6/24/08

Big-endian or Little-endian :Important Write a function that determines whether a computer is big-endian or little-endian.

Big-endian or Little-endian:

Important Write a function that determines whether a computer is big-endian or little-endian.

This problem tests your knowledge of computer architectures as much as it tests your ability to program. The interviewer wants to know whether you are familiar with the term endian. If you are familiar with it, you should define it or at least try to point out the differences between big-endian and little-endian, even if you forget which is which. If you are not familiar with the term, you’ll have to ask the interviewer to explain it.
Endianness refers to the order in which a computer stores the bytes of a multibyte value. (Or, technically, the units of a multiunit value - for example, the computer may use a 16-bit unit size instead of an 8-bit unit size. We restrict ourselves to 8-bit units for simplicity, however.) Almost all modern computers use multibyte sequences to represent certain primitive data types.
For example, an integer is usually 4 bytes. The bytes within an integer can be arranged in any order, but they are almost always either least-significant byte (LSB) to most-significant byte (MSB) or MSB to LSB. Significance refers to the place value a byte represents within a multibyte value. If a byte represents the lowest place values in a (two-byte) word the byte is the LSB. For example, in the number 5A6C, 6C is the LSB. Conversely, if a byte represents the highest place values in the word, it is the MSB. In the 5A6C example, 5A is the MSB. In a big-endian machine the MSB has the lowest address; in a little-endian machine the LSB has the lowest address. For example, a big-endian machine stores the 2-byte hexadecimal value A45C by placing A4 in the first byte and 5C in the second. In contrast, a little-endian machine stores 5C in the first byte and A4 in the second.
Endianness is important to know when reading or writing data structures, especially across networks, so that different applications can communicate with each other. Sometimes the endianness is hidden from the developer: Java uses a fixed endianness to store data, regardless of the underlying platform’s endianness, so data exchanges between two Java applications won’t normally be affected by endianness.
But other languages, C in particular, don’t specify an endianness for data storage, leaving the implementation free to choose the endianness that works best for the platform. C is used to solve this problem.
To answer the problem, you have to choose some multibyte data type to work with. It’s not important which one you choose, just that the type is more than one byte. A 32-bit integer is a good choice. You need to determine how you can test this integer to figure out which byte is LSB and which is MSB. If you set the value of the integer to 1, you can distinguish between the MSB and the LSB because in an integer with the value 1, the LSB has the value 1 and the MSB has the value 0.
Unfortunately, it’s not immediately clear how to access the bytes of an integer. You might try using the bit operators because they allow access to individual bits in a variable. However, they are not particularly useful because the bit operators act as if the bits are arranged in order from least-significant bit to most significant bit. For example, if you use the shift left operator to shift the integer 8 bits, the operator works on the integer as if it were 32 consecutive bits regardless of the true internal byte order. This property
prevents you from using the bit operators to determine byte order.

How might you be able to examine the individual bytes of an integer? A character is a single-byte data type. It could be useful to view an integer as four consecutive characters. To do this, you create a pointer to the integer. Then, you can cast the integer pointer to a character pointer. This enables you to access the integer like an array of 1-byte data types. Using the character pointer, you can examine the bytes and determine the format.
Specifically, to determine the computer’s endianness, get a pointer to an integer with the value of 1. Then, cast the pointer to a char *. This changes the size of the data to which the pointer points. When you de-reference this pointer you access a 1-byte character instead of a 4-byte integer. Thus, you can test the first byte and see if it is 1. If the byte’s value is 1, then the machine is little-endian because the LSB is at the lowest memory address. If the byte’s value is 0, then the machine is big-endian because the MSB is at
the lowest memory address. In outline form, here is the procedure:
Set an integer to 1
Cast a pointer to the integer as a char *
If the dereferenced pointer is 1, the machine is little-endian
If the dereferenced pointer is 0, the machine is big-endian

The code for this test is as follows:


/* Returns true if the machine is little-endian, false if the

* machine is big-endian
*/
bool endianness(){
int testNum;
char *ptr;
UNREGISTERED VERSION OF CHM TO PDF CONVERTER PRO BY THETA-SOFTWARE
UNREGISTERED VERSION OF CHM TO PDF CONVERTER PRO BY THETA-SOFTWARE
testNum = 1;
ptr = (char *) &testNum;
return (*ptr); /* Returns the byte at the lowest address */
}
This solution is sufficient for an interview. However, as the goal of an interview is not just to solve problems, but also to impress your interviewer, you may want to consider a slightly more elegant way to solve this problem. It involves using a feature of C/C++ called union types. A union is like a struct, except that all of the members are allocated starting at the same location in memory. This enables you to access the same data with different variable types. The syntax is almost identical to a struct. Using a union, the code is as follows:

/* Returns true if the machine is little-endian, false if the

* machine is big-endian
*/
bool endianness(){
union {
int theInteger;
char singleByte;
} endianTest;
endianTest.theInteger = 1;
return endianTest.singleByte;
}

No comments:

ITUCU