6/26/08

Write a function that determines the number of 1 bits in the computer’s internal representation of a given integer

This problem may at first sound like a base conversion problem in which you have to design an algorithm to convert a base 10 number to a two’s complement binary number. That approach is circuitous because the computer already internally stores its numbers in two’s complement binary. Instead of doing a base conversion, try counting the 1’s directly.

You can count the number of 1’s by checking the value of each bit. Ideally, you’d like to use an operator that would tell you the value of a specified bit. That way, you could iterate over all of the bits and count how many of them were 1’s. Unfortunately, this ideal operator doesn’t exist.

You can begin by trying to create a procedure that determines the value of each bit using the existing bit operators. Focus on figuring out a way to get the value of the lowest bit. One way to do this is to AND the given integer with the value 1. Let’s use 8-bit integers to keep our examples manageable, in which case 1 is stored as 00000001. The result would be either 00000000 if the given integer’s lowest bit had the value 0, or 00000001 if the given integer’s lowest bit had the value 1. In general, you can get the value of any bit if you create the correct mask. In this case, the mask is an integer with all the bits set to 0 except the bit you’re checking, which is set to 1. When you AND a mask with the value you’re checking, the result is either a 0, indicating that the bit you are checking has the value 0, or a non-zero result, indicating that the bit you are checking has the value 1.

You could create a mask for each of the bits and count the number of 1 bits. For example, the first mask would be 00000001, followed by masks of 00000010, 00000100, 00001000, and so on. This would work, but your interviewer probably doesn’t want to watch you write out that many masks. Consider the differences between each mask. Each mask is the same as the previous mask, but the 1 bit is moved one place to the left. Instead of predefining your masks, you can construct them using the shift left operator. Simply start with a mask of 00000001 and repeatedly shift the integer one bit to the left to generate all the necessary masks. This is a good technique, and if you work it out to its conclusion, it yields an acceptable answer. However, there’s a prettier and slightly faster solution that uses only one mask.

Think about what you can do with a single mask. You are trying to examine each bit of the integer, so you need to mask a different bit on each iteration. So far, you’ve been accomplishing this by shifting the mask and keeping the integer in place, but if you shifted the integer, you could examine all of its bits using the same mask. The most natural mask to use is 00000001, which yields the least-significant bit. If you keep shifting the integer right, each bit will eventually become the rightmost bit. Try working through 00000101 as an example. The rightmost bit is 1 so you would add 1 to your count and shift the integer right, yielding 00000010. This time the rightmost bit is 0. Shifting right again produces 00000001. The least significant bit in this integer is 1, so you would increment your count to 2. When you shift right a third time, the integer becomes 00000000. When the integer’s value reaches zero there are no 1 bits remaining, so you can stop counting. As in this example, you may not have to iterate through all the bits to count all the 1’s, so in many cases this algorithm is more efficient than the multiple mask algorithm. In outline, the single mask algorithm is as follows:

Start with count = 0
While the integer is not 0
If the integer AND 1 equals 1, increment count
Shift the integer one bit to the right
Return count

Finally, check for any error cases in this code; you’ll want to look for problems with positive numbers, negative numbers, and zero. If the integer has the value of 0, the algorithm immediately and correctly returns that there are zero 1’s in the binary representation. Now consider the case where you are passed a negative number. The number will be shifted to the right, but the new bit added on the left will depend on how the shift operator treats negative values. Let’s avoid the problem entirely and use Java’s >>> operator for our example. In the other C-like languages, you can also avoid the problem by reading the value as an unsigned integer. (That solution doesn’t work with Java because there are no unsigned integer types in Java.) Using either the >>> or an unsigned integer means that the shift operator will not sign extend, and the new bits that are added during the right shifting will be 0’s. The number will eventually become all 0’s. Finally, consider the case where you are given a positive integer. This is the sample case that you worked with, and the algorithm works correctly here.

The code for this algorithm is as follows:

int numOnesInBinary( int number )
{
int numOnes = 0;
while( number != 0 ){
if( ( number & 1 ) == 1 )
numOnes++;
number = number >>> 1;
}
return numOnes;
}

What’s the running time of this function? The function will iterate through the while loop until all the 1’s have been counted. In the best case, the given integer is 0, and the function never executes the while loop. In the worst case, this is O(n), where n is the size, in bits, of an integer.

Unless you’re incredibly good at bitwise operations, this is the best solution you’re likely to come up with in an interview. There are better solutions, though. Consider what happens at the bit level when you subtract 1 from a number. Subtracting 1 produces a value that has all the same bits as the original integer except that all the low bits up to and including the lowest 1 are flipped. For example, subtracting 1 from the value 01110000 results in the value 01101111.

If you apply the AND operation to the integer and the result of the subtraction, the result is a new number that is the same as the original integer except that the rightmost 1 is now a 0. For example, 01110000 AND (01110000 – 1) = 01110000 AND 01101111 = 01100000.

You can count the number of times that you can perform this process before the integer’s value reaches 0. This is the number of 1’s in the computer’s representation of the number. In outline form this algorithm is as follows:

Start with count = 0
While the integer is not zero
AND the integer with the integer – 1
Increment count
Return count

Here is the code:

int numOnesInBinary( int number ){
int numOnes = 0;
while( number != 0 ){
number = number & (number – 1);
numOnes++;
}
return numOnes;
}

This solution has a running time of O(m), where m is the number of 1’s in the solution. There may be even better solutions. Keep in mind that this solution was presented for interest, and the first solution is all that would be expected in an interview.

No comments: