The Binary Search Algorithm

If we have an array that is sorted, we can use a much more efficient algorithm called a Binary Search.  Consider the array outlined in the following example. An array reference parameter with the name number is refering to an integer array of 9 elements.  Those 9 elements are filled with values as shown in the figure.  Note that the elements are ALREADY sorted in the array.  The Binary Search algorithm depends on the array being already sorted. The Linear Search did not depend on having a sorted array.

The idea behind the Binary Search is to split this array in half multiple times to "zero-in" on the value we're looking for. Assume we are looking for the value 44 in the array, and we want to know the index of the element that this value is located in, if, in fact, it is in the array at all (remember that we always have to prepare for the case where the element is not found at all).  In this example, the value 44 is located in the 6th element of the array.  In the linear search shown above, it would require six comparisons between array elements and the search key to find out that the 6th element of the array contained the value that we are looking for.  Let's see how the binary search works now.

In the series of figures below, a sequence of passes is shown for the binary search. Let's go over them step-by-step.  The first step is to look at the array in it's initial state.  We are going to have to keep three "pointers" into the array for this algorithm - three integer variables that contain the indicies of three different places that we are concerned with in the array: the low index that we are still looking at, the high index that we are still looking at, and the midpoint index between the low and high.  The figure below shows you these values.  The low and high indices are the first and last element indices of the array, and the midpoint is shown to be (low+high)/2.  Note that we need to do integer division to find the midpoint.  That way, if the number of elements in the array is even, and thus the "midpoint" is actually not an element, we will set the mid pointer to one less than what floating point division would give us.  If you didn't catch on to what integer division did way back at the beginning of the semester, now is the time to make sure you do. So if there are 8 elements, then (0+7)/2 would be 3.5 with floating point division, but will return 3 with integer division.

So, the mid pointer in the example below will start out pointing to element 4, which contains the value 38.  Here comes the key to the Binary Search - pay attention!  First, you compare the value that mid points to to see if it is the value we are looking for (44).  It is not in our case.  So, you then ask the question: is the value that mid points to higher or lower than our search value?  In this example, it is lower.  Now, since the array is sorted, we KNOW that the value we are searching for MUST be in the UPPER HALF of the array, since it is larger than the midpoint element value! So in one comparison, we have discarded the lower half of the array as elements that we need to search! This is a powerful tool for searching large arrays! But we still haven't found where the value really is, so let's continue with the next figure.

So now we take the 2nd pass in the Binary Search algorithm. We know that we just need to search the upper half of the array.  We know that the value that mid pointed to above is NOT the value that we are looking for.  So, we now reset our "low" pointer to point to the index of the element that is one higher than our previous midpoint.  This actually points to our search value, but we don't know that yet.  So now low points to index 5, and high still points to the last index in the array.  We recalculate the midpoint, and using integer division, (5+8)/2 will give 6 as the midpoint index to use. So now we repeat the process.  Does the element at the mid index contain our value? No.  Is the value we are searching for higher or lower than the value of the element at our midpoint index?  In this case, it is lower (44 is less than 77).  So now we are going to reset our pointers and do a third pass. See the next figure.

For our third pass, we reset the HIGH pointer since our search value was lower than the value of the element at the midpoint.  In the figure below, you can see that we reset the high pointer to point to one less than the previous mid pointer (since we already knew that the mid pointer did NOT point to our value).  We leave the low pointer alone.  Note that now, low and high both point to element 5, and so (5+5)/2  = 5, and now the mid pointer will point to 5 as well.  So  now we see if the element in the array that mid is pointing to contains the value that we are searching for. And it does!  We have successfully searched for and found our value in three comparison steps!  As noted at the beginning of this section, the linear search would have taking six comparisons to find the same value.

Now let's look at the Java code to implement this Binary Search.  Note again, that this is NOT a "Java algorithm".  The Binary Search is a general algorithm, and can be implemented in almost any programming language (I don't know of any languages that you couldn't do it in, but then again, I don't know every programming language that exists!)

In the code in the figure below, we again declare a public static final variable that represents the value returned from binarySearch if searchValue is not found. The binarySearch method has the same signature as the linearSearch method (except for the method name, of course!).  It will return an int that represents the index of the found element, or NOT_FOUND if the search key was not found.  It takes a reference to a sorted integer array and the integer value to search for.  Note:  This implementation assumes that the array that number refers to is already sorted!  If it is not, this algorithm will NOT work!  We talk about sorting in the next lesson.

Inside the method, you can see the declaration and initialization of the three element pointers: low, mid and high.  The low value starts at 0, the high value at number.length - 1 (you do understand why it is one less than the length, right?  It is the last index, not the length!). Then mid is calculated as we discussed above.

Now comes the actual logic of the algorithm.  Similar to the linear search, a while loop is used in this example. The stopping criteria for the loop is slightly different, however.  This loop will continue as long as low <= high, and as long as the value in the index that mid points to is not our number.  (Once low==high, then we are almost done - if mid doesn't point to our value, then the value is not in the array)  If mid points to the element with our value, we will end right then.  If we cut the array into halves enough times to get down to one element, we are done.  So if either of these happens, the while loop ends.

Inside the while loop, we have to keep track of the low, mid, and high values for each iteration of the algorithm.  The if-else statement does this.  First, the if block checks to see if the number at the mid point of the current range is less than our search value.  Note that through the loop termination condition, we have already checked to see if the midpoint element contains our value. So if number[mid] < searchValue, we reset low to mid+1.  Otherwise, our number must be below the midpoint, so we reset high to mid-1.  Finally, after the if-else, for each iteration of the loop, we recalculate the midpoint of the current low to high range.  The termination condition checks the midpoint element, and we go through another iteration of the loop (i.e., another pass of the algorithm).

After the loop terminates, we check to see if low>high.  If we DON'T find the value we are looking for in the loop, then we will have done one last iteration where low==mid==high, and then the if-else will have either added 1 to low or subtracted 1 from high, making low>high.  So that is our condition to know that we did not find the search key in the array, and that we want to return NOT_FOUND.  Otherwise, we know that mid will point to the element that our search key is stored in, and we can return mid.  In either case, we are done, and the return value from the method either has the index of the found element, or the value of NOT_FOUND.


Back to Lesson Page

 CSC123  Computer Science I (Programming in Java)