The simplest way to find an element in an array is by checking each element one by one until a match is found or the end of the array is reached. This approach in the worst-case scenario can lead to traversal of the whole array. Given a sorted array, we can improve the efficiency of our search operation by
performing binary search algorithm on it. Let’s start by observing the property of a sorted array that makes it suitable for binary search.
Property of an ascending sorted array
- Every element is bigger than or equal to all the elements on its left side. For instance, 19 in the array below is greater than all the elements on its left side.
2. Every element is smaller than or equal to all the elements in its right side. For instance, 46 in the array below is smaller than all the elements on its right side.
In short for an element at index i,
Similar observations can be made for a descending sorted array but for the sake of simplicity, we will move forward with an example of ascending sorted array in the article.
We can exploit this property of a sorted array to optimize our search by reducing our search space to only the region where we are certain to get the target element. Let’s understand this statement by an example: — We are given a sorted array of length n, where we need to search for element 46.
We can perform the search by picking an element at a random index ‘rdm’ say 26 and compare it to the target element. This can lead to three possibilities: -
Possibility 1: target element is equal to the element at rdm. In that case, we have found the target and can simply return its index.
Possibility 2: target element is smaller than the element at rdm. In that case, we continue the search in its left side.
Possibility 3: target element is greater than the element at rdm. In that case, we continue the search in its right side.
Now that we know the possibilities, let’s explore them in this example.
For the first possibility, we see that target = 46 ≠ 26 = arr[rdm]. Since the target is not equal to element at rdm, we need to explore the other two possibilities. Is the target element 46 greater than random element 26? Yes. So, in that case we can deduce that target element must be present on the right side of 26. This means that we need not perform the search operation on left side of arr[rdm] effectively reducing the search space.
Now we again pick an element at a random position, say 82 (refer to the image below). Since the target element is not equal to 82, we decide to continue searching. Target element 46 is smaller than 82, so we
can deduce that target will be present on the left side of 82 and perform the search in that region thereby reducing the search space.
We repeat this process until the element chosen at random position is equal to the target element. Once we do, we return its index.
We find that instead of picking an element at a random position every time, picking the element at middle position in the current search space leads to the least number of comparisons, so that’s what we will do. That is the main idea of binary search algorithm. We pick the element at middle position. Then we check whether it is equal to the target element. If it is, we return its position. Otherwise, we continue the search in either its left side or right side thereby discarding half the number of elements and repeat the process until we find the element or there is no element left to be searched.
Let’s identify the middle index of our search space as ‘mid’.
Defining search space
A search space is defined as the region in the array where we are finding the target element. We are going to identify the starting and ending of the search space using two pointers. We will call them ‘start’ and ‘end’. Following are the examples of a valid search space:
A valid search space has at least one element. From the above picture we know that a valid search space has start <= end. Below is an example of an invalid search space.
Calculating the middle position
Middle index of a search space can be easily calculated using its starting and ending positions.
For an array with even number of elements, we usually pick any of the two middle positions and this above formula is also well suited to calculate it. For those of you, who fear that the above formula will lead to overflow and give garbage value, we can repurpose it this way:
This formula will yield to the following mid values for these example search spaces:
Step by step iterative implementation
Now that we have covered all the prerequisites, we can start implementing the algorithm. Let’s say we are searching for a target element E.
Initially, we are going to find E in the complete array, so we will define the start of our search space as 0, and end as n-1, since the length of array is n.
Next, we will calculate the middle position.
Now, we check for the three possibilities:
Possibility one: target element E is equal to the element at middle position. In that case we return its position ‘mid’
Possibility two: target E is smaller than the element at mid. In that case, we need to start searching in the left side of mid by discarding all the elements in the left of it. For the new search space, starting position remains the same, but the ending position becomes mid — 1.
Possibility three: target E is greater than the element at mid. In that case, we need to start searching in the right side of mid by discarding all the elements on the left of it. For the new search space, ending position remains the same, but the starting position becomes mid + 1
Now, we need to repeat this procedure until we don’t have a match at mid or the search space is valid, i.e., there is at least one element in the search space. For a search space having at least one element, start <= end. So, we put the whole procedure into a while loop.
What if the element is not present in the array? While searching, there will be a point, when we will completely exhaust the search space, i.e., no element will be left to search for. In that case, start > end and the loop will terminate, so we can be certain and return -1 as an indication that the element was not found.
Example dry run
The following is an algorithm dry run when element to be searched is 19:
For the folks who understand what time complexity is and are wondering what it is for binary search, the following section covers it briefly. There are two facts that can be observed from the behavior of this algorithm:
Fact 1: For every iteration, the number of elements in the search space is reduced to half.
Fact 2: In the worst-case scenario, the algorithm runs until the search space has one element left.
Taking these two facts in consideration, if there are n elements initially and we assume that the algorithm terminates after k iterations, then it is clear that the number of elements in the search space after k iterations = number of elements in search space after last iteration.
This leads to the following deduction: