Binary search
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.
Binary search runs in logarithmic time in the worst case, making comparisons, where is the number of elements in the array. Binary search is faster than linear search except for small arrays. However, the array must be sorted first to be able to apply binary search. There are specialized data structures designed for fast searching, such as hash tables, that can be searched more efficiently than binary search. However, binary search can be used to solve a wider range of problems, such as finding the next-smallest or next-largest element in the array relative to the target even if it is absent from the array.
There are numerous variations of binary search. In particular, fractional cascading speeds up binary searches for the same value in multiple arrays. Fractional cascading efficiently solves a number of search problems in computational geometry and in numerous other fields. Exponential search extends binary search to unbounded lists. The binary search tree and B-tree data structures are based on binary search.
Algorithm
Binary search works on sorted arrays. Binary search begins by comparing an element in the middle of the array with the target value. If the target value matches the element, its position in the array is returned. If the target value is less than the element, the search continues in the lower half of the array. If the target value is greater than the element, the search continues in the upper half of the array. By doing this, the algorithm eliminates the half in which the target value cannot lie in each iteration.Procedure
Given an array of elements with values or records sorted such that, and target value, the following subroutine uses binary search to find the index of in.- Set to and to.
- If, the search terminates as unsuccessful.
- Set to plus the floor of, which is the greatest integer less than or equal to.
- If, set to and go to step 2.
- If, set to and go to step 2.
- Now, the search is done; return.
floor is the floor function, and unsuccessful refers to a specific value that conveys the failure of the search.function binary_search is
L := 0
R := n − 1
while L ≤ R do
m := L + floor
if A < T then
L := m + 1
else if A > T then
R := m − 1
else:
return m
return unsuccessful
Alternatively, the algorithm may take the ceiling of. This may change the result if the target value appears more than once in the array.
Alternative procedure
In the above procedure, the algorithm checks whether the middle element is equal to the target in every iteration. Some implementations leave out this check during each iteration. The algorithm would perform this check only when one element is left. This results in a faster comparison loop, as one comparison is eliminated per iteration, while it requires only one more iteration on average.Hermann Bottenbruch published the first implementation to leave out this check in 1962.
- Set to and to.
- While,
- # Set to plus the ceiling of, which is the least integer greater than or equal to.
- # If, set to.
- # Else, ; set to.
- Now, the search is done. If, return. Otherwise, the search terminates as unsuccessful.
ceil is the ceiling function, the pseudocode for this version is:function binary_search_alternative is
L := 0
R := n − 1
while L != R do
m := L + ceil
if A > T then
R := m − 1
else:
L := m
if A = T then
return L
return unsuccessful
Duplicate elements
The procedure may return any index whose element is equal to the target value, even if there are duplicate elements in the array. For example, if the array to be searched was and the target was, then it would be correct for the algorithm to either return the 4th or 5th element. The regular procedure would return the 4th element in this case. It does not always return the first duplicate. However, it is sometimes necessary to find the leftmost element or the rightmost element for a target value that is duplicated in the array. In the above example, the 4th element is the leftmost element of the value 4, while the 5th element is the rightmost element of the value 4. The alternative procedure above will always return the index of the rightmost element if such an element exists.Procedure for finding the leftmost element
To find the leftmost element, the following procedure can be used:- Set to and to.
- While,
- # Set to plus the floor of, which is the greatest integer less than or equal to.
- # If, set to.
- # Else, ; set ' to.
- Return.
Where
floor is the floor function, the pseudocode for this version is:function binary_search_leftmost:
L := 0
R := n
while L < R:
m := L + floor
if A < T:
L := m + 1
else:
R := m
return''' L
Procedure for finding the rightmost element
To find the rightmost element, the following procedure can be used:- Set to and to.
- While,
- # Set to plus the floor of, which is the greatest integer less than or equal to.
- # If, set to.
- # Else, ; set ' to.
- Return.
Where
floor is the floor function, the pseudocode for this version is:function binary_search_rightmost:
L := 0
R := n
while L < R:
m := L + floor
if A > T:
R := m
else:
L := m + 1
return''' R - 1
Approximate matches
The above procedure only performs exact matches, finding the position of a target value. However, it is trivial to extend binary search to perform approximate matches because binary search operates on sorted arrays. For example, binary search can be used to compute, for a given value, its rank, predecessor, successor, and nearest neighbor. Range queries seeking the number of elements between two values can be performed with two rank queries.- Rank queries can be performed with the [|procedure for finding the leftmost element]. The number of elements less than the target value is returned by the procedure.
- Predecessor queries can be performed with rank queries. If the rank of the target value is, its predecessor is .
- For successor queries, the [|procedure for finding the rightmost element] can be used. If the result of running the procedure for the target value is , then the successor of the target value is .
- The nearest neighbor of the target value is either its predecessor or successor, whichever is closer.
- Range queries are also straightforward. Once the ranks of the two values are known, the number of elements greater than or equal to the first value and less than the second is the difference of the two ranks. This count can be adjusted up or down by one according to whether the endpoints of the range should be considered to be part of the range and whether the array contains entries matching those endpoints.
Performance
In the worst case, binary search makes iterations of the comparison loop, where the notation denotes the floor function that yields the greatest integer less than or equal to the argument, and is the binary logarithm. This is because the worst case is reached when the search reaches the deepest level of the tree, and there are always levels in the tree for any binary search.
The worst case may also be reached when the target element is not in the array. If is one less than a power of two, then this is always the case. Otherwise, the search may perform iterations if the search reaches the deepest level of the tree. However, it may make iterations, which is one less than the worst case, if the search ends at the second-deepest level of the tree.
On average, assuming that each element is equally likely to be searched, binary search makes iterations when the target element is in the array. This is approximately equal to iterations. When the target element is not in the array, binary search makes iterations on average, assuming that the range between and outside elements is equally likely to be searched.
In the best case, where the target value is the middle element of the array, its position is returned after one iteration.
In terms of iterations, no search algorithm that works only by comparing elements can exhibit better average and worst-case performance than binary search. The comparison tree representing binary search has the fewest levels possible as every level above the lowest level of the tree is filled completely. Otherwise, the search algorithm can eliminate few elements in an iteration, increasing the number of iterations required in the average and worst case. This is the case for other search algorithms based on comparisons, as while they may work faster on some target values, the average performance over all elements is worse than binary search. By dividing the array in half, binary search ensures that the size of both subarrays are as similar as possible.