I’m trying to optimize a binary search function for strings in JavaScript. Since each comparison step involves checking if the key is equal to or less than the pivot, this results in two string comparisons per iteration.
In C-like languages, we have strcmp()
which returns a ternary value (-1, 0, +1), allowing just one comparison.
What and how can I use a similar approach in JavaScript. Is there a native method or recommended pattern to compare strings in JavaScript that returns a comparable result (-1, 0, 1) in one go?
If you haven’t explored this yet, localeCompare() is your friend here. It’s JavaScript’s built-in way to compare strings and return a ternary value just like strcmp() in C.
const result = "apple".localeCompare("banana");
// Returns -1 because "apple" < "banana"
So for a binary search, you can write:
function binarySearch(arr, key) {
let low = 0;
let high = arr.length - 1;
while (low <= high) {
const mid = Math.floor((low + high) / 2);
const cmp = key.localeCompare(arr[mid]);
if (cmp === 0) return mid;
else if (cmp < 0) high = mid - 1;
else low = mid + 1;
}
return -1;
}
Why this is awesome?
Handles locale-specific ordering if needed (but you can ignore that with localeCompare(otherString, undefined, { sensitivity: 'base' }))
For simple ASCII, use manual comparison (only if you’re chasing microseconds)
If you don’t need locale awareness (e.g., sorting plain lowercase English words), and you’re really trying to eke out performance, you could go with something like this:
function compareStrings(a, b) {
if (a === b) return 0;
return a < b ? -1 : 1;
}
This is slightly faster in some environments than localeCompare() but lacks its flexibility and correctness for non-English or case-sensitive data.
This isn’t an optimization per se, but during dev, it helps to log your comparison results:
const cmp = key.localeCompare(arr[mid]);
console.log(`Comparing ${key} with ${arr[mid]}: ${cmp}`);
I’ve found that when optimizing string-based algorithms, having these logs in place (and then removing them later) really helps spot logic bugs or unexpected behaviors with similar-looking strings.