Big O Notation

Optimizing code can generally get you 10-20%. If you really need to speed up code, you need to compare the performance of the algorithm against other algorithms. Big O notation was invented to compare the performance of algorithms over a large data set. Big O notation uses “n” to represent the number of data items.

To analyze a function, we look at how many operations we perform in terms of n. Generally, we don’t make a distinction between integer, floating point, string, or boolean operations. Each operation costs 1. We drop all constants unless there are only constant factors, in which case we call it O(1). This represents any constant-time function, regardless of how large a dataset is passed to it. This could be a fairly complicated function with lots of operations, as long as it doesn’t depend on the size of the data.

A simple loop over the data gives us O(n). We drop any coefficients on n and any constants (because we have an “n” term). A nested loop, with both loops from 1 to n, gives O(n^2). In this case, we would drop anything smaller than n^2 (e.g. n, 1, etc.) because for large n, the algorithm is dominated by the n^2 term.

Recursive functions work the same way. A standard recursive factorial function would be O(n) where n is actually the number the function is computing the factorial for. An iterative factorial is also O(n), but might be faster depending on the language and compiler.
Some common algorithms and their performance:

  • Binary search of sorted list: O(log n)
  • Linear search: O(n)
  • Bubble sort: O(n^2)
  • Quicksort: O(n log n)
  • Median of a sorted list: O(1)
  • Mean of a sorted list: O(n)

Some languages provide Big O bounds on the performance of their standard libraries (or portions of the libraries). For example, the C++ Standard Template Library provides bounds on many functions.

Leave a Reply

You must be logged in to post a comment.