# How does the time complexity of iterating through two unsorted list of integers compare to sorting then iterating through the lists?

This question is regarding the Time Complexity of two algorithms.

So I am a solving a question involving two lists (not sorted) of integers, and I want to return the pair of integers (one from each list) with the smallest absolute difference.

1. The naive approach is simply to iterate through both lists, and keep track of the smallest difference throughout, and at the end return the pair that produced this difference- I believe this would take `O(mn) time`, where n and m are the length of each list; as we simply use `2x for loops`.

2. Another approach is to `sort` both lists, then iterate through both of them, and there is some logic where we can break out of the for loops in many instances without iterating through the whole of the second list. I believe this would take `O(nlogn + mlogm) time`, again where n,m are the lengths of the two lists.

I read on the solution page that solution 2. is more efficient, but I don’t see why?- if the time complexity I mentioned above are correct, for solution 2. if we factorise by `mn` what’s left >0??

We can better see the difference with an example. Let’s say we have 2 lists, each containing a 100-item list of the numbers from 1 to 100.

• `list1` = `[11, 2, 33, 4, 5, 67, ...]` = `n` items where `n` is 100
• `list2` = `[8, 90, 1, 23, 45, 7, ...]` = `m` items where `m` is 100

## Naive approach

Steps:

• Iterate over each element of list1’s `n` items
• Time complexity: `n = 100`
• For each of the said iteration, iterate over each element of list2’s `m` items just to get the number that will produce the smallest difference with the current item of list1.
• Time complexity: `n * m = 100 * 100 = 10,000`

Total time complexity: `n * m = 10,000`

## Sorted data approach 1

Steps:

• Sort `list2` producing `[1, 2, 3, 4, 5, ...]`
• Time complexity: `m * log(m) = 100 * log(100) = 664.4 = ~700`
• Iterate over each element of list1’s `n` items
• Time complexity: `n = 100`
• For each of the said iteration, perform a binary search on the sorted items of `list2` to see where the current item of `list1` is the closest to.
• To further explain this, let’s say we have a sorted 8-item list `[0, 5, 10, 15, 20, 25, 30, 35]`
• If the number we want to search for the closest value is `22`, we don’t need to iterate through the whole list. Just like a dictionary/thesaurus, if you would look for the word “Potato”, you won’t be browsing through the dictionary from page 1, what you would do is go to the middle, if it’s lower than “P” such as “M” then go further, but if higher such as “S” then go back, then repeat.
• So the items that we would be checking is the middle `15`, then `25`, then `20`. These 3 items proves the binary search time complexity of `log(n) = log(8) = 3` since we are dividing the list in half for every step.
• Once we are at `20`, we can check the surrounding numbers namely `[15, 20, 25]` to see the closest to `22`, which will be `20`, thus it is the number to pair to get the smallest difference. This is just a constant `O(1)` operation.
• Time complexity: `n * log(m) = 100 * log(100) = 664.4 = ~700`

Total time complexity: `(m * log(m)) + (n * log(m)) = 700 + 700 = 1,400`

## Sorted data approach 2

• Sort `list1` producing `[1, 2, 3, 4, 5, ...]`
• Time complexity: `n * log(n) = 100 * log(100) = 664.4 = ~700`
• Sort `list2` producing `[1, 2, 3, 4, 5, ...]`
• Time complexity: `m * log(m) = 100 * log(100) = 664.4 = ~700`
• Iterate through both `list1` and `list2` at the same time to know what pair would produce the smallest difference. Move the iterator that has the smaller value. The idea is like 2 persons carrying wood upwards through stairs, where the lower person would be the one to move higher, to remain in sync and for the 2 persons to remain at the closest distance to each other at all times (otherwise the wood will fall ^_^).
• Time complexity: `n + m = 100 + 100 = 200`

Total time complexity: `(n * log(n)) + (m * log(m)) + n + m = (n * log(n)) + (m * log(m)) = 700 + 700 = 1,400`

• `n + m` was removed as they are non-dominant terms

## Conclusion

For 2 lists with size 100 each, the sorted data approach is ~8,600 iterations faster than the naive approach.