My question is around the constant O(1) lookup time for a dictionary.
So I have a dictionary of key/value pairs. And I have a list of items I am iterating through. For each item, I want to check whether it is in the dictionary, and if it is not in there, I want to add it, otherwise I will simply pass. Is this constant or linear time?
Because I am iterating over the whole dictionary, however I know that lookup in a dictionary is O(1)?- this has confused me so much!
My real question is, is the overall time complexity O(N^2)? Because for each item in the list, I am checking the entire dictionary to see if it is there, and I am adding each item to it if it is not there.
The overall complexity is O(n). You are performing n lookups each of which takes O(1) time, for a combined time complexity of O(n×1) = O(n).
For each item, I want to check whether it is in the dictionary, and if it is not in there, I want to add it…
Side note, checking and adding in separate steps is a common code smell. It’s usually better to simply attempt an insertion. The insertion will fail if the key already exists, no need to check first. This is more efficient as it does one lookup instead of two&mdash—after all, insertion has to do the exact same lookup as a key check to figure out where to insert. Checking first merely duplicates work.
In the worst case, it’s not just an efficiency improvement but also one of correctness. If the dictionary is accessed by multiple threads a check-then-insert access pattern can result in a time-of-check to time-of-use (TOCTOU) error. Namely, you might check if the key exists; it doesn’t, so you attempt to insert it; but meanwhile another thread has snuck in and inserted the same key while you weren’t looking, and you overwrite the other thread’s entry.
Another common TOCTOU error is checking if a file exists and creating it if it doesn’t. See Common Weakness Enumeration 367 – TOCTOU Race Condition for more details and examples.