The question is published on by Tutorial Guruji team.
Which of these is faster, and why? Or are they the same? Does the answer vary by any conditions (size of dictionary, type of data, etc.)?
Traditional:
for key in dict: x = dict[key] x = key
Hipster:
for key, value in dict.items(): y = value y = key
I haven’t seen an exact duplicate, but if there is one I’d be happy to be pointed to it.
Answer
It turns out there are actually orders of magnitude of difference.
I don’t know much about performance testing, but what I tried to do was create 3 dicts of varying sizes, with each smaller dict being a subset of the larger dict. I then ran all three dicts through the two functions (Traditional vs. Hipster). Then I did that 100 times.
The dictionary sizes (number of key-value pairs) for dict1, dict2, and dict3 are 1000, 50000, 500000 respectively.
There seems to be a significant difference, with d.items()
being generally faster and d.items()
being WAY faster on smaller dictionaries. This is in line with expectations (Python generally rewarding “pythonic” code).
Results:
--d[key]-- dict1 -- mean: 0.0001113555802294286, st. dev: 1.9951038526222054e-05 dict2 -- mean: 0.01669296698019025, st. dev: 0.019088713496142 dict3 -- mean: 0.2553815016898443, st. dev: 0.02778986771642094 --d.items()-- dict1 -- mean: 6.005059978633653e-05, st. dev: 1.1960199272812617e-05 dict2 -- mean: 0.00507106617995305, st. dev: 0.009871762371401046 dict3 -- mean: 0.07369932165958744, st. dev: 0.023440325168927384
Code (repl.it) providing results:
import timeit import random import statistics def traditional(dicty): for key in dicty: x = dicty[key] x = key def hipster(dicty): for key, value in dicty.items(): y = value y = key def generate_random_dicts(): random_dict1, random_dict2, random_dict3 = {}, {}, {} for _ in range(1000): key = generate_random_str_one_to_ten_chars() val = generate_random_str_one_to_ten_chars() random_dict1[key] = val random_dict2[key] = val random_dict3[key] = val for _ in range(49000): key = generate_random_str_one_to_ten_chars() val = generate_random_str_one_to_ten_chars() random_dict2[key] = val random_dict3[key] = val for _ in range(450000): key = generate_random_str_one_to_ten_chars() val = generate_random_str_one_to_ten_chars() random_dict3[key] = val return [random_dict1, random_dict2, random_dict3] def generate_random_str_one_to_ten_chars(): ret_str = "" for x in range(random.randrange(1,10,1)): ret_str += chr(random.randrange(40,126,1)) return ret_str dict1, dict2, dict3 = generate_random_dicts() test_dicts = [dict1, dict2, dict3] times = {} times['traditional_times'] = {} times['hipster_times'] = {} for _ in range(100): for itr, dictx in enumerate(test_dicts): start = timeit.default_timer() traditional(dictx) end = timeit.default_timer() time = end - start try: times['traditional_times'][f"dict{itr+1}"].append(time) except KeyError: times['traditional_times'][f"dict{itr+1}"] = [time] start = timeit.default_timer() hipster(dictx) end = timeit.default_timer() time = end - start try: times['hipster_times'][f"dict{itr+1}"].append(time) except KeyError: times['hipster_times'][f"dict{itr+1}"] = [time] print("--d[key]--") for x in times['traditional_times'].keys(): ltimes = times['traditional_times'][x] mean = statistics.mean(ltimes) stdev = statistics.stdev(ltimes) print(f"{x} -- mean: {mean}, st. dev: {stdev}nn") print("--d.items()--") for x in times['hipster_times'].keys(): ltimes = times['hipster_times'][x] mean = statistics.mean(ltimes) stdev = statistics.stdev(ltimes) print(f"{x} -- mean: {mean}, st. dev: {stdev}")