Following my previous blog post into the most performant timestamp functions in Python, I generated quite the heated discussion in this reddit post.
To summarize the discussion:
- A handful of "Why are you micro-optimizing"?
- The much-expected "if performance is so important to you, don't use Python"
- There is an overhead to access the attributes of modules on each iteration (i.e.
datetime.datetime.now
) - A note that Python 3.12 deprecates
datetime.utcnow()
- I did not know that! - An interesting point related to Windows versus Linux machines
Discarding the first 2 points, I decided to expand the initial analysis to compare Python 3.10 and 3.12. Moreover, I was curious to compare the performance of native Windows against WSL2. I've also included the test on an old Ubuntu server I have around here.
Regarding the third point, I'm now accessing the attribute in the setup function
(fn=datetime.datetime.now
) and calling the resolved function in the loop.
Because of the deprecation of utcnow, I removed it from the test cases.
Never miss a new post
Functions tested
import time
time.time()
import datetime
datetime.datetime.now()
import datetime
datetime.datetime.now().timestamp()
# This is the recommended replacement for `datetime.datetime.utcnow()`
import datetime
import pytz
datetime.datetime.now(pytz.UTC)
import datetime
import pytz
a_timezone = pytz.timezone('America/Los_Angeles')
datetime.datetime.now(a_timezone)
Hardware tested
- A native Ubuntu 20 server. This is running older hardware, and is expectedly slower.
- A native Windows 10 laptop.
- A WSL2 Ubuntu 20 machine running on the same Windows 10 laptop.
Because Windows 10 and WSL2 are running the same hardware, we can directly compare the 2.
Results
Conclusion
- WSL2 is faster than native Windows (on the same physical machine) for all test cases
except
time.time()
, where it is slightly slower. - There is no solution as fast as the deprecated
datetime.utcnow()
to generate the current timestamp as a datetime object. This was noted by the Cython team: https://github.com/python/cpython/issues/103857 - Python 3.12 is slightly faster across the board than 3.10.
Code
On each hardware/interpreter
results = {}
import timeit
import sys
results["time.time()"] = timeit.timeit(setup="import time; fn=time.time", stmt="fn()")
results["datetime.now().timestamp()"] = timeit.timeit(
setup="import datetime; fn=datetime.datetime.now", stmt="fn().timestamp()"
)
results["datetime.now()"] = timeit.timeit(
setup="import datetime; fn=datetime.datetime.now", stmt="fn()"
)
results["datetime.now(timezone.UTC)"] = timeit.timeit(
setup="import datetime, pytz; fn=datetime.datetime.now", stmt="fn(pytz.UTC)"
)
results["datetime.now(tz)"] = timeit.timeit(
setup="import datetime, pytz; a_timezone = pytz.timezone('America/Los_Angeles'); fn=datetime.datetime.now",
stmt="fn(a_timezone)",
)
import time
print(f"{time.time()} -vs- {time.perf_counter()}")
results_sorted = sorted(results.items(), key=lambda t: t[1])
for name, result_s in results_sorted:
print(f"{name},{result_s}")
To compile the results
# %%
import pandas as pd
data = pd.read_csv("analysis_time_2_data.csv")
# %%
ubuntu_20 = data[data["machine"] == "Ubuntu 20"]
wsl2 = data[data["machine"] == "WSL2"]
windows_10 = data[data["machine"] == "Windows 10"]
# %%
# Compare each, py3.10 vs 3.12
import plotly.express as px
px.bar(
ubuntu_20,
x="fn",
y="time_s",
color="python",
log_y=True,
barmode="group",
title="Ubuntu 20 native",
)
# %%
px.bar(
wsl2,
x="fn",
y="time_s",
color="python",
log_y=True,
barmode="group",
title="WSL2",
)
# %%
px.bar(
windows_10,
x="fn",
y="time_s",
color="python",
log_y=True,
barmode="group",
title="Windows 10",
)
# %%
data_by_fn_arch = data[["fn", "time_s", "machine"]].groupby(["machine", "fn"]).mean()
data_by_fn_arch = data_by_fn_arch.sort_index().reset_index()
data_by_fn_arch.sort_values(["time_s"], inplace=True)
fig_1 = px.line(
data_by_fn_arch,
x="fn",
y="time_s",
color="machine",
log_y=True,
title="Mean time per machine type for 100000 calls",
)
# %%
data_by_fn_python = data[["fn", "time_s", "python"]].groupby(["python", "fn"]).mean()
data_by_fn_python = data_by_fn_python.sort_index().reset_index()
data_by_fn_python.sort_values(["time_s"], inplace=True)
fig_2 = px.line(
data_by_fn_python,
x="fn",
y="time_s",
color="python",
log_y=True,
title="Mean time per Python version for 100000 calls",
)
# %%
import plotly.io
plotly.io.write_json(fig_1, 'time_per_machine.json')
plotly.io.write_json(fig_2, 'time_per_python.json')