Python profiling and optimization is about measuring where your program spends time or memory and then improving those slow or heavy parts without changing the overall behavior of the code.
cProfile, timeit, and line profilers help you find bottlenecks.? cProfile – built-in function-level profiler
cProfile gives you per-function statistics: number of calls and time spent.
python -m cProfile your_script.pycProfile.run() or Profile objects.⏱️ timeit – micro-benchmarking tool
timeit is used for timing small snippets of code accurately, by running them many times and averaging the result.
Typical workflow:
Below, we have an intentionally slow function that checks for duplicates using a nested loop. We will profile it with cProfile.
import cProfile
import random
def has_duplicates_slow(items):
# O(n^2) approach – compare every pair
n = len(items)
for i in range(n):
for j in range(i + 1, n):
if items[i] == items[j]:
return True
return False
def main():
data = [random.randint(1, 10_000) for _ in range(50_000)]
print("Has duplicates?", has_duplicates_slow(data))
if __name__ == "__main__":
cProfile.run("main()", sort="tottime")
The same task can be solved faster using a set (average O(1) membership check), turning the algorithm roughly into O(n).
def has_duplicates_fast(items):
# Faster O(n) duplicate check using a set
seen = set()
for value in items:
if value in seen:
return True
seen.add(value)
return False
Use timeit from the standard library to compare the slow and fast versions.
from timeit import timeit
import random
def build_data():
# Generate a list of random integers
return [random.randint(1, 10_000) for _ in range(20_000)]
data = build_data()
# Time the slow and fast implementations
slow_time = timeit("has_duplicates_slow(data)", globals=globals(), number=3)
fast_time = timeit("has_duplicates_fast(data)", globals=globals(), number=3)
# Display the timing results
print(f"Slow version: {slow_time:.4f} seconds")
print(f"Fast version: {fast_time:.4f} seconds")
For quick checks, you can also manually measure elapsed time using time.perf_counter().
import time
def do_work():
# Simple loop to simulate some CPU work
total = 0
for i in range(1_000_000):
total += i
return total
# Record the start time
start = time.perf_counter()
result = do_work()
# Record the end time
end = time.perf_counter()
# Print the result and elapsed time
print("Result:", result)
print(f"Elapsed: {end - start:.6f} seconds")
When you run the cProfile.run("main()", sort="tottime") example, you will see a table that includes columns like:
tottime / ncalls).The function with the highest tottime is usually your main bottleneck. In this example, has_duplicates_slow will dominate the runtime, showing that its O(n²) behavior is expensive for large lists.
After switching to has_duplicates_fast, rerun the profiler/timeit code: you should see a significant reduction in the measured time. This confirms that the optimization is real and not just a guess.
cProfile for whole programs and timeit for small snippets.cProfile. Then rewrite it using a more efficient approach and compare the results.timeit.+ inside a loop and another that uses "".join() on a list. Use timeit to see the difference.python -m cProfile -o profile.out your_script.py and inspect where most time is spent.