How benchmark can go wrong

the computation got optimized out

Benchmarking the wrong code

Related to the previous point introduce code that doesn’t related to what actually is being benchmarked

Throughput vs latency

See latency vs throughput in benchmarking

Cold cache

See code cache

Cache

Benchmarks often have small cache footprint. Real world applications often use much more cache

References