Execution time can be defined in more than one way. When we have code like the following,
for (int i = 0; i < N; i++)
q[i] = rand();
int checksum = 0;
for (int i = 0; i < N; i++)
checksum ^= lower_bound(q[i]);
if we time the whole thing and divide by the number if iterations, we are measuring throughout rather than latency.
To measure actual latency, we need to introduce a dependency between invocations:
for (int i = 0; i < N; i++)
checksum ^= lower_bound(checksum ^ q[i]);