lmbench is an open source performance testing tool. There are many basic operation test cases in src/lat_ops.c, such as int add, int mul, int64 div, double div, etc.
Among them, the realization function of the int mul test case is:
void do_integer_mul(iter_t iterations, void* cookie)
{
struct _state *pState = (struct _state*)cookie;
register int r = pState->N + 37431;
register int s = pState->N + 4;
register int t = r * s * s * s * s * s * s * s * s * s * s - r;
while (iterations-- > 0) {
TEN(r *= s;); r -= t;
TEN(r *= s;); r -= t;
}
use_int(r);
}
Among them, the definition of TEN() macro is:
#define TEN(a) a a a a a a a a a a
The following is the result after preprocessing, you can see: before the loop starts, t = r * s ^ 10-r, in the loop, the result of TEN(r *= s) is: r = r * s ^ 10, Immediately after that, r -= t was executed, so that the value of r was restored to the value before the loop started: pState->N + 37431.
void do_integer_mul(iter_t iterations, void* cookie)
{
struct _state *pState = (struct _state*)cookie;
register int r = pState->N + 37431;
register int s = pState->N + 4;
register int t = r * s * s * s * s * s * s * s * s * s * s - r;
while (iterations-- > 0) {
r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s;; r -= t;
r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s; r *= s;; r -= t;
}
use_int(r);
}
In this case, for the higher version of the compiler (for example: gcc 8.3.0), the multiplication operation will be directly optimized, so that the performance data of the multiplication operation cannot be measured. You can use a lower version of the compiler (for example: gcc 4.8.5), or modify the value of the variable t, for example, modify it to t = r * s ^ 9-r.