Starting Point
void vsum(vec_ptr v, int *dest) {
int i;
*dest = 0;
for (i = 0; i < vec_len(v); i++) {
int val;
get_vec_element(v, i, &val);
*dest += val;
}
}
vec_len(v) a procedure call whose side effects the compiler must assume
get_vec_element(v, i, &val) : another procedure call and memory operation that suffers from potential pointer aliasing
- CPE = 42.06
Letting the Compiler Do Its Job (vsum1)
- Simply compiling with
-O2 allows compiler to perform basic optimizations (scheduling, register allocation)
- CPE = 31.25
Manual High-Level Optimizations
Manual Loop Invariant Code Motion
- Move
vec_len(v) out of the loop
- Compiler must treat
vec_len() as a black box with potential side effects
- CPE = 20.66
Manual Inlining + LICM (vsum3)
- Inline the
get_vec_element() function and move get_vec_start(v) out of the loop
-O2 avoids aggressive inlining to prevent code bloat
- CPE = 6
Reducing Memory References (vsum4)
- Use a local variable
sum for accumulation instead of repeatedly writing to *dest
- Compiler must assume that some other pointer might point to
*dest and change its value during the loop
- Using a local variable eliminates this problem
- It is difficult for the compiler to promote a pointer dereference to a register
- Compiler needs to know that no other pointer can point to
*dest
- Otherwise the pointer could be used to change
*dest