Local value numbering is a way to perform compiler optimization locally in a control block. There is a global counterpart of local value numbering called global value numbering.
Motivation
Let’s consider some challenges in local optimization. What do these different forms have in common?
Dead Code Elimination
main {a: int = const 100; // can be deleteda: int = const 42;print a;}
Copy Propagation
main { x: int = const 4; copy1: int = id x; copy2: int = id copy1; copy3: int = id copy2; print copy3; // can delete all the copies and print x}
Common Subexpression Elimination (CSE)
main { a: int = const 4; b: int = const 2; sum1: int = add a b; sum2: int = add a b; // same as sum1 prod: int = mul sum1 sum2; print prod;}
Answer
All those redundancies in computation are caused by using variables. If we can look at the underlying values instead, the redundancies disappears.
Value Numbering
Value numbering stores pairs of value and canonical variable name into a table. When building up values, we can refers to previous values by number in expression.
Example
number
value(expression)
Canonical Variable Name
1
x
2
(+, #1, #1)
a
3
(*, #2, #1)
d
4
(+, #3, #2)
c
We also need to interpret the basic block. Though unlike an actual interpreter, instead of calculating the actual values, we calculate values as the numbers shown above. (since we don’t know the actual inputs)
Example
Consider our common subexpression elimination example again:
main { a: int = const 4; b: int = const 2; sum1: int = add a b; sum2: int = add a b; // same as sum1 prod: int = mul sum1 sum2; print prod;}
It will generate LVN data structure like this
number
value
Canonical Variable Name
1
const 4
a
2
const 2
b
3
(add, #1, #2)
sum1
4
(mul, #3, #3)
prod
and a map from variables to numbers
{ a → 1, b → 2, sum1 → 3, sum2 → 3, prod → 4 }
Notice that both sum1 and sum2 points to number 3.
Every time when we processed on instruction, we will reconstruct the instruction base on the table. This process happens on-the-fly with the construction of the table. Thus, when the construction of the table is completed, we should get the following instructions
main { a: int = const 4; b: int = const 2; sum1: int = add a b; sum2: int = id sum1; // Now sum2 is never used prod: int = mul sum1 sum1; print prod;}
While the basic version is already useful for common subexpression elimination, value numbering is a framework that we can extend with fancier optimizations.
Copy Propagation
For example, the basic implementation of LVN can’t optimize out our copy propagation example. To make this example work, our LVN algorithm need to understand the semantics of the instructions. In this case, we can say that whenever we encounter the id instruction, we assign the value to the underlying value.
Example
main { x: int = const 4; copy1: int = id x; copy2: int = id copy1; copy3: int = id copy2; print copy3; // can delete all the copies and print x}
We can replace our original code with
main { x: int = const 4; copy1: int = id x; copy2: int = id x; copy3: int = id x; print x;}
In this particular example, since we know the value of x as a constant, we can even apply constant propagation and we can get
main { x: int = const 4; copy1: int = const 4; copy2: int = const 4; copy3: int = const 4; print x;}
Commutativity
Another example is that we can also exploit commutativity of integer arithmetic. This optimization enhance common subexpression elimination.
We can achieve that by canonicalizing values in instruction.
Example
a: int = const 4;b: int = const 2;sum1: int = add a b;sum2: int = add b a;prod: int = mul sum1 sum2;print prod;
can be transformed into
a: int = const 4;b: int = const 2;sum1: int = add a b;prod: int = mul sum1 sum1;print prod;
Constant Propagation
Constant propagation is just like copy propagation but we replace id with a const instruction.
Example
x: int = const 4;copy1: int = id x;copy2: int = id copy1;copy3: int = id copy2;print copy3;
transforms to
x: int = const 4;copy1: int = const 4;copy2: int = const 4;copy3: int = const 4;print x;
Constant Folding
It is possible to achieve constant folding with LVN by combining all the above extensions
Example
@main { a: int = const 4; b: int = const 2; sum1: int = add a b; sum2: int = add a b; prod1: int = mul sum1 sum2; sum1: int = const 0; sum2: int = const 0; sum3: int = add a b; prod2: int = mul sum3 sum3; print prod2;}
can be reduced into
@main { prod2: int = const 36; print prod2;}
Pseudocode for Basic LVN
Note: We need to define equality comparison for the structure that represent a value
table = {} # mapping from value tuples to canonical variables, # with each row numberedvar2num = {} # mapping from variable names to their current # value numbers (i.e., rows in table)for instr in block: value = (instr.op, var2num[instr.args[0]], ...) if value in table: # The value has been computed before; reuse it. num, var = table[value] replace instr with copy of var # a.k.a. dest = id var else: # A newly computed value. num = fresh value number # TODO: Can SSA make this redundent? # To handle mutation in cases like this # x = a + b # ... # x = c # ... # y = x # where we can rename the old variable to a different name # x' = a + b # ... # x = c # ... # y = x' if instr will be overwritten later: dest = fresh variable name instr.dest = dest else: dest = instr.dest # Add new entry to the table table[value] = num, dest # Use (canonical) variables in the table rather than old args for a in instr.args: replace a with table[var2num[a]].var # Always need to update the value of destination variable # The value is represented by number in the table var2num[instr.dest] = num