Peephole optimization


Peephole optimization is an optimization technique performed on a small set of compiler-generated instructions, known as a peephole or window, that involves replacing the instructions with a logically equivalent set that has better performance.
For example:
  • Instead of pushing a register onto the stack and then immediately popping the value back into the register, remove both instructions.
  • Instead of multiplying x by 2, do.
  • Instead of multiplying a floating-point register by 8, add 3 to the floating-point register's exponent.
The term peephole optimization was introduced by William Marshall McKeeman in 1965.

Replacements

Peephole optimization replacements include but are not limited to:
  • Null sequences – delete useless operations.
  • Combine operations – replace several operations with one equivalent.
  • Algebraic laws – use algebraic laws to simplify or reorder instructions.
  • Special-case instructions – use instructions designed for special operand cases.
  • Address-mode operations – use address modes to simplify code.

Implementation

Modern compilers often implement peephole optimizations with a pattern-matching algorithm.

Examples

Replacing slow instructions with faster ones

The following Java bytecode:
aload 1
aload 1
mul
can be replaced with the following, which executes faster:
aload 1
dup
mul
As for most peephole optimizations, this is based on the relative efficiency of different instructions. In this case, dup is known/assumed to be more efficient than aload.

Removing redundant code

The following source code:
a = b + c;
d = a + e;
is straightforwardly compiled to

MOV b, R0 ; Copy b to the register
ADD c, R0 ; Add c to the register, the register is now b+c
MOV R0, a ; Copy the register to a
MOV a, R0 ; Copy a to the register
ADD e, R0 ; Add e to the register, the register is now a+e
MOV R0, d ; Copy the register to d

but can be optimized to

MOV b, R0 ; Copy b to the register
ADD c, R0 ; Add c to the register, which is now b+c
MOV R0, a ; Copy the register to a
ADD e, R0 ; Add e to the register, which is now b+c+e
MOV R0, d ; Copy the register to d

Removing redundant stack instructions

If the compiler saves registers on the stack before calling a subroutine and restores them when returning, consecutive calls to subroutines may have redundant stack instructions.
Suppose the compiler generates the following Z80 instructions for each procedure call:

PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR
POP HL
POP DE
POP BC
POP AF

If there were two consecutive subroutine calls, they would look like this:

PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF

The sequence POP regs followed by PUSH for the same registers is generally redundant. In cases where it is redundant, a peephole optimization would remove these instructions. In the example, this would cause another redundant POP/PUSH pair to appear in the peephole, and these would be removed in turn. Assuming that subroutine _ADDR2 does not depend on previous register values, removing all of the redundant code in the example above would eventually leave the following code:

PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF