Skip to content

Rewriting of induction variables #24

@maximumspatium

Description

@maximumspatium

Induction variables are widely emitted by compilers during program optimization. They greatly assist in strength reduction.

Consider the following code:

int arrayOfInts[10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

for (i = 0; i < 10; i++)
    printf("Next int %d", arrayOfInts[i]);

One possible transformation of it could look like that:

int *p = intPtr;
int arrayOfInts[10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

p = arrayOfInts;

for (i = 0; i < 10; i++) {
    printf("Next int %d", *p);
    p++;
}

The expression arrayOfInts[i] has been transformed to the equivalent pointer dereference introducing the induction variable "p".

This is surely a very simple example. Real-world programs will usually contain more complex cases. A well-optimized PowerPC program I used to work on contained up to 4 induction variables per loop.

The ability to (partially) undo this kind of optimization in the decompiler would greatly improve the readability of its output. Such a transformation would be applied at a later decompilation stage and could be human-assisted.

The great book "Advanced compiler design and implementation" by Steven S. Muchnick already scratched an algorithm for induction variable identification and removal (see chapter 14).

A good starting point would be simplifications of pointers that depend on the loop counter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions