Static single assignment form

In compiler theory, static single assignment form, more often abbreviated SSA form or just SSA, is an intermediate representation (IR) in which every variable is assigned exactly once. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element.

SSA was developed by researchers at IBM in the 1980s.

In functional language compilers, such as those for Scheme, ML and Haskell, continuation passing style (CPS) is generally used where one might expect to find SSA in a compiler for Fortran or C.

Contents

Benefits of SSA

The primary usefulness of SSA comes from how it simultaneously simplifies and improves the results of a variety of compiler optimizations, by simplifying the properties of variables. For example, consider this piece of code:

 y := 1
 y := 2
 x := y

As humans, we can see that the first assignment is not necessary, and that the value of y being used in the third line comes from the second assignment of y. A program would have to perform reaching definition analysis to determine this. But if the program is in SSA form, both of these are immediate:

 y1 := 1
 y2 := 2
 x1 := y2

Compiler optimization algorithms which are either permitted or strongly enhanced by the use of SSA include:

Converting to SSA

Introduction

Converting ordinary code into SSA form is primarily a simple matter of replacing the target of each assignment with a new variable, and replacing each use of a variable with the "version" of the variable reaching that point. For example, consider the following control flow graph:

An example control flow graph, before conversion to SSA

Notice that we could change the name on the left side of "x <math>\leftarrow<math> x - 3", and change the following uses of x to use that new name, and the program would still do the same thing. We exploit this in SSA by creating two new variables, x1 and x2, each of which is assigned only once. We likewise give distinguishing subscripts to all the other variables, and we get this:

Missing image
SSA_example1.2.png
An example control flow graph, partially converted to SSA

We've figured out which definition each use is referring to, except for one thing: the uses of y in the bottom block could be referring to either y1 or y2, depending on which way the control flow came from. So how do we know which one to use?

The answer is that we add a special statement, called a Φ (Phi) function, to the beginning of the last block. This statement will generate a new definition of y, y3, by "choosing" either y1 or y2, depending on which arrow control arrived from:

An example control flow graph, fully converted to SSA

Now, the uses of y in the last block can simply use y3, and they'll obtain the correct value either way. You might ask at this point, do we need to add a Φ function for x too? The answer is no; only one version of x, namely x2 is reaching this place, so there's no problem.

A more general question along the same lines is, given an arbitrary control flow graph, how can I tell where to insert Φ functions, and for what variables? This is a difficult question, but one that has an efficient solution that can be computed using a concept called dominance frontiers.


Note: the Φ functions are not actually implemented; instead, they're just markers for the compiler to place the value of all the variables grouped together by the Φ function, in the same location in memory (or same register).

Note 2: SSA is a type of Three address code Intermediate representation. Therefore, it can't be done on Abstract syntax trees

Computing minimal SSA using dominance frontiers

First, we need the concept of a dominator: we say that a node A strictly dominates a different node B in the control flow graph if it's impossible to reach B without passing through A first. This is useful, because if we ever reach B we know that any code in A has run. We say that A dominates B if either A strictly dominates B or A = B.

Now we can define the dominance frontier: the dominance frontier of a node A is the set of nodes that A does not strictly dominate, but does dominate some immediate predecessor of. From A's point of view, these are the nodes at which other control paths that don't go through A make their earliest appearance.


Dominance frontiers capture the precise places at which we need Φ functions: if the node A defines a certain variable, then that definition and that definition alone will reach every node A dominates. Only when we leave these nodes and enter the dominance frontier must we account for other flows bringing in other definitions of the same variable. Moreover, no other Φ functions are needed in the control flow graph to deal with A's definitions, and we can do with no less.


One algorithm for computing the dominance frontier set is

for each node b
    if the number of predecessors of b ≥ 2
        for each p in predecessors of b
            runner := p
            while runner ≠ doms(b)
                add b to runner’s dominance frontier set
                runner := idom(runner)

Note: in the code above, a predecessor is any node from which control is transferred to this node, doms(n) is a set of dominators of node n, and idom(n) is the immediate dominator of node n.

There is an efficient algorithm for finding dominance frontiers of each node. This algorithm was originally described in the paper "Efficiently computing static single assignment form and the control dependence graph", by R. Cytron, J. Ferrante, B. Rosen, M. Wegman and F. Zadek, ACM Trans. on Programming Languages and Systems 13(4) 1991 pp.451–490. Also useful is chapter 19 of the book "Modern compiler implementation in Java" by Andrew Appel (Cambridge University Press, 2002). See the paper for more details.

Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy of Rice University describe an algorithm in their paper titled A Simple, Fast Dominance Algorithm (http://www.hipersoft.rice.edu/grads/publications/dom14.pdf). The algorithm uses well engineered data structures to improve performance.

Pruned SSA

Pruned SSA removes even more Φ functions by also considering variable liveness at Φ function insertion points. For instance, if a phi function would be assigning to a dead (never used) variable, the Φ function is unnecessary.

Converting out of SSA form

As SSA form is no longer useful for direction execution, it is frequently used "on top of" another IR with which it remains in direct correspondence. This can be accomplished by "constructing" SSA as a set of functions which map between parts of the existing IR (basic blocks, instructions, operands, etc.) and its SSA counterpart. When the SSA form is no longer needed, these mapping functions may be discarded, leaving only the now-optimized IR.

If one absolutely needed to execute a program in SSA form, phi functions could be moved to the bottom of each of their predecessor blocks and then turned into simple move operations.

Simple copy insertion, algorithms to reduce copies, etc. If overlapping lifetimes are allowed simply dropping the subscripts doesn't work. Pure SSA form doesn't really use subscripts anyway.

Extensions to SSA Form

Extensions to SSA form can be divided into two categories.

Renaming scheme extensions alter the renaming criterion. Recall that SSA form renames each variable when it is assigned a value. Alternative schemes include static single use form (which renames each variable at each statement when it is used) and static single information form (which renames each variable when it is assigned a value, and in each conditional context in which that variable is used).

Feature-specific extensions retain the single assignment property for variables, but incorporate new semantics to model additional features. Some feature-specific extensions model high-level programming language features like arrays, objects and aliased pointers. Other feature-specific extensions model low-level architectural features like speculation and predication.

Compilers using SSA form

SSA form is a relatively recent development in the compiler community. As such, many older compilers only use SSA form for some part of compilation or optimization process, but most do not rely on it. Examples of compilers that rely heavily on SSA form include:

  • The LLVM Compiler Infrastructure (http://llvm.org/) uses SSA form for all scalar register values (everything except memory) in its primary code representation. SSA form is only eliminated once register allocation occurs, late in the compile process (often at link time).
  • The open source SGI compiler ORC (http://ipf-orc.sourceforge.net/) uses SSA form in its global scalar optimizer, though the code is brought into SSA form before and taken out of SSA form afterwards. ORC uses extensions to SSA form to represent memory in SSA form as well as scalar values.
  • IBM's open source adaptive Java virtual machine, Jikes RVM (http://jikesrvm.sourceforge.net/), uses extended Array SSA, an extension of SSA that allows analysis of scalars, arrays, and object fields in a unified framework. Extended Array SSA analysis is only enabled at the maximum optimization level, which is applied to the most frequently executed portions of code.
  • jackcc (http://jackcc.sf.net) is an open-source compiler for the academic instruction set Jackal 3.0. It uses a simple 3-operand code with SSA for its intermediate representation. As an interesting variant, it replaces Φ functions with a so-called SAME instruction, which instructs the register allocator to place the two live ranges into the same physical register.
  • Although not a compiler, Boomerang (http://boomerang.sourceforge.net/) (a decompiler) uses SSA form in its internal representation. Applicability of various SSA algorithms to decompilation is an active area of SSA research.

Further Reading

Template:Book reference Template:Book reference Template:Book reference Template:Book reference


See also

External links

  • Steven Bosscher and Diego Novillo. GCC gets a new Optimizer Framework (http://lwn.net/Articles/84888/). An article about GCC's use of SSA and how it improves over older IRs.
Navigation

  • Art and Cultures
    • Art (https://academickids.com/encyclopedia/index.php/Art)
    • Architecture (https://academickids.com/encyclopedia/index.php/Architecture)
    • Cultures (https://www.academickids.com/encyclopedia/index.php/Cultures)
    • Music (https://www.academickids.com/encyclopedia/index.php/Music)
    • Musical Instruments (http://academickids.com/encyclopedia/index.php/List_of_musical_instruments)
  • Biographies (http://www.academickids.com/encyclopedia/index.php/Biographies)
  • Clipart (http://www.academickids.com/encyclopedia/index.php/Clipart)
  • Geography (http://www.academickids.com/encyclopedia/index.php/Geography)
    • Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
    • Maps (http://www.academickids.com/encyclopedia/index.php/Maps)
    • Flags (http://www.academickids.com/encyclopedia/index.php/Flags)
    • Continents (http://www.academickids.com/encyclopedia/index.php/Continents)
  • History (http://www.academickids.com/encyclopedia/index.php/History)
    • Ancient Civilizations (http://www.academickids.com/encyclopedia/index.php/Ancient_Civilizations)
    • Industrial Revolution (http://www.academickids.com/encyclopedia/index.php/Industrial_Revolution)
    • Middle Ages (http://www.academickids.com/encyclopedia/index.php/Middle_Ages)
    • Prehistory (http://www.academickids.com/encyclopedia/index.php/Prehistory)
    • Renaissance (http://www.academickids.com/encyclopedia/index.php/Renaissance)
    • Timelines (http://www.academickids.com/encyclopedia/index.php/Timelines)
    • United States (http://www.academickids.com/encyclopedia/index.php/United_States)
    • Wars (http://www.academickids.com/encyclopedia/index.php/Wars)
    • World History (http://www.academickids.com/encyclopedia/index.php/History_of_the_world)
  • Human Body (http://www.academickids.com/encyclopedia/index.php/Human_Body)
  • Mathematics (http://www.academickids.com/encyclopedia/index.php/Mathematics)
  • Reference (http://www.academickids.com/encyclopedia/index.php/Reference)
  • Science (http://www.academickids.com/encyclopedia/index.php/Science)
    • Animals (http://www.academickids.com/encyclopedia/index.php/Animals)
    • Aviation (http://www.academickids.com/encyclopedia/index.php/Aviation)
    • Dinosaurs (http://www.academickids.com/encyclopedia/index.php/Dinosaurs)
    • Earth (http://www.academickids.com/encyclopedia/index.php/Earth)
    • Inventions (http://www.academickids.com/encyclopedia/index.php/Inventions)
    • Physical Science (http://www.academickids.com/encyclopedia/index.php/Physical_Science)
    • Plants (http://www.academickids.com/encyclopedia/index.php/Plants)
    • Scientists (http://www.academickids.com/encyclopedia/index.php/Scientists)
  • Social Studies (http://www.academickids.com/encyclopedia/index.php/Social_Studies)
    • Anthropology (http://www.academickids.com/encyclopedia/index.php/Anthropology)
    • Economics (http://www.academickids.com/encyclopedia/index.php/Economics)
    • Government (http://www.academickids.com/encyclopedia/index.php/Government)
    • Religion (http://www.academickids.com/encyclopedia/index.php/Religion)
    • Holidays (http://www.academickids.com/encyclopedia/index.php/Holidays)
  • Space and Astronomy
    • Solar System (http://www.academickids.com/encyclopedia/index.php/Solar_System)
    • Planets (http://www.academickids.com/encyclopedia/index.php/Planets)
  • Sports (http://www.academickids.com/encyclopedia/index.php/Sports)
  • Timelines (http://www.academickids.com/encyclopedia/index.php/Timelines)
  • Weather (http://www.academickids.com/encyclopedia/index.php/Weather)
  • US States (http://www.academickids.com/encyclopedia/index.php/US_States)

Information

  • Home Page (http://academickids.com/encyclopedia/index.php)
  • Contact Us (http://www.academickids.com/encyclopedia/index.php/Contactus)

  • Clip Art (http://classroomclipart.com)
Toolbox
Personal tools