Layout of these lectures:
Data Flow Analysis –> How Data flows on CFG? –> How application-specific Data Flows through the Nodes and Edges of CFG( a program ) ?
Recall 1st lecture:
Node(BBs/statements) –> Transfer function in nodes
Edges(control flows)–> Control-flow handling (merge/union)
(This is quite similar to the state machine abstraction mentioned in OS, which is taught by Yanyan Jiang.)
graph TD;
1[S1]
2[S2]
3[S3]
1-->3
2-->3
$$ In[S_3] = OUT[S_1] \and OUT[S_2] $$
In each data-flow analysis application, we associate with every program point a data-flow value that represents an abstraction of the set of all possible program states that can be observed for that point.
Data-flow analysis is to find a solution to a set of safe-approximation-directed constraints on th IN[s]’s and OUT[s]’s, for all statements s.
$$ OUT[s] = f_s[IN[s]] $$
$$ IN[s] = F_s(OUT[s]) $$
We can just convert the CFG to turn backward analysis to forward analysis. (Details not mentioned here)
$$ IN[s_{i+1}] = OUT[s_{i}] $$
for all i = 1, 2, … , n-1
$$ IN[B] = IN[s_1] $$
$$ OUT[B] = OUT[s_n] $$
Forward: $$ OUT[B] = f_B(IN[B]) $$ Where $$ f_B = f_{s_{n}} (f_{s_{n-1}}(…f_{s_{1}})) $$
$$ IN[B] = \and \space OUT[P] $$ where P is a predecessor of B
The backward version is similar to forward version, not listed here.
Declaration:
A definition d at program point p reaches a point q if there is a path from p to q such that d is not ‘killed’ along the path.
Abstraction
Safe approximation
We can try to understand safe approximation with the following example:
This statement ‘generates’ a definition D of variable v and kills all the other definitions before that define v.
Transfer function $$ OUT[B] = gen_B \cup (IN[B] - kill_B) $$
Control flow $$ IN[B] = \cup \space OUT[P] $$ where P is a predecessor of B
We can note that in our definition of control flow, a definition reaches a program point as long as there exists at least one path along which the definition reaches.
For some analysis, OUT[entry] is not defined as empty set, in which situation we can easily change this template algorithm to satisfy those requirements.
The initialization of OUT[B] will be introduced in later lectures. Remember that we are doing may analysis here
Proof of correctness: