1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
|
<!-- -*- ispell-local-dictionary: "english" -*- -->
<html><head>
<title>Semantics and application to program verification, École normale supérieure, 2013-2014</title>
<meta name="author" content="Antoine Mine">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="Stylesheet" title="Stylesheet" type="text/css" href="TP1_files/preferred.css">
<script src="TP1_files/main.js" type="text/javascript"></script>
</head>
<body>
<h1>Semantics and application to program verification</h1>
<div id="main">
<h2>TP1 - Denotational semantics</h2>
<p>
The goal of this session is to program an interpreter to compute the denotational semantics of a simple language.
We will use OCaml.
</p><h3>Language</h3>
<p>
Start by downloading the package: <a href="http://www.di.ens.fr/%7Emine/enseignement/l3/codeV1.tgz">codeV1.tgz</a>.
</p><p>
The package contains a parser for the language, programmed in OCamlLex and Menhir (<tt>lexer.mll</tt> and <tt>parser.mly</tt>).
The parser outputs an abstract syntax tree defined in <tt>abstract_syntax_tree.ml</tt>.
A pretty-printer, to print back an abstract syntax tree into the original language, is provided in <tt>abstract_syntax_printer.ml</tt>.
In <tt>main.ml</tt>, you will find a simple driver that takes a file name passed as argument, parses it, and prints it back.
Just typing <tt>make</tt> should compile the simple driver.
</p><h4><a name="syntax">Syntax</a></h4>
<p>
The language is a very simple "curly brackets" <tt>C</tt>-like language.
A program is composed of a sequence of statements of the form:
</p><ul>
<li> assignment: <tt>var = expr;</tt>
</li><li> tests: <tt>if (expr) stat-1;</tt> or <tt>if (expr) stat-1; else stat-2;</tt>
</li><li> while loops: <tt>while (expr) stat;</tt>
</li><li> blocks in curly brackets <tt>{ stat-1; ... stat-n; }</tt>
</li></ul>
<p>
non-standard statements include:
</p><ul>
<li> assertions of boolean expressions: <tt>assert (expr);</tt>
</li><li> variable printing: <tt>print (var-1,...,var-n);</tt>
</li><li> failure: <tt>halt;</tt> which stops the program immediately
</li></ul>
<p>
Expressions include: integer arithmetic operators: <tt>+</tt>, <tt>-</tt>, <tt>*</tt>, <tt>/</tt>, <tt>%</tt> (modulo); boolean operators: <tt>&&</tt> (and), <tt>||</tt> (or), <tt>!</tt> (negation); integer comparisons <tt><</tt>, <tt><=</tt>, <tt>></tt>, <tt>>=</tt>.
Equality <tt>==</tt> and disequality <tt>!=</tt> can be used to compare either two integers or two boolean values.
Constants include integers, and the boolean constants <tt>true</tt> and <tt>false</tt>.
Finally, <tt>rand(l,h)</tt> denotes the non-deterministic interval of integers between the constant <tt>l</tt> and the constant <tt>h</tt>.
You can use <tt>/* ... */</tt> and <tt>//</tt> comments.
You can use parentheses and the operators have their usual precedence.
</p><p>
Unlike <tt>C</tt>, variables do not need to be declared; they start existing when first assigned a value.
There are no local variables nor functions (see the extension part).
</p><h4>Semantics</h4>
<p>
Variables have no type and can hold either an integer or a boolean
value.
Subsequently, we do not distinguish statically between boolean and
integer expressions: only values have a type.
It is an error to use operators with values of the wrong type, such as
adding two boolean values.
This is detected at run-time, when evaluating the expression in the
current environment.
Other run-time errors include: divisions and modulos by zero; using a
non-boolean value in tests, loops and assert conditions; using a
variable that has never been assigned to; asserting a condition that is
false; executing the <tt>halt</tt> statement.
</p><h3>Deterministic semantics</h3>
<p>
We first consider the deterministic subset of the language, i.e., we ignore the <tt>AST_int_rand</tt> expression node for now.
</p><p>
<b>Write an interpreter that executes the program by induction on the
syntax of statements and expressions; it returns either an environment
mapping variables to values, or an error.</b>
</p><p>
You can use the following steps:
</p><ol>
<li> Define a type <tt>ival</tt> for values. It should contain integers,
booleans and errors.
You can use a string representation for errors, which will give the user
some information on the location and cause of the error.
To avoid possible overflow in arithmetic, you can use the <tt>ZArith</tt> OCaml library (its API is similar to that of <tt>Int32</tt> and <tt>Int64</tt>; the module is called <tt>Z</tt>).
</li><li> Define a type <tt>env</tt> for environments. You can use the <tt>Map</tt>
functor from the standard OCaml library to represent mappings from
variables to (non-erroneous) values. In addition to such mappings, an <tt>env</tt> object can also represent an error. The initial environment is an empty map (no variable).
</li><li> Write an expression evaluation function <tt>eval_expr: env -> expr ext -> ival</tt> by induction on the syntax of expressions.
It returns an error if its <tt>env</tt> argument is an error
(strictness) or if an error is detected in the expression evaluation,
which then percolates to the expression root.
Avoid using OCaml exceptions to propagate errors in the function: it
will only make the construction of the non-deterministic version more
painful!
</li><li> Write a statement evaluation function <tt>eval_stat: env -> stat ext -> env</tt>.
When should the function return an error environment?
</li><li>
Test your interpreter on the programs from the <tt>examples</tt> directory.
Can you detect infinite loops in <tt>loop.c</tt>, <tt>loop2.c</tt>, and <tt>loop3.c</tt>?
Under which condition does your interpreter terminate?
</li></ol>
<p class="framed">
<b>new</b>: correction available in <a href="http://www.di.ens.fr/%7Emine/enseignement/l3/correction/interp_det.ml">interp_det.ml</a>.
(The correction uses abstract syntax trees from the extended language in <a href="http://www.di.ens.fr/%7Emine/enseignement/l3/codeV1x.tgz">codeV1x.tgz</a>)
</p><h3>Non-deterministic semantics</h3>
<p>
We now consider the full language including the non-deterministic expression node <tt>rand(l,h)</tt>.
</p><p>
<b>Write an interpreter for this language that outputs the set of all
possible environments at the end of the program as well as the set of
all errors that can be encountered.</b>
</p><p>
The structure of the interpreter will be similar to the one in the previous question.
You can use the following steps:
</p><ol>
<li> Define a type <tt>ivalset</tt> to represent sets of values <tt>ival</tt> (including errors).
You can use OCaml's standard <tt>Set</tt> functor.
</li><li> Define a type <tt>envset</tt> to represent sets of environments (including errors).
</li><li> Program a function <tt>eval_expr: env -> expr ext -> ivalset</tt>
to evaluate an expression in a single environment and return the set of
its possible values in the environment.
When encountering a unary node, the operator must be applied to each
possible value of its argument expression; you can use iterators such as
<tt>fold</tt>.
Binary nodes require nested <tt>fold</tt>.
</li><li> Program a filter function <tt>filter: envset -> expr ext -> envset</tt> that returns the subset of its <tt>envset</tt> argument that can satisfy the expression, enriched with the errors encountered during the expression evaluation.
This function will be useful to model loops, tests and assertions.
Remember that an environment can satisfy both an expression and its negation!
</li><li> Program a generic fixpoint operator <tt>fix: ('a -> 'a) -> 'a -> 'a</tt> that iterates a function from a base element to reach a fixpoint.
Use it then in the semantics of loops.
</li><li>
Test your interpreter on the <tt>examples</tt> directory, including non-deterministic programs such as <tt>gcd_nd.c</tt> and <tt>loop4.c</tt>.
</li></ol>
<p class="framed">
<b>new</b>: correction available in <a href="http://www.di.ens.fr/%7Emine/enseignement/l3/correction/interp_ndet.ml">interp_ndet.ml</a>.
(The correction uses abstract syntax trees from the extended language in <a href="http://www.di.ens.fr/%7Emine/enseignement/l3/codeV1x.tgz">codeV1x.tgz</a>)
</p><h3>Extensions</h3>
<p>
You will find in <a href="http://www.di.ens.fr/%7Emine/enseignement/l3/codeV1x.tgz">codeV1x.tgz</a>
an extension of the language with the following additional constructs:
variable declarations (global and local), procedures and functions, <tt>break</tt> and <tt>return</tt> statements, labels and <tt>goto</tt>
statements.
The language is completely backward compatible with the simpler one.
The abstract syntax tree also enriches the previous one with new kinds
of nodes, so that you can simply reuse your interpreter and extend it to
support some or all of the new constructs.
<br><br>
</p><hr>
<br>
<p>
Author: <a href="http://www.di.ens.fr/%7Emine">Antoine Miné</a>
<br><br>
</p></div>
</body></html>
|