Implementation of SFL

You will need to use strings quite a bit in the implementation, and you will not want to worry about memory management for them. A good idea is to create a table with all of the strings that you have seen stored in it. Create such a table. Include a function intern(s) that looks up string s in the table, and returns the string stored for s. If s is not in the table, it should be inserted.

Lexical analysis

Create a lexical analyzer for SFL using Flex. Examine the forms of programs to see what the tokens are.

Create a header file that defines tokens. It can have a form similar to the following.

Some tokens need attributes. Make the attribute be the lexeme. Store the attribute into a variable called yylval, of type YYSTYPE. (There is method in the madness, so do this.) You want YYSTYPE to be a string type.

Test your lexer. Write a program that just runs the lexer repeatedly, printing each token and attribute. To compile a flex program called lexer.lex, use command

By default, yylex reads from the standard input. To get it to read from a file, include the following in your main program.

Parser

As an initial phase of development, make the parser just read a file and either say nothing if the program is syntactically correct, or give an error message if the program is incorrect. If there is a syntax error, the parser should report the line number where the error occurs.

Bison manages attributes of tokens automatically. It assumes that the token attribute is in variable yylval, of type YYSTYPE. (You see, there is method in the madness.)

Syntax trees

Create a type of abstract syntax trees that describe an expression or definition. Decide on the structure of trees. Define some functions (or constructors) that create particular kinds of nodes in the tree.

Do not try to be too close to the syntax. These trees describe expressions and definitions; they are not direct descriptions of the language syntax. For example, there should be nothing in the syntax tree that corresponds to a parenthesized expression; that is entirely syntactic. You do not need anything that corresponds to a let expression, since those expressions can be translated into a more basic form. You will probably want kinds of nodes that correspond to the following.

Write two functions that print syntax trees. One should show the details of the structure, and is used for debugging. The other writes the tree in a form that looks similar to the language syntax.

Syntax tree construction and table management

Create a table that stores, with each identifier that is defined in a program, the tree that describes its value. Modify the parser so that it constructs a syntax tree for each definition. After the definition is made, it should make an entry in the table, associating the defined identifier with the syntax tree that it names.

Note that some expressions need to be converted to large trees. For example, expression [1,2,3] should yield the same tree as 1:2:3:[]. Do not invent too many kinds of abstract syntax tree.

Test the new parser, having it read each definition in a program, then print the expression that occurs in a definition.

Eliminating names

You will find it awkward to deal with identifiers. Write a function that takes a tree and removes all identifier names from it, as follows.

Do not perform this transformation when a definition is made. A definition needs to be able to refer to later definitions. This transformation is only to be made just before evaluation.

Interpretation

Write an interpreter that evaluates an expression, where the expression is described by a syntax tree. The interpreter should assume that identifiers have been converted to numbers, as explained above. The interpreter should reduce the expression to its simplest form (where no more evaluations can be done) and return that simplified form.

The rules for evaluating expressions should be fairly obvious. Use call-by-value. To evaluate expression A B, first evaluate A, and check that its result is a function. Then evaluate B. Then perform a function application by doing a substitution. Here are basic rules for computing the simplest form, also called a normal form. nf(a) is the normal form of tree a. Node kinds should be obvious. For example, op(a, b) is an operator node with subtrees a and b. perform-op indicates the result of performing operator op, and perform-fun indicates the result of performing a given function.

nf(c)	=	c [c a constant]
nf(op(a,b))	=	perform-op(nf(a), nf(b)) [when nf(a) and nf(b) are constants]
nf(op(a,b))	=	op(nf(a), nf(b)) [when nf(a) and nf(b) are not both contants]
nf(fun(f))	=	fun(f) [fun(f) a built-in function]
nf(actor(f))	=	actor(f)
nf(i_n)	=	i_n [i_n a numbered identifier]
nf(if(a,b,c))	=	nf(b) [if nf(a) = true]
	=	nf(c) [if nf(a) = false]
	=	error [otherwise]
nf(apply(a, b))	=	subst(nf(a), nf(b)) [when nf(a) is a \ node]
nf(apply(a, b))	=	perform-fun(nf(a), nf(b)) [when nf(a) is a built-in function]
nf(apply(a, b))	=	apply(nf(a), nf(b)) [when nf(a) is neither a \ node nor a standard function]
nf(\ a)	=	\ (nf(a))
nf(seq(a,b))	=	seq (a,b)
nf(seqf(a,f))	=	seq(a,f)

subst(\ a, r)	=	subst1(a, r, 0)
subst1(c, r, n)	=	c [c a constant]
subst1(op(a,b), r, n)	=	op(subst1(a,r,n), subst1(b,r,n))
subst1(fun(f), r, n)	=	fun(f)
subst1(actor(f), r, n)	=	actor(f)
subst1(i_n, r, n)	=	r
subst1(i_n, r, k)	=	i_n [k =/= n]
subst1(if(a,b,c), r, n)	=	if(subst1(a,r,n), subst1(b,r,n), subst1(c,r,n))
subst1(apply(a, b), r, n)	=	apply(subst1(a,r,n), subst1(b,r,n))
subst1(\ a, r, n)	=	\(subst1(a, r, n+1)
subst1(seq(a,b), r, n)	=	seq(subst1(a,r,n), subst1(b,r,n))
subst1(seqf(a,f), r, n)	=	seqf(subst1(a,r,n), subst1(b,r,n))

Add the interpreter to your implementation. Look up the value of main, replace identifiers in it, run the interpreter on it, and perform it. If the value is an actor, seq or seqf node, then run it, as indicated above. If it is any other kind of node, then print it.