CSC 533: Organization of Programming Languages
Fall 2002

HW3: Implementing a Simple Interpreter

For this assignment, you are to write an interpreter for a simple programming language named SILLY (Simple, Interpreted, Limited Language for You). For now, the SILLY language only contains two types of statements: assignments and output statements. The grammar rules for the SILLY language are as follows:

<program> --> 'begin' { <statement> } 'end' <statement> --> <assignment> | <output> <assignment> --> <identifier> '=' <expression> <expression> --> <term> { '+' <term> } <term> --> <integer> | <identifier> <output> ---> 'output' ( <string> | <expression> ) <identifier> --> <letter> [ <digit> ] <string> --> '"' { <letter> | <digit> | ' ' } '"' <integer> --> <digit> { <digit> } <letter> --> 'a' | 'b' | 'c' | ... | 'z' | 'A' | 'B' | 'C' | ... | 'Z' <digit> --> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'

The SILLY language is case sensitive, so variables a and A are considered unique. Variables are not explicitly declared, but are assumed to have initial values of 0 if not otherwise assigned. An assignment statement assigns an integer value to a variable. An output statement can display either a string constant or the value of an expression, followed by a new line. Whitespace (any sequence of spaces, tabs, and returns) separates the individual tokens (language elements) in a program.

Your SILLY interpreter should read the program from standard input (i.e., cin) and display the output that would be produced by the SILLY program. If a syntax error is encountered, the interpreter should display "SYNTAX ERROR" and halt. For example:

SAMPLE PROGRAMOUTPUT
begin x = 0 output "the number is" output x p2 = x + 2 output x + p2 + 5 output "done" end the number is 0 7 done
start x0 = y + 1 output "x0 and y are" output x0 output y x4 = x0 + output "done" end x0 and y are 1 0 SYNTAX ERROR

Two useful classes are provided for your use:

Token (Token.h and Token.cpp)
This class encapsulates a token (language element), using an enumerated type to distinguish the different token types.
Tokenizer (Tokenizer.h and Tokenizer.cpp)
This class defines a token reader, which can read individual tokens from an input stream. It provides member functions TokensRemain() which returns true if there are any remaining tokens to be read, and GetToken() which reads and returns the next token from the stream.

To demonstrate the workings of these classes, the program demo.cpp utilizes a Tokenizer object to read in tokens from standard input and display the tokens with their corresponding types.

Helpful hint 1: Using Visual C++, it is possible to redirect the contents of a file to standard input. To do this, select "Settings" under the "Project" menu, then click on the "Debug" tag. In the box labeled "Program arguments:", type a less-than symbol followed by a file name. For example, typing "< silly1.txt" would cause standard input to read from silly1.txt.

Helpful hint 2: To convert a string of digits into its corresponding integer value, use the atoi function from the <cctype> library. For example, the call atoi("12") would return the integer value 12.

Helpful hint 3: Be forward thinking. For HW4, you will extend your interpreter to handle additional language features: conditionals and loops. Design your solution to this assignment with extensibility in mind.