CSC 533: Organization of Programming Languages
Fall 2003

HW3: Implementing a Simple Interpreter

For this assignment, you are to write an interpreter for a simple programming language named SILLY (Simple, Interpreted, Limited Language for You). For now, the SILLY language only contains two types of statements: assignments and output statements. The grammar rules for the SILLY language are as follows:

<program> --> 'begin' { <statement> } 'end' <statement> --> <assignment> | <output> <assignment> --> <identifier> '=' <expression> <expression> --> <term> { ('+' | '-') <term> } <term> --> <integer> | <identifier> <output> ---> 'output' ( <string> | <expression> ) <identifier> --> <letter> [ <digit> ] <string> --> '"' { <letter> | <digit> | ' ' } '"' <integer> --> <digit> { <digit> } <letter> --> 'a' | 'b' | 'c' | ... | 'z' | 'A' | 'B' | 'C' | ... | 'Z' <digit> --> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'

The SILLY language is case sensitive, so variables a and A are considered unique. Variables are not explicitly declared, but are assumed to have initial values of 0 if not otherwise assigned. An assignment statement assigns an integer value to a variable. An output statement displays a single value (string or integer) on a line by itself. Within the program file, whitespace (any sequence of spaces, tabs, and returns) separates the individual tokens (language elements) in a program.

Your SILLY interpreter should read the program from a file specified by the user and display the output that would be produced by the SILLY program. If a syntax error is encountered, the interpreter should display "SYNTAX ERROR" and halt. For example:

SAMPLE PROGRAMOUTPUT
begin x = 0 output "the number is" output x p2 = x + 2 output x + p2 + 5 output "done" end the number is 0 7 done
begin x0 = y + 1 output "x0 and y are" output x0 output y x4 = x0 + output "done" end x0 and y are 1 0 SYNTAX ERROR

Two useful classes are provided for your use:

Token (Token.h and Token.cpp)
This class encapsulates a token (language element), using an enumerated type to distinguish the different token types.
Tokenizer (Tokenizer.h and Tokenizer.cpp)
This class defines a token reader, which can read individual tokens from an input file. It provides member functions TokensRemain() which returns true if there are any remaining tokens to be read, and GetToken() which reads and returns the next token from the file.

To demonstrate the workings of these classes, the program demo.cpp utilizes a Tokenizer object to read in tokens from a file and display the tokens with their corresponding types.

Helpful hint 1: To convert a string of digits into its corresponding integer value, use the atoi function from the <cctype> library. For example, the call atoi("12") would return the integer value 12.

Helpful hint 2: Be forward thinking. For HW4, you will extend your interpreter to handle additional language features: conditionals and loops. Design your solution to this assignment with extensibility in mind.