CSC 533: Organization of Programming Languages
Spring 2010

HW6: Advanced Scheme


PART 1: DNA Translations

In biology, the four bases that make up messenger RNA (mRNA) are adenine (A), cytosine (C), guanine (G), and uracil (U). Within a cell, an mRNA sequence is processed by a ribosome and translated into amino acids. Each grouping of 3 bases is known as a codon, and is translated to a particular amino acid. For example, the codon GCU translates to Alanine. The following Scheme association list (defined in dna.scm) maps each codon (represented by a triplet of base symbols) to the abbreviation for the corresponding amino acid. (define ACIDS '( ((U U U) F) ((U U C) F) ((U U A) L) ((U U G) L) ((U C U) S) ((U C C) S) ((U C A) S) ((U C G) S) ((U A U) Y) ((U A C) Y) ((U A A) *) ((U A G) *) ((U G U) C) ((U G C) C) ((U G A) *) ((U G G) W) ((C U U) L) ((C U C) L) ((C U A) L) ((C U G) L) ((C C U) P) ((C C C) P) ((C C A) P) ((C C G) P) ((C A U) H) ((C A C) H) ((C A A) Q) ((C A G) Q) ((C G U) R) ((C G C) R) ((C G A) R) ((C G G) R) ((A U U) I) ((A U C) I) ((A U A) I) ((A U G) M) ((A C U) T) ((A C C) T) ((A C A) T) ((A C G) T) ((A A U) N) ((A A C) N) ((A A A) K) ((A A G) K) ((A G U) S) ((A G C) S) ((A G A) R) ((A G G) R) ((G U U) V) ((G U C) V) ((G U A) V) ((G U G) V) ((G C U) A) ((G C C) A) ((G C A) A) ((G C G) A) ((G A U) N) ((G A C) N) ((G A A) E) ((G A G) E) ((G G U) G) ((G G C) G) ((G G A) G) ((G G G) G) ))

Download a copy of this file and add the following definitions. Note: you should comment each function with a description of its behavior.

  1. Define a function named translate that takes one input, a list representing a codon, and returns the abbreviation for the corresponding amino acid. For example, (translate '(G G A)) should evaluate to G.

  2. Define a function named translate-sequence that takes one input, an arbitrarly long sequence of bases, and returns the corresponding list of amino acid abbreviations. That is, the first symbol in the returned list should be the amino acid corresponding to the first three bases, the second symbol should be the amino acid corresponding to the next three bases, etc. If the length of the sequence is not a multiple of three, any trailing bases should be ignored. For example, (translate-sequence '(G G A U U C A C C C)) should evaluate to (G F T).

  3. When scientists extract mRNA samples from a cell, they invariably obtain partial sequences. Since the start of the sequence may be unknown, it is unclear exactly how the codons will be grouped in an actual translation. For example, in the sequence GGAUUCACCC the codons could be grouped starting at the beginning (yielding codons GGA UUC ACC), at the second base (yielding codons GAU UCA CCC), or at the third base (yielding codons AUU CAC). In addition, it is possible that the sequence has been read backwards, so the actual sequence to be translated should be CCCACUUAGG.

    Define a function named translate-all-possible that takes one input, an arbitrarly long sequence of bases, and returns a list containing all six possible translations of that sequence. Any extra bases at the beginning or end of the sequence should be ignored in the translation. For example, (translate-all-possible '(G G A U U C A C C C)) should evaluate to ((G F T) (N S P) (I H) (P T *) (P L R) (H L)).

PART 2: Binary Trees

In class, we discussed how structured lists could be used to represent binary trees. For example, the following binary tree is represented by the list to the right:
(define TREE1 '(dog (bird (aardvark () ()) (cat () ())) (possum (frog () ()) (wolf () ()))))

Several utility functions (empty?, root, left-subtree, and right-subtree) were provided for you in (defined in tree.scm). Download this file and add the following functions for manipulating binary trees.

  1. Define a function named num-nodes that takes one input, a list representing a binary tree, and returns the number of values stored in that tree. For example, given the TREE1 structure above, (num-nodes TREE1) should evaluate to 7.

  2. Define a function named leftmost that takes one input, a list representing a binary tree, and returns the value that is stored leftmost in the tree. For example, given the TREE1 structure above, (leftmost TREE1) should evaluate to aardvark. Note: you may not assume that the input is a binary search tree, so the leftmost value may not be the smallest value in the tree.

  3. Similarly, define a function named rightmost that takes one input, a list representing a binary tree, and returns the value that is stored rightmost in the tree. For example, given the TREE1 structure above, (rightmost TREE1) should evaluate to wolf. Note: you may not assume that the input is a binary search tree, so the righmost value may not be the largest value in the tree.

  4. Define a function named contains? that takes two inputs, a list representing a binary tree and a value, and returns #t if the list contains the value (otherwise #f). For example, given the TREE1 structure above, (contains? TREE1 'emu) should evaluate to #f while (contains? TREE1 'frog) should evaluate to #t. Note: you may not assume that the input is a binary search tree, so the contains-bst? function from the lectures will not suffice here.

  5. Define a function named dupes? that takes one input, a list representing a binary tree, and returns #t if the list contains any duplicate values (otherwise #f). For example, given the TREE1 structure above, (dupes? TREE1) should evaluate to #f.

PART 3: Hailstone Sequence

An interesting unsolved problem in mathematics concerns what is called the hailstone sequence of integers. This sequence is defined as follows: start with any integer. If that number is odd, then multiply it by 3 and add 1. If it is even, then divide it by 2. Now, repeat. For example, if we start with the number 5, we get the following sequence: 5, 16, 8, 4, 2, 1, 4, 2, 1, 4, 2, 1, . . . Here, the subsequence 4, 2, 1 is reached which produces a loop. It is conjectured that no matter what number you start with, you will always end up stuck in the 4, 2, 1 loop. It has, in fact, been shown to hold for all starting values less than 1,200,000,000,000 . However, it still has not been proven for all numbers.

Define a function named hailstone which has one input, the starting number, and which displays the hailstone sequence starting with that number and ending with 1. The function should return the length of the sequence. For example, (hailstone 1) should display the lone number 1 and evaluate to 1, while (hailstone 5) should display 5 16 8 4 2 1 and evaluate to 6. Your function should only utilize tail-recursion (perhaps a help function will be required). Save your function in a file named hailstone.scm.


PART 4: Eliza

One of the most famous/infamous programs from Artificial Intelligence was Joseph Weizenbaum's Eliza. Written in 1966, the Eliza program was intended to demonstrate the superficiality of AI programs at the time. Despite its lack of deep understanding, the program became extremely popular and earned a place in AI history.

The Eliza program is intended to simulate an interview with psychotherapist. The user interacts with the therapist by entering a comment or question, to which the therapist responds with a comment or question of its own. Despite the fact that the program utilizes no language understanding, it is able to often to produce realistic responses due to its pattern-matching capabilities.

A very basic implementation of Eliza can be found in eliza.scm. Download a copy of this file and make the following modifications to the Eliza rules:

  1. The Eliza rules currently include a rule that looks for the word "computer", and responds accordingly. However, this rule will not catch the plural "computers". Add a rule to the database that simiarly responds to user input containing "computers".
  2. Add a rule to the Eliza rule database that recognizes the phrase "I hate ___" in the user's input. When this phrase is found anywhere in the user input, Eliza should respond with either "Hate is such an intense emotion" or else a response of the form "What is it that you hate about ___".
  3. Add at least three more rules to the Eliza rule database of your own design.