CSC 221: Computer Programming I
Fall 2005

HW5: File and Text Processing


The Flesch readability index was invented by Dr. Rudolf Flesch as a tool for estimating how difficult a document is to read and comprehend. The index does not consider the meaning of words, only their lengths and the lengths of sentences, in order to assign a readability index to the document. The higher the readability index, the easier a document is to comprehend. Flesch readability indexes are often translated into the educational level that is usually necessary to understand a document, as shown in the table below:

Flesch IndexEducational LevelExample
90-1005th graderComics
80-906th graderConsumer ads
70-807th graderAlice in Wonderland
65-708th graderSports Illustrated
50-65High school studentTime Magazine
30-50College studentNew York Times
0-30College graduateAuto Insurance
< 0Law school graduateIRS tax code

The Flesch readability index for a document is calculated using the total number of sentences, words, and syllables in the document:

Flesch Index = 206.835 - 84.6 * (avg # of syllables per word) - 1.015 * (avg # of words per sentence)

The purpose of the index is to enable authors to assess the difficulty of their writing and, subsequently, to guide them in revising the text to match its intended audience. For example, the following sentence has a Flesch index of 28.4, corresponding to a college graduate reading level.

The above index was invented by Flesch as a simple tool to estimate the legibility of a document without linguistic analysis.
By splitting the sentence into two and substituting shorter words, the following translation has a Flesch index of 84.2, corresponding to a 6th grader reading level.
Flesch invented an index to check whether a text is easy to read. To compute the index, you do not need to look at the meaning of the words.

It is possible to set up Microsoft Word so that it will calculate and display the Flesch readability index for a file. For this assignment, however, you will write a collection of Java classes that will read a file, calculate its readability index, and determine its corresponding reading level. To keep things relatively simple, we will make the following assumptions:

The Word Class

The Document Class

You should create a few simple text files to test your Document class. Create the files in any text editor, such as NotePad, and save them in your BlueJ project folder. You can then create Document objects by specifying the file names in constructor calls.

Once you are sure that your class works correctly, you can execute it on larger documents, such as the following:

WorkFlesch IndexEducational Level
Alice's Adventures in Wonderland, by Lewis Carroll76.57th grader
The Gettysburg Address, by Abraham Lincoln64.1high school student
Relativity: The Special and General Theory, by Albert Einstein43.4college student