You may work with one (and only one) partner on this assignment.
Your team will submit a single solution, with both names listed, and both team members will share the same grade.
A variety of measures have been developed to characterize the readability of text. Usually, these measures describe readability in terms of grade level, e.g., This sentence is at a seventh grade reading level. For this assignment, you will write Python functions for calculating the readability grade level for files using two different measures.
For example, suppose a text file contained 10 sentences, consisting of 50 words. Those 50 words contained a total of 100 syllables, with 10 of the words having three or more syllables in them. Then,
Due to the complexity of the English language, identifying the ends of sentences and the number of syllables in a word can be tricky. To make these tasks manageable, we will make the following simplifications:
"heavy"
has two syllables while "Italian"
has three syllables.
Define a function named isEndOfSentence
that has a single word as input. The function should return True
if the word ends in a period, exclamation point, or question mark (ignoring trailing quotation marks). For example, isEndOfSentence("What?")
should return True
, while
isEndOfSentence("So,")
should return False
.
Hint: to ignore trailing quotation marks, use the string rstrip
method. For example, the following assignment will strip trailing quotation marks off of a word
and save the resulting string in stripped
:
Define a function named countSyllables
that has a single word as input. The function should return the number of consecutive vowel sequences in the word. For example, countSyllables("people")
should return 2
, while
countSyllables("Italian")
should return 3
.
Be sure to test your functions thoroughly before moving on to the next part.
Define a function named gradeLevel
that processes a text file, which is selected by the user using a dialog box, and displays its readability grade level using each of the formulas listed above. Since the file may be large, your function should read its contents one line at a time, breaking each line into individual words (using the string split
method). It should collect statistics on the words in the file (using the helper functions written in Part 1) and use those statistics to calculate the Flesch-Kincaid grade level and Gunning Fog grade level. In addition to printing the two grade levels, the function should also display the name of the file, and the total number of syllables, words, and sentences in the file. For example:
One special case you will need to watch out for when processing a text file are "words" that contain no syllables. These include numbers, e.g., "2011" and punctuation sequences, e.g., "--"). For this assignment, any "word" that contains no syllables (i.e., no vowels) should not contribute to the word count for the file. For example, the sentence "The year is 2011."
would be considered to have only 3 words in it.