Evaluating Interface Efficiency
Comparing results from the Keystroke-Level Model and empirical
timings
For this assignment, you will be applying two methods for
evaluating the efficiency of the user interface for a selected
application. In this case, efficiency will be defined as the
amount of time it takes a practiced user to complete a core task
of the application.
The first method is Card, Moran and Newell's Keystroke-Level
Model (KLM), which we have already discussed and demonstrated in
class. With its theoretical analysis, this method provides a
theoretical prediction of the time needed to complete a task.
As you recall, the method assumes an experienced user who can
perform the task without slips or mistakes.
The second method involves timing users as they complete the
task. Because we will want to compare the results between the
two methods, you will want to make sure that your users are
experienced with the application before you time them.
Assignment procedure
Using the application assigned to you, specify a
representative task. You will be evaluating the efficiency for
this task using both the KLM (analytical method) and actual user
timings (empirical method).
Analysis for the keystroke level model
- Prepare an outline of the abstract steps that an experienced
user would take when completing the task.
- List the primitive physical operators (i.e. K, P,
H, and R) needed to accomplish each abstract step.
- Add the M operators according the KLM rules.
- Total the time constants to produce the predicted time
needed to complete the task.
Timing users completing the task
- Create the instructions for completing the task that you
will present to your test users.
- For each test user:
- Allow the user to become familiar with the application
by performing similar tasks.
- Present the task instructions to the user.
- Time how long it takes the user to complete the task.
- Collect timings of at least 12 different users and compute
the following:
- Average
- Standard deviation
- 95% confidence interval of the average
Discussion questions
- What are some possible reasons for why the KLM might produce
an inaccurate estimate of actual task completion times?
- What are some possible reasons for why the average of the
collected times might produce an inaccurate estimate of how long
an experienced user would take under real usage conditions?
- How does the predicted time of the KLM compare to the
average of the collected timings? Does the predicted time lie
within the confidence interval of the average? If not, what
might account for the discrepancy?
- How might you better collect actual timings so that a
smaller confidence interval is produced?