CUE-3 – Inspection

The purpose of CUE-3 was to investigate whether inspection results from independently conducted professional inspections differed as much as usability test results. It turned out that they did.

CUE-3 was a comparative usability inspection of the www.avis.com website conducted in August and September 2001. Eleven Danish usability professionals independently evaluated the site, each using their favorite inspection technique.

Overview of all CUE-studies

Practitioner’s Take Away

Each of the 11 evaluators reported an average of 35 (16%) of the 220 problems. The overlap in reported problems between any two evaluators averaged only 9%. As many as 174 (79%) of the problems were reported by just one or two evaluators. Hence, the 11 inspections exhibit a substantial evaluator effect.

The reported results from the 11 usability inspections of the
Avis rent-a-car website were very different. The overlap in reported problems between any two evaluators averaged only 9%. As many as 79% of the problems were reported by just one or two evaluators.

The evaluator effect would be less critical if severe problems were reported more consistently than cosmetic problems, which have little impact on a website’s usability. A problem was defined as severe if it appeared in one or more executive summaries. Each evaluator reported an average of 24% of the 33 severe problems. Seventeen (52%) of the severe problems were reported by just one or two evaluators. Hence, the evaluator effect persisted for severe problems.

The substantial differences in the individual reports stand in stark contrast to the perception the evaluators acquired during the group work. They left the group work with a strong, and reassuring, feeling of agreement. This became evident during the plenary session, as exemplified by the following quotes from five of the evaluators:

  • “I was surprised to see how little we disagreed.”
  • “A very high level of agreement.”
  • “It is not that subjective after all. There is consensus about what the problems are.”
  • “General agreement, but a number of concrete details differ.”
  • “We are all in agreement. We haven’t made the same observations, though.”

Nobody countered these statements.

Read more in our paper about CUE-3.

Paper about CUE-3

Morten Hertzum, Niels Ebbe Jacobsen, and Rolf Molich, “Usability Inspections by Groups of Specialists: Perceived Agreement in Spite of Disparate Observations,” CHI2002 Extended Abstracts, ACM Press, pp. 662-663, www.acm.org (2 pages, PDF, 19 KB).

Available Downloads

  • The proposal and scenario for CUE-3. The proposal describes the background and rules for the study, and the scenario shows what each professional received at the start of the study
    (9 pages, PDF, 27 KB).
  • All evaluation reports in one PDF file. Note that the reports from teams B and E are in Danish; the rest are in English
    (149 pages, PDF, 5,959 KB).