URID.org: Best Practices Archives

[Best Practices]

15 October 2003

Books and Book Lists

I've just started reading Jeffrey Zeldman's book, Designing with Web Standards. It's looking to be a good read (yeah, I got no life). I think Chris has this book, too. I also purchased Eric Meyer's CSS Style Sheets, the Definitive Guide, and it's pocket reference. If you want to borrow these books, just let me know.

I know at one time we were going to inventory the books we had among our various desks. I also have quite a few books at home that others might want to borrow (and even read). I think it would be useful to maintain a book list here on URID.org. I'll even volunteer to do it, if you all think it would be worthwhile.

Posted by John | Add/View Comments | Permanent Link | Filed under: Best Practices

30 September 2003

W2W NetRaker Eval

I’m finalizing the usabilty eval for W2W and I wanted to ask the team to do the dry run before I “start” the eval. I’ve got a couple of questions about how things are set up, and I’m guessing that if everyone goes through it more will come up. I’d like to ask everyone to give feedback on what they think.

Thanks!

Posted by Jessica | Add/View Comments | Permanent Link | Filed under: Best Practices

29 September 2003

Flavors of Remotely Evalutating Applications

We've now had two experiences with remote users looking at and responding to wireframes. While it may have felt a little bumpy, we should declare victory, document our process, and move forward in the bright, brave light of few resources for user research.

As Erika pointed out in a discussion that she and I had after Friday's session, what she was doing was market research, not user research. Because of the constraints of the evaluation situation that we were in, we were collecting user responses, not user behavior. That is not to say, however, that we don't gain valuable insight from the technique that she employed.

In the sessions that we've had so far, we have to remember that no matter how low-fidelity the evaluation, or constrained the process, we gain value from communicating directly with users. We also gain experience and confidence in our ability to facilitate.

Some Evaluation Techniques

I'm proposing a couple of dimensions for evaluation that I think our team should consider:

Facilitation dimension
Stimulus dimension

Facilitation Dimension

The facilitation dimension extends from a facilitated phone conversation in which the facilitator cannot see the user or the user interaction to a full-blown usability test in which the facilitator is looking at the user and observing the user's interaction.

Phone Facilitation. The facilitator talks with the user over a phone line. The facilitator cannot see the user or the user's behavior. In the APS example, Jessica and Erika talked the user through the wireframes, asking the user questions, and documenting the user's responses. While it's evident that phone facilitation does identify issues, the facilitator is flying blind. Here are some issues:

The facilitator cannot validate user behavior.
The facilitator may have to coach the user to elicit a response.
User responses tend to be more about the application functions and less about the user experience.
Evaluation results are not valid and may not be actionable.
The evaluation is relatively inexpensive.
We need to have a discussion about the kinds of questions that are appropriate for this kind of situation.

NetRaker Facilitation. Using NetRaker, the facilitator can see the user's interaction with the stimulus, but may miss other important behavioral cues. NetRaker should make it possible to validate user behavior. Its clickstream capabilities will also provide valuable information. Issues:

While not flying blind, the facilitator still does not have a complete picture of the user experience.
Because the evaluation can be conducted in the user's workplace, the evaluation may have an added realism that a usability test doesn't have.
The evaluation protocol can be modeled on a usability test protocol.
The results may be valid and actionable.
The evaluation requires resources, but can probably be absorbed without it becoming a separate project expense.

Usability Test. Certainly, our team has the most experience with formal usability testing. AIR facilitators know what they are doing. Tests are well documented with notes, tapes, and reports. While this kind of evaluation has enormous constraints, it also provides authoritative results:

The testing situation is artificial
Fannie Mae doesn't have lab facilities.
Because we farm the testing out, AIR provides an expert seal of approval--it's not just us saying something.
The results are valid and actionable.
Usability tests are difficult to set up and expensive to run. We can't hide these in project expenses.

Stimulus Dimension

The stimulus dimension extends from wireframes through a full-blown application, or non-interactive to interactive.

Wireframes. Wireframes document the application interface, but don't provide any interaction:

The lack of user interaction with the prototypes is a major drawback.
Because wireframes are produced in the course of our work, they are always available for evaluation.
We discovered in the APS evaluations, that the "flatness" of the wireframes and the constraints of their HTML rendering made it difficult for users to see options, links, and buttons.
Wireframes are amenable to remote evaluation, but don't have high enough fidelity to be used in a usability test.
Users could not see the contents of drop down lists.

Paper Prototype. Paper prototypes have traction during the early design process. We should consider adding them to our evaluation tool suite.

Paper prototyping may be too expensive to test with end users because of recruiting and facilities requirements.
Paper prototyping require face-to-face facilitation (although we might experiment with remote evaluation via videoconferencing).
Paper prototypes are useful for discovering usability issues with "surrogate" users (business owners and subject matter experts).
Paper prototypes are inexpensive to devise, and infintely malleable during the design/evaluation process.
Paper prototypes are not easily adapted to remote evalutation techniques.

HTML Prototype. HTML prototypes provide interaction and navigation, providing opportunities for validation not possible with wireframes.

HTML prototypes require extra resources that our team doesn't have.
Developing an HTML prototype template would help mitigate the resource issue.
Prototype evaluation can validate user behavior.
Prototypes can lull observers into believing that application development is much further along than it actually is.
HTML prototypes are amenable to the full range of evaluation techniques.

Application Prototype. Application prototypes can be used to evaluate limited application functionality.

If a development team uses iterative development, an application prototype provides an opportunity for validating the user's experience with prototype's functionality.
Results can be folded into a subsequent iteration.
URID resources would be used only to evaluate the prototype, not to develop it.
Technical difficulties are the order of the day when working with application prototypes. They tend to be buggy, and setting up an evaluation requires working through firewall, environment, user id, and test data issues.
Application prototypes are amenable to the full range of evaluation techniques.

Working Application. Evaluating a working application provides the most complete coverage of functionality.

URID resources would be used only to evaluate the application.
Any evaluation results would have to be folded into the next (or subsequent) release of the product.
There will be some technical issues, but probably not on the scale of testing an application prototype that's in a testing environment.
Applications are amenable to the full range of evaluation techniques.

Analysis

Use of resources moves from phone facilitation -> usability test and from wireframes -> HTML prototyping -> working application. Validity goes in the same direction. However, actionability doesn't. Usability findings are more actionable early in the development process. The evaluation models we employ should provide:

no or little additional resource (at least for URID) commitment
facilitation that captures significant user behavior
findings that are readily actionable by the development team.

For the most part, NetRaker provides a facilitation model that captures important user behavior without the expense of a full-fledged usability test. An augmented wireframe or HTML prototype provides some degree of user interaction (I vote for the HTML prototype). Wireframe and prototype development also occur at the point in the development process where the results from a usability evaluation will be the most actionable.

Posted by John | Add/View Comments | Permanent Link | Filed under: Best Practices