Intervention/Implementation Strategy/System Usability Scale (SUS/IUS/ISUS)

The Intervention Usability Scale (IUS), Implementation Strategy Usability Scale (ISUS), and System Usability Scale (SUS) are a set of related scales that support assessment of the usability–the extent to which a system or service can be used by specific people to achieve specified goals with effectiveness, efficiency, and satisfaction within a specified context of use […]

The Intervention Usability Scale (IUS), Implementation Strategy Usability Scale (ISUS), and System Usability Scale (SUS) are a set of related scales that support assessment of the usability–the extent to which a system or service can be used by specific people to achieve specified goals with effectiveness, efficiency, and satisfaction within a specified context of use (ISO 9241-11:2018)–of interventions, implementation strategies, and systems, where a system may be software program or paper form. This can support decisions like whether a prototype is ready to advance to more widespread testing or comparisons, such as whether a redesigned intervention is more usable than the original intervention. While they can help answer how usable something is, the scales on their own do not directly guide how to redesign something to be more usable.  

Highlights  

Here are some ways you can use these measure in the design process… 

  • SUS/IUS/ISUS can be administered in early usability tests to provide quick, quantifiable feedback. This allows designers to assess whether users find the intervention interface intuitive and easy to navigate. 
  • Although the SUS/IUS/ISUS provides a general usability score, when paired when paired with qualitative follow-ups, it can help identify specific issues (i.e., confusing navigation or awkward flow). 
  • Repeated use of SUS, or a shorter form measure (see below), throughout the redesign process can track improvements (i.e., benchmarking). If the score increases over time, it signals that changes are positively affecting usability. 

Which to use: Implementation Usability Scale, Intervention Usability Scale, or System Usability Scale? 

Overall, our guidance parallels the name of each scale. If your goal is to assess the usability of the intervention (including any associated supporting tools), then use the Intervention Usability Scale (IUS). If your goal is to assess the usability of a system or artifact, use the System Usability Scale (SUS). SUS was developed primarily for digital systems but may work for other kinds of artifacts such as worksheets. 

There may be times when you need to assess two or all three. However, in these instances, we recommend against using SUS and IUS in the same session with the same participants, as the similarities may lead to fatigue or confusion. Further, because UWAC projects are often investigating the usability of an intervention and usability of a system, it is important for the object or target of the survey to be clear. 

Role of scales in each DDBT phase: When and how to use them 

Even if you use the same scale in each phase, it may have different roles. 

Discover and Test Phases 

From the center perspective, we hope to collect or set up comparisons between the usability of the un-adapted intervention or implementation strategy and the redesigned intervention or implementation strategy, among the intended users and in the intended population. Example comparisons include: 

  • Comparing an un-adapted intervention in discover phase to the redesigned intervention in the test phase using the IUS 
  • In a project with a test phase that has a control condition of the un-adapted intervention/implementation strategy, comparing usability of the un-adapted intervention/implementation strategy (control group) with the adapted intervention/implementation strategy (redesign group) by comparing the IUS/ISUS scores between the control and redesign groups after a period of use in the test phase. 
  • In a project that focuses on redesigning an artifact-based intervention strategy (e.g., software, paper form) to improve usability, comparing the artifact’s usability between the discover phase and test phase using SUS.  

Projects where the un-adapted intervention or implementation strategy cannot be used at all in the destination context without adaptation can prevent collecting meaningful IUS/ISUS/SUS data at this stage. In this situation, please reach out to the Methods Core to discuss what might be useful for both your project and center goals. 

Design Build / Phase 

Teams also may use IUS/ISUS/SUS to assess usability during the design/build phase. Early in this process, a scale might be used with scenarios or storyboards to assess perceived or anticipated usability of a design direction. Interviewers might probe about responses to items that indicate potential usability issues to guide redesign. Later, as prototypes mature, participants can complete the IUS/ISUS/SUS after doing a set of tasks (e.g., using the system or role-playing steps in an intervention or implementation strategy). Usability experts tend to look for a SUS score to reach 70 (a benchmark for satisfactory usability) before moving to a test phase, as doing so earlier will likely mean that significant usability issues remain that would interfere with overall goals for a test phase. It is not known if a similar benchmark carries over to the newer IUS/ISUS. This guide provides further detail on interpreting SUS scores.  

The shorter Usability Metric for User Experience (UMUX) (4-item) or (UMUX-Lite) 2-item might also be used in this iterative process, to avoid measurement fatigue. UMUX-Lite correlates well with SUS, at least for evaluation of technologies – but to convert between UMUX-Lite and SUS, you must use 7-point Likert scales. Currently, there are not published examples of using UMUX or UMUX-Lite for non-digital interventions and implementation strategies. 

As teams wrap up the design/build phase, participants must complete the IUS/ISUS/SUS about their redesigned intervention or implementation strategy. Please report this (and not all iterations) to the center, as we would like to know what usability level is reached by the end of design. 

Adapting the wording of scale items 

Before administering the survey, review—and possibly pilot—your selected scale to assess if the wording of items is appropriate for your participants and what you are evaluating. Sometimes, it can make sense to adapt items – you are welcome to meet with the Methods Core to discuss possible adaptations or adding clarifications. Please meet with the Methods Core before dropping items or changing the response scale, as this can inhibit later comparisons with other projects. 

If you need to translate the scale into a different language, the translated scale should be piloted and discussed with community partners. A protocol you can consider using for translating IUS/ISUS/SUS is Toma G, Guetterman TC, Yaqub T, Talaat N, Fetters MD. A systematic approach for accurate translation of instruments: Experience with translating the Connor–Davidson Resilience Scale into Arabic. Methodological Innovations. 2017 Nov;10(3):2059799117741406, which has also been used to translate AIM/IAM/FIM. 

Example of scale adapted for children: Putnam C, Puthenmadom M, Cuerdo MA, Wang W, Paul N. Adaptation of the system usability scale for user testing with children. In Extended abstracts of the 2020 CHI conference on human factors in computing systems 2020 Apr 25 (pp. 1-7). 

Some examples of adaptations: https://www.questionpro.com/blog/system-usability-scale/  

An example set of uses: PST Aid 

Consider the following example: a team is developing a technological tool, PST Aid, to support the delivery of Problem Solving Treatment (PST). The team has already completed the IUS regarding PST in the Discover phase. 

The team chooses to use the IUS and SUS in the design phase to inform their design direction (IUS) and assess readiness of the prototype (SUS), and the IUS in the test phase to compare usability of PST without PST Aid to PST with PST Aid. 

Design phase

In early design workshops, the team uses the IUS to assess participant perceptions about whether PST Aid will make PST, the intervention, more usable. They describe PST (without the Aid) and ask participants to complete the IUS based on it, using “PST” as the object they are evaluating. They then describe PST Aid and how it can be used to support PST delivery, and ask participants to repeat the IUS, this time using “PST, as supported by PST-Aid” as what they are evaluating. 

As the team begins to work on their prototype, they periodically present it to participants and ask them to complete tasks using the prototype. They integrate collection of SUS with their usability testing and interviews. After completing a set of tasks, they ask participants to complete the SUS regarding PST Aid’s usability. In early uses, when they know the prototype has many areas for improvement, they ask participants about items where there is the most opportunity for improvement (e.g., “What problems kept this from being a 10?” for positively worded items). This, along with their observations of teams using the tool, informs their design refinements. As the design matures, they begin paying attention to the total score, as they want to ensure they are above the benchmark of 70 (satisfactory usability) before moving to the test phase. 

Because the team knows that training will be part of how PST Aid is introduced to clinicians, they guide clinicians to anticipate usability post-training for SUS questions, such as “After you have been trained to use [system]…”. Because clients will not receive training, they do not adapt those items in similar ways. 

Test phase

In the test phase, the team uses the IUS to compare usability of PST with PST Aid to PST without PST Aid. Their test phase has a control group (PST without the aid) and a test group PST with PST aid), so they decide it is best to collect IUS during the test phase and compare groups. This is also the comparison in which the center is most interested, as it helps answer the question of whether the introduction of PST Aid makes PST (the intervention) more usable. 

System Usability Scale

You should include qualitative probing about scores in an interview format – consult with Methods Core in best practice in doing this, how long your interview may be, and the order (usability scale then interview; interview then scale, etc.). See also: “Usability Interviews & Task-based Usability Testing” below.  

Goal: To get a baseline usability score and inform if redesign is ready  

Response scale: 1 = Strongly disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly agree  

  1. I think that I would like to use [system] frequently  
  2. I thought [system] is easy to use  
  3. I think that I would need the support of an expert consultant to be able to use [system] 
  4. I found [system] unnecessarily complex  
  5. I find the various functions in [system] are well integrated  
  6. I thought there was too much inconsistency in [system] 
  7. I would imagine that most people would learn to use [system] very quickly  
  8. I found [system] very cumbersome to use  
  9. I felt very confident using [system] 
  10. I needed to learn a lot of things before I could get going on [system] 

Intervention Usability Scale (IUS)/Implementation Strategy Usability Scale (ISUS)

  1. I like to use [intervention/implementation strategy] frequently  
  2. I find [intervention/implementation strategy] unnecessarily complex  
  3. I think [intervention/implementation strategy] is easy to use  
  4. I need the support of an expert consultant to be able to use [intervention/implementation strategy] 
  5. I find the various components of [intervention/implementation strategy] are well integrated  
  6. I think there is too much inconsistency in [intervention/implementation strategy]  
  7. I would imagine that most people would learn to use [intervention/implementation strategy] very quickly  
  8. I find [intervention/implementation strategy] very cumbersome to use  
  9. I felt very confident using [intervention/implementation strategy] 
  10. I needed to learn a lot of things before I could get going on [intervention/implementation strategy] 

Child/Youth IUS Version /Intervention Usability Scale (IUS)

😠 😞 😐 🙂 😁

Response scale: 1 = Strongly disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly agree    

  1. I like to use [intervention/implementation strategy] frequently  
  2. I find [intervention/implementation strategy] unnecessarily complex  
  3. I think [intervention/implementation strategy] is easy to use  
  4. I need the support of an expert consultant to be able to use [intervention/implementation strategy] 
  5. I find the various components of [intervention/implementation strategy] are well integrated  
  6. I think there is too much inconsistency in [intervention/implementation strategy]  
  7. I would imagine that most people would learn to use [intervention/implementation strategy] very quickly  
  8. I find [intervention/implementation strategy] very cumbersome to use  
  9. I felt very confident using [intervention/implementation strategy] 
  10. I needed to learn a lot of things before I could get going on [intervention/implementation strategy] 

Scoring guidelines included here. All three measures are scored the same way. 

Usability Interviews & Task-based Usability Testing 

Critical Incident Technique Interviews 

ometimes, teams may be most interested in learning about usability issues that emerge only in complex, real-world situations, and that are hard to reproduce in usability evaluations, in the lab, or other contexts. For this, interviews that elicit details of past events can be most effective, despite being limited by people’s ability to recall information.  

Example questions ask respondents to recall a time when they did a certain behavior. For example, “tell me about a time you used an app in your job.” This question prompt is slightly different than “tell me about the last time you used an app in your job.” A critical incident question variation could be “tell me about a particular time you used an app in your job where it did not help you accomplish your work.”1

Sessions that combine interviews with other methods 

Using interviews alone to gather data may be limiting because of issues with recall and/or challenges with describing behavior. Interviews can be particularly insightful if they incorporate observation or demonstrations, as people’s ability to recall and articulate details of their use of a product or system is limited. Observation can involve you asking a respondent to complete or demonstrate tasks, and you ask the respondent questions based on what you see (see Box 1). Observation during interviews focuses on monitoring and recording people, behavior, artifacts, and environments. When environments or behaviors are defined, structured observation (such as using checklists to record behavior observed) is a good option (Hanington & Martin, 2019, 158).2 Unstructured observations can be more exploratory and leave the researcher open to seeing what you may not anticipate. 

  1. Introduce purpose of study, what you’re hoping to observe and learn, and obtain consent
  2. Pre-observation interview to ask questions about first impressions or respondent’s typical day
  3. Observe respondent and take note of respondent’s behavior
  4. Post-observation interview to ask questions about what you observed

Box 1. Sample sequence of interview and observation 

Observation can be similar to a cognitive walkthrough, which is a usability assessment method to systematically walk through sequential steps of a system or process from a user’s perspective to identify potential usability issues. Cognitive walkthroughs are usually conducted by domain experts, who may be part of the design team, and can be conducted one-on-one or in groups.  

Task-based usability testing 

Usability evaluations often involve asking participants to complete one or more tasks using a product or according to a service. This could be using the baseline intervention/implementation strategy/app, using partial or complete prototypes of the redesign, or using the newly redesigned intervention, implementation strategy, or supporting artifacts. After each task, researchers might present them with a scale or ask follow-up questions, though if this interrupts the flow, you may save this until after all tasks are completed.  You may refer to this example of task-based usability testing protocol from a UWAC project.   

For tasks that involve collaboration (e.g., a session between a clinician and a patient), it may be necessary to have a researcher take on one of the roles. This increases internal reliability but decreases external validity.  

Task design. Designing appropriate tasks requires practice and iteration. If a task is toof unclear, you may instead uncover usability issues with your task design, not what you are studying! However, if the task design mirrors the language of what a participant must do too closely (e.g., if you tell them to click the button labeled “search”), the task is leading, and you may not uncover key usability issues.  

Think aloud protocol. As we cannot read people’s minds, participants are often asked to think aloud while working toward tasks to help researchers learn as much as possible. This can help researchers learn what a participant is considering doing next and why, better understand their in-the-moment goals, and identify misconceptions. To incorporate think aloud in your interview guide, include instructions for the facilitator to give to the participant about the think aloud process. The facilitator should then demonstrate the technique with an unrelated task so that respondents understand it as best as possible. The participant may still forget (especially when concentrating hard on a task!), and it is often necessary for the facilitator to encouragingly remind participants to think aloud. Even with reminders, some respondents may find it distracting or it might not be contextually appropriate to speak before fully processing behavior. In these cases, it is not worth pushing to use the technique, and instead probe respondents on their task experience after they’ve completed their tasks. For example, you can ask a respondent to walk you through how they accomplished their task.3

Additionally, think aloud protocol is not well suited to tasks that require speaking (e.g., talk therapy, interacting with a voice assistant, etc). In these situations, an alternative is to record the task (e.g., video, screen recording, audio) and then play it back to participants, asking them to describe what they were thinking at the time. This retrospective think aloud is less reliable than in-the-moment think aloud, but sometimes it is the best compromise we can make. 

Facilitation. Participants asked to complete tasks may feel like they are being evaluated, and this is especially the case if those tasks parallel anything they might have to do for certification in a therapy or related to their professional expertise. As a result, it is even more important for facilitators to remind participants that the intervention/implementation strategy/artifact is being evaluated, not them. 

When testing new designs (or existing designs with significant usability issues), it is also not uncommon for participants to have interactions that frustrate them. To an extent, it is valuable to allow this frustration to continue so you learn how the participant would navigate the barriers. If participants ask for help, the facilitator might at first turn it back around to them and ask, “what would you do if I were not here?” However, the facilitator should use their discretion in offering assists that keep the session moving or that help prevent frustration levels from becoming so great that the rest of the session is lost.  

Although much task-based usability testing has historically been applied to digital technologies, the approach is quite relevant to complex psychosocial interventions such as client-facing interventions and implementation strategies. As one example, the Usability Evaluation for Evidence-Based Psychosocial Interventions (USE-EBPI) method specifies how “lab-based” user testing (one of the array of sub-methods specified within USE-EBPI) can be completed for interventions such as psychotherapies. 

  1. Introduce purpose of study, what you’re hoping to observe and learn, and obtain consent
  2. Pre-test interview to ask questions about first impressions, demographics, experience with similar products
  3. Describe task 1
  4. Respondent performs task 1
  5. Describe subtask 1a
  6. Respondent performs subtask 1a
  7. Describe subtask 1b
  8. Respondent performs subtask 1b
  9. Post-task interview to debrief on what was observed during task and subtasks (and reduce cognitive load of recall)
  10. Describe task 2
  11. Respondent performs task 2
  12. Describe subtask 2a
  13. Respondent performs subtask 2a
  14. Describe subtask 2b
  15. Respondent performs subtask 2b

Box 2. Sample sequence for usability test

Sample Usability Questions 

  • Following a task: How would you describe your experience completing this task? 
  • What is one thing you would change about this intervention or product? Why? 
  • How did your experience compare to (a different intervention or product)? 
  • What are features that would encourage you to use this intervention or product? 

  1. “The Critical Incident Technique in UX – Nielsen Norman Group.” 26 Jan. 2020, https://www.nngroup.com/articles/critical-incident-technique/. Accessed 26 Aug. 2023. ↩︎
  2. Hanington, B., & Martin, B. (2019). Universal Methods of Design Expanded and Revised: 125 Ways to Research Complex Problems, Develop Innovative Ideas, and Design Effective Solutions. Rockport Publishers.  ↩︎
  3. Rubin, J., & Chisnell, D. (2008). Handbook of usability testing: How to plan, design, and conduct effective tests. John Wiley & Sons.  ↩︎