LIKERT SCALE ANALYSIS AND DATA INTERPRETATION
Assessdo uses Likert scales as a useful and relatively simple method of obtaining data when measuring attitudes to stark issues which require respondents to either agree or disagree. Likert scales are commonly used to measure attitudes, knowledge, perceptions, values, and behavioral changes. In an ordinal scale, responses can be rated or ranked, but the distance between responses is not measurable. Thus, the differences between “always,” “often,” and “sometimes” on a frequency response Likert scale are not necessarily equal. Such rating scale (a closed-ended survey question used to represent respondent feedback in a comparative form for specific particular features/products/services) quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions (usually ranging from one extreme attitude to another) that measures a single attitude or trait when response scores are combined. Likert scales assume that the strength/intensity of an attitude is linear (i.e. on a continuum from strongly agree to strongly disagree, and makes the assumption that attitudes can be measured). These scores are considered to have directionality and even spacing between them.
The traditional way to report on a Likert scale is a combination of Qualitative, non-numerical data that is categorical using Ordinal data used to describe the order of values; and Quantitative, data that can be measured with numbers using Discrete whole numbers that can’t be broken down. Ordinal levels of measurement allocate values to variables based on their relative ranking with respect to one another in a given data set by depicting some ordered relationship among the variable’s observations.
Assessdo’s Likert Scales consist of the following:
- Frequency: Never – Rarely – Sometimes – Always – Often.
- Quality: Poor – Fair – Good – Very good – Excellent.
- Likelihood: Extremely Unlikely – Unlikely – Neutral – Likely – Extremely Likely
- Likelihood: Strongly disagree – Disagree – Somewhat disagree – Neither agree nor disagree – Somewhat agree – Agree – Strongly agree
- Importance: Not at all important – Low importance – Slightly important – Neutral – Moderately important – Very important – Extremely important
The average Assessdo Likert scale is designed as a 5-point scale survey because it is simple to understand and provides three pieces of information including direction (positive/negative), intensity of opinion, and a neutral point. Assessdo understands that in some response formats used with children, the Likert scale may vary from 3 to 5 response points. Changing the response points from 5 to 3, makes it more easily understood by an elementary age child. In a paper, Lehmann & Hulbert (1972) argued that the main problem with two- and three-point scales is that they force respondents to choose and introduce rounding error. In the also aptly named “Are Three-Point Scales Always Good Enough?” article, the authors conducted a simulation study with items with three, five, seven, and nine points. They concluded that two or three points are probably fine when averaging across people and across many items. But if the focus of the research is on individual scales, using a minimum of five to six scale points is probably necessary to get an accurate measure of the variable.
Assessdo Uses Smiley Face Likert (SFL) Scale in Elementary Aged Youth
Usually used in K-12 students, our Likert scale surveys explore student motivational aspects of self-awareness, self-management, decision making, social-awareness, and relationship skills to determine whether they rank low, moderate, or high in a number of categories, such as their need for achievement and need for social connections. For elementary age children, Assessdo follows Tischer and Lang (1983) with substituted faces on which various degrees of happiness or sadness are depicted for written choice points. We have found that children’s judgements using smiling face Likert scales are an appropriate method as a rating scale for children to communicate judgements when being provided quantitative questions in evaluations . This Likert scale is known as the Smiley Face Likert scale (SFL). The SFL scale has a long history of use in pediatrics as a subjective measure of children’s medical and mental health conditions.
In collecting quantitative data, Assessdo believes in order for a child to provide an optimal response the following must be true:
1. The child must be able to understand the words and the sentence that forms the question statement
2. The child must be able to associate the question statement with a past experience of their own in order to retrieve the required information to complete step 3
3. The child must understand that the questionnaire is asking them to make a judgment of their past experience against the question statement
4. The child must be provided with an effective method to communicate the judgment made in step 3
The advantage of smiley face scales is that the scales convey levels of a particular affective domain such as satisfaction without requiring respondents to read and understand verbal text scales. Smiley face scales are also a way to make surveys more enjoyable. Some disadvantages are that respondents who process only smiley face scales may interpret scales differently from those who also process verbal scale labels. Potentially, adding smiley faces may influence the meaning of a scale compared to text labels alone.
The Use of Likert Scales With Children
By: David Mellor, PhD, Kathleen A. Moore, PhD
In consideration of the capacity of children to respond to such scales, some authors have been careful in choosing item wording (e.g., Piers-Harris Children’s Self-Concept Scale) where items are written at a second-grade reading level, or they have reduced the number of response choices, for example, Wright and Asmundson (2003) who changed the original 5-point Likert scale response format for the Illness Attitudes Scale to a 3-point format to make it more easily understood by children. Other authors have followed Tischer and Lang (1983) and substituted faces on which various degrees of happiness or sadness are depicted for written choice points (e.g., Mellor, McCabe, Ricciardelli, & Ball, 2004).
Despite these variations, little other consideration seems to have been given to the more fundamental issue of whether children actually have the capacity to respond to Likert scale formats in a way that accurately reflects their judgments, attitudes, or values. Cognitive development literature would suggest that this matter is of critical importance. For example, Gelman and Baillargeon (1983) argued that younger children primarily think dichotomously. Thus, asking them to respond on a 5-point scale may be beyond their capacity. With regard to content, Marsh (1986) examined a sample of children aged between 7 and 12 years and found that some children, specifically younger children and those with poor verbal skills, were less able to respond to negatively worded items. Other researchers have tested children aged 5–12 years (Chambers & Craig, 1998; Chambers & Johnstone, 2002) and 5–11 years (von Baeyer, Carlson, & Webb, 1997) and suggested that younger children have a tendency to endorse responses at the extreme end of scales when presented with items based on a Likert scale, thus providing unrefined measures of the constructs under investigation. However, Chambers and Johnston (2002) did suggest that this may vary according to what is being assessed.
The importance of these findings is that, as described above, many scales administered to children are used to assess intangible theoretical constructs (including emotions) or subjective judgments about the self. These are different from judgments about matters having an objective accuracy (e.g., a number of objects, or people). In Chambers and Johnston’s (2002) study, younger children were found to respond as accurately as older children to tasks involving judgments about physical objects, but used extremes in responding to questions about feelings. This pattern was found with both 3-point and 5-point response scales, suggesting that simplifying the response format did not increase children’s capacity to use scales.
For a scale to produce reliable and valid data, it must accurately and consistently reflect the measured judgment, attitude, or value. Of critical importance to the use of Likert scales with children is whether an accurate or appropriate internal response will be elicited by the declarative statements. The use of the Likert format assumes that the accurate and representative response has already been internally generated by the child, which may not necessarily be the case. Zeman, Cassano, Perry-Parrish, and Stegall (2006) noted that children’s emotional development shares a transactional relationship with their social, neurophysiological, cognitive, and language development. Thus, any scale that uses a Likert format to assess feeling states may be confronted with issues of whether the states are differentiated internally by the child, as well as their cognitive capacity. The work of cognitive developmentalists such as Piaget (1954) would suggest that certain types of judgments should be harder for children in the stage of concrete operations (7–11 years of age), during which the child develops the capacity to make judgments and reason about the physical world, than the subsequent stage of formal operations (11–16 years) in which the capacity to think in abstract terms (usually) evolves. Thus, it would seem that the use of Likert scales for assessing judgments about tangible/physical materials or their representations may be more amenable to assessment in younger children than those about intangible/abstract concepts such as internal feelings. Furthermore, theorists focusing on working memory capacity (e.g., Barrouillet & Lepine, 2005) and on basic arithmetic proficiency (e.g., Haverty, Koedinger, Klahr, & Alibali, 2000) typically support such an age progression in abilities. However, others have found a U-shape exists across ages 7–11 years on mathematical equivalence. For instance, McNeil (2007) found a decrease in performance between the ages of 7 and 9 years, which was reversed by age 11. Of course, children’s metacognitive development can also be enhanced and perhaps earlier than the formal operations stage as shown by White and Frederiksen (2005) in their manipulation of metacognitive abilities among fifth-grade children.
This study explored this issue by investigating children’s responses to Likert scale items requiring judgments about both physical and abstract concepts. If children are unable to respond accurately to the objectively verifiable and manipulated physical events, then it could be argued that the Likert format cannot accurately assess their judgment about subjective and more abstract matters. On the other hand, if they can respond with accuracy to questions about physical matters, it might be argued that they could have the capacity to use Likert scales in other realms. In line with the findings of Chambers and Johnston (2002), we expected that older children would be able to use Likert formats in both domains, but that younger children’s ability would be limited to the concrete physical domain. However, since Zeman, Klimes-Dougan, Cassano, and Adrian (2007) suggested that future research should focus on alternative response formats for assessing children with Likert scales, we also examined a number of alternative anchor points to establish which provides the optimal scale format for all children when abstract constructs are under investigation, in terms of their consistency with a “gold standard” yes/no response. We used yes/no as the gold standard because we believed that it provided the least ambiguity for the participants, as they were not required to respond in terms of degrees of agreement. While Fritzley et al., reported a “yes” bias to this format in a sample of 2–5-year-olds, this bias was found mainly among 2- and 3-year-olds. Other recent research by Rocha, Marche, and Briere (2013) supported the use of the yes/no format in older children. They argued that according to fuzzy trace theory, some forms of multiple-choice questions should elicit higher error rates than yes/no questions.
The alternative anchor formats were numeric values (1–5), as well as word-based frequencies (e.g., never to regularly), similarities to self (e.g., not like me at all, to very much like me), and agreeability (strongly agree to strongly disagree). The rationale for selecting these different Likert anchor formats is that they are used commonly in various measures.
QUALITATIVE / QUANTITATIVE TEST
We investigated elementary school children’s ability to use a variety of Likert response formatsto respond to concrete and abstract items.
111 children, aged 6–13 years, responded to 2 physical tasks that required them to make objectively verifiable judgments, using a 5-point response format. Then, using 25 items, we ascertained the consistency between responses using a ‘‘gold standard’’ yes/no format and responses using 5-point Likert formats including numeric values, as well as word-based frequencies, similarities to self, and agreeability.
All groups responded similarly to the physical tasks. For the 25 items, the use of numbers to signify agreement yielded low concordance with the yes/no answer format across age-groups. Formats based on words provided higher, but not perfect, concordance for all groups.
Researchers and clinicians need to be aware of the limited understanding that children have of Likert response formats.
A full citation of his study is shown below.
Journal of Pediatric Psychology, Volume 39, Issue 3, April 2014, Pages 369–379,
Three-Point Scales May Be Good Enough When Using Many Items
In response to Green and Rao, in the rather bluntly titled “Three-Point Likert Scales Are Good Enough” article, Matell and Jacoby (1971) argued that three points are good enough in some cases. They had 360 undergraduate psych students answer one of eighteen versions of a 72-item questionnaire of values with the response options varying in the number of scale points between two and nineteen (so there were twenty responses per condition).
Respondents were then given the same questionnaire three weeks later. The authors found little differences in reliabilities and their measure of validity and concluded that as few as two response categories may be adequate in practice. They suggested that both reliability and validity are independent of the number of response categories and their results implied that collapsing data from longer scales into two- or three-point scales would not diminish the reliability or validity of the resulting scores.
A full citation of his study is shown below.
Is A 3 Point Scale Good Enough, by Jeff Sauro, PhD August 14, 2019