Enhancing Anxiety Assessments: Identifying Key Items and Testing Dimensionality with the Rasch Model

Evaluating an anxiety scale for effective item performance and unidimensionality with item analysis and the Rasch model.
Rasch Modelling
IRT
Item Analysis
Author

Alex Wainwright

Published

October 27, 2024

Overview

In this analysis, we move beyond simulations to analyse real-world data, specifically an anxiety scale. We’ll assess item performance and dive into advanced psychometric modeling to enhance understanding of how well these items measure anxiety.

Item Analysis

# Libraries ---------------------

library(data.table)
library(eRm)
library(ggplot2)
library(kableExtra)

# Read Data ---------------------

# Anxiety Scale Data
tma_data <-
  fread("data.csv")

# Codebook for scale
codebook <-
  read.csv("codebook.txt", header = F) |>
  as.data.table()

# Tidy Codebook Data ------------

codebook <-  # Filter to item data (Name and Text)
  codebook[grepl("^Q[0-9]{1,2}", V1)]

codebook[, V1 := paste(V1, V2)]
codebook[, V2 := NULL]

codebook[, c("item_name", "item_text") := tstrsplit(V1, "\\.", keep = c(1, 2))]

codebook[, V1 := NULL]

# Clean Anxiety Scale Data ------

tma_data[, score := NULL]  # Drop total score column

item_locations <-  # Identify columns starting with Q
  grep("^Q", names(tma_data), value = T)

tma_data[,  # Transform columns (0 to NA, 2 to 0, and 1 as 1)
         (item_locations) := lapply(.SD, function(x)
           ifelse(x == 0, NA, ifelse(x == 2, 0, 1))),
         .SDcols = item_locations]

reverse_scored_items <-  # Specify items that will be reversed
  c("Q1",
    "Q3",
    "Q4",
    "Q9",
    "Q12",
    "Q15",
    "Q18",
    "Q20",
    "Q29",
    "Q32",
    "Q38",
    "Q50")

tma_data[,  # Reverse score items
         (reverse_scored_items) := lapply(.SD, function(x)
           1 - x),
         .SDcols = reverse_scored_items]

# Recalculate total score
tma_data[, total_score := apply(.SD, 1, sum), .SDcols = patterns("^Q")]

# Drop instances where total score is blank
tma_data <-
  tma_data[!is.na(total_score)]
# Calculate Item Endorsement ---------------

item_endorsement <-
  tma_data[,
           .(item_name = names(.SD),
             item_endorsement = apply(.SD, 2, function(x)
               # Proportion endorsing each item
               mean(x == 1))),
           .SDcols = patterns("^Q")]

# Calculate Item Discrimination ------------

# Split data by top and bottom 27%
cut_off <- quantile(tma_data$total_score, probs = c(.27, .73))

tma_data[, group_cutoff := fcase(total_score <= cut_off[1],
                                 "lower_score",
                                 total_score >= cut_off[2],
                                 "higher_score")]

# Calculate the item endorsement across by group
group_endorsement <-
  tma_data[!is.na(group_cutoff),
           .(item_name =  names(.SD),
             item_endorsement = apply(.SD, 2, function(x)
               mean(x == 1))),
           by = group_cutoff,
           .SDcols = patterns("^Q")] |>
  dcast(item_name ~ group_cutoff,
        value.var = "item_endorsement")

# Calculate the discrimination index for each item
group_endorsement[, discrimination_index := higher_score - lower_score]
codebook <-
  codebook |>
  merge(
    item_endorsement,
    by = "item_name"
  ) |>
  merge(
    group_endorsement,
    by = "item_name"
  )

codebook[, item_name := as.integer(gsub("Q", "", item_name))]

setorder(codebook, item_name)

codebook |>
  kbl(digits = 2) |>
  kable_paper()
Table 1: Item endorsement and discrimination index values for the 50 item anxiety scale.
item_name item_text item_endorsement higher_score lower_score discrimination_index
1 I do not tire quickly 0.66 0.84 0.43 0.41
2 I am troubled by attacks of nausea 0.39 0.62 0.16 0.46
3 I believe I am no more nervous than most others 0.69 0.80 0.53 0.27
4 I have very few headaches 0.48 0.73 0.24 0.49
5 I work under a great deal of tension 0.66 0.80 0.47 0.33
6 I cannot keep my mind on one thing 0.79 0.94 0.57 0.37
7 I worry over money and business 0.75 0.89 0.55 0.34
8 I frequently notice my hand shakes when I try to do something 0.39 0.62 0.17 0.46
9 I blush no more often than others 0.46 0.64 0.32 0.32
10 I have diarrhea once a month or more 0.54 0.74 0.33 0.41
11 I worry quite a bit over possible misfortunes 0.85 0.99 0.59 0.39
12 I practically never blush 0.61 0.81 0.42 0.39
13 I am often afraid that I am going to blush 0.29 0.50 0.12 0.38
14 I have nightmares every few nights 0.35 0.59 0.12 0.47
15 My hands and feet are usually warm 0.57 0.62 0.53 0.09
16 I sweat very easily even on cool days 0.53 0.74 0.34 0.40
17 Sometimes when embarrassed I break out in a sweat 0.63 0.86 0.38 0.48
18 I hardly ever notice my heart pounding and I am seldom short of breath 0.74 0.91 0.51 0.40
19 I feel hungry almost all the time 0.43 0.59 0.29 0.30
20 I am very seldom troubled by constipation 0.44 0.59 0.30 0.29
21 I have a great deal of stomach trouble 0.47 0.75 0.21 0.54
22 I have had periods in which I lost sleep over worry 0.78 0.95 0.55 0.40
23 My sleep is fitful and disturbed 0.58 0.82 0.30 0.51
24 I dream frequently about things that are best kept to myself 0.56 0.80 0.30 0.49
25 I am easily embarrassed 0.67 0.90 0.37 0.54
26 I am more sensitive than most other people 0.80 0.94 0.57 0.37
27 I frequently find myself worrying about something 0.90 1.00 0.68 0.32
28 I wish I could be as happy as others seem to be 0.84 0.99 0.56 0.42
29 I am usually calm and not easily upset 0.63 0.88 0.28 0.60
30 I cry easily 0.54 0.78 0.27 0.51
31 I feel anxiety about something or someone almost all the time 0.80 0.98 0.46 0.52
32 I am happy most of the time 0.65 0.89 0.33 0.56
33 It makes me nervous to have to wait 0.72 0.94 0.44 0.50
34 I have periods of such great restlessness that I cannot sit long I a chair 0.54 0.80 0.28 0.52
35 Sometimes I become so excited that I find it hard to get to sleep 0.70 0.87 0.49 0.38
36 I have sometimes felt that difficulties were piling up so high that I could not overcome them 0.83 0.99 0.55 0.44
37 I must admit that I have at times been worried beyond reason over something that really did not matter 0.88 0.99 0.66 0.33
38 I have very few fears compared to my friends 0.79 0.94 0.54 0.39
39 I have been afraid of things or people that I know could not hurt me 0.63 0.87 0.32 0.56
40 I certainly feel useless at times 0.79 0.97 0.50 0.47
41 I find it hard to keep my mind on a task or job 0.69 0.91 0.41 0.50
42 I am usually self-conscious 0.86 0.97 0.67 0.30
43 I am inclined to take things hard 0.80 0.96 0.53 0.43
44 I am a high-strung person 0.59 0.77 0.34 0.42
45 Life is a trial for me much of the time 0.67 0.92 0.31 0.61
46 At times I think I am no good at all 0.71 0.96 0.33 0.63
47 I am certainly lacking in self-confidence 0.71 0.95 0.37 0.57
48 I sometimes feel that I am about to go to pieces 0.75 0.98 0.38 0.60
49 I shrink from facing crisis of difficulty 0.63 0.90 0.28 0.62
50 I am entirely self-confident 0.86 0.98 0.65 0.33

As a measure of anxiety, you would expect items to differentiate between those of high and low anxiety. Based on the item endorsement and discrimination in Table 1, we can differentiate between good and bad performing items. Item endorsement reflects the proportion of respondents who endorse a given item, helping identify frequently endorsed items, while item discrimination shows each item’s capacity to differentiate between high- and low-anxiety individuals.

Item 46 (At times I think I am no good at all) has an endorsement rate of 71%. Those with a high raw score endorse this item at a rate of 96% compared to 33% in the low raw score group. This is a difference of 63%, which shows this item to discriminate well.

Item 15 (My hands and feet are usually warm) has a 57% endorsement rate, with a 62% and 53% endorsement rate in the high and low anxiety groups, respectively. This item does not discriminate well (discrimination index: 9%). Having warm hands is not exclusive to anxiety; therefore, as an item it won’t function well in differentiating between high and low anxiety.

Let’s say we wanted a set of items with a discrimination index of at least 50%. Based on the current data, we would retain 16 items (21, 23, 25, 29, 30, 31, 32, 33, 34, 39, 41, 45, 46, 47, 48, 49) that can differentiate between high and low anxiety individuals. Based on use case, we could use this information to reduce the length of the anxiety scale. By selecting only the best-performing items, we can create a more efficient and accurate measure of anxiety.

Rasch Modelling

Rather than relying solely on descriptive statistics, we can apply a more advanced method: Rasch Modelling.

In this approach, both items and individuals are positioned on the same underlying continuum. For items, their position represents difficulty, with higher values indicating greater difficulty. For individuals, their position corresponds to their ability on the trait being measured (e.g., anxiety). Higher values for individuals reflect greater levels of the underlying trait, making them more likely to endorse certain items (in the case of an anxiety measure) or answer correctly (in the case of cognitive tests). Consequently, we would expect individuals with higher anxiety levels to endorse items more frequently compared to those with lower anxiety.

rasch_model <-
  RM(tma_data[, .SD, .SDcols = patterns("^Q")])

Dimensionality

When applying Rasch modelling, we assume that the scale measures a single underlying variable. In our case, this means the 50-item anxiety scale is measuring a single dimension of anxiety. By item content alone, we may find this not to be the case as we have items related to physical manifestations of anxiety (e.g., headaches) and psychological manifestations of anxiety (e.g., feeling afraid).

To test this, we apply Andersen’s Likelihood Ratio test. This involves splitting the data into two groups based on a criterion (the median in our case) and applying the Rasch model to each subgroup. The test compares the item difficulty values between the groups. If the difference are significant, it suggests the Rasch model does not fit the data equally well across both groups, indicating a possible dimensionality issue. In our case, the result is significant (\(\chi^2(49) = 2480.627, p < .001\)), pointing toward a misfit of the model.

lr_test <- 
  LRtest(rasch_model, splitcr = "median")

lr_test

Andersen LR-test: 
LR-value: 2480.627 
Chi-square df: 49 
p-value:  0 

Next, we calculate the differences in item difficulty estimates between the two groups and visual the largest differences:

group_item_beta <-
  data.table(
  item_name = paste0("Beta Q", 1:50),
  low_group = -lr_test$betalist$low,
  high_group = -lr_test$betalist$high) 

group_item_beta[, group_diff := abs(low_group - high_group)]
group_item_beta[, rank_diff := frank(-group_diff)]

group_item_beta |>
  ggplot(aes(x = low_group, y = high_group)) +
  geom_jitter() +
  geom_text(
    data = group_item_beta[rank_diff %in% 1:5],
    aes(
      label = item_name
    )
  ) +
  geom_abline() +
  labs(
    x = "Low Anxiety Score Group",
    y = "High Anxiety Score Group"
  ) +
  scale_x_continuous(limits = c(-4, 4)) +
  scale_y_continuous(limits = c(-4, 4)) +
  theme_classic()

The diagonal line represents identical item difficulties between the two groups. Items far from the line show a large difference between high and low groups, indicating potential multidimensionality. The top five items with the largest differences are highlighted for further analysis. For example, item 27 (I frequently find myself worrying about something) is regarded as a very easy item in the high scoring group, but only moderately easy in the low scoring group. Item 15 (My hands and feet are usually warm), on the other hand, can be viewed as a more challenging item for the high scoring group than the low scoring group.

Conclusion

Through item analysis, we identified several items that could be dropped due to their limited ability to differentiate between individuals with low and high anxiety. Content analysis suggests that the scale may not capture a single dimension of anxiety. This finding is further supported by the Rasch model results, which indicate a violation of the unidimensionality assumption.

Next Steps

To refine this scale, future work could focus on removing poorly performing items and re-evaluating the scale’s dimensionality, considering whether separate subscales for physical and psychological anxiety may be more appropriate. Ultimately, enhancing the scale in these ways could increase its reliability and interpretive power for assessing anxiety.