# Libraries ------------------
library(data.table)
library(flextable)
library(haven)
library(lavaan)
library(stringi)
Overview
The Bullshitting Frequency Scale (BFS) was originally developed to assess how frequently individuals engage in “everyday bullshitting”. This is the act of communicating with little regard for truth, often to impress, influence, or avoid discomfort. Here, I revisit the BFS using exploratory factor analysis (EFA), to test its underlying structure and psychometric validity.
The data comes from the original scale development paper. The original 18-items are presented in Table 1.
# Read Data ------------------------
<-
bsf_data read_sav("BSF Study 1 - validation.sav") |>
as.data.frame() |>
as.data.table()
# Extract Statement Details ---------
<-
item_information Item = names(.SD),
bsf_data[, .(Statement = lapply(.SD, function(x)
attributes(x)$label)),
= patterns("BSF_")][Statement != "NULL"]
.SDcols
:= gsub("BFS - ", "", Statement)]
item_information[, Statement
|>
item_information as_flextable(
max_row = nrow(item_information),
show_coltype = F)
Item | Statement |
---|---|
BSF_1 | By pretending to know more about a topic than I actually do. |
BSF_2 | When I want the thing(s) I'm talking about to sound more interesting or exciting. |
BSF_3 | When I know it will be easy to get away with it. |
BSF_4 | When I want to impress the people I'm talking to. |
BSF_5 | When I know it will get me what I need or want. |
BSF_6 | When someone asks me something that I want to avoid giving a direct answer to. |
BSF_7 | When I need to fake/bluff my way out of a conversation or situation. |
BSF_8 | When being fully honest would be harmful or embarrassing to me or someone else. |
BSF_9 | When I want others to see me as more intelligent or knowledgeable. |
BSF_10 | When I want to contribute to a conversation or discussion even though I'm not well-informed on the topic. |
BSF_11 | When I know it will help me achieve a goal. |
BSF_12 | When I feel obligated to share my opinion. |
BSF_13 | Regardless of whether I actually know what I'm talking about. |
BSF_14 | When I'm trying to fit in better or be more accepted by the person or people I'm interacting with. |
BSF_15 | When I'm trying to avoid looking stupid. |
BSF_16 | When I want to deflect criticism or questions that might make me look bad. |
BSF_17 | When I'm "put on the spot" and asked about something I don't know much about. |
BSF_18 | When I don't want to tell someone what I really think. |
n: 18 |
Analytical Approach
To analyse the data, the original authors applied Exploratory and Confirmatory Factor Analysis (EFA and CFA) to the same data. We don’t follow that approach here. Instead, EFA will be applied to the data set to explore various factor solutions, without using CFA to confirm a structure with the same data.
EFA Model Fit Results
All original 18-items of the BFS scale were used in the EFA. One to four factors were extracted using geomin rotation and the DWLS estimator.
<-
efa_mods efa(bsf_data[, BSF_1:BSF_18], nfactors = 1:4, rotation = "geomin", ordered = T)
<-
efa_fit_measures fitmeasures(efa_mods,
fit.measures = c("chisq.scaled",
"df.scaled",
"pvalue.scaled",
"cfi.robust",
"rmsea.robust")) |>
as.data.table(keep.rownames = T) |>
melt(id.vars = "rn")
setnames(
efa_fit_measures,old = c("rn", "variable"),
new = c("Fit Measure", "Factor")
)
:= stri_extract(Factor, regex = "[1-4]")]
efa_fit_measures[, Factor
<-
efa_fit_measures |>
efa_fit_measures dcast(Factor ~ `Fit Measure`, value.var = "value")
|>
efa_fit_measures[, .(Factor, chisq.scaled, df.scaled, pvalue.scaled, cfi.robust, rmsea.robust)] as_flextable(show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(values = c("Factor", "Chi-Square", "df", "p-value", "CFI", "RMSEA"))
Factor | Chi-Square | df | p-value | CFI | RMSEA |
---|---|---|---|---|---|
1 | 644.64 | 135 | 0.00 | 0.86 | 0.12 |
2 | 432.06 | 118 | 0.00 | 0.91 | 0.11 |
3 | 238.05 | 102 | 0.00 | 0.95 | 0.09 |
4 | 172.00 | 87 | 0.00 | 0.96 | 0.08 |
n: 4 |
Fit measures of the four factor models are presented in Table 2. Based on global fit, a four-factor model would be preferred over a one-factor model. One- or two-factor solutions are least preferred options.
Four-Factor Model
If we look at Table 3, a four-factor solution would be ill-defined based on loadings. There are a high number of cross-loadings and items without a clear target factor. In this case, pursuing better global fit yields a model with unclear or diffuse factor structure.
<-
efa_4_factor_loadings $loadings$nf4 |>
efa_modsas.data.table(keep.rownames = T)
<- grep("^f", names(efa_4_factor_loadings), value = TRUE)
cols_to_change := lapply(.SD, function(x) ifelse(abs(x) >= .3, x, NA)), .SDcols = cols_to_change]
efa_4_factor_loadings[, (cols_to_change)
|>
efa_4_factor_loadings as_flextable(
max_row = nrow(efa_4_factor_loadings),
show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(values = c("Item", "F1", "F2", "F3", "F4")) |>
autofit()
Item | F1 | F2 | F3 | F4 |
---|---|---|---|---|
BSF_1 | 0.71 | |||
BSF_2 | 0.67 | |||
BSF_3 | 0.37 | 0.69 | ||
BSF_4 | 0.75 | |||
BSF_5 | 0.61 | 0.55 | ||
BSF_6 | 0.72 | |||
BSF_7 | 0.58 | |||
BSF_8 | 0.62 | |||
BSF_9 | 0.80 | |||
BSF_10 | 0.83 | |||
BSF_11 | 0.31 | 0.66 | ||
BSF_12 | 0.31 | 0.74 | ||
BSF_13 | 0.94 | |||
BSF_14 | 0.86 | |||
BSF_15 | 0.65 | |||
BSF_16 | 0.43 | 0.42 | ||
BSF_17 | 0.74 | |||
BSF_18 | 0.84 | |||
n: 18 |
Two-Factor Model
Loadings of the two-factor model are presented in Table 4. Items 6, 8 and 18 strongly load on a separate factor. In both instances, the statements are references to an external individual. A moderate loadings on this second factor are found for item 7. Again, this items reflect interactions with external others, suggesting a common theme of interpersonal avoidance or mitigation. For the first factor, loadings are moderate to strong in strength. Items 15 and 16 are the lowest loading within the set. Thematically, this factor appears to be about general bullshitting to achieve personal or social goals.
<-
efa_2_factor_loadings $loadings$nf2 |>
efa_modsas.data.table(keep.rownames = T)
<- grep("^f", names(efa_2_factor_loadings), value = TRUE)
cols_to_change := lapply(.SD, function(x) ifelse(abs(x) >= .3, x, NA)), .SDcols = cols_to_change]
efa_2_factor_loadings[, (cols_to_change)
|>
efa_2_factor_loadings as_flextable(
max_row = nrow(efa_2_factor_loadings),
show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(values = c("Item", "F1", "F2")) |>
autofit()
Item | F1 | F2 |
---|---|---|
BSF_1 | 0.76 | |
BSF_2 | 0.75 | |
BSF_3 | 0.84 | |
BSF_4 | 0.86 | |
BSF_5 | 0.79 | |
BSF_6 | 0.69 | |
BSF_7 | 0.53 | |
BSF_8 | 0.64 | |
BSF_9 | 0.87 | |
BSF_10 | 0.81 | |
BSF_11 | 0.76 | |
BSF_12 | 0.63 | |
BSF_13 | 0.88 | |
BSF_14 | 0.78 | |
BSF_15 | 0.59 | |
BSF_16 | 0.42 | 0.45 |
BSF_17 | 0.62 | |
BSF_18 | 0.85 | |
n: 18 |
We know the two-factor model has a less than adequate global fit. To go further, we can inspect the residuals to identify points of misfit. Heuristics can be employed when inspecting residuals (e.g., considering only residuals with absolute values ≥ .2). Here, I’m interested in any absolute value ≥ .1. These instances are between:
Item 3: Residual with 12
Item 5: Residuals with 12, 15, 17
residuals(efa_mods$nf2, type = "cor")$cov
BSF_1 BSF_2 BSF_3 BSF_4 BSF_5 BSF_6 BSF_7 BSF_8 BSF_9 BSF_10
BSF_1 0.000
BSF_2 -0.024 0.000
BSF_3 -0.030 0.024 0.000
BSF_4 0.018 0.025 0.026 0.000
BSF_5 -0.071 0.041 0.086 0.023 0.000
BSF_6 0.026 0.004 0.061 -0.008 0.003 0.000
BSF_7 0.028 0.005 0.040 0.045 0.043 0.010 0.000
BSF_8 -0.060 0.014 0.034 0.075 0.062 0.022 -0.004 0.000
BSF_9 0.055 -0.004 0.006 0.008 -0.015 -0.037 -0.025 -0.001 0.000
BSF_10 0.050 -0.056 0.021 -0.018 -0.093 0.046 -0.042 -0.083 -0.030 0.000
BSF_11 -0.075 0.031 -0.015 -0.003 0.096 -0.010 -0.040 0.010 0.006 -0.010
BSF_12 -0.058 -0.019 -0.109 -0.023 -0.106 -0.037 0.014 -0.001 -0.023 -0.017
BSF_13 0.013 -0.033 0.029 -0.021 -0.083 0.009 -0.034 -0.031 -0.015 0.010
BSF_14 -0.053 0.012 -0.067 0.001 -0.012 -0.014 -0.065 0.016 -0.030 0.030
BSF_15 0.074 -0.017 -0.076 -0.057 -0.158 -0.013 0.025 -0.068 0.026 0.026
BSF_16 -0.009 -0.007 -0.041 -0.014 0.009 -0.028 0.004 -0.002 0.025 -0.045
BSF_17 -0.010 -0.041 -0.081 -0.088 -0.118 -0.037 -0.032 -0.024 -0.024 0.058
BSF_18 0.026 -0.004 -0.018 -0.054 -0.031 0.004 -0.010 0.000 0.015 0.030
BSF_11 BSF_12 BSF_13 BSF_14 BSF_15 BSF_16 BSF_17 BSF_18
BSF_1
BSF_2
BSF_3
BSF_4
BSF_5
BSF_6
BSF_7
BSF_8
BSF_9
BSF_10
BSF_11 0.000
BSF_12 0.015 0.000
BSF_13 -0.104 0.069 0.000
BSF_14 0.001 0.080 0.008 0.000
BSF_15 -0.038 0.002 0.018 0.011 0.000
BSF_16 0.001 0.019 -0.005 0.019 0.051 0.000
BSF_17 -0.059 0.060 0.083 0.031 0.062 -0.024 0.000
BSF_18 0.027 -0.006 -0.003 -0.013 -0.017 -0.002 0.051 0.000
Items 3, 5, 12, 15, 17
Items 12, 15, and 17 cover social navigation and a need to save face. Whereas, items 3 and 5 are linked to strategic deceit.
Scale Refinement
If we were to consider a general factor or BS-ing, a second factor of interpersonal avoidance seems tangential. Therefore, let’s consider a unidimensional model based on items with the strongest absolute loading values (≥ .7) for a general BS factor. These are items: 1, 2, 3, 4, 9, 10, 11, 13, and 14. I have chosen to drop item 5 on account of the high number of large residuals between other items.
Thematically, the remaining 9 items are motivations behind the act of BS-ing, as opposed to a socially motivated perspective.
Fit measures for the refined scale are presented in Table 5. Compared to the 18-item single factor model, the 9-item model shows improved global fit.
<-
efa_refined_mod efa(bsf_data[, .(BSF_1, BSF_2, BSF_3, BSF_4, BSF_9, BSF_10, BSF_11, BSF_13, BSF_14)], nfactors = 1, rotation = "geomin", ordered = T)
<-
efa_fit_measures fitmeasures(efa_refined_mod,
fit.measures = c("chisq.scaled",
"df.scaled",
"pvalue.scaled",
"cfi.robust",
"rmsea.robust")) |>
as.data.table(keep.rownames = T) |>
melt(id.vars = "rn")
setnames(
efa_fit_measures,old = c("rn", "variable"),
new = c("Fit Measure", "Factor")
)
:= stri_extract(Factor, regex = "[1]")]
efa_fit_measures[, Factor
<-
efa_fit_measures |>
efa_fit_measures dcast(Factor ~ `Fit Measure`, value.var = "value")
flextable(efa_fit_measures[, .(Factor, chisq.scaled, df.scaled, pvalue.scaled, cfi.robust, rmsea.robust)]) |>
colformat_double(digits = 2) |>
set_header_labels(values = c("Factor", "Chi-Square", "df", "p-value", "CFI", "RMSEA"))
Factor | Chi-Square | df | p-value | CFI | RMSEA |
---|---|---|---|---|---|
1 | 84.39 | 27.00 | 0.00 | 0.97 | 0.09 |
The factor loadings for a one-factor model using the refined 9-item scale are presented in Table 6. Loadings are strong and account for 64.55% of the variance in the underlying latent variable.
<-
efa_1_factor_loadings $loadings |>
efa_refined_modas.data.table(keep.rownames = T)
<- grep("^f", names(efa_1_factor_loadings), value = TRUE)
cols_to_change := lapply(.SD, function(x) ifelse(abs(x) >= .3, x, NA)), .SDcols = cols_to_change]
efa_1_factor_loadings[, (cols_to_change)
|>
efa_1_factor_loadings as_flextable(
max_row = nrow(efa_1_factor_loadings),
show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(values = c("Item", "F1")) |>
autofit()
Item | F1 |
---|---|
BSF_1 | 0.79 |
BSF_2 | 0.77 |
BSF_3 | 0.80 |
BSF_4 | 0.86 |
BSF_9 | 0.83 |
BSF_10 | 0.81 |
BSF_11 | 0.78 |
BSF_13 | 0.78 |
BSF_14 | 0.82 |
n: 9 |
Residuals show that areas of localised strain have reduced, with no absolute value being ≥ .10.
residuals(efa_refined_mod$nf1, type = "cor")$cov
BSF_1 BSF_2 BSF_3 BSF_4 BSF_9 BSF_10 BSF_11 BSF_13 BSF_14
BSF_1 0.000
BSF_2 -0.033 0.000
BSF_3 -0.037 0.016 0.000
BSF_4 0.008 0.008 0.014 0.000
BSF_9 0.048 -0.009 -0.004 0.002 0.000
BSF_10 0.039 -0.059 0.012 -0.030 -0.035 0.000
BSF_11 -0.066 0.048 0.002 0.003 0.018 -0.001 0.000
BSF_13 0.015 -0.025 0.046 -0.021 -0.009 0.013 -0.075 0.000
BSF_14 -0.049 0.020 -0.062 0.002 -0.023 0.031 0.029 0.026 0.000
Summary and Takeaways
Four-factor solution: Best overall fit, but not interpretable, likely “overfitted noise."
Two-factor solution: Thematically coherent, suggesting (1) general bullshitting and (2) interpersonal avoidance/mitigation.
One-factor refined scale solution: Improved global fit, interpretable factor, and fewer areas of localised strain.
Next steps: For applied work or future refinement, a two-factor model appears most interpretable, but a unitary, high-loading 9-item scale offers a practical, psychometrically robust alternative. Further validation in independent samples is recommended.
Bottom line: Sometimes, statistical fit rewards unnecessary complexity. For the BFS, clarity and interpretability matter. Both a two-factor solution (general + interpersonal avoidance) and a refined one-factor, 9-item scale offer practical, theory-informed ways forward.
Takeaway: If you need a robust, interpretable scale for "everyday bullshitting," try the 9-item version (BSF_1, 2, 3, 4, 9, 10, 11, 13, 14)—and validate it in your own data.