library(data.table)
library(flextable)
library(haven)
library(lavaan)
library(rstanarm)
<-
study_data read_sav("RemoteWorkData.sav") |>
as.data.table()
Remote working is now a defining feature of many jobs, offering employees greater flexibility and reduced commuting costs. However, it also comes with trade-offs, most notably, a decline in day-to-day social interactions, which can lead to feelings of isolation.
I recently came across a published paper examining the link between job satisfaction and remote worker status. While the authors used regression-based modelling to explore their research questions, they did so without addressing the underlying measurement model. Since the dataset is open-source, we have the opportunity to revisit their analysis. In this post, I’ll re-examine two of their research questions, aiming both to replicate their findings and assess the validity of the measurement model:
What is the relationship between job satisfaction and remote worker status (remote vs. non-remote)?
What is the relationship between teamwork and job satisfaction?
Table 1 lists the items that are used to answer the two research questions.
study_data[, .(item_id = names(.SD),
item_label = unlist(lapply(.SD, function (x) attr(x, "label")))
|>
)] grepl("^Job_Satisfaction|^TeamWork", item_id, perl = T)] |>
_[as_flextable(
max_row = 14,
show_coltype = F) |>
set_header_labels(
values = c("Item", "Label")
)
Item | Label |
---|---|
Job_Satisfaction_1 | How do you feel about your job? |
Job_Satisfaction_2 | How do you feel about the people you work with-your co-workers? |
Job_Satisfaction_3 | How do you feel about the work you do on your job-the work itself? |
Job_Satisfaction_4 | What is it like where your work-the physical surroundings, the hours, the amount of work you are asked to do? |
Job_Satisfaction_5 | How do you feel about what you have available for doing your job¬ I mean the equipment, information, good supervision, and so on? |
TeamWork_1 | In our Team at work, - we communicate in a timely manner |
TeamWork_2 | In our Team at work, - we communicate exactly and precisely. |
TeamWork_3 | In our Team at work, - we communicate result-oriented. |
TeamWork_4 | In our Team at work, - we keep each other posted. |
TeamWork_5 | In our Team at work, - we proactively inform the others about relevant news. |
TeamWork_6 | In our Team at work, - we exchange our knowledge and experiences. |
TeamWork_7 | In our Team at work, - we come up with solutions through spontaneous conversations. |
TeamWork_8 | In our Team at work, - we also discuss things spontaneously. |
TeamWork_9 | In our Team at work, - we communicate on a short notice. |
n: 14 |
Research Question 1
To answer research question 1, we begin by specifying a measurement model for job satisfaction. All five job satisfaction items are hypothesised to load onto a single latent factor. As the items are ordinal, we used a categorical estimator.
<- "
rq_1_model job_satisfaction =~ Job_Satisfaction_1 + Job_Satisfaction_2 + Job_Satisfaction_3 + Job_Satisfaction_4 + Job_Satisfaction_5
"
<- sem(rq_1_model,
rq_1_fit
study_data,estimator = "ULSMV",
ordered = T)
Model Fit
Based on global fit indices (see Table 2), the model does not fit the data well. Although the Comparative Fit Index (CFI) is borderline acceptable at 0.93, the RMSEA is high at 0.16, and the chi-square test is significant, indicating poor model fit overall.
fitmeasures(
rq_1_fit,fit.measures = c(
"chisq.scaled",
"df.scaled",
"pvalue.scaled",
"cfi.robust",
"tli.robust",
"rmsea.robust"
)|>
) as.data.table(keep.rownames = T) |>
:= c(
_[, V1 "χ² (scaled)",
"df",
"p-value",
"CFI",
"TLI",
"RMSEA"
|>
)] as_flextable(show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(
values = c("Fit Index", "Value")
)
Fit Index | Value |
---|---|
χ² (scaled) | 23.08 |
df | 5.00 |
p-value | 0.00 |
CFI | 0.93 |
TLI | 0.87 |
RMSEA | 0.16 |
n: 6 |
Residual Inspection
Inspection of the residual correlation matrix suggests that item 4 shows substantial residual correlations with item 2 (-0.113) and item 5 (0.096). This may indicate content overlap. Item 2 references co-workers, and Item 5 addresses equipment and supervision, both of which relate to the broader work environment. Item 4, which asks about physical surroundings and workload, appears to blend multiple aspects of the workplace, leading to potential multidimensionality.
<-
rq_1_residuals residuals(rq_1_fit)$cov
upper.tri(rq_1_residuals)] <- NA
rq_1_residuals[
|>
rq_1_residuals as.data.table(keep.rownames = T) |>
as_flextable(
show_coltype = F
|>
) colformat_double(
digits = 3
)
rn | Job_Satisfaction_1 | Job_Satisfaction_2 | Job_Satisfaction_3 | Job_Satisfaction_4 | Job_Satisfaction_5 |
---|---|---|---|---|---|
Job_Satisfaction_1 | 0.000 | ||||
Job_Satisfaction_2 | 0.094 | 0.000 | |||
Job_Satisfaction_3 | 0.023 | -0.002 | 0.000 | ||
Job_Satisfaction_4 | -0.038 | -0.113 | 0.021 | 0.000 | |
Job_Satisfaction_5 | -0.044 | 0.001 | -0.047 | 0.096 | 0 |
n: 5 |
Model Re-Specification
The model is re-estimated with the re-specification of correlated errors between items 2, 4, and 5.
<- "
rq_1_model job_satisfaction =~ Job_Satisfaction_1 + Job_Satisfaction_2 + Job_Satisfaction_3 + Job_Satisfaction_4 + Job_Satisfaction_5
Job_Satisfaction_2 ~~ Job_Satisfaction_4
Job_Satisfaction_2 ~~ Job_Satisfaction_5
Job_Satisfaction_4 ~~ Job_Satisfaction_5
"
<- sem(rq_1_model,
rq_1_fit
study_data,estimator = "ULSMV",
ordered = T)
Model Fit (Re-Specified Model)
The re-specified model shows excellent fit (see Table 3). CFI is 1.00, RMSEA is 0.04, and the chi-square test is non-significant, suggesting that the unidimensional model for job satisfaction now fits the data well.
fitmeasures(
rq_1_fit,fit.measures = c(
"chisq.scaled",
"df.scaled",
"pvalue.scaled",
"cfi.robust",
"tli.robust",
"rmsea.robust"
)|>
) as.data.table(keep.rownames = T) |>
:= c(
_[, V1 "χ² (scaled)",
"df",
"p-value",
"CFI",
"TLI",
"RMSEA"
|>
)] as_flextable(show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(
values = c("Fit Index", "Value")
)
Fit Index | Value |
---|---|
χ² (scaled) | 3.41 |
df | 2.00 |
p-value | 0.18 |
CFI | 1.00 |
TLI | 0.99 |
RMSEA | 0.04 |
n: 6 |
Residuals correlations have also improved, with no notable areas of localised strain.
<-
rq_1_residuals residuals(rq_1_fit)$cov
upper.tri(rq_1_residuals)] <- NA
rq_1_residuals[
|>
rq_1_residuals as.data.table(keep.rownames = T) |>
as_flextable(
show_coltype = F
|>
) colformat_double(
digits = 3
)
rn | Job_Satisfaction_1 | Job_Satisfaction_2 | Job_Satisfaction_3 | Job_Satisfaction_4 | Job_Satisfaction_5 |
---|---|---|---|---|---|
Job_Satisfaction_1 | 0.000 | ||||
Job_Satisfaction_2 | 0.033 | 0.000 | |||
Job_Satisfaction_3 | 0.000 | -0.043 | 0.000 | ||
Job_Satisfaction_4 | -0.028 | 0.000 | 0.036 | 0 | |
Job_Satisfaction_5 | 0.003 | -0.000 | -0.004 | 0 | 0 |
n: 5 |
Structural Model
Having confirmed an acceptable measurement model, we now proceed to evaluate the structural component by introducing remote worker status as a covariate. In this model, we regress the latent variable job_satisfaction on the observed binary covariate (remote vs. non-remote).
We assume no direct effects of the covariate on the latent variable’s indicators. Estimating such effects without justification would lead to an underidentified model. Instead, we take an exploratory approach: all direct effects are fixed to zero, and we inspect modification indices to assess whether any residual direct effects appear to be significant.
<- "
rq_1_model job_satisfaction =~ Job_Satisfaction_1 + Job_Satisfaction_2 + Job_Satisfaction_3 + Job_Satisfaction_4 + Job_Satisfaction_5
Job_Satisfaction_2 ~~ Job_Satisfaction_4
Job_Satisfaction_2 ~~ Job_Satisfaction_5
Job_Satisfaction_4 ~~ Job_Satisfaction_5
job_satisfaction ~ Remote_Work
Job_Satisfaction_1 ~ 0*Remote_Work
Job_Satisfaction_2 ~ 0*Remote_Work
Job_Satisfaction_3 ~ 0*Remote_Work
Job_Satisfaction_4 ~ 0*Remote_Work
Job_Satisfaction_5 ~ 0*Remote_Work
"
<- sem(rq_1_model,
rq_1_fit
study_data,estimator = "ULSMV",
ordered = T)
The regression of job satisfaction on remote worker status showed no significant effect, consistent with the findings of the original paper. The parameter estimate is small and non-significant, with a wide confidence interval that crosses zero (see Table 4).
parameterestimates(rq_1_fit) |>
as.data.table() |>
== "job_satisfaction" & op == "~" & rhs == "Remote_Work"] |>
_[lhs flextable() |>
colformat_double(digits = 2)
lhs | op | rhs | est | se | z | pvalue | ci.lower | ci.upper |
---|---|---|---|---|---|---|---|---|
job_satisfaction | ~ | Remote_Work | -0.10 | 0.18 | -0.53 | 0.60 | -0.46 | 0.26 |
These results indicate that, in this sample, being a remote worker is not associated with a statistically significant change in overall job satisfaction.
Differential Item Functioning
To examine potential differential item functioning (DIF), we inspect modification indices for direct effects of the remote work variable on individual job satisfaction items. None of the modification indices exceed typical thresholds for model re-specification (e.g., MI > 3.84), suggesting that there is no evidence of DIF related to remote status (see Table 5).
modificationindices(rq_1_fit) |>
as.data.table() |>
== "~" & grepl("^Job_Satisfaction_[0-9]", lhs, perl = T) & grepl("Remote_Work", rhs)] |>
_[op as_flextable(show_coltype = F) |>
colformat_double(digits = 2)
lhs | op | rhs | mi | epc | sepc.lv | sepc.all | sepc.nox |
---|---|---|---|---|---|---|---|
Job_Satisfaction_1 | ~ | Remote_Work | 1.96 | 0.13 | 0.13 | 0.06 | 0.13 |
Job_Satisfaction_2 | ~ | Remote_Work | 0.34 | 0.05 | 0.05 | 0.02 | 0.05 |
Job_Satisfaction_3 | ~ | Remote_Work | 0.41 | 0.05 | 0.05 | 0.02 | 0.05 |
Job_Satisfaction_4 | ~ | Remote_Work | 1.73 | -0.11 | -0.11 | -0.05 | -0.11 |
Job_Satisfaction_5 | ~ | Remote_Work | 2.40 | -0.13 | -0.13 | -0.06 | -0.13 |
n: 5 |
Research Question 2
For Research Question 2, we examine the relationship between teamwork and job satisfaction. Specifically, we regress the latent factor of job satisfaction onto latent teamwork factors.
To simplify the model and isolate the relationship of interest, we omit the remote worker covariate from this analysis.
Initially, we attempt to estimate a four-factor measurement model for teamwork, consistent with the original paper. However, this resulted in computational issues, likely due to poor factor separation or estimation instability in the sample.
<- "
rq_2_model job_satisfaction =~ Job_Satisfaction_1 + Job_Satisfaction_2 + Job_Satisfaction_3 + Job_Satisfaction_4 + Job_Satisfaction_5
Job_Satisfaction_2 ~~ Job_Satisfaction_4
Job_Satisfaction_2 ~~ Job_Satisfaction_5
Job_Satisfaction_4 ~~ Job_Satisfaction_5
focused_communication =~ TeamWork_1 + TeamWork_2 + TeamWork_3
knowledge_sharing =~ TeamWork_4 + TeamWork_5 + TeamWork_6
spontaneous_communication =~ TeamWork_7 + TeamWork_8 + TeamWork_9
"
<- sem(rq_2_model,
rq_2_fit
study_data,estimator = "ULSMV",
ordered = T)
Warning: lavaan->lav_model_vcov():
The variance-covariance matrix of the estimated parameters (vcov) does not
appear to be positive definite! The smallest eigenvalue (= 7.344696e-18)
is close to zero. This may be a symptom that the model is not identified.
At face value, the items appear to perform reasonably. To further probe potential issues, we inspect the correlation matrix between the job satisfaction and teamwork items.
This allows us to:
Identify items with unusually low or high correlations,
Check for cross-loadings or overlapping constructs,
Assess whether specific items may be contributing to estimation difficulties.
Initial inspection reveals that Job_Satisfaction_2 (“How do you feel about the people you work with – your co-workers?”) shows moderate correlations with several teamwork items. This is not surprising given the shared focus on interpersonal aspects of work.
This overlap may indicate that Job_Satisfaction_2 taps into a construct more aligned with teamwork than with satisfaction per se, potentially contributing to cross-loading or model instability.
<-
rq_2_cor lavCor(study_data[, .SD, .SDcols = patterns("Job_Satisfaction_[0-9]|TeamWork_[0-9]")], ordered = T)
upper.tri(rq_2_cor)] <- NA
rq_2_cor[
|>
rq_2_cor as.data.table(keep.rownames = T) |>
as_flextable(
max_row = 14,
show_coltype = F
|>
) colformat_double(
digits = 3
)
rn | Job_Satisfaction_1 | Job_Satisfaction_2 | Job_Satisfaction_3 | Job_Satisfaction_4 | Job_Satisfaction_5 | TeamWork_1 | TeamWork_2 | TeamWork_3 | TeamWork_4 | TeamWork_5 | TeamWork_6 | TeamWork_7 | TeamWork_8 | TeamWork_9 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Job_Satisfaction_1 | 1.000 | |||||||||||||
Job_Satisfaction_2 | 0.494 | 1.000 | ||||||||||||
Job_Satisfaction_3 | 0.594 | 0.316 | 1.000 | |||||||||||
Job_Satisfaction_4 | 0.584 | 0.222 | 0.504 | 1.000 | ||||||||||
Job_Satisfaction_5 | 0.589 | 0.332 | 0.447 | 0.625 | 1.000 | |||||||||
TeamWork_1 | 0.242 | 0.508 | 0.199 | 0.267 | 0.425 | 1.000 | ||||||||
TeamWork_2 | 0.316 | 0.483 | 0.237 | 0.321 | 0.409 | 0.757 | 1.000 | |||||||
TeamWork_3 | 0.155 | 0.314 | 0.187 | 0.284 | 0.451 | 0.461 | 0.512 | 1.000 | ||||||
TeamWork_4 | 0.398 | 0.601 | 0.275 | 0.314 | 0.422 | 0.656 | 0.592 | 0.510 | 1.000 | |||||
TeamWork_5 | 0.382 | 0.542 | 0.149 | 0.307 | 0.386 | 0.539 | 0.409 | 0.410 | 0.807 | 1.000 | ||||
TeamWork_6 | 0.404 | 0.592 | 0.195 | 0.198 | 0.334 | 0.543 | 0.439 | 0.414 | 0.598 | 0.637 | 1.000 | |||
TeamWork_7 | 0.330 | 0.580 | 0.245 | 0.201 | 0.333 | 0.493 | 0.506 | 0.409 | 0.619 | 0.525 | 0.636 | 1.000 | ||
TeamWork_8 | 0.316 | 0.467 | 0.259 | 0.282 | 0.278 | 0.497 | 0.488 | 0.272 | 0.540 | 0.494 | 0.565 | 0.771 | 1.000 | |
TeamWork_9 | 0.279 | 0.181 | 0.257 | 0.390 | 0.372 | 0.300 | 0.318 | 0.289 | 0.426 | 0.372 | 0.352 | 0.399 | 0.508 | 1 |
n: 14 |
Exploratory Factor Analysis
To move beyond the constraints of the confirmatory approach, we conduct an exploratory factor analysis (EFA) assuming a four-factor structure, in line with the intended design. However, estimation problems persist, and the factor solution reveals further issues with cross-loadings.
In particular, several job satisfaction items exhibit substantial loadings on non-target teamwork factors:
Job_Satisfaction_2 (“How do you feel about your co-workers?”) loads strongly on the same factor as TeamWork_4, TeamWork_5, and TeamWork_6 – all of which pertain to communication and information sharing among team members.
Job_Satisfaction_4 and Job_Satisfaction_5—which relate to workload, resources, and the physical work environment—also load onto factors intended to represent efficient and result-oriented team communication (e.g. TeamWork_1, TeamWork_2, TeamWork_3).
This pattern of cross-loading suggests content overlap between the job satisfaction and teamwork items. The teamwork items are not purely about group-level processes but often touch on conditions that affect the individual’s satisfaction with their work environment. In turn, some job satisfaction items capture perceptions that are social or context-dependent, blurring the distinction between the two constructs.
efa(
.SDcols = patterns("Job_Satisfaction_[0-9]|TeamWork_[0-9]")],
study_data[, .SD, ordered = T,
nfactors = 4)
Warning: lavaan->lav_model_vcov():
The variance-covariance matrix of the estimated parameters (vcov) does not
appear to be positive definite! The smallest eigenvalue (= 2.076088e-18)
is close to zero. This may be a symptom that the model is not identified.
Warning: lavaan->lav_model_vcov():
The variance-covariance matrix of the estimated parameters (vcov) does not
appear to be positive definite! The smallest eigenvalue (= 2.445125e-16)
is close to zero. This may be a symptom that the model is not identified.
f1 f2 f3 f4
Job_Satisfaction_1 0.745* 0.394*
Job_Satisfaction_2 . 0.625* .
Job_Satisfaction_3 0.618* .
Job_Satisfaction_4 0.692* 0.485* .
Job_Satisfaction_5 0.583* 0.558*
TeamWork_1 0.723* .
TeamWork_2 0.770* .
TeamWork_3 0.530* .
TeamWork_4 . 0.723*
TeamWork_5 0.833*
TeamWork_6 0.602* .
TeamWork_7 0.372 0.599*
TeamWork_8 0.904*
TeamWork_9 .* . 0.356*
A decision was made to drop Job_Satisfaction_2 and Job_Satisfaction_5. Job_Satisfaction_4 was retained as it has a stronger loading on its target factor.
<- "
rq_2_model job_satisfaction =~ Job_Satisfaction_1 + Job_Satisfaction_3 + Job_Satisfaction_4
focused_communication =~ TeamWork_1 + TeamWork_2 + TeamWork_3
knowledge_sharing =~ TeamWork_4 + TeamWork_5 + TeamWork_6
spontaneous_communication =~ TeamWork_7 + TeamWork_8 + TeamWork_9
"
<- sem(rq_2_model,
rq_2_fit
study_data,estimator = "ULSMV",
ordered = T)
The model fit is poor (see Table 6). CFI is 0.90, RMSEA is 0.12, and the Chi-Square is significant. The four-factor model does not fit the data well.
fitmeasures(
rq_2_fit,fit.measures = c(
"chisq.scaled",
"df.scaled",
"pvalue.scaled",
"cfi.robust",
"tli.robust",
"rmsea.robust"
)|>
) as.data.table(keep.rownames = T) |>
:= c(
_[, V1 "χ² (scaled)",
"df",
"p-value",
"CFI",
"TLI",
"RMSEA"
|>
)] as_flextable(show_coltype = F) |>
colformat_double(digits = 2) |>
set_header_labels(
values = c("Fit Index", "Value")
)
Fit Index | Value |
---|---|
χ² (scaled) | 90.60 |
df | 48.00 |
p-value | 0.00 |
CFI | 0.90 |
TLI | 0.86 |
RMSEA | 0.12 |
n: 6 |
Residual correlations show the job satisfaction items still remain a problem, in addition to issues between teamwork items.
<-
rq_2_residuals residuals(rq_2_fit)$cov
upper.tri(rq_2_residuals)] <- NA
rq_2_residuals[
|>
rq_2_residuals as.data.table(keep.rownames = T) |>
as_flextable(
max_row = 12,
show_coltype = F
|>
) colformat_double(
digits = 3
)
rn | Job_Satisfaction_1 | Job_Satisfaction_3 | Job_Satisfaction_4 | TeamWork_1 | TeamWork_2 | TeamWork_3 | TeamWork_4 | TeamWork_5 | TeamWork_6 | TeamWork_7 | TeamWork_8 | TeamWork_9 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Job_Satisfaction_1 | 0.000 | |||||||||||
Job_Satisfaction_3 | 0.036 | 0.000 | ||||||||||
Job_Satisfaction_4 | -0.049 | 0.023 | 0.000 | |||||||||
TeamWork_1 | -0.073 | -0.036 | -0.007 | 0.000 | ||||||||
TeamWork_2 | 0.016 | 0.013 | 0.062 | 0.066 | 0.000 | |||||||
TeamWork_3 | -0.061 | 0.020 | 0.099 | -0.080 | -0.006 | 0.000 | ||||||
TeamWork_4 | 0.019 | -0.008 | -0.013 | 0.050 | 0.005 | 0.050 | 0.000 | |||||
TeamWork_5 | 0.051 | -0.097 | 0.020 | 0.010 | -0.104 | 0.009 | 0.092 | 0.000 | ||||
TeamWork_6 | 0.091 | -0.041 | -0.067 | 0.018 | -0.066 | 0.030 | -0.110 | 0.018 | 0.000 | |||
TeamWork_7 | -0.026 | -0.026 | -0.102 | -0.010 | 0.022 | 0.044 | -0.008 | -0.022 | 0.103 | 0.000 | ||
TeamWork_8 | -0.027 | -0.002 | -0.011 | 0.013 | 0.023 | -0.079 | -0.063 | -0.032 | 0.052 | 0.060 | 0.000 | |
TeamWork_9 | 0.037 | 0.073 | 0.181 | -0.047 | -0.014 | 0.041 | -0.006 | -0.006 | -0.012 | -0.105 | 0.022 | 0 |
n: 12 |
Given the results of the analysis, the measurement model for teamwork and job satisfaction is not tenable. The constructs overlap considerably, primarily due to both being evaluative in nature, with shared content relating to co-worker interactions and work conditions.
As a result, we do not proceed with the structural model in the SEM framework. The regression of job satisfaction on teamwork cannot be supported using latent variables due to model misfit and interpretability concerns.
Conclusion
This re-analysis aimed to evaluate the relationship between job satisfaction and remote work using a more rigorous measurement modelling approach. In line with the original study, we found no significant association between remote worker status and overall job satisfaction. However, our analysis highlights critical measurement concerns: some job satisfaction items overlap with teamwork content, making it difficult to distinguish the constructs cleanly.
Attempts to fit a confirmatory factor model incorporating both teamwork and job satisfaction failed, and even an exploratory approach revealed problematic cross-loadings. As such, the measurement model could not be accepted, and structural modelling between teamwork and job satisfaction could not proceed.
These findings underscore the importance of validating measurement models before testing structural hypotheses. While remote work continues to reshape how teams interact and how employees evaluate their roles, clear distinctions between constructs like job satisfaction and teamwork remain essential for drawing accurate conclusions.