In the previous post, I learned how to run path analysis with R for the first time. Last time, I imported data from SPSS. This time, I want to try something different – use a covariance matrix to run path analysis.
How to Run Path Analaysis with R using a Covariance Matrix
Everything about how to use a covariance matrix as input is explained on the lavaan project page. I pretty much follow this tutorial step-by-step.
Create a covariance matrix
First things first, I have to obviously prepare a covariance matrix among variables of interest. For this learning exercise, I use a dataset that I collected a few years ago. It’s saved in sav format.
I have 5 main variables to use: (1) intention to eat fast food items (e.g., hamburger, chocolate, fries, soda drink, etc.), (2) exposure to fast food advertisements, (3) attitudes toward fast food, (4) perceived norms toward fast food, and (5) self-efficacy regarding consumption of fast food.
There are multiple ways to create a covariance matrix with SPSS, but I go to Analyze > Scale > Reliability Ananlysis > and enter all variables that I want to use. I also request Inter-Item Covariances from Statistics option.
Now, I have a covariance matrix – I am reporting only the low half of the matrix.
19.791 5.373 12.710 6.847 4.086 33.840 9.425 5.138 10.753 27.833 -2.777 -1.181 -1.267 -2.479 7.833
Read a covariance matrix
Next, I open RStudio. First, I check my working directory with the getwd() function.
getwd() [1] "c:/users/document"
I am now storing this blog folder in my dropbox folder. So, to change the working directory, I use the setwd() function.
setwd("c:/users/dropbox/r") [1] setwd("c:/users/dropbox/r")
Then, call lavaan.
library(lavaan)
As instructed on on the lavaan project page, now read the covariance matrix and store this covariance matrix in a new variable that I call food. The covariance matrix should be enveloped with single quotes (‘ ‘) at the beginning and at the end.
food <- ' 19.791 5.373 12.710 6.847 4.086 33.840 9.425 5.138 10.753 27.833 -2.777 -1.181 -1.267 -2.479 7.833'
Then, I can add variable names to each column of the matrix. Very neat! I did not know how to do it, but apparently, I can use the c() function within the getCov() function. So, I try to add the name of each variabel to the food covariance matrix, and store it in a new food.cov variable.
food.cov <- getCov(food, names = c("intention", "ads", "attitudes", "norms", "efficacy"))
Specify a model
From this point on, everything is almost the same as the previous attempt. I just first need to specify a model and then estimate it!
So, I have this model in mind based on theory of planned behavior.
model <- 'intention ~ attitudes + norms + efficacy attitudes ~ ads norms ~ ads efficacy ~ ads'
The model indicates that the intention to consume fast food items is a function of attitudes (like-dislike fast food), perceived norms (people around you approve-disapprove of eating fast food), and efficacy (the belief that you can resist consumption of fast food).
And, attitudes, norms, and efficacy are a function of exposure to fast food advertisements. Fast food ads always glorify fast food items with fancy presentations. So, being exposed to fast food ads might be related to increaes in attitudes toward fast food and perceived norms, and decreases in efficacy.
Estimate a model
I again use the sem() function and store results in a new variable result. The only difference is that in the sem function parentheses, this time I type sample.cov= (covariance matrix) and sample.nobs= (sample size), instead of just data name.
result <- sem(model, sample.cov = junk.cov, sample.nobs = 448)
Get results
Then, I type the following to get basic results, and standardized coefficients, fit indices, and modification indices!
summary(result, standardized=TRUE, fit.measures=TRUE, modindices=TRUE)
> summary(result, standardized=TRUE, fit.measures=TRUE, modindices=TRUE) lavaan (0.5-22) converged normally after 21 iterations Number of observations 448 Estimator ML Minimum Function Test Statistic 55.822 Degrees of freedom 3 P-value (Chi-square) 0.000 Model test baseline model: Minimum Function Test Statistic 242.611 Degrees of freedom 10 P-value 0.000 User model versus baseline model: Comparative Fit Index (CFI) 0.773 Tucker-Lewis Index (TLI) 0.243 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -6315.702 Loglikelihood unrestricted model (H1) -6287.791 Number of free parameters 11 Akaike (AIC) 12653.405 Bayesian (BIC) 12698.558 Sample-size adjusted Bayesian (BIC) 12663.648 Root Mean Square Error of Approximation: RMSEA 0.198 90 Percent Confidence Interval 0.155 0.245 P-value RMSEA <= 0.05 0.000 Standardized Root Mean Square Residual: SRMR 0.091 Parameter Estimates: Information Expected Standard Errors Standard Regressions: Estimate Std.Err z-value P(>|z|) Std.lv Std.all intention ~ norm 0.086 0.032 2.687 0.007 0.086 0.114 attitude 0.234 0.036 6.513 0.000 0.234 0.282 efficacy -0.224 0.066 -3.424 0.001 -0.224 -0.143 ads 0.280 0.054 5.138 0.000 0.280 0.228 norm ~ ads 0.321 0.076 4.253 0.000 0.321 0.197 attitude ~ ads 0.404 0.067 6.011 0.000 0.404 0.273 efficacy ~ ads -0.093 0.037 -2.523 0.012 -0.093 -0.118 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .intention 14.840 0.992 14.967 0.000 14.840 0.775 .norm 32.454 2.168 14.967 0.000 32.454 0.961 .attitude 25.698 1.717 14.967 0.000 25.698 0.925 .efficacy 7.706 0.515 14.967 0.000 7.706 0.986 Modification Indices: lhs op rhs mi epc sepc.lv sepc.all sepc.nox 12 ads ~~ ads 0.000 0.000 0.000 0.000 0.000 16 norm ~~ attitude 44.296 9.081 9.081 0.297 0.297 17 norm ~~ efficacy 1.404 -0.885 -0.885 -0.055 -0.055 18 attitude ~~ efficacy 9.023 -1.997 -1.997 -0.136 -0.136 19 norm ~ intention 41.506 1.295 1.295 0.975 0.975 20 norm ~ attitude 44.296 0.353 0.353 0.320 0.320 21 norm ~ efficacy 1.404 -0.115 -0.115 -0.055 -0.055 22 attitude ~ intention 41.890 1.957 1.957 1.625 1.625 23 attitude ~ norm 44.296 0.280 0.280 0.309 0.309 24 attitude ~ efficacy 9.023 -0.259 -0.259 -0.137 -0.137 25 efficacy ~ intention 10.425 -0.330 -0.330 -0.517 -0.517 26 efficacy ~ norm 1.404 -0.027 -0.027 -0.057 -0.057 27 efficacy ~ attitude 9.023 -0.078 -0.078 -0.146 -0.146 28 ads ~ intention 0.000 0.000 0.000 0.000 0.000 29 ads ~ norm 0.000 0.000 0.000 0.000 0.000 30 ads ~ attitude 0.000 0.000 0.000 0.000 0.000 31 ads ~ efficacy 0.000 0.000 0.000 0.000 0.000
Looking at all fit indices, the model does not represent the data quite well. But, that’s OK for the purpose of this practice exercise.
Wrapping up
I guess I kind of understand the fundamentals of how to perform path analysis with R and lavaan. Please let me know if you find any errors though.
For the next one, I am thinking of moving back to Web scraping with API or data preprocessing for text mining. As I read more, I find more great resources and ideas shared by more experienced users!
Happy learning, and type it hard!
Image courtesy of taniadimas