How to Run Path Analysis with R – Part 2

How Run Path Analysis with R

In the previous post, I learned how to run path analysis with R for the first time. Last time, I imported data from SPSS. This time, I want to try something different – use a covariance matrix to run path analysis.

How to Run Path Analaysis with R using a Covariance Matrix

Everything about how to use a covariance matrix as input is explained on the lavaan project page. I pretty much follow this tutorial step-by-step.

Create a covariance matrix

First things first, I have to obviously prepare a covariance matrix among variables of interest. For this learning exercise, I use a dataset that I collected a few years ago. It’s saved in sav format.

I have 5 main variables to use: (1) intention to eat fast food items (e.g., hamburger, chocolate, fries, soda drink, etc.), (2) exposure to fast food advertisements, (3) attitudes toward fast food, (4) perceived norms toward fast food, and (5) self-efficacy regarding consumption of fast food.

There are multiple ways to create a covariance matrix with SPSS, but I go to Analyze > Scale > Reliability Ananlysis > and enter all variables that I want to use. I also request Inter-Item Covariances from Statistics option.

Now, I have a covariance matrix – I am reporting only the low half of the matrix.

19.791 
5.373 12.710 
6.847 4.086 33.840 
9.425 5.138 10.753 27.833 
-2.777 -1.181 -1.267 -2.479 7.833

Read a covariance matrix

Next, I open RStudio. First, I check my working directory with the getwd() function.

getwd()
[1] "c:/users/document"

I am now storing this blog folder in my dropbox folder. So, to change the working directory, I use the setwd() function.

setwd("c:/users/dropbox/r")
[1] setwd("c:/users/dropbox/r")

Then, call lavaan.

library(lavaan)

As instructed on on the lavaan project page, now read the covariance matrix and store this covariance matrix in a new variable that I call food. The covariance matrix should be enveloped with single quotes (‘ ‘) at the beginning and at the end.

food <- '
19.791 
5.373 12.710 
6.847 4.086 33.840 
9.425 5.138 10.753 27.833 
-2.777 -1.181 -1.267 -2.479 7.833'

Then, I can add variable names to each column of the matrix. Very neat! I did not know how to do it, but apparently, I can use the c() function within the getCov() function. So, I try to add the name of each variabel to the food covariance matrix, and store it in a new food.cov variable.

food.cov <- getCov(food, names = c("intention", "ads", "attitudes", "norms", "efficacy"))

Specify a model

From this point on, everything is almost the same as the previous attempt. I just first need to specify a model and then estimate it!

So, I have this model in mind based on theory of planned behavior.

model <- 'intention ~ attitudes + norms + efficacy
attitudes ~ ads
norms ~ ads
efficacy ~ ads'

The model indicates that the intention to consume fast food items is a function of attitudes (like-dislike fast food), perceived norms (people around you approve-disapprove of eating fast food), and efficacy (the belief that you can resist consumption of fast food).

And, attitudes, norms, and efficacy are a function of exposure to fast food advertisements. Fast food ads always glorify fast food items with fancy presentations. So, being exposed to fast food ads might be related to increaes in attitudes toward fast food and perceived norms, and decreases in efficacy.

Estimate a model

I again use the sem() function and store results in a new variable result. The only difference is that in the sem function parentheses, this time I type sample.cov= (covariance matrix) and sample.nobs= (sample size), instead of just data name.

result <- sem(model, sample.cov = junk.cov, sample.nobs = 448)

Get results

Then, I type the following to get basic results, and standardized coefficients, fit indices, and modification indices!

summary(result, standardized=TRUE, fit.measures=TRUE, modindices=TRUE)

> summary(result, standardized=TRUE, fit.measures=TRUE, modindices=TRUE)
lavaan (0.5-22) converged normally after 21 iterations

Number of observations 448

Estimator ML
 Minimum Function Test Statistic 55.822
 Degrees of freedom 3
 P-value (Chi-square) 0.000

Model test baseline model:

Minimum Function Test Statistic 242.611
 Degrees of freedom 10
 P-value 0.000

User model versus baseline model:

Comparative Fit Index (CFI) 0.773
 Tucker-Lewis Index (TLI) 0.243

Loglikelihood and Information Criteria:

Loglikelihood user model (H0) -6315.702
 Loglikelihood unrestricted model (H1) -6287.791

Number of free parameters 11
 Akaike (AIC) 12653.405
 Bayesian (BIC) 12698.558
 Sample-size adjusted Bayesian (BIC) 12663.648

Root Mean Square Error of Approximation:

RMSEA 0.198
 90 Percent Confidence Interval 0.155 0.245
 P-value RMSEA <= 0.05 0.000

Standardized Root Mean Square Residual:

SRMR 0.091

Parameter Estimates:

Information Expected
 Standard Errors Standard

Regressions:
 Estimate Std.Err z-value P(>|z|) Std.lv Std.all
 intention ~ 
 norm 0.086 0.032 2.687 0.007 0.086 0.114
 attitude 0.234 0.036 6.513 0.000 0.234 0.282
 efficacy -0.224 0.066 -3.424 0.001 -0.224 -0.143
 ads 0.280 0.054 5.138 0.000 0.280 0.228
 norm ~ 
 ads 0.321 0.076 4.253 0.000 0.321 0.197
 attitude ~ 
 ads 0.404 0.067 6.011 0.000 0.404 0.273
 efficacy ~ 
 ads -0.093 0.037 -2.523 0.012 -0.093 -0.118

Variances:
 Estimate Std.Err z-value P(>|z|) Std.lv Std.all
 .intention 14.840 0.992 14.967 0.000 14.840 0.775
 .norm 32.454 2.168 14.967 0.000 32.454 0.961
 .attitude 25.698 1.717 14.967 0.000 25.698 0.925
 .efficacy 7.706 0.515 14.967 0.000 7.706 0.986

Modification Indices:

lhs op rhs mi epc sepc.lv sepc.all sepc.nox
12 ads ~~ ads 0.000 0.000 0.000 0.000 0.000
16 norm ~~ attitude 44.296 9.081 9.081 0.297 0.297
17 norm ~~ efficacy 1.404 -0.885 -0.885 -0.055 -0.055
18 attitude ~~ efficacy 9.023 -1.997 -1.997 -0.136 -0.136
19 norm ~ intention 41.506 1.295 1.295 0.975 0.975
20 norm ~ attitude 44.296 0.353 0.353 0.320 0.320
21 norm ~ efficacy 1.404 -0.115 -0.115 -0.055 -0.055
22 attitude ~ intention 41.890 1.957 1.957 1.625 1.625
23 attitude ~ norm 44.296 0.280 0.280 0.309 0.309
24 attitude ~ efficacy 9.023 -0.259 -0.259 -0.137 -0.137
25 efficacy ~ intention 10.425 -0.330 -0.330 -0.517 -0.517
26 efficacy ~ norm 1.404 -0.027 -0.027 -0.057 -0.057
27 efficacy ~ attitude 9.023 -0.078 -0.078 -0.146 -0.146
28 ads ~ intention 0.000 0.000 0.000 0.000 0.000
29 ads ~ norm 0.000 0.000 0.000 0.000 0.000
30 ads ~ attitude 0.000 0.000 0.000 0.000 0.000
31 ads ~ efficacy 0.000 0.000 0.000 0.000 0.000

Looking at all fit indices, the model does not represent the data quite well. But, that’s OK for the purpose of this practice exercise.

Wrapping up

I guess I kind of understand the fundamentals of how to perform path analysis with R and lavaan. Please let me know if you find any errors though.

For the next one, I am thinking of moving back to Web scraping with API or data preprocessing for text mining. As I read more, I find more great resources and ideas shared by more experienced users!

Happy learning, and type it hard!

Image courtesy of taniadimas

TYPE IT HARD

A NOVICE'S STEP-BY-STEP EXPERIENCE WITH COMPUTATIONAL MAGIC