exploratory factor analysis in R (EFA IN R)

Write an article about how exploratory factor analysis is used in R and how it differs from confirmatory factor analysis. Exploratory Factor Analysis (EFA) is a data-driven statistical technique to identify underlying constructs that produce variation among observed measures. EFA can provide researchers with the understanding of how single variables correlate with one another, and where patterns emerge across inventories, assessments, or questionnaires.

Exploratory Factor Analysis (EFA) is a data-driven statistical technique to identify underlying constructs that produce variation among observed measures. EFA can provide researchers with the understanding of how single variables correlate with one another, and where patterns emerge across inventories, assessments, or questionnaires. This article will discuss how the process differs from confirmatory factor analysis and what you need to know about it in order to successfully apply it in R.

Exploratory Factor Analysis (EFA), like other statistical analyses, relies on the presence of a sample and a population. EFA is not considered to be a “subset” of other analytical methods such as ANOVA, regression, or structural equation modeling. Generally speaking, EFA can be thought of as a one-step analysis since it only assesses data at the level of correlation between variables and does not model conditional relationships among variables (i.e., correlations).

How does it differ from confirmatory factor analysis

The purpose of the exploratory factor analysis is to identify factor structures that are suggested by correlations among observed variables. The confirmatory factor analysis, on the other hand, assesses whether a proposed theoretical model is supported by empirical data.

When should you use exploratory factor analysis in R

Confirmatory factor analysis is applied to measure relationships among different variables. It aims to test hypotheses that are grounded in theory or data collected. Confirmatory factor analysis requires careful consideration of the research question, the theoretical model, and the number of sub-questions within the research. The confirmatory factor analysis results can be used to guide exploratory factor analysis.

Analyzing data for exploratory factor analysis has benefits over other statistical analyses because it does not require assumptions about relationships among variables or about how many factors there are. These assumptions limit the conclusions that can be drawn from confirmatory factor analysis because they require an assumption about “how many” time series are entangled. Exploratory Factor Analysis (EFA) is only appropriate for datasets that consist of a single variable or multiple correlated variables. It is not appropriate to use EFA with information from questionnaires, as it does not provide an assessment about whether the constructs measured are truly distinct from one another.

How to do an exploratory factor analysis in R with the “principal axis factoring” method?

This article will explain the process of doing an exploratory factor analysis in R with the “principal axis factoring” method. This would be something that you would do on data that is correlated, but not on data from questionnaires.

Step 1 – Data Preparation

The first step is to have your data set with variables to be included in your exploratory factor analysis (EFA). For EFA, the dependent variable needs to correlate (have a relationship) with at least one explanatory variable; this means that they should co-vary in some way. The more reflecting variables there are, the more likely it is that they will form clusters of high correlations.

You can then exclude any irrelevant variables and start ‘gathering’ your relevant variables. It is useful to make a list of the relevant/reflected variables and any grouping labels (e.g., “Group one”, “Group two”) that will help you, group, similar variables together.

Step 2 – Rotation Method

There are many different ways of doing an EFA. This article will focus on the principal axis factoring method that is common and sometimes considered best practice. You can learn more about other rotation methods at: https://statisticalhorizons.com/2013/07/02/how-to-do-an-exploratory-factor-analysis/.

The next thing to decide in your EFA is the Rotation Method. There are a few different ways that you can do this. It is a good idea to look at your variables and see if there are any outliers in the data set, as these have been known to influence the choice of rotation method. If you have any outliers, they may change the results of your EFA. Some examples of outliers are below:

There are several options for the rotation, for this article, we will focus on the principal axis factoring method.

The next step is to run your EFA! You can do this by running the following R code in RStudio or by using another program if you prefer. For now, let’s assume that your dataset is called “dataset”.

fit <- principal(dataset)

summary(fit,rotate=”oblimin”)

The resulting output will tell you whether the solution was successful. If the summary shows you are not rotating to oblique factors, this means that your data is linear and you should look into other methods such as CFA, SEM or structural equation modeling.

Step 3 – Exploratory Factor Analysis

Once you have run your EFA and it has successfully rotated to oblique factors, the last step is naming and interpreting your factors. Using Kaiser’s criterion (the eigenvalue of >1), we can determine how many factors we want to extract. The eigenvalue tells us how much information from the data is contained within a specific factor.

In this example, we will name factors as “Factor 1”, “Factor 2”, and so on for as many times as eigenvalues >1 that you have rotated to, then label these factors by what they seem most closely related to. This way, you can construct a narrative about what the EFA results mean and determine what to call your factors.

For example, if we rotated to 2 factors, we could call these “Factor 1” and “Factor 2”. We could then refer back to our list of hypotheses (variables) and think about which variables seem most closely related to Factor 1 or Factor 2. You can also see what variables are most closely correlated with each factor. This can be useful for determining if you have too many variables in the EFA rotation.

How to interpret the results of an EFA using the “principal axis factoring” method Example of an EFA using “principal axis factoring” for a hypothetical inventory, assessment, or questionnaire data set.

The principal axis factoring method of EFA is also known as the oblique rotation method. This means that the factors are not assumed to be correlated with one another, but instead capture underlying constructs that may overlap (be related). The oblimin (oblique) rotation used here was found to provide a factor structure that matched our hypotheses. No further rotation was necessary.

Our sample size is quite low here, so we should be cautious about over-or under-extracting factors. You can see that our <1 eigenvalue criterion (Kaiser’s rule) tells us that we should extract 3 factors and follow this up with a parallel analysis (below). The Kaiser criterion is particularly useful here as it will tell us if we have too many factors to consider.

The eigenvalues of the first three factors are above our Kaiser criterion (1.488), so these can be interpreted as they are. The other two factors each have an eigenvalue lower than 1, so we would stop there and review what we have extracted so far.

We will now review the individual items to explore what they relate to and determine labels for each factor. We can also see which variables are most strongly related to each factor. In this case, all but two items load strongly onto the first factor, Factor 1: “interest in math”. These two outliers could potentially be removed from the model as they do not seem to belong.

Please note that, without a larger sample size, we can’t say much here about how well the factors represent our data and whether this factor structure generalizes across participants. This EFA might make more sense if we had a larger sample or a stronger theoretical background.

This example is a hypothetical inventory situation. You can see that we rotated to oblique factors and then named them using what they seem most closely related to (e.g., “interest in math” for Factor 1). We also chose the label of “math anxiety” for Factor 2 as this best-represented item on this factor. In your own analyses, you may come up with a different set of labels and factors.

The next step is to determine whether you have too many factors to interpret or not enough. It’s also useful to see how well your factors represent the data. You can use an EFA with a larger sample size for this purpose, as well as a more thorough theoretical background if available.

Some things to consider when interpreting the EFA results include:

– which factors are most important, based on eigenvalue?

– which items load strongly onto each factor?

– do your factors make sense given the psychological construct you are studying? If not, what seems to be missing? You can also look at relationships between factors (e.g., one factor might be related to 2 other factors).

– which variables are most strongly associated with each factor?

– if the underlying construct is continuous, how many items do you need to measure this construct well? If it’s more than what you have, consider writing some new items or revising the existing items.

– how well do your factors generalize? Are there any outliers that you should consider removing from the model if possible? Are there any items that don’t belong on a factor, even after considering their content?

These are just some things to think about when interpreting EFA results. To determine whether an EFA is appropriate for your data, consider whether you can provide a clear and precise definition of the construct you are studying (e.g., anxiety), whether using a variety of methods to assess this construct would make sense (e.g., self-report questionnaire, observational measure), the sample size required for factor analysis, etc.

 

Concluding remarks about how to apply this technique successfully when doing research

Exploratory factor analysis is an exploratory statistical technique that is used to identify underlying factors of variation among observed variables. It helps you break down your data into smaller, manageable pieces so you can better understand what it means and what it represents. Factor Analysis can be applied in many different ways depending on the needs of the researcher. This article has provided a brief overview of how this research method will help with psychology-related projects because it allows researchers to look for patterns between items or behaviors across individuals, but there are many other applications as well! The benefits might not always be clear at first glance, but once you get started exploring them they will become more apparent. For example, if all goes according to plan then after conducting an EFA one may be able to better understand the topic at hand, as well as determine which questions or factors should be focused on. The design and development of psychological assessments is an essential part of research and finding a way to reduce items down into what is most representative can help with this process.

Additionally, factor analysis allows one to “look beyond what we want and need to know and consider what we may not be able to know” (p. 168). In order to achieve this, one must “[identify] the questions that cannot yet be answered with these data, because the sample size is too small or because of other limitations in the design” (pp. 188-189). As noted by Fabrigar and Wegener (2012), “factor analysis represents a distillation of a set of variables into a smaller set of composite variables that, in some sense, optimally capture their relationship to one another” (p. 54). In short, it provides researchers with an opportunity to focus on the most relevant aspects of their data so they can start moving ahead with the project. This is not to say that additional variables cannot be added later with a larger sample, but it can help researchers narrow down what issues are most important and how they might best be investigated with their resources.

The goal of factor analysis is ultimately to summarize the correlations between some sets of variables more parsimoniously than by listing all of the coefficients.

By Muthali Ganesh

I am an engineer wih a masters in business administration from Chennai, India. I love discovering and sharing hacks.