The different variables may correspond to repeated measurements over time, to a battery of surrogates for one or more latent traits, or to multiple types of outcomes having an unknown dependence structure. Hierarchical models that incorporate subje- speci?
Such subject-speci? There are two modeling frameworks that have been particularly widely used as hierarchical generalizations of linear regression models. Linear mixed effects LME models extend linear regr- sion to incorporate two components, with the? LMEs have also been increasingly used for function estimation. One thing to keep in mind when interpreting lasso parameter estimates is that they are biased toward zero because of the shrinkage Tibshirani, To address this bias, one can refit the model without any penalty in a second stage that includes only the chosen subset of predictors.
Because we did not follow this two-stage approach, the regularized coefficients are best interpreted only as zero or nonzero. We have argued that regularized SEM is a powerful and underutilized method for researchers who want to examine a relatively large number of predictors, or who have a relatively modest sample size combined with a model of moderate complexity.
In our simulation study, models with lasso penalties incurred less error than MLE models when sample sizes were small and demonstrated higher power to detect effects of small and medium magnitude. Our results illustrate how sample size and the correlation among regressors influence the accuracy of parameter estimates and how variable selection is performed in an extremely complex model. Starting with a complex model of 48 distinct white-matter tracts, the regularized model identified 6 tracts as determinants of VSTM.
Finally, in our last example, we used a regularized model to identify a broad set of variables that explain individual differences in stress, anxiety, and depression. Our simulation study showed that regularized SEM may be a viable option for researchers looking to identify relatively low-dimensional sets of predictors in fields with broad sets of candidate variables, such as cognitive neuroscience and behavior genetics. Notably, this technique goes beyond traditional methods used to correct for multiple comparisons in neuroimaging studies.
It may be possible to combine regularized SEM with methods of joint comparisons, such as principal component regression, to estimate the joint predictive value of multiple components across many voxels even in cases with modest sample sizes e. Although we have illustrated several benefits of regularization in regression and SEM when sample sizes are small, we did not include any conditions with sample sizes below in our simulation study.
This was mostly due to the complexity of our model, as we were unable to achieve stable estimates at a sample size of or below. In regularized regression, it is possible to test models in which there are more predictors than observations; however, to our knowledge, methods of testing models with more predictors than observations have not been extended to SEM, and we were unsuccessful in our attempt to apply regularized SEM to such cases in our simulation study. Additionally, bias induced by high degrees of collinearity may be reduced by first creating factor scores, and thus fixing the factor loadings.
Frequentist software for regularized SEM currently requires complete cases. As it is rare for psychological data to have no missing values, this requirement is currently a considerable weakness of regularized SEM. One strategy for modeling data with missing values is multiple imputation. The main issue with using this strategy with regularized SEM concerns how to combine the results. In traditional multiple imputation for SEM, parameter estimates can be aggregated across 10 to 20 data sets or more by averaging the parameter estimates and correcting the standard errors for the lack of randomness in the process.
However, regularization is most often used to perform variable selection, and this necessitates a way to aggregate a set of 0 to 1 decisions across imputed data sets. Lockhart, Taylor, Tibshirani, and Tibshirani have derived sampling distributions to calculate p values that take into account the adaptive nature of the lasso regression model, but this work has not been extended to SEM with the lasso. When parameter estimates are not accompanied by p values or confidence intervals, researchers may feel uncertain in making inferences. Consequently, inference can be more challenging with regularized structural equation models than with regularized regression models, particularly given the inherent bias in estimation.
One proposed method for overcoming this challenge is the relaxed lasso Meinshausen, , which has been shown to produce unbiased parameter estimates when applied to mediation models Serang et al. It may be difficult to change the mind-set of relying on p values and instead to characterize nonzero paths as important.
Editorial Reviews. Review. From the reviews: “This book is a collection of review papers, Random Effect and Latent Variable Model Selection (Lecture Notes in Statistics Book ) - Kindle edition by David Dunson. Download it once and. Random Effect and Latent Variable Model Selection In recent years, there has been a dramatic increase in the collection of Lecture Notes in Statistics.
To overcome this difficulty, we recommend thinking in terms of generalizing to an alternative sample. Although researchers may incur bias when using regularization, the more important aim is generalization, which is achieved by reducing variance and preferring models of a complexity that is afforded by the observed data. This holds true particularly for exploratory studies, which are less concerned with within-sample inference and more concerned with informing future research.
The review must be at least 50 characters long. In many cases the constraints that are meaningful in one framework translate to constraints in the equivalent model that lack a clear interpretation in the other framework. In other words, this approach reflects the hypothesis that the true underlying model has few nonzero parameters. View all works in Cristin. From top to bottom, the graphs show results for noise predictors and predictors with small, medium, and large effect sizes.
In exploratory studies, we generally recommend a liberal stance; that is, more emphasis should be given to the inclusion of potentially important variables, and the possibility of including variables that do not have either predictive or inferential value should be of less concern. In an ideal setting, researchers would apply regularized SEM to data from a pilot or initial study in the hopes of being maximally efficient in identifying what variables should be included in a future, possibly larger study. Our simulation study supports the idea that applying MLE when the sample is small and the number of variables is large will result in the exclusion of potentially relevant variables.
Note, however, that our conclusions depended not only on the method of regularization applied but also on the specific heuristic for choosing the penalty i. The penalty values that align with the goals of researchers who want to be relatively inclusive in variable selection i. SEM trees directly use the observed covariates to partition observations, and in the process, only a subset of covariates are used to create the model. This allows researchers to uncover nonlinearities and interactions. Additional methods include the use of heuristic search algorithms e.
One of the biggest challenges to such work is software implementation. We encourage researchers to think of regularization as an approach that can combine confirmatory and exploratory modeling. Regularization gives researchers more flexibility to make both their uncertainty and their knowledge concrete. It is particularly suitable when researchers hope to use a principled approach to go beyond the limitations of their theory to identify potentially fruitful avenues for future study.
In both our simulation and our empirical examples, we conducted exploratory searches for important predictors in relation to a confirmatory latent-variable model. This is only one example of how these types of modeling can be fused, and we look forward to seeing new areas of application.
We hope that this article sheds light on a new family of statistical methods that have much utility for psychological research. With two predictors y 1 and y 2 , and just one indicator x 1 of a single latent variable, the covariance between x 1 and y 1 cov x1,y1 is. This equation means that when predictor covariance is high, the estimation of the second regression coefficient plays a large role.
Thus, whenever parameters are overpenalized either because the sample is not large enough to estimate them or because sparsity is desired , this bias not only is incurred in the regression, but also trickles down to the factor loadings. Adding in a large numbers of predictors makes the problem much worse.
Action Editor Jennifer L. Tackett served as action editor for this article. Author Contributions R. Jacobucci, A. Brandmaier, and R. Kievit generated the idea for the studies and developed the simulation specification.
Jacobucci ran the analyses. All three authors analyzed the results, generated the figures, and wrote the manuscript. All the authors approved the final submitted version of the manuscript. Rogier A. Declaration of Conflicting Interests The author s declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Open Practices. This article has received the badge for Open Materials. Note that regression paths can be penalized regardless of which variables they connect. For example, paths from manifest variables to latent variables can be penalized, as can the reversed paths, paths between latent variables, and paths between manifest variables. In fact, lasso regression can be seen as a subset of the RegSEM lasso method.
We tested a sample size of as well, but with this sample size the models generated by the regsem package failed to converge at a high rate. Therefore, we did not include these results. Skip to main content. Advances in Methods and Practices in Psychological Science.
Article Menu. Download PDF.
Open EPUB. Cite Citation Tools. How to cite this article If you have the appropriate software installed, you can download article citation data to the citation manager of your choice.
Download Citation If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Share Share.