Computational Aspects, Statistical Algorithms and Software in Psychometrics

A special issue of Psych (ISSN 2624-8611). This special issue belongs to the section "Psychometrics and Educational Measurement".

Deadline for manuscript submissions: closed (31 August 2021) | Viewed by 132319

Special Issue Editor


E-Mail Website
Guest Editor
IPN – Leibniz Institute for Science and Mathematics Education, University of Kiel, Olshausenstraße 62, 24118 Kiel, Germany
Interests: item response models; linking; methodology in large-scale assessments; multilevel models; missing data; cognitive diagnostic models; Bayesian methods and regularization

Special Issue Information

Dear Colleagues,

Statistical software in psychometrics has made tremendous progress in providing open-source solutions (e.g., software R, Julia, Python). In this Special Issue, on the one hand, a focus is devoted to computational aspects and statistical algorithms for psychometric methods. For example, shared experiences about efficient implementation aspects or how to handle vast datasets in psychometric modeling are of particular interest. On the other hand, articles introducing new software packages are invited. We would also like to invite researchers to submit articles of software reviews that review one software package or several packages or provide empirical comparisons of several packages. Also welcome are software tutorials that could provide applied researchers guidance about how to estimate recent psychometric models in statistical software. Potential psychometric models include, but are not limited to, item response models, structural equation models, and multilevel models.

Dr. Alexander Robitzsch
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Psych is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • statistical software
  • estimation algorithms
  • software tutorials
  • software reviews
  • item response models
  • multilevel models
  • structural equation models
  • open-source software

Published Papers (32 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Other

5 pages, 225 KiB  
Editorial
Editorial of the Psych Special Issue “Computational Aspects, Statistical Algorithms and Software in Psychometrics”
by Alexander Robitzsch
Psych 2022, 4(1), 114-118; https://0-doi-org.brum.beds.ac.uk/10.3390/psych4010011 - 02 Mar 2022
Cited by 1 | Viewed by 1949
Abstract
Statistical software in psychometrics has made tremendous progress in providing open source solutions (e [...] Full article

Research

Jump to: Editorial, Other

16 pages, 372 KiB  
Article
Evaluating Stan’s Variational Bayes Algorithm for Estimating Multidimensional IRT Models
by Esther Ulitzsch and Steffen Nestler
Psych 2022, 4(1), 73-88; https://0-doi-org.brum.beds.ac.uk/10.3390/psych4010007 - 05 Feb 2022
Cited by 1 | Viewed by 2806
Abstract
Bayesian estimation of multidimensional item response theory (IRT) models in large data sets may come with impractical computational burdens when general-purpose Markov chain Monte Carlo (MCMC) samplers are employed. Variational Bayes (VB)—a method for approximating the posterior distribution—poses a potential remedy. Stan’s general-purpose [...] Read more.
Bayesian estimation of multidimensional item response theory (IRT) models in large data sets may come with impractical computational burdens when general-purpose Markov chain Monte Carlo (MCMC) samplers are employed. Variational Bayes (VB)—a method for approximating the posterior distribution—poses a potential remedy. Stan’s general-purpose VB algorithms have drastically improved the accessibility of VB methods for a wide psychometric audience. Using marginal maximum likelihood (MML) and MCMC as benchmarks, the present simulation study investigates the utility of Stan’s built-in VB function for estimating multidimensional IRT models with between-item dimensionality. VB yielded a marked speed-up in comparison to MCMC, but did not generally outperform MML in terms of run time. VB estimates were trustworthy only for item difficulties, while bias in item discriminations depended on the model’s dimensionality. Under realistic conditions of non-zero correlations between dimensions, VB correlation estimates were subject to severe bias. The practical relevance of performance differences is illustrated with data from PISA 2018. We conclude that in its current form, Stan’s built-in VB algorithm does not pose a viable alternative for estimating multidimensional IRT models. Full article
Show Figures

Figure 1

28 pages, 380 KiB  
Article
An Introduction to Factored Regression Models with Blimp
by Brian Tinnell Keller
Psych 2022, 4(1), 10-37; https://0-doi-org.brum.beds.ac.uk/10.3390/psych4010002 - 31 Dec 2021
Cited by 1 | Viewed by 3296
Abstract
In this paper, we provide an introduction to the factored regression framework. This modeling framework applies the rules of probability to break up or “factor” a complex joint distribution into a product of conditional regression models. Using this framework, we can easily specify [...] Read more.
In this paper, we provide an introduction to the factored regression framework. This modeling framework applies the rules of probability to break up or “factor” a complex joint distribution into a product of conditional regression models. Using this framework, we can easily specify the complex multivariate models that missing data modeling requires. The article provides a brief conceptual overview of factored regression and describes the functional notation used to conceptualize the models. Furthermore, we present a conceptual overview of how the models are estimated and imputations are obtained. Finally, we discuss how users can use the free software package, Blimp, to estimate the models in the context of a mediation example. Full article
Show Figures

Figure 1

19 pages, 425 KiB  
Article
Automated Essay Scoring Using Transformer Models
by Sabrina Ludwig, Christian Mayer, Christopher Hansen, Kerstin Eilers and Steffen Brandt
Psych 2021, 3(4), 897-915; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040056 - 14 Dec 2021
Cited by 12 | Viewed by 5672
Abstract
Automated essay scoring (AES) is gaining increasing attention in the education sector as it significantly reduces the burden of manual scoring and allows ad hoc feedback for learners. Natural language processing based on machine learning has been shown to be particularly suitable for [...] Read more.
Automated essay scoring (AES) is gaining increasing attention in the education sector as it significantly reduces the burden of manual scoring and allows ad hoc feedback for learners. Natural language processing based on machine learning has been shown to be particularly suitable for text classification and AES. While many machine-learning approaches for AES still rely on a bag of words (BOW) approach, we consider a transformer-based approach in this paper, compare its performance to a logistic regression model based on the BOW approach, and discuss their differences. The analysis is based on 2088 email responses to a problem-solving task that were manually labeled in terms of politeness. Both transformer models considered in the analysis outperformed without any hyperparameter tuning of the regression-based model. We argue that, for AES tasks such as politeness classification, the transformer-based approach has significant advantages, while a BOW approach suffers from not taking word order into account and reducing the words to their stem. Further, we show how such models can help increase the accuracy of human raters, and we provide a detailed instruction on how to implement transformer-based models for one’s own purposes. Full article
Show Figures

Figure 1

24 pages, 2913 KiB  
Article
Cognitively Diagnostic Analysis Using the G-DINA Model in R
by Qingzhou Shi, Wenchao Ma, Alexander Robitzsch, Miguel A. Sorrel and Kaiwen Man
Psych 2021, 3(4), 812-835; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040052 - 08 Dec 2021
Cited by 6 | Viewed by 5191
Abstract
Cognitive diagnosis models (CDMs) have increasingly been applied in education and other fields. This article provides an overview of a widely used CDM, namely, the G-DINA model, and demonstrates a hands-on example of using multiple R packages for a series of CDM analyses. [...] Read more.
Cognitive diagnosis models (CDMs) have increasingly been applied in education and other fields. This article provides an overview of a widely used CDM, namely, the G-DINA model, and demonstrates a hands-on example of using multiple R packages for a series of CDM analyses. This overview involves a step-by-step illustration and explanation of performing Q-matrix evaluation, CDM calibration, model fit evaluation, item diagnosticity investigation, classification reliability examination, and the result presentation and visualization. Some limitations of conducting CDM analysis in R are also discussed. Full article
Show Figures

Figure 1

29 pages, 734 KiB  
Article
Comparing the MCMC Efficiency of JAGS and Stan for the Multi-Level Intercept-Only Model in the Covariance- and Mean-Based and Classic Parametrization
by Martin Hecht, Sebastian Weirich and Steffen Zitzmann
Psych 2021, 3(4), 751-779; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040048 - 30 Nov 2021
Cited by 8 | Viewed by 3203
Abstract
Bayesian MCMC is a widely used model estimation technique, and software from the BUGS family, such as JAGS, have been popular for over two decades. Recently, Stan entered the market with promises of higher efficiency fueled by advanced and more sophisticated algorithms. With [...] Read more.
Bayesian MCMC is a widely used model estimation technique, and software from the BUGS family, such as JAGS, have been popular for over two decades. Recently, Stan entered the market with promises of higher efficiency fueled by advanced and more sophisticated algorithms. With this study, we want to contribute empirical results to the discussion about the sampling efficiency of JAGS and Stan. We conducted three simulation studies in which we varied the number of warmup iterations, the prior informativeness, and sample sizes and employed the multi-level intercept-only model in the covariance- and mean-based and in the classic parametrization. The target outcome was MCMC efficiency measured as effective sample size per second (ESS/s). Based on our specific (and limited) study setup, we found that (1) MCMC efficiency is much higher for the covariance- and mean-based parametrization than for the classic parametrization, (2) Stan clearly outperforms JAGS when the covariance- and mean-based parametrization is used, and that (3) JAGS clearly outperforms Stan when the classic parametrization is used. Full article
Show Figures

Figure 1

23 pages, 541 KiB  
Article
Concepts and Coefficients Based on John L. Holland’s Theory of Vocational Choice—Examining the R Package holland
by Florian G. Hartmann, Jörg-Henrik Heine and Bernhard Ertl
Psych 2021, 3(4), 728-750; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040047 - 29 Nov 2021
Cited by 4 | Viewed by 10734
Abstract
John L. Holland’s theory of vocational choice is one of the most prominent career theories and is used by both researchers and practitioners around the world. The theory states that people should seek work environments that fit their vocational interests in order to [...] Read more.
John L. Holland’s theory of vocational choice is one of the most prominent career theories and is used by both researchers and practitioners around the world. The theory states that people should seek work environments that fit their vocational interests in order to be satisfied and successful. Its application in research and practice requires the determination of coefficients, which quantify its core concepts such as person-environment fit. The recently released R package holland aims at providing a holistic collection of the references, descriptions and calculations of the most important coefficients. The current paper presents the package and examines it in terms of its application for research and practice. For this purpose, the functions of the package are applied and discussed. Furthermore, recommendations are made in the case of multiple coefficients for the same theoretical concept and features that future releases should include are discussed. The R package holland is a promising computational environment providing multiple coefficients for Holland’s most important theoretical concepts. Full article
Show Figures

Figure 1

14 pages, 493 KiB  
Article
Anonymiced Shareable Data: Using mice to Create and Analyze Multiply Imputed Synthetic Datasets
by Thom Benjamin Volker and Gerko Vink
Psych 2021, 3(4), 703-716; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040045 - 23 Nov 2021
Cited by 4 | Viewed by 4018
Abstract
Synthetic datasets simultaneously allow for the dissemination of research data while protecting the privacy and confidentiality of respondents. Generating and analyzing synthetic datasets is straightforward, yet, a synthetic data analysis pipeline is seldom adopted by applied researchers. We outline a simple procedure for [...] Read more.
Synthetic datasets simultaneously allow for the dissemination of research data while protecting the privacy and confidentiality of respondents. Generating and analyzing synthetic datasets is straightforward, yet, a synthetic data analysis pipeline is seldom adopted by applied researchers. We outline a simple procedure for generating and analyzing synthetic datasets with the multiple imputation software mice (Version 3.13.15) in R. We demonstrate through simulations that the analysis results obtained on synthetic data yield unbiased and valid inferences and lead to synthetic records that cannot be distinguished from the true data records. The ease of use when synthesizing data with mice along with the validity of inferences obtained through this procedure opens up a wealth of possibilities for data dissemination and further research on initially private data. Full article
Show Figures

Figure 1

21 pages, 435 KiB  
Article
Handling Missing Responses in Psychometrics: Methods and Software
by Shenghai Dai
Psych 2021, 3(4), 673-693; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040043 - 19 Nov 2021
Cited by 11 | Viewed by 3899
Abstract
The presence of missing responses in assessment settings is inevitable and may yield biased parameter estimates in psychometric modeling if ignored or handled improperly. Many methods have been proposed to handle missing responses in assessment data that are often dichotomous or polytomous. Their [...] Read more.
The presence of missing responses in assessment settings is inevitable and may yield biased parameter estimates in psychometric modeling if ignored or handled improperly. Many methods have been proposed to handle missing responses in assessment data that are often dichotomous or polytomous. Their applications remain nominal, however, partly due to that (1) there is no sufficient support in the literature for an optimal method; (2) many practitioners and researchers are not familiar with these methods; and (3) these methods are usually not employed by psychometric software and missing responses need to be handled separately. This article introduces and reviews the commonly used missing response handling methods in psychometrics, along with the literature that examines and compares the performance of these methods. Further, the use of the TestDataImputation package in R is introduced and illustrated with an example data set and a simulation study. Corresponding R codes are provided. Full article
21 pages, 3026 KiB  
Article
An Evaluation of DIF Tests in Multistage Tests for Continuous Covariates
by Rudolf Debelak and Dries Debeer
Psych 2021, 3(4), 618-638; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040040 - 15 Oct 2021
Cited by 2 | Viewed by 2321
Abstract
Multistage tests are a widely used and efficient type of test presentation that aims to provide accurate ability estimates while keeping the test relatively short. Multistage tests typically rely on the psychometric framework of item response theory. Violations of item response models and [...] Read more.
Multistage tests are a widely used and efficient type of test presentation that aims to provide accurate ability estimates while keeping the test relatively short. Multistage tests typically rely on the psychometric framework of item response theory. Violations of item response models and other assumptions underlying a multistage test, such as differential item functioning, can lead to inaccurate ability estimates and unfair measurements. There is a practical need for methods to detect problematic model violations to avoid these issues. This study compares and evaluates three methods for the detection of differential item functioning with regard to continuous person covariates in data from multistage tests: a linear logistic regression test and two adaptations of a recently proposed score-based DIF test. While all tests show a satisfactory Type I error rate, the score-based tests show greater power against three types of DIF effects. Full article
Show Figures

Figure 1

25 pages, 1964 KiB  
Article
The Theoretical and Statistical Ising Model: A Practical Guide in R
by Adam Finnemann, Denny Borsboom, Sacha Epskamp and Han L. J. van der Maas
Psych 2021, 3(4), 593-617; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040039 - 08 Oct 2021
Cited by 8 | Viewed by 6420
Abstract
The “Ising model” refers to both the statistical and the theoretical use of the same equation. In this article, we introduce both uses and contrast their differences. We accompany the conceptual introduction with a survey of Ising-related software packages in R. Since [...] Read more.
The “Ising model” refers to both the statistical and the theoretical use of the same equation. In this article, we introduce both uses and contrast their differences. We accompany the conceptual introduction with a survey of Ising-related software packages in R. Since the model’s different uses are best understood through simulations, we make this process easily accessible with fully reproducible examples. Using simulations, we show how the theoretical Ising model captures local-alignment dynamics. Subsequently, we present it statistically as a likelihood function for estimating empirical network models from binary data. In this process, we give recommendations on when to use traditional frequentist estimators as well as novel Bayesian options. Full article
Show Figures

Figure 1

17 pages, 3206 KiB  
Article
Bivariate Distributions Underlying Responses to Ordinal Variables
by Laura Kolbe, Frans Oort and Suzanne Jak
Psych 2021, 3(4), 562-578; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040037 - 01 Oct 2021
Cited by 2 | Viewed by 3235
Abstract
The association between two ordinal variables can be expressed with a polychoric correlation coefficient. This coefficient is conventionally based on the assumption that responses to ordinal variables are generated by two underlying continuous latent variables with a bivariate normal distribution. When the underlying [...] Read more.
The association between two ordinal variables can be expressed with a polychoric correlation coefficient. This coefficient is conventionally based on the assumption that responses to ordinal variables are generated by two underlying continuous latent variables with a bivariate normal distribution. When the underlying bivariate normality assumption is violated, the estimated polychoric correlation coefficient may be biased. In such a case, we may consider other distributions. In this paper, we aimed to provide an illustration of fitting various bivariate distributions to empirical ordinal data and examining how estimates of the polychoric correlation may vary under different distributional assumptions. Results suggested that the bivariate normal and skew-normal distributions rarely hold in the empirical datasets. In contrast, mixtures of bivariate normal distributions were often not rejected. Full article
Show Figures

Figure 1

11 pages, 489 KiB  
Article
Robust Chi-Square in Extreme and Boundary Conditions: Comments on Jak et al. (2021)
by Tihomir Asparouhov and Bengt Muthén
Psych 2021, 3(3), 542-551; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030035 - 10 Sep 2021
Cited by 4 | Viewed by 2648
Abstract
In this article we describe a modification of the robust chi-square test of fit that yields more accurate type I error rates when the estimated model is at the boundary of the admissible space. Full article
Show Figures

Figure A1

21 pages, 3237 KiB  
Article
Modelling Norm Scores with the cNORM Package in R
by Sebastian Gary, Wolfgang Lenhard and Alexandra Lenhard
Psych 2021, 3(3), 501-521; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030033 - 30 Aug 2021
Cited by 9 | Viewed by 3340
Abstract
In this article, we explain and demonstrate how to model norm scores with the cNORM package in R. This package is designed specifically to determine norm scores when the latent ability to be measured covaries with age or other explanatory variables such as [...] Read more.
In this article, we explain and demonstrate how to model norm scores with the cNORM package in R. This package is designed specifically to determine norm scores when the latent ability to be measured covaries with age or other explanatory variables such as grade level. The mathematical method used in this package draws on polynomial regression to model a three-dimensional hyperplane that smoothly and continuously captures the relation between raw scores, norm scores and the explanatory variable. By doing so, it overcomes the typical problems of classical norming methods, such as overly large age intervals, missing norm scores, large amounts of sampling error in the subsamples or huge requirements with regard to the sample size. After a brief introduction to the mathematics of the model, we describe the individual methods of the package. We close the article with a practical example using data from a real reading comprehension test. Full article
Show Figures

Figure 1

22 pages, 524 KiB  
Article
Estimating the Stability of Psychological Dimensions via Bootstrap Exploratory Graph Analysis: A Monte Carlo Simulation and Tutorial
by Alexander P. Christensen and Hudson Golino
Psych 2021, 3(3), 479-500; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030032 - 27 Aug 2021
Cited by 70 | Viewed by 6760
Abstract
Exploratory Graph Analysis (EGA) has emerged as a popular approach for estimating the dimensionality of multivariate data using psychometric networks. Sampling variability, however, has made reproducibility and generalizability a key issue in network psychometrics. To address this issue, we have developed a novel [...] Read more.
Exploratory Graph Analysis (EGA) has emerged as a popular approach for estimating the dimensionality of multivariate data using psychometric networks. Sampling variability, however, has made reproducibility and generalizability a key issue in network psychometrics. To address this issue, we have developed a novel bootstrap approach called Bootstrap Exploratory Graph Analysis (bootEGA). bootEGA generates a sampling distribution of EGA results where several statistics can be computed. Descriptive statistics (median, standard error, and dimension frequency) provide researchers with a general sense of the stability of their empirical EGA dimensions. Structural consistency estimates how often dimensions are replicated exactly across the bootstrap replicates. Item stability statistics provide information about whether dimensions are unstable due to misallocation (e.g., item placed in the wrong dimension), multidimensionality (e.g., item belonging to more than one dimension), and item redundancy (e.g., similar semantic content). Using a Monte Carlo simulation, we determine guidelines for acceptable item stability. After, we provide an empirical example that demonstrates how bootEGA can be used to identify structural consistency issues (including a fully reproducible R tutorial). In sum, we demonstrate that bootEGA is a robust approach for identifying the stability and robustness of dimensionality in multivariate data. Full article
Show Figures

Figure 1

25 pages, 3122 KiB  
Article
shinyReCoR: A Shiny Application for Automatically Coding Text Responses Using R
by Nico Andersen and Fabian Zehner
Psych 2021, 3(3), 422-446; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030030 - 16 Aug 2021
Cited by 10 | Viewed by 3926
Abstract
In this paper, we introduce shinyReCoR: a new app that utilizes a cluster-based method for automatically coding open-ended text responses. Reliable coding of text responses from educational or psychological assessments requires substantial organizational and human effort. The coding of natural language in responses [...] Read more.
In this paper, we introduce shinyReCoR: a new app that utilizes a cluster-based method for automatically coding open-ended text responses. Reliable coding of text responses from educational or psychological assessments requires substantial organizational and human effort. The coding of natural language in responses to tests depends on the texts’ complexity, corresponding coding guides, and the guides’ quality. Manual coding is thus not only expensive but also error-prone. With shinyReCoR, we provide a more efficient alternative. The use of natural language processing makes texts utilizable for statistical methods. shinyReCoR is a Shiny app deployed as an R-package that allows users with varying technical affinity to create automatic response classifiers through a graphical user interface based on annotated data. The present paper describes the underlying methodology, including machine learning, as well as peculiarities of the processing of language in the assessment context. The app guides users through the workflow with steps like text corpus compilation, semantic space building, preprocessing of the text data, and clustering. Users can adjust each step according to their needs. Finally, users are provided with an automatic response classifier, which can be evaluated and tested within the process. Full article
Show Figures

Figure 1

18 pages, 360 KiB  
Article
Between-Item Multidimensional IRT: How Far Can the Estimation Methods Go?
by Mauricio Garnier-Villarreal, Edgar C. Merkle and Brooke E. Magnus
Psych 2021, 3(3), 404-421; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030029 - 09 Aug 2021
Cited by 7 | Viewed by 3376
Abstract
Multidimensional item response models are known to be difficult to estimate, with a variety of estimation and modeling strategies being proposed to handle the difficulties. While some previous studies have considered the performance of these estimation methods, they typically include only one or [...] Read more.
Multidimensional item response models are known to be difficult to estimate, with a variety of estimation and modeling strategies being proposed to handle the difficulties. While some previous studies have considered the performance of these estimation methods, they typically include only one or two methods, or a small number of factors. In this paper, we report on a large simulation study of between-item multidimensional IRT estimation methods, considering five different methods, a variety of sample sizes, and up to eight factors. This study provides a comprehensive picture of the methods’ relative performance, as well as each individual method’s strengths and weaknesses. The study results lead us to make recommendations for applied research, related to which estimation methods should be used under various scenarios. Full article
Show Figures

Figure 1

18 pages, 1961 KiB  
Article
cdcatR: An R Package for Cognitive Diagnostic Computerized Adaptive Testing
by Miguel A. Sorrel, Pablo Nájera and Francisco J. Abad
Psych 2021, 3(3), 386-403; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030028 - 09 Aug 2021
Cited by 4 | Viewed by 2766
Abstract
Cognitive diagnosis models (CDMs) are confirmatory latent class models that provide fine-grained information about skills and cognitive processes. These models have gained attention in the last few years because of their usefulness in educational and psychological settings. Recently, numerous developments have been made [...] Read more.
Cognitive diagnosis models (CDMs) are confirmatory latent class models that provide fine-grained information about skills and cognitive processes. These models have gained attention in the last few years because of their usefulness in educational and psychological settings. Recently, numerous developments have been made to allow for the implementation of cognitive diagnosis computerized adaptive testing (CD-CAT). Despite methodological advances, CD-CAT applications are still scarce. To facilitate research and the emergence of empirical applications in this area, we have developed the cdcatR package for R software. The purpose of this document is to illustrate the different functions included in this package. The package includes functionalities for data generation, model selection based on relative fit information, implementation of several item selection rules (including item exposure control), and CD-CAT performance evaluation in terms of classification accuracy, item exposure, and test length. In conclusion, an R package is made available to researchers and practitioners that allows for an easy implementation of CD-CAT in both simulation and applied studies. Ultimately, this is expected to facilitate the development of empirical applications in this area. Full article
Show Figures

Figure 1

26 pages, 1101 KiB  
Article
Predicting Differences in Model Parameters with Individual Parameter Contribution Regression Using the R Package ipcr
by Manuel Arnold, Andreas M. Brandmaier and Manuel C. Voelkle
Psych 2021, 3(3), 360-385; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030027 - 06 Aug 2021
Cited by 3 | Viewed by 2706
Abstract
Unmodeled differences between individuals or groups can bias parameter estimates and may lead to false-positive or false-negative findings. Such instances of heterogeneity can often be detected and predicted with additional covariates. However, predicting differences with covariates can be challenging or even infeasible, depending [...] Read more.
Unmodeled differences between individuals or groups can bias parameter estimates and may lead to false-positive or false-negative findings. Such instances of heterogeneity can often be detected and predicted with additional covariates. However, predicting differences with covariates can be challenging or even infeasible, depending on the modeling framework and type of parameter. Here, we demonstrate how the individual parameter contribution (IPC) regression framework, as implemented in the R package ipcr, can be leveraged to predict differences in any parameter across a wide range of parametric models. First and foremost, IPC regression is an exploratory analysis technique to determine if and how the parameters of a fitted model vary as a linear function of covariates. After introducing the theoretical foundation of IPC regression, we use an empirical data set to demonstrate how parameter differences in a structural equation model can be predicted with the ipcr package. Then, we analyze the performance of IPC regression in comparison to alternative methods for modeling parameter heterogeneity in a Monte Carlo simulation. Full article
Show Figures

Figure 1

12 pages, 397 KiB  
Article
Using the Effective Sample Size as the Stopping Criterion in Markov Chain Monte Carlo with the Bayes Module in Mplus
by Steffen Zitzmann, Sebastian Weirich and Martin Hecht
Psych 2021, 3(3), 336-347; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030025 - 30 Jul 2021
Cited by 10 | Viewed by 3154
Abstract
Bayesian modeling using Markov chain Monte Carlo (MCMC) estimation requires researchers to decide not only whether estimation has converged but also whether the Bayesian estimates are well-approximated by summary statistics from the chain. On the contrary, software such as the Bayes module in [...] Read more.
Bayesian modeling using Markov chain Monte Carlo (MCMC) estimation requires researchers to decide not only whether estimation has converged but also whether the Bayesian estimates are well-approximated by summary statistics from the chain. On the contrary, software such as the Bayes module in Mplus, which helps researchers check whether convergence has been achieved by comparing the potential scale reduction (PSR) with a prespecified maximum PSR, the size of the MCMC error or, equivalently, the effective sample size (ESS), is not monitored. Zitzmann and Hecht (2019) proposed a method that can be used to check whether a minimum ESS has been reached in Mplus. In this article, we evaluated this method with a computer simulation. Specifically, we fit a multilevel structural equation model to a large number of simulated data sets and compared different prespecified minimum ESS values with the actual (empirical) ESS values. The empirical values were approximately equal to or larger than the prespecified minimum ones, thus indicating the validity of the method. Full article
Show Figures

Figure 1

14 pages, 490 KiB  
Article
Testing and Interpreting Latent Variable Interactions Using the semTools Package
by Alexander M. Schoemann and Terrence D. Jorgensen
Psych 2021, 3(3), 322-335; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030024 - 30 Jul 2021
Cited by 26 | Viewed by 7753
Abstract
Examining interactions among predictors is an important part of a developing research program. Estimating interactions using latent variables provides additional power to detect effects over testing interactions in regression. However, when predictors are modeled as latent variables, estimating and testing interactions requires additional [...] Read more.
Examining interactions among predictors is an important part of a developing research program. Estimating interactions using latent variables provides additional power to detect effects over testing interactions in regression. However, when predictors are modeled as latent variables, estimating and testing interactions requires additional steps beyond the models used for regression. We review methods of estimating and testing latent variable interactions with a focus on product indicator methods. Product indicator methods of examining latent interactions provide an accurate method to estimate and test latent interactions and can be implemented in any latent variable modeling software package. Significant latent interactions require additional steps (plotting and probing) to interpret interaction effects. We demonstrate how these methods can be easily implemented using functions in the semTools package with models fit using the lavaan package in R, and we illustrate how these methods work using an applied example concerning teacher stress and testing. Full article
Show Figures

Figure 1

14 pages, 3345 KiB  
Article
Estimating Explanatory Extensions of Dichotomous and Polytomous Rasch Models: The eirm Package in R
by Okan Bulut, Guher Gorgun and Seyma Nur Yildirim-Erbasli
Psych 2021, 3(3), 308-321; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030023 - 29 Jul 2021
Cited by 12 | Viewed by 3882
Abstract
Explanatory item response modeling (EIRM) enables researchers and practitioners to incorporate item and person properties into item response theory (IRT) models. Unlike traditional IRT models, explanatory IRT models can explain common variability stemming from the shared variance among item clusters and person groups. [...] Read more.
Explanatory item response modeling (EIRM) enables researchers and practitioners to incorporate item and person properties into item response theory (IRT) models. Unlike traditional IRT models, explanatory IRT models can explain common variability stemming from the shared variance among item clusters and person groups. In this tutorial, we present the R package eirm, which provides a simple and easy-to-use set of tools for preparing data, estimating explanatory IRT models based on the Rasch family, extracting model output, and visualizing model results. We describe how functions in the eirm package can be used for estimating traditional IRT models (e.g., Rasch model, Partial Credit Model, and Rating Scale Model), item-explanatory models (i.e., Linear Logistic Test Model), and person-explanatory models (i.e., latent regression models) for both dichotomous and polytomous responses. In addition to demonstrating the general functionality of the eirm package, we also provide real-data examples with annotated R codes based on the Rosenberg Self-Esteem Scale. Full article
Show Figures

Figure 1

16 pages, 1013 KiB  
Article
RALSA: Design and Implementation
by Plamen Vladkov Mirazchiyski
Psych 2021, 3(2), 233-248; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3020018 - 12 Jun 2021
Cited by 1 | Viewed by 3919
Abstract
International large-scale assessments (ILSAs) provide invaluable information for researchers and policy makers. Analysis of their data, however, requires methods that go beyond the usual analysis techniques assuming simple random sampling. Several software packages that serve this purpose are available. One such is the [...] Read more.
International large-scale assessments (ILSAs) provide invaluable information for researchers and policy makers. Analysis of their data, however, requires methods that go beyond the usual analysis techniques assuming simple random sampling. Several software packages that serve this purpose are available. One such is the R Analyzer for Large-Scale Assessments (RALSA), a newly developed R package. The package can work with data from a large number of ILSAs. It was designed for user experience and is suitable for analysts who lack technical expertise and/or familiarity with the R programming language and statistical software. This paper presents the technical aspects of RALSA—the overall design and structure of the package, its internal organization, and the structure of the analysis and data preparation functions. The use of the data.table package for memory efficiency, speed, and embedded computations is explained through examples. The central aspect of the paper is the utilization of code reuse practices to the achieve consistency, efficiency, and safety of the computations performed by the analysis functions of the package. The comprehensive output system to produce multi-sheet MS Excel workbooks is presented and its workflow explained. The paper also explains how the graphical user interface is constructed and how it is linked to the data preparation and analysis functions available in the package. Full article
Show Figures

Figure 1

36 pages, 502 KiB  
Article
Evaluating the Observed Log-Likelihood Function in Two-Level Structural Equation Modeling with Missing Data: From Formulas to R Code
by Yves Rosseel
Psych 2021, 3(2), 197-232; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3020017 - 07 Jun 2021
Cited by 5 | Viewed by 3335
Abstract
This paper discusses maximum likelihood estimation for two-level structural equation models when data are missing at random at both levels. Building on existing literature, a computationally efficient expression is derived to evaluate the observed log-likelihood. Unlike previous work, the expression is valid for [...] Read more.
This paper discusses maximum likelihood estimation for two-level structural equation models when data are missing at random at both levels. Building on existing literature, a computationally efficient expression is derived to evaluate the observed log-likelihood. Unlike previous work, the expression is valid for the special case where the model implied variance–covariance matrix at the between level is singular. Next, the log-likelihood function is translated to R code. A sequence of R scripts is presented, starting from a naive implementation and ending at the final implementation as found in the lavaan package. Along the way, various computational tips and tricks are given. Full article
19 pages, 645 KiB  
Article
Evaluating Cluster-Level Factor Models with lavaan and Mplus
by Suzanne Jak, Terrence D. Jorgensen and Yves Rosseel
Psych 2021, 3(2), 134-152; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3020012 - 31 May 2021
Cited by 9 | Viewed by 3944
Abstract
Background: Researchers frequently use the responses of individuals in clusters to measure cluster-level constructs. Examples are the use of student evaluations to measure teaching quality, or the use of employee ratings of organizational climate. In earlier research, Stapleton and Johnson (2019) provided [...] Read more.
Background: Researchers frequently use the responses of individuals in clusters to measure cluster-level constructs. Examples are the use of student evaluations to measure teaching quality, or the use of employee ratings of organizational climate. In earlier research, Stapleton and Johnson (2019) provided advice for measuring cluster-level constructs based on a simulation study with inadvertently confounded design factors. We extended their simulation study using both Mplus and lavaan to reveal how their conclusions were dependent on their study conditions. Methods: We generated data sets from the so-called configural model and the simultaneous shared-and-configural model, both with and without nonzero residual variances at the cluster level. We fitted models to these data sets using different maximum likelihood estimation algorithms. Results: Stapleton and Johnson’s results were highly contingent on their confounded design factors. Convergence rates could be very different across algorithms, depending on whether between-level residual variances were zero in the population or in the fitted model. We discovered a worrying convergence issue with the default settings in Mplus, resulting in seemingly converged solutions that are actually not. Rejection rates of the normal-theory test statistic were as expected, while rejection rates of the scaled test statistic were seriously inflated in several conditions. Conclusions: The defaults in Mplus carry specific risks that are easily checked but not well advertised. Our results also shine a different light on earlier advice on the use of measurement models for shared factors. Full article
Show Figures

Figure 1

21 pages, 611 KiB  
Article
How to Estimate Absolute-Error Components in Structural Equation Models of Generalizability Theory
by Terrence D. Jorgensen
Psych 2021, 3(2), 113-133; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3020011 - 29 May 2021
Cited by 10 | Viewed by 3073
Abstract
Structural equation modeling (SEM) has been proposed to estimate generalizability theory (GT) variance components, primarily focusing on estimating relative error to calculate generalizability coefficients. Proposals for estimating absolute-error components have given the impression that a separate SEM must be fitted to a transposed [...] Read more.
Structural equation modeling (SEM) has been proposed to estimate generalizability theory (GT) variance components, primarily focusing on estimating relative error to calculate generalizability coefficients. Proposals for estimating absolute-error components have given the impression that a separate SEM must be fitted to a transposed data matrix. This paper uses real and simulated data to demonstrate how a single SEM can be specified to estimate absolute error (and thus dependability) by placing appropriate constraints on the mean structure, as well as thresholds (when used for ordinal measures). Using the R packages lavaan and gtheory, different estimators are compared for normal and discrete measurements. Limitations of SEM for GT are demonstrated using multirater data from a planned missing-data design, and an important remaining area for future development is discussed. Full article
Show Figures

Figure 1

17 pages, 412 KiB  
Article
Automated Test Assembly in R: The eatATA Package
by Benjamin Becker, Dries Debeer, Karoline A. Sachse and Sebastian Weirich
Psych 2021, 3(2), 96-112; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3020010 - 21 May 2021
Cited by 5 | Viewed by 4061
Abstract
Combining items from an item pool into test forms (test assembly) is a frequent task in psychological and educational testing. Although efficient methods for automated test assembly exist, these are often unknown or unavailable to practitioners. In this paper we present the R [...] Read more.
Combining items from an item pool into test forms (test assembly) is a frequent task in psychological and educational testing. Although efficient methods for automated test assembly exist, these are often unknown or unavailable to practitioners. In this paper we present the R package eatATA, which allows using several mixed-integer programming solvers for automated test assembly in R. We describe the general functionality and the common work flow of eatATA using a minimal example. We also provide four more elaborate use cases of automated test assembly: (a) The assembly of multiple test forms for a pilot study; (b) the assembly of blocks of items for a multiple matrix booklet design in the context of a large-scale assessment; (c) the assembly of two linear test forms for individual diagnostic purposes; (d) the assembly of multi-stage testing modules for individual diagnostic purposes. All use cases are accompanied with example item pools and commented R code. Full article
Show Figures

Figure 1

44 pages, 1809 KiB  
Article
Comparison of Recent Acceleration Techniques for the EM Algorithm in One- and Two-Parameter Logistic IRT Models
by Marie Beisemann, Ortrud Wartlick and Philipp Doebler
Psych 2020, 2(4), 209-252; https://0-doi-org.brum.beds.ac.uk/10.3390/psych2040018 - 10 Nov 2020
Cited by 3 | Viewed by 2394
Abstract
The expectation–maximization (EM) algorithm is an important numerical method for maximum likelihood estimation in incomplete data problems. However, convergence of the EM algorithm can be slow, and for this reason, many EM acceleration techniques have been proposed. After a review of acceleration techniques [...] Read more.
The expectation–maximization (EM) algorithm is an important numerical method for maximum likelihood estimation in incomplete data problems. However, convergence of the EM algorithm can be slow, and for this reason, many EM acceleration techniques have been proposed. After a review of acceleration techniques in a unified notation with illustrations, three recently proposed EM acceleration techniques are compared in detail: quasi-Newton methods (QN), “squared” iterative methods (SQUAREM), and parabolic EM (PEM). These acceleration techniques are applied to marginal maximum likelihood estimation with the EM algorithm in one- and two-parameter logistic item response theory (IRT) models for binary data, and their performance is compared. QN and SQUAREM methods accelerate convergence of the EM algorithm for the two-parameter logistic model significantly in high-dimensional data problems. Compared to the standard EM, all three methods reduce the number of iterations, but increase the number of total marginal log-likelihood evaluations per iteration. Efficient approximations of the marginal log-likelihood are hence an important part of implementation. Full article
Show Figures

Figure 1

Other

Jump to: Editorial, Research

32 pages, 511 KiB  
Tutorial
Reproducible Research in R: A Tutorial on How to Do the Same Thing More Than Once
by Aaron Peikert, Caspar J. van Lissa and Andreas M. Brandmaier
Psych 2021, 3(4), 836-867; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040053 - 09 Dec 2021
Cited by 9 | Viewed by 5216
Abstract
Computational reproducibility is the ability to obtain identical results from the same data with the same computer code. It is a building block for transparent and cumulative science because it enables the originator and other researchers, on other computers and later in time, [...] Read more.
Computational reproducibility is the ability to obtain identical results from the same data with the same computer code. It is a building block for transparent and cumulative science because it enables the originator and other researchers, on other computers and later in time, to reproduce and thus understand how results came about, while avoiding a variety of errors that may lead to erroneous reporting of statistical and computational results. In this tutorial, we demonstrate how the R package repro supports researchers in creating fully computationally reproducible research projects with tools from the software engineering community. Building upon this notion of fully automated reproducibility, we present several applications including the preregistration of research plans with code (Preregistration as Code, PAC). PAC eschews all ambiguity of traditional preregistration and offers several more advantages. Making technical advancements that serve reproducibility more widely accessible for researchers holds the potential to innovate the research process and to help it become more productive, credible, and reliable. Full article
Show Figures

Figure 1

14 pages, 6930 KiB  
Tutorial
Tutorial on the Use of the regsem Package in R
by Xiaobei Li, Ross Jacobucci and Brooke A. Ammerman
Psych 2021, 3(4), 579-592; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3040038 - 05 Oct 2021
Cited by 7 | Viewed by 3190
Abstract
Sparse estimation through regularization is gaining popularity in psychological research. Such techniques penalize the complexity of the model and could perform variable/path selection in an automatic way, and thus are particularly useful in models that have small parameter-to-sample-size ratios. This paper gives a [...] Read more.
Sparse estimation through regularization is gaining popularity in psychological research. Such techniques penalize the complexity of the model and could perform variable/path selection in an automatic way, and thus are particularly useful in models that have small parameter-to-sample-size ratios. This paper gives a detailed tutorial of the R package regsem, which implements regularization for structural equation models. Example R code is also provided to highlight the key arguments of implementing regularized structural equation models in this package. The tutorial ends by discussing remedies of some known drawbacks of a popular type of regularization, computational methods supported by the package that can improve the selection result, and some other practical issues such as dealing with missing data and categorical variables. Full article
Show Figures

Figure 1

21 pages, 465 KiB  
Tutorial
Analysis of Categorical Data with the R Package confreq
by Jörg-Henrik Heine and Mark Stemmler
Psych 2021, 3(3), 522-541; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030034 - 07 Sep 2021
Cited by 2 | Viewed by 3116
Abstract
The person-centered approach in categorical data analysis is introduced as a complementary approach to the variable-centered approach. The former uses persons, animals, or objects on the basis of their combination of characteristics which can be displayed in multiway contingency tables. Configural Frequency Analysis [...] Read more.
The person-centered approach in categorical data analysis is introduced as a complementary approach to the variable-centered approach. The former uses persons, animals, or objects on the basis of their combination of characteristics which can be displayed in multiway contingency tables. Configural Frequency Analysis (CFA) and log-linear modeling (LLM) are the two most prominent (and related) statistical methods. Both compare observed frequencies (foik) with expected frequencies (feik). While LLM uses primarily a model-fitting approach, CFA analyzes residuals of non-fitting models. Residuals with significantly more observed than expected frequencies (foik>feik) are called types, while residuals with significantly less observed than expected frequencies (foik<feik) are called antitypes. The R package confreq is presented and its use is demonstrated with several data examples. Results of contingency table analyses can be displayed in tables but also in graphics representing the size and type of residual. The expected frequencies represent the null hypothesis and different null hypotheses result in different expected frequencies. Different kinds of CFAs are presented: the first-order CFA based on the null hypothesis of independence, CFA with covariates, and the two-sample CFA. The calculation of the expected frequencies can be controlled through the design matrix which can be easily handled in confreq. Full article
Show Figures

Figure 1

32 pages, 1412 KiB  
Tutorial
Flexible Item Response Modeling in R with the flexmet Package
by Leah Feuerstahler
Psych 2021, 3(3), 447-478; https://0-doi-org.brum.beds.ac.uk/10.3390/psych3030031 - 16 Aug 2021
Cited by 6 | Viewed by 2040
Abstract
The filtered monotonic polynomial (FMP) model is a semi-parametric item response model that allows flexible response function shapes but also includes traditional item response models as special cases. The flexmet package for R facilitates the routine use of the FMP model in real [...] Read more.
The filtered monotonic polynomial (FMP) model is a semi-parametric item response model that allows flexible response function shapes but also includes traditional item response models as special cases. The flexmet package for R facilitates the routine use of the FMP model in real data analysis and simulation studies. This tutorial provides several code examples illustrating how the flexmet package may be used to simulate FMP model parameters and data (both for dichotomous and polytomously scored items), estimate FMP model parameters, transform traditional item response models to different metrics, and more. This tutorial serves as both an introduction to the unique features of the FMP model and as a practical guide to its implementation in R via the flexmet package. Full article
Show Figures

Figure 1

Back to TopTop