Fit the Low Rank Item Response using lmirt package

*This page requires javascript to load properly.*

1. Description

The lmirt package is a toolkit for fitting the multidimensional two-parameter logistic (M2PL) model under a conjugate prior. The model is subsumed under a second-order exponential (SOE) family admitting a low-rank parameter matrix. The package efficiently estimates the model parameters by capitalizing on the closed-form likelihood function within the SOE framework. The estimation is based on the pseudo-likelihood and implemented via limited-memory BFGS algorithm. The package supports analysis of mid- towards high-dimensional data in the presence of external covariates.

2. Method

Details of the model and inference schemes are given in out paper.

2.1 Model

Consider that a test measures \(K\) latent traits on a continuous scale. Let \(\boldsymbol{\theta} = (\theta_1, \, \ldots, \, \theta_K) \in \mathbb{R}^K\) denote the \(K\)-dimensional latent trait vector characterizing proficiency levels of a subject. The M2PL model defines the probability of a correct response to an item \(j\) as

\[ \text{Pr} (X_j = 1 \, | \, \boldsymbol{\theta} ) = \frac{ \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + b_j \right) }{ 1 + \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + b_j \right) }, \label{eq:mirf} \]

where \(X_j \in \{0, \, 1\}\) denotes the binary response score on item \(j\); \(\mathbf{a}_j = (a_{j1}, \, \ldots, \, a_{jK})^\top\) is the vector of item discrimination parameters; \(\top\) denotes vector or matrix transpose; and \(b_j\) is the scalar parameter relating to the difficulty of the item.

Let \(\mathbf{X}=(X_1, \, \ldots \, , \, X_J)^\top\) be a random vector of responses to \(J\) test items, and \(\mathbf{x} = (x_1, \, \ldots \, , \, x_J)\) be a realization of \(\mathbf{X}\) for a given subject. The discrimination and difficulty parameters of the \(J\) items are accordingly written as a matrix and a vector: \(\mathbf{A}=(\mathbf{a}_1, \, \ldots \, \, \mathbf{a}_J)^\top\) and \(\mathbf{b} = (b_1, \, \ldots \, , \, b_J)^\top\). Since the latent variable \(\boldsymbol \theta\) is unobservable, statistical inference about the item parameters is made via a marginal distribution

\[ f ( \mathbf{X} = \mathbf{x} \, | \, \mathbf{A}, \, \mathbf{b} ) = \int f (\mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b} ) \, d \boldsymbol{\theta}. \]

We consider the joint distribution of \((\mathbf X, \, \boldsymbol{\theta})\) that conceives a special prior such that it becomes an SOE family:

\[ f ( \mathbf{X} = \mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b} ) \propto \exp \bigg( - \frac{1}{2} || \boldsymbol{\theta} ||^2 + \mathbf{x}^\top \mathbf{A} \boldsymbol{\theta} + \mathbf{x}^\top \mathbf{b} \bigg), %\label{eq:joint} \]

where \(|| \cdot ||\) is the usual Euclidean norm on \(\mathbb{R}^K\). The marginal distribution of \(\mathbf x\) then has the closed form

\[ f (\mathbf{X} = \mathbf{x} \, | \, \mathbf{A}, \, \mathbf{b} ) = \int f ( \mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b} ) \, d \boldsymbol{\theta} = \frac{(2 \pi)^{K/2}}{z (\mathbf{A}, \, \mathbf{b})} \exp \left( \frac{1}{2} \mathbf{x}^\top \mathbf{AA}^\top \mathbf{x} + \mathbf{x}^\top \mathbf{b} \right). \]

where \(z (\mathbf{A}, \, \mathbf{b})\) is a normalizing constant obtained as

\[ z (\mathbf{A}, \, \mathbf{b}) = \sum_{\mathbf{x} \in \{0, \, 1\}^{J}} \int \exp \bigg( - \frac{1}{2} || \boldsymbol{\theta} ||^2 + \mathbf{x}^\top \mathbf{A} \boldsymbol{\theta} + \mathbf{x}^\top \mathbf{b} \bigg) \, d \boldsymbol{\theta}. \]

2.2 eMIRT

We employ pseudo-likelihood as a basis of inference. Let \[\mathbf{X}_{-j} = (X_1, \, \ldots \, , \, X_{j-1}, \, X_{j+1}, \, \ldots \, , \, X_J)\] denote the vector of response variables excluding \(X_j\), and \(\mathbf{x}_{-j}\) be a realization of \(\mathbf{X}_{-j}\). The pseudo-likelihood for all observations is attained as

\[ \mathcal{L} ( \mathbf{A}, \, \mathbf{b} ) = \prod_{i=1}^N \prod_{j=1}^J \mathcal{L}_j (\mathbf{A}, \, \mathbf{b}; \, \mathbf{x}_i), \]

where \(\mathbf{x}_i\) is the vector of response scores for individual \(i\) (\(i=1, \, \ldots, \, N\)), and

\[ \mathcal{L}_j (\mathbf{A}, \, \mathbf{b}; \, \mathbf{x}) = f (X_j = x_j \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b} ). \]

The conditional likelihood of \(X_j\) given \(\mathbf{x}_{-j}\) (i.e., \(f (X_j \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b} )\)) follows the Bernoulli distribution with success probability

\[ \text{Pr} ( \mathrm{X}_j = 1 \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b} ) = \frac{ \displaystyle \exp \left( \frac{1}{2} \boldsymbol{a}_j^\top \boldsymbol{a}_j + b_j + \sum_{j’ \ne j} \boldsymbol{a}_j^\top \boldsymbol{a}_{j’} x_{j’} \right) }{ \displaystyle 1 + \exp \left( \frac{1}{2} \boldsymbol{a}_j^\top \boldsymbol{a}_j + b_j + \sum_{j’ \ne j} \boldsymbol{a}_j^\top \boldsymbol{a}_{j’} x_{j’} \right) }. \]

The pseudo-likelihood estimator of the item parameters is then obtained as

\[ (\hat{\mathbf{A}}, \, \hat{\mathbf{b}}) = \underset{ (\mathbf{A}, \, \mathbf{b}) }{ \textrm{arg max} } \; \log \mathcal{L} (\mathbf{A}, \, \mathbf{b}) \label{eq:opt_obj} \]

subject to positive semidefinite \(\mathbf{AA}^\top\). Once the item parameters are estimated with enough precision, person parameters can be calculated as a weighted sum of the responses. Specifically, the posterior distribution of \(\boldsymbol{\theta}\) given \(\mathbf{x}\) is obtained as the multivariate normal distribution with mean

\[ E \big[ \boldsymbol{\theta} \, | \, \mathbf{X} = \mathbf{x} \big] = \mathbf{A}^\top \mathbf{x} \]

and identity covariance matrix.

2.3 cMIRT

The lmirt package supports confirmatory analysis where restrictions are placed on the model parameters a priori, for example, the dimension of \(\boldsymbol{\theta}\), relations between the latent factors and item responses, correlations among the latent factors, etc. Let's denote the pre-identified covariance matrix as \(\mathbf \Sigma\) and the factor loading structure of the items as \(\mathbf Q\)-matrix whose entry, \(q_{jk}\) (\(j\)=1, \(\ldots\) , \(J\), \(k\)=1, \(\ldots\) , \(K\)), indicates whether or not the item \(j\) is designed to measure the \(k\)th latent factor. The marginal likelihood of the item parameters is given by

\[ f (\mathbf{X} = \mathbf{x} \, | \, \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}) = \int f ( \mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ) \, d \boldsymbol{\theta} = \frac{ (2 \pi)^{K/2} | \mathbf{\Sigma}|^{1/2} }{ z(\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}) } \exp \left( \frac{1}{2} \mathbf{x}^\top \mathbf{A} \mathbf{\Sigma} \mathbf{A}^\top \mathbf{x} + \mathbf{x}^\top \mathbf{b} \right), \]

where \(a_{jk} = 0\) if \(q_{jk} = 0\).

In an analogous manner with the exploratory analysis, inference about the model parameters is made based on the pseudo-likelihood function:

\[ \mathcal{L} ( \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ) = \prod_{i=1}^N \prod_{j=1}^J \mathcal{L}_j (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}; \, \mathbf{x}_i), \]

where

\[ \mathcal{L}_j (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}; \, \mathbf{x}) = f (X_j = x_j \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ). \]

The estimator is then obtained as

\[ (\hat{\mathbf{A}}, \, \hat{\mathbf{b}}, \, \hat{\mathbf{\Sigma}}) = \underset{ (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}) }{ \textrm{arg max} } \; \log \mathcal{L} (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ) \]

subject to \(a_{jk} = 0\) if \(q_{jk}=0\) and positive semidefinite \(\mathbf{\Sigma}\). The posterior distribution of \(\boldsymbol \theta\) given \(\mathbf{X} = \mathbf{x}\) follows the multivariate normal distribution with

\[ E \big[ \boldsymbol{\theta} \, | \, \mathbf{X} = \mathbf{x} \big] = \mathbf{\Sigma} \mathbf{A}^\top \mathbf{x} \quad \text{and} \quad \text{Var} \big[ \boldsymbol{\theta} \, | \, \mathbf{X} = \mathbf{x} \big] = \mathbf{\Sigma} . \]

2.4 cMIRT_cov

The pseudo-likelihood estimator can be extend for fitting the mixed-effects M2PL models. The mixed-effects M2PL model in the presence of observable covariates is written as

\[ \text{Pr} (X_j=1 \, | \, \boldsymbol{\theta}, \, \mathbf{z}) = \frac{ \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + \boldsymbol{\gamma}_j^\top \mathbf{z} \right) }{ 1 + \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + \boldsymbol{\gamma}_j^\top \mathbf{z} \right) }, \]

where \(\mathbf{z}\) is an observed \(P\)-dimensional covariate vector and \(\boldsymbol{\gamma}_j\) is the \(P\)-dimensional parameter vector of the fixed effects (including the item difficulty parameter). The optimization problem in the existence of external covariates can be written in a similar manner to the above estimators but as a function of \(\mathbf{A}\) and \((\boldsymbol{\gamma}_1, \, \ldots, \, \boldsymbol{\gamma}_J)\) in place of \(\mathbf{b}\).

References

Chen, Y., Li, X., Liu, J., & Ying, Z. (2016). A Fused Latent and Graphical Model for Multivariate Binary Data. arXiv preprint arXiv:1606.08925.

Sun, J., Chen, Y., Liu, J., Ying, Z., & Xin, T. (2016). Latent Variable Selection for Multidimensional Item Response Theory Models via L1 Regularization. Psychometrika, 81(4), 921-939.

Kang, H., Zhang, S., Chen, Y., Liu, J., Ying, Z. (2017). Multidimensional Latent Trait Models via Low-rank Second-order Exponential Family