# Fit the Low Rank Item Response using lmirt package

## 1. Description

The lmirt package is a toolkit for fitting the multidimensional two-parameter logistic (M2PL) model under a conjugate prior. The model is subsumed under a second-order exponential (SOE) family admitting a low-rank parameter matrix. The package efficiently estimates the model parameters by capitalizing on the closed-form likelihood function within the SOE framework. The estimation is based on the pseudo-likelihood and implemented via limited-memory BFGS algorithm. The package supports analysis of mid- towards high-dimensional data in the presence of external covariates.

## 2. Method

Details of the model and inference schemes are given in out paper.

### 2.1 Model

Consider that a test measures $$K$$ latent traits on a continuous scale. Let $$\boldsymbol{\theta} = (\theta_1, \, \ldots, \, \theta_K) \in \mathbb{R}^K$$ denote the $$K$$-dimensional latent trait vector characterizing proficiency levels of a subject. The M2PL model defines the probability of a correct response to an item $$j$$ as

$\text{Pr} (X_j = 1 \, | \, \boldsymbol{\theta} ) = \frac{ \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + b_j \right) }{ 1 + \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + b_j \right) }, \label{eq:mirf}$

where $$X_j \in \{0, \, 1\}$$ denotes the binary response score on item $$j$$; $$\mathbf{a}_j = (a_{j1}, \, \ldots, \, a_{jK})^\top$$ is the vector of item discrimination parameters; $$\top$$ denotes vector or matrix transpose; and $$b_j$$ is the scalar parameter relating to the difficulty of the item.

Let $$\mathbf{X}=(X_1, \, \ldots \, , \, X_J)^\top$$ be a random vector of responses to $$J$$ test items, and $$\mathbf{x} = (x_1, \, \ldots \, , \, x_J)$$ be a realization of $$\mathbf{X}$$ for a given subject. The discrimination and difficulty parameters of the $$J$$ items are accordingly written as a matrix and a vector: $$\mathbf{A}=(\mathbf{a}_1, \, \ldots \, \, \mathbf{a}_J)^\top$$ and $$\mathbf{b} = (b_1, \, \ldots \, , \, b_J)^\top$$. Since the latent variable $$\boldsymbol \theta$$ is unobservable, statistical inference about the item parameters is made via a marginal distribution

$f ( \mathbf{X} = \mathbf{x} \, | \, \mathbf{A}, \, \mathbf{b} ) = \int f (\mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b} ) \, d \boldsymbol{\theta}.$

We consider the joint distribution of $$(\mathbf X, \, \boldsymbol{\theta})$$ that conceives a special prior such that it becomes an SOE family:

$f ( \mathbf{X} = \mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b} ) \propto \exp \bigg( - \frac{1}{2} || \boldsymbol{\theta} ||^2 + \mathbf{x}^\top \mathbf{A} \boldsymbol{\theta} + \mathbf{x}^\top \mathbf{b} \bigg), %\label{eq:joint}$

where $$|| \cdot ||$$ is the usual Euclidean norm on $$\mathbb{R}^K$$. The marginal distribution of $$\mathbf x$$ then has the closed form

$f (\mathbf{X} = \mathbf{x} \, | \, \mathbf{A}, \, \mathbf{b} ) = \int f ( \mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b} ) \, d \boldsymbol{\theta} = \frac{(2 \pi)^{K/2}}{z (\mathbf{A}, \, \mathbf{b})} \exp \left( \frac{1}{2} \mathbf{x}^\top \mathbf{AA}^\top \mathbf{x} + \mathbf{x}^\top \mathbf{b} \right).$

where $$z (\mathbf{A}, \, \mathbf{b})$$ is a normalizing constant obtained as

$z (\mathbf{A}, \, \mathbf{b}) = \sum_{\mathbf{x} \in \{0, \, 1\}^{J}} \int \exp \bigg( - \frac{1}{2} || \boldsymbol{\theta} ||^2 + \mathbf{x}^\top \mathbf{A} \boldsymbol{\theta} + \mathbf{x}^\top \mathbf{b} \bigg) \, d \boldsymbol{\theta}.$

### 2.2 eMIRT

We employ pseudo-likelihood as a basis of inference. Let $\mathbf{X}_{-j} = (X_1, \, \ldots \, , \, X_{j-1}, \, X_{j+1}, \, \ldots \, , \, X_J)$ denote the vector of response variables excluding $$X_j$$, and $$\mathbf{x}_{-j}$$ be a realization of $$\mathbf{X}_{-j}$$. The pseudo-likelihood for all observations is attained as

$\mathcal{L} ( \mathbf{A}, \, \mathbf{b} ) = \prod_{i=1}^N \prod_{j=1}^J \mathcal{L}_j (\mathbf{A}, \, \mathbf{b}; \, \mathbf{x}_i),$

where $$\mathbf{x}_i$$ is the vector of response scores for individual $$i$$ ($$i=1, \, \ldots, \, N$$), and

$\mathcal{L}_j (\mathbf{A}, \, \mathbf{b}; \, \mathbf{x}) = f (X_j = x_j \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b} ).$

The conditional likelihood of $$X_j$$ given $$\mathbf{x}_{-j}$$ (i.e., $$f (X_j \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b} )$$) follows the Bernoulli distribution with success probability

$\text{Pr} ( \mathrm{X}_j = 1 \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b} ) = \frac{ \displaystyle \exp \left( \frac{1}{2} \boldsymbol{a}_j^\top \boldsymbol{a}_j + b_j + \sum_{j’ \ne j} \boldsymbol{a}_j^\top \boldsymbol{a}_{j’} x_{j’} \right) }{ \displaystyle 1 + \exp \left( \frac{1}{2} \boldsymbol{a}_j^\top \boldsymbol{a}_j + b_j + \sum_{j’ \ne j} \boldsymbol{a}_j^\top \boldsymbol{a}_{j’} x_{j’} \right) }.$

The pseudo-likelihood estimator of the item parameters is then obtained as

$(\hat{\mathbf{A}}, \, \hat{\mathbf{b}}) = \underset{ (\mathbf{A}, \, \mathbf{b}) }{ \textrm{arg max} } \; \log \mathcal{L} (\mathbf{A}, \, \mathbf{b}) \label{eq:opt_obj}$

subject to positive semidefinite $$\mathbf{AA}^\top$$. Once the item parameters are estimated with enough precision, person parameters can be calculated as a weighted sum of the responses. Specifically, the posterior distribution of $$\boldsymbol{\theta}$$ given $$\mathbf{x}$$ is obtained as the multivariate normal distribution with mean

$E \big[ \boldsymbol{\theta} \, | \, \mathbf{X} = \mathbf{x} \big] = \mathbf{A}^\top \mathbf{x}$

and identity covariance matrix.

### 2.3 cMIRT

The lmirt package supports confirmatory analysis where restrictions are placed on the model parameters a priori, for example, the dimension of $$\boldsymbol{\theta}$$, relations between the latent factors and item responses, correlations among the latent factors, etc. Let's denote the pre-identified covariance matrix as $$\mathbf \Sigma$$ and the factor loading structure of the items as $$\mathbf Q$$-matrix whose entry, $$q_{jk}$$ ($$j$$=1, $$\ldots$$ , $$J$$, $$k$$=1, $$\ldots$$ , $$K$$), indicates whether or not the item $$j$$ is designed to measure the $$k$$th latent factor. The marginal likelihood of the item parameters is given by

$f (\mathbf{X} = \mathbf{x} \, | \, \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}) = \int f ( \mathbf{x}, \, \boldsymbol{\theta} \, | \, \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ) \, d \boldsymbol{\theta} = \frac{ (2 \pi)^{K/2} | \mathbf{\Sigma}|^{1/2} }{ z(\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}) } \exp \left( \frac{1}{2} \mathbf{x}^\top \mathbf{A} \mathbf{\Sigma} \mathbf{A}^\top \mathbf{x} + \mathbf{x}^\top \mathbf{b} \right),$

where $$a_{jk} = 0$$ if $$q_{jk} = 0$$.

In an analogous manner with the exploratory analysis, inference about the model parameters is made based on the pseudo-likelihood function:

$\mathcal{L} ( \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ) = \prod_{i=1}^N \prod_{j=1}^J \mathcal{L}_j (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}; \, \mathbf{x}_i),$

where

$\mathcal{L}_j (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}; \, \mathbf{x}) = f (X_j = x_j \, | \, \mathbf{X}_{-j} = \mathbf{x}_{-j}, \, \mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} ).$

The estimator is then obtained as

$(\hat{\mathbf{A}}, \, \hat{\mathbf{b}}, \, \hat{\mathbf{\Sigma}}) = \underset{ (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma}) }{ \textrm{arg max} } \; \log \mathcal{L} (\mathbf{A}, \, \mathbf{b}, \, \mathbf{\Sigma} )$

subject to $$a_{jk} = 0$$ if $$q_{jk}=0$$ and positive semidefinite $$\mathbf{\Sigma}$$. The posterior distribution of $$\boldsymbol \theta$$ given $$\mathbf{X} = \mathbf{x}$$ follows the multivariate normal distribution with

$E \big[ \boldsymbol{\theta} \, | \, \mathbf{X} = \mathbf{x} \big] = \mathbf{\Sigma} \mathbf{A}^\top \mathbf{x} \quad \text{and} \quad \text{Var} \big[ \boldsymbol{\theta} \, | \, \mathbf{X} = \mathbf{x} \big] = \mathbf{\Sigma} .$

### 2.4 cMIRT_cov

The pseudo-likelihood estimator can be extend for fitting the mixed-effects M2PL models. The mixed-effects M2PL model in the presence of observable covariates is written as

$\text{Pr} (X_j=1 \, | \, \boldsymbol{\theta}, \, \mathbf{z}) = \frac{ \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + \boldsymbol{\gamma}_j^\top \mathbf{z} \right) }{ 1 + \exp \left( \mathbf{a}_j^\top \boldsymbol{\theta} + \boldsymbol{\gamma}_j^\top \mathbf{z} \right) },$

where $$\mathbf{z}$$ is an observed $$P$$-dimensional covariate vector and $$\boldsymbol{\gamma}_j$$ is the $$P$$-dimensional parameter vector of the fixed effects (including the item difficulty parameter). The optimization problem in the existence of external covariates can be written in a similar manner to the above estimators but as a function of $$\mathbf{A}$$ and $$(\boldsymbol{\gamma}_1, \, \ldots, \, \boldsymbol{\gamma}_J)$$ in place of $$\mathbf{b}$$.

## References

Chen, Y., Li, X., Liu, J., & Ying, Z. (2016). A Fused Latent and Graphical Model for Multivariate Binary Data. arXiv preprint arXiv:1606.08925.

Sun, J., Chen, Y., Liu, J., Ying, Z., & Xin, T. (2016). Latent Variable Selection for Multidimensional Item Response Theory Models via L1 Regularization. Psychometrika, 81(4), 921-939.

Kang, H., Zhang, S., Chen, Y., Liu, J., Ying, Z. (2017). Multidimensional Latent Trait Models via Low-rank Second-order Exponential Family