Chapter 1 Introduction to Exploratory Cognitive Diagnostic Models

Exploratory Cognitive Diagnostic Models (ECDMs) are versions of classicial Cognitive Diagnostic Models (CDMs) that do not require the component of an expertly crafted Q matrix. This class of models is new to the world of psychometric models. The goal of this textbook is to provide an overview of their implementation in the ecdm package from by the authors that developed it!

Before we continue, please “bookmark” the ecdm repository on GitHub:

https://github.com/tmsalab/ecdm

The website provides direct access to the developers behind ecdm. In particular, it features the ability to file issues or bug reports, ask questions, or stay up-to-date in the latest breakthroughs.

1.1 Installation

Before we can get started, please install the ecdm package from GitHub. The ecdm package is currently only available via GitHub as it is still being developed. As a result, installing by install.packages('ecdm') isn’t possible.

As many of the routines are written in C++, the ecdm package requires a compiler to install. To assist in setting up the compiler, we’ve created the following guides:

From there, please use the remotes package to retrieve the latest development version.

if(!requireNamespace('remotes', quietly = TRUE)) install.packages("remotes")
remotes::install_github("tmsalab/ecdm")

1.2 Loading the Package

Accessing the ecdm rountines requires loading the package into R. Please load the ecdm package by pressing “run”

library(ecdm)

1.3 Supplementary Data Sets

The ecdm package has an accompanying data package called ecdmdata that comes equipped with three different data sets:

Examination for the Certificate of Proficiency in English (ECPE), Templin, J. and Hoffman, L. (2013).
- items_ecpe, N = 2922 subject responses to J = 28 items.
- qmatrix_ecpe, J = 28 items and K = 3 traits.
Fraction Addition and Subtraction, Tatsuoka, K. K. (1984).
- items_fractions: N = 536 subject responses to J = 20 items.
- qmatrix_fractions: J = 20 items and K = 8 traits.
Revised PSVT:R, Culpepper and Balamuta (2013).
- items_spatial: N = 516 subject responses to J = 30 items.

As time goes on, more data sets will likely be added.

Let’s take a look at Fraction Addition and Subtraction data sets. Typing the name of each data set and running the command will load the data into R if the ecdmdata package is loaded. As these data sets are relatively big, let’s use the function head() to view on the first 6 rows.

library(ecdmdata)

head( items_fractions )

##            Item01 Item02 Item03 Item04 Item05 Item06 Item07 Item08 Item09
## Subject001      0      0      0      1      0      0      1      1      0
## Subject002      0      1      1      1      0      1      1      1      1
## Subject003      0      1      1      1      0      1      1      1      0
## Subject004      1      1      1      1      1      1      0      1      0
## Subject005      0      0      0      1      0      1      0      1      0
## Subject006      0      0      0      0      0      1      0      1      1
##            Item10 Item11 Item12 Item13 Item14 Item15 Item16 Item17 Item18
## Subject001      1      1      1      0      1      1      1      0      1
## Subject002      1      1      1      1      1      1      1      1      1
## Subject003      0      0      0      0      1      1      1      0      0
## Subject004      1      1      1      0      1      0      1      0      1
## Subject005      0      0      1      0      0      0      0      0      0
## Subject006      0      0      0      0      1      0      0      0      0
##            Item19 Item20
## Subject001      1      1
## Subject002      1      1
## Subject003      0      0
## Subject004      0      1
## Subject005      0      0
## Subject006      0      0

head( qmatrix_fractions )

##        Trait1 Trait2 Trait3 Trait4 Trait5 Trait6 Trait7 Trait8
## Item01      0      0      0      1      0      1      1      0
## Item02      0      0      0      1      0      0      1      0
## Item03      0      0      0      1      0      0      1      0
## Item04      0      1      1      0      1      0      1      0
## Item05      0      1      0      1      0      0      1      1
## Item06      0      0      0      0      0      0      1      0

Within this textbook, the following notation will be used for dimensions:

\(K\): Number of Traits (columns) in the Q Matrix
\(J\): Number of Items (rows/columns) in the Q Matrix and Item Matrix
\(N\): Number of Subjects (rows) in the Item Matrix

To retrieve this information in R, we can use the dimension function, dim(), which lists the size of the data as rows by columns.

Find the dimensions of the items_fractions and qmatrix_fractions

dim(items_fractions)

## [1] 536  20

dim(qmatrix_fractions)

## [1] 20  8

1.4 Help!

Each function within the package contains a help file that provides documentation on the implementation. Moreover, some of the functions have worked examples as well. To view this information type either ?function_name or help(function_name). Let’s verify the previously acquired numbers for the items_fractions data set by checking the entry in the documentation.

?items_fractions

If you are curious to see how a function performs, you can opt to use example(function_name, package = "ecdm"). Be aware that some examples may take considerably longer than the rest to run.

1.5 Notation

For consistency, we aim to use the following notation.

Denoting individuals:

\(N\) is the total number of individuals taking the assessment.
\(i\) is the current individual.

Denoting items:

\(J\) is the total number of items on the assessment.
\(j\) is the current item
\(Y_{ij}\) is the observed binary response for individual \(i\) (\(1\leq i \leq N\)) to item \(j\) (\(1\leq j\leq J\)).
\(s_j\) is the probability of slipping on item \(j\).
\(g_j\) is the probability of guessing on item \(j\).

Denoting attributes:

\(K\) is the total number of attributes for the assessment item.
\(k\) is the current attribute.
\(\boldsymbol\alpha_i=\left(\alpha_{i1},\dots,\alpha_{iK}\right)^\prime\) where \(\boldsymbol\alpha_i\in \left\{0,1\right\}^K\) and \(\alpha_{ik}\) is the latent binary attribute for individual \(i\) on attribute \(k\) (\(1\leq k\leq K\)).

Denoting the skill/attribute “Q” matrix:

\(\boldsymbol q_{j}=\left(q_{j1},\dots,q_{jK}\right)^\prime\) be the \(j\)th row of \(\boldsymbol Q\) such that \(q_{jk}=1\) if attribute \(k\) is required for item \(j\) and zero otherwise.