Chapter 1 Introduction to Exploratory Cognitive Diagnostic Models
Exploratory Cognitive Diagnostic Models (ECDMs) are versions of classicial
Cognitive Diagnostic Models (CDMs) that do not require the component of
an expertly crafted Q matrix. This class of models is new to the world of
psychometric models. The goal of this textbook is to provide an overview
of their implementation in the ecdm
package from by the authors that
developed it!
Before we continue, please “bookmark” the ecdm
repository on GitHub:
https://github.com/tmsalab/ecdm
The website provides direct access to the developers behind ecdm
. In particular,
it features the ability to file issues or bug reports, ask questions, or stay
up-to-date in the latest breakthroughs.
1.1 Installation
Before we can get started, please install the ecdm
package from GitHub.
The ecdm
package is currently only available via GitHub as
it is still being developed. As a result, installing by
install.packages('ecdm')
isn’t possible.
As many of the routines are written in C++, the ecdm
package
requires a compiler to install. To assist in setting up the compiler, we’ve
created the following guides:
From there, please use the remotes
package to retrieve the latest development version.
1.2 Loading the Package
Accessing the ecdm
rountines requires loading the package into R. Please
load the ecdm
package by pressing “run”
1.3 Supplementary Data Sets
The ecdm
package has an accompanying data package called ecdmdata
that
comes equipped with three different data sets:
- Examination for the Certificate of Proficiency in English (ECPE), Templin, J. and Hoffman, L. (2013).
items_ecpe
, N = 2922 subject responses to J = 28 items.qmatrix_ecpe
, J = 28 items and K = 3 traits.
- Fraction Addition and Subtraction, Tatsuoka, K. K. (1984).
items_fractions
: N = 536 subject responses to J = 20 items.qmatrix_fractions
: J = 20 items and K = 8 traits.
- Revised PSVT:R, Culpepper and Balamuta (2013).
items_spatial
: N = 516 subject responses to J = 30 items.
As time goes on, more data sets will likely be added.
Let’s take a look at Fraction Addition and Subtraction data sets. Typing the
name of each data set and running the command will load the data into R if
the ecdmdata
package is loaded. As these data sets are relatively big,
let’s use the function head()
to view on the first 6 rows.
## Item01 Item02 Item03 Item04 Item05 Item06 Item07 Item08 Item09
## Subject001 0 0 0 1 0 0 1 1 0
## Subject002 0 1 1 1 0 1 1 1 1
## Subject003 0 1 1 1 0 1 1 1 0
## Subject004 1 1 1 1 1 1 0 1 0
## Subject005 0 0 0 1 0 1 0 1 0
## Subject006 0 0 0 0 0 1 0 1 1
## Item10 Item11 Item12 Item13 Item14 Item15 Item16 Item17 Item18
## Subject001 1 1 1 0 1 1 1 0 1
## Subject002 1 1 1 1 1 1 1 1 1
## Subject003 0 0 0 0 1 1 1 0 0
## Subject004 1 1 1 0 1 0 1 0 1
## Subject005 0 0 1 0 0 0 0 0 0
## Subject006 0 0 0 0 1 0 0 0 0
## Item19 Item20
## Subject001 1 1
## Subject002 1 1
## Subject003 0 0
## Subject004 0 1
## Subject005 0 0
## Subject006 0 0
## Trait1 Trait2 Trait3 Trait4 Trait5 Trait6 Trait7 Trait8
## Item01 0 0 0 1 0 1 1 0
## Item02 0 0 0 1 0 0 1 0
## Item03 0 0 0 1 0 0 1 0
## Item04 0 1 1 0 1 0 1 0
## Item05 0 1 0 1 0 0 1 1
## Item06 0 0 0 0 0 0 1 0
Within this textbook, the following notation will be used for dimensions:
- \(K\): Number of Traits (columns) in the Q Matrix
- \(J\): Number of Items (rows/columns) in the Q Matrix and Item Matrix
- \(N\): Number of Subjects (rows) in the Item Matrix
To retrieve this information in R, we can use the dimension function,
dim()
, which lists the size of the data as rows by columns.
Find the dimensions of the items_fractions
and qmatrix_fractions
## [1] 536 20
## [1] 20 8
1.4 Help!
Each function within the package contains a help file that provides documentation
on the implementation. Moreover, some of the functions have worked examples
as well. To view this information type either ?function_name
or
help(function_name)
. Let’s verify the previously acquired numbers for the
items_fractions
data set by checking the entry in the documentation.
If you are curious to see how a function performs, you can opt to use
example(function_name, package = "ecdm")
. Be aware that some examples
may take considerably longer than the rest to run.
1.5 Notation
For consistency, we aim to use the following notation.
Denoting individuals:
- \(N\) is the total number of individuals taking the assessment.
- \(i\) is the current individual.
Denoting items:
- \(J\) is the total number of items on the assessment.
- \(j\) is the current item
- \(Y_{ij}\) is the observed binary response for individual \(i\) (\(1\leq i \leq N\)) to item \(j\) (\(1\leq j\leq J\)).
- \(s_j\) is the probability of slipping on item \(j\).
- \(g_j\) is the probability of guessing on item \(j\).
Denoting attributes:
- \(K\) is the total number of attributes for the assessment item.
- \(k\) is the current attribute.
- \(\boldsymbol\alpha_i=\left(\alpha_{i1},\dots,\alpha_{iK}\right)^\prime\) where \(\boldsymbol\alpha_i\in \left\{0,1\right\}^K\) and \(\alpha_{ik}\) is the latent binary attribute for individual \(i\) on attribute \(k\) (\(1\leq k\leq K\)).
Denoting the skill/attribute “Q” matrix:
- \(\boldsymbol q_{j}=\left(q_{j1},\dots,q_{jK}\right)^\prime\) be the \(j\)th row of \(\boldsymbol Q\) such that \(q_{jk}=1\) if attribute \(k\) is required for item \(j\) and zero otherwise.