10 Validity Evidence

The Dynamic Learning Maps^® (DLM^®) Alternate Assessment System is based on the core belief that all students should have access to challenging, grade-level academic content. The DLM assessment provides students with the most significant cognitive disabilities the opportunity to demonstrate what they know and can do.

The 2021–2022 was the seventh operational administration of the DLM science assessments. This technical manual update provides updated evidence from the 2021–2022 year intended to evaluate the propositions and assumptions that undergird the assessment system as described at the onset of its design in the DLM Theory of Action. The contents of this manual address the information summarized in Table 10.1. Evidence summarized in this manual builds on the original evidence included in the 2015–2016 Technical Manual—Science (Dynamic Learning Maps Consortium, 2017) and in the subsequent technical manual updates (Dynamic Learning Maps Consortium, 2018a, 2018b, 2019, 2020, 2021a). Together, the documents summarize the validity evidence collected to date.

Table 10.1: Review of Technical Manual Update Contents
Chapter	Contents
1	Provides an overview of information updated for the 2021–2022 year
2	Not updated for 2021–2022
3, 4	Provides evidence collected during 2021–2022 of assessment development and administration, including field-test information, item analyses, and test administrator survey results
5	Describes the statistical model used to produce results based on student responses, along with a summary of item parameters
6	Not updated for 2021–2022
7, 8	Describes results and analyses from the seventh operational administration, evaluating how students performed on the assessment, the distributions of those results, including aggregated and disaggregated results, and analysis of the consistency of student responses
9	Provides evidence collected during 2021–2022 on participation in professional development modules, including participant evaluations
10	Summarizes validity evidence collected during the 2021–2022 academic year

This chapter reviews the evidence provided in this technical manual update and discusses future research studies as part of ongoing and iterative processes of program responsiveness, validation, and evaluation.

10.1 Validity Evidence Summary

The accumulated evidence available by the end of the 2021–2022 year provides additional support for the validity argument. Three scoring interpretation and use claims are summarized in Table 10.2. Each claim is addressed by evidence in one or more of the sources of validity evidence defined in the Standards for Educational and Psychological Testing (Standards, American Educational Research Association et al., 2014). While many sources of evidence contribute to multiple propositions, Table 10.2 lists the primary associations. For example, Proposition 4 is indirectly supported by content-related evidence described for Propositions 1 through 3. Table 10.3 shows the titles and sections for the chapters cited in Table 10.2.

Table 10.2: DLM Alternate Assessment System Claims and Sources of Updated Evidence for 2021–2022
	Sources of evidence^*
Claim	Test content	Response processes	Internal structure	Relations with other variables	Consequences of testing
Mastery results represent what students know and can do.	3.1, 3.2, 3.3, 3.4, 4.1, 4.3, 4.4, 4.5, 7.1, 7.2	4.1, 4.2	3.3, 3.4, 3.6, 5.1, 8.1		3.5, 7.1, 7.2
Results indicate summative performance relative to alternate achievement standards.	7.1, 7.2		8.1		3.5, 7.1, 7.2
Results can be used for instructional decision-making.					3.5
^* See Table 10.3 for a list of evidence sources. Only direct sources of evidence are listed. Some propositions are also supported indirectly by evidence presented for other propositions.

Table 10.3: Evidence Sources Cited in Table 10.2
Evidence no.	Chapter	Section
3.1	3	Testlet and Item Writing
3.2	3	External Reviews
3.3	3	Operational Assessment Items for 2021–2022
3.4	3	Field Testing
3.5	3	Educator Perception of Assessment Content
3.6	3	Evaluation of Item-Level Bias
4.1	4	User Experience With the DLM System
4.2	4	Accessibility Support Selections
4.3	4	Test Administration Observations
4.4	4	Student Experience
4.5	4	Opportunity to Learn
5.1	5	All
7.1	7	Student Performance
7.2	7	Score Reports
8.1	8	All

10.2 Continuous Improvement

As noted previously in this manual, 2021–2022 was the seventh year the DLM Science Alternate Assessment System was operational. While the 2021–2022 assessments were carried out in a manner that supports the validity of inferences made from results for the intended purposes, the DLM Governance Board is committed to continual improvement of assessments, educator and student experiences, and technological delivery of the assessment system. Through formal research and evaluation, as well as informal feedback, some improvements have already been implemented for 2022–2023. This section describes notable improvements from the sixth year to the seventh year of operational administration, as well as examples of improvements to be made during the 2022–2023 year.

10.2.1 Improvements to the Assessment System

Overall, there were no significant changes to the item-writing procedures, item flagging outcomes, assessment administration procedures, or the process for scoring assessments from previous years to 2021–2022.

Based on an ongoing effort to improve the evaluation of the psychometric model used for scoring the assessments, the methods for calibrating the psychometric model and estimating reliability were updated in 2021–2022. The model calibration now implements a Bayesian rather than a maximum likelihood estimator, which allows for more accurate evaluations of model fit. The simulated retest method for estimating reliability now more closely approximates the conditions of test administration to better estimate the consistency in reported results. Both of these changes were implemented on the advice of the DLM Technical Advisory Committee (TAC).

The validity evidence collected in 2021–2022 expands upon the data compiled in the first six operational years for four of the critical sources of evidence as described in the Standards (American Educational Research Association et al., 2014): evidence based on test content, response process, internal structure, and consequences of testing. Specifically, analysis of the opportunity to learn data contributed to the evidence collected based on test content. Test administrator survey responses on test administration further contributed to the body of evidence collected based on response process. Evaluation of item-level bias via differential item functioning analysis, along with item-pool statistics and model parameters, provided additional evidence collected based on internal structure. Test administrator survey responses also provided evidence based on consequences of testing. We summarize studies planned for 2022–2023 to provide additional validity evidence in the following section.

10.2.2 Future Research

The continuous improvement process also leads to future directions for research to inform and improve the DLM System in 2022–2023 and beyond. The section describes some areas for further investigation.

DLM staff members are planning several studies for spring 2023 to collect data from educators in the states administering DLM assessments. The test administrator survey will collect information on educator ratings of student mastery as additional evidence to evaluate the extent that mastery ratings are consistent with other measures of student knowledge, skills, and understandings. In addition, the test administrator survey will continue to provide a source of data from which to investigate changes over time in the long-term effects of the assessment system for students and educators. DLM staff are also examining new ways to collect information on students’ opportunity to learn and evaluate the extent to which educators provide aligned instruction. DLM staff will continue to collaborate with the DLM Governance Board on additional data collection as needed.

In addition to data collected from students and educators, there is an ongoing research agenda to improve the evaluation of item- and person-level model fit. A modeling subcommittee of DLM TAC members guides this research agenda.

Advice from the DLM TAC and DLM Governance Board will guide all future studies, using processes established over the life of the DLM System.