top of page


A Vision for Using an Argument-Based Framework for Validity Applied to a Comprehensive System of Assessments for English Learners in Secondary Grades

By Margaret Heritage, Caroline Wylie, Molly Faulkner-Bond, and Aída Walqui

INTRODUCTION   |   PART 1   |   PART 2   |   PART 3    |   PART 4    |   PART 5    |   PART 6    |   PART 7    |   PART 8    |   PART 9   |   REFERENCES    |   APPENDICES


American Educational Research Association (AERA), American Psychological Association, National Council on Measurement in Education, Joint Committee on Standards for Educational and Psychological Testing (U.S.). (2014). Standards for educational and psychological testing. Washington, DC: AERA.

Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly: An International Journal, 2(1), 1-34.


Black, P. (1993). Formative and summative assessment by teachers. Studies in Science Education, 21(1) 49–97, DOI: 10.1080/03057269308560014


Brooks, M. D. (2018). Pushing past myths: Designing instruction for long‐term English Learners. TESOL quarterly, 52(1), 221–233.


Bruner, J. (1983). Child’s talk. New York, NY: Norton.


Callahan, R. M. (2005). Tracking and high school English Learners: Limiting opportunity to learn. American Educational Research Journal, 42(2), 305–328.


Callahan, R. M., & Shifrer, D. (2016). Equitable access for secondary English Learner students. Educational Administration Quarterly.


Carlson, D., & Knowles, J. E. (2016). The effect of English language learner reclassification on student ACT scores, high school graduation, and postsecondary enrollment: Regression discontinuity evidence from Wisconsin. Journal of Policy Analysis and Management, 35(3), 559–586.


Connolly, S., Klenowski, V., & Wyatt-Smith, C. M. (2012). Moderation and consistency of teacher judgement: Teachers’ views. British Educational Research Journal, 38(4), 593-614.


Durán, R. P. (2008). Assessing English-language learners’ achievement. Review of Research in Education, 32(1), 292–327.


Glick, Y., & Walqui, A. (2021). Affordances in the development of student voice and agency. The case of bureaucratically labeled Long Term English Learners. In: A. Kibler, G. Valdés, & A. Walqui (Eds.), Reconceptualizing the role of critical dialogue in American classrooms. Promoting equity through dialogic education. New York, NY: Routledge.


Gordon, E. W. (2020). Toward Assessment in the Service of Learning. Educational Measurement: Issues and Practice, 39(3), 72–78.


Gordon, E. W., Gordon, E. W., Aber, L., & Berliner, D. (2012). Changing paradigms for education. Assessment, Teaching, and Learning, 2(2).


Heritage, M., Faulkner-Bond, M. & Walqui, A. (2021). A new direction for assessing English learners in the secondary grades. San Francisco: WestEd


Herman, J. L. (2010). Coherence: Key to next generation assessment success. Los Angeles, CA: University of California.


Herman, J. L., Ashbacher, P., & Winters, L. (1992). A practical guide to alternative assessment. Alexandria, VA: Association for Supervision and Curriculum Development.


Herman, J. L., Heritage, M., & Goldschmidt, P. (2011). Guidance for developing and selecting assessments of student growth for use in teacher evaluation systems. Los Angeles, CA: University of California.


Johnson, A. (2019). A matter of time: Variations in high school course-taking by years-as-EL subgroup. Educational Evaluation and Policy Analysis, 41(4), 461–482.


Kane, M. T. (2006). Validation. Educational Measurement, 4(2), 17-64.


Kane, M. (2013). The argument-based approach to validation. School Psychology Review, 42(4), 448-457.


Kibler, A. K., & Valdés, G. (2016). Conceptualizing language learners: Socioinstitutional mechanisms and their consequences. The Modern Language Journal, 100(S1), 96–116.


Levi, T., & Poehner, M. E. (2018). Employing dynamic assessment to enhance agency among L2 learners. In J.P.

Lantolf, Poehner, M.E. & M. Swain (Eds.), The Routledge handbook of sociocultural theory and second language development (pp. 295–309). New Yok, NY: Routledge.


Moss, P. A., Girard, B. J., & Haniford, L. C. (2006). Validity in educational assessment. Review of Research in Education, 30(1), 109-162.


National Center for Education Statistics (2018). NAEP data explorer. Retrieved September 10, 2019, from


National Research Council. (2001). Knowing what students know. Washington, DC: National Academies Press.


Organisation for Economic Co-operation and Development. (2012). Equity and equality of opportunity. Education Today 2013: The OECD Perspective. Paris, France: Author.


Paris, D. (2012). Culturally sustaining pedagogy: A needed change in stance, terminology, and practice. Educational Researcher, 41(3), 93–97.


Perie, M., & Forte, E. (2011). Developing a validity argument for assessments of students in the margins. In M. Russell & M. Kavanaugh (Eds.), Assessing Students in the Margins: Challenges, Strategies, and Techniques (pp. 335-381. Charlotte, NC: Information Age Publishing.

Rogoff, B. (1995) Observing sociocultural activity on three planes: Participatory, appropriation, guided participation, and apprenticeship. In J. Wertsch, P. del Rio, and A. Alvarez (Eds.), Sociocultural studies of mind (pp. 139–164). Cambridge, UK: Cambridge University Press.


Rosa, J. (2019). Looking like a language, sounding like a race: Raciolinguistic ideologies and the learning of latinidad. New York and Oxford: Oxford University Press.


Smith, J. K. (2003). Reconsidering reliability in classroom assessment and grading. Educational Measurement: Issues and practice, 22(4), 26-33.


Swain, M. (2006). Languaging, agency and collaboration in advanced second language proficiency. In H. Byrnes (Ed.), Advanced language learning: The contribution of Halliday and Vygotsky (pp. 95–108). London: Continuum.


Swain, M., & Lapkin, S. (2011). Languaging as agent and constituent of cognitive change in an older adult: An example. Canadian Journal of Applied Linguistics, 14(1), 104–117.


Umansky, I. M. (2016). Leveled and exclusionary tracking: English Learners’ access to academic content in middle school. American Educational Research Journal, 53(6), 1792–1833.


Umansky, I. M., & Dumont, H. (2021). English Learner Labeling: How English Learner Classification in Kindergarten Shapes Teacher Perceptions of Student Skills and the Moderating Role of Bilingual Instructional Settings. American Educational Research Journal, 0002831221997571.


Umansky, I. M., & Porter, L. (2020). State English learner education policy: A conceptual framework to guide comprehensive policy action. Education Policy Analysis Archives, 28(0), 17.


Volante, L., DeLuca, C., Adie, L., Baker, E., Harju‐Luukkainen, H., Heritage, M., Schneider, C., Stobart, G., Tan, K., & Wyatt‐Smith, C. (2020). Synergy and tension between large‐scale and classroom assessment: International trends. Educational Measurement: Issues and Practice, 39(4), 21-29.


Valdés, G., Kibler, A., & Walqui, A. (2014, March). Changes in the expertise of ESL professionals: Knowledge and action in an era of new standards. Alexandria, VA: TESOL International Association.


Vygotsky, L.S. (1978). Mind in society. Cambridge, MA: Harvard University Press.


Vygotsky, L.S. (1986). Thought and language. Cambridge, MA: MIT Press.


van Lier, L. (2000). From input to affordance: social interactive learning from an ecological perspective. In J. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 245–259). New York, NY: Oxford University Press.


van Lier, L. (2004). The ecology and semiotics of language learning. A sociocultural perspective. Dordrecht, NL: Kluwer Academic.


van Lier, L., & Walqui, A. (2012). Language and the common core standards. In K. Hakuta & M. Santos (Eds.), Understanding language: Commissioned papers on language and literacy issues in the common core state standards and next generation science standards (pp. 44–51). Palo Alto, CA: Stanford University.


Walqui, A. (2006). Scaffolding instruction for English language learners: A conceptual framework. International Journal of Bilingual Education and Bilingualism, 9(2), 159–180.


Walqui, A., & Heritage, M. (2011). Instruction for diverse groups of English language learners. In K. Hakuta & M. Santos (Eds.), Understanding language: Commissioned papers on language and literacy issues in the Common Core State Standards and Next Generation Science Standards (pp. 94–103). Palo Alto, CA: Stanford University.


Webb, N. M., & Shavelson, R. J., Generalizability theory: Overview. In B. Everitt & D. Howell (Eds.), Encyclopedia of statistics in behavioral science (Vol 2, pp. 717-719). Chichester, UK: Wiley.


Webb, N. M., Shavelson, R. J., & Haertel, E. H. (2006). In C. R. Roa & S. Sinharay (Eds.), Reliability coefficients and generalizability theory. Handbook of statistics, 26, pp. 81-124.


Wyatt‐Smith, C., Klenowski, V., & Gunn, S. (2010). The centrality of teachers’ judgement practice in assessment: A study of standards in moderation. Assessment in Education: Principles, Policy & Practice, 17(1), 59-75.

bottom of page