Language of school and SES matter in standardized testing of bilinguals

Assessing children from diverse language backgrounds can be a challenge, but at least for Spanish speakers, SLPs have a decent array of resources available—including a growing number of standardized tests. The CELF–4S is one of these, designed to diagnose language disorders in Spanish speakers (mono- or bilingual) from 5–21 years old. It’s not just a Spanish translation of the English CELF, but is written specifically for speakers of Spanish. Great, right?


The problem is that the norming sample for this test was somewhat smaller than what’s recommended, and so the norms in the test manual may not be valid for all groups. Previously, there have been disagreements between the test creators and other researchers about whether you need separate norms for monolingual and bilingual speakers (in the test manual, they’re together).

This study focused on children from 5–7 years old with multiple risk factors for underperformance on standardized language tests. These included low SES (low-income family and parents with lower levels of education) and attending an English-only school, which favors English to the detriment of the home language. The researchers gave the CELF–4S to a huge group (656) of these kids, a lot more per age bracket than the test was originally normed on. The average Core Language Score was 83.57—more than one standard deviation below the mean, which is given in the manual as the cut-off score for identifying a language disorder. In Table 3, you can see how the results break down by subtest and age group. And, yes. You read that right. Given the published test norms, over half of these kids would appear to have DLD.

Wow. This is clearly not okay. So what do we do?

It looks like we need separate test norms for low-SES children in English schools. The authors used a subset of the original sample (still large at 299, 28 of whom had been found to have a language disorder via multiple methods of assessment) to look into the test’s diagnostic accuracy. That cut-off score of 85? Yeah, it resulted in so many false positives (specificity of only 65%) that it wasn’t clinically useful. The researchers computed an adjusted cut-off score of 78 for this group, which has acceptable diagnostic sensitivity and specificity (85% and 80%, respectively).

The big takeaway is this: Use the CELF–4S very cautiously. Understand the limitations of the normative sample used to standardize the test. If you are working with kids matching the profile of this paper’s sample (5-7 years old, low-SES/maternal education, and in English-only schools), keep that adjusted cut-off score of 78 in mind. And above all, remember that standardized testing alone is not a good way to assess young English learners.


Barragan, B., Castilla-Earls, A., Martinez-Nieto, L., Restrepo, M. A., & Gray, S. (2018). Performance of Low-Income Dual Language Learners Attending English-Only Schools on the Clinical Evaluation of Language Fundamentals–Fourth Edition, Spanish. Language, Speech, and Hearing Services in Schools. Advance online publication. doi: 10.1044/2017_LSHSS-17-0013.