And more...

Esmaeeli et al. found that family history is the biggest predictor of reading disorders in children at the end of second grade, but emergent literacy and oral language skills also played a role. As SLPs, we should always be taking family history into account when screening or testing for reading disorders.

Two studies this month looked at standardized language tests for Spanish–English bilingual children. Fitton et al. studied the sentence repetition task from the Bilingual English–Spanish Assessment (BESA) and found that it was a valid measure of morphosyntax in both Spanish and English. Wood & Schatschneider studied the Peabody Picture Vocabulary Test (PPVT-4) and found that it was biased against Spanish–English dual language learners (see also this review).

Méndez & Simon-Cereijido looked at Spanish–English bilingual preschoolers with developmental language disorder* (DLD) and found that children with better Spanish vocabulary skills also had better English grammar skills. They suggest targeting vocabulary in students’ home language to support English learning.

In a survey of nearly 3000 children, Reinhartsen et al. found that children with autism are significantly more likely to have higher expressive language skills than receptive. Children with this profile tended to have more severe delays and more significantly impaired language overall compared to children without this profile.

Rudolph et al. studied the diagnostic accuracy of finite verb morphology composite (FVMC) scores. Unlike previous studies, they found that FVMC wasn’t good at identifying 6-year-olds with developmental language disorder (DLD). The difference might be due to a larger, more representative sample of children. (NOTE: “The FVMC is derived from a spontaneous language sample, in either a free-play or elicited narrative scenario, and reflects the percent occurrence in obligatory contexts of eight T/A morphemes: regular past tense –ed, 3S, and present tense uncontracted and contracted copula and auxiliary BE forms (am, is, are).” ~Rudolph et al., 2019)

Verschuur et al. studied two types of parent training in Pivotal Response Treatment (PRT), finding that both group and individual training improved parents’ ability to create communication opportunities and increased children’s initiations. Furthermore, group training had additional benefits for parents’ stress levels and feelings of self-efficacy. The authors suggest that combining group and individual sessions might be a good way to build parents’ skills while conserving resources.

Venker et al. surveyed SLPs about their use of telegraphic speech. The vast majority of SLPs reported using telegraphic input for commenting on play, prompting for verbal imitations, and giving directions. However, only 18% of SLPs reported that they felt telegraphic speech is useful, which doesn’t make much sense! More research is needed to help align SLP practices and perspectives for use of telegraphic input. (Editors’ note = Perhaps it’s just a habit that’s hard to break? Even culturally influenced?)


*Note: The children in this study were those with Specific Language Impairment (SLI), which refers to children with Developmental Language Disorder (DLD) and normal nonverbal intelligence. We use DLD throughout our website for consistency purposes (read more here).


Esmaeeli, Z., Kyle, F.E., & Lundetræ, K. (2019). Contribution of family risk, emergent literacy and environmental protective factors in children’s reading difficulties at the end of second-grade. Reading and Writing. doi:10.1007/s11145-019-09948-5.

Fitton, L., Hoge, R., Petscher, Y., & Wood, C. (2019). Psychometric evaluation of the Bilingual English-Spanish Assessment sentence repetition task for clinical decision making. Journal of Speech, Language, and Hearing Research. doi:10.1044/2019_JSLHR-L-1

Méndez, L. I., & Simon-Cereijido, G. (2019). A view of the lexical-grammatical link in young latinos with specific language impairment using language-specific and conceptual measures. Journal of Speech, Language, and Hearing Research. doi:10.1044/2019_JSLHR-L-18-0315

Reinhartsen, D.B., Tapia, A.L., Watson, L., Crais, E., Bradley, C., Fairchild, J., Herring, A.H., & Daniels, J. (2019). Expressive dominant versus receptive dominant language patterns in young children: Findings from the study to explore early development. Journal of Autism and Developmental Disorders. doi:10.1007/s10803-019-03999-x

Rudolph, J. M., Dollaghan, C. A., & Crotteau, S. (2019). Finite verb morphology composite: Values from a community sample. Journal of Speech, Language, and Hearing Research. doi:10.1044/2019_JSLHR-L-18-0437 

Venker, C.E., Yasick, M., & McDaniel, J. (2019). Using telegraphic input with children with language delays: A survey of speech-language pathologists’ practices and perspectives. American Journal of Speech–Language Pathology. doi:10.1044/2018_AJSLP-18-0140

Verschuur, R., Huskens, B. & Didden, R. (2019). Effectiveness of Parent Education in Pivotal Response Treatment on Pivotal and Collateral Responses. Journal of Autism and Developmental Disorders. doi:10.1007/s10803-019-04061-6

Wood, C., & Schatschneider, C. (2019). Item bias: Predictors of accuracy on Peabody Picture Vocabulary Test-Fourth Edition items for Spanish-English-speaking children. Journal of Speech, Language, and Hearing Research. doi: 10.1044/2018_JSLHR-L-18-0145  

What’s driving our clinical decision-making?

We know a lot about what types of assessment tools SLPs tend to use (see here, here, and here, for example), but we don’t know much about how we synthesize and prioritize the information we gather in those assessments to come up with a diagnosis (or lack thereof). How do we reconcile inconsistent results? What factors tend to carry the most weight? How much do outside influences (i.e. policies and caseload issues) affect our decisions? Two different studies this month dive into the minds of SLPs to begin answering these questions.

Fulcher-Rood et al. begin by pointing out that school-based SLPs receive conflicting information on how to assess and diagnose language disorders from our textbooks, our federal/state/local guidelines and policies, and the research. So how do we actually approach this problem in real life? To learn more, they used a pretty cool case study method, where lots of assessment results were available for each of five, real 4–6-year-olds (cognitive and hearing screenings, parent/teacher questionnaires, three different standardized tests and two different language samples, transcribed and analyzed against SALT norms), but the 14 experienced SLPs who participated only saw the results they specifically asked for to help them make their diagnoses. This better reflects actual practice than just giving the SLPs everything upfront, because in school settings you’re for sure not going to have SPELT-3 scores or LSA stats to consider unless you’re purposefully making that happen. The case studies were chosen so that some showed a match between formal and informal results (all within or all below normal limits), whereas some showed a mismatch between formal and informal testing, or overall borderline results. Importantly, SLPs were instructed not to consider the “rules” of where they work when making a diagnosis.

Here were some major findings:

  • Unsurprisingly, when all data pointed in the same direction, SLPs were unanimous in determining that a disorder was or wasn’t present.

  • When there was conflicting information (standard scores pointed one direction, informal measures the other), almost all the SLPs made decisions aligning with the standardized test results.

  • Across cases, almost all the SLPs looked at CELF-P2 and/or PLS-5 scores to help them make a diagnosis, and in most cases they asked for parent/teacher concerns and language sample transcripts as well. A third of the SLPs didn’t ask for LSA at all.

  • Only a few SLPs used SPELT-3 scores, and no one asked for language sample analyses that compared performance to developmental norms.

These results reinforce what we learned in the survey studies linked above: SLPs use a lot of standardized tests, combined with informal measures like parent/teacher reports, and not so much language sampling. What’s troubling here is the under-utilization of tools that have a really good track record at diagnosis language disorders accurately (like the SPELT-3 and LSA measures), as well as over-reliance on standardized test scores that we know can be problematic—even when there’s tons of other information available and time/workplace policies aren’t a factor.

The second study, from Selin et al., tapped into a much bigger group of SLPs (over 500!), to ask a slightly different question:


Under ideal conditions, where logistical/workplace barriers are removed, how are SLPs approaching clinical decision-making? And what about the children, or the SLPs themselves, influences those decisions? 

Their method was a little different from the first study. SLPs read a paragraph about each case, including standard scores (TOLD-P:4 or CELF-4, PPVT-4, GFTA-2, and nonverbal IQ) and information about symptoms and functional impairments (use of finiteness, MLU, pragmatic issues, etc.). Rather than giving a diagnosis, the SLPs made eligibility decisions—should the child continue to receive services, and if so, in what area(s) and what type of service (direct, consultation, monitoring, etc.)?

The survey method this team used yielded a TON of information, but we’ll share a few highlights:

  • Freed from the constraints of caseloads and time, SLPs recommended continued service more often than we do in real life. We know that workplace policies and huge caseloads can prevent us from using best practices, but it’s helpful to see that play out in the research. It’s not just you!

  • Six cases were specifically set up to reflect the clinical profile of Specific Language Impairment*, but when determining services and goal areas, SLPs choices didn’t consistently align with that profile. So, even when a case was consistent with SLI, services weren’t always recommended, and when they were, the goals didn’t necessarily correspond with the underlying deficits of that disorder. So as a group, our operational knowledge of EBP for language disorders has a lot of room for improvement. Unlike with speech sound disorders, SLPs were not sensitive to clinical symptoms of SLI (tense/agreement errors, decreased MLU) when making eligibility decisions.

  • Yet again, SLPs relied heavily on standardized scores, even when other evidence of impairments was present.  

So what can you do with all this information? First of all, think about what YOU do in your language assessments. What tools do you lean on to guide your decisions, and why? Are you confident that those choices are evidence-based? Second, keep doing what you’re doing right now—learning the research! There is tons of work being done on assessment and diagnosis of language disorders, use of standardized tests, and LSA (hit the links to take a wander through our archives!). Taking a little time here and there to read up can add up to a whole new mindset before you know it.  

*SLI, or developmental language disorder (DLD) with average nonverbal intelligence.


Fulcher-Rood, K., Castilla-Earls, A., & Higginbotham, J. (2019). Diagnostic Decisions in Child Language Assessment: Findings From a Case Review Assessment Task. Language, Speech, and Hearing Services in Schools. doi:10.1044/2019_LSHSS-18-0044

Selin, C. M., Rice, M. L., Girolamo, T., & Wang, C. J. (2019). Speech-Language Pathologists’ Clinical Decision Making for Children With Specific Language Impairment. Language, Speech, and Hearing Services in Schools. doi:10.1044/2018_LSHSS-18-0017

A one–two punch for assessing young Spanish–English learners

Do you serve pre-K or kindergarten-aged kids? Are some/lots/all of them from Hispanic backgrounds and learning Spanish AND English? Mandatory reading right here, friends!

So—a major issue for young, dual-language learners? Appropriate language assessments. We talk about it a lot (plus here, here, here, and here, to name a few). In this new study, the authors compared a handful of assessments to see which could most accurately classify 4- and 5-year-olds (all Mexican–American and dual-language learners) as having typical vs. disordered language.


The single measure with the best diagnostic accuracy was two subtests of the Bilingual English-Spanish Assessment (BESA)—Morphosyntax and Semantics (the third subtest is phonology, which they didn’t use here). But to get even more accurate? Like, sensitivity of 100% and specificity of about 93%? Add in a story retell task (they used Frog, Where Are You?). Sample both Spanish and English, and take the better MLUw of the two. This BESA + MLU assessment battery outperformed other options in the mix (English and Spanish CELF-P2, plus a composite of the two, a parent interview, and a dynamic vocab assessment).

Not familiar with the BESA? It’s a newer test, designed—as the name implies—specifically for children who are bilingual, with different versions (not translated) of subtests in each language. If you give a subtest in both languages, you use the one with the highest score. And before you ask—yes, the test authors believe that monolingual SLPs can administer the BESA, given preparation and a trained assistant.

Now, the researchers here don’t include specific cut scores to work with on these assessments, but you can look at Table 2 in the paper and see the score ranges for the typical vs. disordered language groups. They also note that an MLUw of 4 or less can be a red flag for this group.

The major issue with this study, affecting our ability to generalize what it tells us, is that the sample size was really small—just 30 kids total. So, take these new results on board, but don’t override all that other smart stuff you know about assessing dual-language learners (see our links above for some refreshers if needed). And keep an eye out for more diagnostic studies down the road—you know we’ll point them out when they come!


Lazewnik, R., Creaghead, N. A., Smith, A. B., Prendeville, J.-A., Raisor-Becker, L., & Silbert, N. (2018). Identifiers of Language Impairment for Spanish-English Dual Language Learners. Language, Speech, and Hearing Services in Schools. Advance online publication.

And more...

  • Briley & Ellis found that 52% of children who stutter (CWS; ages 3–17) also had at least one additional developmental disability, compared to just 15% of children who do not stutter (CWNS), per parent report gathered in a large-scale survey. Specifically, CWS had significantly higher odds of having intellectual disability, learning disability, ADHD/ADD, ASD, or another delay than CWNS.

  • Deevy and Leonard found that preschoolers with DLD were less sensitive to number information (i.e. is vs. are) in sentences with fronted auxiliary verbs than younger, typically developing children. “Is the nice little boy running?” is an example of this form (note the auxiliary “is” at the front of the sentence). The authors suggest children with DLD might need explicit instruction to understand tense and agreement markers—in other words, it might not be enough to just practice producing them correctly.

  • Duncan & Lederberg examined the ways that teachers of K–2nd grade deaf/hard of hearing children communicated in the classroom and related it to the students’ language outcomes. They found that explicitly teaching vocabulary predicted improvements in both vocabulary and morphosyntax over the school year, and that reformulating/recasting children’s statements also predicted vocabulary growth.

  • Kelly et al. interviewed teenagers with high-functioning autism, who reported their perceptions of their own social communication skills. They shared individual experiences with challenges with verbal and nonverbal communication, managing challenging feelings during communication with peers, and feelings of isolation and rejection.

  • Mandak et al.* added to the evidence on Transition to Literacy (T2L) features in AAC software with visual scene displays (VSDs). They found that when digital books were programmed with these features—hotspots that, when touched, would speak the target word and display it dynamically—and used in therapy for preschool-aged children with autism, the children made gains in the ability to read targeted sight words.

  • Goodrich et al. administered three subtests of the Test of Preschool Early Literacy (TOPEL) to 1,221 preschool children, including 751 who were Spanish-speaking language-minority children. Despite the TOPEL being written in English, they found that it provided reliable and valid measures of Spanish-speaking preschoolers’ early literacy skills in English.

*Disclosure: Kelsey Mandak is a writer for The Informed SLP. She was not involved in the selection or review of this article.  

Briley, P. M., & Ellis, C., Jr. (2018). The Coexistence of Disabling Conditions in Children Who Stutter: Evidence From the National Health Interview Survey. Journal of Speech, Language, and Hearing Research. Advance online publication. doi:10.1044/2018_JSLHR-S-17-0378

Deevy, P., & Leonard, L. (2018). Sensitivity to morphosyntactic information in preschool children with and without developmental language disorder: A follow-up study. Journal of Speech, Language, and Hearing Research. Advance online publication. doi:10.1044/2018_JSLHR-L-18-0038

Duncan, M. K., & Lederberg, A. R. (2018). Relations Between Teacher Talk Characteristics and Child Language in Spoken-Language Deaf and Hard-of-Hearing Classrooms. Journal of Speech, Language, and Hearing Research. Advance online publication. doi:10.1044/2018_JSLHR-L-17-0475

Goodrich, J. M., Lonigan, C. J., & Alfonso, S. V. (2019). Measurement of early literacy skills among monolingual English-speaking and Spanish-speaking language-minority children: A differential item functioning analysis. Early Childhood Research Quarterly. doi: 10.1016/j.ecresq.2018.10.007

Kelly, R., O’Malley, M., Antonijevic, S. (2018). ‘Just trying to talk to people… it’s the hardest’: Perspectives of adolescents with high-functioning autism spectrum disorder on their social communication skills. Child Language Teaching and Therapy. doi:10.1177/0265659018806754

Mandak, K., Light, J., & McNaughton, D. (2018). Digital Books with Dynamic Text and Speech Output: Effects on Sight Word Reading for Preschoolers with Autism Spectrum Disorder. Journal of Autism and Developmental Disorders. Advance online publication. doi: 10.1007/s10803-018-3817-1

And more...

Brinton et al. found that five elementary-age children with DLD rarely described characters’ mental states (responses, plans, emotions) when generating stories and struggled to answer direct questions about characters’ mental states. The authors suggest that children with DLD may have difficulty with social and emotional concepts. 

Chenausky et al. found that baseline phonetic inventory and ADOS scores were most predictive of speech target approximations post-speech therapy in minimally verbal children with autism (more than IQ, language, age). And that’s not terribly surprising (except the age part—cool that they made good speech gains in older elementary children!). Perhaps the more interesting thing about this study, though, is what they did in speech therapy. It’s called “auditory motor map training”, and is basically the addition of rhythm (tapping drums) and intonation (singing the speech targets) to speech therapy. The researchers are finding that adding these tactile and auditory cues is better than not having them; so worth trying! 

Cooke and Millard asked school-aged children who stutter what they considered to be the most important therapy outcomes. The children reported increased fluency, independence, and confidence, as well as others knowing how to support them and how to make communication situations feel easier. This study serves as a good reminder that stuttering is more than dysfluent speech. The cognitive (thoughts and attitudes) and affective (feelings) components should also play a role in how we evaluate therapy outcomes.  

Dyson et al. taught 20 vocabulary words to elementary-age children with low vocabulary scores using examples, games, and worksheets. After 10 weeks of 20-minute small-group sessions, children learned five new words on average; significantly more than children in a control group. (Email the authors for free materials!)

Giusto and Ehri found that third-graders with poor decoding and average listening comprehension benefitted from a partial-read aloud test accommodation with pacing (PRAP). When examiners read aloud only directions, proper nouns, and multiple choice questions, the students improved their reading comprehension of the test passages. Although you may not be directly assessing these students, these findings may be helpful if you’re ever in the position to recommend accommodations for this subset of children.

Gough Kenyon et al. found that, compared to typical peers, 10- to 11-year-olds with developmental language disorder (DLD) struggled with making elaborative inferences (drawing on background knowledge not stated) but not cohesive inferences (linking information given) after reading a passage. They suggest targeting elaborative inferencing to boost reading comprehension for children with DLD.

Millard et al. add to the evidence base for Palin Parent–Child Interaction Therapy for young children who stutter, finding a reduction in stuttering severity and improvements in both parent and child attitudes and confidence following a year of participation in the program.

Sabri & Fabiano-Smith analyzed a case study and found that, given early implantation and support in both languages, a bilingual child with cochlear implants can acquire two phonological systems, although likely at a slower rate than other bilingual children.

Using (and maybe struggling with) the Lidcombe Program with your young clients who stutter? Van Eerdenbrugh et al. studied the challenges clinicians have with implementing the program and surveyed experts to come up with solutions.


Brinton, B., Fujiki, M., & Asai, N. (2018). The ability of five children with developmental language disorder to describe mental states in stories. Communication Disorders Quarterly. Advance online publication. doi: 10.1177/1525740118779767.

Chenausky, K., Norton, A., Tager-Flusberg, H., & Schlaug, G. (2018). Behavioral predictors of improved speech output in minimally verbal children with autism. Autism Research. Advance Online Publication. doi: 10.1002/aur.2006.

Cooke, K., & Millard, S. K. (2018). The most important therapy outcomes for school-aged children who stutter: An exploratory study. American Journal of Speech-Language Pathology, 27(3S), 1152.

Dyson, H. , Solity, J. , Best, W. and Hulme, C. (2018), Effectiveness of a small‐group vocabulary intervention programme: evidence from a regression discontinuity design. International Journal of Language & Communication Disorders, 53: 947-958. doi:10.1111/1460-6984.12404

Giusto, M., & Ehri, L. C. (2018). Effectiveness of a partial read-aloud test accommodation to assess reading comprehension in students with a reading disability. Journal of Learning Disabilities. Advance online publication. doi:10.1177/0022219418789377

Gough Kenyon, S. M., Palikara, O., & Lucas, R. M. (2018). Explaining reading comprehension in children with developmental language disorder: The importance of elaborative inferencing. Journal of Speech, Language, and Hearing Research, 61(10), 2517–2531. 

Millard, S. K., Zebrowski, P., & Kelman, E. (2018). Palin Parent–Child Interaction Therapy: The Bigger Picture. American Journal of Speech–Language Pathology, 27(3S), 1211–1223.

Sabri, M. & Fabiano-Smith, L. (2018). Phonological Development in a Bilingual Arabic–English-Speaking Child With Bilateral Cochlear Implants: A Longitudinal Case Study. American Journal of Speech–Language Pathology. Advance online publication. doi: 10.1044/2018_AJSLP-17-0162.

Van Eerdenbrugh, S., Packman, A., O'Brian, S., & Onslow, M. (2018). Challenges and Strategies for Speech-Language Pathologists Using the Lidcombe Program for Early Stuttering. American Journal of Speech–Language Pathology, 27(3S), 1259–1272.

Just say "yes" to narrative assessment for ASD

We all have those high-functioning kids with ASD who score in the average range on the CELF but so clearly have language issues. It can be hard to justify services for students like this, especially in school districts where test scores are the main criteria for eligibility. King & Palikara sought a solution to this frequent dilemma by using a variety of different assessment tools.


Using groups of adolescents both with and without high-functioning ASD, the researchers tested each child using the CELF-4, a standardized vocabulary test, a variety of narrative analysis tasks, and the Children’s Communication Checklist (CCC-2), completed by parents and teachers.

Not surprisingly, the adolescents with ASD scored similarly to typically developing peers on the CELF-4 and vocabulary measure. However, students with ASD scored significantly lower on a variety of narrative tasks.

Compared to peers, adolescents with ASD produced narratives that:

  • Were shorter and less grammatically complex
  • Used more limited vocabulary
  • Included less reasoning and fewer explanations
  • Made fewer references to emotion and thoughts
  • Made use of fewer linguistic enrichment devices
  • Contained less conflict resolution and reduced character development
  • Were overall less coherent

Did you get all that?

Basically, when assessing high-functioning students with ASD, especially those on the verge of qualifying, do yourself a favor and include some kind of narrative measure. I know, I know—narrative analysis can be complex and time-consuming, and the authors note this as well. But using narratives in assessment can give us great information about specific areas of difficulty that the CELF just doesn’t address. Besides, narrative assessment results translate so easily into IEP goals, so it will be worth your while. Check out the original article for more details on how they used and analyzed narrative assessment!


King, D., & Palikara, O. (2018). Assessing language skills in adolescents with autism spectrum disorder. Child Language Teaching and Therapy, 34(2), 101–113.

School-based assessments: Why do we do what we do?


Fulcher-Rood et al. interviewed school-based SLPs across the United States about how we choose assessment tools and diagnose/qualify our students. They wanted to understand not just which tools we use, but why we choose them, what “rules” we follow when we make diagnostic decisions, and what external factors affect those decisions. We’ve reviewed some other surveys of SLPs’ current assessment practices in the past—on the use of LSA, and on methods we’re using to assess bilingual clients—and these findings are kinda similar. There’s a lot of detail in the survey, but we’ll just focus on a couple things here.

  • We give a LOT of standardized tests, and qualify most of our students for service on the basis of those scores, with reference to some established cut-off (e.g. 1.5 SD below the mean)
  • We don’t do a ton of language sample analysis (at least the good ol’ record-transcribe-analyze variety)
  • We use informal measures to fill in the gaps and show academic impacts, but those results are less important when deciding who qualifies for service

None of this is likely to surprise you, but given what we know about the weaknesses of standardized tests (especially given diversity in home languages, dialects, and SES), the arbitrary nature of most cut-off scores, and the many advantages of LSA and other non-standard measures… it’s a problem.

So, what barriers are we up against when it comes to implementation of evidence-based assessment practices? First—let’s say it all together—TIME. Always time. Standardized tests are easy to pull, fairly quick to administer and score, and you often have a handy dandy report template to follow. Besides that, we’re often subject to institutional guidelines or policies that require (or *seem* to require) standard scores to qualify students for services.

None of the SLPs in the survey mentioned that research was informing their selection of assessment tools or diagnostic decisions. That doesn’t necessarily mean none of them consider the research—they just didn’t bring it up. But guys! We need to be bringing it up! And by “we,” I mean YOU! The person taking your all-too-limited time to read these reviews. The authors of the study pointed out (emphasis mine) that “there are differences between policies (what must be done) and guidelines (how can it be done)... potentially, school-based SLPs interpret some of the guidelines as mandatory, instead of as suggested.” Maybe there’s some wiggle room that that we aren’t taking advantage of. We can speak up, evaluation by evaluation, sharing our knowledge of research and best practices.

It all boils down to this: “While it is important for SLPs to adhere to the policies set forth by their employment agency, it is equally important for SLPs to conduct evaluations guided by best practice in the field. SLPs may need to advocate for policy changes to ensure that evidence-based practice is followed.”

Fulcher-Rood, K., Castilla-Earls, A. P., & Higginbotham, J. (2018). School-Based Speech-Language Pathologists’ Perspectives on Diagnostic Decision Making. American Journal of Speech-Language Pathology. Advance online publication.

Language of school and SES matter in standardized testing of bilinguals

Assessing children from diverse language backgrounds can be a challenge, but at least for Spanish speakers, SLPs have a decent array of resources available—including a growing number of standardized tests. The CELF–4S is one of these, designed to diagnose language disorders in Spanish speakers (mono- or bilingual) from 5–21 years old. It’s not just a Spanish translation of the English CELF, but is written specifically for speakers of Spanish. Great, right?


The problem is that the norming sample for this test was somewhat smaller than what’s recommended, and so the norms in the test manual may not be valid for all groups. Previously, there have been disagreements between the test creators and other researchers about whether you need separate norms for monolingual and bilingual speakers (in the test manual, they’re together).

This study focused on children from 5–7 years old with multiple risk factors for underperformance on standardized language tests. These included low SES (low-income family and parents with lower levels of education) and attending an English-only school, which favors English to the detriment of the home language. The researchers gave the CELF–4S to a huge group (656) of these kids, a lot more per age bracket than the test was originally normed on. The average Core Language Score was 83.57—more than one standard deviation below the mean, which is given in the manual as the cut-off score for identifying a language disorder. In Table 3, you can see how the results break down by subtest and age group. And, yes. You read that right. Given the published test norms, over half of these kids would appear to have DLD.

Wow. This is clearly not okay. So what do we do?

It looks like we need separate test norms for low-SES children in English schools. The authors used a subset of the original sample (still large at 299, 28 of whom had been found to have a language disorder via multiple methods of assessment) to look into the test’s diagnostic accuracy. That cut-off score of 85? Yeah, it resulted in so many false positives (specificity of only 65%) that it wasn’t clinically useful. The researchers computed an adjusted cut-off score of 78 for this group, which has acceptable diagnostic sensitivity and specificity (85% and 80%, respectively).

The big takeaway is this: Use the CELF–4S very cautiously. Understand the limitations of the normative sample used to standardize the test. If you are working with kids matching the profile of this paper’s sample (5-7 years old, low-SES/maternal education, and in English-only schools), keep that adjusted cut-off score of 78 in mind. And above all, remember that standardized testing alone is not a good way to assess young English learners.


Barragan, B., Castilla-Earls, A., Martinez-Nieto, L., Restrepo, M. A., & Gray, S. (2018). Performance of Low-Income Dual Language Learners Attending English-Only Schools on the Clinical Evaluation of Language Fundamentals–Fourth Edition, Spanish. Language, Speech, and Hearing Services in Schools. Advance online publication. doi: 10.1044/2017_LSHSS-17-0013.

Throwback (2006): The evidence-based way to interpret standardized language test scores

When you give a standardized language test to a child, how do you interpret his or her standard score to decide if a disorder is present, or if services should be offered? Is the magic number 85 (1 standard deviation below the mean)? Does your district or state give you some other number to use? Back in 2006, Spaulding et al. schooled us all by explaining that it depends on the test. As in, there is no universal magic number. Instead, we need more information to determine an appropriate cutoff score for each test. Using the wrong number has serious implications: “If the cutoff criteria do not match the test selected, typically developing children may be misdiagnosed as language impaired, and children with language impairments [language disorder] may go undiagnosed.”


The authors walk us through how to find and interpret values for sensitivity and specificity at a specific cutoff score listed in the test’s manual or in a research article. Remember, sensitivity tells us the percentage of children who have language disorder who were correctly identified by the test as having language disorder. Specificity tells us the percentage of children who are typically developing who were correctly identified by the test as typically developing. Ideally, both should be 80% or higher. If a test lists sensitivity and specificity values for multiple cutoff scores, we use the one that balances both. If sensitivity and specificity aren’t listed, we can look at the mean group difference between the groups with and without language disorder, but this evidence is not as strong.

The authors listed sensitivity and specificity values and cutoff scores for the tests they reviewed (see Table 4). The bad news is that many of the language tests reviewed by Spaulding et al. have been updated since this article was published. This means that we might have to dive into a test manual or the literature to find sensitivity and specificity values for a newer test like the CELF-5 (start with Table 10 in this recent article). 

Even if sensitivity and specificity are sufficient, we still need to make sure that the test had adequate reliability and validity and that the normative sample of the test included children like the client we’re testing (considering things like dialect and SES). The authors say that evaluating these things is important, but isn’t worth our time if sensitivity and specificity are lacking (see Figure 6 for a decision tree).

Overall, this article is a good reminder of the potential pitfalls of making diagnostic decisions based on standardized tests alone. It’s also a good one to have in your pocket if you want to challenge your state’s or district’s policy for standardized test cutoff scores.


Spaulding, T. J., Plante, E., & Farinella, K. A. (2006). Eligibility criteria for language impairment: Is the low end of normal always appropriate?. Language, Speech, and Hearing Services in Schools, 37(1), 61–72. doi: 10.1044/0161-1461(2006/007).