Combining genetic and behavioral predictors of 11-year language outcome
Survey
LSAC
Author(s)
Loretta Gasparini
loretta.gasparini@mcri.edu.au
Murdoch Children's Research Institute
0000-0002-1561-5572
Daisy A Shepherd
daisy.shepherd@unimelb.edu.au
Murdoch Children's Research Institute
0000-0001-8540-0473
katherine.Lange@mcri.edu.au
Murdoch Children's Research Institute
0000-0002-8175-6285
Ellen Verhoef
jing.wang@mcri.edu.au
Murdoch Children’s Research Institute
0000-0001-5701-476X
Edith L. Bavin
E.Bavin@latrobe.edu.au
School of Psychology and Public Health, La Trobe University
N/A
Beate St Pourcain
Beate.StPourcain@mpi.nl
Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
0000-0002-4680-3517
Angela T. Morgan
melissa.wake@mcri.edu.au
Murdoch Children’s Research Institute
0000-0001-7501-9257
Date Issued
2025-11-13
Pages
9
Keywords
Language development
Language disorders
Genetics
Polygenic score
Sensitivity and specificity
Longitudinal studies
Machine learning
Abstract
Background
Rapid population-level identification of language disorders could help provide care to young children to improve their outcomes. Two previous studies identified and replicated up to six parent-reported items that predicted 11-year language outcome with ≥71 % sensitivity and specificity. Here, we assess whether including genetic propensity for toddlerhood vocabulary improves predictive accuracy.
Method
The Early Language in Victoria Study (ELVS) recruited 1910 8-month-olds in Melbourne in 2003–2004. The Longitudinal Study of Australian Children (LSAC) recruited 5107 0–1-year-olds across Australia in 2004. Both collected parent-reported items at 2–3 years, a comparable 11-year language outcome: the Clinical Evaluation of Language Fundamentals (CELF-4) Core Language score or Recalling Sentences subtest, and biospecimens for genotyping. We derived polygenic scores capturing participants’ genetic propensity for parent-reported 24–38-month vocabulary. We calculated univariate associations with continuous language outcomes. We used ensemble method SuperLearner to estimate how accurately the parent-reported predictors and polygenic scores predict low 11-year language outcome (>1.5 standard deviations below the mean) in each cohort.
Results
Language outcome was available for 839 ELVS and 1441 LSAC participants. Polygenic scores accounted for little variance in continuous language outcomes (R2 < 1.3 %). Adding polygenic scores to the predictor sets increased accuracy of predicting language outcome by up to 7 %, but inconsistently between analyses.
Conclusions
Polygenic scores derived for toddlerhood vocabulary did not meaningfully improve predictive accuracy of individuals’ language outcome when added to the phenotypic predictor set. Presently, parent-reported measures or clinician observation appear best for predicting language outcome at this age.
Rapid population-level identification of language disorders could help provide care to young children to improve their outcomes. Two previous studies identified and replicated up to six parent-reported items that predicted 11-year language outcome with ≥71 % sensitivity and specificity. Here, we assess whether including genetic propensity for toddlerhood vocabulary improves predictive accuracy.
Method
The Early Language in Victoria Study (ELVS) recruited 1910 8-month-olds in Melbourne in 2003–2004. The Longitudinal Study of Australian Children (LSAC) recruited 5107 0–1-year-olds across Australia in 2004. Both collected parent-reported items at 2–3 years, a comparable 11-year language outcome: the Clinical Evaluation of Language Fundamentals (CELF-4) Core Language score or Recalling Sentences subtest, and biospecimens for genotyping. We derived polygenic scores capturing participants’ genetic propensity for parent-reported 24–38-month vocabulary. We calculated univariate associations with continuous language outcomes. We used ensemble method SuperLearner to estimate how accurately the parent-reported predictors and polygenic scores predict low 11-year language outcome (>1.5 standard deviations below the mean) in each cohort.
Results
Language outcome was available for 839 ELVS and 1441 LSAC participants. Polygenic scores accounted for little variance in continuous language outcomes (R2 < 1.3 %). Adding polygenic scores to the predictor sets increased accuracy of predicting language outcome by up to 7 %, but inconsistently between analyses.
Conclusions
Polygenic scores derived for toddlerhood vocabulary did not meaningfully improve predictive accuracy of individuals’ language outcome when added to the phenotypic predictor set. Presently, parent-reported measures or clinician observation appear best for predicting language outcome at this age.
URI (Link)
External resource (Link)
ISBN
0165-1781
Type
Journal Articles
