Harnessing Acoustic Analysis in Primary Care to Detect Cognitive Impairment: A Novel Screening Approach

Highlight

This diagnostic study evaluates the feasibility and accuracy of machine learning (ML) models analyzing acoustic features from brief patient-primary care physician conversations to identify cognitive impairment (CI). Using data from over 900 patients across two urban centers, the study demonstrates good predictive ability with AUROC approximately 0.73. Key acoustic indicators include pitch, timing, and variability, supporting a passive, scalable screening tool in routine clinical care.

Study Background

Cognitive impairment, often underestimated in primary care, poses significant challenges due to its insidious onset and heterogeneous manifestations. Early stages of mild cognitive impairment and dementia are frequently unrecognized owing to limited clinical time, lack of standardized screening, and resource constraints. Detecting CI early allows prompt intervention which can improve patient outcomes and guide care planning.

Traditional screening tools, such as the Montreal Cognitive Assessment (MoCA), require clinician time and patient cooperation, limiting widespread adoption. Advances in artificial intelligence and speech processing provide new opportunities for passive, automated screening. Prior studies have suggested that speech disruptions and prosodic changes correlate with cognitive decline. This study leverages cutting-edge acoustic feature extraction and ML methods to evaluate speech characteristics from routine clinical conversations for CI detection.

Study Design

This multi-center diagnostic study was conducted from August 2020 through December 2021 in primary care settings in New York City and Chicago, enrolling English-speaking patients aged 55 years and older without prior dementia or mild CI diagnoses. Audio recordings of routine patient-physician encounters were obtained with portable devices.

From the audio recordings, multiple 30-second speech segments were extracted for analysis. Acoustic features were derived through both foundation AI models (Whisper, HuBERT, wav2vec 2.0) and expert-defined feature sets including eGeMAPS and prosodic parameters.

The primary outcome of CI was defined as a Montreal Cognitive Assessment score at least one standard deviation below age- and education-adjusted norms. ML classifiers were trained on the extracted features to distinguish patients with versus without CI. Performance metrics included area under the receiver operating characteristic curve (AUROC) and maximum F1 score (Fmax). An external validation cohort from Chicago tested generalizability.

Key Findings

The study enrolled 787 patients in the primary cohort and 179 in the external validation cohort, totaling 966 participants with a mean age of 67.2 years and a 21% prevalence of cognitive impairment.

Among acoustic feature models, those based on Whisper-derived features achieved the highest predictive accuracy: AUROC of 0.733 (95% CI, 0.714–0.752) with Fmax of 0.502 (95% CI, 0.471–0.533) in the internal cohort, and AUROC of 0.727 (95% CI, 0.714–0.740) with Fmax of 0.459 (95% CI, 0.441–0.477) in the external cohort, indicating robust performance across sites.

Model interpretability analyses identified key acoustic features predictive of CI, including pitch variability, timing irregularities, and prosodic changes, consistent with known speech alterations in cognitive decline.

When applied as a screening tool in practice, the algorithm achieved a sensitivity of 68.2% (95% CI, 61.8%–74.6%), specificity of 63.6% (95% CI, 59.8%–67.4%), and positive predictive value of 30.4% (95% CI, 28.7%–32.1%) on the held-out cohort. These metrics suggest the tool could effectively flag patients for further cognitive evaluation while balancing false positives.

Expert Commentary

This study demonstrates a promising and innovative approach to address the critical gap in cognitive impairment detection in primary care. By leveraging routinely captured audio and state-of-the-art ML acoustic analysis, clinicians may gain a low-burden method to screen older adults during everyday visits without additional patient effort or significant workflow disruption.

Limitations include the exclusive focus on English-speaking patients and reliance on MoCA as the reference standard, which itself has limitations. Further research is needed to assess applicability in diverse linguistic and cultural populations, and to refine models for improved specificity. Integration with electronic health records and real-time decision support could enhance utility.

The identified acoustic markers—pitch, timing, variability—are biologically plausible correlates of CI-related speech alterations, reflecting changes in motor planning, linguistic processing, and executive function. This mechanistic insight bolsters confidence in the approach.

Conclusion

Passive acoustic analysis of patient-primary care clinician conversations analyzed by machine learning offers a feasible and scalable means to identify cognitive impairment. With good validation across independent cohorts, this approach holds potential to enhance early detection of cognitive disorders in routine primary care settings, facilitating timely diagnosis and management.

Ongoing research should focus on broadening linguistic applicability, optimizing model precision, and assessing clinical impact to enable translation into practice. Meanwhile, acoustic-based screening can be considered a complementary tool to established cognitive assessments, improving overall detection rates and patient outcomes.

Funding and Registration

The study details do not explicitly mention funding sources or clinical trial registration numbers. Future reports should clarify these for transparency and reproducibility.

References

1. Nasreddine ZS, Phillips NA, Bédirian V, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. 2005;53(4):695-699.

2. Ybarra O, Burnstein E, Winkielman P, et al. Using tech-based ecological momentary assessments to measure behaviors and emotions in real time: Advantages and challenges. J Med Internet Res. 2018;20(11):e11350.

3. König A, Satt A, Sorin A, et al. Automatic speech analysis for the assessment of patients with predementia and Alzheimer’s disease. Alzheimers Dement (Amst). 2018;10:260-268.

4. Clercq G de, Ghaye T, Degrauwe K, et al. A systematic review on speech analysis for automated detection of cognitive impairment. Geriatrics. 2021;6(1):14.