Using Data Science to Understand Suicide – Management of Suicidality

Posted on:February 24, 2020
Last Updated: October 10, 2020
Time to read: 3 minutes

This article is based on the talk by Dr Rina Dutta at RCPsychIC 2019. Dr Dutta is a Consultant Psychiatrist at the Affective Disorders Service, Maudsley Hospital and a Clinician Scientist Fellow of the Academy of Medical Sciences/the Health Foundation.

  • Data science in mental health research has many roles and includes the improvement of detection and the ability to screen high-risk patients. Many studies to date have been case studies with broad risk factors, e.g. bereavement, chronic illness, gender and family history. However, much less is known about the dynamic changes within an individual during their life.  
  • It is difficult to accurately quantify individual deaths within a small subgroup of high-risk individuals with mental health problems, and to do this requires a useful prediction tool with a move from traditional statistical modelling towards Machine Learning (ML) models. Existing risk assessment tools are not bespoke for the individual and often rely on summative, or threshold scores with no real predictive value, and data-driven methods are needed to work on real-world data.[Velupillai et al. 2019] 
  • Mental health signs and symptoms are not well captured in structured data in Electronic Health Records (EHRs). There is a high volume of free text written with considerable variability in the quality of records kept.[Roberts. 2017]
  • The EHR can also contain bias – gender/diagnosis and differences between clinicians in how they use the verbatim “quotation” of the patient experience. The use of adverbs, lexical markers, syntax, and grammar are challenges in how clinicians phrase suicide attempts. 
  • The association of language use and later risk of suicide has been investigated. A data-driven study of suicidality using Natural Language Processing (NLP) applied standard Cox-regression modelling to reveal a 30% reduction in suicide when the clinician expressed positive sentiment during the admission.[McCoy et al. 2016]   
  • Another study of US Veterans’ suicide risk used NLP to identify changes in referencing – how patients are referred to in the EHR text and how many third-person pronouns are used in the year prior to suicide. Findings showed that the relative frequency of distancing language increased prior to suicide.[Leonard Westgate et al. 2015] 
  • Studies that used Machine Learning (ML) to predict suicidal behaviour [Barak-Corren et al. 2017] and risk of suicide [Walsh et al. 2018] attempt to use the Bayesian ML and Random Forest approach, respectively. The former study using longitudinal EHR data used information from structured fields but not NLP – sensitivity was only 33-45%, but they were able to give an early prediction of 3-4 years in advance of the risk of suicide in some individuals. The latter study using machine learning showed a higher sensitivity of 94-96% and accuracy improved from 720 days to 7 days before the suicide attempt. 
  • The e-HOST-IT (Electronic health records to predict HOspitalised Suicide attempts: Targeting Information Technology) study is part of the Clinical Record Interactive Search (CRIS) project at the South London and Maudsley NHS Foundation Trust.
  • The aim is to predict hospitalised suicide attempts targeting information technology solutions. EHRs linked with hospital statistics data and mortality data will be used to understand the temporal factors that precede a suicide attempt, and ultimately to provide visual feedback for clinical support.[Downs et al. 2018] 

Learn More: Can Digital Technology Prevent Suicides?

References