CEGS N-GRID Natural Language Processing (NLP) Challenge and Workshop

November 11, 2016 (All day)
Hilton Chicago
720 South Michigan Avenue
Chicago, ILL 60605
312-922-4400 (map)

Data will be released in November 2017. Selected publications below:

Dai HJ, Su EC, Uddin M, Jonnagaddala J, Wu CS, Syed-Abdul S. Exploring associations of clinical and social parameters with violent behaviors among psychiatric patients. J Biomed Inform. 2017 Aug 16. pii: S1532-0464(17)30188-0. doi: 10.1016/j.jbi.2017.08.009. [Epub ahead of print] PubMed PMID: 28822857.

Clark C, Wellner B, Davis R, Aberdeen J, Hirschman L. Automatic classification of RDoC positive valence severity with a neural network. J Biomed Inform. 2017 Jul 8. pii: S1532-0464(17)30161-2. doi: 10.1016/j.jbi.2017.07.005. [Epub ahead of print] PubMed PMID: 28694118.

Tao C, Filannino M, Uzuner Ö. Prescription extraction using CRFs and word embeddings. J Biomed Inform. 2017 Aug;72:60-66. doi: 10.1016/j.jbi.2017.07.002. Epub 2017 Jul 4. PubMed PMID: 28684255; PubMed Central PMCID: PMC5551970.

Buchan K, Filannino M, Uzuner Ö. Automatic prediction of coronary artery disease from clinical narratives. J Biomed Inform. 2017 Aug;72:23-32. doi: 10.1016/j.jbi.2017.06.019. Epub 2017 Jun 27. PubMed PMID: 28663072; PubMed Central PMCID: PMC5592829.

Stubbs A, Filannino M, Uzuner Ö. De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1. J Biomed Inform. 2017 Jun 11. pii: S1532-0464(17)30134-X. doi: 10.1016/j.jbi.2017.06.011. [Epub ahead of print] PubMed PMID: 28614702.

Dehghan A, Kovacevic A, Karystianis G, Keane JA, Nenadic G. Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes. J Biomed Inform. 2017 Jun 7. pii: S1532-0464(17)30128-4. doi: 10.1016/j.jbi.2017.06.005. [Epub ahead of print] PubMed PMID: 28602908.

Scheurwegs E, Sushil M, Tulkens S, Daelemans W, Luyckx K. Counting trees in Random Forests: Predicting symptom severity in psychiatric intake reports. J Biomed Inform. 2017 Jun 7. pii: S1532-0464(17)30130-2. doi: 10.1016/j.jbi.2017.06.007. [Epub ahead of print] PubMed PMID: 28602906.

Liu Z, Tang B, Wang X, Chen Q. De-identification of clinical notes via recurrent neural network and conditional random field. J Biomed Inform. 2017 Jun  1. pii: S1532-0464(17)30122-3. doi: 10.1016/j.jbi.2017.05.023. [Epub ahead of print] PubMed PMID: 28579533.

Goodwin TR, Maldonado R, Harabagiu SM. Automatic recognition of symptom severity from psychiatric evaluation records. J Biomed Inform. 2017 May 30. pii:  S1532-0464(17)30119-3. doi: 10.1016/j.jbi.2017.05.020. [Epub ahead of print] PubMed PMID: 28576748.

Posada JD, Barda AJ, Shi L, Xue D, Ruiz V, Kuan PH, Ryan ND, Tsui FR. Predictive modeling for classification of positive valence system symptom severity from initial psychiatric evaluation records. J Biomed Inform. 2017 May 29. pii: S1532-0464(17)30118-1. doi: 10.1016/j.jbi.2017.05.019. [Epub ahead of print] PubMed PMID: 28571784.

Liu Y, Gu Y, Nguyen JC, Li H, Zhang J, Gao Y, Huang Y. Symptom severity classification with gradient tree boosting. J Biomed Inform. 2017 May 22. pii: S1532-0464(17)30110-7. doi: 10.1016/j.jbi.2017.05.015. [Epub ahead of print] PubMed PMID: 28545836.

Filannino M, Stubbs A, Uzuner Ö. Symptom severity prediction from neuropsychiatric clinical records: Overview of 2016 CEGS N-GRID shared tasks Track 2. J Biomed Inform. 2017 Apr 25. pii: S1532-0464(17)30087-4. doi: 10.1016/j.jbi.2017.04.017. [Epub ahead of print] PubMed PMID: 28455151.



The 2016 Centers of Excellence in Genomic Science (CEGS) Neuropsychiatric Genome-Scale and RDoC Individualized Domains (N-GRID) Challenge, a.k.a., RDoC for Psychiatry Challenge, aims to extract symptom severity from neuropsychiatric clinical records.  RDoC is a framework developed under the aegis of the National Institute of Mental Health (NIMH) that facilitates the study of human behaviour from normal to abnormal in various domains.  The "Challenge" goal is to classify symptom severity in a domain for a patient based on information included in initital psychiatric evaluations and as recorded as part of each individual's electronic health record.

This Challenge will be conducted on initial psychiatric evaluations (1 per patient) which have been fully de-identified and scored by clinical experts in a symptom domain.  The data for this task are derived from actual clinical records used as part of this project and will be released under a Rules of Conduct and Data Use Agreement.   

The evaluation for the various NLP tracks will be conducted using withheld test data.  Participating teams are asked to stop development as soon as the test data is downloaded.  Each team is allowed to upload up to to three system runs for each task, output from which will be submitted in the exact format of the ground truth annotations provided by organizers for scoring.

Participants are asked to submit a 500-word long abstract describing their methodologies.  Abstracts may also have a graphical summary of the proposed architecture.  This document should not exceed 2 pages, 1.5 line spacing, 12 font size.  The authors of either the top performing systems or particularly novel approaches will be invited to present or demonstrate their systems at the Workshop.  A special issue of an appropriate journal will be organized following the workshop so that the NLP community at large may benefit from this work.


Organizing Committee:

Ozlem Uzuner, co-chair, SUNY at Albany

Amber Stubbs, co-chair, Simmons College

Michele Filannino, co-chair, SUNY at Albany

Tianxi Cai, Harvard T.H. Chan School of Public Health

Susanne Churchill, Harvard Medical School

Isaac Kohane, Harvard Medical School

Thomas McCoy, MGH, Harvard Medical School

Roy Perlis, MGH, Harvard Medical School

Peter Szolovits, MIT

Uma Vaidyanathan, NIMH

Philip Wang, American Psychiatric Association


Additional information, Challenge milestones and DUAs available here.