- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Research Data /
- Text mining for neuroanatomy using WhiteText with an...
Open Collections
UBC Research Data
Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application French, Leon; Pavlidis, Paul
Description
We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/.
Item Metadata
Title |
Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application
|
Creator | |
Contributor | |
Date Created |
2015
|
Date Issued |
2019-03-11
|
Description |
We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/.
|
Subject | |
Type | |
Notes | |
Date Available |
2019-03-11
|
Provider |
University of British Columbia Library
|
License |
CC0 Waiver
|
DOI |
10.14288/1.0363908
|
URI | |
Publisher DOI | |
Rights URI | |
Aggregated Source Repository |
Dataverse
|
Item Media
Item Citations and Data
Licence
CC0 Waiver