Skip to main content

Natural Language Processing for Health-Related Texts

  • Chapter
  • First Online:
Biomedical Informatics

Abstract

Narrative text is an important component of communication in health care, including patient-specific information in health record reports and notes and general biomedical knowledge papers, textbooks and web resources. Retrieval of information from such sources can be accomplished with keyword indexing but this approach fails to distinguish between texts that discuss a topic versus those that merely mention it. Natural language processing (NLP) techniques seek to analyze narrative text to make these distinctions and even to find instances where a concept is discussed but not explicitly mentioned. Previously, grammar-based methods were based on parsing sentences into their structure to infer semantics. Machine-learning methods are now being applied that can infer semantics through techniques such as statistical pattern recognition. Regular expression, rule bases, neural networks and word embeddings are among approaches that are improving the ability of automated systems to carry out successful language understanding, information extraction and question answering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Agarwal, S., & Yu, H. (2009). Automatically classifying sentences in full-text biomedical articles into introduction, methods, results and discussion. Bioinformatics, 25(23), 3174ā€“3180.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Aronson, A. R., & Lang, F. M. (2010). An overview of MetaMap: Historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229ā€“236.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Barbarino, J. M., Whirl-Carrillo, M., Altman, R. B., & Klein, T. E. (2018). PharmGKB: A worldwide resource for pharmacogenomic information. Wiley Interdisciplinary Reviews. Systems Biology and Medicine, 10(4), e1417.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Ben Abacha, A., Shivade, C., & Demner-Fushman, D. (2019). Overview of the MEDIQA 2019 shared task on textual inference, question entailment and question answering. Proceedings of the BioNLP 2019 Workshop.

    Google ScholarĀ 

  • Bird, S., Klein, E., & Loper, E.. https://www.nltk.org/book/.

  • Bjƶrne, J., Ginter, F., Pyysalo, S., Tsujii, J. I., & Salakoski, T. (2010). Complex event extraction at PubMed scale. Bioinformatics, 26(12), i382ā€“i390.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  CASĀ  Google ScholarĀ 

  • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77ā€“84.

    ArticleĀ  Google ScholarĀ 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993ā€“1022.

    Google ScholarĀ 

  • Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.01759.

    Google ScholarĀ 

  • Bunt, H. (2017). Computational pragmatics. In Y. Huang (Ed.), The Oxford handbook of pragmatics (pp. 326ā€“345). Oxford: Oxford University Press.

    Google ScholarĀ 

  • Cao, Y., Liu, F., Simpson, P., Antieau, L., Bennett, A., Cimino, J. J., et al. (2011). AskHERMES: An online question answering system for complex clinical questions. Journal of Biomedical Informatics, 44(2), 277ā€“288.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Chapman, W. W., Bridewell, W., Hanbury, P., Cooper, G. F., & Buchanan, B. G. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34(5), 301ā€“310.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Christensen, L., Haug, P., & Fiszman, P. (2002). MPLUS: A probabilistic medical language understanding system. Proceedings of the ACL BioNLP, 29ā€“36.

    Google ScholarĀ 

  • Claveau, V., & Lā€™Homme, M.-C. (2005). Structuring terminology using analogy-based machine learning. Proceedings of the 7th International Conference on Terminology and Knowledge Engineering, TKE.

    Google ScholarĀ 

  • Cohen, P. R. (2015). DARPAā€™s Big Mechanism program. Physical Biology, 12(4), 045008. IOP Publishing Ltd. https://iopscience.iop.org/article/10.1088/1478-3975/12/4/045008/meta.

    ArticleĀ  PubMedĀ  CASĀ  Google ScholarĀ 

  • Conway, M., Keyhani, S., Christensen, L., South, B. R., Vali, M., Walter, L. C., et al. (2019). Moonstone: A novel natural language processing system for inferring social risk from clinical narratives. Journal of Biomedical Semantics, 10(1), 6. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-019-0198-0.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • De Choudhury, M. D., Counts, S., & Horvitz, E.. (2013). Social media as a measurement tool of depression in populations. In Proceedings of the 5th Annual ACM Web Science Conference (WebSci ā€™13). Association for Computing Machinery, New York, NY, USA, 47ā€“56. https://dl.acm.org/doi/abs/10.1145/2464464.2464480.

  • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391ā€“407.

    ArticleĀ  Google ScholarĀ 

  • DelĆ©ger, L., Merkel, M., & Zweigenbaum, P. (2009a). Translating medical terminologies through word alignment in parallel text corpora. Journal of Biomedical Informatics, 42(4), 692ā€“701.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  • DelĆ©ger, L., Namer, F., & Zweigenbaum, P. (2009b). Morphosemantic parsing of medical compound words: Transferring a French analyzer to English. International Journal of Medical Informatics, 78, S48ā€“S55.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  • Demner-Fushman, D., & Lin, J. (2007). Answering clinical questions with knowledge-based and statistical techniques. Computational Linguistics, 33(1), 63ā€“103.

    ArticleĀ  Google ScholarĀ 

  • Demner-Fushman, D., Chapman, W. W., & McDonald, C. J. (2009). What can natural language processing do for clinical decision support? Journal of Biomedical Informatics, 42(5), 760ā€“772.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Demner-Fushman, D., Rogers, W. J., & Aronson, A. R. (2017). MetaMap Lite: An evaluation of a new Java implementation of MetaMap. Journal of the American Medical Informatics Association, 24(4), 841ā€“844.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Demner-Fushman, D., Shooshan, S. E., Rodriguez, L., Aronson, A. R., Lang, F., Rogers, W., et al. (2018). A dataset of 200 structured product labels annotated for adverse drug reactions. Scientific Data, 5, 180001.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Demner-Fushman, D., Mrabet, Y., & Ben Abacha, A. (2020). Consumer health information and question answering: Helping consumers find answers to their health-related information needs. Journal of the American Medical Informatics Association, 27(2), 194ā€“201.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  • Denny, J. C., Miller, R. A., Johnson, K. B., & Spickard, A. III. (2008). Development and evaluation of a clinical note section header terminology. In AMIA annual symposium proceedings 2008 (Vol. 2008, pp. 156ā€“160). Bethesda: American Medical Informatics Association.

    Google ScholarĀ 

  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018, Oct 11). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

    Google ScholarĀ 

  • Dublin, S., Baldwin, E., Walker, R. L., Christensen, L. M., Haug, P. J., Jackson, M. L., et al. (2013). Natural language processing to identify pneumonia from radiology reports. Pharmacoepidemiology and Drug Safety, 8(22), 834ā€“841.

    ArticleĀ  Google ScholarĀ 

  • Eichstaedt, J. C., Schwartz, H. A., Kern, M. L., Park, G., Labarthe, D. R., Merchant, R. M., et al. (2015). Psychological language on Twitter predicts county-level heart disease mortality. Psychological Science, 26(2), 159ā€“169. https://doi.org/10.1177/0956797614557867.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  • Elhadad, N. (2006). Comprehending technical texts: Predicting and defining unfamiliar terms. Proceedings AMIA Symposium, 239ā€“243.

    Google ScholarĀ 

  • Elhadad, N., Kan, M. Y., Klavans, J. L., & McKeown, K. R. (2005). Customization in a unified framework for summarizing medical literature. Artificial Intelligence in Medicine, 33(2), 179ā€“198.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Evans, D. A., Cimino, J. J., Hersh, J. J., Huff, S. M., & Bell, D. S. (1994). Toward a medical-concept representation language. The Canon Group. Journal of the American Medical Informatics Association: JAMIA, 1(3), 207ā€“217.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Eysenbach, G., & Till, J. E. (2001). Ethical issues in qualitative research on internet communities. BMJ, 323(7321), 1103ā€“1105.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Firth, J. R. (1957). A synopsis of linguistic theory. In Studies in linguistic analysis. Oxford: Blackwell.

    Google ScholarĀ 

  • Friedman, C. (2000). A broad-coverage natural language processing system. American Medical Informatics Association Annual Symposium Proceedings, 2000, 270ā€“274.

    Google ScholarĀ 

  • Friedman, C., Alderson, P. O., Austin, J., Cimino, J. J., & Johnson, S. B. (1994). A general natural language text processor for clinical radiology. Journal of the American Medical Informatics Association: JAMIA, 1(2), 161ā€“174.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Friedman, C., Shagina, L., Lussier, Y., & Hripcsak, G. (2004). Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association, 11(5), 392ā€“402.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Ganiz, M. C., Pottenger, W. M., & Janneck, C. D. (2005). Recent advances in literature based discovery. Journal of the American Society for Information Science and Technology: JASIST (Submitted).

    Google ScholarĀ 

  • Ghassemi, M., Naumann, T., Doshi-Velez, F., Brimmer, N., Joshi, R., Rumshisky, A., & Szolovits, P. (2014). Unfolding physiological state: Mortality modelling in intensive care units. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 75ā€“84). New York: ACM.

    ChapterĀ  Google ScholarĀ 

  • Greaves, F., Ramirez-Cano, D., Millett, C., Darzi, A., & Donaldson, L. (2013). Use of sentiment analysis for capturing patient experience from free-text comments posted online. Journal of Medical Internet Research, 15(11), e239.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Grishman, R., Sager, N., Raze, C., & Bookchin, B. (1973). The linguistic string parser. Proceedings of the National Computer Conference, 42, 427ā€“434.

    Google ScholarĀ 

  • Grosz, B., Joshi, A., & Weinstein, S. (1995). Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 2(21), 203ā€“225.

    Google ScholarĀ 

  • Habibi, M., Weber, L., Neves, M., Wiegandt, D. L., & Leser, U. (2017). Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics, 33(14), i37ā€“i48.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Hahn, U., Romacker, M., & Schulz, S. (1999). Discourse structures in medical reports ā€“ watch out! The generation of referentially coherent and valid text knowledge bases in the MEDSYNDIKATE system. International Journal of Medical Informatics, 53(1), 1ā€“28.

    Google ScholarĀ 

  • Hakenberg, J., Voronov, D., NguyĆŖn, V. H., Liang, S., Anwar, S., Lumpkin, B., et al. (2012). A SNPshot of PubMed to associate genetic variants with drugs, diseases, and adverse reactions. Journal of Biomedical Informatics, 45(5), 842ā€“850.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Harris, Z. (1991). A theory of language and information ā€“ a mathematical approach. New York: Oxford University Press.

    Google ScholarĀ 

  • Harris, Z., Gottfried, M., Ryckman, T., Mattick, P., Daladier, A., Harris, T., & Harris, S. (1989). The form of information in science ā€“ analysis of an immunology sublanguage. Dordrecht: Kluwer Academic.

    Google ScholarĀ 

  • Haug, P. J., Ranum, D. L., & Frederick, P. R. (1990). Computerized extraction of coded findings from free-text radiology reports. Radiology, 174, 543ā€“548.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Haug, P., Koehler, S., Lau, L. M., Wang, P., Rocha, R., & Huff, S. (1994). A natural language understanding system combining syntactic and semantic techniques. Proceedings of the Annual Symposium on Computer Applications in Medical Care, 247ā€“251.

    Google ScholarĀ 

  • Hofmann, T. (1999). Probabilistic latent semantic indexing. Proceedings of the Twenty-Second Annual International SIGIR Conference.

    Google ScholarĀ 

  • Hripcsak, G., Friedman, C., Alderson, P. O., DuMouchel, W., Johnson, S. B., & Clayton, P. D. (1995). Unlocking data from narrative reports: A study of natural language processing. Annals of Internal Medicine, 122(9), 681ā€“688.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Hripcsak, G., Soulakis, N. D., Li, L., Morrison, F. P., Lai, A. M., Friedman, C., et al. (2009). Syndromic surveillance using ambulatory electronic health records. Journal of the American Medical Informatics Association, 16(3), 354ā€“361.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • HĆ¼ske-Kraus, D. (2003). Text generation in clinical medicine ā€“ a review. Methods of Information in Medicine, 42(1), 51ā€“60.

    Google ScholarĀ 

  • Institute of Medicine (US) Committee for Evaluating Medical Technologies in Clinical Use. (1985). Assessing medical technologies. Washington, DC: National Academies Press.

    Google ScholarĀ 

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in R (Springer texts in statistics). New York: Springer Science+Business Media.

    BookĀ  Google ScholarĀ 

  • Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., & Zeng, Q. (2007). Assessing consumer health vocabulary familiarity: An exploratory study. Journal of Medical Internet Research, 9(1), e5.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Kilicoglu, H., & Demner-Fushman, D. (2016). Bio-SCoRes: A smorgasbord architecture for coreference resolution in biomedical text. PLoS One, 11(3), e0148538.

    Google ScholarĀ 

  • Kilicoglu, H., Shin, D., Fiszman, M., Rosemblat, G., & Rindflesch, T. C. (2012). SemMedDB: A PubMed-scale repository of biomedical semantic predications. Bioinformatics, 28(23), 3158ā€“3160.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Krallinger, M., Morgan, A., Smith, L., Leitner, F., Tanabe, L., Wilbur, J., et al. (2008). Evaluation of text-mining systems for biology: Overview of the Second BioCreative community challenge. Genome Biology, 9(2), S1.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  CASĀ  Google ScholarĀ 

  • Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. (2019, Jan 25). BioBERT: Pre-trained biomedical language representation model for biomedical text mining. arXiv preprint arXiv:1901.08746.

    Google ScholarĀ 

  • Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., et al. (2019, Oct 29). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.

    Google ScholarĀ 

  • Lindberg, D. A. B., Humphreys, B. L., & McCray, A. T. (1993a). The unified medical language system. Methods of Information in Medicine, 32, 281ā€“291.

    ArticleĀ  CASĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Lindberg, D. A., Siegel, E. R., Rapp, B. A., Wallingford, K. T., & Wilson, S. R. (1993b). Use of MEDLINE by physicians for clinical problem solving. Journal of the American Medical Association, 269(24), 3124ā€“3129.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Lynch, J. A., Kelley, M. J., Lee, K. M., Hung, A., Li, Y., Hintze, B. J., et al. (2019). An NLP tool to identify molecular diagnostic testing in veterans with stage IV NSCLC. Journal of Clinical Oncology, 37(27_suppl), 318. https://ascopubs.org/doi/abs/10.1200/JCO.2019.37.27_suppl.318.

    ArticleĀ  Google ScholarĀ 

  • Mane, V. L., Panicker, S. S., & Patil, V. B. (2015, Jan 8). Summarization and sentiment analysis from user health posts. In 2015 International Conference on Pervasive Computing (ICPC) (pp. 1ā€“4). IEEE.

    Google ScholarĀ 

  • Maroto, M., Reshef, R., Munsterberg, A. E., Koester, S., Goulding, M., & Lassar, A. B. (1997). Ectopic Pax-3 activates MyoD and Myf-5 expression in embryonic mesoderm and neural tissue. Cell, 89, 139ā€“148.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Meystre, S. M., Friedlin, F. J., South, B. R., Shen, S., & Samore, M. H. (2010). Automatic de-identification of textual documents in the electronic health record: A review of recent research. BMC Medical Research Methodology, 10(1), 70.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013, Jan 16). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

    Google ScholarĀ 

  • Mork, J., Aronson, A., & Demner-Fushman, D. (2017). 12 years on ā€“ is the NLM medical text indexer still useful and relevant? Journal of Biomedical Semantics, 8(1), 8.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • OHDSIPNAS, Lancet ā€“ Noemie?

    Google ScholarĀ 

  • openNLP. http://opennlp.apache.org/index.html.

  • Peng, Y., Rios, A., Kavuluru, R., & Lu, Z. (2018). Extracting chemicalā€“protein relations with ensembles of SVM and deep learning models. Database, 2018, bay073.

    ArticleĀ  PubMed CentralĀ  Google ScholarĀ 

  • Pennington, J., Socher, R., & Manning, C. D. (2014, Oct). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532ā€“1543.

    Google ScholarĀ 

  • Peters, M. P., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv:1802.05365.

    Google ScholarĀ 

  • Pivovarov, R., & Elhadad, N. (2015, Sep). Automated methods for the summarization of electronic health records. Journal of the American Medical Informatics Association, 22(5), 938ā€“947. https://doi.org/10.1093/jamia/ocv032.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.

    Google ScholarĀ 

  • Resnik, P., Niv, M., Nossal, M., Schnitzer, G., Stoner, J., Kapit, A., & Toren, R. (2006). Using intrinsic and extrinsic metrics to evaluate accuracy and facilitation in computer-assisted coding. Perspectives in Health Information Management Computer Assisted Coding Conference Proceedings, Fall, 6 Sept 2006.

    Google ScholarĀ 

  • Roberts, K., & Patra, B. G. (2018). A semantic parsing method for mapping clinical questions to logical forms. American Medical Informatics Association Annual Symposium Proceedings, 2017, 1478ā€“1487.

    Google ScholarĀ 

  • Roberts, K., Demner-Fushman, D., & Tonning, J. M. (2017). Overview of the TAC 2017 adverse reaction extraction from drug labels track. Proceedings of the 2017 Text Analysis Conference, 13 Nov 2017.

    Google ScholarĀ 

  • Ruch, P., Boyer, C., Chichester, C., Tbahriti, I., GeissbĆ¼hler, A., Fabry, P., et al. (2007). Using argumentation to extract key sentences from biomedical abstracts. International Journal of Medical Informatics, 76(2ā€“3), 195ā€“200.

    ArticleĀ  PubMedĀ  Google ScholarĀ 

  • Ruder, S. (2019). Neural transfer learning for natural language processing (Diss). NUI Galway.

    Google ScholarĀ 

  • Sager, N. (1972). Syntactic formatting of science information. Proceedings of the AFIPS (pp. 791ā€“800). In R. Kittredge & J. Lehrberger (Eds.). (1982). Reprinted in Sublanguage: Studies of language in restricted semantic domains (pp. 9ā€“26). Berlin: Walter de Gruyter.

    Google ScholarĀ 

  • Sager, N. (1978). Natural language information formatting: The automatic conversion of texts to a structured data base. In M. C. Yovits (Ed.), Advances in computers (Vol. 17, pp. 89ā€“162). New York: Academic Press.

    Google ScholarĀ 

  • Sager, N. (1981). Natural language information processing: A computer grammer of English and its applications. Reading: Addison-Wesley.

    Google ScholarĀ 

  • Sager, N., Friedman, C., & Lyman, M. (1987). Medical language processing ā€“ computer management of narrative data. Reading: Addison-Wesley.

    Google ScholarĀ 

  • Savova, G. K., Masanz, J. J., Ogren, P. V., Zheng, J., Sohn, S., Kipper-Schuler, K. C., & Chute, C. G. (2010). Mayo clinical text analysis and knowledge extraction system (cTAKES): Architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17(5), 507ā€“513.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Soysal, E., Wang, J., Jiang, M., Wu, Y., Pakhomov, S., Liu, H., & Xu, H. (2017). CLAMP ā€“ a toolkit for efficiently building customized clinical natural language processing pipelines. Journal of the American Medical Informatics Association, 25(3), 331ā€“336.

    Google ScholarĀ 

  • SQUAD. Stanford Question Answering Dataset Leaderboard https://rajpurkar.github.io/SQuAD-explorer/.

  • Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., & Tsujii, J. I. (2012). BRAT: A web-based tool for NLP-assisted text annotation. In Proceedings of the demonstrations at the 13th conference of the European chapter of the Association for Computational Linguistics 2012 Apr 23 (pp. 102ā€“107). Stroudsburg: Association for Computational Linguistics.

    Google ScholarĀ 

  • Swanson, D. R. (1986). Fish oil, Raynaudā€™s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30, 7ā€“18.

    ArticleĀ  CASĀ  PubMedĀ  Google ScholarĀ 

  • Taylor, A., Marcus, M., & Santorini, B. (2003). The Penn treebank: An overview. In Treebanks (pp. 5ā€“22). Dordrecht: Springer.

    ChapterĀ  Google ScholarĀ 

  • Turian, J., Ratinov, L., & Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.

    Google ScholarĀ 

  • UIMA. https://uima.apache.org/.

  • Uzuner, O., Goldstein, I., Luo, Y., & Kohane, I. (2008). Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association: JAMIA, 15(1), 14ā€“24.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Uzuner, Ɩ., South, B. R., Shen, S., & DuVall, S. L. (2011). 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association, 18(5), 552ā€“556.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Wu, S., Roberts, K., Datta, S., Du, J., Ji, Z., Si, Y., et al. (2019). Deep learning in clinical natural language processing: A methodical review. Journal of the American Medical Informatics Association, 27, 457ā€“470. https://doi.org/10.1093/jamia/ocz200.

    ArticleĀ  PubMed CentralĀ  Google ScholarĀ 

  • Ye, Y., Tsui, F. R., Wagner, M., Espino, J. U., & Li, Q. (2014). Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. Journal of the American Medical Informatics Association, 5(21), 815ā€“823.

    ArticleĀ  Google ScholarĀ 

  • Zhang, H., Fiszman, M., Shin, D., Miller, C. M., Rosemblat, G., & Rindflesch, T. C. (2011). Degree centrality for semantic abstraction summarization of therapeutic studies. Journal of Biomedical Informatics, 44(5), 830ā€“838.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

  • Zunic, A., Corcoran, P., & Spasic, I. (2020). Sentiment analysis in health and well-being: Systematic review. JMIR Medical Informatics, 8(1), e16023. https://doi.org/10.2196/16023.

    ArticleĀ  PubMedĀ  PubMed CentralĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to NoƩmie Elhadad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Demner-Fushman, D., Elhadad, N., Friedman, C. (2021). Natural Language Processing for Health-Related Texts. In: Shortliffe, E.H., Cimino, J.J. (eds) Biomedical Informatics. Springer, Cham. https://doi.org/10.1007/978-3-030-58721-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58721-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58720-8

  • Online ISBN: 978-3-030-58721-5

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics