" Natural Language Processing (NLP) and text mining are research fields aimed at exploiting rich knowledge resources with the goal of understanding, extraction, and retrieval from unstructured text. However, little consideration has been given to generating meaningful natural language through inference and assumptions. We at Goddard are redefining the limits of natural language processing algorithms by using deep learning models and literature-based discovery for generating new interdisciplinary scientific knowledge and hypotheses..
Pre-training large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pre-training efforts focus on general domain corpora, such as newswire and Web which are not efficient for domain-specific corpora. At Goddard, we are devising new strategies to improve performance of language models for domain-specific tasks.
Abstractive summarization is a technique in which the summary is generated by generating novel sentences by either rephrasing or using the new words, instead of simply extracting the important sentences. The complexity of natural language processing makes abstractive summarization, a challenging task. At Goddard, we are investigating new deep learning techniques for semantic-based abstractive summarization techniques.