Knowledge Discovery and Decision Support Systems Using Natural Language Processing in Applications for Societal Good

Heidarysafa, Mojtaba, Systems Engineering - School of Engineering and Applied Science, University of Virginia
Brown, Don, DS-Research, University of Virginia

Artificial Intelligence (AI) is becoming a crucial innovation that significantly impacts
our everyday life. Not only have commercial AI applications improved our
daily experiences, but AI has also proved to be beneficial in social domains such
as healthcare, transportation, and education. As an essential sub-field of AI,
Natural Language Processing (NLP) has great potential to benefit such social
domains. In particular, many social domains have textual information that different
NLP techniques can leverage to help individuals generate insights or aid in
quick decision-making. This study presents several different NLP approaches that
can provide societal benefits in three problem areas: namely, transportation and
safety, security and counter-terrorism, and labor market gaps.
In the first application, we focused on short domain-specific types of texts in
the field of transportation and safety. We used accident reports’ narratives, a
free text field, to identify the causes of accidents. We used two deep-learning
architectures, Recurrent Neural Networks and Convolutional Neural Networks,
combined with two word embeddings (Word2Vec and GloVe) to train a model
capable of identifying any accident’s cause based on the accident’s narrative. Such
a system can be used as a decision-support tool for evaluating accident reports.
In the security and counter-terrorism domain, we used NLP to analyze ISIS’s
propaganda approach to women and compared it with the approach from a nonviolent
religious group for women. We collected the relevant texts by using webscraping
and optical character recognition (OCR), and used an unsupervised learning
method for analyzing the texts. Furthermore, we used emotion analysis to
check for the emotional aspects of these documents.
Finally, to address the skill gap in data science-related jobs, we collected a
large corpus of online job advertisements and used the embedding vector space
of the advertised skill terms and phrases in a semi-supervised approach in order
to find the hard skills that the jobs required and extrapolate them from these
documents. We also presented a complete framework for analyzing skills in the
U.S. that allows individuals and organizations to understand the job market.

PHD (Doctor of Philosophy)
natural language processing, deep learning, word embedding, topic modeling
Issued Date: