Tracking the Occurrence and Academic Effects of Sexual Violence at a University

Liu, Xiaoqian, Systems Engineering - School of Engineering and Applied Science, University of Virginia
Brown, Donald, Department of Systems and Information Engineering, University of Virginia
Gerber, Matthew, Department of Systems and Information Engineering, University of Virginia
Laughon, Kathryn, School of Nursing, University of Virginia

Most of student sexual assault victims are unwilling to report to the police. This makes investigation and spatial temporal analysis less representative because of lack of documented incident addresses. Due to this under-reporting issue, we are motivated to discover additional sexual assault incidents outside of the local crime report, for instance emergency records and news reports. These addresses can be quite useful to discover some undiscovered patterns in the crime analysis. In this research, I present an approach to automatically extract street addresses from news reports by applying a sequential labeling technique and semi-supervised learning. The previous work on address extraction only focuses on web pages where addresses are separated from other texts; however our problem needs to retrieve addresses embedded in texts. We built the Gradient Boosting and Conditional Random Field (CRF) models to solve this problem. In addition, we utilized a semi-supervised learning algorithm to use additional unlabeled data to further improve the predictive performance. In the end, we compared the patterns of extracted addresses from documented addresses in the crime report.

MS (Master of Science)
All rights reserved (no additional license for public reuse)
Issued Date: