Reevaluating Adversarial Examples in Natural Language; Political Implications of Near Human-Level Natural Language Processing Models

Author:
Lifland, Eli, School of Engineering and Applied Science, University of Virginia

Advisors:
Foley, Rider, EN-Engineering and Society, University of Virginia
Qi, Yanjun, EN-Comp Science Dept, University of Virginia

Abstract:

Machine learning models have been shown to be vulnerable to being fooled by small, targeted changes to images and text. My research is addressing the lack of standardization in evaluation of adversarial attacks on natural language processing (NLP) models which analyze text. Adversarial attacks are useful for probing the blind spots of NLP models. However, without consistent methods of evaluation, it is difficult to tell which attacks are most effective and whether attacks are even successful.
We are developing an open-source library to separate changes in attack methodology from changes in evaluation of attack success. This will allow NLP researchers to separate the effects of changes in each of these respective areas. As NLP models improve, they are able to be used more effectively for both beneficial and malicious purposes. One common concern is automating the generation of fake news. Additionally, the NLP models may be biased and are often only available in select languages.
There are several STS theories relating to the rapid progress of NLP models. One is the Risk Society. The ways in which we think about the risks of automated disinformation campaigns and abusive content are interesting to think about. Another is technological determinism. There is certainly a segment of the NLP community who assert that NLP progress will continue no matter what, and that it is necessary to progress as fast as possible without thinking about the consequences. I will be analyzing the impacts of recent NLP models through the techno-politics framework.
My STS research will consist of a case study of 3 recently released NLP models: BERT, GPT-2, and GROVER. I will collect primary evidence through interviews of various actors, including people developing these models and deploying them to various applications. I will also collect secondary evidence through prior literature and media accounts related to the NLP models and their implications. I will analyze the political consequences of NLP models, both intentional and unintentional. I expect to learn more about known threats such as automated disinformation campaigns and discover new threats.
While my technological research aims to better understand the weaknesses of NLP models, my STS research asks what political impacts they will have as they inevitably continue to improve. Thinking about the implications of technologies and how to best mitigate their downsides is important, especially for technologies which have the potential to have large positive and negative effects. As NLP models continue to improve, it is our responsibility to consider how to best use them and mitigate misuse.

Degree:
BS (Bachelor of Science)

Keywords:
Natural Language Processing, NLP, Machine Learning, Adversarial Examples, Techno-politics

Notes:

School of Engineering and Applied Sciences
Bachelor of Science in Computer Science
Technical Advisor: Yanjun Qi
STS Advisor: Rider Foley
Technical Team Members: Jack Morris, Jack Lanchantin

Language:
English

Rights:
Attribution 4.0 International (CC BY)

Issued Date:
2020/05/01

Persistent Link:
https://doi.org/10.18130/v3-sxw2-yk14

Reevaluating Adversarial Examples in Natural Language; Political Implications of Near Human-Level Natural Language Processing Models

Files