Fast, Robust, and Secure HTML Parser for Sensitive Data; Reforming The Security Clearance Process: How Relevant Factors Get Weighed by Policymakers

Tierney, Evan, School of Engineering and Applied Science, University of Virginia
Forelle, MC, EN-Engineering and Society, University of Virginia
Vrugtman, Rosanne, EN-Comp Science Dept, University of Virginia

Both my technical capstone and STS research focus on improving the US government’s capacity to quickly complete objectives on classified subjects. My technical report focuses on the development of an HTML parser I engineered during my internship at Allied Associates Intl. that will aid the government in doing classified work. My STS research project analyzes recent reforms in the security clearance investigation process that are aimed to help the government in employing a large, trustworthy workforce to do classified work.
During my internship at Allied, the company needed to parse massive HTML files containing sensitive data from US government clients. Unfortunately, existing HTML parsers were inadequate due to a combination of security reasons and the obscure nature of the data. To address this issue, I was tasked with developing a fast and robust HTML parser that Allied could use to securely analyze these large HTML files. I designed this parser to take XPath queries from the user and return HTML tags that matched. To build this parser, I utilized test-driven and agile development and made use of tools such as Git, Visual Studio, and NUnit. After completion of my internship, I had developed a parser that could successfully parse through test sets of hundreds of megabytes to several gigabytes of HTML data in under 20 seconds. To continue this project, I suggested that the HTML parser be further optimized and also be tested against real-world data sets to ensure there are no bugs.
My STS research project seeks to understand why policymakers considered certain factors while neglecting other relevant factors when implementing the Trusted Workforce 2.0 (TW 2.0) reforms. Specifically, why the investigation backlog, staying within budgets, and integrating evolving technology were so intensely focused on while newly emerging societal norms of younger generations were not addressed. In my research, I analyzed TW 2.0 policies along with reports and commentary on their implementation, results, and areas for improvement. I used the ethnography of infrastructure framework from Susan Leigh Star to guide my analysis of these sources. I specifically focused on the relationality of infrastructure involved in the security clearance investigation process to see what policymakers recognized and what they missed. I also narrowed my analysis to only three of the properties of infrastructure Star outlined: embodiment of standards, built on an installed base, and becomes visible upon breakdown. Through my research and analysis, I discovered that policymakers only addressed areas that they understood when implementing TW 2.0 reforms and neglected important areas of difficulty for other groups. I argue that groups looking to influence how future reforms in the realm of security clearances are made need to ensure that their viewpoints are fully understood by policymakers. This is especially important for groups who are not well represented among policymakers, such as young people. My research will hopefully allow future reforms to create more meaningful and wholistic change for the clearance process and other areas of national security.
Working on these projects simultaneously allowed me to get firsthand experience of security clearance concerns my coworkers and I had and how this contrasted with the concerns policymakers had. My technical capstone inspired my STS research topic because I was frustrated with how long my clearance investigation was taking. Not being able to see the classified data that would be passed to the HTML parser I was developing created extra difficulty for me. Additionally, while developing the parser, I was in an environment where I was surrounded by cleared individuals and other individuals seeking clearances. This allowed me to hear the different opinions of a diverse set of people regarding the security clearance process. This set of people went beyond the typical policymaker which helped me realize some of the differences between issues policymakers recognized in the clearance process and issues other groups recognized.
My STS research also helped me understand the purpose behind my technical capstone project. I had been a little confused why I had to develop an HTML parser when there were already plenty that existed. I knew that the data was somewhat obscure, but I still figured that there had to be some parsers that could handle it. My supervisors had told me that any parsers that were advanced enough had been disqualified due to extra security requirements for software that handles classified data, such as restrictions on developer nationality. By diving deeper into what policymakers considered when making decisions about the clearance process, I realized why there were these extra security requirements. I saw how policymakers approached national security topics from their own perspective and not the perspective of a software engineer, like me, who might think it repetitive and inefficient to develop an entirely new parser.

BS (Bachelor of Science)
Security Clearance, Parser, Policymakers, HTML

School of Engineering and Applied Science
Bachelor of Science in Computer Science
Technical Advisor: Rosanne Vrugtman
STS Advisor: MC Forelle
Technical Team Members: Evan Tierney

All rights reserved (no additional license for public reuse)
Issued Date: