Amazon Influencer Recommendations Widget; The Ethics of Using Customer Data in Machine Learning

Author:
Lee, Justin, School of Engineering and Applied Science, University of Virginia
Advisors:
Earle, Joshua, EN-Engineering and Society, University of Virginia
Abstract:

Through the Amazon Influencer Program, Amazon customers can follow influencers whose content they enjoy. Before the implementation of my technical project, customers had to locate each influencer individually. During my internship at Amazon, I implemented a recommendations widget to suggest influencers and their content to customers. This widget provided a dedicated place for customers to discover new content and used machine learning (ML) to optimize recommendations based on what customers liked. The technical portion of this paper provides an overview of the design, architecture, and results of the recommendations widget project. It describes the full technological stack, from the frontend decisions to the backend construction with machine learning extensibility.

Though the machine learning algorithm behind this particular widget was simple, it brought to mind the issue of customer data privacy. Since machine learning algorithms become more effective with more data, this issue of data ethics is a complex sociotechnical dilemma. Additionally, there is a blatant lack of transparency behind tech companies and their user data. This problem is further compounded by the lack of effective legislation, both on public and private levels, to oversee the ethical use of these immense volumes of data.

To demonstrate why machine learning applications have become so prevalent, I first investigate their power, specifically in the form of recommendation engines that help users find new content through personalized suggestions. The dominance of these applications lies in their potential to provide revenue, personalization, and content discovery. I then frame the issue of customer data privacy using the Social Construction of Technology (SCOT), or the idea that social groups dictate nearly every aspect of a technology to shape its function in society. Given the sociotechnical complexity of this issue, algorithmic bias often appears through the implicit bias of the engineers implementing the ML, despite the algorithms seeming objective. This is dangerous as it ignores the nuances of the social groups that the data represents, resulting in discriminatory outputs. To explore the ethics of data collection and transparency, I analyze recommender systems using utilitarian and virtue ethics.

To explore my research question “To what extent is it ethical, from a utilitarian and virtue standpoint, for tech companies to use customer data to provide more personalized experiences?” I conduct several research methods. First, I survey my UVA peers to gauge stances on software companies using their data for personalized services. The survey results highlight an interesting paradox: respondents were generally uncomfortable with companies tracking their data, but also would not stop using these services due to their unique value and lack of viable alternatives.

I then analyze two case studies of ML algorithmic bias and data persistence across different platforms. In the first, juridical deliberation in Broward County, Florida used ML to assign recidivism scores to 18,000 criminal defendants, but systematically overestimated for black people. In the second, the Chinese government used artificial intelligence (AI) to profile the Muslim Uighur minority, marking the first known example of a government using AI for racial profiling. This research demonstrates a severe need for effective data policies to promote ML data collection in ethical and effective ways.

Degree:
BS (Bachelor of Science)
Keywords:
machine learning, utilitarianism, artificial intelligence, big tech, tech companies, virtue ethics, ethics, Social Construction of Technology, SCOT, Google, Amazon, Facebook
Language:
English
Rights:
All rights reserved (no additional license for public reuse)
Issued Date:
2022/05/15