Poisoning Attacks on Online Learning to Rank

Bamrara, Rishab, Computer Science - School of Engineering and Applied Science, University of Virginia
Wang, Hongning, EN-Comp Science Dept, University of Virginia

Online Learning to Rank (OL2R) methods have been popularly studied in Information Retrieval (IR). The key idea is to estimate a ranker directly based on the interaction with users. Dueling Bandit Gradient Descent (DBGD) is one of the most popularly studied OL2R methods, which is deeply rooted in online convex optimization. However, little is known about the robustness of such algorithms in scenarios where a malicious attacker tries to change the user’s feedback in such a way that the final learned ranker deviates away from what the user intended it to be. This work describes the real-world setting(s) on how a malicious attacker can use the characteristics of DBGD-based algorithm to attack it. We have also described in detail the threat model that we have used. We then propose two novel attacking strategies, namely, Naive intersection attack and Frequency attack, that the attacker can use on DBGD to divert the ranker towards the attacker’s goal. The attacks and the corresponding deviation have been subsequently analyzed to show the robustness of DBGD based methods. The results show that DBGD is not robust against the proposed attacks. To the best of our knowledge, we are the first to propose poisoning attacks on an OL2R algorithm, show empirical results and analyze them.

MS (Master of Science)
Online Learning to Rank, Poisoning Attack, Dueling Bandit Gradient Descent (DBGD)
Issued Date: