Impact-Aware Retrieval Defense: Mitigating Word Substitution Ranking Attacks for Enhanced Stability Conference

Liu, J, Li, X, Hu, X et al. (2025). Impact-Aware Retrieval Defense: Mitigating Word Substitution Ranking Attacks for Enhanced Stability .(2025), 1084-1093. 10.1109/BigData66926.2025.11400863

cited authors

  • Liu, J; Li, X; Hu, X; Yang, W; Li, W; Yang, J; Zhang, W; Guo, Y

authors

abstract

  • Recent years have witnessed substantial progress in document retrieval, driven by advancements in numerous language models. However, these models remain vulnerable to adversarial attacks, such as Word Substitution Ranking Attack (WSRA), which manipulates retrieval results by subtly replacing words in the document content. Existing defense methods often rely on adversarial training or ensemble-based certification, both of which require extensive supervision and limit their practicality. Accordingly, we propose the Impact-Aware Defense (IAD) algorithm, which explicitly leverages three simple yet effective masking strategies. Specifically, IAD stabilizes retrieval results by minimizing the impact of word-level perturbations, ensuring that the removal of arbitrary words does not significantly alter the retrieval result. Furthermore, our theoretical analysis guarantees ranking stability by constraining perturbation-induced score deviations. Empirical results on three widely adopted retrieval benchmarks show that IAD achieves substantial robustness improvements against adversarial attacks, establishing a new state-of-the-art with up to 29.4% relative gain in Mean Reciprocal Rank (MRR) over prior best-performing methods.

publication date

  • January 1, 2025

Digital Object Identifier (DOI)

start page

  • 1084

end page

  • 1093

issue

  • 2025