AI Made Friendly HERE

A prompt-based approach to adversarial exampl

Article Highlight | 21-Aug-2024

Higher Education Press

image: 

Figure 1


view more 

Credit: Yuting YANG, Pei HUANG, Juan CAO, Jintao LI, Yun LIN, Feifei MA

Recent years have seen the wide application of NLP models in crucial areas such as finance, medical treatment, and news media, raising concerns about the model robustness. Existing methods are mainly limited to synonym perturbation. We find that prompt paradigm can probe special robust defects of pre-trained language models. To solve the problems, a research team led by Juan CAO published their new research on 15 August 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team first proposes a prompt-based adversarial example generation method. Malicious prompt texts are first constructed for inputs and a pre-trained language model can generate adversarial examples for victim models via mask-filling. Experimental results show that prompt paradigm can efficiently generate more diverse adversarial examples besides synonym substitution. Then, we propose a novel robust training approach based on prompt paradigm which incorporates prompt texts as the alternatives to adversarial examples and enhances robustness under a lightweight minimax-style optimization framework.
Experiments on three real-world tasks and two deep neural models show that our approach can significantly improve the robustness of models to resist adversarial attacks. To the best of our knowledge, our work is the first to explore the great potential of prompt paradigm in probing fundamental flaws of pre-trained language models and fine-tuning them for downstream tasks.
DOI: 10.1007/s11704-023-2639-2

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

Originally Appeared Here

You May Also Like

About the Author:

Early Bird