research18h
Generalised Eigenvalue Geometry of Semantic Adversarial Attacks
Researchers developed a continuous local model of semantic paraphrase perturbations that explains how semantically equivalent paraphrases can fool classifiers. The model captures the geometry of attacks and provides a theoretical framework for understanding robustness. This work aims to improve classifier robustness against adversarial attacks.
Key takeaways
- New model explains paraphrase attacks on classifiers.
- Captures geometry of semantic perturbations.
- Aims to improve classifier robustness.