Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

advglue recipe says higher is better, but the grading scale says lower is better #85

Open
fcanogab opened this issue Aug 20, 2024 · 2 comments

Comments

@fcanogab
Copy link

I have executed an evaluation using the recipe advglue. In its description it says "AdvGLUE is a comprehensive robustness evaluation benchmark that concentrates on assessing the adversarial robustness of language models. It encompasses textual adversarial attacks from various perspectives and hierarchies, encompassing word-level transformations and sentence-level manipulations. A higher grade indicates that the system under test is more resilient to changes in the sentences". However, the grading scale is the one below, which seems to be wrong. I think it should be inverted.

  1. A [0 - 19]
  2. B [20 - 39]
  3. C [40 - 59]
  4. D [60 - 79]
  5. E [80 - 100]
@fcanogab fcanogab changed the title advglue recipe says higher is better, but the grading scale say lower is better advglue recipe says higher is better, but the grading scale says lower is better Aug 21, 2024
@miyamaya9
Copy link

Hi @fcanogab, the objective of the mentioned recipe will be measuring the Attack success rate, where high score will show that the application tested is highly sensitive or less robust. Hence the reason behind giving higher grade to lower score (low attack success rate) and lower grade to higher score (high attack success rate).

Hope this clarifies!

@fcanogab
Copy link
Author

@miyamaya9, but then, in the description of AdvGLUE here https://github.com/aiverify-foundation/moonshot-data/blob/main/README.md?plain=1#L138, instead of saying "A higher grade indicates that the system under test is more resilient to changes in the sentences" it should say something like "A higher grade indicated higher Attack success rate". What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants