Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS: Using supervised QE models for available language pairs #378

Merged
merged 6 commits into from
Mar 15, 2022

Conversation

abhi-agg
Copy link
Contributor

@abhi-agg abhi-agg commented Mar 14, 2022

Fixes: #318

Modified the wasm test page and tested it there. It works.
Supervised QE binary models are being used for 3 language pairs.

@abhi-agg
Copy link
Contributor Author

@jelmervdl @jerinphilip Could one of you review the PR? It is a small one. I have to merge it today to be able to make progress on integrating supervised QE models in the extension.

@jerinphilip
Copy link
Contributor

jerinphilip commented Mar 15, 2022

It works.

There are no tests accompanying this change. At least put a screenshot. How does the colour coding change? Does the visual output make sense?

I have to merge it today to be able to make progress on integrating supervised QE models in the extension.

There's nothing really stopping you from bringing the bigger PR which adds the QE feature as a whole to UI. This will additionally allow experimenting with values etc (#370 is probably not going to work, for instance. Do we need calibration?). I fail to understand what stops you from dogfooding your own extension UI experimentation developing in a branch and bringing changes here more complete - look at #312 for example. Alternatively, do it properly at the test page and show application examples (screenshots of UI, not console logs) - which makes an import into the extension UI perhaps easier. The larger context will help review here.

@abhi-agg
Copy link
Contributor Author

There's nothing really stopping you from bringing the bigger PR which adds the QE feature as a whole to UI. This will additionally allow experimenting with values etc (#370 is probably not going to work, for instance. Do we need calibration?). I fail to understand what stops you from dogfooding your own extension UI experimentation developing in a branch and bringing changes here more complete - look at #312 for example.

Are you saying that I should complete the whole implementation in the extension first and then bring those complete changes in test page? Because this is completely opposite of the natural workflow here, making development in this repo dependent on extension repo while clearly it should be other way around.

I am looking for your review on the bindings not on the wasm/JS. Bindings for QE not being merged here does stop me from making progress on integration in the extension as the extension does depend upon the artifacts generated with this change, not the other way around.

Alternatively, do it properly at the test page and show application examples (screenshots of UI, not console logs) - which makes an import into the extension UI perhaps easier. The larger context will help review here.

I can attach a screenshot with console logs that will show that the supervised models are returning scores which are within the range [-1.0, 0] and that will prove that the bindings work. However, it is difficult to show in the test page that there is a color coding change. This depends on the model and the sentences used for translation. At least, I couldn't find a sentence with which I could show better results with supervised QE compared to the free scores from translation models.

Anyway, sanity of the scores being returned by the supervised QE models lie with the C++ realm and not JS.

Copy link
Member

@jelmervdl jelmervdl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it locally by loading models that have (enes) and don't have (bgen) quality models associated with them. Both work as in I can see translations with scores attached. Scores seem to be in the range [-1.0, 0]. So not sure whether the binding changes are working, but they're not breaking anything.

Maybe the people working on QE have some reference data, e.g. an example sentence combined with one of our models that they expect a certain score for. Or some easy examples we can use to test it. E.g. I use german <-> english often to test movement. Similarly, there must be example sentences that trigger a response from the QE models we can see.

But this doesn't block this pull request imo.

@jelmervdl
Copy link
Member

Worth mentioning though: this is an API breaking change for all WASM consumers out there. Which is just Mozilla and my fork of their extension. Consider them all informed, hehe.

@abhi-agg
Copy link
Contributor Author

So not sure whether the binding changes are working, but they're not breaking anything.

Actually it is easy to see that bindings work because whenever supervised models are available then you will see the QE model size in the console log (which indicates that the TranslationModel construction API uses QE model aligned memory).

Attaching screenshot for reference.

Screenshot 2022-03-15 at 15 52 39

Maybe the people working on QE have some reference data, e.g. an example sentence combined with one of our models that they expect a certain score for. Or some easy examples we can use to test it. E.g. I use german <-> english often to test movement. Similarly, there must be example sentences that trigger a response from the QE models we can see.

This sounds like an option for verification.

I am merging this one.

@abhi-agg abhi-agg merged commit 0a52a6d into browsermt:main Mar 15, 2022
@abhi-agg abhi-agg deleted the supervised-qe branch March 15, 2022 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to provide quality estimation models through WASM Bindings
3 participants