Skip to content
This repository has been archived by the owner on Jun 17, 2020. It is now read-only.

Translation quality assurance and issue template #302

Closed
wordsandstuff opened this issue Feb 7, 2018 · 29 comments
Closed

Translation quality assurance and issue template #302

wordsandstuff opened this issue Feb 7, 2018 · 29 comments
Assignees
Labels
zz-Operations NEEDS SPONSOR guides: @TrenchFloat, @jimscarver @Tonyprisca13 zz-Translation NEEDS SPONSOR

Comments

@wordsandstuff
Copy link

Native speakers need to have priority over the translations. I dont think that some one who doesn't speak German or Spanish or what ever language should be able to be the main translator . Google translate should NOT be the first method of translation. QUALITY should always be the TOP priority. Yes I would like to translate something with google translate and make money, but I know a native speaker can do it better.

@flowpoint flowpoint added the zz-Translation NEEDS SPONSOR label Feb 7, 2018
@flowpoint
Copy link

to make this issue smart, we maybe could streamline translation with a short translation_guide and find a way to find native speakers of the languages
also how schould we reference this with the many current translation issues?

@dckc
Copy link
Contributor

dckc commented Feb 7, 2018

Which translation used Google translate with poor quality results? Pointer, please?

@lapin7 invited work on a translation policy in an Oct 11 comment Unfortunately, aside from my suggestion to take inspiration from W3C Translations, it's empty.

@jimscarver
Copy link
Contributor

someone must take the responsibility for development of the translation policy.

I believe no transaction can be considered good until several speakers of the language have reviewed and improved it. There is a work flow process If no one is willing to review/correct it given reasonable bounties it can be considered worthless. reviewers budget/claim/award a percent of the bounty commensurate with the amount of work toward a finished product needed and delivered..

@llerner
Copy link

llerner commented Feb 7, 2018

Review by consensus. Three native speakers should agree upon the translation. Split the bounty accordingly (E.g., translator gets 70%, validators get 15%)

@dckc
Copy link
Contributor

dckc commented Feb 7, 2018

Yes, my impression is that we are doing just that: review by several native speakers.

At some point, it becomes cost-effective to lay down rules explicitly, but it's not much fun and it takes time away from more directly productive work. Just trusting each other is more fun and efficient. So I'd like to see evidence that just trusting each other isn't working. Pointer, please?

@zsluedem
Copy link
Contributor

zsluedem commented Feb 8, 2018

yeah. I think separation the bounty will encourage people to review.

We could put a bounty issue for the docs which has been translated already. If anyone could find any improper translation which has to be approved by 3(or 2?) others' native speaker, he/she can get a bounty (along with the reviewers but different percentage, E.g. 1$ for one sentence and then split)

BTW, in the last Members Meeting , someone(Brandon I think?) says there's a translation which is translated by Google. Can you provide link or something?

@Viraculous
Copy link

I agree with @dckc and @llerner . Your idea is splendid @wordsandstuff. In a cooperative and decentralized Community as the RChain, what we market is solidarity in trust to achieve precision on work done.

@Keaycee
Copy link
Contributor

Keaycee commented Feb 8, 2018

I agree with @dckc & @llerner . You have a point @wordsandstuff , The google translation API is very poor and it's translation are not accurate, and i do not advice or support any work done with google translation. But you must also know that there are native speakers who are not member of our cooperative but do collaborate with coop members to such work done. There are native speaker who's job is to translate documents or assist in business interaction, one can collaborate with any and get a good job done. What we should consider, is the quality and accuracy of the translation work been done.

@flowpoint
Copy link

flowpoint commented Feb 8, 2018

Pointer, please?

@dckc
not suspecting/blaming anyone but in:
#266 this document
for example, line 789:

[...] wissen, dass die dag-Operationen, die Haschisch [...]

in the english transcript:

'know, to the DAG operations that Hashgraph'

'Haschisch' is a german word for marihuana, which doesnt fit at all,
and i doubt, that a human, would translate english "hash" into german "Haschisch"
also 5000$ isnt a suited as reward imo

@dckc
Copy link
Contributor

dckc commented Feb 8, 2018

I guess I don't see a problem in the case of #266. It's still clearly on their TODO list to "Have it reviewed by others with deeper understanding of German and English". They don't claim to be done.

My experience with translation is all 2nd hand, but I don't see any reason to rule out Google translate as a way to start. As long as the end product is good quality and people treat each other with respect along the way, I don't much care how it gets done.

@dckc
Copy link
Contributor

dckc commented Feb 8, 2018

oh! @flowpoint you're the one doing the reviewing. If you see a problem, then there is a problem.

@n10n
Copy link

n10n commented Feb 9, 2018

Food for thought: https://www.quora.com/How-effective-is-Google-translate... varies between 58 to 88 % and this number is language dependent. If you like the real numbers then use this: http://mt-quality.multilizer.com/machine-translation-quality-statistics/. Question - Do you want to use this tool for the translation?

@flowpoint
Copy link

flowpoint commented Feb 9, 2018

my gripe is:

  1. if the result matters, then the goal and budget should be fixed. however someone got there isnt important.
    or
  2. if the amount of effort matters, reward is proportional, then whether google translate was used impacts alot.

without clear decisions about that, it will lead to:
someone claiming reward for his effort (2.) , without having to exert himself, towards the result (1.)

possibly here:

someone using google translate, getting 80% accuracy, claiming its his work and he cant do better.

-> little work done, big reward

p.s. google translate from a cleaned englisch transcript to german is very good. how to decide which translation was done by software or badly done by a human?

@Keaycee
Copy link
Contributor

Keaycee commented Feb 9, 2018

@flowpoint How the work is been done does not really matter. What matters is the quality and accuracy of the work been done. We need the works to be reviewed and proofread to check further errors and corrections. This work maybe translated by a native speaker in collaboration with coop members or probably by a translator. Which ever way the work is done, what's important is a work well done. It's a decentralized cooperative, we all can contribute to any particular issue base on specialty or knowledge.

@dcpnlau
Copy link

dcpnlau commented Feb 12, 2018

I agree with @Keaycee: quality over quantity/how work has been done. If Google Translate was 100% accurate, it would be fine to use and we could have perfect translations at lower cost as it would be easier to do. Unfortunately, this is not the case, so a suggestion I'd like to propose:

Can we create a list of native speakers by language who are willing to proofread translated transcripts or translate projects altogether? That way we can easily notify and assign issues to native speakers who can then decide if they want to work on it.

Is there any way we can use the KYC process to confirm nationality and if someone is a native speaker, as it will inform us with great accuracy of someone's background? Or is this too private information? Though I understand that e.g. someone who has a Spanish passport, may have grown up in the UK and not be fluent, but I think this number of people will be minimal and we'll be able to pick up on these people quickly.

@dckc
Copy link
Contributor

dckc commented Feb 12, 2018

@dcpnlau writes:

Can we create a list of native speakers by language who are willing to proofread translated transcripts or translate projects altogether?

That's a fine idea. Would you please do so?

I think it's worth putting some more quality control measures in place.

In #100 I requested that translation issues be specific to a source document and a language. These are not specific to a document:

#317 is a dup of #312.

#317 and several others also shows a lack of familiarity with the self-starter aspect of bounty work. Work in other languages is significantly less transparent -- many of us cannot evaluate the quality of the work.

@Ojimadu
Copy link
Contributor

Ojimadu commented Feb 13, 2018

While we give Native speakers preference in translation, we need to also note that a good number of native speakers are also not fluent in English. Some might use some computer aided translation to convert these text into English all this can affect how the translation work turn out to be overall.

@dckc
Copy link
Contributor

dckc commented Feb 14, 2018

In cleaning up some goofy git commits (#319) I just realized that many of the translations are for administrative files in this repository. Compared to translating the architecture document or some such, these have little value in telling the RChain story.

Also, I expect README and CONTRIBUTING to change significantly as a result of our onboarding discussion, so I would only want those translated in the case where a commitment to maintain them over time is arranged.

Until we resolve this issue and put some additional quality control in place, I recommend against merging any translation PRs. (Note that they all need to be rebased in any case as a result of repository clean-up in #319).

I see #351 suggests folders for each language. I wonder about Setting up a Proper Multilingual Site with GitHub Pages and Jekyll.

On the other hand, I expect most of the translations to live in other repositories, so maybe that's overkill.

@dckc
Copy link
Contributor

dckc commented Feb 21, 2018

@patrick727 which lanuages show up high in web site analytics?

@dckc
Copy link
Contributor

dckc commented Feb 21, 2018

Perhaps an issue template for translations:


Thanks for your interest in telling the story of RChain in another language.

  • Why this document? (mark all that apply)
    • it's one of the /rchain/reference documents
    • I contacted the author, who agreed it's important to translate
    • I know a person or audience who would benefit from it: _____
    • other: ___
  • How can people who do not speak this language be assured of the quality of the translation? (all are required)
    • several reviewers are involved (assign the ticket to them)
    • at least two of them have high proficiency in (the target language)
    • at least two of them have high proficiency in English
    • at least two of them have either
      • significant reputation at stake; source: ____
      • a track record of quality work on rchain bounties: #nn, #nn, ...
  • What about maintenance? (choose one)
    • the source document is stable; further translation work is not expected
    • our plan to keep the translation up to date is: ____

@Ojimadu
Copy link
Contributor

Ojimadu commented Feb 21, 2018

@dckc A good template up there.
IMO, the most important documents that should be translated are

  1. How to contribute
  2. White paper and architecture documents

Translation of any other document should meet a quorum set by the community through a proposal, and the languages to be translated into should be among the top 10 spoken languages (except English) by number of speakers. Except in cases where the translation has a clear value and and audience for it.

@dckc
Copy link
Contributor

dckc commented Feb 21, 2018

@ICA3DaR5 would you please help shepherd translations for a while? I would much rather focus on other things, and @dcpnlau isn't available -- at least not today (cf. #381).

Perhaps you could recruit @AbnerZheng to help you?

I think I read that github supports multiple issue templates. Would you please start by installing the checklist in my comment above?

@dckc dckc assigned ghost and unassigned dcpnlau Feb 21, 2018
@ghost
Copy link

ghost commented Feb 21, 2018

@dckc Sure.

@ghost
Copy link

ghost commented Feb 21, 2018

@dckc We can do the same thing we did on #398 for Pull Requests too.

@dckc dckc changed the title Translation requirements. Translation quality assurance and issue template Feb 22, 2018
@dckc
Copy link
Contributor

dckc commented Feb 22, 2018

This is done to my satisfaction.

@wordsandstuff and others, feel free to re-open it if you see more to do that shouldn't go in a separate issue.

@dckc
Copy link
Contributor

dckc commented Mar 13, 2018

#483 shows the approach here isn't working well enough.

@dckc dckc reopened this Mar 13, 2018
@dckc dckc assigned 9rb and dckc and unassigned patrick727 Mar 14, 2018
@dckc dckc added the zz-Operations NEEDS SPONSOR guides: @TrenchFloat, @jimscarver @Tonyprisca13 label Mar 14, 2018
@Barkov-F
Copy link

The usage of machine translation is detectable. But I have no idea how to implement such a system here

@ghost
Copy link

ghost commented Mar 15, 2018

@Barkov-F I think colleges use something like this to avoid machine translation and copying from other students.

@dckc
Copy link
Contributor

dckc commented May 7, 2018

#483 seems like a dup of this one, but I guess the action is over there...

@dckc dckc closed this as completed May 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
zz-Operations NEEDS SPONSOR guides: @TrenchFloat, @jimscarver @Tonyprisca13 zz-Translation NEEDS SPONSOR
Projects
None yet
Development

No branches or pull requests