-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish between different variations of the same language #46
Comments
Hi @BLKSerene, thank you for your request. The library already distinguishes between Bokmal and Nynorsk. As for Simplified and Traditional Chinese, I could not find suitable training corpora yet which solely consist of either Simplified or Traditional Chinese. Do you know a good source for those perhaps? |
There are two UD Chinese corpora. |
Ah, those look suitable, thank you. For |
The |
+1 on the feature request 🙏 |
If it helps anyone: in the meanwhile I've had some success identifying traditional and simplified Chinese with hanzidentifier which is based on zhon |
Hi, I'm wondering whether it is possible for
lingua
to distinguish between variations of the same language, for example: Simplified Chinese and Traditional Chinese, Norwegian Bokmål and Norwegian Nynorsk.AFAIK,
langdetect
could distinguish between Simplified and Traditional Chinese while other alternatives can't.The text was updated successfully, but these errors were encountered: