Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store combining class in a static byte array #47

Closed
wants to merge 0 commits into from
Closed

Store combining class in a static byte array #47

wants to merge 0 commits into from

Conversation

Bodigrim
Copy link
Collaborator

This is the first step to #41, introducing a static byte array for combining classes. There are no changes in other areas, so NormalizeStream still queries this array twice (both for isCombining check and for getCombiningClass), but, surprisingly, performance became better in several cases.

Ideally, we would unify this array to cover isDecomposable property as well (represented as combining class 255, see Section 14.1.3), but since there are two ways to decompose (NFD and NFKD), such change would involve quite a lot of restructuring, which I do not have time to dive into right now, unfortunately.

Benchmark          Before      After
-------------------------------------
NFD/Japanese      7.138 ms   6.604 ms
NFD/Vietnamese    6.186 ms   5.500 ms
NFD/AllChars      5.907 ms   5.140 ms
NFD/Deutsch       2.075 ms   2.126 ms
NFD/English       1.556 ms   1.583 ms
NFD/Devanagari    6.262 ms   5.708 ms
NFD/Korean        7.366 ms   7.795 ms
NFKD/Japanese     7.722 ms   7.189 ms
NFKD/Vietnamese   6.723 ms   5.593 ms
NFKD/AllChars     9.341 ms   8.449 ms
NFKD/Deutsch      2.059 ms   2.147 ms
NFKD/English      1.522 ms   1.587 ms
NFKD/Devanagari   6.128 ms   5.744 ms
NFKD/Korean       7.924 ms   8.315 ms
NFC/Japanese      11.14 ms   9.846 ms
NFC/Vietnamese    10.73 ms   9.218 ms
NFC/AllChars      9.389 ms   8.648 ms
NFC/Deutsch       2.754 ms   2.840 ms
NFC/English       1.962 ms   2.107 ms
NFC/Devanagari    7.242 ms   6.265 ms
NFC/Korean        7.432 ms   7.469 ms
NFKC/Japanese     11.98 ms   10.38 ms
NFKC/Vietnamese   11.01 ms   8.959 ms
NFKC/AllChars     14.35 ms   12.88 ms
NFKC/Deutsch      3.404 ms   2.762 ms
NFKC/English      1.996 ms   1.969 ms
NFKC/Devanagari   7.427 ms   5.876 ms
NFKC/Korean       7.794 ms   7.389 ms

@harendra-kumar
Copy link
Member

harendra-kumar commented May 17, 2020

Looks good. I will probably get some time on next weekend. I will try making use of it if you don't do it until then.

@harendra-kumar
Copy link
Member

I will not be able to get back to this until Sep. I guess we can be a bit lazy on this one as we have already pushed several good perf improvements in the upcoming release.

@Bodigrim Bodigrim closed this Sep 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants