Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: add fast path for Latin1 decoding #55275

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mertcanaltin
Copy link
Member

@mertcanaltin mertcanaltin commented Oct 5, 2024

I added a fast path for Latin1 (windows-1252) decoding to improve performance. This change avoids using the slower ICU-based decoding for Latin1 and instead utilizes a direct approach, similar to the fast path implemented for UTF-8.
nodejs/performance#178

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/security-wg
  • @nodejs/v8-update

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. encoding Issues and PRs related to the TextEncoder and TextDecoder APIs. needs-ci PRs that need a full CI run. v8 engine Issues and PRs related to the V8 dependency. labels Oct 5, 2024
@mertcanaltin mertcanaltin requested review from anonrig and joyeecheung and removed request for anonrig October 5, 2024 08:45
Copy link

codecov bot commented Oct 5, 2024

Codecov Report

Attention: Patch coverage is 17.24138% with 24 lines in your changes missing coverage. Please review.

Project coverage is 88.40%. Comparing base (bbdfeeb) to head (b12ba58).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/encoding_binding.cc 12.50% 21 Missing ⚠️
lib/internal/encoding.js 40.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #55275      +/-   ##
==========================================
- Coverage   88.41%   88.40%   -0.02%     
==========================================
  Files         652      652              
  Lines      186612   186641      +29     
  Branches    36062    36068       +6     
==========================================
+ Hits       165001   165003       +2     
- Misses      14885    14910      +25     
- Partials     6726     6728       +2     
Files with missing lines Coverage Δ
src/encoding_binding.h 100.00% <ø> (ø)
lib/internal/encoding.js 99.02% <40.00%> (-0.49%) ⬇️
src/encoding_binding.cc 73.52% <12.50%> (-10.72%) ⬇️

... and 31 files with indirect coverage changes

@anonrig
Copy link
Member

anonrig commented Oct 5, 2024

Can you update benchmarks as well?

lib/internal/encoding.js Outdated Show resolved Hide resolved
lib/internal/encoding.js Show resolved Hide resolved
lib/internal/encoding.js Outdated Show resolved Hide resolved
lib/internal/encoding.js Outdated Show resolved Hide resolved
@@ -443,6 +446,10 @@ function makeTextDecoderICU() {
return decodeUTF8(input, this[kIgnoreBOM], this[kFatal]);
}

if (this[kLatin1FastPath]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After Line 443, we need this, since we don't support options.stream on the fast path.

this[kLatin1FastPath] &&= !(options?.stream)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will apply it as you say, thank you.

@@ -443,6 +446,10 @@ function makeTextDecoderICU() {
return decodeUTF8(input, this[kIgnoreBOM], this[kFatal]);
}

if (this[kLatin1FastPath]) {
return decodeLatin1(input);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should support ignore bom, and fatal values as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an update covering this, thank you

@mertcanaltin
Copy link
Member Author

Can you update benchmarks as well?

I wonder if this is the right place.
benchmark/util/normalize-encoding.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. encoding Issues and PRs related to the TextEncoder and TextDecoder APIs. needs-ci PRs that need a full CI run. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants