Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: Detect session encoding incapable of U+00F1 #6567

Merged
merged 2 commits into from
Oct 11, 2024

Conversation

aitap
Copy link
Contributor

@aitap aitap commented Oct 10, 2024

iconv() does not necessarily fail to convert U+00F1 to ASCII and return NA. For example, FreeBSD iconv() succeeds and returns ? instead of the character in question. Use identical() to compare the result of the conversion back to the original (which internally converts both to UTF-8).

Closes: #6339

Should with the part of #6559 that's about encodings, but not OpenMP

iconv() does not necessarily fail to convert U+00F1 to ASCII. For
example, FreeBSD iconv() succeeds and returns '?' instead of the
character in question. Use identical() to compare the result of the
conversion back to the original (which internally converts both to
UTF-8).
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.62%. Comparing base (9b8d496) to head (873f581).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6567   +/-   ##
=======================================
  Coverage   98.62%   98.62%           
=======================================
  Files          79       79           
  Lines       14450    14450           
=======================================
  Hits        14251    14251           
  Misses        199      199           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aitap aitap mentioned this pull request Oct 11, 2024
@nunotexbsd
Copy link

Hello,

This change fixes test #6559

Thanks

@@ -18768,7 +18768,7 @@ if (test_bit64) local({
# non-ASCII plain symbol in by, #4708
# NB: recall we can't use non-ASCII symbols in the test script. The text is a-<n-tilde>-o (year in Spanish)
native_ano = iconv("a\U00F1o", "UTF-8", "")
if (!is.na(native_ano)) { # #6339: symbol must be represented in native encoding
if (identical(native_ano, "a\U00F1o")) { # #6339: symbol must be represented in native encoding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, thanks. It still leaves us with the else branch here on FreeBSD where this is unanticipated / the behavior of DT[,.N, by=año] goes untested there. But let's start with un-breaking the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up is #6573

@MichaelChirico MichaelChirico merged commit 315ab54 into master Oct 11, 2024
8 checks passed
@MichaelChirico MichaelChirico deleted the iconv-result-identical branch October 11, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-ascii symbols on systems without UTF-8
3 participants