Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing floats is locale dependent #70

Open
smurfix opened this issue Aug 31, 2020 · 6 comments
Open

Writing floats is locale dependent #70

smurfix opened this issue Aug 31, 2020 · 6 comments

Comments

@smurfix
Copy link

smurfix commented Aug 31, 2020

Hi,
CJSON is not locale independent. Specifically, if the current locale wants us to encode the decimal point as a comma, as in Germany, CJSON will happily do so.

This makes the other side rather unhappy. Worse, if the problem occurs in a list it's not even recognized by the receiving parser.

Please fix!

@mpx
Copy link
Owner

mpx commented Sep 1, 2020

Afaik, this bug was fixed in 1.0.4.

Can you please provide some more details about the platform you are using, and how you are compiling Lua CJSON?

There are 2 ways of compiling floating point support:

  • Adapting the system strtod/snprintf routines to use "." with any locale (the default)
  • Using internal dtoa/g_fmt routines to avoid any locale issues and increase performance. (USE_INTERNAL_FPCONV).

This is documented in the manual under Built-in floating point conversion.

@smurfix
Copy link
Author

smurfix commented Sep 2, 2020

I'm using the Debian package for 2.1 which doesn't use specific preprocessor arguments.
Apparently the program I'm linking with manages to change the current locale after your fpconv_init gets called.

I'll ask the Debian maintainer to build with USE_INTERNAL_FPCONV, but cjson still needs to support usage with a non-constant locale setting IMHO.

@smurfix
Copy link
Author

smurfix commented Sep 2, 2020

Also, what about locales that don't even use "our" digits?
Please consider unconditionally switching to USE_INTERNAL_FPCONV and/or using that as the default.

@mpx
Copy link
Owner

mpx commented Sep 2, 2020

While the hack to detect/adapt the decimal separator for different locales is ugly, I'm not aware of any specific locales where it doesn't work (happy to hear otherwise).

Changing the locale after the program has started is typically a bad strategy. setlocale isn't thread-safe, and various code might make assumptions based on the previous configuration.

I agree it would be better to have a single implementation. However, there are tradeoffs:

  • Platform routines
    • Pro: Work everywhere
    • Con: Slower, breaks if locale changes to a different decimal separator mid-execution (ideally that should never happen).
  • Internal dtoa routines
    • Pro: Faster
    • Con: Needs to be correctly built for little/big endian, and with/without Pthreads.

It's an awkward tradeoff. I wanted Lua CJSON to just work out of the box, which means using the platform routines. The internal dtoa has advantages but must be build correctly for the platform and the codebase.

If Debian wants to use the internal routines it will need to correctly set the endianness, and probably always enable/link pthreads (since it may be used in a multi-threaded application).

@mpx
Copy link
Owner

mpx commented Sep 3, 2020

Another option: Can you call fpconv_init after the locale has been changed? fpconv_init isn't thread safe, but neither is setlocale, so it shouldn't be any worse.

@mpx
Copy link
Owner

mpx commented Jan 23, 2021

Another option..

You could use compiler defines to autodetect the correct endianness for USE_INTERNAL_FPCONV by adapting something like this: https://stackoverflow.com/a/58642785

It might be reasonable to add something similar to CJSON as well, but it would need testing across a wide range of compilers/platforms first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants