Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add decimal validator #763

Merged
merged 2 commits into from
Aug 9, 2023
Merged

add decimal validator #763

merged 2 commits into from
Aug 9, 2023

Conversation

davidhewitt
Copy link
Contributor

@davidhewitt davidhewitt commented Jul 11, 2023

Change Summary

Adds a Rust implementation of decimal validation and serialization. Needs more tests and benchmarks to confirm it's worth going for this instead of pydantic/pydantic#6327

Related issue number

N/A

Checklist

  • Unit tests for the changes exist
  • Documentation reflects the changes where applicable
  • Pydantic tests pass with this pydantic-core (except for expected changes)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @dmontagu

@codspeed-hq
Copy link

codspeed-hq bot commented Jul 11, 2023

CodSpeed Performance Report

Merging #763 will degrade performances by 43.74%

Comparing dh/decimal (997cbde) with main (c94fa15)

Summary

🔥 1 improvements
❌ 1 regressions
✅ 133 untouched benchmarks

🆕 3 new benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main dh/decimal Change
🔥 test_dont_raise_error 36 µs 25.6 µs +40.74%
test_tagged_union_int_keys_python 33.3 µs 59.1 µs -43.74%
🆕 test_decimal_from_string_core N/A 67.2 µs N/A
🆕 test_decimal_from_string_limit N/A 20 µs N/A
🆕 test_decimal_from_string_pyd N/A 65.2 µs N/A

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a good start.

The easiest way to check performance would be to copy the validator constructed by pydantic and add it to benchmarks here, together with the function from pydantic. Then run benchmarks. You can probably remove that once we're sure it's faster.

src/input/input_json.rs Outdated Show resolved Hide resolved
src/input/input_json.rs Outdated Show resolved Hide resolved
src/input/input_json.rs Outdated Show resolved Hide resolved
src/input/input_python.rs Outdated Show resolved Hide resolved
src/input/input_json.rs Outdated Show resolved Hide resolved
src/serializers/type_serializers/decimal.rs Outdated Show resolved Hide resolved
src/validators/decimal.rs Outdated Show resolved Hide resolved
src/validators/decimal.rs Outdated Show resolved Hide resolved
src/validators/decimal.rs Outdated Show resolved Hide resolved
src/validators/decimal.rs Outdated Show resolved Hide resolved
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple more points.

src/input/input_python.rs Outdated Show resolved Hide resolved
src/input/input_python.rs Outdated Show resolved Hide resolved
@davidhewitt
Copy link
Contributor Author

I think the implementation here for the validator is now complete and is as optimized as can be. Still to do:

  • Serializer feedback
  • More tests
  • Create known error types

I will park this until we make a decision whether to move forwards with it. The alternative is pydantic/pydantic#6781

@davidhewitt davidhewitt force-pushed the dh/decimal branch 3 times, most recently from 1586486 to a088741 Compare July 24, 2023 15:40
@codecov
Copy link

codecov bot commented Jul 24, 2023

Codecov Report

Merging #763 (997cbde) into main (c94fa15) will decrease coverage by 0.24%.
The diff coverage is 80.92%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #763      +/-   ##
==========================================
- Coverage   94.06%   93.82%   -0.24%     
==========================================
  Files         102      104       +2     
  Lines       15023    15365     +342     
  Branches       25       25              
==========================================
+ Hits        14131    14416     +285     
- Misses        886      943      +57     
  Partials        6        6              
Files Changed Coverage Δ
src/errors/mod.rs 92.30% <ø> (ø)
src/serializers/shared.rs 79.56% <0.00%> (-0.98%) ⬇️
src/validators/decimal.rs 71.09% <71.09%> (ø)
src/serializers/type_serializers/decimal.rs 93.02% <93.02%> (ø)
src/input/input_python.rs 98.05% <96.96%> (-0.10%) ⬇️
python/pydantic_core/core_schema.py 96.77% <100.00%> (+0.06%) ⬆️
src/errors/types.rs 99.50% <100.00%> (+0.01%) ⬆️
src/input/input_abstract.rs 90.40% <100.00%> (+0.40%) ⬆️
src/input/input_json.rs 91.63% <100.00%> (+0.30%) ⬆️
src/validators/mod.rs 97.74% <100.00%> (+0.01%) ⬆️

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c94fa15...997cbde. Read the comment docs.

@davidhewitt davidhewitt force-pushed the dh/decimal branch 2 times, most recently from ed0bd96 to 2224e98 Compare July 25, 2023 16:49
@davidhewitt davidhewitt force-pushed the dh/decimal branch 2 times, most recently from 2fa96b3 to 0f7ca0d Compare July 25, 2023 19:18
input: &'a impl Input<'a>,
decimal_type: &'a PyType,
) -> ValResult<'a, &'a PyAny> {
decimal_type.call1((arg,)).map_err(|e| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning this part maybe we can do the calculation in rust and set the attributes of the python class Decimal: __slots__ = ('_exp','_int','_sign', '_is_special') WDYT @davidhewitt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR I'm trying to keep the same logic as pydantic just move the implementation. I would definitely be interested in this as a future improvement. I believe @samuelcolvin had some previous thoughts on how the parsing might be done (by looking at the digits as a string), not sure if there was any implementation put somewhere?

fn strict_decimal(&'a self, decimal_type: &'a PyType) -> ValResult<&'a PyAny> {
let py = decimal_type.py();
match self {
JsonInput::Float(f) => create_decimal(PyString::new(py, &f.to_string()), self, decimal_type),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way for us to make sure that when parsing numeric JSON into a decimal it is treated as a string rather than loaded into an f64 or whatever? I think @samuelcolvin said he did something like this in speedate

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Maybe it's already doing that, not sure, but it's probably worth checking that we load many-digit JSON numbers into decimal properly)

src/input/input_python.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me worth adding a test that Decimal('123456789123456789123456789.123456789123456789123456789') (or some other similarly high-precision decimal) gets serialized properly (at least to JSON).

@dmontagu
Copy link
Collaborator

I made various comments; I think at least some merit changes (e.g., removing the print line), and I do think there are some logic changes worth making (i.e., allowing proper subclasses of Decimal to pass validation unmodified), but happy to defer to your judgement.

For hooky: please update

@davidhewitt
Copy link
Contributor Author

Thanks for the thorough review; that definitely caught a couple of bugs!

I've rebased this PR and added a commit to make the desired review changes.

@davidhewitt
Copy link
Contributor Author

please review

@davidhewitt
Copy link
Contributor Author

Many thanks, I rebased and dropped the commit which pointed the integration tests at pydantic/pydantic#6858 👍

@davidhewitt davidhewitt enabled auto-merge (squash) August 9, 2023 22:43
@davidhewitt davidhewitt merged commit 672c2b4 into main Aug 9, 2023
28 of 29 checks passed
@davidhewitt davidhewitt deleted the dh/decimal branch August 9, 2023 22:49
@Masynchin Masynchin mentioned this pull request Jul 22, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants