Release Notes v2024.06.21
stress-tess
released this
21 Jun 19:30
·
208 commits
to master
since this release
Bug Fixes
- Issues #3074, #3234 - Fix bug reading
Segarray
s from parquet files - Issues #3001, #3185 - Fix broadcast bugs involving
nan
s andStrings
- Issue #3156 - Fixes
Categorical.sort_values
bug - Issues #3311, #3112 - Fix Parquet multi column byte writing and Parquet string column free
- Issue #3115 - Fixes non-deterministic
sparse_sum
failure - Issue #3089 - Avoids out of memory crashes caused by
in
intents onmakeDistArray
- Issue #3009 and PRs #3232, #3316 - Improve performance of
indexof1d
and fix handling of null values - Issues #3158, #3222 - Fix print bugs involving
Dataframe
orSeries
containing aSegarray
Major Updates
- PR #3303 - Drops support for Chapel
1.31
- Issues #3343, #3346 - Pin
numpy < 2.0
andpython < 3.12.4
- Issue #3148 - Updates IO functions to always return a dictionary
- PRs #3238, #3314 and Issue #3347 - Reimplements CSV read to increase performance
- Issue #3108 - Adds
groupby.sample
anddataframe.groupby.sample
- Issue #2893 - Changes the behavior of
dataframe.GroupBy.count
to align with pandas - Issues #3086, #3118, #3245, #3322, #3167 and PRs #3110, #3280 - Add updates to
Random
module:- Adds
choice
,poisson
,normal
to random number generators
- Adds
- PRs #3242, #3305, #3160, #3223, #3237, #3142 - Improvements to Array API:
- Add documentation for Array API functions
- Add implementations of
vstack
,clip
,diff
,pad
and missing stats, search, and sort functions to Array API module - Compatibility improvements for Xarray chunk-manager
- Issues #3213, #3206, #3202, #3208, #3217, #3188 - Add
Index
andMultiIndex
properties:- Including
levels
,equals
,names
,ndim
, etc
- Including
- Issues #3050, #3192, #3128, #3196, #3198, #3200, #3130, #3123, #3194 - Work on proto tests:
- Improvements to tests for
dataframe
,dtypes
,groupby
,io
,numeric
,symbol_table
- Adds
make-proto-tests
command and updates our CI to run it
- Improvements to tests for
Minor Updates
- Issues #3006, #3007 - Add
median
andcount_nonzero
- Issues #3079, #3080 - Add
sum
and+=
for boolean pdarrays - PRs #3221, #3211 - Add NYC taxi tutorial from CUG 2024
Auto-Generated Release Notes
- Closes #3068 add doc strings for numpy imports by @ajpotts in #3077
- Add a random sampling with support for a weights array by @jeremiah-corrado in #3110
- Closes #3112: Fix Parquet string column free by @bmcdonald3 in #3113
- Closes #3115: Fix non-deterministic sparse_sum failure by @stress-tess in #3117
- Closes #3086: Add
choice
to random number generators by @stress-tess in #3114 - Closes #3118: Move
choice
implementation into arkouda by @stress-tess in #3138 - Closes #2947 change the name of the class dataframe.GroupBy by @ajpotts in #3146
- Avoid a warning about mismatched parSafe settings for list initialization by @lydia-duncan in #3149
- Closes #3116 remove DataFrame._columns by @ajpotts in #3147
- Closes #3124-dataframe.pyi-file and Closes #3097 numpy import docs at module level by @ajpotts in #3141
- Closes #3135 Update scipy/special_test by @ajpotts in #3137
- 3050 groupby etc by @drculhane in #3111
- multidimensional array bug fixes by @jeremiah-corrado in #3142
- Closes #3123-make-proto-tests by @ajpotts in #3126
- Closes #2893 dataframe.GroupBy.count to align with pandas by @ajpotts in #3125
- Closes #3051 Update akscipy_test by @ajpotts in #3136
- Fixes #3158:
Dataframe
containing aSegarray .__str__()
bug by @stress-tess in #3161 - Closes #3089: Avoid OOM Crashes caused due to
in
intents onmakeDistArray
by @ShreyasKhandekar in #3163 - Resolve deprecation warning about not using 'new' in dmapped expressions by @jeremiah-corrado in #3162
- Closes #3079 and #3080: Sum and Plus Equal of Boolean Arrays by @jaketrookman in #3154
- Closes #3108: Add
groupby.sample
anddataframe.groupby.sample
by @stress-tess in #3157 - Closes #3174: loosens type return restrictions of sum by @stress-tess in #3175
- Fixes #3001: nan broadcast bug by @stress-tess in #3173
- Dataframe Indexing by @brandon-neth in #3109
- Closes 3190 add mypy.ini by @ajpotts in #3191
- Closes #3192 PROTO_tests/tests/dtypes_test.py is failing by @ajpotts in #3193
- Fixes #3156:
Categorical.sort_values
bug by @stress-tess in #3168 - Closes #3148: Update IO functions to always return a dictionary by @stress-tess in #3164
- Re # 3128 fixes errors and omissions in PROTO-tests version of datafr… by @drculhane in #3139
- 3130 numeric test slight revamp by @drculhane in #3151
- 1D implementations of median and count_nonzero by @drculhane in #3187
- Closes #3196 PROTO_tests/tests/symbol_table.py failing by @ajpotts in #3197
- Closes #3198 PROTO_tests/tests/io_test.py failing by @ajpotts in #3199
- Closes #3200 PROTO_tests/tests/dataframe_test.py failing by @ajpotts in #3201
- Closes #3204 is_numeric to handle Index and Series type by @ajpotts in #3205
- Closes #3206 MultiIndex.levels by @ajpotts in #3207
- Array-API slice Assignment by @jeremiah-corrado in #3166
- Implement missing stats, search and sort functions for Array API by @jeremiah-corrado in #3160
- Closes #3202 Index.inferred_type by @ajpotts in #3203
- Closes #3208-Index.equals by @ajpotts in #3209
- Closes #3194 add proto tests to CI by @ajpotts in #3195
- Add benchmark for for CSV Read and write perf by @ShreyasKhandekar in #3189
- Fixes #3185: strings broadcast bug by @stress-tess in #3210
- Closes #3167: Add
normal
to random number generators by @stress-tess in #3180 - Add NYC taxi tutorial from CUG 2024 by @bmcdonald3 in #3211
- Fix jupyter notebook formatting by @bmcdonald3 in #3221
- Closes #3009:
indexof1d
to handle null values by @stress-tess in #3169 - Compatibility improvements for Xarray chunk-manager by @jeremiah-corrado in #3223
- Closes #3215: Index.__get__item can accept a list by @ajpotts in #3216
- Closes #3217: MultiIndex.get_level_values by @ajpotts in #3218
- Move some definitions from ArrowFunctions header to source by @e-kayrakli in #3236
- Reduce file size for csvIO benchmark by @ShreyasKhandekar in #3239
- Part of #3229: CI failures due to
indexof1d
by @stress-tess in #3232 - Fixes #3074: Bug reading segarrays from parquet files by @stress-tess in #3233
- Closes #3227 add pandas stubs library by @ajpotts in #3228
- Closes #3213 Index properties by @ajpotts in #3214
- Add implementations of
clip
,diff
,pad
to Array API module by @jeremiah-corrado in #3237 - Closes #3188 multi index.equals by @ajpotts in #3225
- Fixes #3222: series of segarray print bug by @stress-tess in #3240
- Fixing a missing
iloc
usage by @brandon-neth in #3243 - Closes #3249: Fix issue with finding incorrect conftest file for proto tests by @bmcdonald3 in #3250
- Fixes #3234: segarray with empty segments and nans parquet bug by @stress-tess in #3241
- Array API Documentation by @jeremiah-corrado in #3242
- Fixes #3252: proto
test_segarray_read
failure with multi-locale by @stress-tess in #3254 - Closes #3255 move numeric.floor to numpy module by @ajpotts in #3257
- Remove single-column cases from multi-col-merge test. by @brandon-neth in #3248
- Benchmark Display Performance Fix by @brandon-neth in #3276
- Closes #3245: Add
poisson
distribution to random number generators by @stress-tess in #3253 - poisson cleanup by @stress-tess in #3280
- Drop support for Chapel 131 by @bmcdonald3 in #3303
- Improve the CSV Read performance with new impl by @ShreyasKhandekar in #3238
- fixes typo, server host maintained for connect call by @brandon-neth in #3307
- Clean up Compat modules by @jeremiah-corrado in #3306
- Add vstack implementation by @jeremiah-corrado in #3305
- Closes #3311: Fix Parquet multi column byte writing by @bmcdonald3 in #3312
- Fix CSV Reading Bugs by @ShreyasKhandekar in #3314
- Part of #3229: Re-enable find implementation of
indexof1d
by @stress-tess in #3316 - Reverts DataFrame Indexing Changes by @brandon-neth in #3323
- Closes #3343: Pin
numpy < 2.0
to avoid CI failures by @stress-tess in #3344 - Related to #3346: Disable
3.12
/3.x
workflows by @stress-tess in #3349 - Fixes #3322:
poisson
same seed reproducibility bug by @stress-tess in #3331 - Resolves #3347: Eliminate csv file reads fully into memory by @ShreyasKhandekar in #3348
- Closes #3339: Add multi-batch parquet read tests by @stress-tess in #3350
New Contributors
- @ShreyasKhandekar made their first contribution in #3163
Full Changelog: v2024.04.19...v2024.06.21