Polars: py-0.19.13 Release

Release date:
November 10, 2023
Previous version:
py-0.19.13-rc.1 (released November 1, 2023)
Magnitude:
8,954 Diff Delta
Contributors:
20 total committers
Data confidence:
Commits:

94 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored November 10, 2023
Authored November 8, 2023
Authored November 8, 2023

Top Contributors in py-0.19.13

stinodego
ritchie46
orlp
reswqa
MarcoGorelli
alexander-beedie
moritzwilksch
nameexhaustion
owrior
Julian-J-S

Directory Browser for py-0.19.13

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

πŸ† Highlights

  • improve join performance through radix partitioned join (#12270)

⚠️ Deprecations

  • Rename write_csv parameter has_header to include_header (#12351)
  • Deprecate _saturating in duration string language, make it the default (#12301)
  • Switch args for Decimal and set default scale=0 (#12224)
  • Rename dt.seconds to dt.total_seconds (likewise for days, hours, minutes, milliseconds, microseconds, and nanoseconds) (#12179)
  • Deprecate DataFrame.as_dict positional input (#12131)

πŸš€ Performance improvements

  • indexvec in group-by (#12371)
  • Reduce allocations in hash join (#12368)
  • Change concurrency parameters (#12321)
  • Improve join performance through radix partitioned join (#12270)
  • Remove extra multiplication in hash_to_partition (#12233)
  • Allow non-power-of-two partitions (#12225)
  • Reduce compute in error message for failed datetime parsing (#12147)

✨ Enhancements

  • Updated BytecodeParser for Python 3.12 (#12348)
  • Add round_sig_figs expression for rounding to significant figures (#11959)
  • Change concurrency parameters (#12321)
  • Deprecate _saturating in duration string language, make it the default (#12301)
  • Auto-infer ambiguous for truncate and round (#12204)
  • Allow construction of Datetime series from datetime.date array (#12175)
  • New Config options for numeric formatting: digit grouping and thousands/decimal separator (#12099)
  • Allow non-aggregation predicate in ternary groupby (#12286)
  • Add name= in .write_avro to set schema name (#12255)
  • Update write_delta to write large arrow types without casting (#12260)
  • Add support for reading zstd compressed files (no-options) in read_csv (#12214)
  • Start prefetching all files immediately (#12201)
  • Expose more options to plugin registration (#12197)
  • Add .list.to_array expression (#12192)
  • Consolidate & improve all casting failure error messages (#12168)
  • Add Binary dtype to hypothesis tests (#12140)
  • Tunable concurrency (#12171)
  • Support reverse sort in streaming (#12169)
  • Add .arr.to_list expression (#12136)
  • Support decimals in assert utils (#12119)
  • Add concurrency budget (#12117)
  • Improved support for use of file-like objects with DataFrame "write" methods (#12113)
  • Introduce ignore_nulls for str.concat (#12108)

🐞 Bug fixes

  • Do not cast lit if has same dtype (#12342)
  • Fix index column name of rolling/dynamic group by (#12365)
  • Ternary broadcasting with empty truthy or falsy and agg predicate (#12357)
  • UInt64 should be correctly extracted from python object (#12338)
  • Ignore IDE-mediated DeprecationWarning when debugging tests under 3.12 (#12343)
  • expr_output_name include literal (#12335)
  • Fix Decimal dtype table repr (#12318)
  • Fix behavior of month intervals in date_range (#12317)
  • Scan empty csv miss row_count (#12316)
  • zip_with also broadcast mask (#12309)
  • respect hive_partitioning flag when dealing with multiple files (#12315)
  • parquet, add row_count to empty file materialization (#12310)
  • Fix invalid DeprecationWarning generated from date_range defined with 'saturating' interval (#12311)
  • fix download ranges in parquet (#12313)
  • object store path derivation for local URL (#12308)
  • don't move right endpoint of windows in rolling in default offset==-period case (#12267)
  • Raise more informative error on invalid reshape input (#12288)
  • incorrect super type for literals in nested binary exprs (#12238)
  • typo in exception message (#12278)
  • fix ambiguous aggregation type (#12269)
  • return frames from read_excel in the originally specified order (#12243)
  • Consistently propagate nulls for numpy ufuncs (#12212)
  • respect return_scalar of list scalars (#12251)
  • fix plugins system on Windows (#12230)
  • potential overflow (#12206)
  • always start a new thread if the thread is already blocking (#12202)
  • with_row_count should block predicate push down for lazy csv (#12187)
  • rechunk failed-list series before iterate (#12189)
  • Fix interchange protocol boolean buffer size (#12177)
  • fix incorrect desc sort behavior (#12141)
  • take should block predicate pushdown (#12130)
  • use null type when read from unknown row (#12128)
  • boundary predicate to block all accumulated predicates in push down (#12105)
  • make python schema_overrides information available to the rust-side inference code when initialising from records/dicts (#12045)
  • fix panic when initializing Series with array of list dtype (#12148)
  • Fix schema of arr.min/max (#12127)
  • ensure filter predicate inputs exist in schema (#12089)
  • Update null_count after arithmetic (#12280)

πŸ› οΈ Other improvements

  • Workaround for maturin issue (#12370)
  • Fix incorrect boundary column name in group_by_dynamic docstrings (#12366)
  • Fix typo in rolling_* docstrings (#12362)
  • Fix ruff linting invocation (#12350)
  • Clean up conversion utils (#11789)
  • Organize Cargo.toml (#12323)
  • Consolidate "getting started" and "user guide" sections (#12246)
  • Minor updates to prepare for Python 3.12 support (#12314)
  • Move script for testing map warning (#12306)
  • simplify expr checking in predicate push down (#12287)
  • Remove external link (#12223)
  • Fix rebase issue breaking CI (#12296)
  • Add top-level make clippy, simplify Rust linting workflows (#12290)
  • ensure we git-ignore ALL .venv dirs (#12289)
  • incorrect super type for literals in nested binary exprs (#12238)
  • Remove recommended setting from IDE docs (#12275)
  • Clean up Python test workflow (#12261)
  • clarify contains selector (#12265)
  • Add py-polars to Cargo workspace (#12256)
  • Use .with_columns in some docstrings (#12250)
  • Add test for scan_csv plus slice (#12239)
  • Fix emphasis formatting in docstring (#12240)
  • Fix emphasis formatting in docstring (#12237)
  • add deprecation notices to the docs for expressions moved into the new name namespace (#12236)
  • update Cargo.lock (#12226)
  • make sort test work with unstable sort (#12221)
  • Build Python wheels on manylinux_2_28 (#12211)
  • Include rust-toolchain.toml with sdist/wheels (#12184)
  • Standardize project name formatting across docs (#12185)
  • Update sqlparser to 0.39 (#12173)
  • pin ring (#12176)
  • Improve strip_{prefix, suffix} & strip_chars_{start, end} (#12161)
  • Fix tests for pyarrow 14 (#12170)
  • Fix rendering of note in DataFrame.fold (#12164)
  • Fix triggers for docs deployment (#12159)
  • Refactor some tests (#12121)
  • Consolidate contributing info (#12109)
  • Fix typo in user-guide/expressions/plugins.md (#12115)
  • Render docstring text in single backticks as code (#12096)
  • use more ergonomic syntax in select/with_columns where possible (#12101)
  • Update CODEOWNERS (#12107)
  • visualize plugin directory layout in user guide (#12092)
  • Minor tweak in code example in section Expressions/Aggregation (#12033)
  • Minor tweak in code example in section Expressions/Missing data (#12080)
  • Minor improvements to the docs website (#12084)

Thank you to all our contributors for making this release possible! @JulianCologne, @MarcoGorelli, @Priyansh121096, @alexander-beedie, @cmdlineluser, @daviskirk, @dependabot, @dependabot[bot], @dgilman, @hirohira9119, @ion-elgreco, @jrycw, @mcrumiller, @moritzwilksch, @nameexhaustion, @orlp, @owrior, @rancomp, @reswqa, @ritchie46, @rob-sil, @stefmolin, @stinodego and @wsyxbcl