Polars: py-1.17.0 Release

Release date:
December 8, 2024
Previous version:
py-1.16.0 (released December 3, 2024)
Magnitude:
13,861 Diff Delta
Contributors:
29 total committers
Data confidence:
Commits:

134 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored December 2, 2024
Authored December 4, 2024
Authored December 5, 2024
Authored December 2, 2024
Authored December 5, 2024
Authored November 26, 2024
Authored December 4, 2024
Authored December 7, 2024
Authored December 6, 2024
Authored December 3, 2024

Top Contributors in py-1.17.0

coastalwhite
nameexhaustion
ritchie46
mcrumiller
siddharth-vi
MarcoGorelli
alexander-beedie
stijnherfst
EvanLai88
lukapeschke

Directory Browser for py-1.17.0

All files are compared to previous version, py-1.16.0. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸš€ Performance improvements

  • Add fast paths for series.arg_sort and dataframe.sort (#19872)
  • Much faster Series construction from subclasses of standard Python types (#20166)
  • Utilize the RangedUniqueKernel for Enum/Categorical (#20150)
  • Reduce memory copy when scanning from Python objects (#20142)
  • Construct Series for bytes/binary data 10x faster when dtype not explicitly set (#20157)
  • Don't instantiate validity mask when unneeded in Parquet (#20149)

✨ Enhancements

  • Retry with reloaded credentials on cloud error (#20185)
  • Support reading Enum dtype from csv (#20188)
  • Improve dtype inference and load for DataFrame cols constructed from Python Enum values (#20180)
  • Allow sorting of lists and arrays (#20169)
  • Add maintain_order parameter to joins (#20026)
  • Allow for to_datetime / strftime to automatically parse dates with single-digit hour/minute/second (#20144)
  • Issue warning when using to_struct() without a list of field names (#20158)
  • Experimental cloud write support (#20129)
  • Add lazy support for pl.select (#20091)
  • Enable view arrow export in write_delta (#20092)

🐞 Bug fixes

  • Don't trigger length check in array construction (#20205)
  • Allow row encoding for 32-bit architectures (e.g. WASM) (#20186)
  • Properly project unordered column in parquet prefiltered (#20189)
  • Csv stop simd cache if eol char is hit (#20199)
  • Estimated size for object (#20191)
  • Respect parallel argument in parquet (#20187)
  • Only validate UTF-8 for selected items when all below len 128 (#20183)
  • Serialize categories of Enum in arrow metadata (#20181)
  • Don't use RLE encoding for Parquet Boolean (#20172)
  • Invalid bitwise_xor for ScalarColumn (#20140)
  • Series construct with large nested u64 (#20167)
  • Add temporal feature gate in is_elementwise_top_level (#20177)
  • Column name mismatch or not found in Parquet scan with filter (#20178)
  • Raise if apply returns different types (#20168)
  • Deal with masked out list elements (#20161)
  • Fix index out of bounds in uniform_hist_count (#20133)
  • Implement arg_sort for Null series (#20135)
  • Handle slice pushdown in PythonUDF GroupBy (#20132)
  • Check shape for *_horizontal functions (#20130)
  • Properly coerce types in lists (#20126)
  • Incorrect aggregation of empty groups after slice (#20127)
  • DataFrame .get_column after drop_in_place (#20120)
  • Subtraction with underflow on empty FixedSizeBinaryArray (#20109)
  • Materialize smallest dyn ints to use feature gate for i8/i16 (#20108)
  • Return null instead of 0. for rolling_std when window contains a single element and ddof=1 and there are nulls elsewhere in the Series (#20077)
  • Only slice after sort when slice is smaller than frame length (#20084)
  • Preserve Series name in __rpow__ operation (#20072)
  • Allow nested is_in() in when()/then() for full-streaming (#20052)

πŸ“– Documentation

  • Add more Rust examples to User Guide (#20194)
  • Expand plotting docs (#19719)
  • Fix Rust examples in user guide (#20075)
  • Update by param description for rolling_*_by functions (#19715)
  • Correct supported compression formats (#20085)
  • Specify strictness in cast (#20067)

πŸ“¦ Build system

  • Upgrade sqlparser-rs from version 0.49 to 0.52 (#20110)
  • Bump memmap2 to version 0.9 (#20105)
  • Bump object_store to version 0.11 (#20102)
  • Bump fs4 to version 0.12 (#20101)
  • Bump thiserror to version 2 (#20097)
  • Bump atoi_simd to version 0.16 (#20098)
  • Bump chrono-tz to 0.10 (#20094)
  • Update Rust dependency ndarray to 0.16 (#20093)
  • Bump Rust toolchain to nightly-2024-11-28 (#20064)

πŸ› οΈ Other improvements

  • Deprecate ddof parameter for correlation coefficient (#20197)
  • Move Bitwise aggregations to FunctionExpr (#20193)
  • Add ragged lines test (#20182)
  • Set delta version check higher (#20153)
  • Fix typo in assertion in datatype copy test (#20121)
  • Move horizontal methods to polars-ops (#20134)
  • Remove useless SeriesTrait::get implementations (#20136)
  • Add a bunch more automated row encoding sortedness tests (#20056)

Thank you to all our contributors for making this release possible! @DzenanJupic, @MarcoGorelli, @YichiZhang0613, @alexander-beedie, @coastalwhite, @dependabot, @dependabot[bot], @flowlight0, @henryharbeck, @iharthi, @ion-elgreco, @jqnatividad, @lukapeschke, @lukemanley, @mcrumiller, @nameexhaustion, @ptiza, @ritchie46, @siddharth-vi, @stijnherfst, @stinodego and @wsyxbcl