Polars: py-0.20.11 Release

Release date:
February 27, 2024
Previous version:
py-0.20.10 (released February 19, 2024)
Magnitude:
3,467 Diff Delta
Contributors:
15 total committers
Data confidence:
Commits:

55 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored February 27, 2024
Authored February 20, 2024
Authored February 20, 2024

Top Contributors in py-0.20.11

nameexhaustion
stinodego
c-peters
mcrumiller
orlp
MarcoGorelli
ritchie46
mbuhidar
alexander-beedie
Object905

Directory Browser for py-0.20.11

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

[!WARNING]
This release was deleted from PyPI. Please use the 0.20.13 release instead.

πŸ† Highlights

  • fast path for COUNT(*) queries (#14574)

⚠️ Deprecations

  • Deprecate passing time_unit=None to Datetime constructor (#14708)
  • Rename Expr.meta.write_json/Expr.from_json to Expr.meta.serialize/Expr.deserialize (#14490)
  • Deprecate default value for ignore_nulls for ewm methods (#14663)
  • Deprecate DataFrame/LazyFrame.approx_n_unique (#14594)

πŸš€ Performance improvements

  • 2-3x speedup in creating literals/Series of type Date (#14716)
  • fix accidental quadratic utf8 validation in parquet (#14705)
  • Add __slots__ to most Polars classes (#13236)
  • fast path for COUNT(*) queries (#14574)
  • Elide the total order wrapper for non-(float/option) types (#14648)
  • add utf8-validation fast paths for utf8view (#14644)
  • don't reassign chunks back to df owner (#14633)
  • If there are many small chunks in write_parquet(), convert to a single chunk (#14484) (#14487)
  • Polars thread pool was not used properly in various functions (#14583)

✨ Enhancements

  • Change default for maximum number of Series items printed to 10 to match DataFrame (#14703)
  • Change default number of rows printed in Notebooks for DataFrame/Series to 10 (#14536)
  • Infer values columns in DataFrame.pivot when values is None (#14477)
  • fast path for COUNT(*) queries (#14574)
  • let rolling accept index_column of type UInt32 or UInt64 (#14669)
  • Treat float -0.0 == 0.0 and -NaN == NaN in group-by, joins and unique (#14617)
  • Improve consistency of dtype inference from Python types (#14600)
  • Properly cache object-stores (#14598)

🐞 Bug fixes

  • Fix parallel strategy for LazyFrame not being applied (#14696)
  • Block slice pushdown past non-literal projections or when the projection doesn't contain any columns from the input (#14684)
  • Fix number of rows printed in DataFrame/Series repr (edge cases) (#14548)
  • Fix contention panics in file gc threads (#14690)
  • Fix feature combination (#14688)
  • Only push predicates depending on the subset columns past unique() (#14668)
  • Properly handle a single empty RecordBatch in from_arrow (#14683)
  • More accurate type hints for binary file-like inputs (#14674)
  • Reading RLE_DICTIONARY-encoded parquet incorrectly coalesced NULL to empty string in some cases (#14670)
  • use correct flooring division/modulo operator in literal optimizer and const_lhs <> series ops (#14671)
  • Enable is_in for string in categorical/enum (#14576)
  • Fixes a read_database issue loading specific datetime types from SQL Server backends (#14627)
  • Polars thread pool was not used properly in various functions (#14583)

πŸ“– Documentation

  • Improve some DataType docstrings (#14719)
  • Fix bad link due to boldness in pl.count (#14691)
  • Improve docstrings for ewm_* and rolling_* methods (#14667)
  • Improve examples for Series.binary.encode and Series.binary.decode. (#14579)
  • Add examples for Series.kurtosis (#14681)
  • Fix docstring for LazyGroupBy.len (#14661)
  • Separate "writing a plugin" from "registering an expression" in user guide, add some extra links, don't use deprecated _register_plugin (#14621)
  • Fix code block path for group by example in getting started guide (#14612)
  • Add missing 'string' column in reading-writing Rust example to match Python example (#14597)

πŸ“¦ Build system

  • Limit CMake threads to fix crash compiling libz-ng-sys on macOS (#14715)

πŸ› οΈ Other improvements

  • Limit CMake threads to fix crash compiling libz-ng-sys on macOS (#14715)
  • Fix make requirements when conda environment is active (#14693)
  • update rustc (#14678)
  • redundant imports all crates (#14662)
  • Avoid unnecessary cast in Series constructor (#14650)
  • Add test on selecting Enum columns (#14628)
  • Use uv for make requirements (#14618)
  • Rename coverage file (#14607)
  • Add a lint-only Makefile option (#14602)
  • No longer use SeriesView in Series.to_numpy (#14588)

Thank you to all our contributors for making this release possible! @Kylea650, @MarcoGorelli, @Object905, @alexander-beedie, @bsubei, @c-peters, @eLVas, @itamarst, @mbuhidar, @mcrumiller, @nameexhaustion, @orlp, @rijkvp, @ritchie46 and @stinodego