Polars: py-0.20.8 Release

Release date:
February 14, 2024
Previous version:
py-0.20.7 (released February 4, 2024)
Magnitude:
3,734 Diff Delta
Contributors:
26 total committers
Data confidence:
Commits:

86 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored February 7, 2024
Authored February 7, 2024
Authored February 7, 2024
Authored February 10, 2024

Top Contributors in py-0.20.8

grinya007
stinodego
mcrumiller
rben01
itamarst
deanm0000
MarcoGorelli
reswqa
alexander-beedie
taki-mekhalfa

Directory Browser for py-0.20.8

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

πŸ† Highlights

  • Implemented tree formatting for LogicalPlan (#14221)

⚠️ Deprecations

  • Deprecate positional args in pivot to prepare new functionality (#14428)

πŸš€ Performance improvements

  • Combine small chunks in sinks for streaming pipelines (#14346)
  • reduce heap allocs in expression/logical-plan iteration (#14440)
  • simplify and speed up cum_sum and cum_prod (#14409)
  • simplify negated predicates to improve row groups skipping (#14370)

✨ Enhancements

  • Increase verbosity of duplicate column error message (#11899)
  • change print to warn in reading csv from python file like object (#14469)
  • Raise if pivot would introduce duplicate column names (#14431)
  • apply negate in simplify expression pass (#14436)
  • restrict more cloud interop to semaphore budget (#14435)
  • Implement min/max for categorical dtype (#14112)
  • Hide polars.testing.* in pytest stack traces (#14399)
  • expose numpy view to integer types (#14405)
  • Allow column name input in clip (#14410)
  • add boolean rle decoding for parquet (#14403)
  • Allow brackets in SQL join conditions (#14263)
  • Implemented tree formatting for LogicalPlan (#14221)
  • Implement mean_horizontal expression (#14369)
  • support decimal comparison (#14338)
  • Implements arr.shift (#14298)
  • Implements list.n_unique (#14306)
  • Do not panic when casting from an empty Series to pl.Decimal (#14330)
  • unset WRITEABLE flag in zero-copy output (#14283)
  • Support Categorical/Enum in Series.to_numpy (#14275)
  • add parametric testing support for the Array dtype (#14265)

🐞 Bug fixes

  • don't gc after variadic buffers are written (#14473)
  • Increase verbosity of duplicate column error message (#11899)
  • Return appropriate data type for duration mean and median (#14376)
  • change print to warn in reading csv from python file like object (#14469)
  • regression in out-of-core group-by by new string-type (#14464)
  • DataFrame.pivot was returning incorrect results when multiple columns were passed to index and one of them was Struct (#14438)
  • remove literal Series from projection state (#14437)
  • pivot was producing incorrect results when (single) index was Struct (#14308)
  • Error on some invalid clip inputs (#14416)
  • Series.hist panicking on empty/all-null (#14407)
  • rechunk series when apply_lambda (#14406)
  • Raise if invalid strategy is passed to map_elements (#14397)
  • Require exact checking for Decimals in assertion utils (#14357)
  • fix ufunc for unlimited column args (#14328)
  • Handle chunked Series in Series.to_numpy (#14341)
  • Remove duplicated content in error messages (#8107)
  • Fix set_operation if the input is sliced and be broadcast (#14303)
  • Wrap par_iter in list.to_struct by POOL.install (#14304)
  • Do not panic when casting from an empty Series to pl.Decimal (#14330)
  • Preserve name when casting to Enum (#14320)
  • list.get does not work on list of decimals (#14276)
  • relax precision when up scaling (#14270)
  • Allow format object series with registry (#14272)

πŸ“– Documentation

  • Update read_database docstring note about getting the connection URI string for sqlalchemy (#14461)
  • Fix typo in plugins section (#14402)
  • Add debugging section to contributing docs (#10576)
  • Define what a 'character' means in slice / len_chars (#14395)
  • Clarify behavior of DataFrame.rows_by_key (#14149)
  • Fix some typos (#14394)
  • Realign file structure of user guide (#14360)
  • Rust examples for data structures in user guide (#14339)
  • Add deprecation period policy example for post-1.0.0 (#14184)
  • Add example for Series.bin.contains (#14297)
  • Small clarifications in the contributing guide (#14310)
  • Fix capitalization of user guide references (#14291)
  • Fix explode docstring mentioning String types (#14285)
  • Update deltalake docstrings to new link (#14282)

πŸ› οΈ Other improvements

  • Ignore unclosed file warnings for now (#14467)
  • Raise better error in import timings test (#14441)
  • Refactor arg_min/max test case (#14439)
  • Skip some OOC tests that fail randomly in the CI (#14434)
  • Bump release drafter to v6 (#14429)
  • Set specific temp dir for OOC tests (#14420)
  • Bump setup-graphviz action to v2 (#14418)
  • Minor test refactor (#14404)
  • Update make clean command (#14408)
  • Internal rename of _or to or_ in PyO3 (same for _xor/_and) (#14393)
  • Minor refactor of DataFrame.to_numpy structured code (#14348)
  • Update Series.to_numpy to handle Decimal/Time types in Rust (#14296)
  • Add test for Series.to_numpy with timezones (#14337)
  • Bump ruff version to 0.2.0 (#14294)
  • Temporarily fix failing deltalake test (#14288)
  • remove dataframe consortium standard api entrypoint (#14279)

Thank you to all our contributors for making this release possible! @BGR360, @CaselIT, @MarcoGorelli, @Migi, @NedJWestern, @Vincenthays, @alexander-beedie, @deanm0000, @dependabot, @dependabot[bot], @engdoreis, @flisky, @grinya007, @itamarst, @janosh, @kalekundert, @lukemanley, @mbuhidar, @mcrumiller, @petrosbar, @r-brink, @rben01, @reswqa, @ritchie46, @stinodego, @taki-mekhalfa and @thomasfrederikhoeck