Polars: py-0.17.0 Release

Release date:
April 7, 2023
Previous version:
py-0.16.18 (released April 3, 2023)
Magnitude:
8,420 Diff Delta
Contributors:
7 total committers
Data confidence:
Commits:

48 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Top Contributors in py-0.17.0

stinodego
ritchie46
alexander-beedie
ghuls
rben01
universalmind303
MarcoGorelli

Directory Browser for py-0.17.0

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

⚠️ Breaking changes

  • rename some function arguments (#8017)
  • don't create duplicate pivot names (#8002)
  • Remove deprecated behaviour (#7978)
  • rename toggle_string_cache to enable_string_cache (#7970)
  • change top_k(descending) -> bottom_k (#7969)
  • in sort, top_k, sort_by, and arg_sort_by, raise if descending is a sequence and its length doesn't match the number of columns to sort by (#7957)
  • Use RowsError instead of RowsException as recommended … (#6009)
  • Use time_unit/time_zone instead of tu/tz (#7910)
  • More ergonomic args for struct, concat_str, and arg_sort_by (#7308)
  • swap arguments of shift_and_fill and add default… (#7192)
  • set maintain_order=False for df/lf.unique (#7468)
  • Rename pipe arg func to function (#7139)
  • Set some args for Series/Expr methods to keyword-only (#7860)

🚀 Performance improvements

  • FromParalleIter<Option<str>> for Utf8Chunked ~1.9x (#8058)
  • speed up from_par_iter Option<bool> ~2.5x (#8057)
  • parallelize numeric ChunkedArray materialization ~2x. (#8053)
  • parallelize into_groups materialization ~-25% (#8036)
  • use a trusted anyvalue builder (#8001)
  • numeric grouptuples with nulls hash in single pass ~25% (#7980)
  • ensure primitives are parsed first in anyvalue conversion (#7955)
  • use perfect hash table for categoricals (#7951)

✨ Enhancements

  • multiple sql contexts & optional sql highlighting in cli (#8072)
  • implement arg_sort for struct dtype (#8051)
  • Support DataFrame init from pyarrow RecordBatch objects, and improve init from Array (#8011)
  • allow write_ipc to take file=None (returning BytesIO) (#7997)
  • Add __array__ method to DataFrame (#7979)
  • support struct in df.unique (#7976)
  • change top_k(descending) -> bottom_k (#7969)
  • basic sanity-checks for some Config methods, reference POLARS_MAX_THREADS in threadpool_size docstring (#7965)
  • optimize away nested unions in lp (#7861)
  • Use RowsError instead of RowsException as recommended … (#6009)
  • More ergonomic args for struct, concat_str, and arg_sort_by (#7308)

🐞 Bug fixes

  • check element count in multi-column explode (#8050)
  • set lower limit for chunk_size (#8048)
  • impl to_static for struct (#8037)
  • create Series with list of only None with Float32 dtype (#8015)
  • version gate pyarrow version for `to_pandas=(use_pyarrow… (#8026)
  • Only allow correct type for get_column and to_series arg… (#7983)
  • Output correct dtype for values of remapping dict in map… (#8013)
  • all/any empty sets (#8012)
  • struct null_count, cast string, tranpose and describe (#8009)
  • fix pivot and transpose of struct data (#8005)
  • don't create duplicate pivot names (#8002)
  • Fix test_literal_group_agg_chunked_7968 test (#7991)
  • fix chunked literals in expression engine (#7973)
  • in sort, top_k, sort_by, and arg_sort_by, raise if descending is a sequence and its length doesn't match the number of columns to sort by (#7957)
  • pandas 2.0 compat (#7962)
  • concat object types (#7958)
  • fix decimal conversion alignment (#7954)

🛠️ Other improvements

  • Fix Expr.apply docstring for return_dtype parameter (#8069)
  • rename some function arguments (#8017)
  • Remove deprecated behaviour (#7978)
  • Add docstring examples for top_k and bottom_k (#7987)
  • rename toggle_string_cache to enable_string_cache (#7970)
  • add remaining operator-equivalent method docstrings and a related html/docs entry (#7953)
  • Use time_unit/time_zone instead of tu/tz (#7910)
  • swap arguments of shift_and_fill and add default… (#7192)
  • set maintain_order=False for df/lf.unique (#7468)
  • Rename pipe arg func to function (#7139)
  • Set some args for Series/Expr methods to keyword-only (#7860)

Thank you to all our contributors for making this release possible! @MarcoGorelli, @StefanBRas, @alexander-beedie, @ghuls, @rben01, @ritchie46, @stinodego and @universalmind303