Polars: py-0.20.6 Release

Release date:
January 26, 2024
Previous version:
py-0.20.6-rc.1 (released January 24, 2024)
Magnitude:
4,762 Diff Delta
Contributors:
17 total committers
Data confidence:
Commits:

59 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored January 24, 2024
Authored January 26, 2024

Top Contributors in py-0.20.6

alexander-beedie
r-brink
c-peters
ritchie46
stinodego
flisky
reswqa
itamarst
mcrumiller
Wainberg

Directory Browser for py-0.20.6

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

πŸ† Highlights

  • new implementation for String/Binary type. (#13748)

⚠️ Deprecations

  • Deprecate dtype_if_empty parameter for Series constructor (#13976)

πŸš€ Performance improvements

  • improve string/binary reverse performance (#14016)
  • add "calamine" support to read_excel, using fastexcel (~8-10x speedup) (#14000)
  • optimize DataFrame.describe by presorting columns (#13822)
  • elide redundant bound checks. (#13909)
  • speedup boolean filter (#13905)
  • speedup binview filter (#13902)
  • allow python threads in read_ functions (#13886)
  • improve binview filter (#13878)
  • apply string view GC more conservatively (#13850)
  • add optimized BinaryViewArray comparison kernels (#13839)
  • lazy cache binview bytes len (#13830)
  • fast-path for eager int_range (#13811)
  • Optimize arr.sum for inner non-null bool (#13800)

✨ Enhancements

  • Add UnstableWarning for unstable functionality (#13948)
  • DataFrame supports explode by array column (#13958)
  • add "calamine" support to read_excel, using fastexcel (~8-10x speedup) (#14000)
  • improve binary formatting (#13981)
  • preserve Enum information when going to IPC (#13943)
  • support calling describe on a LazyFrame (#13982)
  • support kwargs in plugin 'field' functions and raise error on unsupported binview layout (#13944)
  • support cast decimal to utf8 (#13829)
  • add SQL support for timestamp precision modifier (#13936)
  • support negative indexing and expressions for LEFT, RIGHT and SUBSTR SQL string funcs (#13888)
  • Introduce explode for ArrayNameSpace (#13923)
  • unify Series/DataFrame describe code (#13720)
  • raise better error message for .dt.time on Date column (#13932)
  • List set_operations supports float (#13920)
  • Add ignore_nulls for arr.join (#13919)
  • register 'set_sorted' as batch/elementwise (#13896)
  • move Enum/Categorical categories to binview (#13882)
  • Add ignore_nulls for list.join (#13701)
  • Add ignore_nulls for pl.concat_str (#13877)
  • Align int_range and int_ranges signatures (#13867)
  • fix parquet for binview (#13873)
  • support mmap for binview in OOC (#13872)
  • implement ffi for binview (#13871)
  • Support zero fill null strategy for binary and string columns (#13869)
  • allow df.rename and lf.rename to take a renaming function (#13708)
  • Implement/fix unary minus operator -pl.col(...) (#13776)
  • extend SQL EXTRACT with "century", "millennium", and "timezone" parts (#13634)
  • fix binview ipc format (#13842)
  • add SQL support for numeric and/or decimal types (#13739)
  • improve panic message (#13836)
  • Expressify str.zfill (#13790)
  • new implementation for String/Binary type. (#13748)
  • Add typing to hvplot plot namespace (#13813)
  • Add nulls_last for Series.sort (#13794)
  • allow ftp URLs, improve URL check (#13781)

🐞 Bug fixes

  • count matches on list categorical (#14021)
  • list.min/max with empty and/or None elements (#14018)
  • Make to_pandas() work for Dataframe and Series with dtype Object (#13910)
  • raise for pl.concat(how="align") when no columns are shared between frames (#13941)
  • Fix casting from categorical to numeric (#13957)
  • read_csv preserve whitespace and newlines (#13934)
  • omit implicit 'site' from import-timing test (#14009)
  • append decimal with different scale (#13977)
  • Use date_as_object=False as default for Series.to_pandas (just like DataFrame.to_pandas) (#13984)
  • serialize decimal type (#13997)
  • check input type for arr/list.contains (#13959)
  • Fix max_colname_length formatting in glimpse() (#13969)
  • Allow dtype merge when inner dtype is enum (#13938)
  • recurse less in streaming shared sinks (#13930)
  • ensure order is preserved if streaming from different sources (#13922)
  • Fix is_not_null for Struct columns (#13921)
  • convert object-dtyped NumPy str/bytes arrays to pl.String/pl.Binary instead of pl.Object (#13712)
  • allow extract of numeric from str AnyValue (#13865)
  • single-element .dt.time() and .dt.date() should always preserve sortedness (#13808)
  • prune emtpy chunks before set operations (#13898)
  • treat null columns as zero in sum_horizontal (#13880)
  • include null count in rolling window validity with min_periods (#13863)
  • Fix interchange protocol for new String type (#13881)
  • parquet hybrid RLE encoding did not always align to bit width (#13883)
  • Add ignore_nulls for list.join (#13701)
  • .dt.time() was panicking for datetimes prior to unix epoch (#13812)
  • allow list creation of decimals (#13851)
  • ensure kwargs filter behaviour matches docstring (expect equivalence with eq) (#13864)
  • Implement abs for Decimal, error on Date/Time/Datetime (#13821)
  • rolling nested groups deadlock (#13835)
  • gather_every should work on agg context (#13810)
  • Fix segfault of is_in (#13814)
  • don't panic on full null qcut (#13815)
  • validate operator arithmetic with None, fix Series edge-case (#13780)

πŸ“– Documentation

  • Add missing doc entries (#14006)
  • add missing len to rst file (#13999)
  • Improve structure of user guide (#13951)
  • Improve structure of user guide (#13639)
  • Introduce ecosystem page in user guide (#13903)
  • Mention deltalake write support in README (#13890)
  • use proper argument names in the code blocks of api.rst (#13866)

πŸ› οΈ Other improvements

  • make Enums an actual datatype (#14011)
  • omit implicit 'site' from import-timing test (#14009)
  • Constructor improvements - part 1 (#14001)
  • Add glimpse test (#13979)
  • Move PyO3 ChunkedArray conversion logic into its own module (#13973)
  • Fix xdist streaming group (#13974)
  • Fix spurious test failures (#13961)
  • minor describe tidy-up, and slight rewording of some Exception docstrings (#13942)
  • Fix pip warning filter return code (#13935)
  • Minor refactor of PyO3 conversions module (#13929)
  • move filter to polars-compute (#13897)
  • Revert pandas warning filter (#13893)
  • Make functions in expr/general non-anonymous (#13832)
  • Fix doctests (#13831)
  • Refactor Python release workflow (#13807)

Thank you to all our contributors for making this release possible! @ByteNybbler, @JulianCologne, @MarcoGorelli, @Wainberg, @alexander-beedie, @c-peters, @dependabot, @dependabot[bot], @edavisau, @flisky, @ion-elgreco, @itamarst, @jacksonthall22, @kstoneriv3, @mcrumiller, @mkucijan, @nameexhaustion, @orlp, @petrosbar, @r-brink, @reswqa, @ritchie46, @stinodego, @taki-mekhalfa, @thomasaarholt and @valorien