Polars: py-1.21.0 Release

Release date:
February 4, 2025
Previous version:
py-1.20.0 (released February 13, 2025)
Magnitude:
7,926 Diff Delta
Contributors:
16 total committers
Data confidence:
Commits:

74 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored January 24, 2025
Authored January 19, 2025
Authored January 23, 2025
Authored January 22, 2025
Authored January 23, 2025
Authored January 22, 2025
Authored January 18, 2025
Authored January 21, 2025

Top Contributors in py-1.21.0

orlp
ritchie46
mcrumiller
coastalwhite
etiennebacher
nameexhaustion
lukemanley
alexander-beedie
Feiyang472
stinodego

Directory Browser for py-1.21.0

All files are compared to previous version, py-1.20.0. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸš€ Performance improvements

  • Use BitmapBuilder in yet more places (#20868)
  • Make an owned version of append (#20800)
  • Use BitmapBuilder in a lot more places (#20776)

✨ Enhancements

  • Stabilize methods/functions (#20850)
  • Add linear_space (#20678)
  • Improve string β†’ temporal parsing in read_excel and read_ods (#20845)
  • Implement df.unique() on new-streaming engine (#20875)
  • Experimental credential provider support for Delta read/scan/write (#20842)
  • Allow column expressions in DataFrame unnest (#20846)
  • Auto-initialize Python credential providers in more cases (#20843)
  • Add unique operations for Decimal dtype (#20855)
  • Add NDJson sink for the new streaming engine (#20805)
  • Support nested keys in window functions (#20837)
  • Add CSV sink for the new streaming engine (#20804)
  • Periodically check python signals ('CTRL-C' handling) (#20826)
  • Experimental unity catalog client (#20798)
  • Support cumulative aggregations for Decimal dtype (#20802)
  • Account for SurrealDB Python API updates (handle both SurrealDB and AsyncSurrealDB classes) in read_database (#20799)
  • Drop nest-asyncio in favor of custom logic (#20793)
  • Improve window function caching strategy (#20791)
  • Support lakefs:// URI for delta scanner (#20757)
  • Additional support for loading numpy.float16 values (as Float32) (#20769)

🐞 Bug fixes

  • Warn if asof keys not sorted (#20887)
  • Ensure explicit values given to column_widths override autofit in write_excel (#20893)
  • Avoid name collisions and panicking in object conversion (#20890)
  • Incorrect scale used in log and exp for Decimal type (#20888)
  • Don't deep clone manuallydrop in GroupsPosition (#20886)
  • Fix DuplicateError when selecting columns after join_where or cross join + filter (#20865)
  • Incorrect Decimal value for fill_null(strategy="one") (#20844)
  • Fix one edge case (out of many) of int128 literals not working (#20830)
  • Add height check to frame-level row indexing when key is int (#20778)
  • Remove assert that panics on group_by followed by head(n), where n is larger then the frame height (#20819)
  • Selectors should raise on + between themselves (#20825)
  • Fix panic InvalidHeaderValue scanning from S3 on Windows (#20820)
  • Fix clip for Decimal returning wrong values (#20814)
  • Incorrect height from slicing after projecting only the file path column (#20817)
  • Shift mask when skipping Bitpacked values in Parquet (#20810)
  • Error instead of truncate if length mismatch for several str functions (#20781)
  • Support cumulative aggregations for Decimal dtype (#20802)
  • Allow is_in values to be given as custom Collection (#20801)
  • Propagate null instead of panicking in pl.repeat_by() (#20787)
  • Do not print sensitive information to output on POLARS_VERBOSE (#20797)
  • Ignore file cache allocation error if fallocate() is not permitted (#20796)
  • Incorrect logic in assert_series_equal for infinities (#20763)

πŸ“– Documentation

  • Update source URL for legislators-historical.csv (#20858)
  • Update ML part of ecosystem user guide page (#20596)

πŸ› οΈ Other improvements

  • Disable 'catalog' in build (#20897)
  • Implement negative slice for new streaming IPC (#20866)
  • Debloat Series bitops (#20873)
  • Reduce python map bloat (#20871)
  • Remove todo and test restriction for new-streaming (#20861)
  • Dispatch to the in-mem engine for AExpr::Gather (#20862)
  • Dispatch to the in-memory engine for multifile sources (#20860)
  • Add tests for open issues (#20857)
  • Mark 'register_startup' as unsafe (#20841)
  • Reduce mode bloat (#20839)
  • Rename ContainsMany to ContainsAny (#20785)
  • Unpin NumPy in type checking workflow (#20792)
  • Add various tests (#20768)
  • Small drive-by's (#20772)
  • Touch the upload probe for the remote benchmark (#20767)

Thank you to all our contributors for making this release possible! @alexander-beedie, @arnabanimesh, @braaannigan, @burakemir, @coastalwhite, @etiennebacher, @ion-elgreco, @itamarst, @lukemanley, @mcrumiller, @nameexhaustion, @orlp, @ritchie46 and @stinodego