Polars: py-1.23.0 Release

Release date:
February 27, 2025
Previous version:
py-1.22.0 (released February 11, 2025)
Magnitude:
10,793 Diff Delta
Contributors:
35 total committers
Data confidence:
Commits:

137 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored February 13, 2025
Authored February 13, 2025
Authored February 13, 2025
Authored February 18, 2025
Authored February 20, 2025
Authored February 11, 2025
Authored February 17, 2025
Authored February 10, 2025

Top Contributors in py-1.23.0

coastalwhite
ritchie46
orlp
nameexhaustion
borchero
mcrumiller
r-brink
JakubValtar
alexander-beedie
lukemanley

Directory Browser for py-1.23.0

All files are compared to previous version, py-1.22.0. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸš€ Performance improvements

  • Toggle projection pushdown for eager rolling (#21405)
  • Fix pathologic rolling + group-by performance and memory explosion (#21403)
  • Add sampling to new-streaming equi join to decide between build/probe side (#21197)

✨ Enhancements

  • Implement i128 -> str cast (#21411)
  • Connect polars-cloud (#21387)
  • Version DSL (#21383)
  • Make user facing binary formats mostly self describing (#21380)
  • Filter hive files using predicates in new streaming (#21372)
  • Add negative slicing to new streaming multiscan (#21219)
  • Allow iterable of frames as input to align_frames (#21209)
  • Implement sorted flags for struct series (#21290)
  • Support reading arrow Map type from Delta (#21330)
  • Add a dedicated remove method for DataFrame and LazyFrame (#21259)
  • Rename credentials parameter to credential in CredentialProviderAzure (#21295)
  • Implement merge_sorted for struct (#21205)
  • Add positive slice for new streaming MultiScan (#21191)
  • Don't take in rewriting visitor (#21212)
  • Add SQL support for the DELETE statement (#21190)
  • Add row index to new streaming multiscan (#21169)
  • Improve DataFrame fmt in explain (#21158)

🐞 Bug fixes

  • Method dt.ordinal_day was returning UTC results as opposed to those on the local timestamp (#21410)
  • Use Kahan summation for rolling sum kernels. Fix numerical stability issues (#21413)
  • Add scalar checks for n and fill_value parameters in shift (#21292)
  • Upcast small integer dtypes for rolling sum operations (#21397)
  • Don't silently produce null values from invalid input to pl.datetime and pl.date (#21013)
  • Allow duration multiplied w/ primitive to propagate in IR schema (#21394)
  • Struct arithmetic broadcasting behavior (#21382)
  • Prefiltered optional plain primitive kernel (#21381)
  • Panic when projecting only row index from IPC file (#21361)
  • Properly update groups after gather in aggregation context (#21369)
  • Mark test as may_fail_auto_streaming (#21373)
  • Properly set fast_unique in EnumBuilder (#21366)
  • Rust test race condition (#21368)
  • Fix unequal DataFrame column heights from parquet hive scan with filter (#21340)
  • Fix ColumnNotFound error selecting len() after semi/anti join (#21355)
  • Merge Parquet nested and flat decoders (#21342)
  • Incorrect atomic ordering in Connector (#21341)
  • Method dt.offset_by was discarding month and year info if day was included in offset for timezone-aware columns (#21291)
  • Fix pickling polars.col on Python versions <3.11 (#21333)
  • Fix duplicate column names after join if suffix already present (#21315)
  • Skip Batches Expression for boolean literals (#21310)
  • Fix performance regression for eager join_where (#21308)
  • Fix incorrect predicate pushdown for predicates referring to right-join key columns (#21293)
  • Panic in to_physical for series of arrays and lists (#21289)
  • Resolve deadlock due to leaking in Connector recv drop (#21296)
  • Incorrect result for merge_sorted with lexical categorical (#21278)
  • Add Int128 path for join_asof (#21282)
  • Categorical min/max returning String dtype rather than Categorical (#21232)
  • Checking overflow in Sliced function (#21207)
  • Adding a struct field using a literal raises InvalidOperationError (#21254)
  • Return nulls for is_finite, is_infinite, and is_nan when dtype is pl.Null (#21253)
  • Account for minor change in new connectorx release (#21277)
  • Properly implement and test Skip Batch Predicate (#21269)
  • Infinite recursion when broadcasting into struct zip_outer_validity (#21268)
  • Deadlock due to bad logic in new-streaming join sampling (#21265)
  • Incorrect result for top_k/bottom_k when input is sorted (#21264)
  • UTF-8 validation of nested string slice in Parquet (#21262)
  • Raise instead of panicking when casting a Series to a Struct with the wrong number of fields (#21213)
  • Defer credential provider resolution to take place at query collection instead of construction (#21225)
  • Do not panic in strptime() if format ends with '%' (#21176)
  • Raise error instead of panicking for unsupported SQL operations (#20789)
  • Projection of only row index in new streaming IPC (#21167)
  • Fix projection count query optimization (#21162)

πŸ“– Documentation

  • Fix doc for SQL Functions navigation (#21412)
  • Fix initial selector example (#21321)
  • Add pandas strictness API difference (#21312)
  • Improve Expr.name.map docstring example (#21309)
  • Add logo to Ask AI (#21261)
  • Fix docs for Catalog (#21252)
  • AI widget again (#21257)
  • Revert plugin (#21250)
  • Add kappa ask ai widget (#21243)
  • Update social icons in API reference docs (#21214)
  • Improve Arrow key feature description (#21171)
  • Improve example in IO plugins user guide (#21146)

πŸ› οΈ Other improvements

  • Move storage of hive partitions to DataFrame (#21364)
  • Feature gate merge sorted in new streaming engine (#21338)
  • Remove new streaming old multiscan (#21300)
  • Add tests for fixed open issues (#21185)
  • Try to mimic all steps (#21249)
  • Require version for POLARS_VERSION (#21248)
  • Fix docs (#21246)
  • Avoid unnecessary packaging dependency (#21223)
  • Remove unused file (#21240)
  • Add use_field_init_shorthand = true to rustfmt (#21237)
  • Don't mutate arena by default in Rewriting Visitor (#21234)
  • Disable the TraceMalloc allocator (#21231)
  • Add feature gate to old streaming deprecation warning (#21179)
  • Install seaborn when running remote benchmark (#21168)

Thank you to all our contributors for making this release possible! @GiovanniGiacometti, @JakubValtar, @MarcoGorelli, @Matt711, @Shoeboxam, @YichiZhang0613, @alexander-beedie, @bschoenmaeckers, @coastalwhite, @edwinvehmaanpera, @erikbrinkman, @etiennebacher, @hemanth94, @henryharbeck, @jqnatividad, @lukemanley, @mcrumiller, @nameexhaustion, @orlp, @r-brink, @ritchie46 and @ydagosto