π Highlights
- Implemented tree formatting for LogicalPlan (#14221)
β οΈ Deprecations
- Deprecate positional args in
pivot
to prepare new functionality (#14428)
π Performance improvements
- Combine small chunks in sinks for streaming pipelines (#14346)
- reduce heap allocs in expression/logical-plan iteration (#14440)
- simplify and speed up cum_sum and cum_prod (#14409)
- simplify negated predicates to improve row groups skipping (#14370)
β¨ Enhancements
- Increase verbosity of duplicate column error message (#11899)
- change print to warn in reading csv from python file like object (#14469)
- Raise if
pivot
would introduce duplicate column names (#14431)
- apply negate in simplify expression pass (#14436)
- restrict more cloud interop to semaphore budget (#14435)
- Implement
min
/max
for categorical dtype (#14112)
- Hide
polars.testing.*
in pytest stack traces (#14399)
- expose numpy view to integer types (#14405)
- Allow column name input in
clip
(#14410)
- add boolean rle decoding for parquet (#14403)
- Allow brackets in SQL join conditions (#14263)
- Implemented tree formatting for LogicalPlan (#14221)
- Implement
mean_horizontal
expression (#14369)
- support decimal comparison (#14338)
- Implements
arr.shift
(#14298)
- Implements
list.n_unique
(#14306)
- Do not panic when casting from an empty Series to pl.Decimal (#14330)
- unset WRITEABLE flag in zero-copy output (#14283)
- Support
Categorical/Enum
in Series.to_numpy
(#14275)
- add parametric testing support for the
Array
dtype (#14265)
π Bug fixes
- don't gc after variadic buffers are written (#14473)
- Increase verbosity of duplicate column error message (#11899)
- Return appropriate data type for duration
mean
and median
(#14376)
- change print to warn in reading csv from python file like object (#14469)
- regression in out-of-core group-by by new string-type (#14464)
- DataFrame.pivot was returning incorrect results when multiple columns were passed to
index
and one of them was Struct (#14438)
- remove literal
Series
from projection state (#14437)
- pivot was producing incorrect results when (single)
index
was Struct (#14308)
- Error on some invalid
clip
inputs (#14416)
- Series.hist panicking on empty/all-null (#14407)
- rechunk series when apply_lambda (#14406)
- Raise if invalid strategy is passed to
map_elements
(#14397)
- Require exact checking for Decimals in assertion utils (#14357)
- fix ufunc for unlimited column args (#14328)
- Handle chunked Series in
Series.to_numpy
(#14341)
- Remove duplicated content in error messages (#8107)
- Fix
set_operation
if the input is sliced and be broadcast (#14303)
- Wrap
par_iter
in list.to_struct
by POOL.install
(#14304)
- Do not panic when casting from an empty Series to pl.Decimal (#14330)
- Preserve name when casting to Enum (#14320)
list.get
does not work on list of decimals (#14276)
- relax precision when up scaling (#14270)
- Allow format object series with registry (#14272)
π Documentation
- Update
read_database
docstring note about getting the connection URI string for sqlalchemy (#14461)
- Fix typo in plugins section (#14402)
- Add debugging section to contributing docs (#10576)
- Define what a 'character' means in
slice
/ len_chars
(#14395)
- Clarify behavior of
DataFrame.rows_by_key
(#14149)
- Fix some typos (#14394)
- Realign file structure of user guide (#14360)
- Rust examples for data structures in user guide (#14339)
- Add deprecation period policy example for post-1.0.0 (#14184)
- Add example for
Series.bin.contains
(#14297)
- Small clarifications in the contributing guide (#14310)
- Fix capitalization of user guide references (#14291)
- Fix explode docstring mentioning String types (#14285)
- Update deltalake docstrings to new link (#14282)
π οΈ Other improvements
- Ignore unclosed file warnings for now (#14467)
- Raise better error in import timings test (#14441)
- Refactor
arg_min/max
test case (#14439)
- Skip some OOC tests that fail randomly in the CI (#14434)
- Bump release drafter to v6 (#14429)
- Set specific temp dir for OOC tests (#14420)
- Bump
setup-graphviz
action to v2 (#14418)
- Minor test refactor (#14404)
- Update
make clean
command (#14408)
- Internal rename of
_or
to or_
in PyO3 (same for _xor/_and
) (#14393)
- Minor refactor of
DataFrame.to_numpy
structured code (#14348)
- Update
Series.to_numpy
to handle Decimal/Time types in Rust (#14296)
- Add test for
Series.to_numpy
with timezones (#14337)
- Bump ruff version to 0.2.0 (#14294)
- Temporarily fix failing deltalake test (#14288)
- remove dataframe consortium standard api entrypoint (#14279)
Thank you to all our contributors for making this release possible!
@BGR360, @CaselIT, @MarcoGorelli, @Migi, @NedJWestern, @Vincenthays, @alexander-beedie, @deanm0000, @dependabot, @dependabot[bot], @engdoreis, @flisky, @grinya007, @itamarst, @janosh, @kalekundert, @lukemanley, @mbuhidar, @mcrumiller, @petrosbar, @r-brink, @rben01, @reswqa, @ritchie46, @stinodego, @taki-mekhalfa and @thomasfrederikhoeck