An upgrade guide is available on our website.
π Highlights
- implementing sink_csv for LazyFrame (#10682)
- Support
DataFrame
init from queries against users' existing database connections (#10649)
- Rename
groupby
to group_by
(#10656)
π₯ Breaking changes
- return
f64
for rank
when method="average"
(#10734)
- Update a lot of error types (#10637)
- Remove deprecated behavior from vertical aggregations (#10602)
- Read/write support for IPC streams in DataFrames (#10606)
- Change behavior of
all
- fix Kleene logic implementation for all
/any
(#10564)
- Improve consistency of parsing expression input (#9512)
- allow
from_arrow
to take a generator of RecordBatches, change error type to TypeError
(#10529)
- remove fixed_seed and add pl.set_random_seed (#10388)
- Make
arange
an alias for int_range
(#9983)
date_range
/time_range
no longer return a List
type (#10526)
- Remove various functionalities deprecated before
0.18
(#10527)
- Improve some error types and messages (#10470)
β οΈ Deprecations
- Rename
map
to map_batches
(#10801)
- Rename
GroupBy.apply
to map_groups
(#10799)
- Rename
DataFrame.apply
to map_rows
(#10797)
- Rename
Series/Expr.rolling_apply
to rolling_map
(#10750)
- Rename
Series/Expr.apply
to map_elements
(#10678)
- Rename
groupby
to group_by
(#10656)
- Deprecate some parameters of
cut
/qcut
(#10484)
π Performance improvements
- parse time zones outside of downcast_iter() in replace_time_zone (#10713)
- use binary abstraction for atan2 (#10588)
- use binary abstraction in pow (#10562)
β¨ Enhancements
- activate cse for group_by (again) (#10749)
- implementing sink_csv for LazyFrame (#10682)
- Supports series unique & arg_unique & n_unique for list (#10743)
- repeat_by should also support broadcasting of LHS (#10735)
- deprecate 'use_earliest' argument in favour of 'ambiguous', which can take expressions (#10719)
- is_first also supports numeric list type. (#10727)
- improve slice pushdown in unions (#10723)
- Explicitly implement
Protocol
for interchange classes (#10688)
- Support min and max strategy for binary & str columns fill null (#10673)
- support broadcasting in list set operations (#10668)
- csv: add schema argument (#10665)
- Support
DataFrame
init from queries against users' existing database connections (#10649)
- add
truncate_ragged_lines
(#10660)
- supports cast to list (#10623)
- Update a lot of error types (#10637)
- preserve whitespace in notebook output (#10644)
- Remove deprecated behavior from vertical aggregations (#10602)
- support selector usage in
write_excel
arguments (#10589)
- Add
LazyFrame.collect_async
and pl.collect_all_async
(#10616)
- Read/write support for IPC streams in DataFrames (#10606)
- propagate null is in
is_in
and more generic array construction (#10614)
- Change behavior of
all
- fix Kleene logic implementation for all
/any
(#10564)
- frame-level
cast
support (#10504)
- Improve consistency of parsing expression input (#9512)
- Add failed column to cast exception (#10507)
- allow
from_arrow
to take a generator of RecordBatches, change error type to TypeError
(#10529)
- Remove deprecated
get_idx_type
- use get_index_type
instead (#10556)
- Make
arange
an alias for int_range
(#9983)
date_range
/time_range
no longer return a List
type (#10526)
- Remove various functionalities deprecated before
0.18
(#10527)
- Improve some error types and messages (#10470)
- suggest str.to_datetime instead of apply and stdlib strptime (#10266)
π Bug fixes
- get_single_leaf can't handle Expr::Count (#10790)
- support groupby literal in streaming (#10771)
ORDER BY
on unselected columns (#10752)
- Fix is_in cannot cast list type for float (#10769)
- whitespace CSS in Notebook HTML updated to use
pre-wrap
instead of pre
(#10739)
- only preserve sortedness flag in replace_time_zone when safe (#10738)
- Error on
value_counts
on column named "counts"
(#10737)
- return
f64
for rank
when method="average"
(#10734)
- Keep min/max and arg_min/arg_max consistent. (#10716)
- use time zone from dtype to overwrite output time zone when initialising Series (#10689)
- Cast small int type when scan csv in streaming mode. (#10679)
- raise exception with invalid
on
arg type for join_asof (#10690)
- Reused input series in rolling_apply should not be orderly (#10694)
- re-sort buffer when update window swap the whole buffer (#10696)
- Set the correct fast_explode flag for ListUtf8ChunkedBuilder (#10684)
- Sorted Utf8Chunked max_str and min_str should consider null value (#10675)
- Correctly handle time zones in
write_delta
(#10633)
- fix apply for empty series in threading mode (#10651)
- respect 'ignore_errors=False' in csv parser (#10641)
- fix rename + projection pushdown (#10624)
- fix int/float downcast in
is_in
(#10620)
- Change behavior of
all
- fix Kleene logic implementation for all
/any
(#10564)
- Fix serialization for categorical chunked. (#10609)
- Take input_schema to create physical expr for Selection (#10571)
- Clear window cache after evaluate predication expr (#10505)
- Parsing regex col in Expr::Columns (#10551)
- sanitize column naming in boolean ops (#10531)
- Fix
write_delta
with schema in delta_write_options
(#10541)
- remove fixed_seed and add pl.set_random_seed (#10388)
- respect
pl.Config
options relating to shape, column names, and types when rendering HTML (#10449)
π οΈ Other improvements
- update cargo.lock (#10800)
- Create
.venv
in repo root (#10789)
- refactored
write_database
unit tests to properly separate concerns (#10773)
- Fix some broken links / formatting (#10772)
- Document chained when-then behaviour more prominently (#10759)
- Fix test failing due to new
adbc
release (#10763)
- Unpin
connectorx
and bump other Python dependencies (#10753)
- add note to
testing
docs about module import (#10741)
- Clear GitHub Actions caches weekly (#10715)
- Update for new pyarrow
13.0.0
behavior (#10691)
- Fix minor issue with
sink_parquet
docs (#10669)
- Remove
deprecate_renamed_methods
util (#10537)
- add "see also" entries to ne/eq_missing and update related examples (#10667)
- fix potential memory leak from usage of
inspect.currentframe
(#10630)
- give more relevant example for polars.apply (#10631)
- Bump ruff and enable new setting (#10626)
- Add docstrings for
Expr.meta
namespace (#10617)
- Enforce up-to-date
Cargo.lock
(#10555)
- deprecate DataFrame.replace (#10600)
- ensure that
make requirements
fully refreshes unpinned packages/deps (#10591)
- fix out-of-date explain default parameter (#10566)
- Fix
expr_dispatch
decorator to work on methods with decorators (#10549)
- Fix link to source code (#10542)
- Add title to index page (#10539)
- Disable SIM108 lint (#10519)
- Keep versioned docs (#10500)
- switch to
pyo3/maturin-action
(#10503)
- Update URLs for dev documentation (#10495)
- Skip failing test (#10496)
- Add version switcher to API reference (#10488)
Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Object905, @OndrejSlamecka, @SeanTroyUWO, @VasanthakumarV, @alexander-beedie, @aminalaee, @braaannigan, @c-peters, @ion-elgreco, @lorepozo, @marki259, @mcrumiller, @messense, @orlp, @owrior, @rben01, @reswqa, @ritchie46, @sdamashek, @stinodego, @svaningelgem, @titoeb, @trueb2, @washcycle and @zundertj