β οΈ Deprecations
- Fix group keys in
partition_by(as_dict=True)
/ GroupBy.__iter__
in some cases (#13646)
- Rename
row_count_name
/row_count_offset
parameters in IO functions to row_index_*
(#13563)
- Deprecate
dt.datetime
in favor of dt.replace_time_zone(None)
(#13520)
- Rename
with_row_count
to with_row_index
(#13494)
- Deprecate
Expr.where
in favor of filter
(#13440)
- Allow
drop
with no inputs as a no-op (#13460)
π Performance improvements
- elide parallelism restriction on generic rolling expressions (#13662)
- ensure time groups are parallelized (#13660)
- do not eagerly compute bitcount (#13562)
- optimise SQL engine string concat (#13499)
- Refactor expression parsing logic of predicates/constraints (#13468)
- Represent
Enum
categories as Series (#13434)
- remove lifetime requirement from CategoricalChunkedBuilder (#13319)
β¨ Enhancements
- write parquet ColumnOrder (#13672)
- Impl
contains
for ArrayNameSpace (#13638)
- improve
rolling()
expression formatting (#13657)
- Implement
is_between
in Rust (#11945)
- Add base
PolarsError
and PolarsWarning
class (#13615)
- typing overloads for Series operator methods
ge, gt, ...
(#13167)
- Expressify
pattern
of str.extract
(#13607)
- Impl
join
for ArrayNameSpace (#13586)
- add SQL engine support for string cast to
json
(#13624)
- add SQL engine support for
EXTRACT
and DATE_PART
(#13603)
- Allow
drop
with no inputs as a no-op (#13460)
- add SQL engine support for
POSITION
and STRPOS
(#13585)
- additional multi-column support for
pl.<function>
entries (#13336)
is_in
support for array dtype (#13559)
- add new
str.find
expression, returning the index of a regex pattern or literal substring (#13561)
- Impl and dispatch arr.first/last to get (#13536)
- Implement
from_dataframe
natively (interchange protocol) (#10701)
- add SQL engine support for
LIKE
and ILIKE
pattern matching (#13522)
- improve hive partition pruning (#13358) (#13426)
- Add compact syntax for
int_range
starting from 0 (#13530)
- don't rechunk by default in lazy scans (#13518)
- Add
cum_count
expression function (#13478)
- add SQL engine support for
IF
control flow function (#13491)
- add SQL engine support for
MOD
function (#13502)
- return datetime for datetime mean & median (#13417)
- add SQL engine support for
CONCAT_WS
string function (#13483)
- Allow map_batches to auto-convert output NumPy arrays to Series (#13277)
- add SQL engine support for
RIGHT
and REVERSE
string functions (#13461)
- implement
BinaryView
and Utf8View
in polars-arrow
(#13243)
- add SQL engine support for variadic string
CONCAT
function (#13428)
- add support for AND in SQL join-clause context (#13242)
- Impl ordering ops for array namespace (#13414)
- add SQL engine support for
REPLACE
string function (#13431)
- add SQL engine support for
SIGN
function (#13429)
- add SQL engine support for
IFNULL
function (#13432)
- additional SQL support for
bytes
, bit
, and hex
literals (#13389)
π Bug fixes
- gather.get schema (#13679)
- Fix group keys in
partition_by(as_dict=True)
/ GroupBy.__iter__
in some cases (#13646)
- ensure we hit proper cache in nested
rolling
expressions (#13666)
- Allow
av_buffer
cast numeric record to temporal type (#13661)
- streaming cross join if swapped is hit (#13656)
- Make sure rolling key is projected when process projection (#13622)
- fix schema inference for json (#13637)
- Improve parsing of inputs for Expr dunders (#13635)
- Empty series of AggregatedList should also have list dtype (#13620)
Series.eq_missing
should return an Expr when the input is an Expr (#13628)
- fallback to cast kernel if
inline_cast
AnyValue raise (#13595)
- Fix formatting in
describe
for precise quantiles (#13593)
- fix reverse variable row decoding (#13587)
- Fix
scatter
for null values (#13578)
- Fix
cum_count
with regards to start value / null values (#13535)
- Fix precision/scale handling and invalid numbers in string-to-decimal conversions. (#13548)
- Treat Python
None
as null value for Object
dtype (#13564)
- Fix
scatter
to allow single temporal inputs (#13577)
- Fix interchange protocol data buffer dtype (#10787)
Expr.replace
to single value did not replace NULLs (#13551)
- improve hive partition pruning (#13358) (#13426)
- fix projection pushdown for new outer join schema (#13527)
- dont raise when partial function is passed to map_elements (#13524)
- improve reading of mixed string/other dtype column data from spreadsheets with
openpyxl
and pyxlsb
engines (#13495)
- ensure size-hint of TrueIdxIter is correct (#13508)
- correct 'outer_coalesce' logic in case of duplicate names (#13501)
- raise for out-of-range datetimes in to_datetime/strptime (#13403)
- Fix Series equality for List/Array types (#13477)
- Keep logical type when getting values from list (#13456)
- Handle duplicate/ambiguous inputs for
replace
(#13217)
- Handle empty inputs to Enum constructor (#13446)
- Fix
group_by
iteration when grouping by certain selectors (#13437)
- Fix
to_pandas
for 0x0 dataframe (#13420)
- Fix offsets for numeric types in
from_buffer
(#13398)
π Documentation
- Clarify documentation for the
agg_list
argument in Expr.map_batches
(#13625)
- fix linking to feature flags in user guide (#13644)
- bring sink_ndjson docstring in line with other sink docstrings (#13636)
- Update
then
and otherwise
docstrings with "strings are parsed as column names" (#13630)
- Add
sink_ndjson
to API reference. (#13627)
- Improve documentation on broadcasting (#13394)
- Add note about toolchain issue under native Windows (#13590)
- Hint about ruff setting in VSCode (#13421)
- Clarify examples for .transpose() (#13581)
- Add additional
Series
docstring examples (#13558)
- Doc example for
read_csv
(#13161) (#13545)
- Add more doc examples on how to create an index column (#13532)
- update SQL section of the README (#13529)
- Add note to
int_range
docs for creating an index column (#13516)
- add a note to the
read_database_uri
docstring about escaping special characters in the connection string (#13514)
- update polars-business > polars-xdt link (#13509)
- Fix various typos, grammar and formatting in docstrings and user guide (#13506)
- Doc examples for
threadpool_size
and get_index_type
(#13496)
- Add missing datetime examples to docs (#13487)
- add polars-distance to plugins page (#13454)
- define file-like object in read_parquet docstring (#13463)
- Move
Series.struct.json_encode
to methods in Sphinx autosummary (#13443)
- Add missing examples of
series/list.py
(#13423)
- show
datetime.date
import in code block (#13419)
- clarify documentation for rle and rle_id (#13397)
- use named series in Series.plot example (#13407)
- fix alphabetical order of documentation entries (#13396)
π οΈ Other improvements
- Auto-add 'needs triage' label to bugs (#13671)
- make rolling index column visible to optimizer (#13658)
- Enable new error message lint to improve stack trace display (#13596)
- Add
Documentation
/ Build system
sections to the changelog (#13594)
- Filter unhelpful messages in
make build
(#13579)
- Remove extra line break between checkboxes in GitHub bug report issues (#13576)
- Narrow type hint for
get_index_type
util (#13556)
- Fix some test failures/slowdowns (#13504)
- pandas 2.2 compat (#13467)
- Increase timeout for gevent async test (#13448)
- Do not end docstrings with a blank line (#13193)
Thank you to all our contributors for making this release possible!
@Bromeon, @MarcNuebel, @MarcoGorelli, @ShivMunagala, @Wainberg, @aaarrti, @alexander-beedie, @bchalk101, @c-peters, @cgevans, @cmdlineluser, @collinprince, @deanm0000, @hamishs, @henryharbeck, @ion-elgreco, @jcrozum, @mcrumiller, @nameexhaustion, @orlp, @petrosbar, @r-brink, @reswqa, @ritchie46, @s-banach, @shritesh, @stinodego, @tim-stephenson and @wjandrea