Home » Oreps » Polars » All Releases » py-0.20.4 Release

pola-rs/polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Watch

Basic Details Issues Addressed Top Contributors Directory Browser Release Notes

Polars py-0.20.4

Polars: py-0.20.4 Release

Release date:

January 24, 2024

Previous version:

py-0.20.3 (released January 2, 2024)

Magnitude:

5,997 Diff Delta

Contributors:

29 total committers

Data confidence:

Commits:

127

127 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

depr(python,rust!): Rename `row_count_name`/`row_count_offset` parameters in IO functions to `row_index_*` (#13563)

Authored January 9, 2024

depr(python,rust!): Rename `with_row_count` to `with_row_index` (#13494)

Authored January 8, 2024

feat(rust): `BinaryView`/`Utf8View` IPC support (#13464)

Authored January 6, 2024

feat(rust,python): add new `str.find` expression, returning the index of a regex pattern or literal substring (#13561)

Authored January 10, 2024

feat(rust,python,cli): add SQL engine support for `RIGHT` and `REVERSE` string functions (#13461)

Authored January 5, 2024

fix(rust,python): Fix precision/scale handling and invalid numbers in string-to-decimal conversions. (#13548)

Authored January 10, 2024

feat: Implement `is_between` in Rust (#11945)

Authored January 11, 2024

feat(rust,python): Impl ordering ops for array namespace (#13414)

Authored January 5, 2024

feat(python): Impl and dispatch arr.first/last to get (#13536)

Authored January 9, 2024

#11290 add support for AND in SQL context (#13242)

Authored January 5, 2024

perf(python): Represent `Enum` categories as Series (#13434)

Authored January 4, 2024

perf: do not eagerly compute bitcount (#13562)

Authored January 10, 2024

perf(python): Refactor expression parsing logic of predicates/constraints (#13468)

Authored January 6, 2024

feat(python): Add base `PolarsError` and `PolarsWarning` class (#13615)

Authored January 11, 2024

perf: elide all possible casts in parquet reading (#13604)

Authored January 11, 2024

docs(python): Add missing datetime examples to docs (#13487)

Authored January 7, 2024

feat: Impl `contains` for ArrayNameSpace (#13638)

Authored January 12, 2024

perf: ensure time groups are parallelized (#13660)

Authored January 12, 2024

feat(rust,python,cli): add SQL engine support for `LIKE` and `ILIKE` pattern matching (#13522)

Authored January 8, 2024

fix(python): `Series.eq_missing` should return an Expr when the input is an Expr (#13628)

Authored January 11, 2024

fix: Fix `cum_count` with regards to start value / null values (#13535)

Authored January 10, 2024

feat(python): Add compact syntax for `int_range` starting from 0 (#13530)

Authored January 8, 2024

fix: streaming cross join if swapped is hit (#13656)

Authored January 12, 2024

fix: Treat Python `None` as null value for `Object` dtype (#13564)

Authored January 10, 2024

feat(python): typing overloads for Series operator methods `ge, gt, ...` (#13167)

Authored January 11, 2024

docs(python): clarify documentation for rle and rle_id (#13397)

Authored January 3, 2024

feat: Add `cum_count` expression function (#13478)

Authored January 8, 2024

feat: Impl `join` for ArrayNameSpace (#13586)

Authored January 11, 2024

docs(python): Clarify documentation for the `agg_list` argument in `Expr.map_batches` (#13625)

Authored January 12, 2024

feat(rust, python): return datetime for datetime mean & median (#13417)

Authored January 6, 2024

feat: improve hive partition pruning (#13358) (#13426)

Authored January 8, 2024

feat(python): Allow map_batches to auto-convert output NumPy arrays to Series (#13277)

Authored January 6, 2024

feat: Expressify `pattern` of `str.extract` (#13607)

Authored January 11, 2024

docs(python): Update `then` and `otherwise` docstrings with "strings are parsed as column names" (#13630)

Authored January 11, 2024

docs(python): Hint about ruff setting in VSCode (#13421)

Authored January 10, 2024

fix: fix reverse variable row decoding (#13587)

Authored January 10, 2024

fix: Handle duplicate/ambiguous inputs for `replace` (#13217)

Authored January 5, 2024

chore(python): Narrow type hint for `get_index_type` util (#13556)

Authored January 9, 2024

fix(python): Fix interchange protocol data buffer dtype (#10787)

Authored January 9, 2024

chore: Remove extra line break between checkboxes in GitHub bug report issues (#13576)

Authored January 9, 2024

Browse Other Releases

Latest Pending Unreleased 😎

py-0.20.7 Released February 4, 2024

5,979 Δ

py-0.20.6 Released January 26, 2024

4,762 Δ

py-0.20.6-rc.1 Released January 24, 2024

4,863 Δ

py-0.20.5 Released January 24, 2024

8,906 Δ

py-0.20.4 Released January 24, 2024

5,997 Δ

py-0.20.3 Released January 2, 2024

2,175 Δ

py-0.20.3-rc.2 Released December 28, 2023

686 Δ

py-0.20.3-rc.1 Released December 28, 2023

2,094 Δ

py-0.20.2 Released December 20, 2023

614 Δ

py-0.20.1 Released December 18, 2023

244 Δ

Top Contributors in py-0.20.4

stinodego

reswqa

ritchie46

alexander-beedie

cgevans

petrosbar

aaarrti

orlp

Wainberg

jcrozum

Directory Browser for py-0.20.4

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

⚠️ Deprecations

Fix group keys in partition_by(as_dict=True) / GroupBy.__iter__ in some cases (#13646)
Rename row_count_name/row_count_offset parameters in IO functions to row_index_* (#13563)
Deprecate dt.datetime in favor of dt.replace_time_zone(None) (#13520)
Rename with_row_count to with_row_index (#13494)
Deprecate Expr.where in favor of filter (#13440)
Allow drop with no inputs as a no-op (#13460)

🚀 Performance improvements

elide parallelism restriction on generic rolling expressions (#13662)
ensure time groups are parallelized (#13660)
do not eagerly compute bitcount (#13562)
optimise SQL engine string concat (#13499)
Refactor expression parsing logic of predicates/constraints (#13468)
Represent Enum categories as Series (#13434)
remove lifetime requirement from CategoricalChunkedBuilder (#13319)

✨ Enhancements

write parquet ColumnOrder (#13672)
Impl contains for ArrayNameSpace (#13638)
improve rolling() expression formatting (#13657)
Implement is_between in Rust (#11945)
Add base PolarsError and PolarsWarning class (#13615)
typing overloads for Series operator methods ge, gt, ... (#13167)
Expressify pattern of str.extract (#13607)
Impl join for ArrayNameSpace (#13586)
add SQL engine support for string cast to json (#13624)
add SQL engine support for EXTRACT and DATE_PART (#13603)
Allow drop with no inputs as a no-op (#13460)
add SQL engine support for POSITION and STRPOS (#13585)
additional multi-column support for pl.<function> entries (#13336)
is_in support for array dtype (#13559)
add new str.find expression, returning the index of a regex pattern or literal substring (#13561)
Impl and dispatch arr.first/last to get (#13536)
Implement from_dataframe natively (interchange protocol) (#10701)
add SQL engine support for LIKE and ILIKE pattern matching (#13522)
improve hive partition pruning (#13358) (#13426)
Add compact syntax for int_range starting from 0 (#13530)
don't rechunk by default in lazy scans (#13518)
Add cum_count expression function (#13478)
add SQL engine support for IF control flow function (#13491)
add SQL engine support for MOD function (#13502)
return datetime for datetime mean & median (#13417)
add SQL engine support for CONCAT_WS string function (#13483)
Allow map_batches to auto-convert output NumPy arrays to Series (#13277)
add SQL engine support for RIGHT and REVERSE string functions (#13461)
implement BinaryView and Utf8View in polars-arrow (#13243)
add SQL engine support for variadic string CONCAT function (#13428)
add support for AND in SQL join-clause context (#13242)
Impl ordering ops for array namespace (#13414)
add SQL engine support for REPLACE string function (#13431)
add SQL engine support for SIGN function (#13429)
add SQL engine support for IFNULL function (#13432)
additional SQL support for bytes, bit, and hex literals (#13389)

🐞 Bug fixes

gather.get schema (#13679)
Fix group keys in partition_by(as_dict=True) / GroupBy.__iter__ in some cases (#13646)
ensure we hit proper cache in nested rolling expressions (#13666)
Allow av_buffer cast numeric record to temporal type (#13661)
streaming cross join if swapped is hit (#13656)
Make sure rolling key is projected when process projection (#13622)
fix schema inference for json (#13637)
Improve parsing of inputs for Expr dunders (#13635)
Empty series of AggregatedList should also have list dtype (#13620)
Series.eq_missing should return an Expr when the input is an Expr (#13628)
fallback to cast kernel if inline_cast AnyValue raise (#13595)
Fix formatting in describe for precise quantiles (#13593)
fix reverse variable row decoding (#13587)
Fix scatter for null values (#13578)
Fix cum_count with regards to start value / null values (#13535)
Fix precision/scale handling and invalid numbers in string-to-decimal conversions. (#13548)
Treat Python None as null value for Object dtype (#13564)
Fix scatter to allow single temporal inputs (#13577)
Fix interchange protocol data buffer dtype (#10787)
Expr.replace to single value did not replace NULLs (#13551)
improve hive partition pruning (#13358) (#13426)
fix projection pushdown for new outer join schema (#13527)
dont raise when partial function is passed to map_elements (#13524)
improve reading of mixed string/other dtype column data from spreadsheets with openpyxl and pyxlsb engines (#13495)
ensure size-hint of TrueIdxIter is correct (#13508)
correct 'outer_coalesce' logic in case of duplicate names (#13501)
raise for out-of-range datetimes in to_datetime/strptime (#13403)
Fix Series equality for List/Array types (#13477)
Keep logical type when getting values from list (#13456)
Handle duplicate/ambiguous inputs for replace (#13217)
Handle empty inputs to Enum constructor (#13446)
Fix group_by iteration when grouping by certain selectors (#13437)
Fix to_pandas for 0x0 dataframe (#13420)
Fix offsets for numeric types in from_buffer (#13398)

📖 Documentation

Clarify documentation for the agg_list argument in Expr.map_batches (#13625)
fix linking to feature flags in user guide (#13644)
bring sink_ndjson docstring in line with other sink docstrings (#13636)
Update then and otherwise docstrings with "strings are parsed as column names" (#13630)
Add sink_ndjson to API reference. (#13627)
Improve documentation on broadcasting (#13394)
Add note about toolchain issue under native Windows (#13590)
Hint about ruff setting in VSCode (#13421)
Clarify examples for .transpose() (#13581)
Add additional Series docstring examples (#13558)
Doc example for read_csv (#13161) (#13545)
Add more doc examples on how to create an index column (#13532)
update SQL section of the README (#13529)
Add note to int_range docs for creating an index column (#13516)
add a note to the read_database_uri docstring about escaping special characters in the connection string (#13514)
update polars-business > polars-xdt link (#13509)
Fix various typos, grammar and formatting in docstrings and user guide (#13506)
Doc examples for threadpool_size and get_index_type (#13496)
Add missing datetime examples to docs (#13487)
add polars-distance to plugins page (#13454)
define file-like object in read_parquet docstring (#13463)
Move Series.struct.json_encode to methods in Sphinx autosummary (#13443)
Add missing examples of series/list.py (#13423)
show datetime.date import in code block (#13419)
clarify documentation for rle and rle_id (#13397)
use named series in Series.plot example (#13407)
fix alphabetical order of documentation entries (#13396)

🛠️ Other improvements

Auto-add 'needs triage' label to bugs (#13671)
make rolling index column visible to optimizer (#13658)
Enable new error message lint to improve stack trace display (#13596)
Add Documentation / Build system sections to the changelog (#13594)
Filter unhelpful messages in make build (#13579)
Remove extra line break between checkboxes in GitHub bug report issues (#13576)
Narrow type hint for get_index_type util (#13556)
Fix some test failures/slowdowns (#13504)
pandas 2.2 compat (#13467)
Increase timeout for gevent async test (#13448)
Do not end docstrings with a blank line (#13193)

Thank you to all our contributors for making this release possible! @Bromeon, @MarcNuebel, @MarcoGorelli, @ShivMunagala, @Wainberg, @aaarrti, @alexander-beedie, @bchalk101, @c-peters, @cgevans, @cmdlineluser, @collinprince, @deanm0000, @hamishs, @henryharbeck, @ion-elgreco, @jcrozum, @mcrumiller, @nameexhaustion, @orlp, @petrosbar, @r-brink, @reswqa, @ritchie46, @s-banach, @shritesh, @stinodego, @tim-stephenson and @wjandrea