Skip to content

v0.3.0

Compare
Choose a tag to compare
@github-actions github-actions released this 20 Aug 22:13
· 163 commits to refs/heads/main since this release
b3f5260

‼️ v0.2 → v0.3 Migration Guide ‼️

We're proud to release version 0.3.0 of Daft! Please note that with this minor version increment, v0.3 contains several breaking changes:

  • daft.read_delta_lake
    • This function was deprecated in favor of daft.read_deltalake in v0.2.26 and is now removed. (#2663)
  • daft.read_parquet / daft.read_csv / daft.read_json
    • Schema hints are deprecated in favor of infer_schema (whether to turn on schema inference) and schema (a definitive schema if infer_schema is False, otherwise it is used as a schema hint that is applied post inference). (#2326)
  • Expression.str.normalize()
    • Parameters are now all False by default, and need to individually be toggled on. (#2647)
  • DataFrame.agg / GroupedDataFrame.agg
    • Tuple syntax for aggregations was deprecated in v0.2.18 and is now no longer supported. Please use aggregation expressions instead. (#2663)
    • Ex: df.agg([(col("x"), "sum"), (col("y"), "mean")]) should be written instead as df.agg(col("x").sum(), col("y").mean())
  • DataFrame.count
    • Calling .count() with no arguments will now return a DataFrame with column “count” which contains the length of the entire DataFrame, instead of the count for each of the columns (#1996)
  • DataFrame.with_column
    • Resource requests should now be specified on UDF expressions (@udf(num_gpus=…)) instead of on Projections (through .with_column(..., resource_request=...) (#2654)
  • DataFrame.join
    • When joining two DataFrames, columns will now be merged only if they exactly match join keys. (#2631)
    • Ex:
df1 = daft.from_pydict({
	"a": ["x", "y"],
	"b": [1, 2]
})

df2 = daft.from_pydict({
	"a": ["y", "z"],
	"b": [20, 30]
})

result_df = df1.join(
	df2, 
	left_on=[col("a"), col("b")],
	right_on=[col("a"), col("b")/10], # NOTE THE "/10"
	how="outer"
)

result_df.sort("a").collect()
# before
╭──────┬───────╮
│ a    ┆ b     │
│ ---  ┆ ---   │
│ Utf8 ┆ Int64 │
╞══════╪═══════╡
│ x    ┆ 1     │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ y    ┆ 2     │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ z    ┆ 30    │
╰──────┴───────╯

# after
╭──────┬───────┬─────────╮
│ a    ┆ b     ┆ right.b │
│ ---  ┆ ---   ┆ ---     │
│ Utf8 ┆ Int64 ┆ Int64   │
╞══════╪═══════╪═════════╡
│ x    ┆ 1     ┆ None    │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ y    ┆ 2     ┆ 20      │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ z    ┆ None  ┆ 30      │
╰──────┴───────┴─────────╯

Changes

✨ New Features

🚀 Performance Improvements

👾 Bug Fixes

📖 Documentation

🧰 Maintenance