colnade_dask¶
Dask backend adapter, construction functions, and I/O functions.
DaskBackend¶
DaskBackend
¶
Colnade backend adapter for Dask.
Expression translation produces callables (df) -> Series | scalar
since Dask, like Pandas, has no standalone lazy expression API. The
callables build lazy Dask task graphs instead of executing immediately.
translate_expr(expr)
¶
Recursively translate a Colnade AST node to a callable (df -> result).
validate_schema(source, schema)
¶
Validate that a Dask DataFrame matches the schema.
validate_field_constraints(source, schema)
¶
Validate value-level constraints (Field(), @schema_check) on data.
to_arrow_batches(source, batch_size)
¶
Convert a Dask DataFrame to an iterator of Arrow RecordBatches.
from_arrow_batches(batches, schema)
¶
Reconstruct a Dask DataFrame from Arrow RecordBatches.
from_dict(data, schema)
¶
Create a Dask DataFrame from a columnar dict with schema-driven dtypes.
Construction Functions¶
from_dict(schema, data)
¶
Create a typed LazyFrame from a columnar dict.
Returns a LazyFrame because Dask is inherently lazy — use
.collect() to materialize. The schema drives dtype coercion so
plain Python values ([1, 2, 3]) are cast to the correct native
types (e.g. UInt64).
from_rows(schema, rows)
¶
Create a typed LazyFrame from Row[S] instances.
Returns a LazyFrame because Dask is inherently lazy — use
.collect() to materialize. The type checker verifies that rows
match the schema — passing Orders.Row where Users.Row is
expected is a static error.
I/O Functions¶
scan_parquet(path, schema, **kwargs)
¶
Lazily scan a Parquet file into a typed LazyFrame backed by Dask.
scan_csv(path, schema, **kwargs)
¶
Lazily scan a CSV file into a typed LazyFrame backed by Dask.
Applies the schema's dtype mapping to ensure correct column types.
write_parquet(df, path, **kwargs)
¶
Write a DataFrame or LazyFrame to a Parquet file.
write_csv(df, path, **kwargs)
¶
Write a DataFrame or LazyFrame to a CSV file.