Null Handling¶
This tutorial demonstrates how to work with nullable data in Colnade.
Runnable example
The complete code is in examples/null_handling.py.
Setup¶
import colnade as cn
from colnade_polars import from_dict
class Users(cn.Schema):
id: cn.Column[cn.UInt64]
name: cn.Column[cn.Utf8]
age: cn.Column[cn.UInt64 | None]
score: cn.Column[cn.Float64 | None]
df = from_dict(Users, {
"id": [1, 2, 3, 4, 5],
"name": ["Alice", "Bob", "Charlie", "Diana", "Eve"],
"age": [30, None, 35, None, 40],
"score": [85.0, 92.5, None, 95.0, None],
})
fill_null — replace nulls with a value¶
filled = df.with_columns(
Users.age.fill_null(0).alias(Users.age),
Users.score.fill_null(0.0).alias(Users.score),
)
fill_null(value) creates a FunctionCall expression. The backend translates it to pl.col("age").fill_null(0). Multiple columns can be filled in a single with_columns call.
drop_nulls — remove rows with nulls¶
# Drop rows where score is null
clean = df.drop_nulls(Users.score)
# Drop rows where any of the specified columns are null
clean = df.drop_nulls(Users.age, Users.score)
is_null / is_not_null — filter by null status¶
# Keep only rows with null scores
null_scores = df.filter(Users.score.is_null())
# Keep only rows with non-null scores
valid_scores = df.filter(Users.score.is_not_null())
Combining null handling with other operations¶
A common pattern: fill nulls, then filter:
Or combine null checks with value filters: