Code
The code repository can have two types of code files: regular Python scripts and transformations. For a transformation to be identified as such, a function called transform
returning a LazyFrame (or DataFrame) needs to be present. The function will be called automatically when appropriate.
import polars as pl
def transform(airport_info_ingest, jfk_ingest):
return airport_info_ingest.join(jfk_ingest, on="id")
The function can have multiple parameters. A parameter to the transform
function needs to have the same name as one of the transforms. By declaring a transform as a parameter, the dependency graph will be updated, the transform will be read, and the DataFrame will be passed in as a parameter to the transform
function.
The following example shows how the connection_statistics
transform declares a dependency on airport_info_ingest
and jfk_ingest
by just specifying those transform names as parameters for the transform
function.

Last updated