Code

Transformations are essentially Python functions. A given transformation needs a function declaration for a function called transform that returns a pandas data frame. The function will be called automatically when appropriate.

import pandas as pd

def transform():
    df = pd.DataFrame()
    return df

The function can have multiple parameters. A parameter to the transform function needs to have the same name as one of the transforms. By declaring a transform as a parameter, the dependency graph will be updated, the transform will be read and the DataFrame will be passed in as a parameter to the transformfunction.

The following example shows how the connection_statisticstransform declares a dependency on airport_info_ingest and jfk_ingest by just specifying those transform names as parameters for the transform function.

Shared Library

In some cases, extracting common definitions and functions into a shared space can be beneficial for code reuse. DataSpace offers a dedicated tab called "Library" for this purpose. Code written in the Library tab is automatically exported into a module named lib, making it easily accessible. Here's how you can use it:

import lib

# get data
lib.my_data

# call a function
lib.my_func()

Last updated