DataSpace
  • DataSpace
  • Platform
    • Overview
    • Workspaces
    • User Roles
    • Transformation
      • Code
      • Preview
      • Logs
      • Plot
      • Ingest
    • Builds
    • Schedules
    • Secrets Store
  • Infrastructure
    • Overview
Powered by GitBook
On this page
  1. Platform
  2. Transformation

Code

PreviousTransformationNextPreview

Last updated 11 months ago

Transformations are essentially Python functions. A given transformation needs a function declaration for a function called transform that returns a pandas data frame. The function will be called automatically when appropriate.

import pandas as pd

def transform():
    df = pd.DataFrame()
    return df

The function can have multiple parameters. A parameter to the transform function needs to have the same name as one of the transforms. By declaring a transform as a parameter, the dependency graph will be updated, the transform will be read and the DataFrame will be passed in as a parameter to the transformfunction.

The following example shows how the connection_statisticstransform declares a dependency on airport_info_ingest and jfk_ingest by just specifying those transform names as parameters for the transform function.

Shared Library

In some cases, extracting common definitions and functions into a shared space can be beneficial for code reuse. DataSpace offers a dedicated tab called "Library" for this purpose. Code written in the Library tab is automatically exported into a module named lib, making it easily accessible. Here's how you can use it:

import lib

# get data
lib.my_data

# call a function
lib.my_func()