Skip to content

Shadow Ingest DataFrame Contract

This page describes what users should expect from the returned polars.DataFrame objects.

Common Contract

For all dataframe-returning public APIs:

  • the public return type is polars.DataFrame
  • transport choice is hidden from users
  • schema should stay stable at the level of core identifying columns
  • pandas users can convert explicitly with df.to_pandas()

gather_daily_price(...)

Stable core columns:

  • trade_date: Date
  • stock_code: String
  • requested price fields such as open, close, volume

Behavior:

  • one row per trade_date x stock_code
  • requested field subset is respected
  • trade_date is normalized to a date-like dataframe type across transports
  • total_turnover refers to traded value, not share volume

gather_daily_snapshot(...)

Stable core columns:

  • trade_date: Date
  • stock_code: String

Behavior:

  • one trading date per call
  • cross-sectional dataframe for the requested stock list
  • rows are naturally understood as trade_date x stock_code
  • additional fields depend on the snapshot dataset

gather_financial_snapshot(...)

Stable core column:

  • stock_code

Behavior:

  • returns the latest snapshot available for each requested stock code
  • financial statement columns depend on statement_type
  • result may be empty if the upstream financial PIT dataset is empty or incomplete

get_industry_mapping(...)

Stable core columns:

  • trade_date: Date
  • stock_code: String
  • stock_name: String | Null
  • standard: String

Behavior:

  • one row per matching stock_code within the selected standard and date
  • public results always expose stock_code, even if the backing dataset uses order_book_id
  • hierarchy columns include first-, second-, and third-level industry code/name pairs

get_industry_members(...)

Stable core columns:

  • trade_date: Date
  • stock_code: String
  • stock_name: String | Null
  • standard: String

Behavior:

  • returns member rows for one matched industry definition
  • the same stable hierarchy columns are returned as get_industry_mapping(...)
  • stock_name is joined from common_stock.symbol for the same trade date