Forecasting¶

Compose tabular data forecasting pipelines.

Components:

`ForecastingPrepareDataForTrainOp`(...[, ...])	Prepares the parameters for the training step.
`ForecastingPreprocessingOp`(project, ...[, ...])	Preprocesses BigQuery tables for training or prediction.
`ForecastingValidationOp`(input_tables, ...[, ...])	Validates BigQuery tables for training or prediction.

v1.forecasting.ForecastingPrepareDataForTrainOp(input_tables: list, preprocess_metadata: dict, model_feature_columns: list | None = None) → Outputs¶

Prepares the parameters for the training step.

Converts the input_tables and the output of ForecastingPreprocessingOp to the input parameters of TimeSeriesDatasetCreateOp and AutoMLForecastingTrainingJobRunOp.

Parameters¶

input_tables: list¶: Serialized Json array that specifies input BigQuery tables and specs.
preprocess_metadata: dict¶: The output of ForecastingPreprocessingOp that is a serialized dictionary with 2 fields: processed_bigquery_table_uri and column_metadata.
model_feature_columns: list | None = None¶: Serialized list of column names that will be used as input feature in the training step. If None, all columns will be used in training.

Returns¶

NamedTuple: Unknown: time_series_identifier_column: Name of the column that identifies the time series. time_series_attribute_columns: Serialized column names that should be used as attribute columns. available_at_forecast_columns: Serialized column names of columns that are available at forecast. unavailable_at_forecast_columns: Serialized column names of columns that are unavailable at forecast. column_transformations: Serialized transformations to apply to the input columns. preprocess_bq_uri: The BigQuery table that saves the preprocessing result and will be used as training input. target_column: The name of the column values of which the Model is to predict. time_column: Name of the column that identifies time order in the time series. predefined_split_column: Name of the column that specifies an ML use of the row. weight_column: Name of the column that should be used as the weight column. data_granularity_unit: The data granularity unit. data_granularity_count: The number of data granularity units between data points in the training data.

v1.forecasting.ForecastingPreprocessingOp(project: str, input_tables: list, preprocess_metadata: dsl.OutputPath(dict), preprocessing_bigquery_dataset: str | None = '', location: str | None = 'US')¶

Preprocesses BigQuery tables for training or prediction.

Creates a BigQuery table for training or prediction based on the input tables. For training, a primary table is required. Optionally, you can include some attribute tables. For prediction, you need to include all the tables that were used in the training, plus a plan table.

Parameters¶

project: str¶: The GCP project id that runs the pipeline.
input_tables: list¶: Serialized Json array that specifies input BigQuery tables and specs.
preprocessing_bigquery_dataset: str | None = ''¶: Optional BigQuery dataset to save the preprocessing result BigQuery table. If not present, a new dataset will be created by the component.
location: str | None = 'US'¶: Optional location for the BigQuery data, default is US.

Returns¶

v1.forecasting.ForecastingValidationOp(input_tables: list, validation_theme: str, location: str | None = 'US')¶

Validates BigQuery tables for training or prediction.

Validates BigQuery tables for training or prediction based on predefined requirements. For training, a primary table is required. Optionally, you can include some attribute tables. For prediction, you need to include all the tables that were used in the training, plus a plan table.

Parameters¶

input_tables: list¶: Serialized Json array that specifies input BigQuery tables and specs.
validation_theme: str¶: Theme to use for validating the BigQuery tables. Acceptable values are FORECASTING_TRAINING and FORECASTING_PREDICTION.
location: str | None = 'US'¶: Optional location for the BigQuery data, default is US.