AutoML Forecasting

Experimental AutoML forecasting components.

Components:

ForecastingEnsembleOp(project, location, ...)

Ensembles AutoML Forecasting models.

ForecastingStage1TunerOp(project, location, ...)

Searches AutoML Forecasting architectures and selects the top trials.

ForecastingStage2TunerOp(project, location, ...)

Tunes AutoML Forecasting models and selects top trials.

Functions:

get_learn_to_learn_forecasting_pipeline_and_parameters(*, ...)

Returns l2l_forecasting pipeline and formatted parameters.

get_sequence_to_sequence_forecasting_pipeline_and_parameters(*, ...)

Returns seq2seq forecasting pipeline and formatted parameters.

get_temporal_fusion_transformer_forecasting_pipeline_and_parameters(*, ...)

Returns tft_forecasting pipeline and formatted parameters.

get_time_series_dense_encoder_forecasting_pipeline_and_parameters(*, ...)

Returns timeseries_dense_encoder_forecasting pipeline and parameters.

preview.automl.forecasting.ForecastingEnsembleOp(project: str, location: str, root_dir: str, transform_output: dsl.Input[system.Artifact], metadata: dsl.Input[system.Artifact], tuning_result_input: dsl.Input[system.Artifact], instance_baseline: dsl.Input[system.Artifact], instance_schema_path: dsl.Input[system.Artifact], prediction_image_uri: str, gcp_resources: dsl.OutputPath(str), model_architecture: dsl.Output[system.Artifact], example_instance: dsl.Output[system.Artifact], unmanaged_container_model: dsl.Output[google.UnmanagedContainerModel], explanation_metadata: dsl.OutputPath(dict), explanation_metadata_artifact: dsl.Output[system.Artifact], explanation_parameters: dsl.OutputPath(dict), encryption_spec_key_name: str | None = '')

Ensembles AutoML Forecasting models.

Parameters:
project: str

Project to run the job in.

location: str

Region to run the job in.

root_dir: str

The Cloud Storage path to store the output.

transform_output: dsl.Input[system.Artifact]

The transform output artifact.

metadata: dsl.Input[system.Artifact]

The tabular example gen metadata.

tuning_result_input: dsl.Input[system.Artifact]

AutoML Tabular tuning result.

instance_baseline: dsl.Input[system.Artifact]

The instance baseline used to calculate explanations.

instance_schema_path: dsl.Input[system.Artifact]

The path to the instance schema, describing the input data for the tf_model at serving time.

encryption_spec_key_name: str | None = ''

Customer-managed encryption key.

prediction_image_uri: str

URI of the Docker image to be used as the container for serving predictions. This URI must identify an image in Artifact Registry or Container Registry.

Returns:

gcp_resources: dsl.OutputPath(str)

GCP resources created by this component. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.

model_architecture: dsl.Output[system.Artifact]

The architecture of the output model.

nmanaged_container_model: dsl.Output[google.UnmanagedContainerModel]

Model information needed to perform batch prediction.

explanation_metadata: dsl.OutputPath(dict)

The explanation metadata used by Vertex online and batch explanations.

explanation_metadata_artifact: dsl.Output[system.Artifact]

The explanation metadata used by Vertex online and batch explanations in the format of a KFP Artifact.

explanation_parameters: dsl.OutputPath(dict)

The explanation parameters used by Vertex online and batch explanations.

example_instance: dsl.Output[system.Artifact]

An example instance which may be used as an input for predictions.

preview.automl.forecasting.ForecastingStage1TunerOp(project: str, location: str, root_dir: str, num_selected_trials: int, deadline_hours: float, num_parallel_trials: int, single_run_max_secs: int, metadata: dsl.Input[system.Artifact], transform_output: dsl.Input[system.Artifact], materialized_train_split: dsl.Input[system.Artifact], materialized_eval_split: dsl.Input[system.Artifact], gcp_resources: dsl.OutputPath(str), tuning_result_output: dsl.Output[system.Artifact], study_spec_parameters_override: list | None = [], worker_pool_specs_override_json: list | None = [], reduce_search_space_mode: str | None = 'regular', encryption_spec_key_name: str | None = '')

Searches AutoML Forecasting architectures and selects the top trials.

Parameters:
project: str

Project to run hyperparameter tuning.

location: str

Location for running the hyperparameter tuning.

root_dir: str

The Cloud Storage location to store the output.

study_spec_parameters_override: list | None = []

JSON study spec. E.g., [{“parameter_id”: “activation”,”categorical_value_spec”: {“values”: [“tanh”]}}]

worker_pool_specs_override_json: list | None = []

JSON worker pool specs. E.g., [{“machine_spec”: {“machine_type”: “n1-standard-16”}},{},{},{“machine_spec”: {“machine_type”: “n1-standard-16”}}]

reduce_search_space_mode: str | None = 'regular'

The reduce search space mode. Possible values: “regular” (default), “minimal”, “full”.

num_selected_trials: int

Number of selected trials. The number of weak learners in the final model is 5 * num_selected_trials.

deadline_hours: float

Number of hours the hyperparameter tuning should run.

num_parallel_trials: int

Number of parallel training trials.

single_run_max_secs: int

Max number of seconds each training trial runs.

metadata: dsl.Input[system.Artifact]

The tabular example gen metadata.

transform_output: dsl.Input[system.Artifact]

The transform output artifact.

materialized_train_split: dsl.Input[system.Artifact]

The materialized train split.

materialized_eval_split: dsl.Input[system.Artifact]

The materialized eval split.

encryption_spec_key_name: str | None = ''

Customer-managed encryption key.

Returns:

gcp_resources: dsl.OutputPath(str)

GCP resources created by this component. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.

ning_result_output: dsl.Output[system.Artifact]

The trained model and architectures.

preview.automl.forecasting.ForecastingStage2TunerOp(project: str, location: str, root_dir: str, num_selected_trials: int, deadline_hours: float, num_parallel_trials: int, single_run_max_secs: int, metadata: dsl.Input[system.Artifact], transform_output: dsl.Input[system.Artifact], materialized_train_split: dsl.Input[system.Artifact], materialized_eval_split: dsl.Input[system.Artifact], tuning_result_input_path: dsl.Input[system.Artifact], gcp_resources: dsl.OutputPath(str), tuning_result_output: dsl.Output[system.Artifact], worker_pool_specs_override_json: list | None = [], encryption_spec_key_name: str | None = '')

Tunes AutoML Forecasting models and selects top trials.

Parameters:
project: str

Project to run stage 2 tuner.

location: str

Cloud region for running the component: us-central1).

root_dir: str

The Cloud Storage location to store the output.

worker_pool_specs_override_json: list | None = []

JSON worker pool specs. E.g., [{“machine_spec”: {“machine_type”: “n1-standard-16”}},{},{},{“machine_spec”: {“machine_type”: “n1-standard-16”}}]

num_selected_trials: int

Number of selected trials. The number of weak learners in the final model.

deadline_hours: float

Number of hours the cross-validation trainer should run.

num_parallel_trials: int

Number of parallel training trials.

single_run_max_secs: int

Max number of seconds each training trial runs.

metadata: dsl.Input[system.Artifact]

The forecasting example gen metadata.

transform_output: dsl.Input[system.Artifact]

The transform output artifact.

materialized_train_split: dsl.Input[system.Artifact]

The materialized train split.

materialized_eval_split: dsl.Input[system.Artifact]

The materialized eval split.

encryption_spec_key_name: str | None = ''

Customer-managed encryption key.

tuning_result_input_path: dsl.Input[system.Artifact]

Path to the json of hyperparameter tuning results to use when evaluating models.

Returns:

gcp_resources: dsl.OutputPath(str)

GCP resources created by this component. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.

ning_result_output: dsl.Output[system.Artifact]

The trained (private) model artifact paths and their hyperparameters.

preview.automl.forecasting.get_learn_to_learn_forecasting_pipeline_and_parameters(*, project: str, location: str, root_dir: str, target_column: str, optimization_objective: str, transformations: dict[str, list[str]], train_budget_milli_node_hours: float, time_column: str, time_series_identifier_columns: list[str], time_series_identifier_column: str | None = None, time_series_attribute_columns: list[str] | None = None, available_at_forecast_columns: list[str] | None = None, unavailable_at_forecast_columns: list[str] | None = None, forecast_horizon: int | None = None, context_window: int | None = None, evaluated_examples_bigquery_path: str | None = None, window_predefined_column: str | None = None, window_stride_length: int | None = None, window_max_count: int | None = None, holiday_regions: list[str] | None = None, stage_1_num_parallel_trials: int | None = None, stage_1_tuning_result_artifact_uri: str | None = None, stage_2_num_parallel_trials: int | None = None, num_selected_trials: int | None = None, data_source_csv_filenames: str | None = None, data_source_bigquery_table_path: str | None = None, predefined_split_key: str | None = None, training_fraction: float | None = None, validation_fraction: float | None = None, test_fraction: float | None = None, weight_column: str | None = None, dataflow_service_account: str | None = None, dataflow_subnetwork: str | None = None, dataflow_use_public_ips: bool = True, feature_transform_engine_bigquery_staging_full_dataset_id: str = '', feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16', feature_transform_engine_dataflow_max_num_workers: int = 10, feature_transform_engine_dataflow_disk_size_gb: int = 40, evaluation_batch_predict_machine_type: str = 'n1-standard-16', evaluation_batch_predict_starting_replica_count: int = 25, evaluation_batch_predict_max_replica_count: int = 25, evaluation_dataflow_machine_type: str = 'n1-standard-16', evaluation_dataflow_max_num_workers: int = 25, evaluation_dataflow_starting_num_workers: int = 22, evaluation_dataflow_disk_size_gb: int = 50, study_spec_parameters_override: list[dict[str, Any]] | None = None, stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None, stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None, enable_probabilistic_inference: bool = False, quantiles: list[float] | None = None, encryption_spec_key_name: str | None = None, model_display_name: str | None = None, model_description: str | None = None, run_evaluation: bool = True, group_columns: list[str] | None = None, group_total_weight: float = 0.0, temporal_total_weight: float = 0.0, group_temporal_total_weight: float = 0.0) tuple[str, dict[str, Any]][source]

Returns l2l_forecasting pipeline and formatted parameters.

Parameters:
project: str

The GCP project that runs the pipeline components.

location: str

The GCP region that runs the pipeline components.

root_dir: str

The root GCS directory for the pipeline components.

target_column: str

The target column name.

optimization_objective: str

“minimize-rmse”, “minimize-mae”, “minimize-rmsle”, “minimize-rmspe”, “minimize-wape-mae”, “minimize-mape”, or “minimize-quantile-loss”.

transformations: dict[str, list[str]]

Dict mapping auto and/or type-resolutions to feature columns. The supported types are: auto, categorical, numeric, text, and timestamp.

train_budget_milli_node_hours: float

The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.

time_column: str

The column that indicates the time.

time_series_identifier_columns: list[str]

The columns which distinguish different time series.

time_series_identifier_column: str | None = None

[Deprecated] The column which distinguishes different time series.

time_series_attribute_columns: list[str] | None = None

The columns that are invariant across the same time series.

available_at_forecast_columns: list[str] | None = None

The columns that are available at the forecast time.

unavailable_at_forecast_columns: list[str] | None = None

The columns that are unavailable at the forecast time.

forecast_horizon: int | None = None

The length of the horizon.

context_window: int | None = None

The length of the context window.

evaluated_examples_bigquery_path: str | None = None

The bigquery dataset to write the predicted examples into for evaluation, in the format bq://project.dataset.

window_predefined_column: str | None = None

The column that indicate the start of each window.

window_stride_length: int | None = None

The stride length to generate the window.

window_max_count: int | None = None

The maximum number of windows that will be generated.

holiday_regions: list[str] | None = None

The geographical regions where the holiday effect is applied in modeling.

stage_1_num_parallel_trials: int | None = None

Number of parallel trails for stage 1.

stage_1_tuning_result_artifact_uri: str | None = None

The stage 1 tuning result artifact GCS URI.

stage_2_num_parallel_trials: int | None = None

Number of parallel trails for stage 2.

num_selected_trials: int | None = None

Number of selected trails.

data_source_csv_filenames: str | None = None

A string that represents a list of comma separated CSV filenames.

data_source_bigquery_table_path: str | None = None

The BigQuery table path of format bq://bq_project.bq_dataset.bq_table

predefined_split_key: str | None = None

The predefined_split column name.

training_fraction: float | None = None

The training fraction.

validation_fraction: float | None = None

The validation fraction.

test_fraction: float | None = None

The test fraction.

weight_column: str | None = None

The weight column name.

dataflow_service_account: str | None = None

The full service account name.

dataflow_subnetwork: str | None = None

The dataflow subnetwork.

dataflow_use_public_ips: bool = True

True to enable dataflow public IPs.

feature_transform_engine_bigquery_staging_full_dataset_id: str = ''

The full id of the feature transform engine staging dataset.

feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16'

The dataflow machine type of the feature transform engine.

feature_transform_engine_dataflow_max_num_workers: int = 10

The max number of dataflow workers of the feature transform engine.

feature_transform_engine_dataflow_disk_size_gb: int = 40

The disk size of the dataflow workers of the feature transform engine.

evaluation_batch_predict_machine_type: str = 'n1-standard-16'

Machine type for the batch prediction job in evaluation, such as ‘n1-standard-16’.

evaluation_batch_predict_starting_replica_count: int = 25

Number of replicas to use in the batch prediction cluster at startup time.

evaluation_batch_predict_max_replica_count: int = 25

The maximum count of replicas the batch prediction job can scale to.

evaluation_dataflow_machine_type: str = 'n1-standard-16'

Machine type for the dataflow job in evaluation, such as ‘n1-standard-16’.

evaluation_dataflow_max_num_workers: int = 25

Maximum number of dataflow workers.

evaluation_dataflow_starting_num_workers: int = 22

Starting number of dataflow workers.

evaluation_dataflow_disk_size_gb: int = 50

The disk space in GB for dataflow.

study_spec_parameters_override: list[dict[str, Any]] | None = None

The list for overriding study spec.

stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 1 tuner worker pool spec.

stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 2 trainer worker pool spec.

enable_probabilistic_inference: bool = False

If probabilistic inference is enabled, the model will fit a distribution that captures the uncertainty of a prediction. If quantiles are specified, then the quantiles of the distribution are also returned.

quantiles: list[float] | None = None

Quantiles to use for probabilistic inference. Up to 5 quantiles are allowed of values between 0 and 1, exclusive. Represents the quantiles to use for that objective. Quantiles must be unique.

encryption_spec_key_name: str | None = None

The KMS key name.

model_display_name: str | None = None

Optional display name for model.

model_description: str | None = None

Optional description.

run_evaluation: bool = True

True to evaluate the ensembled model on the test split.

group_columns: list[str] | None = None

A list of time series attribute column names that define the time series hierarchy.

group_total_weight: float = 0.0

The weight of the loss for predictions aggregated over time series in the same group.

temporal_total_weight: float = 0.0

The weight of the loss for predictions aggregated over the horizon for a single time series.

group_temporal_total_weight: float = 0.0

The weight of the loss for predictions aggregated over both the horizon and time series in the same hierarchy group.

Returns:

Tuple of pipeline_definition_path and parameter_values.

preview.automl.forecasting.get_sequence_to_sequence_forecasting_pipeline_and_parameters(*, project: str, location: str, root_dir: str, target_column: str, optimization_objective: str, transformations: dict[str, list[str]], train_budget_milli_node_hours: float, time_column: str, time_series_identifier_columns: list[str], time_series_identifier_column: str | None = None, time_series_attribute_columns: list[str] | None = None, available_at_forecast_columns: list[str] | None = None, unavailable_at_forecast_columns: list[str] | None = None, forecast_horizon: int | None = None, context_window: int | None = None, evaluated_examples_bigquery_path: str | None = None, window_predefined_column: str | None = None, window_stride_length: int | None = None, window_max_count: int | None = None, holiday_regions: list[str] | None = None, stage_1_num_parallel_trials: int | None = None, stage_1_tuning_result_artifact_uri: str | None = None, stage_2_num_parallel_trials: int | None = None, num_selected_trials: int | None = None, data_source_csv_filenames: str | None = None, data_source_bigquery_table_path: str | None = None, predefined_split_key: str | None = None, training_fraction: float | None = None, validation_fraction: float | None = None, test_fraction: float | None = None, weight_column: str | None = None, dataflow_service_account: str | None = None, dataflow_subnetwork: str | None = None, dataflow_use_public_ips: bool = True, feature_transform_engine_bigquery_staging_full_dataset_id: str = '', feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16', feature_transform_engine_dataflow_max_num_workers: int = 10, feature_transform_engine_dataflow_disk_size_gb: int = 40, evaluation_batch_predict_machine_type: str = 'n1-standard-16', evaluation_batch_predict_starting_replica_count: int = 25, evaluation_batch_predict_max_replica_count: int = 25, evaluation_dataflow_machine_type: str = 'n1-standard-16', evaluation_dataflow_max_num_workers: int = 25, evaluation_dataflow_starting_num_workers: int = 22, evaluation_dataflow_disk_size_gb: int = 50, study_spec_parameters_override: list[dict[str, Any]] | None = None, stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None, stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None, encryption_spec_key_name: str | None = None, model_display_name: str | None = None, model_description: str | None = None, run_evaluation: bool = True)[source]

Returns seq2seq forecasting pipeline and formatted parameters.

Parameters:
project: str

The GCP project that runs the pipeline components.

location: str

The GCP region that runs the pipeline components.

root_dir: str

The root GCS directory for the pipeline components.

target_column: str

The target column name.

optimization_objective: str

“minimize-rmse”, “minimize-mae”, “minimize-rmsle”, “minimize-rmspe”, “minimize-wape-mae”, “minimize-mape”, or “minimize-quantile-loss”.

transformations: dict[str, list[str]]

Dict mapping auto and/or type-resolutions to feature columns. The supported types are: auto, categorical, numeric, text, and timestamp.

train_budget_milli_node_hours: float

The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.

time_column: str

The column that indicates the time.

time_series_identifier_columns: list[str]

The columns which distinguish different time series.

time_series_identifier_column: str | None = None

[Deprecated] The column which distinguishes different time series.

time_series_attribute_columns: list[str] | None = None

The columns that are invariant across the same time series.

available_at_forecast_columns: list[str] | None = None

The columns that are available at the forecast time.

unavailable_at_forecast_columns: list[str] | None = None

The columns that are unavailable at the forecast time.

forecast_horizon: int | None = None

The length of the horizon.

context_window: int | None = None

The length of the context window.

evaluated_examples_bigquery_path: str | None = None

The bigquery dataset to write the predicted examples into for evaluation, in the format bq://project.dataset.

window_predefined_column: str | None = None

The column that indicate the start of each window.

window_stride_length: int | None = None

The stride length to generate the window.

window_max_count: int | None = None

The maximum number of windows that will be generated.

holiday_regions: list[str] | None = None

The geographical regions where the holiday effect is applied in modeling.

stage_1_num_parallel_trials: int | None = None

Number of parallel trails for stage 1.

stage_1_tuning_result_artifact_uri: str | None = None

The stage 1 tuning result artifact GCS URI.

stage_2_num_parallel_trials: int | None = None

Number of parallel trails for stage 2.

num_selected_trials: int | None = None

Number of selected trails.

data_source_csv_filenames: str | None = None

A string that represents a list of comma separated CSV filenames.

data_source_bigquery_table_path: str | None = None

The BigQuery table path of format bq://bq_project.bq_dataset.bq_table

predefined_split_key: str | None = None

The predefined_split column name.

training_fraction: float | None = None

The training fraction.

validation_fraction: float | None = None

The validation fraction.

test_fraction: float | None = None

The test fraction.

weight_column: str | None = None

The weight column name.

dataflow_service_account: str | None = None

The full service account name.

dataflow_subnetwork: str | None = None

The dataflow subnetwork.

dataflow_use_public_ips: bool = True

True to enable dataflow public IPs.

feature_transform_engine_bigquery_staging_full_dataset_id: str = ''

The full id of the feature transform engine staging dataset.

feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16'

The dataflow machine type of the feature transform engine.

feature_transform_engine_dataflow_max_num_workers: int = 10

The max number of dataflow workers of the feature transform engine.

feature_transform_engine_dataflow_disk_size_gb: int = 40

The disk size of the dataflow workers of the feature transform engine.

evaluation_batch_predict_machine_type: str = 'n1-standard-16'

Machine type for the batch prediction job in evaluation, such as ‘n1-standard-16’.

evaluation_batch_predict_starting_replica_count: int = 25

Number of replicas to use in the batch prediction cluster at startup time.

evaluation_batch_predict_max_replica_count: int = 25

The maximum count of replicas the batch prediction job can scale to.

evaluation_dataflow_machine_type: str = 'n1-standard-16'

Machine type for the dataflow job in evaluation, such as ‘n1-standard-16’.

evaluation_dataflow_max_num_workers: int = 25

Maximum number of dataflow workers.

evaluation_dataflow_starting_num_workers: int = 22

Starting number of dataflow workers.

evaluation_dataflow_disk_size_gb: int = 50

The disk space in GB for dataflow.

study_spec_parameters_override: list[dict[str, Any]] | None = None

The list for overriding study spec.

stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 1 tuner worker pool spec.

stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 2 trainer worker pool spec.

encryption_spec_key_name: str | None = None

The KMS key name.

model_display_name: str | None = None

Optional display name for model.

model_description: str | None = None

Optional description.

run_evaluation: bool = True

True to evaluate the ensembled model on the test split.

Returns:

Tuple of pipeline_definition_path and parameter_values.

preview.automl.forecasting.get_temporal_fusion_transformer_forecasting_pipeline_and_parameters(*, project: str, location: str, root_dir: str, target_column: str, optimization_objective: str, transformations: dict[str, list[str]], train_budget_milli_node_hours: float, time_column: str, time_series_identifier_columns: list[str], time_series_identifier_column: str | None = None, time_series_attribute_columns: list[str] | None = None, available_at_forecast_columns: list[str] | None = None, unavailable_at_forecast_columns: list[str] | None = None, forecast_horizon: int | None = None, context_window: int | None = None, evaluated_examples_bigquery_path: str | None = None, window_predefined_column: str | None = None, window_stride_length: int | None = None, window_max_count: int | None = None, holiday_regions: list[str] | None = None, stage_1_num_parallel_trials: int | None = None, stage_1_tuning_result_artifact_uri: str | None = None, stage_2_num_parallel_trials: int | None = None, data_source_csv_filenames: str | None = None, data_source_bigquery_table_path: str | None = None, predefined_split_key: str | None = None, training_fraction: float | None = None, validation_fraction: float | None = None, test_fraction: float | None = None, weight_column: str | None = None, dataflow_service_account: str | None = None, dataflow_subnetwork: str | None = None, dataflow_use_public_ips: bool = True, feature_transform_engine_bigquery_staging_full_dataset_id: str = '', feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16', feature_transform_engine_dataflow_max_num_workers: int = 10, feature_transform_engine_dataflow_disk_size_gb: int = 40, evaluation_batch_predict_machine_type: str = 'n1-standard-16', evaluation_batch_predict_starting_replica_count: int = 25, evaluation_batch_predict_max_replica_count: int = 25, evaluation_dataflow_machine_type: str = 'n1-standard-16', evaluation_dataflow_max_num_workers: int = 25, evaluation_dataflow_starting_num_workers: int = 22, evaluation_dataflow_disk_size_gb: int = 50, study_spec_parameters_override: list[dict[str, Any]] | None = None, stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None, stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None, encryption_spec_key_name: str | None = None, model_display_name: str | None = None, model_description: str | None = None, run_evaluation: bool = True)[source]

Returns tft_forecasting pipeline and formatted parameters.

Parameters:
project: str

The GCP project that runs the pipeline components.

location: str

The GCP region that runs the pipeline components.

root_dir: str

The root GCS directory for the pipeline components.

target_column: str

The target column name.

optimization_objective: str

“minimize-rmse”, “minimize-mae”, “minimize-rmsle”, “minimize-rmspe”, “minimize-wape-mae”, “minimize-mape”, or “minimize-quantile-loss”.

transformations: dict[str, list[str]]

Dict mapping auto and/or type-resolutions to feature columns. The supported types are: auto, categorical, numeric, text, and timestamp.

train_budget_milli_node_hours: float

The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.

time_column: str

The column that indicates the time.

time_series_identifier_columns: list[str]

The columns which distinguish different time series.

time_series_identifier_column: str | None = None

[Deprecated] The column which distinguishes different time series.

time_series_attribute_columns: list[str] | None = None

The columns that are invariant across the same time series.

available_at_forecast_columns: list[str] | None = None

The columns that are available at the forecast time.

unavailable_at_forecast_columns: list[str] | None = None

The columns that are unavailable at the forecast time.

forecast_horizon: int | None = None

The length of the horizon.

context_window: int | None = None

The length of the context window.

evaluated_examples_bigquery_path: str | None = None

The bigquery dataset to write the predicted examples into for evaluation, in the format bq://project.dataset.

window_predefined_column: str | None = None

The column that indicate the start of each window.

window_stride_length: int | None = None

The stride length to generate the window.

window_max_count: int | None = None

The maximum number of windows that will be generated.

holiday_regions: list[str] | None = None

The geographical regions where the holiday effect is applied in modeling.

stage_1_num_parallel_trials: int | None = None

Number of parallel trails for stage 1.

stage_1_tuning_result_artifact_uri: str | None = None

The stage 1 tuning result artifact GCS URI.

stage_2_num_parallel_trials: int | None = None

Number of parallel trails for stage 2.

data_source_csv_filenames: str | None = None

A string that represents a list of comma separated CSV filenames.

data_source_bigquery_table_path: str | None = None

The BigQuery table path of format bq://bq_project.bq_dataset.bq_table

predefined_split_key: str | None = None

The predefined_split column name.

training_fraction: float | None = None

The training fraction.

validation_fraction: float | None = None

The validation fraction.

test_fraction: float | None = None

The test fraction.

weight_column: str | None = None

The weight column name.

dataflow_service_account: str | None = None

The full service account name.

dataflow_subnetwork: str | None = None

The dataflow subnetwork.

dataflow_use_public_ips: bool = True

True to enable dataflow public IPs.

feature_transform_engine_bigquery_staging_full_dataset_id: str = ''

The full id of the feature transform engine staging dataset.

feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16'

The dataflow machine type of the feature transform engine.

feature_transform_engine_dataflow_max_num_workers: int = 10

The max number of dataflow workers of the feature transform engine.

feature_transform_engine_dataflow_disk_size_gb: int = 40

The disk size of the dataflow workers of the feature transform engine.

evaluation_batch_predict_machine_type: str = 'n1-standard-16'

Machine type for the batch prediction job in evaluation, such as ‘n1-standard-16’.

evaluation_batch_predict_starting_replica_count: int = 25

Number of replicas to use in the batch prediction cluster at startup time.

evaluation_batch_predict_max_replica_count: int = 25

The maximum count of replicas the batch prediction job can scale to.

evaluation_dataflow_machine_type: str = 'n1-standard-16'

Machine type for the dataflow job in evaluation, such as ‘n1-standard-16’.

evaluation_dataflow_max_num_workers: int = 25

Maximum number of dataflow workers.

evaluation_dataflow_starting_num_workers: int = 22

Starting number of dataflow workers.

evaluation_dataflow_disk_size_gb: int = 50

The disk space in GB for dataflow.

study_spec_parameters_override: list[dict[str, Any]] | None = None

The list for overriding study spec.

stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 1 tuner worker pool spec.

stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 2 trainer worker pool spec.

encryption_spec_key_name: str | None = None

The KMS key name.

model_display_name: str | None = None

Optional display name for model.

model_description: str | None = None

Optional description.

run_evaluation: bool = True

True to evaluate the ensembled model on the test split.

Returns:

Tuple of pipeline_definition_path and parameter_values.

preview.automl.forecasting.get_time_series_dense_encoder_forecasting_pipeline_and_parameters(*, project: str, location: str, root_dir: str, target_column: str, optimization_objective: str, transformations: dict[str, list[str]], train_budget_milli_node_hours: float, time_column: str, time_series_identifier_columns: list[str], time_series_identifier_column: str | None = None, time_series_attribute_columns: list[str] | None = None, available_at_forecast_columns: list[str] | None = None, unavailable_at_forecast_columns: list[str] | None = None, forecast_horizon: int | None = None, context_window: int | None = None, evaluated_examples_bigquery_path: str | None = None, window_predefined_column: str | None = None, window_stride_length: int | None = None, window_max_count: int | None = None, holiday_regions: list[str] | None = None, stage_1_num_parallel_trials: int | None = None, stage_1_tuning_result_artifact_uri: str | None = None, stage_2_num_parallel_trials: int | None = None, num_selected_trials: int | None = None, data_source_csv_filenames: str | None = None, data_source_bigquery_table_path: str | None = None, predefined_split_key: str | None = None, training_fraction: float | None = None, validation_fraction: float | None = None, test_fraction: float | None = None, weight_column: str | None = None, dataflow_service_account: str | None = None, dataflow_subnetwork: str | None = None, dataflow_use_public_ips: bool = True, feature_transform_engine_bigquery_staging_full_dataset_id: str = '', feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16', feature_transform_engine_dataflow_max_num_workers: int = 10, feature_transform_engine_dataflow_disk_size_gb: int = 40, evaluation_batch_predict_machine_type: str = 'n1-standard-16', evaluation_batch_predict_starting_replica_count: int = 25, evaluation_batch_predict_max_replica_count: int = 25, evaluation_dataflow_machine_type: str = 'n1-standard-16', evaluation_dataflow_max_num_workers: int = 25, evaluation_dataflow_starting_num_workers: int = 22, evaluation_dataflow_disk_size_gb: int = 50, study_spec_parameters_override: list[dict[str, Any]] | None = None, stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None, stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None, enable_probabilistic_inference: bool = False, quantiles: list[float] | None = None, encryption_spec_key_name: str | None = None, model_display_name: str | None = None, model_description: str | None = None, run_evaluation: bool = True, group_columns: list[str] | None = None, group_total_weight: float = 0.0, temporal_total_weight: float = 0.0, group_temporal_total_weight: float = 0.0) tuple[str, dict[str, Any]][source]

Returns timeseries_dense_encoder_forecasting pipeline and parameters.

Parameters:
project: str

The GCP project that runs the pipeline components.

location: str

The GCP region that runs the pipeline components.

root_dir: str

The root GCS directory for the pipeline components.

target_column: str

The target column name.

optimization_objective: str

“minimize-rmse”, “minimize-mae”, “minimize-rmsle”, “minimize-rmspe”, “minimize-wape-mae”, “minimize-mape”, or “minimize-quantile-loss”.

transformations: dict[str, list[str]]

Dict mapping auto and/or type-resolutions to feature columns. The supported types are: auto, categorical, numeric, text, and timestamp.

train_budget_milli_node_hours: float

The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.

time_column: str

The column that indicates the time.

time_series_identifier_columns: list[str]

The columns which distinguish different time series.

time_series_identifier_column: str | None = None

[Deprecated] The column which distinguishes different time series.

time_series_attribute_columns: list[str] | None = None

The columns that are invariant across the same time series.

available_at_forecast_columns: list[str] | None = None

The columns that are available at the forecast time.

unavailable_at_forecast_columns: list[str] | None = None

The columns that are unavailable at the forecast time.

forecast_horizon: int | None = None

The length of the horizon.

context_window: int | None = None

The length of the context window.

evaluated_examples_bigquery_path: str | None = None

The bigquery dataset to write the predicted examples into for evaluation, in the format bq://project.dataset.

window_predefined_column: str | None = None

The column that indicate the start of each window.

window_stride_length: int | None = None

The stride length to generate the window.

window_max_count: int | None = None

The maximum number of windows that will be generated.

holiday_regions: list[str] | None = None

The geographical regions where the holiday effect is applied in modeling.

stage_1_num_parallel_trials: int | None = None

Number of parallel trails for stage 1.

stage_1_tuning_result_artifact_uri: str | None = None

The stage 1 tuning result artifact GCS URI.

stage_2_num_parallel_trials: int | None = None

Number of parallel trails for stage 2.

num_selected_trials: int | None = None

Number of selected trails.

data_source_csv_filenames: str | None = None

A string that represents a list of comma separated CSV filenames.

data_source_bigquery_table_path: str | None = None

The BigQuery table path of format bq://bq_project.bq_dataset.bq_table

predefined_split_key: str | None = None

The predefined_split column name.

training_fraction: float | None = None

The training fraction.

validation_fraction: float | None = None

The validation fraction.

test_fraction: float | None = None

The test fraction.

weight_column: str | None = None

The weight column name.

dataflow_service_account: str | None = None

The full service account name.

dataflow_subnetwork: str | None = None

The dataflow subnetwork.

dataflow_use_public_ips: bool = True

True to enable dataflow public IPs.

feature_transform_engine_bigquery_staging_full_dataset_id: str = ''

The full id of the feature transform engine staging dataset.

feature_transform_engine_dataflow_machine_type: str = 'n1-standard-16'

The dataflow machine type of the feature transform engine.

feature_transform_engine_dataflow_max_num_workers: int = 10

The max number of dataflow workers of the feature transform engine.

feature_transform_engine_dataflow_disk_size_gb: int = 40

The disk size of the dataflow workers of the feature transform engine.

evaluation_batch_predict_machine_type: str = 'n1-standard-16'

Machine type for the batch prediction job in evaluation, such as ‘n1-standard-16’.

evaluation_batch_predict_starting_replica_count: int = 25

Number of replicas to use in the batch prediction cluster at startup time.

evaluation_batch_predict_max_replica_count: int = 25

The maximum count of replicas the batch prediction job can scale to.

evaluation_dataflow_machine_type: str = 'n1-standard-16'

Machine type for the dataflow job in evaluation, such as ‘n1-standard-16’.

evaluation_dataflow_max_num_workers: int = 25

Maximum number of dataflow workers.

evaluation_dataflow_starting_num_workers: int = 22

Starting number of dataflow workers.

evaluation_dataflow_disk_size_gb: int = 50

The disk space in GB for dataflow.

study_spec_parameters_override: list[dict[str, Any]] | None = None

The list for overriding study spec.

stage_1_tuner_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 1 tuner worker pool spec.

stage_2_trainer_worker_pool_specs_override: dict[str, Any] | None = None

The dictionary for overriding stage 2 trainer worker pool spec.

enable_probabilistic_inference: bool = False

If probabilistic inference is enabled, the model will fit a distribution that captures the uncertainty of a prediction. If quantiles are specified, then the quantiles of the distribution are also returned.

quantiles: list[float] | None = None

Quantiles to use for probabilistic inference. Up to 5 quantiles are allowed of values between 0 and 1, exclusive. Represents the quantiles to use for that objective. Quantiles must be unique.

encryption_spec_key_name: str | None = None

The KMS key name.

model_display_name: str | None = None

Optional display name for model.

model_description: str | None = None

Optional description.

run_evaluation: bool = True

True to evaluate the ensembled model on the test split.

group_columns: list[str] | None = None

A list of time series attribute column names that define the time series hierarchy.

group_total_weight: float = 0.0

The weight of the loss for predictions aggregated over time series in the same group.

temporal_total_weight: float = 0.0

The weight of the loss for predictions aggregated over the horizon for a single time series.

group_temporal_total_weight: float = 0.0

The weight of the loss for predictions aggregated over both the horizon and time series in the same hierarchy group.

Returns:

Tuple of pipeline_definition_path and parameter_values.