AutoML Forecasting

GA AutoML forecasting components.

Components:

ProphetTrainerOp(project, location, ...[, ...])

Trains and tunes one Prophet model per time series using Dataflow.

v1.automl.forecasting.ProphetTrainerOp(project: str, location: str, root_dir: str, target_column: str, time_column: str, time_series_identifier_column: str, forecast_horizon: int, window_column: str, data_granularity_unit: str, predefined_split_column: str, source_bigquery_uri: str, gcp_resources: dsl.OutputPath(str), unmanaged_container_model: dsl.Output[google.UnmanagedContainerModel], evaluated_examples_directory: dsl.Output[system.Artifact], optimization_objective: str | None = 'rmse', max_num_trials: int | None = 6, encryption_spec_key_name: str | None = '', dataflow_max_num_workers: int | None = 10, dataflow_machine_type: str | None = 'n1-standard-1', dataflow_disk_size_gb: int | None = 40, dataflow_service_account: str | None = '', dataflow_subnetwork: str | None = '', dataflow_use_public_ips: bool | None = True)

Trains and tunes one Prophet model per time series using Dataflow.

Parameters
project: str

The GCP project that runs the pipeline components.

location: str

The GCP region for Vertex AI.

root_dir: str

The Cloud Storage location to store the output.

time_column: str

Name of the column that identifies time order in the

time series. :param time_series_identifier_column: Name of the column that identifies the time series. :param target_column: Name of the column that the model is to predict values for. :param forecast_horizon: The number of time periods into the future for which forecasts will be created. Future periods start after the latest timestamp for each time series. :param optimization_objective: Optimization objective for tuning. Supported metrics come from Prophet’s performance_metrics function. These are mse, rmse, mae, mape, mdape, smape, and coverage. :param data_granularity_unit: String representing the units of time for the time column. :param predefined_split_column: The predefined_split column name. A string that represents a list of comma separated CSV filenames. :param source_bigquery_uri: The BigQuery table path of format bq (str)://bq_project.bq_dataset.bq_table :param window_column: Name of the column that should be used to filter input rows. The column should contain either booleans or string booleans; if the value of the row is True, generate a sliding window from that row. :param max_num_trials: Maximum number of tuning trials to perform per time series. There are up to 100 possible combinations to explore for each time series. Recommended values to try are 3, 6, and 24. :param encryption_spec_key_name: Customer-managed encryption key. :param dataflow_machine_type: The dataflow machine type used for training. :param dataflow_max_num_workers: The max number of Dataflow workers used for training. :param dataflow_disk_size_gb: Dataflow worker’s disk size in GB during training. :param dataflow_service_account: Custom service account to run dataflow jobs. :param dataflow_subnetwork: Dataflow’s fully qualified subnetwork name, when empty the default subnetwork will be used. :param dataflow_use_public_ips: Specifies whether Dataflow workers use public IP addresses.

Returns

``gcp_resources: dsl.OutputPath(str)``
          Serialized gcp_resources proto tracking the custom training

job. nmanaged_container_model: dsl.Output[google.UnmanagedContainerModel] The UnmanagedContainerModel artifact.