BigQuery ML¶
Create and execute machine learning models via SQL using Google Cloud BigQuery ML.
Components:
|
Launch a BigQuery create model job and waits for it to finish. |
|
Launch a BigQuery detect anomalies model job and waits for it to finish. |
|
Launch a BigQuery drop model job and waits for it to finish. |
|
Launch a BigQuery evaluate model job and waits for it to finish. |
|
Launch a BigQuery ML.EXPLAIN_FORECAST job and let you explain forecast an ARIMA_PLUS or ARIMA model. |
|
Launch a BigQuery explain predict model job and waits for it to finish. |
|
Launch a BigQuery export model job and waits for it to finish. |
|
Launch a BigQuery ML.FORECAST job and let you forecast an ARIMA_PLUS or ARIMA model. |
|
Launch a BigQuery ml advanced weights job and waits for it to finish. |
|
Launch a BigQuery ML.ARIMA_COEFFICIENTS job and let you see the ARIMA coefficients. |
|
Launch a BigQuery ML.ARIMA_EVALUATE job and waits for it to finish. |
|
Launch a BigQuery ML.CENTROIDS job and waits for it to finish. |
|
Launch a BigQuery confusion matrix job and waits for it to finish. |
|
Launch a BigQuery feature importance fetching job and waits for it to finish. |
|
Launch a BigQuery feature info job and waits for it to finish. |
|
Launch a BigQuery global explain fetching job and waits for it to finish. |
|
Launch a BigQuery ML.principal_component_info job and waits for it to finish. |
|
Launch a BigQuery ML.principal_components job and waits for it to finish. |
|
Launch a BigQuery ML.Recommend job and waits for it to finish. |
|
Launch a BigQuery ml reconstruction loss job and waits for it to finish. |
|
Launch a BigQuery roc curve job and waits for it to finish. |
|
Launch a BigQuery ml training info fetching job and waits for it to finish. |
|
Launch a BigQuery ml trial info job and waits for it to finish. |
|
Launch a BigQuery ml weights job and waits for it to finish. |
|
Launch a BigQuery predict model job and waits for it to finish. |
|
Launch a BigQuery query job and waits for it to finish. |
-
v1.bigquery.BigqueryCreateModelJobOp(query: str, model: dsl.Output[google.BQMLModel], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery create model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to create the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- query: str¶
SQL query text to execute. Only standard SQL is supported. If query are both specified in here and in job_configuration_query, the value in here will override the other one.
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery model creation job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
model: dsl.Output[google.BQMLModel]
Describes the model which is created.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryDetectAnomaliesModelJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, contamination: float =-1.0
, anomaly_prob_threshold: float =0.95
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery detect anomalies model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery model prediction job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for prediction. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-detect-anomalies#model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-detect-anomalies#table_name
- query_statement: str =
''
¶ Query statement string used to generate the data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-detect-anomalies#query_statement
- contamination: float =
-1.0
¶ Contamination is the proportion of anomalies in the training dataset that are used to create the AUTOENCODER, KMEANS, or PCA input models. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-detect-anomalies#contamination
- anomaly_prob_threshold: float =
0.95
¶ The ARIMA_PLUS model supports the anomaly_prob_threshold custom threshold for anomaly detection. The value of the anomaly probability at each timestamp is calculated using the actual time-series data value and the values of the predicted time-series data and the variance from the model training. The actual time-series data value at a specific timestamp is identified as anomalous if the anomaly probability exceeds the anomaly_prob_threshold value. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-detect-anomalies#anomaly_prob_threshold
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery model prediction job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the model prediction results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryDropModelJobOp(model: dsl.Input[google.BQMLModel], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery drop model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to drop the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model to drop.
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery model drop job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryEvaluateModelJobOp(model: dsl.Input[google.BQMLModel], evaluation_metrics: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, threshold: float =-1.0
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery evaluate model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery model evaluation job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for evaluation. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-evaluate#eval_model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the evaluation data, as in ML.EVALUATE(MODEL model_name[, {TABLE table_name | (query_statement)}] For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-evaluate#eval_table_name
- query_statement: str =
''
¶ Query statement string used to generate the evaluation data, as in ML.EVALUATE(MODEL model_name[, {TABLE table_name | (query_statement)}] For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-evaluate#eval_query_statement
- threshold: float =
-1.0
¶ A custom threshold for the binary-class classification model to be used for evaluation. The default value is 0.5. The threshold value that is supplied must be of type STRUCT. https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-evaluate#eval_threshold
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery model evaluation job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: Unknown
Describes the table where the model prediction results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryExplainForecastModelJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, horizon: int =3
, confidence_level: float =0.95
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.EXPLAIN_FORECAST job and let you explain forecast an ARIMA_PLUS or ARIMA model.
This function only applies to the time-series ARIMA_PLUS and ARIMA models.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.EXPLAIN_FORECAST. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-forecast
- horizon: int =
3
¶ Horizon is the number of time points to explain forecast. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-forecast#horizon
- confidence_level: float =
0.95
¶ The percentage of the future values that fall in the prediction interval. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-forecast#confidence_level
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run the BigQuery job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the model explain forecast results should be stored. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-forecast#mlexplain_forecast_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryExplainPredictModelJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, top_k_features: int =-1
, threshold: float =-1.0
, num_integral_steps: int =-1
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery explain predict model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery model prediction job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for explaining prediction. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-predict#model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the prediction data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-predict#table_name
- query_statement: str =
''
¶ Query statement string used to generate the prediction data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-predict#query_statement
- top_k_features: int =
-1
¶ This argument specifies how many top feature attribution pairs are generated per row of input data. The features are ranked by the absolute values of their attributions. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-predict#top_k_features
- threshold: float =
-1.0
¶ A custom threshold for the binary logistic regression model used as the cutoff between two labels. Predictions above the threshold are treated as positive prediction. Predictions below the threshold are negative predictions. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#threshold
- num_integral_steps: int =
-1
¶ This argument specifies the number of steps to sample between the example being explained and its baseline for approximating the integral in integrated gradients attribution methods. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-explain-predict#num_integral_steps
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery model prediction job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the model prediction results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryExportModelJobOp(model: dsl.Input[google.BQMLModel], model_destination_path: str, exported_model_path: dsl.OutputPath(str), gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, job_configuration_extract: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery export model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to export the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model to export.
- location: str =
model_destination_path: The gcs bucket to export the model to. :param job_configuration_extract: A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery :param labels: The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key. Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery model export job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
exported_model_path: dsl.OutputPath(str)
The gcs bucket path where you export the model to.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryForecastModelJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, horizon: int =3
, confidence_level: float =0.95
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.FORECAST job and let you forecast an ARIMA_PLUS or ARIMA model.
This function only applies to the time-series ARIMA_PLUS and ARIMA models.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.FORECAST. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-forecast
- horizon: int =
3
¶ Horizon is the number of time points to forecast. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-forecast#horizon
- confidence_level: float =
0.95
¶ The percentage of the future values that fall in the prediction interval. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-forecast#confidence_level
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run the BigQuery job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the model forecast results should be stored. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-forecast#mlforecast_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLAdvancedWeightsJobOp(model: dsl.Input[google.BQMLModel], advanced_weights: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ml advanced weights job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to create the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ml advanced weights job. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_model_name
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery ml advanced weights job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
weights: Unknown
Describes different output columns for different models. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-advanced-weights#mladvanced_weights_output.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLArimaCoefficientsJobOp(model: dsl.Input[google.BQMLModel], arima_coefficients: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.ARIMA_COEFFICIENTS job and let you see the ARIMA coefficients.
This function only applies to the time-series ARIMA_PLUS and ARIMA models.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.ARIMA_COEFFICIENTS. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-arima-coefficients
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- encryption_spec_key_name: str =
''
¶ Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one.
- project: str =
'{{$.pipeline_google_cloud_project_id}}'
¶ Project to run the BigQuery job. Defaults to the project in which the PipelineJob is run.
- location: str =
- Returns¶:
arima_coefficients: dsl.Output[system.Artifact]
Describes arima_coefficients to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-arima-coefficients#mlarima_coefficients_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLArimaEvaluateJobOp(model: dsl.Input[google.BQMLModel], arima_evaluation_metrics: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, show_all_candidate_models: bool =False
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.ARIMA_EVALUATE job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery model evaluation job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.ARIMA_EVALUATE. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-arima-evaluate#model_name
- show_all_candidate_models: bool =
False
¶ You can use show_all_candidate_models to show evaluation metrics or an error message for either all candidate models or for only the best model with the lowest AIC. The value is type BOOL and is part of the settings STRUCT. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-arima-evaluate#show_all_candidate_models
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery model evaluation job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
arima_evaluation_metrics: dsl.Output[system.Artifact]
Describes arima metrics. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-arima-evaluate#mlarima_evaluate_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLCentroidsJobOp(model: dsl.Input[google.BQMLModel], centroids: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, standardize: bool =False
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.CENTROIDS job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery ML.CENTROIDS job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.CENTROIDS. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-centroids#mlcentroids_syntax
- standardize: bool =
False
¶ Determines whether the centroid features should be standardized to assume that all features have a mean of zero and a standard deviation of one. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-centroids#mlcentroids_syntax
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery ML.CENTROIDS job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
centroids: dsl.Output[system.Artifact]
Information about the centroids in a k-means model. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-centroids#mlcentroids_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLConfusionMatrixJobOp(model: dsl.Input[google.BQMLModel], confusion_matrix: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, threshold: float =-1.0
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery confusion matrix job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery confusion matrix job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for confusion matrix. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-confusion#eval_model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the evaluation data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-confusion#eval_table_name
- query_statement: str =
''
¶ Query statement string used to generate the evaluation data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-confusion#eval_query_statement
- threshold: float =
-1.0
¶ A custom threshold for your binary classification model used for evaluation. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-confusion#eval_threshold
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery confusion matrix job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
confusion_matrix: dsl.Output[google.BQTable]
Describes common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-confusion#mlconfusion_matrix_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLFeatureImportanceJobOp(model: dsl.Input[google.BQMLModel], feature_importance: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery feature importance fetching job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to create the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for feature importance. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_model_name
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery model creation job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
feature_importance: dsl.Output[system.Artifact]
Describes common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-importance
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLFeatureInfoJobOp(model: dsl.Input[google.BQMLModel], feature_info: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery feature info job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to run BigQuery feature info job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for evaluation. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_model_name
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery feature info job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
feature_info: dsl.Output[system.Artifact]
Describes common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-feature#mlfeature_info_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLGlobalExplainJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, class_level_explain: bool =False
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery global explain fetching job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to create the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for global explain. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_model_name
- class_level_explain: bool =
False
¶ For classification models, if class_level_explain is set to TRUE then global feature importances are returned for each class. Otherwise, the global feature importance of the entire model is returned rather than that of each class. By default, class_level_explain is set to FALSE. This option only applies to classification models. Regression models only have model-level global feature importance.
- project: str =
'{{$.pipeline_google_cloud_project_id}}'
¶ Project to run BigQuery model creation job. Defaults to the project in which the PipelineJob is run.
- location: str =
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the global explain results should be stored.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLPrincipalComponentInfoJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.principal_component_info job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery ML.principal_component_info job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.principal_component_info. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-principal-component-info#mlprincipal_component_info_syntax
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery ML.principal_component_info job. Defaults to the project in which PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table which stores common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-principal-component-info#mlprincipal_component_info_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLPrincipalComponentsJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.principal_components job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery ML.principal_components job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.principal_components. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-principal-components#mlprincipal_components_syntax
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery ML.principal_components job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table which stores common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-principal-components#mlprincipal_components_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLRecommendJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ML.Recommend job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery ML.Recommend job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for ML.Recoomend. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-recommend#recommend_model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the the user and/or item data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-recommend#recommend_table_name
- query_statement: str =
''
¶ query statement string used to generate the evaluation data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-recommend#recommend_query_statement
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery ML.Recommend job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the recommendation results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLReconstructionLossJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ml reconstruction loss job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery ml reconstruction loss job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-reconstruction-loss#reconstruction_loss_model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the input data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-reconstruction-loss#reconstruction_loss_table_name
- query_statement: str =
''
¶ Query statement string used to generate the input data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-reconstruction-loss#reconstruction_loss_query_statement
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery ml reconstruction loss job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the ml reconstruction loss job results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLRocCurveJobOp(model: dsl.Input[google.BQMLModel], roc_curve: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, table_name: str =''
, query_statement: str =''
, thresholds: str =''
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery roc curve job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to run BigQuery roc curve job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for BigQuery roc curv job. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-roc#roc_model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the evaluation data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-roc#roc_table_name
- query_statement: str =
''
¶ Query statement string used to generate the evaluation data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-roc#roc_query_statement
- thresholds: str =
''
¶ Percentile values of the prediction output. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-roc#roc_thresholds
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery roc curve job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
roc_curve: dsl.Output[google.BQTable]
Describes common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-roc#mlroc_curve_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLTrainingInfoJobOp(model: dsl.Input[google.BQMLModel], ml_training_info: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ml training info fetching job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to create the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- query
SQL query text to execute. Only standard SQL is supported. If query are both specified in here and in job_configuration_query, the value in here will override the other one.
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery ML training info job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
ml_training_info: dsl.Output[system.Artifact]
Describes common metrics applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-evaluate#mlevaluate_output
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLTrialInfoJobOp(model: dsl.Input[google.BQMLModel], trial_info: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ml trial info job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery ml trial info job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-trial-info#predict_model_name
- query_parameters: list[str] =
[]
¶ Query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery ml trial info job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
rial_info: dsl.Output[system.Artifact]
Describes the trial info applicable to the type of model supplied. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-trial-info
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryMLWeightsJobOp(model: dsl.Input[google.BQMLModel], weights: dsl.Output[system.Artifact], gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery ml weights job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location of the job to create the BigQuery model. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- query
SQL query text to execute. Only standard SQL is supported. If query are both specified in here and in job_configuration_query, the value in here will override the other one.
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param project: Project to run BigQuery ml weights job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
weights: dsl.Output[system.Artifact]
Describes different output columns for different models. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-weights#mlweights_output.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryPredictModelJobOp(model: dsl.Input[google.BQMLModel], destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), table_name: str =
''
, query_statement: str =''
, threshold: float =-1.0
, location: str ='us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery predict model job and waits for it to finish.
- Parameters¶:
- location: str =
'us-central1'
¶ Location to run the BigQuery model prediction job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- model: dsl.Input[google.BQMLModel]¶
BigQuery ML model for prediction. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_model_name
- table_name: str =
''
¶ BigQuery table id of the input table that contains the prediction data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_table_name
- query_statement: str =
''
¶ Query statement string used to generate the prediction data. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#predict_query_statement
- threshold: float =
-1.0
¶ A custom threshold for the binary logistic regression model used as the cutoff between two labels. Predictions above the threshold are treated as positive prediction. Predictions below the threshold are negative predictions. For more details, see https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict#threshold
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run BigQuery model prediction job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the model prediction results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
v1.bigquery.BigqueryQueryJobOp(destination_table: dsl.Output[google.BQTable], gcp_resources: dsl.OutputPath(str), query: str =
''
, location: str ='us-central1'
, query_parameters: list[str] =[]
, job_configuration_query: dict[str, str] ={}
, labels: dict[str, str] ={}
, encryption_spec_key_name: str =''
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a BigQuery query job and waits for it to finish.
Note: The total input commands/args to the component can be at most 50KB. This means the BigQuery query must be less than 50KB, since the input commands/args contain other non-query characters, including all parameter names, parameter values, and various JSON characters.
- Parameters¶:
- location: str =
'us-central1'
¶ Location for creating the BigQuery job. If not set, default to
US
multi-region. For more details, see https://cloud.google.com/bigquery/docs/locations#specifying_your_location- query: str =
''
¶ SQL query text to execute. Only standard SQL is supported. If query are both specified in here and in job_configuration_query, the value in here will override the other one.
- query_parameters: list[str] =
[]
¶ jobs.query parameters for standard SQL queries. If query_parameters are both specified in here and in job_configuration_query, the value in here will override the other one.
- job_configuration_query: dict[str, str] =
{}
¶ A json formatted string describing the rest of the job configuration. For more details, see https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery
- labels: dict[str, str] =
{}
¶ The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only containlowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
- location: str =
Example: { “name”: “wrench”, “mass”: “1.3kg”, “count”: “3” }. :param encryption_spec_key_name: Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key. If encryption_spec_key_name are both specified in here and in job_configuration_query, the value in here will override the other one. :param project: Project to run the BigQuery query job. Defaults to the project in which the PipelineJob is run.
- Returns¶:
destination_table: dsl.Output[google.BQTable]
Describes the table where the query results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
gcp_resources: dsl.OutputPath(str)
Serialized gcp_resources proto tracking the BigQuery job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.