Model Evaluation¶
Model evaluation preview components.
Components:
|
Detects data bias metrics in a dataset. |
|
Detects bias metrics from a model's predictions. |
|
Compute feature attribution on a trained model's batch explanation results. |
|
Imports a model evaluation artifact to an existing Vertex model with ModelService.ImportModelEvaluation. |
Pipelines:
|
A pipeline to compute feature attributions by sampling data for batch explanations. |
-
preview.model_evaluation.DetectDataBiasOp(gcp_resources: dsl.OutputPath(str), data_bias_metrics: dsl.Output[system.Artifact], target_field_name: str, bias_configs: list[Any], location: str =
'us-central1', dataset_format: str ='jsonl', dataset_storage_source_uris: list[str] =[], dataset: dsl.Input[google.VertexDataset] =None, columns: list[str] =[], encryption_spec_key_name: str ='', project: str ='{{$.pipeline_google_cloud_project_id}}')¶ Detects data bias metrics in a dataset.
Creates a Dataflow job with Apache Beam to category each data point in the dataset to the corresponding bucket based on bias configs, then compute data bias metrics for the dataset.
- Parameters¶:
- location: str =
'us-central1'¶ Location for running data bias detection.
- target_field_name: str¶
The full name path of the features target field in the predictions file. Formatted to be able to find nested columns, delimited by
.. Alternatively referred to as the ground truth (or ground_truth_column) field.- bias_configs: list[Any]¶
A list of
google.cloud.aiplatform_v1beta1.types.ModelEvaluation.BiasConfig. When provided, compute data bias metrics for each defined slice. Below is an example of how to format this input.
- location: str =
1: First, create a BiasConfig. `from google.cloud.aiplatform_v1beta1.types.ModelEvaluation import BiasConfig` `from google.cloud.aiplatform_v1.types.ModelEvaluationSlice.Slice import SliceSpec` `from google.cloud.aiplatform_v1.types.ModelEvaluationSlice.Slice.SliceSpec import SliceConfig` `bias_config = BiasConfig(bias_slices=SliceSpec(configs={ 'feature_a': SliceConfig(SliceSpec.Value(string_value= 'label_a') ) }))` 2: Create a list to store the bias configs into. `bias_configs = []` 3: Format each BiasConfig into a JSON or Dict. `bias_config_json = json_format.MessageToJson(bias_config` or `bias_config_dict = json_format.MessageToDict(bias_config).` 4: Combine each bias_config JSON into a list. `bias_configs.append(bias_config_json)` 5: Finally, pass bias_configs as an parameter for this component. `DetectDataBiasOp(bias_configs=bias_configs)`- Parameters¶:
- dataset_format: str =
'jsonl'¶ The file format for the dataset.
jsonlandcsvare the currently allowed formats.- dataset_storage_source_uris: list[str] =
[]¶ Google Cloud Storage URI(-s) to unmanaged test datasets.``jsonl`` and
csvis currently allowed format. Ifdatasetis also provided, this field will be overriden by the provided Vertex Dataset.- dataset: dsl.Input[google.VertexDataset] =
None¶ A
google.VertexDatasetartifact of the dataset. Ifdataset_gcs_sourceis also provided, this Vertex Dataset argument will override the GCS source.- encryption_spec_key_name: str =
''¶ Customer-managed encryption key options for the Dataflow. If this is set, then all resources created by the Dataflow will be encrypted with the provided encryption key. Has the form:
projects/my-project/locations/my-location/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created.- project: str =
'{{$.pipeline_google_cloud_project_id}}'¶ Project to run data bias detection. Defaults to the project in which the PipelineJob is run.
- dataset_format: str =
- Returns¶:
data_bias_metrics: dsl.Output[system.Artifact]Artifact tracking the data bias detection output.
gcp_resources: dsl.OutputPath(str)Serialized gcp_resources proto tracking the Dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
preview.model_evaluation.DetectModelBiasOp(gcp_resources: dsl.OutputPath(str), bias_model_metrics: dsl.Output[system.Artifact], target_field_name: str, bias_configs: list, location: str =
'us-central1', predictions_format: str ='jsonl', predictions_gcs_source: dsl.Input[system.Artifact] =None, predictions_bigquery_source: dsl.Input[google.BQTable] =None, thresholds: list =[0.5], encryption_spec_key_name: str ='', project: str ='{{$.pipeline_google_cloud_project_id}}')¶ Detects bias metrics from a model’s predictions.
Creates a Dataflow job with Apache Beam to category each data point to the corresponding bucket based on bias configs and predictions, then compute model bias metrics for classification problems.
- Parameters¶:
- location: str =
'us-central1'¶ Location for running data bias detection.
- target_field_name: str¶
The full name path of the features target field in the predictions file. Formatted to be able to find nested columns, delimited by
.. Alternatively referred to as the ground truth (or ground_truth_column) field.- predictions_format: str =
'jsonl'¶ The file format for the batch prediction results.
jsonlis the only currently allow format.- predictions_gcs_source: dsl.Input[system.Artifact] =
None¶ An artifact with its URI pointing toward a GCS directory with prediction or explanation files to be used for this evaluation. For prediction results, the files should be named “prediction.results-”. For explanation results, the files should be named “explanation.results-“.
- predictions_bigquery_source: dsl.Input[google.BQTable] =
None¶ BigQuery table with prediction or explanation data to be used for this evaluation. For prediction results, the table column should be named “predicted_*”.
- bias_configs: list¶
A list of
google.cloud.aiplatform_v1beta1.types.ModelEvaluation.BiasConfig. When provided, compute model bias metrics for each defined slice. Below is an example of how to format this input.
- location: str =
1: First, create a BiasConfig. `from google.cloud.aiplatform_v1beta1.types.ModelEvaluation import BiasConfig` `from google.cloud.aiplatform_v1.types.ModelEvaluationSlice.Slice import SliceSpec` `from google.cloud.aiplatform_v1.types.ModelEvaluationSlice.Slice.SliceSpec import SliceConfig` `bias_config = BiasConfig(bias_slices=SliceSpec(configs={ 'feature_a': SliceConfig(SliceSpec.Value(string_value= 'label_a') ) }))` 2: Create a list to store the bias configs into. `bias_configs = []` 3: Format each BiasConfig into a JSON or Dict. `bias_config_json = json_format.MessageToJson(bias_config` or `bias_config_dict = json_format.MessageToDict(bias_config)` 4: Combine each bias_config JSON into a list. `bias_configs.append(bias_config_json)` 5: Finally, pass bias_configs as an parameter for this component. `DetectModelBiasOp(bias_configs=bias_configs)`- Parameters¶:
- thresholds: list =
[0.5]¶ A list of float values to be used as prediction decision thresholds.
- encryption_spec_key_name: str =
''¶ Customer-managed encryption key options for the Dataflow. If this is set, then all resources created by the Dataflow will be encrypted with the provided encryption key. Has the form:
projects/my-project/locations/my-location/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created.- project: str =
'{{$.pipeline_google_cloud_project_id}}'¶ Project to run data bias detection. Defaults to the project in which the PipelineJob is run.
- thresholds: list =
- Returns¶:
bias_model_metrics: dsl.Output[system.Artifact]Artifact tracking the model bias detection output.
gcp_resources: dsl.OutputPath(str)Serialized gcp_resources proto tracking the Dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
preview.model_evaluation.FeatureAttributionGraphComponentOp(location: str, prediction_type: str, vertex_model: VertexModel, batch_predict_instances_format: str, batch_predict_gcs_destination_output_uri: str, batch_predict_gcs_source_uris: list[str] =
[], batch_predict_bigquery_source_uri: str ='', batch_predict_predictions_format: str ='jsonl', batch_predict_bigquery_destination_output_uri: str ='', batch_predict_machine_type: str ='n1-standard-16', batch_predict_starting_replica_count: int =5, batch_predict_max_replica_count: int =10, batch_predict_explanation_metadata: dict ={}, batch_predict_explanation_parameters: dict ={}, batch_predict_explanation_data_sample_size: int =10000, batch_predict_accelerator_type: str ='', batch_predict_accelerator_count: int =0, dataflow_machine_type: str ='n1-standard-4', dataflow_max_num_workers: int =5, dataflow_disk_size_gb: int =50, dataflow_service_account: str ='', dataflow_subnetwork: str ='', dataflow_use_public_ips: bool =True, encryption_spec_key_name: str ='', force_runner_mode: str ='', project: str ='{{$.pipeline_google_cloud_project_id}}') outputs¶ A pipeline to compute feature attributions by sampling data for batch explanations.
This pipeline guarantees support for AutoML Tabular models that contain a valid explanation_spec.
- Parameters¶:
- location: str¶
The GCP region that runs the pipeline components.
- prediction_type: str¶
The type of prediction the model is to produce. “classification”, “regression”, or “forecasting”.
- vertex_model: VertexModel¶
The Vertex model artifact used for batch explanation.
- batch_predict_instances_format: str¶
The format in which instances are given, must be one of the Model’s supportedInputStorageFormats. For more details about this input config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#InputConfig.
- batch_predict_gcs_destination_output_uri: str¶
The Google Cloud Storage location of the directory where the output is to be written to. In the given directory a new directory is created. Its name is
prediction-<model-display-name>-<job-create-time>, where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. Inside of it filespredictions_0001.<extension>,predictions_0002.<extension>, …,predictions_N.<extension>are created where<extension>depends on chosenpredictions_format, and N may equal 0001 and depends on the total number of successfully predicted instances. If the Model has bothinstanceandpredictionschemata defined then each such file contains predictions as per thepredictions_format. If prediction for any instance failed (partially or completely), then an additionalerrors_0001.<extension>,errors_0002.<extension>,…,errors_N.<extension>files are created (N depends on total number of failed predictions). These files contain the failed instances, as per their schema, followed by an additionalerrorfield which as value hasgoogle.rpc.Statuscontaining onlycodeandmessagefields. For more details about this output config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#OutputConfig.- batch_predict_gcs_source_uris: list[str] =
[]¶ Google Cloud Storage URI(-s) to your instances to run batch prediction on. May contain wildcards. For more information on wildcards, see https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames. For more details about this input config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#InputConfig.
- batch_predict_bigquery_source_uri: str =
''¶ Google BigQuery URI to your instances to run batch prediction on. May contain wildcards. For more details about this input config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#InputConfig.
- batch_predict_predictions_format: str =
'jsonl'¶ The format in which Vertex AI gives the predictions. Must be one of the Model’s supportedOutputStorageFormats. For more details about this output config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#OutputConfig.
- batch_predict_bigquery_destination_output_uri: str =
''¶ The BigQuery project location where the output is to be written to. In the given project a new dataset is created with name
prediction_<model-display-name>_<job-create-time>where is made BigQuery-dataset-name compatible (for example, most special characters become underscores), and timestamp is in YYYY_MM_DDThh_mm_ss_sssZ “based on ISO-8601” format. In the dataset two tables will be created,predictions, anderrors. If the Model has bothinstanceandpredictionschemata defined then the tables have columns as follows: Thepredictionstable contains instances for which the prediction succeeded, it has columns as per a concatenation of the Model’s instance and prediction schemata. Theerrorstable contains rows for which the prediction has failed, it has instance columns, as per the instance schema, followed by a single “errors” column, which as values hasgoogle.rpc.Statusrepresented as a STRUCT, and containing onlycodeandmessage. For more details about this output config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#OutputConfig.- batch_predict_machine_type: str =
'n1-standard-16'¶ The type of machine for running batch prediction on dedicated resources. If the Model supports DEDICATED_RESOURCES this config may be provided (and the job will use these resources). If the Model doesn’t support AUTOMATIC_RESOURCES, this config must be provided. For more details about the BatchDedicatedResources, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#BatchDedicatedResources. For more details about the machine spec, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec
- batch_predict_starting_replica_count: int =
5¶ The number of machine replicas used at the start of the batch operation. If not set, Vertex AI decides starting number, not greater than
max_replica_count. Only used ifmachine_typeis set.- batch_predict_max_replica_count: int =
10¶ The maximum number of machine replicas the batch operation may be scaled to. Only used if
machine_typeis set.- batch_predict_explanation_metadata: dict =
{}¶ Explanation metadata configuration for this BatchPredictionJob. Can be specified only if
generate_explanationis set toTrue. This value overrides the value ofModel.explanation_metadata. All fields ofexplanation_metadataare optional in the request. If a field of theexplanation_metadataobject is not populated, the corresponding field of theModel.explanation_metadataobject is inherited. For more details, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/ExplanationSpec#explanationmetadata.- batch_predict_explanation_parameters: dict =
{}¶ Parameters to configure explaining for Model’s predictions. Can be specified only if
generate_explanationis set toTrue. This value overrides the value ofModel.explanation_parameters. All fields ofexplanation_parametersare optional in the request. If a field of theexplanation_parametersobject is not populated, the corresponding field of theModel.explanation_parametersobject is inherited. For more details, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/ExplanationSpec#ExplanationParameters.- batch_predict_explanation_data_sample_size: int =
10000¶ Desired size to downsample the input dataset that will then be used for batch explanation.
- batch_predict_accelerator_type: str =
''¶ The type of accelerator(s) that may be attached to the machine as per
batch_predict_accelerator_count. Only used ifbatch_predict_machine_typeis set. For more details about the machine spec, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec- batch_predict_accelerator_count: int =
0¶ The number of accelerators to attach to the
batch_predict_machine_type. Only used ifbatch_predict_machine_typeis set.- dataflow_machine_type: str =
'n1-standard-4'¶ The Dataflow machine type for evaluation components.
- dataflow_max_num_workers: int =
5¶ The max number of Dataflow workers for evaluation components.
- dataflow_disk_size_gb: int =
50¶ Dataflow worker’s disk size in GB for evaluation components.
- dataflow_service_account: str =
''¶ Custom service account to run Dataflow jobs.
- dataflow_subnetwork: str =
''¶ Dataflow’s fully qualified subnetwork name, when empty the default subnetwork will be used. Example: https://cloud.google.com/dataflow/docs/guides/specifying-networks#example_network_and_subnetwork_specifications
- dataflow_use_public_ips: bool =
True¶ Specifies whether Dataflow workers use public IP addresses.
- encryption_spec_key_name: str =
''¶ Customer-managed encryption key options. If set, resources created by this pipeline will be encrypted with the provided encryption key. Has the form:
projects/my-project/locations/my-location/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created.- force_runner_mode: str =
''¶ Indicate the runner mode to use forcely. Valid options are
DataflowandDirectRunner.- project: str =
'{{$.pipeline_google_cloud_project_id}}'¶ The GCP project that runs the pipeline components. Defaults to the project in which the PipelineJob is run.
- Returns¶:
A system.Metrics artifact with feature attributions.
-
preview.model_evaluation.ModelEvaluationFeatureAttributionOp(gcp_resources: dsl.OutputPath(str), feature_attributions: dsl.Output[system.Metrics], problem_type: str, location: str =
'us-central1', predictions_format: str ='jsonl', predictions_gcs_source: dsl.Input[system.Artifact] =None, predictions_bigquery_source: dsl.Input[google.BQTable] =None, dataflow_service_account: str ='', dataflow_disk_size_gb: int =50, dataflow_machine_type: str ='n1-standard-4', dataflow_workers_num: int =1, dataflow_max_workers_num: int =5, dataflow_subnetwork: str ='', dataflow_use_public_ips: bool =True, encryption_spec_key_name: str ='', force_runner_mode: str ='', project: str ='{{$.pipeline_google_cloud_project_id}}')¶ Compute feature attribution on a trained model’s batch explanation results.
Creates a dataflow job with Apache Beam and TFMA to compute feature attributions. Will compute feature attribution for every target label if possible, typically possible for AutoML Classification models.
- Parameters¶:
- location: str =
'us-central1'¶ Location running feature attribution. If not set, defaulted to
us-central1.- problem_type: str¶
Problem type of the pipeline: one of
classification,regressionandforecasting.- predictions_format: str =
'jsonl'¶ The file format for the batch prediction results.
jsonl,csv, andbigqueryare the allowed formats, from Vertex Batch Prediction. If not set, defaulted tojsonl.- predictions_gcs_source: dsl.Input[system.Artifact] =
None¶ An artifact with its URI pointing toward a GCS directory with prediction or explanation files to be used for this evaluation. For prediction results, the files should be named “prediction.results-” or “predictions_”. For explanation results, the files should be named “explanation.results-“.
- predictions_bigquery_source: dsl.Input[google.BQTable] =
None¶ BigQuery table with prediction or explanation data to be used for this evaluation. For prediction results, the table column should be named “predicted_*”.
- dataflow_service_account: str =
''¶ Service account to run the dataflow job. If not set, dataflow will use the default worker service account. For more details, see https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#default_worker_service_account
- dataflow_disk_size_gb: int =
50¶ The disk size (in GB) of the machine executing the evaluation run. If not set, defaulted to
50.- dataflow_machine_type: str =
'n1-standard-4'¶ The machine type executing the evaluation run. If not set, defaulted to
n1-standard-4.- dataflow_workers_num: int =
1¶ The number of workers executing the evaluation run. If not set, defaulted to
10.- dataflow_max_workers_num: int =
5¶ The max number of workers executing the evaluation run. If not set, defaulted to
25.- dataflow_subnetwork: str =
''¶ Dataflow’s fully qualified subnetwork name, when empty the default subnetwork will be used. More details: https://cloud.google.com/dataflow/docs/guides/specifying-networks#example_network_and_subnetwork_specifications
- dataflow_use_public_ips: bool =
True¶ Specifies whether Dataflow workers use public IP addresses.
- encryption_spec_key_name: str =
''¶ Customer-managed encryption key for the Dataflow job. If this is set, then all resources created by the Dataflow job will be encrypted with the provided encryption key.
- force_runner_mode: str =
''¶ Flag to choose Beam runner. Valid options are
DirectRunnerandDataflow.- project: str =
'{{$.pipeline_google_cloud_project_id}}'¶ Project to run feature attribution container. Defaults to the project in which the PipelineJob is run.
- location: str =
- Returns¶:
gcs_output_directory: UnknownJsonArray of the downsampled dataset GCS output.
bigquery_output_table: UnknownString of the downsampled dataset BigQuery output.
gcp_resources: dsl.OutputPath(str)Serialized gcp_resources proto tracking the dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.
-
preview.model_evaluation.ModelImportEvaluationOp(model: dsl.Input[google.VertexModel], gcp_resources: dsl.OutputPath(str), evaluation_resource_name: dsl.OutputPath(str), metrics: dsl.Input[system.Metrics] | None =
None, row_based_metrics: dsl.Input[system.Metrics] | None =None, problem_type: str | None =None, classification_metrics: dsl.Input[google.ClassificationMetrics] | None =None, forecasting_metrics: dsl.Input[google.ForecastingMetrics] | None =None, regression_metrics: dsl.Input[google.RegressionMetrics] | None =None, text_generation_metrics: dsl.Input[system.Metrics] | None =None, question_answering_metrics: dsl.Input[system.Metrics] | None =None, summarization_metrics: dsl.Input[system.Metrics] | None =None, explanation: dsl.Input[system.Metrics] | None =None, feature_attributions: dsl.Input[system.Metrics] | None =None, embedding_metrics: dsl.Input[system.Metrics] | None =None, display_name: str ='', dataset_path: str ='', dataset_paths: list[str] =[], dataset_type: str ='')¶ Imports a model evaluation artifact to an existing Vertex model with ModelService.ImportModelEvaluation.
For more details, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.models.evaluations One of the metrics inputs must be provided, metrics & problem_type, classification_metrics, regression_metrics, or forecasting_metrics, text_generation_metrics, question_answering_metrics, summarization_metrics, embedding_metrics.
- Parameters¶:
- model: dsl.Input[google.VertexModel]¶
Vertex model resource that will be the parent resource of the
uploaded evaluation. :param metrics: Path of metrics generated from an evaluation component. :param row_based_metrics: Path of row_based_metrics generated from an evaluation component. :param problem_type: The problem type of the metrics being imported to the VertexModel.
classification,regression,forecasting,text-generation,question-answering, andsummarizationare the currently supported problem types. Must be provided whenmetricsis provided. :param classification_metrics: google.ClassificationMetrics artifact generated from the ModelEvaluationClassificationOp component. :param forecasting_metrics: google.ForecastingMetrics artifact generated from the ModelEvaluationForecastingOp component. :param regression_metrics: google.ClassificationMetrics artifact generated from the ModelEvaluationRegressionOp component. :param text_generation_metrics: system.Metrics artifact generated from the LLMEvaluationTextGenerationOp component. Subject to change to google.TextGenerationMetrics. :param question_answering_metrics: system.Metrics artifact generated from the LLMEvaluationTextGenerationOp component. Subject to change to google.QuestionAnsweringMetrics. :param summarization_metrics: system.Metrics artifact generated from the LLMEvaluationTextGenerationOp component. Subject to change to google.SummarizationMetrics. :param explanation: Path for model explanation metrics generated from an evaluation component. :param feature_attributions: The feature attributions metrics artifact generated from the feature attribution component. :param embedding_metrics: The embedding metrics artifact generated from the embedding retrieval metrics component. :param display_name: The display name for the uploaded model evaluation resource.