google_cloud_pipeline_components.experimental.evaluation module

Google Cloud Pipeline Model Evaluation components.

google_cloud_pipeline_components.experimental.evaluation.ModelEvaluationOp(project: str, root_dir: str, problem_type: str, batch_prediction_job: google.VertexBatchPredictionJob, class_names: list, ground_truth_column: str, location: str = 'us-central1', predictions_format: str = 'jsonl', classification_type: str = None, prediction_score_column: str = 'prediction.scores', prediction_label_column: str = 'prediction.classes', prediction_id_column: str = '', example_weight_column: str = '', positive_classes: list = '{}', generate_feature_attribution: bool = False, dataflow_service_account: str = None, dataflow_disk_size: int = 200, dataflow_machine_type: str = 'n1-standard-4', dataflow_workers_num: int = '10', dataflow_max_workers_num: int = '100')

model_evaluation Compute evaluation metrics on a trained model’s batch prediction results. Creates a dataflow job to compute the metrics.

Args:
project (str):

Project to run evaluation container.

location (Optional[str]):

Location for running the evaluation. If not set, defaulted to us-central1.

root_dir (str):

The GCS directory for keeping staging files. A random subdirectory will be created under the directory to keep job info for resuming the job in case of failure.

problem_type (str):

The problem type being addressed by this evaluation run. classification and regression are the currently supported problem types.

predictions_format (Optional[str]):

The file format for the batch prediction results. jsonl is currently the only allowed format currently. If not set, default to jsonl.

batch_prediction_job (google.VertexBatchPredictionJob):

The VertexBatchPredictionJob with prediction or explanation results for this evaluation run. For prediction results, the files should be in format “prediction.results-“. For explanation results, the files should be in format “explanation.results-“.

classification_type (str):

Required for a classification problem_type. The type of classification problem. Defined as multiclass or multilabel.

class_names (Sequence[str]):

The list of class names, in the same order they appear in the batch predictions column.

ground_truth_column (str):

The column name of the feature containing ground truth. Formatted to be able to find nested columns, delimeted by .. Prefixed with ‘instance.’ for Vertex Batch Prediction.

prediction_score_column (str):

The column name of the field containing batch prediction scores. Formatted to be able to find nested columns, delimeted by ..

prediction_label_column (Optional[str]):

Optional. The column name of the field containing classes the model is scoring. Formatted to be able to find nested columns, delimeted by ..

prediction_id_column (Optional[str]):

Optional. The column name of the field containing ids for classes the model is scoring. Formatted to be able to find nested columns, delimeted by ..

example_weight_column (Optional[str]):

Optional. The column name of the field containing example weights. Formatted to be able to find nested columns, delimeted by ..

positive_classes (Optional[Sequence[str]]):

Optional for classification problem_type. The list of class names to create binary classification metrics based on one-vs-rest for Each value of positive_classes provided.

generate_feature_attribution (Optional[bool]):

Optional. Set to False by default. If set to True, then the explanations generated by the VertexBatchPredictionJob will be used to generate feature attributions. This will only pass if the input VertexBatchPredictionJob generated explanations.

dataflow_service_account (Optional[str]):

Service account to run the dataflow job.

dataflow_disk_size (Optional[int]):

The disk size (in GB) of the machine executing the evaluation run. If not set, defaulted to 200.

dataflow_machine_type (Optional[str]):

The machine type executing the evaluation run. If not set, defaulted to n1-standard-4.

dataflow_workers_num (Optional[int]):

The number of workers executing the evaluation run. If not set, defaulted to 10.

dataflow_max_workers_num (Optional[int]):

The max number of workers executing the evaluation run. If not set, defaulted to 100.

Returns:
evaluation_metrics (system.Metrics):

System metrics artifact representing the evaluation metrics in GCS. WIP to update to a google.VertexMetrics type with additional functionality.