google_cloud_pipeline_components.experimental.custom_job.custom_job module

Module for supporting Google Vertex AI Custom Training Job Op.

google_cloud_pipeline_components.experimental.custom_job.custom_job.custom_training_job_op(component_spec: Callable, display_name: Optional[str] = '', replica_count: Optional[int] = 1, machine_type: Optional[str] = 'n1-standard-4', accelerator_type: Optional[str] = '', accelerator_count: Optional[int] = 1, boot_disk_type: Optional[str] = 'pd-ssd', boot_disk_size_gb: Optional[int] = 100, timeout: Optional[str] = '', restart_job_on_worker_restart: Optional[bool] = False, service_account: Optional[str] = '', network: Optional[str] = '', worker_pool_specs: Optional[List[Mapping[str, Any]]] = None, encryption_spec_key_name: Optional[str] = '', tensorboard: Optional[str] = '', base_output_directory: Optional[str] = '', labels: Optional[Dict[str, str]] = None) → Callable

Run a pipeline task using Vertex AI custom training job.

For detailed doc of the service, please refer to https://cloud.google.com/vertex-ai/docs/training/create-custom-job

Args:

component_spec: The task (ContainerOp) object to run as Vertex AI custom

job.

display_name (Optional[str]): The name of the custom job. If not provided

the component_spec.name will be used instead.

replica_count (Optional[int]): The number of replicas to be split between

master workerPoolSpec and worker workerPoolSpec. (master always has 1 replica).

machine_type (Optional[str]): The type of the machine to run the custom

job. The default value is “n1-standard-4”. For more details about this input config, see https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types.

accelerator_type (Optional[str]): The type of accelerator(s) that may be

attached to the machine as per accelerator_count. For more details about this input config, see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec#acceleratortype.

accelerator_count (Optional[int]): The number of accelerators to attach to

the machine. Defaults to 1 if accelerator_type is set.

boot_disk_type (Optional[str]):

Type of the boot disk (default is “pd-ssd”). Valid values: “pd-ssd”: (Persistent Disk Solid State Drive) or “pd-standard” (Persistent Disk Hard Disk Drive).

boot_disk_size_gb (Optional[int]): Size in GB of the boot disk (default is

100GB).

timeout (Optional[str]): The maximum job running time. The default is 7

days. A duration in seconds with up to nine fractional digits, terminated by ‘s’, for example: “3.5s”.

restart_job_on_worker_restart (Optional[bool]): Restarts the entire

CustomJob if a worker gets restarted. This feature can be used by distributed training jobs that are not resilient to workers leaving and joining a job.

service_account (Optional[str]): Sets the default service account for

workload run-as account. The service account running the pipeline

(https://cloud.google.com/vertex-ai/docs/pipelines/configure-project#service-account): submitting jobs must have act-as permission on this run-as account. If unspecified, the Vertex AI Custom Code Service
Agent(https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents): for the CustomJob’s project is used.

network (Optional[str]): The full name of the Compute Engine network to

which the job should be peered. For example, projects/12345/global/networks/myVPC. Format is of the form projects/{project}/global/networks/{network}. Where {project} is a project number, as in 12345, and {network} is a network name. Private services access must already be configured for the network. If left unspecified, the job is not peered with any network.

worker_pool_specs (Optional[List[Mapping[str, Any]]]): Worker_pool_specs for

distributed training. This will overwite all other cluster configurations. For details, please see: https://cloud.google.com/ai-platform-unified/docs/training/distributed-training

encryption_spec_key_name (Optional[str]): Customer-managed encryption key

options for the CustomJob. If this is set, then all resources created by the CustomJob will be encrypted with the provided encryption key.

tensorboard (Optional[str]): The name of a Vertex AI Tensorboard resource

to which this CustomJob will upload Tensorboard logs.

base_output_directory (Optional[str]): The Cloud Storage location to store

the output of this CustomJob or HyperparameterTuningJob. see below for more details: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/GcsDestination

labels (Optional[Dict[str, str]]): The labels with user-defined metadata

to organize CustomJobs. See https://goo.gl/xmQnxf for more information.

Returns:

A Custom Job component operator correspoinding to the input component operator.