Dataflow¶
Create Google Cloud Dataflow jobs from within Vertex AI Pipelines.
Components:
|
Launch a self-executing Beam Python file on Google Cloud using the Dataflow Runner. |
-
v1.dataflow.DataflowPythonJobOp(python_module_path: str, temp_location: str, gcp_resources: dsl.OutputPath(str), location: str =
'us-central1'
, requirements_file_path: str =''
, args: list[str] =[]
, project: str ='{{$.pipeline_google_cloud_project_id}}'
)¶ Launch a self-executing Beam Python file on Google Cloud using the Dataflow Runner.
'us-central1'
. :param python_module_path: The GCS path to the Python file to run. :param temp_location: A GCS path for Dataflow to stage temporary job files created during the execution of the pipeline. :param requirements_file_path: The GCS path to the pip requirements file. :param args: The list of args to pass to the Python file. Can include additional parameters for the Dataflow Runner. :param project: Project to create the Dataflow job. Defaults to the project in which the PipelineJob is run.- Returns¶
``gcp_resources: dsl.OutputPath(str)`` Serialized gcp_resources proto tracking the Dataflow job. For more details, see