Skip to content
Advertisement

Tag: google-cloud-dataflow

Libraries cannot be found on Dataflow/Apache-beam job launched from CircleCI

I am having serious issues running a python Apache Beam pipeline using a GCP Dataflow runner, launched from CircleCI. I would really appreciate if someone could give any hint on how to tackle this, I’ve tried it all but nothing seems to work. Basically, I’m running this python Apache Beam pipeline which runs in Dataflow and uses google-api-python-client-1.12.3. If I

Dataflow BigQuery to BigQuery

I am trying to create a dataflow script that goes from BigQuery back to BigQuery. Our main table is massive and breaks the extraction capabilities. I’d like to create a simple table (as a result of a query) containing all the relevant information. The SQL query ‘Select * from table.orders where paid = false limit 10’ is a simple one

Advertisement