Integrating ML models into production pipelines with Dataflow | C2C Community

Integrating ML models into production pipelines with Dataflow

  • 1 September 2022
  • 3 replies
  • 21 views

Userlevel 7
Badge +35

Google Cloud’s Dataflow recently announced the General Availability support for Apache Beam's generic machine learning prediction and inference transform, RunInference. In this article,  Reza Rokni, Senior Staff Developer Advocate Dataflow show us to take a deeper dive on the transform, including:

  1. Showing the RunInference transform used with a simple model as an example, in both batch and streaming mode.

  2. Using the transform with multiple models in an ensemble.

  3. Providing an end-to-end pipeline example that makes use of an open source model from Torchvision. 

He also discussed, about Apache Beam developers who wanted to make use of a machine learning model locally, in a production pipeline, had to hand-code the call to the model within a user defined function (DoFn), taking on the technical debt for layers of boilerplate code. Let's have a look at what would have been needed:

  1. Load the model from a common location using the framework's load method.

  2. Ensure that the model is shared amongst the DoFns, either by hand or via the shared class utility in Beam.

  3. Batch the data before the model is invoked to improve the model efficiency. The developer would set this up, either by hand or via one of the groups into batches utilities.

  4. Provide a set of metrics from the transform.

  5. Provide production grade logging and exception handling with clean messages to help that SRE out at 2 in the morning! 

  6. Pass specific parameters to the models, or start to build a generic transform that allows the configuration to determine information within the model. 

To learn more, follow the link below:


3 replies

Userlevel 7
Badge +65

Dataflow is great for streaming. And you can use it better if you have Apache beam knowledge.

Reza Rokni has the knowledge and he can help people understand all the concepts.

Thanks for sharing his post @malamin 

Userlevel 7
Badge +35

Yes, @ilias, it is a good combination. In addition, I experimented with Apache Beam in the cloud skill boots lab.

 

Userlevel 7
Badge +65

Great job @malamin 😉

Reply