Google Cloud’s Dataflow recently announced the General Availability support for Apache Beam's generic machine learning prediction and inference transform, RunInference. In this article, Reza Rokni, Senior Staff Developer Advocate Dataflow show us to take a deeper dive on the transform, including:
-
Showing the RunInference transform used with a simple model as an example, in both batch and streaming mode.
-
Using the transform with multiple models in an ensemble.
-
Providing an end-to-end pipeline example that makes use of an open source model from Torchvision.
He also discussed, about Apache Beam developers who wanted to make use of a machine learning model locally, in a production pipeline, had to hand-code the call to the model within a user defined function (DoFn), taking on the technical debt for layers of boilerplate code. Let's have a look at what would have been needed:
-
Load the model from a common location using the framework's load method.
-
Ensure that the model is shared amongst the DoFns, either by hand or via the shared class utility in Beam.
-
Batch the data before the model is invoked to improve the model efficiency. The developer would set this up, either by hand or via one of the groups into batches utilities.
-
Provide a set of metrics from the transform.
-
Provide production grade logging and exception handling with clean messages to help that SRE out at 2 in the morning!
-
Pass specific parameters to the models, or start to build a generic transform that allows the configuration to determine information within the model.
To learn more, follow the link below: