Architecture

The infernet-ml library comes with two classes of workflows:

Inference Workflows
Training Workflows (coming soon)

Inference Workflows

Inference workflows all subclass the BaseInferenceWorkflow (opens in a new tab) class. Just like we did in our Quickstart section earlier, to run an inference workflow, you need to:

Instantiate the workflow class
Call the setup() method to prepare the workflow for running
Call the inference() method to execute the workflow

Life-Cycle of an Inference Workflow

All workflows implement the following life-cycle methods:

`do_setup(self) -> Any`

This method is called by the setup method. It is used to prepare the workflow for running. This method is called only once during the life-cycle of the workflow. It is used to perform any setup operations that are required before the workflow can be run. For workflows that run the model themselves, this is where the model is downloaded, and any ahead-of-time verifications are performed.

`do_preprocessing(self, input_data: Any) -> Any`

This method is the first method called by the inference method. It is used to perform any preprocessing operations on the input data before it is passed to the model. The output of this method is passed to the do_inference method.

`def do_run_model(self, preprocessed_data: Any) -> Any`

This method is called by the inference method after the do_preprocessing method. This is where the model is run on the preprocessed data. The output of this method is passed to the do_postprocessing method.

`do_postprocessing(self, input_data: Any, output_data: Any) -> Any`

This method is called by the inference method after the do_run_model method. It is used to perform any postprocessing operations on the output data before it is returned to the user. The output of this method is returned to the user.

Class Hierarchy

Available Inference Workflows

The following inference workflows are available in the infernet-ml library:

HFInferenceClientWorkflow: Uses Huggingface Inference Client library (opens in a new tab) to run all models that are hosted on Huggingface.
ONNXInferenceWorkflow: To run models that are in the ONNX format.
TorchInferenceWorkflow: To run PyTorch models.
TTSInferenceWorkflow: This workflow is the base class for all text-to-speech models.
TGIClientInferenceWorkflow: This workflow uses the TGI client to run models hosted on a Huggingface's Text Generation Interface (opens in a new tab) server.
BarkHFInferenceWorkflow: This workflow is used to run Suno's Bark model.

Training Workflows

Coming soon, stay tuned!

Getting Started HFInferenceClientWorkflow