Architecture
The infernet-ml
library comes with two classes of workflows:
- Inference Workflows
- Training Workflows (coming soon)
Inference Workflows
Inference workflows all subclass the BaseInferenceWorkflow
(opens in a new tab) class.
Just like we did in our Quickstart section earlier, to run an inference workflow, you need to:
- Instantiate the workflow class
- Call the
setup()
method to prepare the workflow for running - Call the
inference()
method to execute the workflow
Life-Cycle of an Inference Workflow
All workflows implement the following life-cycle methods:
do_setup(self) -> Any
This method is called by the setup
method. It is used to prepare the workflow for running. This method is called only
once during the life-cycle of the workflow. It is used to perform any setup operations that are required before the
workflow can be run. For workflows that run the model themselves, this is where the model is downloaded, and any
ahead-of-time verifications are performed.
do_preprocessing(self, input_data: Any) -> Any
This method is the first method called by the inference
method. It is used to perform any preprocessing operations on
the input data before it is passed to the model. The output of this method is passed to the do_inference
method.
def do_run_model(self, preprocessed_data: Any) -> Any
This method is called by the inference
method after the do_preprocessing
method. This is where the model is run on
the preprocessed data. The output of this method is passed to the do_postprocessing
method.
do_postprocessing(self, input_data: Any, output_data: Any) -> Any
This method is called by the inference
method after the do_run_model
method. It is used to perform any postprocessing
operations on the output data before it is returned to the user. The output of this method is returned to the user.
Class Hierarchy
Available Inference Workflows
The following inference workflows are available in the infernet-ml
library:
HFInferenceClientWorkflow
: Uses Huggingface Inference Client library (opens in a new tab) to run all models that are hosted on Huggingface.ONNXInferenceWorkflow
: To run models that are in the ONNX format.TorchInferenceWorkflow
: To run PyTorch models.TTSInferenceWorkflow
: This workflow is the base class for all text-to-speech models.TGIClientInferenceWorkflow
: This workflow uses the TGI client to run models hosted on a Huggingface's Text Generation Interface (opens in a new tab) server.BarkHFInferenceWorkflow
: This workflow is used to run Suno's Bark model.
Training Workflows
Coming soon, stay tuned!