Fast and Efficient JavaScript Inference with ONNX Runtime Web

Find AI Tools

No difficulty

No complicated process

Find ai tools

Home GPTS Fast and Efficient JavaScript Inference with ONNX Runtime Web

Fast and Efficient JavaScript Inference with ONNX Runtime Web

Introduction
Setting Up the Template
Exploring the User Interface
Image Preprocessing
Creating the Tensor
Setting Up the Model
Running the Inference Session
Displaying the Results
Additional Configuration
Using Onyx Runtime Node Package
Conclusion

Introduction

In this article, we will explore how to use Onyx Runtime Web to perform inferencing with JavaScript in the browser. We will be using the ORTWeb JavaScript site template using Next.js, a React framework for building production apps. We will cover the steps to set up the template, preprocess images, Create tensors, set up the model, run the inference session, and display the results. Additionally, we will discuss the possibility of using the Onyx Runtime Node Package for server-side inferencing. So let's dive in and learn how to perform inferencing using Onyx Runtime Web!

Setting Up the Template

To get started, we need to install the necessary packages and set up the template. We will be using the ORTWeb JavaScript site template available on the Microsoft repo. We can easily clone the template locally and install the required packages using npm. Once the template is set up, we can proceed to explore its functionalities.

Exploring the User Interface

The ORTWeb template provides a basic UI that allows us to perform inferencing on sample images. The UI displays the image, confidence percentage, and inference time. When we click the "Run SqueezeNet Inference" button, the template selects a random image from the provided list and performs inferencing using the Onyx Runtime. The UI gives a simple representation of the inferencing process, but let's dig into the code to understand how it works under the hood.

Image Preprocessing

To prepare the image for inferencing, the template uses a JavaScript image processing library called Jim. This library provides various transformations, but for this template, we only need to resize the image. The processed image data is then converted into a tensor to feed into the inference session. The template handles these preprocessing steps to ensure the image is ready for inferencing.

Creating the Tensor

Once the image data is preprocessed, it needs to be converted into a tensor for inferencing. The template retrieves the RGB values from the processed image data and shapes them into a tensor using the ORTWeb tensor object. This tensor will be used as input for the inferencing process.

Setting Up the Model

Before running the inference session, we need to set up the model. The template uses the ORTWeb inference session functionality for this purpose. The inference session is created by providing the path to the model and selecting the execution provider. In this template, We Are using WebGL as the execution provider, which utilizes the GPU. Alternatively, the template allows using the WebAssembly (WASM) execution provider for CPU-Based inferencing.

Running the Inference Session

With the model set up, we can now run the inference session. The template initializes the session, sets up the feeds for the model, and executes the run method. The results of the inference session include the predictions and the inference time. The template processes these results to display the top inference results on the UI.

Displaying the Results

After the inference session, the template displays the top prediction result on the UI. It retrieves the top result from the predictions and updates the display accordingly. The UI shows the confidence percentage and the label of the predicted class. This straightforward display allows users to understand the outcome of the inferencing process.

Additional Configuration

The template comes with some configuration options to support different use cases. The webpack configuration file handles additional settings such as the required polyfill plugin for the Jim library and the copy plugin for copying the WASM files to the output folder. These configurations ensure the proper functioning of the template and support static site generation.

Using Onyx Runtime Node Package

If You prefer to perform inferencing using the server-side Node package of Onyx Runtime, the template provides that option as well. By utilizing the Onyx Runtime Node package, you can set up an API and perform inferencing on the server. This allows you to use the same Next.js template while leveraging the server-side features of Onyx Runtime.

Conclusion

In this article, we have explored how to use Onyx Runtime Web for inferencing in the browser. We started by setting up the template using the ORTWeb JavaScript site template and installing the necessary packages. Then, we dived into the code, understanding concepts such as image preprocessing, tensor creation, model setup, and running the inference session. We also discussed additional configuration options and the possibility of using the Onyx Runtime Node package for server-side inferencing. With this knowledge, you can start using Onyx Runtime Web for your own inferencing projects. Happy coding!

Highlights

Learn how to perform inferencing in the browser using Onyx Runtime Web.
Set up the ORTWeb JavaScript site template with Next.js.
Preprocess images and convert them into tensors.
Set up the model and run the inference session.
Display the results and explore additional configuration options.
Use Onyx Runtime Node package for server-side inferencing.

FAQ

Q: Can I use the Onyx Runtime Web template with any JavaScript framework? A: Yes, you can use the TypeScript files that handle image preprocessing in any JavaScript framework or even with vanilla JavaScript. The Next.js and React components in the template are only responsible for the UI.

Q: What execution providers are available in the ORTWeb template? A: The ORTWeb template supports two execution providers: WebGL and WebAssembly (WASM). WebGL utilizes the GPU, while WASM uses the CPU for inferencing.

Q: Is the ORTWeb template suitable for static site generation? A: Yes, the ORTWeb template provides options for both running the server and generating a static site. You can choose the static site generation option for a backend-free solution using HTML, JavaScript, and CSS.

Q: Can I use the Onyx Runtime Node package with the ORTWeb template? A: Yes, the ORTWeb template can be used with the Onyx Runtime Node package. By setting up an API using the Onyx Runtime Node package, you can perform inferencing on the server-side instead of in the browser.

Q: What configurations are available in the webpack config file? A: The webpack config file allows you to configure additional settings such as the polyfill plugin for the Jim library and the copy plugin for copying WASM files to the output folder for static site generation.

Q: Can I use different models with the ORTWeb template? A: Yes, the ORTWeb template is flexible and supports different models. By modifying the template, you can replace the SqueezeNet model with any other ImageNet model of your choice.

Upgrade Your Home Server with Turing Pi 2

Unleashing the Power of Stable Diffusion and Particle Systems

Are you spending too much time looking for ai tools?