aimet

We have hosted the application aimet in order to run this application in our online workstations with Wine or directly.

Run aimet online

Quick description about aimet:

Qualcomm Innovation Center (QuIC) is at the forefront of enabling low-power inference at the edge through its pioneering model-efficiency research. QuIC has a mission to help migrate the ecosystem toward fixed-point inference. With this goal, QuIC presents the AI Model Efficiency Toolkit (AIMET) - a library that provides advanced quantization and compression techniques for trained neural network models. AIMET enables neural networks to run more efficiently on fixed-point AI hardware accelerators. Quantized inference is significantly faster than floating point inference. For example, models that we�ve run on the Qualcomm� Hexagon� DSP rather than on the Qualcomm� Kryo� CPU have resulted in a 5x to 15x speedup. Plus, an 8-bit model also has a 4x smaller memory footprint relative to a 32-bit model. However, often when quantizing a machine learning model (e.g., from 32-bit floating point to an 8-bit fixed point value), the model accuracy is sacrificed.

Features:

Equalize weight tensors to reduce amplitude variation across channels
Tensor-decomposition technique to split a large layer into two smaller ones
Corrects shift in layer outputs introduced due to quantization
Removes redundant input channels from a layer and reconstructs layer weights
Use quantization sim to train the model further to improve accuracy
Automatically selects how much to compress each layer in the model

Programming Language: Python.
Categories:

Machine Learning, Neural Network Libraries, LLM Inference

Page navigation:

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.