##Java to open first accordion##

WHAT SOFTWARE DOES EDGE AI NEED?


Edge AI, or running AI algorithms on-device rather than in the cloud, is a fast-growing field. Insights from Counterpoint Research suggests that by 2028 54% of all mobile devices will be AI capable. One driver for this is the gains in performance efficiency of the latest generations of Edge AI processors. Another driver is the massive amounts of innovation taking place in the software space. 

Typically, an Edge AI algorithm consists of less than 10Bn parameters. Lower parameter count algorithms have lower computational and memory requirements and so increase effectiveness on performance, power, and area constrained Edge AI devices. What is more, smaller models can normally process data faster, reducing the time to first token and providing a smoother user experience. In addition to models with fewer parameters, advances in model compression techniques such as quantisation, pruning and sparsity allow Edge AI models to perform more efficiently on mobile devices. 

Edge AI covers a diverse range of applications ranging from defect monitoring in factories to a chatbot on a smartphone. The underlying algorithms supporting these applications are equally diverse, ranging from various neural network types to newer transformer-based models. 

Computer vision is an example of an Edge AI application that recurs in different guises across various industries. In automotive it is used within the perception element of a vehicle autonomy system, helping the car to understand its environment, or in driver monitoring. In agriculture it can be used to analyse crops to better deploy water or fertiliser. On phones, it can be used in support of face detection security settings.  

Most computer vision applications use convolutional neural networks, a form of deep learning model that is specifically designed for processing structured grid data, like images. They start by detecting features such as edges, textures, and patterns within tiles of the grid before combining this understanding to create a macro-interpretation of the image as a whole and then applying some high-level reasoning to predict what the image is. Examples of CNNs commonly used at the edge include Tiny YOLO, a real-time object detection system and MobileNet, a family of lightweight CNN architectures designed specifically for mobile and embedded vision applications. 

GenerativeAI is an Edge AI application that has risen in popularity only in the last couple of years as the capabilities of the models significantly improved following the shift from neural network style approaches to transformer-based architectures. Transformers use self-attention mechanisms to process input data in parallel rather than sequentially to deliver far more life-like results. An example of a transformer-based model that can be deployed at the edge is BERT-Base, a model used for text classification and question answering. GenerativeAI now has a massive range of applications including personalising education with tailored resources, accelerating software development through code creation, and improving customer service experiences by handling requests with AI. 

One of the key challenges for getting these models delivering value across the vast numbers of Edge AI devices is software portability. The edge consists of millions of different devices: different form factors, different power supplies, different processor architectures. It takes time for a software developer to get applications running at a decent level on a new device – and this time spent on performance tuning is expensive. Solutions are needed that allow developers to take the code that they write in popular frameworks like PyTorch, ONNX and PaddlePaddle and deploy it to any device. 

For some edge devices, like smartphones, high performance runtime libraries, like LiteRT, exist for executing machine learning models on a wide variety of platforms. It is an open solution, meaning that semiconductor companies like Imagination can drop libraries optimised for their processors into the solution, allowing developers to maximise GPU utilisation and performance. 

For ahead-of-time compilation, Apache TVM is an alternative open-source machine learning compiler framework that can deploy machine learning models across CPUs, GPUs, and specialised accelerators. It can be deployed alongside hardware-specific, low-level libraries and back-end graph compilers for maximum Edge AI performance. 

Edge AI is poised to revolutionise numerous industries by bringing powerful AI capabilities directly to devices. As technology continues to advance, the integration of Edge AI will drive innovation, improve efficiency, and enhance user experiences, making AI an integral part of everyday life. Software is appearing that can deliver exceptional on-device results; the next step for Edge AI software is to continue to develop standards and processes for software portability. 

Here at Imagination, we develop hardware and software solutions for deploying AI across the edge. Find out more about our latest generation of Edge AI technology