##Java to open first accordion##

THE BEST PROCESSORS FOR EDGE AI


The emergence of artificial intelligence is having profound implications on hardware development in every single market. Data centres are being built with massive compute resources to handle the demands of training and connected inference. For Edge AI, embedded hardware system designers are dealing with the challenges of delivering AI within power, memory and compute constrained environments.  

Edge AI is essential for applications where connectivity may be limited, latency is not tolerated, privacy is needed, or power constraints require efficient local processing. However, for it to be a success, system designers need to unlock a significant uplift in compute performance within the well-established power, memory, and area constraints of edge devices.  

Extracting maximum performance from Edge AI algorithms is, at its core, a parallelisation problem. The computational tasks within an Edge AI model demonstrate independence (they do not rely in intermediate results from other tasks), require minimal data exchange during execution and are decomposable (processing can be split into a single set of many identical tasks). They can be parallelised into hundreds or thousands of simultaneous operations to boost performance.  

Edge AI systems consist of three distinct kinds of processors: a CPU, a GPU, and an NPU / DSP: 

CPUs are the original processor and are designed to handle operations sequentially, one after the other. They are exceptionally flexible, able to support almost any kind of Edge AI workload, and are supported by an extensive, well-established software ecosystem. However, CPUs do not naturally achieve superior performance when it comes to highly parallel workloads like graphics and AI. Some CPUs have introduced a level of parallelism through, for example, vector extensions; however, this is negligible compared to a naturally parallel processor like a GPU or an NPU.  

NPUs can deliver the best performance for specified, known Edge AI algorithms, and can do this in an exceptionally area and power efficient manner. However, they are not easy to program. They depend on a proprietary programming model, they only have a small ecosystem of developers (meaning companies using NPUs only have a small pool of talent to recruit from), and they only support specific data formats. This poor programmability and inflexible hardware make them difficult for third party companies to target, and it also means that they struggle to support newer Edge AI algorithms. 

The GPU sits in the sweet spot between the CPU and the NPU. It delivers the high performance of parallel processing alongside the programmability of a more general-purpose processor. While it may not deliver the same level of performance/Watt or performance/mm2 as an NPU for known workloads, GPUs are supported by a well-established software ecosystem, can support lots of different data formats and can easily adapt to new workloads. 

Find out more about the different processors for Edge AI in our white paper, Getting Real About AI Processors.

New generations of GPUs, like the Imagination E-Series GPU IP, contain significantly more AI performance than more traditional GPUs, further cementing their position as the processor of choice for Edge AI systems.