The rise of GPU compute

Share on linkedin
Share on twitter
Share on facebook
Share on reddit
Share on digg
Share on email

If you were ploughing a field, which would you rather use: two strong oxen or 1024 chickens?” Seymour Cray, the father of supercomputing

GPU compute refers to the current trend of using cores aimed at rendering graphics to perform computational tasks usually handled by the CPU and has had a major impact on the way programmers develop their applications.

The concept implies using GPUs and CPUs together in modern SoCs with the sequential part of the application running on the CPU while the data-parallel, computational-intensive side, which is often more substantial, is handled by the GPU. The momentum for embracing the GPU compute model has been rapidly picking up as some experts have predicted that GPU compute is likely to increase its current capabilities by 500x, while ‘pure’ CPU capacities will progress by a limited 10x.

This enables graphics processors to achieve tremendous computational performance and maintain power efficiency while at the same time offering the end-user an incredible overall system speedup that is transparent, seamless and easy to achieve.

The Need for Speed: a Crash Course in GPU compute APIs

Able to access the hardware solutions but lacking the software support, applications initially attempted to match the feature set of traditional graphics APIs like OpenGL. This proved to be somewhat inefficient and thus a number of solutions have started to appear for the GPU compute programming problem.

Developments in dedicated multi-threaded languages such as OpenCL™ (driven by Apple at first, but now a widely adopted Khronos standard), DirectX 11.1 (enabling access to the DirectCompute technology) and C for CUDA (Compute Unified Device Architecture) have been driven by key semiconductor and software companies to become a tangible reality. In the high performance workstation market, there are FireStream and CUDA-compliant products, although neither of those standards has been ported to the embedded space.

The success of this approach was such that the industry started looking at FLOPS (FLoating point Operations Per Second) instead of CPU frequency, when comparing a computing system’s overall speed. From the graph below, we can see most GPUs outclass high-end CPUs by a large margin when looking at computational capacity.

The rise of GPU compute: CPU vs. GPU GFLOPS comparison

The theoretical performance of GPUs vs. CPUs (click on the image for the high resolution version)

The Usual Suspects

Imagination’s PowerVR graphics technologies support all the main APIs now in use for GPU computing which are presently getting wider deployment, particularly in desktop products, but also in embedded systems.

Thanks to the USSE™ (Universal Scalable Shader Engine) present in the PowerVR SGX™  Series5 graphics IP cores and its updated sibling, USSE2™, which arrived with PowerVR Series5XT, Imagination was able to become an early adopter of OpenCL. Both products are currently available on the market, having advanced capabilities such as round-to-nearest in floating point mathematics, full 32-bit integer support and 64-bit integer emulation.  

These features that enable GPU computing have been already integrated in several popular platforms that can be found in most of the mobile phones and tablets. By offering the possibility to combine up to sixteen PowerVR SGX cores on a chip, Imagination is able to deliver performance on par with discrete GPU vendors, while still retaining an unrivalled power, area and bandwidth efficiency. As power consumption increases super-linearly with frequency, the PowerVR SGX family achieves high parallelism at low clock frequencies therefore enabling programmers to write efficient applications that can benefit from the OpenCL mobile API ecosystem. This enables advanced applications and parallel computing for imaging and graphics solutions.

Here is an example of how PowerVR GPUs can improve both the overall performance and power efficiency of a mobile platform.

PowerVR OpenCL demonstration on the TI OMAP 4 platform

The Godfather, part 6: PowerVR Series6 GPUs

The newly launched PowerVR Series6 IP cores address the problem of achieving optimal general purpose computational throughput while taking into account memory latency and power efficiency. This revolutionary family of GPUs is designed to integrate the graphics and compute functionalities together, optimizing interoperation between the two, both at hardware and software driver levels.

Another very important aspect of Power VR Series6’s GPU compute capabilities lies in how the graphics core can dramatically improve the overall system performance by offloading the CPU. The new family of GPUs offers a multi-tasking, multi-threaded engine with maximal utilization via a scalar/wide SIMD execution model for maximal compute efficiency and ensures true scalability in performance, as the industry is sending a clear message that the CPU-GPU relationship is no longer based on a master-slave model but on a peer-to-peer communication mechanism.

 A heterogeneous system

A heterogeneous computing system

With its design targeting efficiency in the mobile space, the CPU is fundamentally a sequential processor. Therefore, it cannot handle intensive data-plane processing without quickly becoming overloaded and virtually stalling the whole system. As a result, computing architectures need to become heterogeneous systems, with true parallel-core GPUs, like the PowerVR Series6 IP graphics cores, working together with multi-core CPUs and other processing units within the system.

There is an ever-expanding variety of use cases where GPU computing based on PowerVR graphics cores brings great benefits. Examples include imaging processing (stabilization, correction, improvement, or face detection and beautification tools), multimedia (real-time stabilization, information extraction or superimposition of information), computer vision (augmented reality, edge and feature detection) and general gaming, if the applications are written with the right approach in mind.

Want to find out more about our latest and greatest PowerVR cores? If you can’t make it to one of our upcoming technical events or exhibitions, join us online on our YouTube channel and inside the Demo Room or follow @ImaginationTech for more exciting announcements.

Alex Voica

Alex Voica

Before deciding to pursue his dream of working in technology marketing, Alexandru held various engineering roles at leading semiconductor companies in Europe. His background also includes research in computer graphics and VR at the School of Advanced Studies Sant'Anna in Pisa. You can follow him on Twitter @alexvoica.

Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

[email protected]
Tel: +44 (0)1923 260 511

Search by Tag

Search by Author

Related blog articles

Apple M1 image

Why you should be running your iOS apps on your new Apple M1 laptop

Towards the end of last year, Apple released the latest version of its Apple MacBook Pro and Macbook Air laptops. This release was notable as with these brand-new laptops Apple made a significant change – the processor inside was based on its own M1 chip rather than the Intel architecture that it had been using since 2006. Since its release, the Apple M1 has been widely hailed for its performance, outstripping Intel in all the major benchmarks and all in a cool, quiet package with low power consumption.

Read More »
android background

The Android Invasion: Imagination GPU IP buddies up with Google-powered devices

Google Android continues to have the lion share of the mobile market, powering around 75% of all smartphones and tablets, making it the most used operating system in the world. Imagination’s PowerVR architecture-based IP and the Android OS are bedfellows, with a host of devices based on Android coming to market all the time. Here we list a few that have appeared in Q4 2020.

Read More »
bseries imgic technology

Back in the high-performance game

My first encounter with the PowerVR GPU was helping the then VideoLogic launch boards for Matrox in Europe. Not long after I joined the company, working on the rebrand to Imagination Technologies and promoting both our own VideoLogic-branded boards and those of our partners using ST’s Kyro processors. There were tens of board partners but only for one brief moment did we have two partners in the desktop space: NEC and ST.

Read More »


Sign up to receive the latest news and product updates from Imagination straight to your inbox.