Deep dive: OpenCL face detection on PowerVR [part 3]

Share on linkedin
Share on twitter
Share on facebook
Share on reddit
Share on digg
Share on email

Imagination’s R&D group has developed a face detection algorithm, which is based on a classifier cascade and is optimized to run on mobile devices comprising a CPU and PowerVR GPU. The algorithm employs several optimizations to improve performance and accuracy. In particular, instead of searching each entire frame for faces, the detector limits its search to regions in which faces were previously detected plus a few randomly selected regions. Tracking previously-found faces ensures they are not lost, while testing a variety of other regions ensures that new faces are found quickly.

The main steps performed are illustrated in the figure below:

12-Block-level implementation of face detection on CPU and GPU Block-level implementation of face detection on CPU and GPU

Source image preprocessing

The preprocessing kernel constructs three temporary images from a single source image including:

  1. A mipmap containing multiple versions of the source image at different scales.
  2. A copy of the image in chromatic colour space.
  3. A single-channel image (or probability map) that for each pixel records the probability that the corresponding pixel in the source image has skin colour, calculated by comparing the colour in the chromatic image to the colour of faces detected in previous frames.

The chromatic image and probability map are stored at quarter-resolution, which is sufficient to preserve accuracy while minimising memory and bandwidth requirements.
The pre-processing kernel operates on pixels of the source image in parallel: each work-item processes a separate block of 4×4 pixels, outputting one pixel of the chromatic image and one pixel of the probability map.

Tile generation

To facilitate parallel processing, the source image is divided into multiple tiles that can be processed independently on separate GPU clusters. These regions are described using an integral image that simplifies computation of Haar-like features.

Cascade classification

The cascade classifier limits its search to the vicinity of any faces detected in the previous frame (and surrounding areas), skin-coloured areas identified by thresholding the probability map, and regions selected by the random candidate generator.

In comparison to the sequential sliding window approach required by a CPU, the GPU work-items can evaluate multiple windows in parallel. A property of the algorithm is that some evaluations complete much sooner than others, each window requiring anywhere from one to one hundred stages of computation. To maintain parallelism, when a work-item finishes evaluating one window it starts evaluating another.

Find regions with skin colour

The skin region detector finds areas of the probability map that have high probability, passing these coordinates to the cascade classifier.

Zero-copy implementation

The CPU code is implemented in C++ and the GPU kernels are implemented in OpenCL. As shown in the diagram below, an Android demonstration application is created using the PowerVR imaging framework (introduced in a previous article in this series). This framework enables the face detection algorithm to be efficiently pipelined across the ISP, GPU and CPU, making use of shared zero-copy memory and cache allocations that minimize synchronization overheads.

13-Creating-an-Android-app-using-the-PowerVR-imaging-framework_fCreating an Android app using the PowerVR imaging framework

When integrated into an application based on the PowerVR Imaging Framework SDK, Imagination’s optimized face detection algorithm can detect up to four faces processed in real-time at 1080p 30fps using a two-cluster GPU part clocked at 200MHz. This leaves plenty of headroom to combine other tasks into the software pipeline such as image stabilization beforehand and beautification afterwards, while still achieving 1080p30 performance on many existing mobile and tablet products available in the market today.

Concluding remarks

Imagination’s hardware portfolio enables silicon vendors to create devices that deliver best-in-class performance while operating under a tight power and thermal envelope. Its PowerVR GPUs provide the performance and flexibility needed to accelerate both graphics and data-parallel computations across many mobile and embedded devices in the market today.

By pairing Imagination hardware with the PowerVR Imaging Framework, designers can now harness the vast amounts of performance available in their target SoC including the GPU, ISP, CPU, video codecs and hardware accelerators. Imagination’s close collaboration with strategic OEMs–and in some cases their third-party software partners–has already helped deliver new computational photography and computer vision use cases to market that intelligently distribute the required computations across the available heterogeneous hardware components.

Further reading

Here is a menu to help you navigate through every article published in this heterogeneous compute series:


Please let us know if you have any feedback on the materials published on the blog and leave a comment on what you’d like to see next. Make sure you also follow us on Twitter (@ImaginationTech, @GPUCompute and @PowerVRInsider) for more news and announcements from Imagination.

Alex Kelley

Alex Kelley

Alex Kelley has over 20-years of sales, marketing, and general management experience in 3D computer graphics and has worked in the USA and several countries in Asia. Since joining Imagination Technologies Alex has launched the Visualizer brand, which has most recently brought a photorealistic virtual camera to SketchUp transforming the way people view 3D models. Alex was a Vice President at Caustic Graphics, a start up acquired by Imagination, and before that held Vice President roles at Autodesk and Alias. Alex is fluent in Japanese, and holds a B.S. and M.S. degree in Computer Science from Arizona State University.

Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

[email protected]
Tel: +44 (0)1923 260 511

Search by Tag

Search by Author

Related blog articles

android background

The Android Invasion: Imagination GPU IP buddies up with Google-powered devices

Google Android continues to have the lion share of the mobile market, powering around 75% of all smartphones and tablets, making it the most used operating system in the world. Imagination’s PowerVR architecture-based IP and the Android OS are bedfellows, with a host of devices based on Android coming to market all the time. Here we list a few that have appeared in Q4 2020.

Read More »
bseries imgic technology

Back in the high-performance game

My first encounter with the PowerVR GPU was helping the then VideoLogic launch boards for Matrox in Europe. Not long after I joined the company, working on the rebrand to Imagination Technologies and promoting both our own VideoLogic-branded boards and those of our partners using ST’s Kyro processors. There were tens of board partners but only for one brief moment did we have two partners in the desktop space: NEC and ST.

Read More »
pvrtune complete

What is PVRTune Complete?

PVR Tune Complete highlights exactly what the application is doing at the GPU level, helping to identify any bottlenecks in the compute stage, the renderer, and the tiler.

Read More »


Sign up to receive the latest news and product updates from Imagination straight to your inbox.