OpenVX 1.1 with Convolutional Neural Network extension in action

Share on linkedin
Share on twitter
Share on facebook
Share on reddit
Share on digg
Share on email

In December last year, Imagination announced we were the first to submit an OpenVX 1.1 conformant implementation. In this blog post, we will show how our work has developed since then on one of the first implementations of the Khronos OpenVX 1.1 API as well as the new and very first implementation of the Convolutional Neural Network (CNN) extension that goes along with it.

First, a bit of background. OpenVX is an API developed by the Khronos group of which Imagination Technologies is an active promoter member. The new API provides developers with a standard method for writing applications that use vision operations on an image in an efficient way. Examples of these operations are “edge detection” or “thresholding”.

The OpenVX API enables users to chain together these operations for purposes such as detecting corners in an image or adjusting an image’s perspective. This makes developing applications quicker and easier compared to the alternative of implementing the application in another API such as OpenCL or Compute which are not targeted at vision operations. Our implementation runs on OpenCL-capable PowerVR GPUs.

Read this blog post to find out more information on how Imagination created the first conformant implementation of OpenVX 1.1.

OpenVX 3

One advantage of this API for developers is that an application written for one OpenVX implementation should work on any other implementation. For example, there could be an implementation that runs on the CPU on a system without a GPU. The application used for both platforms should not need to be modified to run because of the use of a standard API.

CNN extension

Another first is our implementation of the CNN extensions for OpenVX.

What is a CNN? CNN stands for ‘Convolutional Neural Network’ and is a part of machine learning. It is used in many different areas and one such area is image recognition. Imagination has implemented this extension for OpenVX and we have created a demo to show off this feature running on a PowerVR GPU.

First, we are using standard vision operations in OpenVX to get the bounding box of an image that the user has created. Then we use a simple LeNet CNN graph, which lets us estimate what the user input image represents. In this demo, we are recognising numerical digits and have trained the demo against the MNIST set of handwritten digits.

Of course, we could extend this to recognise a variety of things. For example, we could use it to recognise a voice and turn that into an application that does voice recognition in real-time. Equally, compared to running on the CPU this means better efficiency and longer battery life on mobile. Compared to an application doing the same thing in OpenCL, this application should be portable between implementations. The use of OpenVX means we were able to implement this demonstration in only a relatively few lines of code and with only tens of man hours of effort, compared with the hundreds or thousands it would take to develop from scratch.

OpenVX-image 2Our graph is composed of a very simple set of operations so far, but this can be extended to more complicated graphs and to deep learning algorithms. This demo is currently running on general purpose hardware but we can envisage an easy step forward where we use dedicated hardware to streamline certain operations. This means that we could further improve performance and efficiency compared to running on general purpose processors and move more algorithms to run on mobile rather than the cloud. Doing this would greatly improve latency and would eliminate having to use a network connection.

OpenVX-image 3

In the above image, you can see we have drawn a very tall and thin ‘8’ character. You can see that the ‘1’ and ‘8’ nodes have high activation. If we continued to make this ‘8’ character taller and thinner, the graph would break down and incorrectly label the result as a ‘1’. We can avoid this by modifying the graph to learn about these edge-cases. However, for simplicity, this demo is intended to show the OpenVX 1.1 API and the extension we have implemented.

Above is a video of the demo in action. You’ll notice that we draw the number four as closed digit, rather than two stroke ‘open’ four. This is because at present the graph does not recognise the ‘4’ character well. This is likely to be because the training data set uses other ways of drawing this digit, such as below. If we extend the data set to include these methods of writing the digit, the network would become more robust.

OpenVX number 4

We are showing this demo in person at the Embedded Vision Summit in California from 1st to 3rd May 2017. Come join us for your opportunity to talk about how we can help you with your vision or machine learning problem. Imagination’s Paul Brasnett will also be talking at the summit on the subject of ‘Training CNNs for Efficient Inference‘.


Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

[email protected]
Tel: +44 (0)1923 260 511

Search by Tag

Search by Author

Related blog articles

android background

The Android Invasion: Imagination GPU IP buddies up with Google-powered devices

Google Android continues to have the lion share of the mobile market, powering around 75% of all smartphones and tablets, making it the most used operating system in the world. Imagination’s PowerVR architecture-based IP and the Android OS are bedfellows, with a host of devices based on Android coming to market all the time. Here we list a few that have appeared in Q4 2020.

Read More »
bseries imgic technology

Back in the high-performance game

My first encounter with the PowerVR GPU was helping the then VideoLogic launch boards for Matrox in Europe. Not long after I joined the company, working on the rebrand to Imagination Technologies and promoting both our own VideoLogic-branded boards and those of our partners using ST’s Kyro processors. There were tens of board partners but only for one brief moment did we have two partners in the desktop space: NEC and ST.

Read More »
pvrtune complete

What is PVRTune Complete?

PVR Tune Complete highlights exactly what the application is doing at the GPU level, helping to identify any bottlenecks in the compute stage, the renderer, and the tiler.

Read More »


Sign up to receive the latest news and product updates from Imagination straight to your inbox.