Sunday, 14 December 2014

SuperComputing goes Embedded

Graphics Processors, or more specifically General Purpose Graphics Processing Units (GPGPUs), have been steadily making inroads into the SuperComputer market over the last 5 years or so. Their high rate of floating point performance coupled with lower power requirements driving the shift from CPU to GPU cores. The graph below from NVIDIA demonstrates this trend.


There are only two major players in the GPGPU market, AMD and NVIDIA, with NVIDIA seen as the market leader, particularly with their Kepler architecture.

Earlier this year NVIDIA made an announcement on a breakthrough in Embedded System-on-a-Chip (SoC) design with the Tegra K1. The diagram below outlines the Tegra K1 architecture.


Essentially an ARM A15 2.3GHz 4+1 core CPU mated with a 192 core Kepler GPU providing an amazing ~350 GFLOPS of compute at < 10W of power.

Obviously NVIDIA have an eye on the mobile gaming market, building on their Shield strategy, but equally recognise this step change in GFLOPS/W opens up major opportunities in the embedded market. This ranges from real-time vision and computation for autonomous cars, to advanced imaging applications for defence applications, UAVs in particular. In fact, General Electric Intelligent Platforms have signed a deal with NVIDIA to license the Tegra K1 SoC for their next generation embedded vehicle computing and avionics systems.

But, the really great thing about the Tegra K1 is that NVIDIA have released a development board call the Jetson K1, which retails for an amazing $192 in the US.

I've purchased one myself and have started to get to grips with the challenges of CUDA programming. Once you've mastered the conceptual shift to parallel programming with CUDA, then it starts to become relatively straightforward to develop algorithms and computations that take advantage of the GPU.

If you want to find out more detail on the Jetson K1, then I'd recommend visiting the Jetson page on eLinux.org.