Site icon Libretro

RetroArch – Hardware video decoding – coming soon!

As you may well know, RetroArch has embedded video player support on platforms such as Windows, and Linux. Just like VLC, Kodi, mpv and other video players out there, it accomplishes this by leveraging the ffmpeg project.

Up until now, all video decoding was performed entirely in software. This means that the CPU has to do all the decoding instead of being able to delegate it to the GPU. This meant that on some systems, video playback could be too slow if the CPU was too underpowered. This so happens to be the case on many ARM SoC devices out there, such as the Raspberry Pi and Odroids.

Now, we finally support hardware video decoding through ffmpeg’s own APIs! This should really help on systems where there is a CPU bottleneck and the GPU happens to support hardware decoding. Whether or not you are able to decode 1080p, 1440p or 4K on hardware depends entirely on your GPU’s capabilities however.

In addition to hardware decoding, frame based multithreading is now enabled for SW based video decoders, but actual effectiveness hasn’t been proven yet.

The core switches back to SW based decoding if the HW based decoding couldn’t be initialized.

The following backends have been tested:

We have performed the following tests so far:

As a stress test video, we picked a 4K video (3840×2160) with a total bitrate of 29561 kb/s (h264/AVC1, YUV420P), running at 30 frames per second. The CPU we’re using for this test is an Intel Core i7 7700k. With such a CPU, we don’t really have a CPU bottleneck and we are merely GPU bound when it comes to rendering the content.

With software decoding (the current default in RetroArch) – we averaged around 55fps with the 2080 Ti. Our CPU load averages around 15% with GPU load averaging around 11%.

With hardware decoding (the 2080 Ti defaults to DXVA2 for this test) – we averaged 77fps with the 2080 Ti. Our CPU load averages around 11% with GPU load averaging around 20%.

NOTE: The above is long since out of date – the same video is now 256fps with hardware decoding and 224fps with threaded video decoding at an automatically defined amount of threads. Quite the improvement from 55fps I’m sure you’ll agree.

What remains to be done

We will still need to gather tests for the following backends:

Future plans

In short, we hope this will really help out RetroArch’s video playback capabilities not only on desktops such as Windows and Linux, but also on the ARM SoCs, and in specific our own Linux distribution, Lakka.

But hardware video decoding is not the end-all-be all. There is certainly a lot of room for improvement for future speedups, and these are being investigated. But that’s the subject of another blog post somewhere down the line.

For now, rest assured that big things are coming up for the next version of RetroArch!

Exit mobile version