A quick rant about web performance

t was surprisingly easy finishing up my RetroArch port to the web, thanks to Emscripten and all the hard work in that. And making it easier for programs to get on the web is a good thing. However, during my journey, I ran into more than a couple issues and learned more than a couple discouraging facts that make me think twice about the great future of full-blown programs on the web.

A lot of the tech is still very very young

Most of the big tech in web apps are new. Stuff like WebGL, Web Audio, and even Emscripten itself are still in their growing stages, and it shows. You would never expect to run into a bug with a platform’s standard C library, but that was exactly an issue I had to debug and fix. (Note to future C library developers: while isprint and isgraph sound similar, they are not the same.) Even when they do work, they don’t always work “right” according to the other browsers. Right now my port has to resort to a hack to get Firefox working, because currentTime only updates when the Javascript event loop is idle. Mozilla claims this is the correct behavior, yet Chrome goes ahead and updates continuously. Who is right? Mozilla claim they are, but Chrome says otherwise. The spec doesn’t say with certainty which way is correct, so until this is expanded upon, you will run into these issues. (BTW, If anyone from Mozilla happens to read this, I would really like you guys to adapt the Chrome behavior.)

Doing things the “not-Web” way is bad, but it works better

One thing I learned very quickly: blocking the event loop is bad. Very very very VERY bad. It’s so bad. Never do it. No reason to ever do it. But it’s the only way to do some things.

RetroArch, being an emulator, has very different audio/video sync requirements than something like a game or a video. The audio and video the emulator spit out must get played immediately, or you get horrible stuff like input lag or audio/video desync. One way to combat this is to “block” on audio: make a buffer for audio and fill it up. If it’s full, wait until the audio API empties it a bit and fill it up and go on your way. If you’re on native Linux and using something like ALSA, that functionality is built into the API and works just fine, and for something like OpenAL that doesn’t, you can simulate it with busy loops. However, every time I brought that up in a bug report I was immediately shot down. “Busy-waiting in Javascript is a big no-no.” And it is when you let it get in an infinite loop. However, to do blocking audio with any sort of good performance, it is required with the Web Audio API. Web Audio has no blocking features, so you have to use busy-loops. I tried using some of the other interfaces I had to avoid this, like some callback-based ones, but this leads to…

You are at the mercy of the event loop

The event loop is 100% unpredictable. It also must be hit to do anything involving input/output. Video needs it to display, input needs it to capture events, and audio needs it to play. I tried to implement an audio callback method to avoid blocking on audio, but the callback fired at intervals that were in no way predictable, from one time up to five times every 20 milliseconds. That and trying to put calls to “requestAnimationFrames” is spotty at best. Hoping to get a stable 60FPS off of it is an exercise in futility. Most of these issues are not issues for something like a video or a normal game: Audio can be queued and you can play individual samples instead of streaming. For emulators, there is no alternative, and the tools the web platform has right now do not help. They have to be worked around to get a program that performs well.

What can be done?

The first thing that can be done is to make the browsers behave similarly. Right now Firefox doesn’t have the constantly updating currentTime for Web Audio, and the precision on performance timers on Chrome is really odd. These can be fixed (or a consensus agreed upon).

For more performance-heavy things, there needs to be a way to either get around the event loop or control it better. The yield keyword in ES6 is a nice first step, but there needs to be a way to guarantee (or at least make a better attempt) a callback or timer to happen when it should. Even better would be a way to start a thread separate from the main loop. Can Web Workers be used for this? Not sure, but probably not with any audio/video output which makes it a no-go for us. Maybe some new tech can allow this in browsers. A pipe dream perhaps, but hey, it’ll probably be better than NaCl in the long run.

RetroArch Android 0.9.9.5

By Squarepusher –

Another point release – and a lot more to talk about again.

New cores

So given the power of the Shield, we decided to dust off some libretro cores that have previously only been used for PC. bsnes/higan Performance core was high on our list. Thankfully, the nVidia Shield puts up quite the show.

I have done some extensive performance tests with bsnes/higan v0.92 on RetroArch Android (on the Shield) and I can confirm that every single non-coprocessor game runs at fullspeed and runs great. That means – every game that is not an SA-1/SuperFX/DSP/Cx4-coprocessor enhanced game will run just fine with bsnes/higan on a Shield.

bsnes/higan Balanced core according to maister did around 57fps with Zelda 3 – so *nearly fullspeed* but obviously Performance core is a better candidate for now on the Shield. Perhaps with the Shield 2 (Tegra 5?) we could expect co-processor games to run at fullspeed on the Performance core and for every non-coprocessor game to run at fullspeed on the Balanced core.

Also, need I remind you – yes, battery usage with bsnes will be higher than with any other SNES core. And no, unlike bsnes/higan on the PC, you can just use your trusty old .SFC/.SMC ROMs on it.

UI

I’m gradually coming around to the realization that the people badmouthing the current UI are, in fact, somewhat correct. However, this complaint I feel is valid only when it comes to the Android frontend which does indeed suck. So I have begun to restructure it all and in this release you can start seeing the first fruits of that labor. It’s a lot better organized now and on a microconsole like the Ouya/Shield it shouldn’t require you to leave your fingers off the gamepad and reach for touchscreen or the mouse in order to reach certain parts of the UI.

TV Mode

Given that microconsoles seem to be all the rage now – I thought adding in this mode from the iOS port would be nice. What this does, is that it boots you straight into RGUI. From there, you can navigate the menu with your gamepad and launch cores/games from there. The best part about RGUI (which could always be toggled from an overlay or from a gamepad that has a menu button BTW) is that it has a ‘history list’. It keeps a history of every game you have played – and you can select that game from the history list and it will instantly switch to that game. This mode is even more convenient when you have “Auto-load state” and “Auto-save state” turned on so that it instantly starts again at the point where you last left off.

Also, this “History list” is also going to be making an appearance in the Android frontend UI at some point for convenience.

Ouya

Somebody has offered to send an Ouya. We’ll see if it arrives here safely. If so, I’ll assume control over the RetroArch Ouya release as well along with Moonlighting and make sure that it’s a decent user experience on Ouya.

Download links

APK (r19) – http://themaister.net/retroarch-dl/android/org.retroarch.browser.RetroArch.r19.apk

Google Play – https://play.google.com/store/apps/details?id=org.retroarch&hl=en

BTW – the iOS port will come a day later. It will have Picodrive and all updated cores and changes.

RetroArch Android 0.9.9.4

By Squarepusher –

Despite this being a point release, a couple of very important changes have been made regarding the Android port, which made me compelled to make this blog post so that I can explain some of the things that have changed.

New cores added

People always love this part – especially the Xperia Play guys who are ever struggling to hold onto ever-decreasing internal storage on their outdated devices :) .

First up is a new core  – Picodrive. This has been closed-source for a fair while because notaz didn’t like guys like certain guys profiting from it through iPhone ports with donation buttons. With libretro and RetroArch now out, there s justifiable reason to open source it again since the payware/donationware guys are going to be on the run from now on.

Picodrive is a Sega Genesis/Sega CD/Sega 32X emulator especially optimized for ARM processors. It is the fastest Mega Drive emulator in existence for ARM-based devices like Android and iOS. Best of all, it comes with a 32X core which should run most games at fullspeed even on a weak ARM Cortex A8 CPU. There is only one game that has somewhat higher system requirements (Virtua Fighter 32X). An iPad Mini/2 runs this game at 56~57fps – about two FPS shy of fullspeed – your mileage may vary on how it performs on [insert your device]. It runs fullspeed on the Nvidia Shield though.

What else is new? A Stella core – this is an Atari 2600 emulator. We already bundled this for the iOS port but I guess we neglected to add it to the Android port up until now. Anyway, here it is.

Anything else besides? SNES9x mainline – ie. not the Next speedhacked version. Use this if you have a beefy device and think you can get away with a non-speedhacked version of SNES9x. It runs at fullspeed with all games on the Nvidia Shield – your mileage may vary on how it performs on your device. This version of SNES9x is more accurate than SNES9x Next- but is also a lot slower.

We also threw in the Desmume core – Nintendo DS emulator – with meancoot’s ARM JIT backend. It runs pretty much the same as the iOS port – no, it’s not going to be giving that payware closed-source emu Drastic any run for its money and it’s just thrown in for the ‘ah what the heck’ factor – its main usefulness as a libretro core is on the PC – but seeing how fast ARM hardware is catching up with even laptop Core processors, perhaps it will only take another year or two until ARM devices are at the level of a Core i5 – so we will keep it on life-support until then.

I felt it was also time to throw in Mednafen PSX – an alternative PlayStation1 emulator. Users of the PC version will know this emulator – it’s one of the most accurate open-source PS1 emus around right now. It’s always been deemed as totally unsuitable for ARM devices since it has such high system requirements – however, the Nvidia Shield is showing that there is reason for optimism. On Shield it hovers right now between 35 and 40fps with occasional spikes to 44fps. Give it some time with stuff like ARM Cortex A57 coming out and who knows if this will be able to run at fullspeed. So – it’s included by default from now in anticipation of that.

For practical purposes – it is recommended you keep using PCSX ReARMed. Don’t complain to me that this core runs too slow.

There is one other core which has been added – Instancing Viewer. This is another GL tech demo that is meant to show off instancing being done in GL. Load it up with any PNG image file and it should be rendered on a bunch of cubes – you can look at them from a firstperson perspective. You can increase the cube amount by going to RGUI->Core Options and increasing cube size. Note that the cube sizes are to the power of two – so 8 is 2 ^  meaing 256 cubes. There is no practical gameplay purpose to this right now – it’s just a Libretro GL tech demo. It might become something more useful later on.

No more static syncing by defaultWe now enable the threaded video option by default. There have been a couple of improvements to it in that it now applies adaptive jittering which should make the jittering less apparent.This option should lead to audio crackle-less gameplay on most devices. Of course, there is still the option to use static syncing and we certainly recommend its use if you know what you are doing and can set the ‘refresh rate’ in the RetroArch app to exactly match that of your display source’s refreshrate. This option just turned out to be too difficult to setup for most users and because any refresh rate mismatch in RetroArch leads to audio/video sync being incorrect, this would manifest itself in audio crackles and lead to a broken user experience.On a platform like iOS, this is simply no issue. On Android unfortunately it is. Therefore, we are playing it safe here. Upon starting up RetroArch Android for the first time, it will ask you whether you want to use threaded video or whether you want to ‘synchronize by refreshrate’. By choosing the latter option you get the old way of how things worked.

NOTE: Static syncing should work fine on nearly all libretro cores. However, there has been one annoying exception, and that is PCSX ReARMed. Certain games (like Chrono Cross, FInal Fantasy VII, Crash Bandicoot 1, and lots more) use variable refresh rates instead of running at a fixed refresh rate. This seems to play havoc with RetroArch Android’s way of doing static syncing right now (it is not a problem on iOS however). Enabling threaded video solves all these problems overnight, so we recommend that if you have your ‘forced refreshrate’ set up right, that you set ‘threaded video’ off for the PS1 fighter games like the Tekken games, Tobal games, Street Fighter games and others – ie. games that are guaranteed to run at 60Hz. For everything else, turn on threaded video for now.

Appeal to low-latency audio in Android 4.1 and up

There have been changes to the OpenSL audio driver to target the new low audio-latency capabilities of Android 4.1 and up. This should lead to much better results where the audio isn’t as hopelessly behind the video like it was in previous releases (which really wasn’t our fault but more due to us having to deal with the awful state of Android’s audio capabilities prior to 4.1 and up).

We are aware that people with sub-Android 4.0 devices (and even 4.0 itself) will likely get regressions because their Android version can’t possibly deal with low-latency audio buffer sizes. Unfortunately, there’s no getting around the fact that Android was really fundamentally broken on many different levels up to maybe 4.1 and 4.2. Things certainly have been improving a lot since then, and it makes no sense for us to keep appealing to the lowest common denominator when those phones/tablets are all going to be replaced in the near foreseeable future anyway (and they should). There’s also no point trying to upgrade most of these outdated devices (like the Xperia Play) to the latest Android version because the system requirements of newer versions of Android won’t allow for it. RAM requirements for instance have gone through the roof since Android 4.0 and up, and so it cuts off all older devices with only 512MB of RAM (and even less).

If the demand is high enough on these old outdated devices, we might introduce back a ‘high-latency audio’ option. For this release, we are trying to cater to the crowd that bought devices this year and one year ago. We believe that is the right strategy here.

NVidia Shield support

So we received an Nvidia Shield from Nvidia (two in fact – one for me, the other for Themaister) and we are certainly walking away with a higher overall impression of Android now. There are still problems, but they seem at least surmountable now.  John Carmack in a recent QuakeCon talk has spoken about some of these issues, but really, it doesn’t take a rocket scientist to fire up adb and look at Logcat and see the pages and pages of garbage collector stalls passing by to sense there is something architecturally very wrong going on with this OS from a high-performance games machine perspective. How Google is ever going to fix this and bring it at parity with iOS on most devices is frankly Google’s problem.

Returning back to Shield for a minute – I am pleased to announce I’ve gotten much more consistent runtime performance results on it than any other Android device so far. Regarding the performance of it – it’s insane – I’ve done a lot of performance tests over the past few days and so far it’s definitely the most powerful ARM-based device I’ve used.

http://t.co/nZDSmzpjec

Really, people like to deride the device for the Xbox 1 Duke-esque form factor and the high price, but really, I consider this a much better ‘micro console’ than something like Ouya or GameStick so far. (more on the Ouya stuff later). In fact, I’d say that using it as a RetroArch console alone is worth the cost for $299 – building our own RetroArch console has been a thing on my mind for sometime but really – there’s no way we’d manage to do a better job at that pricepoint than Nvidia themselves and certainly not with such hardware inside it. So let’s hope they are successful. I am certainly a believer so far.

And yeah, GPU-wise it still might not be at parity with a PS3 or 360, but CPU-wise? It slaughters a PS3 or Xbox 360. Really, I’ve been stressing this for years and it seems to have only recently been gaining traction after Mark Cerny and co have started admitting how bad the Cell was (he didn’t say that exactly but it’s fairly obvious that is what they are referring to beyond all the media training) – the per-core performance of a Cell-based CPU (this includes both Xenon CPU in 360 and Cell in PS3 itself) was something like a Pentium 4 2.4GHz CPU. It exhibited most of the same problems – deep pipelines, high penalties (to the tune of 500ms L2 cache misses – goddamn!), and to top it all off, in-order execution (and a bunch of SPUs in PS3 which are useless for general-purpose code and with too little local storage).By comparison, anything based on an ARM Cortex A15 design (like Nvidia Shield) is happily blazing past a Core 2 Duo – which has a level of IPC that a PS3 or 360 could only dream of. So really – if this is about a ‘comparison’ between the current-gen consoles and something like Shield, then the current cream of the crop of micro-consoles already wins out by a fair margin when it comes to CPU power – and then some. I have no doubt that Tegra 5 will exceed PS3 RSX/360 ATI performance levels and from there it’s basically a race to catching up with the next-gen consoles.

Long story short – this is no moneyhats – I consider this the RetroArch handheld games console I wanted to build from the onset and which none of the el-cheapo tablets/phones ever delivered.

Improved input support – analog stick support

Coinciding with the support for the Nvidia Shield gamepad, we have improved input support in a number of big ways –

Analog stick support

Some cores (like TyrQuake, SceneWalker) already have native analog stick controls – however, RetroArch Android never exposed analog stick support – up until version 0.9.9.4 now that is. Devices like the Nvidia Shield, the Xbox 360 gamepad, the Logitech Rumblepad 2 have all been preconfigured to default to ‘Dual Analog mode’ now. Libretro cores which implement ‘RETRO_DEVICE_ANALOG’ will now be able to make use of the analog sticks on an input device.

We will need more trial-and-error testing to add this functionality to more input devices. Users are encouraged to help us out in this endeavor.

ANR issues fixed (Application Not Responding)

So it turns out that the ever-reliable Google had a bug in their native activity glue code input code that was causing the input buffers to become congested and then start issuing ‘Application Not Responding’ events to the application if an input event had failed to be picked up on for over more than 5 seconds.

It seems Nvidia picked up on this issue earlier and made a blog post about it – which is the only reason I have been able to fix it for this release. According to the changelogs, Google has fixed this issue now in their code for the latest NDK version. Unfortunately for them, the latest NDK version seems to be a totally broken regression city fest where at least half of the cores that used to compile fine previously now issue ‘Internal Compiler errors’ – so I was forced to use NDK r8b either way.Speaking of which – to any and all devs reading this stuff – here are the links which hook you up – Google doesn’t seem to provide an archive of their previous NDK versions (even though they should) but luckily Nvidia still has a mirror up for it – so here goes –

https://t.co/P5poAtsS0e (NDK r8b for Linux x86_64)

https://developer.nvidia.com/content/nativeactivity-input-crashes-and-anrs-simple-fix-dangerous-bug (NativeActivity Input Crashes and ANRs: A Simple fix for a Dangerous Bug)

Input devices fixed (Xbox 360, iPega PG-9017)

Thanks to generous gifts, I was able to fix a number of input devices. First and foremost is the Xbox 360 pad – the D-pad hat controls should now be properly working. Analog stick support for DEVICE_RETRO_ANALOG has also been added.

Two modes for the iPega PG-9017 have also been added. Set iCade Profile in Settings->Input to either of the two iPega options to use them with RetroArch.

Ouya Store (??)

Some guy has registered on our forum a few days ago and has been talking about wanting to bring this over to the Ouya Store.

We don’t have an Ouya, are likely not to get one unless somebody gifts it ($99 for an Android-based Tegra 3? no thanks), and the Ouya Store policies are absolutely insane (http://forum.themaister.net/viewtopic.php?id=741&p=2) – a bunch of startup guys thinking they can pull an ‘Apple Store’ in terms of draconian app store policies. So it’s probably a good thing somebody else is going to prostitute themselves before these guys because I quite honestly wouldn’t have the patience for it.

We’ll see what comes out of it.

Anyway, version 0.9.9.4 is out now. Enjoy – we hope that people with more recent Android devices will get a much better experience now with it.  The iOS version will come a few days later – most of the work really has been on the Android version for this release.

Download Links

APK (r18) – https://anonfiles.com/file/afb0f033a3f779cb111884400406cd7b

Google Play – https://play.google.com/store/apps/details?id=org.retroarch&hl=en

 

Libretro ffmpeg

By Squarepusher

Lion King running on libretro ffmpeg with around 5/6 shaders stacked - hence the low framerate (my GPU can't keep up)
Lion King running on libretro ffmpeg with around 5/6 shaders stacked – hence the low framerate (my GPU can’t keep up)

This core isn’t particularly new – maister has been dabbling on/off with a libretro ffmpeg port for a good two years now. The problem was that up until now it was never really particularly useful except for morbid curiosity.

The main achilles heel has always been that video rendering was software-based through libretro. Software-rendered video is still awfully slow compared to hardware-accelerated rendering, and launching a movie player with no hardware acceleration would definitely not compare favorably to pre-existing media players.

Now that it has made the leap to libretro GL, its usefulness has increased by a lot. The most noteworthy aspect of this core is that there is a core option enabling/disabling temporal interpolation. Through motion blur it will ‘fake’ a higher framerate in movies (fake 60fps).

The Matrix running on libretro ffmpeg with waterpaint-mudlord shader - looks like The Matrix meets Waking Life/A Scanner Darkly.
The Matrix running on libretro ffmpeg with waterpaint-mudlord shader – looks like The Matrix meets Waking Life/A Scanner Darkly. (in case you think the video quality leaves much to be desired – remember that the input source here is a low-quality SD Xvid video circa mid ’00s.

Another very appealing aspect of the ffmpeg libretro core is (of course) the mere virtue of it running inside RetroArch, which means for ports that have shader support, shader passes can be applied ontop of the image. We’re pretty confident no other movie player right now is offering 8-pass shader stacking right now – never mind it being dynamically configurable from a built-in menu. Also included (an option of most interest to otakus who like to watch anime) is ASS subtitle support.

Despite the very cool nature of this ffmpeg port, it should be noted that this is fundamentally a very backwards way of implementing a movie player. While most movie players are high-latency affairs that depend on buffering and advanced A/V synchronization strategies, this ffmpeg core instead depends on a low-latency frontend (ie. RetroArch in this case) in order to deliver good audio and video. Something which might simply be too tall an order on Android given the high-latency audio/video drivers on that platform.

Terminator 1 with bsnes-gamma-ramp applied. What you can't see is how smooth this looks with temporal interpolation turned on.
Terminator 1 with bsnes-gamma-ramp applied. What you can’t see is how smooth this looks with temporal interpolation turned on.

An attempt will be made by me to get this running on mobiles and anything in fact supporting libretro GL – this might have to involve baking in ffmpeg as a static library since on the mobile platforms ffmpeg is not available or can be installed as a dependency.

All in all, the temporal interpolation option really makes a big difference in the movies I’ve tried it with, and overall it’s an exciting and promising indicator that libretro doesn’t necessarily have to be confined to merely emulators or games.

RetroArch v0.9.9 Released – where to get it on each platform

RetroArch v0.9.9 has officially been rolled out on all platform targets.

The new platforms that are supported with this release of RetroArch are as follows:

  • iOS (both jailbroken and non-jailbroken – non-jailbroken requires that you are a registered developer and can compile your own copy of RetroArch + cores)
  • Blackberry 10
  • Blackberry Playbook Tablet OS

The other platforms which are already supported by the RetroArch/libretro projects have all received updates (with some pretty extensive changes – more on that in an upcoming blog post).

WHERE TO GET IT

Windows: New users can download 32- and 64-bit flavors of RetroArch and RetroArch-Phoenix from Themaister’s site:

http://themaister.net/retroarch.html

Existing users can/should download the new version through RetroArch-Phoenix’s built-in ‘RetroArch Updater’ utility. (this is the preferred update method for existing users to save massive bandwidth!)

Mac OS X users can download hunterk’s builds from this post on the libretro forum:

http://forum.themaister.net/viewtopic.php?pid=459#p459

Debian/Ubuntu/Mint users can add hunterk’s Launchpad PPA repository to their Synaptic/apt sources:

https://launchpad.net/~hunter-kaller/+archive/ppa

iOS users can find RetroArch iOS in one of Cydia’s default repositories – ZodTTD & MacCiti.

You can also add our own Cydia repository in order to get it, located at:

http://themaister.net/cydia

Most cores will work with both tethered and untethered jailbreaks, but cores that require the use of a dynamic recompiler (dynarec; DeSmuME and PCSX-ReARMed) will require a full, untethered jailbreak to function.

Android users can get the latest version from the Google Play Store. Xperia play controls seem to be wonky, but we hope to have that fixed very soon.

Wii users should use this package:

https://anonfiles.com/file/4536ac12f0071a397b2f1d70672814cf

Blackberry Playbook users should use this package:

http://themaister.net/retroarch-dl/blackberry/playbook/RetroArch-1_0_0_1.bar

Blackberry 10 users should use this package:

http://themaister.net/retroarch-dl/blackberry/bb10/RetroArch-Cascades-1_0_0_1.bar

PS3 users can get the DEX and CEX versions from the usual sources.

Xbox1 and Xbox360 users can get their respective versions from the usual sources.

OpenPandora users can get builds from lifning’s repo:

http://repo.openpandora.org/?page=detail&app=retroarch.lifning.001