RetroArch Steam – Beta 1 key giveaway (Read more for details)

All keys are gone. Thanks for participating!

New beta testing keys will be available on 9/26/2020 – for the exact time check this page here.

As you probably already know, a year ago we announced that RetroArch would be releasing on Steam. We have worked hard on this for a fair while now. The process is slower than expected due to reasons beyond our control.

It’s been a lengthy process and we have had to significantly retool things to conform to Steam’s policies and guidelines, one of which is no Core Updater (just like on Google Play now).

While we wait for our release candidate build to be manually approved by Steam (which we’ve been told is a lengthy process), in the meantime we will start giving away beta testing keys. We want to gather as much feedback as possible from users so that the final experience on Steam lives up to people’s expectations.

So with that in mind, we are giving away keys for our first beta test version, Beta 1.

How do you get a Steam beta key?

Although we want everyone to be involved in this testing process, we cannot do it all at once. We will distribute the codes for a while at the link down below.

For Patreon users: We feel it’s important to express our gratitude to the people who gave their support when we fell on hard times with the hacking attack. Patreon subscribers can request their testing key by sending us a Direct Message on Patreon.

Disclaimer

NOTE: These keys are not being sold, and as per Steam’s rules regarding crowdfunding, we are allowed to do this. We refer to the following section:

Crowdfunding.
You can use the keys for crowdfunding rewards and give to your supporters. Before the game launches, you can also give your supporters beta keys if you wish, but these keys should only be owned by your supporters, and unless beta access is available for sale through Steam, these keys should not be sold elsewhere outside of a crowdfunding campaign.

Steam version details

So what’s different when using the Steam version right now vs. the regular version?

  • No core updater. You install cores through Steam Store instead. After you install RetroArch, you install the Libretro core DLCs that are available separately. We have made 10 cores available as DLC so far. They are all free and are already available on Steam.
  • All updateable assets (including shaders, overlays, etc) are pre-packaged and updated with new RetroArch builds. Basically, nothing is downloadable from our servers, everything goes through Steam.
  • No Desktop Menu.
  • Remote Play support. See next paragraph.

Remote Play support

RetroArch on Steam has full Remote Play support. This means that you will be able to play any multiplayer game online with another Steam user that also has a RetroArch beta key.

This feature is exclusive to the Steam version, and has the following advantages:
* Not dependent on RetroArch’s netplay functionality
* Because of this, it does not depend on serialization in order to work
* It apparently works very well, on par with something like Parsec and perhaps even better

NOTE: Right now, Player 2 needs to use a gamepad in order to be recognized as player 2. If Player 2 uses the keyboard instead, RetroArch will mistakenly think it’s Player 1 instead.

Which cores are available right now on Steam?

The following cores are currently available:

Final Burn Neo Progress Report – September 2020

Final Burn Neo (FBN/FBNeo) is the follow-up of Final Burn Alpha (FBA/FBAlpha), an alternative to MAME for arcade emulation. It’s more focused on playability and less on accuracy/preservation. The team is composed of dink, iq_132, JacKc, kev and me. It supports most libretro features (netplay, runahead, retroachievements, …) and is part of the libretro steam launch lineup.

New supported games and other improvements

Konami

We added support for “WEC Le mans” and “Hot Chase”, 2 racing games from 1986 & 1988, you could see those games as Konami’s response to Sega’s “Out Run” and “Hang On”.
We also added support for the 6-player version of “X-Men”, which is playable from start to end without any glitch for the first time in an arcade emulator (our fixes were also ported to current MAME and MAME2003+).

We also improved emulation for the K054539 sound chip which is used in lots of konami games (including X-Men), this board had an echo effect that wasn’t emulated in any arcade emulator previously.

PGM

“Photo Y2K 2”, a game dumped nearly 20 years ago, finally got his protection passed thanks to the efforts of iq_132 and dink, i wouldn’t call it a must-play but it’s always nice to make breakthrough after so long. Various pgm hacks and bootlegs were also added along the way.

Sega

“Opa Opa”, a Sega System E game from 1987, is now playable, the game is some kind of mix between pacman and a shoot’em’up, with upgrades and playable in co-op.
Our Sega System 18 driver also had a nice overhaul after being on the todo list for a long time, games like “Michael Jackson’s Moonwalker” now plays without any gfx issues.
2 megadrive arcade bootlegs were also added : “Super Bubble Bobble” and “Top Shooter”.

Midway

For years there was a nasty issue with our Midway W/T Unit emulation (Mortal Kombat 1/2/3 and Rampage World Tour), more precisely only half of the frames were properly displayed, it has now been fixed thanks to the efforts of Marcos Medeiros (Romhack/zxmarcos) !

Taito

We added support for “Galactic Storm”, a third person shoot’em’up from 1992, this one was quite the unusual addition since it involves some polygon rendering, it’s quite rare for FBNeo to do 3D-ish things.

DataEast

“Gondomania”, a shoot’em’up from 1987, is now playable. “Super Shanghai Dragon’s Eye”, a puzzle/mahjong game from 1992, was also added.

Capcom

There was another batch of gfx fixes for CPS3 (Street Fighter 3, JoJo, Warzard), our emulation should now be on par with MAME’s apart from DMA status emulation, and there is a good reason for that : it would critically increase the hardware requirements for something that doesn’t seem to affect any phase of gameplay, that’s just not worth it on an emulator which is more focused on playability than accuracy/preservation. SF3-1 and SF3-2 from that driver also have a new dipswitch to enable wide screen mode (previously, only SF3-2 had that wide screen mode available, and only through its service menu), don’t ask about SF3-3.
CPS1’s (olders Street Fighter 2 and dozens of other games) MSM6295 sound chip also got nice improvements that make the bgm sound clearer, i’m mentioning it for CPS1 but it’s also used in quite a lot of other arcade systems. About that, i’ve seen some messed up comparisons on internet so i’ll say it : it’ll never reach the quality of cd/vinyl soundtracks of those CPS1 games, because it operates at a 7.5khz samplerate and the bgms were recorded for that samplerate.

SNK

Support for Neo-Geo Pocket and Pocket Color was added.

Nintendo

More NES mappers, more NES games, and reduced input lag ! If your favorite NES game isn’t emulated yet, feel free to ask here !

Miscellaneous

Pre 90s

Added support for missing games running on the same hardware as “Q-Bert”, it includes “Insector”, “Argus”, “Krull”, “Curve Ball”, “Tylz”, “Knightmare”, “Reactor”, “Screw Loose”, “Wiz Warz”, “Video Vince and the Game Factory”, and “The Three Stooges In Brides Is Brides”.
We finally completed our support for Nichibutsu games : “Tube Panic” and “Mag Max”, 2 shoot’em’up from 1984 and 1985, are now supported.

Post 90s

Support for “Hyper Duel”, a really nice Technosoft shoot’em’up from 1993, was added.

A “Cisco Heat” driver was added, it includes the eponym racing game from Jaleco, and other games like “Big Run”, “Grand Prix Star”, “Wild Pilot”, and “Scud Hammer”.
Support for the Hyperstone cpu family, high-spec cpus from the late 90s, was added, and along with it support for most arcade systems using them, it includes games like “Vamf x1/2”, “X2222”, “Crazy War”, and dozens of other games. I’ll be honest though, technically those games don’t live up to the specs of their cpus.

Better savestates support

Over the last months, savestate issues that could cause glitches, mostly with runahead (single instance) and netplay, were fixed, it includes neogeo, cps1, pgm, irem m92, sega system 1, and last but not least, pretty much any game using yamaha sound boards, and i can tell you those boards were quite popular in the 90s.

Better support for other archs/platforms (arm, PS3 and Wii U)

Final Burn Neo is written for x86 cpus, and while we try to keep the code compatible with other archs, we don’t always have the time nor means to properly test them. Lots of work was done on that front lately, it includes a few fixes for arm cpus (i.e smartphones, tablets and small computers like the raspberry pi), and tons of fixes for big-endian ppc cpus (i.e PS3 and Wii U). I’m not saying the ~15000 romsets we support are all ok, with an emulator which is basically a conglomerate of hundreds of smaller emulators it probably wouldn’t be even after working for years on it, but hundreds of games were fixed, including popular systems like pgm or neogeo CD. A special thanks to CrystalCT, author of FBNeoRLPlus (an alternative libretro frontend specialized in FBNeo for PS3), for his dedication on this.

And much more ! This article is far from being exhaustive.
If you want to help FBNeo, keep sending us bug reports when you see something wrong.
Our forum : https://neo-source.com/
Our github : https://github.com/finalburnneo/FBNeo
Our discord : https://discord.gg/8EGVd9v

RetroArch: Steam Beta Testing – Steam Cloud

We are happy to present one of the expected features, Steam Cloud backups. With this feature, you will be able to back up some files in your RetroArch directory to your Steam Cloud.

RetroArch/saves // saves folder
retroarch.cfg // main configuration file
RetroArch/states *.state // every state file
RetroArch/config *.opt // dlc options
*lpl // history entries

In order for the Steam Cloud sync to work correctly, default directories shouldn’t be changed to other locations.

RetroArch Steam – Beta 1 key giveaway (Read more for details)

UPDATE 9/21/2020 – NEW KEYS INCOMING TODAY

New beta testing keys will be available on 9/21/2020 at exactly 20:00 GMT+1 (Greenwich Mean Time).

ORIGINAL POST 9/6/2020

As you probably already know, a year ago we announced that RetroArch would be releasing on Steam. We have worked hard on this for a fair while now. The process is slower than expected due to reasons beyond our control.

It’s been a lengthy process and we have had to significantly retool things to conform to Steam’s policies and guidelines, one of which is no Core Updater (just like on Google Play now).

While we wait for our release candidate build to be manually approved by Steam (which we’ve been told is a lengthy process), in the meantime we will start giving away beta testing keys. We want to gather as much feedback as possible from users so that the final experience on Steam lives up to people’s expectations.

So with that in mind, we are giving away keys for our first beta test version, Beta 1.

How do you get a Steam beta key?

Although we want everyone to be involved in this testing process, we cannot do it all at once. We will distribute the codes for a while at the link down below.

For Patreon users: We feel it’s important to express our gratitude to the people who gave their support when we fell on hard times with the hacking attack. Patreon subscribers can request their testing key by sending us a Direct Message on Patreon.

Disclaimer

NOTE: These keys are not being sold, and as per Steam’s rules regarding crowdfunding, we are allowed to do this. We refer to the following section:

Crowdfunding.
You can use the keys for crowdfunding rewards and give to your supporters. Before the game launches, you can also give your supporters beta keys if you wish, but these keys should only be owned by your supporters, and unless beta access is available for sale through Steam, these keys should not be sold elsewhere outside of a crowdfunding campaign.

Steam version details

So what’s different when using the Steam version right now vs. the regular version?

  • No core updater. You install cores through Steam Store instead. After you install RetroArch, you install the Libretro core DLCs that are available separately. We have made 10 cores available as DLC so far. They are all free and are already available on Steam.
  • All updateable assets (including shaders, overlays, etc) are pre-packaged and updated with new RetroArch builds. Basically, nothing is downloadable from our servers, everything goes through Steam.
  • No Desktop Menu.
  • Remote Play support. See next paragraph.

Remote Play support

RetroArch on Steam has full Remote Play support. This means that you will be able to play any multiplayer game online with another Steam user that also has a RetroArch beta key.

This feature is exclusive to the Steam version, and has the following advantages:
* Not dependent on RetroArch’s netplay functionality
* Because of this, it does not depend on serialization in order to work
* It apparently works very well, on par with something like Parsec and perhaps even better

NOTE: Right now, Player 2 needs to use a gamepad in order to be recognized as player 2. If Player 2 uses the keyboard instead, RetroArch will mistakenly think it’s Player 1 instead.

Which cores are available right now on Steam?

The following cores are currently available:

Buildbot and Github mostly restored – the current status and future plans

Thanks to m4xw and Xer The Squirrel, we have managed to:

  • Restore our buildbot server.
  • Restore the vandalised Github repositories.

State of the buildbot server

We have managed to restore most of the 1.9.0 stable downloads. Some files are still missing though, such as the PS2 stable and the non-RPX WiiU builds. Unfortunately, you’ll have to wait until 1.9.1 before we release another stable.

All the stable versions prior to 1.9.0 are all gone.

As far as nightlies go, these should be fully operational again for now. There are some slight omissions, like right now there is no mainline MAME core and some other cores might also be missing, but overall, most of the stuff should be back again.

The Core Installer should work again on any RetroArch build.

State of the Github organization

Most of the affected Github repositories have been restored. Unfortunately, there are some shenanigans with Github issues that were closed. For reasons unknown to us, these closed issues cannot be manually re-opened again. Unfortunately, Github hasn’t really been of any help in this department, so we don’t know what to do about this other than to simply move on and ask users to create new Github issues again for the affected repos.

No real data loss has happened and things should be back to normal on the organization.

New server

Thanks to the massive outpouring of support on our Patreon in the wake of the attack, we now have the additional resources to massively beef up our server infrastructure. We are in the process of moving to a far more powerful server that will cover both Lakka and Libretro/RetroArch. We will go into more detail on this as we move closer to retiring the current buildbot server. For now, we are paying out of pocket for both while we are in the transition phase which will undoubtedly double our monthly bill for now, but we think it will be worth it in the end to our users. We again thank our users for believing in us and giving us the stimulus boost necessary to finally do something about our underpowered infrastructure. It is massively appreciated.

What’s next?

No doubt, this attack has set us back some, and it has resulted in some weeks being lost that we could have otherwise put to good use elsewhere. Nevertheless, we believe we are on the road to recovery. We are working on a solution for the Google Play situation. We will create a separate version of RetroArch for Google Play without the Core Installer but with an alternative that is compatible with Google’s recently updated TOS. We don’t think this version will be better than the one you already know and use on Android, but you will always have the option of downloading the version w/ Core Installer support from our own website. We will not remove this version outright, it will stay existing next to the new Play Store builds.

Other than that, we don’t know yet when the next version of RetroArch releases. Ideally the new server will be ready by the time we get to it, since building new releases has been a pain on the current one and we really don’t want to go through it again. We will see. For now, we thank you all for the massive outpouring of support and for giving us the means to finally do something about our situation.

How to donate

Remember that this project exists for the benefit of our users, and that we wouldn’t keep doing this were it not for spreading the love with our users. This project exists because of your support and belief in us to keep going doing great things. If you’d like to show your support, consider donating to us. Check here in order to learn more. In addition to being able to support us on Patreon, there is now also the option to sponsor us on Github Sponsors! You can also help us out by buying some of our merch on our Teespring store!

Hacker vandalised our buildbot and Github organization

Approximately 5 hours ago, we were the target of a premeditated cybercrime attack on our key infrastructure.

The hacker did the following damage:

  • He accessed our buildbot server and crippled the nightly/stable buildbot services, and the netplay lobby service. Right now, the Core Updater won’t work. The websites for these have also been rendered inaccessible for the moment
  • He gained access to our Libretro organization on Github impersonating a very trusted member of the team and force-pushed a blank initial commit to a fair percentage of our repositories, effectively wiping them. He managed to do damage to 3 out of 9 pages of repositories. RetroArch and everything preceding it on page 3 has been left intact before his access got curtailed.

We are still awaiting any sort of response or support from Github. We hope they will be able to help us restore some of these vandalised Github repos to their proper state, and also to help us narrow down the attacker’s identity.

We wanted to clear up some confusion that may have arisen in the wake of this news breaking:

  • No cores or RetroArch installations should be considered compromised. The attacker simply wiped our buildbot server clean, there is nothing being distributed that could be considered malicious to your system. Nothing has happened here and there is no need for any concern.
  • For the current time being, the Core Installer is non-functional until further notice. The same goes for ‘Update Assets’, ‘Update Overlays’, ‘Update Shaders’.

The IP he was using while doing this was ‘54.167.104.253’, which seems to lead back to AWS.

We’re still assessing the situation but moving forward, we think that it’s probably best not to go forward with the buildbot server that was compromised earlier today. We had some long-term migration plans for a move to a new server, but this was always pushed back because we felt that we weren’t ready migration-wise. It might indeed be the case this is the catalyst for just starting all from scratch with a new server instead of trying to migrate the old one over. This would mean that the more commonplace builds for Linux/Windows/Android would be immediately available, but all the specialized systems like consoles, old MSVC builds and whatnot would have to wait for later until we have adapted this properly to the new system.

Lack of automated backups

This brings us onto another key issue – the lack of backups. We last performed a backup of our buildbot server about a couple of months ago. The truth is that while we pay a hefty amount for the servers on a monthly basis already, there is simply not enough money to pile on automated backups as well. We could really use your support on Patreon to help lighten our financial burden here, especially since this now-pretty-much-mandatory server switch will likely cost us an insubstantial amount of money upfront while we keep the current server running for a month longer.

How will we restore things

So, how are we going to restore things? We hope that Github will be able to restore the affected repositories. If they are unable to do so, we could rely on the goodwill of users to source us with git repositories with the full history intact.

As for the buildbot? No idea to be quite frank. If we make the switch to the new server, you’ll get Android/Windows/Linux up and running early again but all other platforms will have to be added as we go along.

It’s a shame what is happening to the emulation and homebrew community. When it isn’t developers leaving for greener pastures deciding it’s no longer worth it, prestigious developers like byuu are being forced to early retirement because of unsavory online gang-stalkers. In our situation, we can’t rule out the possibility that some of these attacks come from some of the same usual suspects (it isn’t the first time we’ve seen them abuse AWS for some of these attacks, we encountered them a year ago earlier targeting our lobby services). Whatever their aim may be, while they will not deter our will to continue working on this project, they have definitely increased our maintenance and cost burden for the time being. And for this we ask for your understanding and support as we attempt to come up with a plan to address these problems moving forward. Supporting us through Patreon is a great way of helping out, especially if we can reach the $1300 goal which means we can spend a bit more each month to make sure our stuff is properly backed up.

As if the complications with Android’s new store policies that requires us to coordinate with new contributors to come up with a workable solution was not enough of a headache, this comes along. With your help and support, we will overcome this and come out stronger than before.

Regarding the Android / Core Installer situation

While we’re on this subject briefly, while it’s off-topic, we felt the need to address this real quick. We will likely be making a version of RetroArch Android that is neutered ONLY for Google Play. It will mean that the Core Installer will not be available for this, and cores will come packaged in additional APKs that can be installed. Apparently there is a 50-core extra APK limit on this until it starts requiring a version of Android over version 8.0. So while trying not to artificially bump the Android OS system requirements, we’re deciding on a 50 core-APK limit for now. Hopefully we can fit nearly most of the cores within such narrow constraints.

On our download site (and on F-Droid), we will have a RetroArch Android version that will work as before – with the Core Installer feature completely left intact. We feel this is a much superior version to what will be available on the Play Store, but unfortunately Google will force our hand here.

RetroArch 1.9.0 won’t be releasing on Google Play Store for now, next version severely downgraded to comply with new policies

We regret to inform you that RetroArch 1.9.0 will not be releasing on the Google Play Store for now.

RetroArch has been available on Android through the Google Play Store since 2012. We encountered two snags this week while trying to update RetroArch on Google Play. First, it complained that our APK size was too big, and that an APK could not be any bigger than 100MB. So we had to go back and start removing some shaders in order to get things to fit.

After this, we tried uploading again. This time, we hit another snag that we have not encountered before:

Issue: Violation of Malicious Behavior policy

An app distributed via Google Play may not modify, replace, or update itself using any method other than Google Play’s update mechanism. Likewise, an app may not download executable code (e.g. dex, JAR, .so files) from a source other than Google Play.

The way we interpret it, the former is not an issue since RetroArch cannot update itself. The latter is probably the issue. We take it that apparently now the Core Updater service is a point of contention, and that it has to go from RetroArch Android. We were not aware of this, and throughout the app’s near-decade presence on the Google Play Store, there has never been a problem before on this front.

Unfortunately, this means that we have to go back to the drawing board now and fundamentally re-engineer RetroArch for Android. However it will be re-tooled, it will unfortunately be a huge setback from an Enduser Experience for the user, there’s no way around that.

Some possibilities that might exist:

  1. We keep RetroArch dynamically linked, but each core has to be installed separately through the Google Play interface as installable DLC. NOTE: We have no idea if this even works the way it can with Steam, so it would have to be explored first.
  2. We make RetroArch statically linked and therefore there needs to be a separate new app store entry for every single combination of RetroArch with every single core.

Whichever of the two we choose, it will mean no more Core Updater. Given the interpretation of the rules, updating assets is probably still permissible as it would be pretty silly to block that, so things like updating shaders and overlays would likely still remain included.

We cannot stress how much of a pain in the ass it will be to have to retool RetroArch like this. It’s almost at the point where it’s almost not worth it from our perspective to do it like this and we feel tempted to just tell people to download it from our site instead, as we certainly have never made a single buck on the Google Play Store to begin with, so there’s no direct profit incentive there. We also don’t know if we even have the manpower right now to be able to make these fundamental changes, but we will certainly attempt to try.

For now, we recommend to users that want the ‘proper’ version to just go to our Downloads page and download RetroArch 1.9.0 from there on Android. You will be installing the APK directly on your phone. You might have to enable ‘Allow outside APKs to be installed’ or some similar setting on your phone for it to be able to be installed, but theoretically any Android phone should be able to install APKs outside of the Play Store.

Where to download RetroArch for Android for now

If you want the latest 1.9.0 version, you can either get it for now on F-Droid, or on our Downloads page here.

Right now on the Play Store, version 1.8.9 should still be available for now.

We apologize for the inconvenience and we hope we can offer a solution soon that is agreeable to both Google and that doesn’t cause us a huge maintenance burden either (although it assuredly will be). We also right now don’t really have a plan ready that will allow us to quickly move on this front, so we’ll just have to see how things go.

RetroArch 1.9.0 released!


RetroArch 1.9.0 has just been released.

Grab it here.

A Libretro Cores Progress Report will follow later.

Remember that this project exists for the benefit of our users, and that we wouldn’t keep doing this were it not for spreading the love with our users. This project exists because of your support and belief in us to keep going doing great things. If you’d like to show your support, consider donating to us. Check here in order to learn more. In addition to being able to support us on Patreon, there is now also the option to sponsor us on Github Sponsors! You can also help us out by buying some of our merch on our Teespring store!

Highlights

Explore View for playlists

Probably the highlight of this release – there is a new ‘Explore’ view for all playlists.

The Explore view takes advantage of the Libretro databases’ metadata history of content. It allows you to search / find content based on criteria such as:

  • Amount of players
  • Developer
  • Publisher
  • System (game console/platform that the game released on)
  • Origin (country of origin that the game was developed in)
  • By Release Year
  • By Genre

The ‘Explore’ view only shows the content that has been added to your playlists. It will not show entries that haven’t yet been added to your collection.

The metadata is currently a bit on the incomplete side, but you can expect us to add more and more metadata to the Libretro database as we go along. This is only the beginning.

Usecases

There are tons of ways you can use the Explore view to find what you want. Here are some of them:

Granular filtering

Here is a good example of the kind of powerful context-sensitive filtering that is possible with the Explore view.

  • Go to Explore
  • Type in ‘super’, will list all entries in your collection that has ‘super’ in the name. 219 entries are shown.
  • We will now filter the entries by a specific developer to narrow down our search. We go to ‘Additional Filters’, and select a developer of choice. Now out of 219 entries, only 25 entries are still shown that matches this criteria (name has ‘super’ in the title, AND is from a specific developer)
  • Let’s add another filter to narrow down the search even more. We go to ‘Additional Filters’ again, and this time we select ‘By Release Year’, ‘1993’. There’s now only 6 entries shown that matches this criteria (name has ‘super’ in the title, AND is from a specific developer, AND was released in the year 1993).
  • Let’s add another filter. We go to ‘Additional Filters’, ‘Region’, ‘Europe’). Now only 3 entries are shown that matches this criteria (name has ‘super’ in the title, AND is from a specific developer, AND was released in the year 1993, AND was released in region ‘Europe’).

NOTE: It should be mentioned that in the future, if you want the metadata to be updated automatically in existing versions of RetroArch, you should go to ‘Online Updater’ and select ‘Update Databases’. After this, you might have to restart.

Enhanced playlist search functionality

Before, RetroArch’s inbuilt ‘search’ function was woefully inadequate:

  • The user presses RetroPad ‘X’ (or keyboard ‘/’, or Material UI’s search icon), and enters a search term
  • The navigation pointer jumps to the first match
  • That’s it. There is no way to continue searching from that point, or to do much of anything, really. It’s mostly a non-feature…

    The search functionality has now been enhanced as follows:

    • When viewing a playlist, the user presses RetroPad X (or /, etc.) as normal, and enters a search term
    • This becomes a filter – all matching entries will be displayed
    • The user can then perform another search to further refine the results. An arbitrary number of filters may be stacked in this fashion
    • Pressing ‘cancel’ clears the last entered filter

    Here’s an example of the search feature in action:

    Content loading animations

    A new “Load Content” Startup Notification option has been added under Settings > On-Screen Display > On-Screen Notifications. When enabled, a brief animation is shown whenever content is launched – it looks something like this –

    • The animation is disabled when running a core without content (there are some underlying technical issues that prevent this)
    • The animation is disabled when running content with ‘in-built’ cores (imageviewer, music/video player).
    • The animation works both for content launched via the menu and via the command line

    Easy dropdown lists for input remapping

    Before, when you wanted to remapp inputs via the Quick Menu (Quick Menu > Controls > Port N Controls), the user had to press left/right ad nauseam in order select an appropriate core input for each controller button. This is somewhat cumbersome, and highly awkward on touchscreen devices.

    In this new version we have added drop-down lists to the input remap entries. For example, when viewing this:

    …the user can now press OK (or keyboard enter, or tap/click the entry) to open a list of available inputs:

    This works for both controller and keyboard Device Types.

    In addition:

    • You can now use the RetroPad start button to reset (unbind) controller keyboard remaps (when Device Type is a keyboard type, obviously!)
    • Fixed a bug whereby pressing RetroPad select on certain entries would spawn a message box with no text – this would be invisible, and would therefore appear to ‘hang’ the menu.

    FFMpeg/Video Player Improvements

    • A new progress overlay bar has been added to the ffmpeg core (embedded in RetroArch for Windows/Linux). When you skip forwards or backwards via the directional keys or D-pad, you will see this interface appear for a brief period of time.
    • Lockups could occur when playing videos if a forwards seek operation would take the target playback position past the end of the file. We have added the following workaround: seek operations are limited to a point 1 second before the end of the file, and if the user attempts to seek past the end then playback of the file will restart from the beginning.
    • Solved several big memory leaks upon opening videos

    iOS/tvOS Metal Renderer

    RetroArch on iOS/tvOS now supports Apple’s Metal graphics API. Slang shader support is already implemented and all software rendered cores should work.

    Currently the iOS/tvOS build will default to OpenGL because of a few outstanding issues that still have to be resolved:

    • Slang shaders degrade performance noticeably on most cores
    • When audio is interrupted, it doesn’t resume with the Metal renderer (OpenGL seems ok)

    Other general menu improvements

    • The RGUI menu now shows boolean settings as a ‘toggle switch’ if you have the setting ‘Show Switch Icons’ enabled (Settings -> User Interface -> Appearance).
    • Previously, RetroArch would have the bad habit of resetting the selection cursor to the first entry in the menu after returning from almost every list of selectable values for a setting. For example, if you go to Settings -> Drivers -> Audio and change the audio driver (or press Back), the selection cursor will be reset to the first entry of the Drivers menu instead of the Audio item from where we were originally. This has been fixed.
    • People previously complained that it was possible to set drivers to ‘null’ that are necessary for RetroArch to work, such as ‘Menu’ and ‘Video’. It’s now no longer possible to set a driver to ‘null’ unless there is no driver available for this type.
    • MaterialUI – The Playlist screen now shows icons of the associated system.
    • The three separate Scan Directory/Scan File/Manual Scan entries are now moved into a submenu called ‘Import Content’. They are no longer shown in any ‘top level’ menu. This unclutters the Playlist screen.
    • You can now selectively hide/enable widget notifications of several types. For instance, autoconfig messages, load content animation popups, cheat code notifications, fastforward notifications, screenshot indications, and more can all be individually configured. You no longer have to disable widgets altogether if you don’t like any one of the widget UI elements, you can now just opt to disable the widget that you don’t like.
    • (Windows only) Screen resolution dropdown list improvements – it no longer includes multiple duplicate entries.

    File I/O/Memory improvements

    A significant amount of time has been spent reducing RetroArch’s memory footprint and reducing disk I/O overhead when doing menial operations such as loading a configuration file or a playlist inside RetroArch.

    Changelog

    There’s much more to this release than meets the eye. See the CHANGELOG below for a more detailed breakdown.

    1.9.0

    • 3DS: Fix sound crackling when paused
    • ANDROID/VIBRATION: Fixes “Vibrate on Key Press” having no effect on Android devices, which occurred because only the off time/strength was defined in what should have been a pair of off/on values
    • AUTOCONFIG: Ensure correct directory is used when saving autoconfig profiles
    • BLUETOOTH: Add a Bluetooth driver (Lakka-only for now)
    • CHEATS: Fix for wrong number of remaining cheat search matches on some machines
    • CHEEVOS: Option to play sound on achievement unlock.
    • CHEEVOS: Upgrade to rcheevos 9.1
    • CHEEVOS: Restore display of unlocked achievements across hardcore modes
    • CHEEVOS: Hash buffered data when available
    • CHEEVOS: Fix ‘Auto Save State freezes RetroArch while Cheevos is enabled’
    • CORE OPTIONS: Pressing OK (or clicking/tapping) on a ‘boolean toggle’ core option no longer opens a drop-down list. The value now toggles directly, just like boolean options everywhere else in the menu
    • CORE OPTIONS: Toggling an option that changes the number of core options being displayed (i.e. things like `Show Advanced Audio/Video Settings) no longer resets the navigation pointer to the start of the list
    • CORE OPTIONS: Before, RetroArch would identify core option values as being ‘boolean’ if they had labels matching the specific strings enabled or disabled. Most core devs would abide by this, but not always… As a result, we sometimes would end up with misidentified values, with all kinds of Enabled, Off, True, etc. strings littering the menu, in place of proper toggle switches. All boolean-type value labels are now detected, and replaced with standard ON/OFF strings.
    • CLI: A new command line option –load-menu-on-error has been added
    • CRT: On the fly CRT porch adjuments – these changes allow a user to adjust how the porch algorithm generates the 15khz/31khz output. Giving the ability to change over/under scan.
    • CONFIG FILE: Optimise parsing of configuration files
    • D3D9/D3D11: Fix core-initiated D3D9/D3D11 driver switches
    • DRIVERS: Implemented protection to avoid setting critical drivers to nothing thus preventing the user from locking him/herself out of the program
    • EMSCRIPTEN: Fix input code to ignore unknown keys
    • FFMPEG CORE: Prevent seeking past the end of files (hang fix)
    • FILE I/O: VFS and NBIO interfaces will now use 64-bit fseek/ftell where possible, should allow for reading/writing to files bigger than 2GB
    • INPUT MAPPING/REMAPPING: Add input remap drop-down lists
    • IOS: Fixed iOS 6 version
    • IOS: Hide the home indicator as it obscures the content too frequently
    • IOS/METAL: Metal video driver now works on RetroArch iOS
    • IOS/METAL: Support getting video metrics to support proper touchscreen interactions
    • LOCALIZATION: Updates for several languages (synchronized from Crowdin)
    • MEMORY/LINUX/ANDROID: Fix reporting of free memory
    • MEMORY/WINDOWS: Fix reporting of free memory
    • MENU: Enlarged INT/UINT selection limit from 999 to 9999
    • MENU: Fix cursor forced to first entry after displaying lists
    • MENU: Make Notification Font option visible when Graphics Widgets are enabled
    • MENU/RGUI: Add optional ‘toggle switch’ icons
    • MENU/WIDGETS: Add optional widget-based ‘load content’ launch feedback animation
    • MENU/WIDGETS: Make notification font size option visible when graphics widgets are enabled
    • ODROID GO ADVANCE: Video driver – fix race condition with RGUI callback
    • PLAYLISTS: Change playlists to use dynamic arrays. Instead of a fixed initial 12MB memory allocation (99999 * 128 byte (on 64bit arch)), use a dynamically growing array
    • PLAYLISTS: Playlist base content directory paths – portable playlists
    • PLAYLISTS/SEARCH: Enhanced playlist search functionality
    • PLAYLISTS/DATABASE: Add ‘Explore’ view
    • PLAYLISTS/DATABASE/EXPLORE: Show system icons in explore view
    • PS2: Improve FPS Limiter
    • RUNAHEAD: Prevent runahead from being disabled permanently when an error occurs
    • SCANNER: Add more region codes for GameCube/Wii game detection
    • SHADERS/SLANG: Increased Slang max Parameters, Textures & Passes
    • VIDEO FILTERS/BLARGG: Make Blargg_snes filter customizable
    • WINDOWS/RAWINPUT: Fix invalid calls to dinput_handle_message when input driver is not set to dinput
    • X11: Add lightgun support

    Other news

    We will follow this up with more news soon on all the recent core developments that have been going on. As usual with these blog posts, there’s a lot we don’t have time to touch on when it comes to bugfixes, improvements and additions to the Libretro/RetroArch project.

    New and improved versions of the Dolphin and Citra cores will be coming soon. The PPSSPP core is now being actively updated again and should be up-to-date with the Git upstream repository. In addition to this, there are several other big new things that will be discussed soon.

New PlayStation1 core DuckStation now available for RetroArch!


Stenzek’s new PlayStation1 emulator DuckStation is now available as a Libretro core on RetroArch! The author of this promising new PlayStation1 emulator made the core by himself and has included it in his upstream repo.

DuckStation is a totally new PlayStation 1 (aka PSX) emulator focusing on playability, speed, and long-term maintainability. Accuracy is not the main focus of the emulator, but the goal is to be as accurate as possible while maintaining performance suitable for low-end devices. “Hack” options are discouraged, the default configuration should support all playable games with only some of the enhancements having compatibility issues. A “BIOS” ROM image is required to start the emulator and to play games. You can use an image from any hardware version or region, although mismatching game regions and BIOS regions may have compatibility issues. A ROM image is not provided with the emulator for legal reasons, you should dump this from your own console using Caetla or other means. DuckStation includes hardware rendering (OpenGL, Vulkan and D3D11), upscaling and 24-bit color and a 64-bit dynarec.

It is currently available on the Libretro buildbot for the following platforms:

  • Windows
  • Linux
  • Android (AArch64-only)

As soon as a commit is pushed on Stenzek’s repository, the libretro buildbot will compile a new build, and it should from there be available shortly for all RetroArch users.

Features

  • Relatively high degree of compatibility
  • Has three hardware renderers: OpenGL, Vulkan, and Direct3D11
  • Allows you to internally upscale the resolution
  • Has a dynamic recompiler and cached interpreter CPU core
  • Ability to run PSX CDROM emulation on a separate thread, reducing frame time spikes
  • How to get it

    There are two ways to install and/or update the DuckStation core:

    a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

    b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select ‘Sony – PlayStation (DuckStation)’ from the list. It will then download and install this core.

    BIOS required

    DuckStation, like Beetle PSX, requires a real BIOS in order to work. There is no HLE BIOS like PCSX ReARMed.

    A “BIOS” ROM image is required to start the emulator and to play games. You can use an image from any hardware version or region, although mismatching game regions and BIOS regions may have compatibility issues. A ROM image is not provided with the emulator for legal reasons, you should dump this from your own console using Caetla or other means.

    Recognized BIOS images:

    • scph5500.bin
    • scph5501.bin
    • scph5502.bin

    Benchmarks

    System specs: CPU – Intel Core i7 7700k | GPU – Geforce RTX 2080 Ti (11GB VRAM, 2018) | 16GB RAM

    We’ve performed some basic performance tests between DuckStation and Beetle PSX HW. We are using the same baseline resolution (1x) for both cores, and we try to make the test as fair as possible by disabling features such as PGXP and texture filtering for Beetle PSX HW (both features which DuckStation lacks).

    DuckStation

    Title Direct3D11 Vulkan OpenGL
    Tekken 3 396fps 414fps 335fps
    Alien Trilogy 538fps 552fps 499fps

    Beetle PSX HW

    NOTE: Beetle PSX HW does not have a Direct3D 11 renderer

    Title Vulkan OpenGL
    Tekken 3 326fps 229fps
    Alien Trilogy 480fps 473fps

    As you can see, on average performance is overwhelmingly in DuckStation’s favor. It does have to be said that Beetle PSX HW right now has some unique features that DuckStation lacks, such as PGXP and texture replacement.

    Conclusion

    You should definitely give DuckStation a go if you want a high performance PlayStation1 emulator. If you find Beetle PSX HW to be running too slowly for you on your system, you should check if DuckStation is faster instead.

(Coming Soon) RetroArch 1.9.0 – Widget-based ‘load content’ animation

A new “Load Content” Startup Notification option has been added under Settings > On-Screen Display > On-Screen Notifications. When enabled, a brief animation is shown whenever content is launched – it looks something like this –

Notes:

  • The animation is disabled when running a core without content (there are some underlying technical issues that prevent this)
  • The animation is disabled when running content with ‘in-built’ cores (imageviewer, music/video player).
  • The animation works both for content launched via the menu and via the command line

Mupen64Plus-Next – v2.1

The long-anticipated big update to Mupen64Plus-Next has finally arrived!

Important Information and notes

Beforehand, be warned that the core name changed
As you probably know, up until now, the flavour (if it’s a GLES/GL build) was appended to the Core Name, this caused the frontend to categorize them with the appendix. Now with Vulkan support added, this would break remap/game specific core options/etc anyway, so I decided to just kill it and append it to the version (there was never a good reason why I added it to the name to begin with…).
Now a new folder named `Mupen64Plus-Next` will be created inside your config folder on first start.
You can move and rename your existing core config override, core options and shader presets there, named accordingly (Mupen64Plus-Next.cfg/opt/slangp…).

RetroArch Nintendo Switch Notes

With the development of the threaded renderer support we noticed a few Issues in our platform specific Audio drivers, especially audren_thread, that will cause some cores, most often multithreaded cores, to randomly freeze. We have a fix for this in the pipeline, while also nearly halving our current audio latency.
Due to time concerns tho, I didn’t get to push the fix yet and it needs more testing.
So, for now, I recommend switching to the `switch_thread` audio driver until the issues are fixed.
Another core where it’s likely to happen is PPSSPP, so if you encounter random freezes, give it a try, the only thing you will lose is audio in in-game recordings.

GlideN64

This new version of Mupen64Plus-Next should be up-to-date with the most recent versions of GLideN64.
Here are some highlights, which are now available in the libretro-core as well!

Threaded Renderer

There’s now a ‘threaded rendering’ option for the libretro core. Enabling this can significantly increase performance, at the expense of slightly more input latency.
It has been available upstream for a while, but the implementation doesn’t play well with how a libretro core works.
I started work on it sometime mid last-year and after more than a dozen iterations and months of testing, it’s now ready for production.
An enormous shoutout to fzurita, who originally came up with the implementation!

How to use it


To use it, go to Quick Menu, Options. Make sure you have set ‘RDP Plugin’ to ‘GLideN64’ (the setting will not do anything with Angrylion and/or ParaLLEl RDP). Then turn ‘Threaded Rendering’ either on or off, and then restart the core (Close Content, and loading content again with the core).
Please note, I am aware that switching between fullscreen and windowed currently crashes when a game is running with the threaded renderer (same applies to changing Video Threaded in RetroArch), a fix is on my todo.

Benchmarks

Tests were performed on a Core i7 7700k desktop PC with a Geforce RTX 2080 Ti.

Game Non-Threaded Threaded Resolution
Super Mario 64 719 VI/s ~1000 VI/s 2x Native Resolution
Super Mario 64 701 VI/s ~1000 VI/s 4x Native Resolution
Super Mario 64 742 VI/s 780 VI/s 3840 x 2880

NOTE: These tests were performed with hyper threading enabled and CPU throttling, so take these figures with a grain of salt. The main important thing to take away from this is that VI/s is nearly 300 units of measurement faster at 2x to 4x native resolution compared to non-threaded rendering in this test.

This feature will significantly help platforms like Nintendo Switch and Raspberry Pi.

Dithering

In the past, HLE renderers have not really attempted to implement dithering (of course, with LLE RDP renderers you get it for free). An N64 game is typically rendered using a 16bit color buffer, and dithering is then used to reduce color banding and create the illusion of a higher color depth. GLideN64 in the past has always used 32bit rendering.

There are several new core options available:

  • Dithering Pattern
  • Dithering Quantization
  • RDRAM Image Dithering Mode

If you use native N64 resolution, you also may enable a dithering pattern to get a more authentic look, but even if you like to play in HD, this is something worth trying out!

ParaLLEl RDP

By now you’ve heard all about the revolutionary Vulkan-powered ParaLLEl RDP renderer, which debuted first in ParaLLEl N64. It is now included for the first time in Mupen64Plus-Next.

All the same features are available and more –

  • Compatibility on Android with ParaLLEl RDP should be much higher now as a result of a much more up-to-date mupen64plus-core. Games like Paper Mario, GoldenEye 007 and others would previously just crash on Android with ParaLLEl RDP+RSP.
  • Performance should be roughly ~5-10% faster on average than ParaLLEl N64. Sometimes a bit more.
  • Some compatibility issues that happened even on PC x86/x64 with ParaLLEl N64 are not an issue with Mupen64Plus-Next (such as Perfect Dark crashing at startup, Pokemon Snap graphics glitches, Mario no Photopi not working, Conker’s Bad Fur Day).

Over time we will probably repurpose ParaLLel N64 and let Mupen64Plus-Next take center stage.

Improved Core Options

Sub-labels descriptions got added to the core options. I hope this will make them a bit less confusing.
Please note that it’s currently not possible to hide the options on the fly so depending on the build configuration this might get a bit cluttered.
I am looking for solutions, but this should be a great improvement over the last versions nontheless!

Bugfixes and changes

– Android: Fixed garbage on the framebuffer with GLES3 (where the overscan would be)
– Android: Switched to “on Vertical Interrupt buffer swap mode” (might take slightly more perf) since the touch overlay was pretty much unusuable without it
– Updated Parallel-RSP
^- – Fix some stability issues in parallel-rsp on 64-bit
– Added Native Resfactor core option (set to disabled / 0 to use custom resolutions as you are used to)
^- Note: With Native Resfactor the resolution option will act as viewport size!
– Added Copy Aux to RDRAM core option
– Added a script to regenerate the INI Headers, updated to the latest variants
^- Note: It seems I still had cheats for OoT subscreen fix and DK64 bone displacement from when I first wrote the core, these caused some issues after it was fixed in the core, so I got rid of them for good, it was a oversight.
– Remove CountPerOp=1 for Quake 2 and Goldfinger
^- Note: After speaking with some upstream folks, nobody knows why it was even forced to 1, it caused crippeling performance on Android and Switch and after hours of testing no gamebreaking Issue was found, in the future I might work on getting rid of Count-Per-Op for good, it’s a nasty approximation.
– Allow higher Count-Per-Op
– Nintendo Switch: Lowered Firmware version requirements
– Added support for linking against system libaries
– Fixed LLE Fallback falsely being treated as supported, fixes F-Zero X Expansion HLE
– Exposed Hybrid filtering

These fixes are incorporated in both ParaLlEl N64 and Mupen64Plus-Next:

  • Vigilante 8’s character portraits are no longer wrongly coloured.
  • Mario Tennis’ intro screen no longer has tons of graphics bugs

Dynarec Issues

Over the last months, testers repeatedly encountered freezes in Ocarina of Time. I and Gillou spent hours on investigating the Issue and tracked it to the dynarec.
Sadly even after syncing core instances and comparing each recompiled block with the working MSVC builds led nowhere yet (tho we found a few other issues in code invalidation, which might’ve been an issue or not as well as borked caller saved regs..)
These fixes are still in the development stage and thus not included here. However I brought back the good ol’ TLB Invalidation hack as core option.
Setting it to the Ignore TLB Exceptions if not using TLB option will allow the game to continue so you can save it and restart (For this Issue you actually need to Close Content and start it again, a soft reset wont be enough). You will notice it happens when you suddenly see Epona carrots. Of course this is not a fix, but a side-effect is also that a bunch of broken romhacks work and it’s also useful for the upcoming GDB Server implementation, so I figured I will add it anyway.
Take note that this is confirmed as a mupen64plus-core upstream issue and that this Issue does not arise with Cached or Pure Interpreter!

Differences between Mupen64Plus-Next and ParaLLEl N64

  • ParaLLEl N64 has the following RDP plugins: Glide64, GLN64, Rice, Angrylion, ParaLLEl RDP. Glide64, GLN64, and Rice are aimed more at the lowend of graphics cards.
  • Mupen64Plus-Next has the following RDP plugins: GlideN64, Angrylion, ParaLLEl RDP.
    GLideN64 should be the best-in class HLE RDP renderer, but might have higher performance and GL requirements than the lower-end Gliden64/GLN64/Rice from ParaLLEl N64.
  • Both ParaLLEl N64 and Mupen64Plus-Next have the same RSP plugins (HLE, cxd4 LLE interpreter, and ParaLLEl RSP)
  • ParaLLEl N64 uses the Hacktarux dynarec for x86 32bit/64bit, and new_dynarec for ARM. Mupen64Plus-Next uses new_dynarecs for both x86 and ARM architectures, and tends to be a bit faster as a result.
  • ParaLLEl N64 has some built-in game specific alternate control schemes that you can switch on/off with the Select button. Mupen64Plus-Next does not have this yet.

Conclusion

Moving forward, we recommend you use Mupen64Plus-Next if you want to use LLE N64 (ParaLLEl RDP/RSP) with the highest compatibility and best performance.
Also, GLideN64 (provided your graphics card meets the OpenGL requirements) will work better than ParaLLEl N64’s equivalents.
Furthermore, Mupen64Plus-Next has a up to date version of mupen64plus-core, so it tends to have less game compatibility issues and the sound is better in games like Body Harvest.

ParaLLEl N64 might get repurposed towards the lower end as a result.

As a final note I want to give my thanks to dmrlawson for giving me a helping hand, fzurita for being very helpful, gonetz and his contributors for doing a awesome job with GLideN64 and Gillou68310 for all the hours he put in helping me investigate the dynarec issues (also thanks to Thom Rainier for never getting tired of OoT testing) as well as themaister for his work on Parallel RSP/RDP and the Vulkan implementation in Mupen64Plus-Next!

– m4xw

RetroArch 1.8.9 released!


RetroArch 1.8.9 has just been released.

Grab it here.

A Libretro Cores Progress Report will follow later.

Remember that this project exists for the benefit of our users, and that we wouldn’t keep doing this were it not for spreading the love with our users. This project exists because of your support and belief in us to keep going doing great things. If you’d like to show your support, consider donating to us. Check here in order to learn more. In addition to being able to support us on Patreon, there is now also the option to sponsor us on Github Sponsors! You can also help us out by buying some of our merch on our Teespring store!

Highlights

AI Service – Custom accessibility service support

The AI service feature has included new changes to allow closer integration between the service selected and the game being played, allowing the service to read and press gamepad buttons along with the current screen image. The example video above shows a custom service (still in development) designed to make Final Fantasy 1 accessible and playable by blind users.

When started, the AI service will continually parse the screen and describe what’s being shown. When in a town or overworld view, it will describe what’s around the player to the west, north, east, and south, as well as any new things of interest that have appeared on screen (eg: a townsperson, a weapon shop, treasure chest, etc.). When the emulator is paused, it will give a more detailed description of what’s on the screen, including how far the player can walk in all directions and all things of interest along with their coordinates relative to the player. If the player holds the select button at this time, then the AI service will read out the list of things of interest on the screen and allow the player to scroll through them and select one. When selected, the AI service will unpause the game and move the player to that thing and interact with it.

When on a menu or battle screen, the service will read out the text on the screen and the currently selected menu option.

We will have more information on this for you soon after the initial testing and feedback is over.

Core Management Options

  • The software license of each core is now shown in the ‘Core Downloader’ and ‘Load Core’ screen.
  • Pressing RetroPad Select on a Core Updater entry will now display any text in the description field of its info file
  • Installed cores are now highlighted via a [#] symbol
  • Pressing RetroPad Start on a selected, installed entry opens the Core Information menu (when using Material UI, swiping left or right triggers the same action). This means we can now view bios info etc. – and more importantly delete cores – without jumping through all the hoops of loading a core first and navigating all over the place
  • It’s now possible to hide ‘Experimental Cores’ from being shown in the ‘Core Downloader’ menu screen.

Backup cores when updating

By default now, a backup of the current Libretro core will be made when you upgrade a core from RetroArch’s builtin Updater service. In addition, you can also ‘freeze’ a core. ‘Freeze’ in this context means that the Updater service will not be able to overwrite your current core with the latest version from the Updater service.

Vulkan WSI improvements

There were some problem platforms with WSI (Window System Interface) currently, which version 1.8.9 partly addresses. This should theoretically reduce stalls on integrated GPUs.

  • Intel Mesa was broken when using Fences, we have to use Semaphores to acquire the swapchain or the entire GPU stalls.
  • Add support for either using fences or semaphores when syncing.
  • Prefer using semaphores for integrated GPUs (such as Intel HD) as it promotes better throughput over fences.
  • Do not use mailbox emulation on Android.
  • Also, to make this work, decouple frame index from swapchain index with regards to CPU-side synchronization. Before, swapchain index would be coupled with frame context, which is somewhat naive.

Changelog

What you’ve read above is just a small sampling of what 1.8.8 has to offer. There might be things that we forgot to list in the changelog listed below, but here it is for your perusal regardless.

1.8.9

  • AUTO SAVESTATES: Ensure save states are correctly flushed to disk when quitting RetroArch (fixes broken save states when exiting RetroArch – without first closing content – with ‘Auto Save State’ enabled)
    BUILTIN CORES: Builtin cores like ffmpeg and imageviewer would previously try to erroneously load a dynamic core named ‘builtin’ – this would fail and would just be a wasteful operation – this now skips dylib loading in libretro_get_system_info for builtin cores
  • CHEEVOS: Report API errors when unlocking achievements or submitting leaderboards
  • CHEEVOS: Support less common file extensions
  • CHEEVOS: Disable hardcore mode when playing BSV file
  • CHEEVOS: Correctly report unlocked non-hardcore achievements when hardcore is paused
  • CHEEVOS/M3U: Bugfix – did not handle absolute/relative paths in M3U files correctly before
  • CHEEVOS/M3U: Bugfix – it didn’t handle comments/directives
  • CHEEVOS/M3U: Bugfix – it doesn’t handle trailing whitespace
  • CHEEVOS/M3U: Bugfix – failed when loading M3U files with certain line endings
  • CORE MANAGEMENT: Add ‘core management’ menu (Settings -> Core)
  • CORE MANAGEMENT: Add option to backup/restore installed cores
  • CORE MANAGEMENT: Improved core selection logic
  • CORE INFO: Search search optimisations
  • CORE DOWNLOADER: Rename ‘Core Updater’ to ‘Core Downloader’
  • CORE DOWNLOADER: Add ‘Show Experimental Cores’ setting under Settings > Network > Updater
  • CORE DOWNLOADER: Core licenses are now shown for all entries in the Core Updater menu
  • CORE DOWNLOADER: Pressing RetroPad select on a Core Updater entry will now display any text in the description field of its info file
  • CORE DOWNLOADER: Installed cores are now highlighted via a [#] symbol
  • CORE DOWNLOADER: Pressing RetroPad start on a selected, installed entry opens the Core Information menu (when using Material UI, swiping left or right triggers the same action). This means we can now view bios info etc. – and more importantly delete cores – without jumping through all the hoops of loading a core first and navigating all over the place
  • CORE DOWNLOADER/UPDATER: Add option to automatically backup cores when updating
  • DISK CONTROL: Enable ‘Load New Disc’ while disk tray is open
  • INPUT: Added a hotkey delay option to allow hotkey input to work properly when it is assigned to another action
  • INPUT: Remove ‘All Users Control Menu’ setting, was buggy and will be properly reintroduced after input overhaul
  • LINUX: Set default saves/save states/system paths
  • LOCALIZATION: Add Persian language
  • LOCALIZATION: Add Hebrew language
  • LOCALIZATION: Add Asturian language
  • MENU: Proper line wrapping for message dialog boxes
  • MENU/HOTKEYS: Add sublabels to all hotkey bind entries
  • MENU/QUICK MENU: Suppress the display of ’empty’ quick menu listings when closing content
  • MENU/OZONE: Performance improvements
  • MENU/SDL: Add mouse controls
  • OPENGL1/VITA: Initial changes for HW context without FBO
  • OVERLAYS: Add options for moving the on-screen overlay
  • PLAYLISTS/WINDOWS: Fix core path entries in image/video/music history playlists
  • PS2: Add back CDFS support
  • SDL/GL: Advertise GLSL support
  • VIDEO/WIDGETS: Fix heap-use-after-free errors, leading to memory corruption
  • VITA: Added custom bubbles support
  • VITA: VitaGL update
  • VULKAN/WSI: Better frame pacing
  • VULKAN/WSI: Fix Intel Mesa being broken when using Fences, we have to use Semaphores to acquire the swapchain or the entire GPU stalls
  • VULKAN/WSI: Add support for either using fences or semaphores when syncing
  • VULKAN/WSI: Prefer using semaphores for integrated GPUs as it promotes better throughput over fences
  • VULKAN/WSI/ANDROID: Do not use mailbox emulation on Android
  • UWP/XBOX: Potentially improve performance by enabling ‘Game Mode’

ParaLLEl-RDP – How the upscaled rendering works

This is a technical article on how upscaling in LLE works on the N64 RDP. Accurate upscaling in LLE is something which has not been done before (it has been done in a HLE framework, but accurate is the key word here), due to its extremely intense performance requirements, but with paraLLEl-RDP running on the GPU with Vulkan, this is now practical, and the results are faithful to what N64 games would look like if games rendered at a very high resolution. There are no compromises on accuracy, and I believe this is a correct representation of upscaling in a “what-if” scenario. The changes required to add this were actually fairly minimal, and there aren’t really any hacks involved. However, we have to be somewhat conservative in what we attempt to enhance.

Main concepts

Unified Memory Architecture – fully accurate frame buffer behavior

A complicated problem with the N64 is that the RDP and CPU have a unified memory architecture, and this complicates a lot. We must assume that the CPU can read arbitrary pixels that the RDP rendered, and the CPU can overwrite pixels written by the RDP earlier. In upscaling, this gets weird very quickly since the CPU does not understand upscaling. To support this, the GPU renders everything twice, once in the native domain, and finally in the upscaled domain. With this approach, the CPU cannot observe that upscaling is happening. It also improves performance in synchronous mode, since we can just render native resolution before we unblock CPU, and the GPU can go on to render upscaled render passes asynchronously, which takes a longer time.

Rasterization at sub-pixel precision

The core mathematical problem to solve for upscaling is how we are going to rasterize at sub-pixel precision. This gets somewhat interesting, since the RDP is fully defined in fixed-point, and there is limited precision available. Fortunately, there are enough bits of precision that we can add extra sub-pixel precision to the rasterization equations. 8x is the theoretically maximum upscaling we can achieve without going beyond 32-bit fixed point math. 8x is complete overkill, 2x and 4x are more than enough anyways.

Instancing RDRAM

Given that we have a requirement of unified memory architecture, paraLLEl-RDP directly implements a unified memory architecture (UMA) as mentioned above where the GPU reads and writes directly into RDRAM. This ensures full accuracy, and this is usually where HLE fails, as implementing UMA at this level is not practical with the traditional graphics pipeline in GPUs. To extend paraLLEl-RDP’s approach to upscaling, I went with multiple copies of RDRAM, one copy for each sub-sample. This works really well, because at any time, if we detect that any write happens in an unscaled context, e.g. CPU writes, we can simply duplicate samples up to upscaled domain. This is essentially some kind of faux MSAA where each pixel has multiple samples associated with it. This is the memory we end up allocating for a 4x upscale (4×4 = 16 samples):

  • RDRAM (8 MB) – Allocated on host with VK_EXT_external_memory_host. This is fully coherent with emulated CPU.
  • Hidden RDRAM (4 MB) – Device local
  • RDRAM reference buffer (8 MB) – Device local
  • Multisampled RDRAM (8 * 16 MB) – Device local
  • Multisampled Hidden RDRAM (4 * 16 MB) – Device local

The reference buffer is there so we can track when CPU writes to RDRAM. Essentially, before we render anything on the GPU, we compare RDRAM against the reference buffer. If there is a difference, the CPU must have clobbered the pixel, and the RDRAM is now duplicated to all the samples of RDRAM. After rendering something, we update the reference buffer, so we know it’s safe to use upscaled pixels later.

When rendering an upscaled pixel (X, Y), we convert the coordinate to native pixel (X, Y) and convert the sub-pixel to an RDRAM instance, e.g.:

ivec2 upscaled_pixel = ivec2(x, y);
ivec2 subpixel = upscaled_pixel & (SCALING_FACTOR - 1);
ivec2 native_pixel = upscaled_pixel >> SCALING_LOG2;
int rdram_instance = subpixel.y * SCALING_FACTOR + subpixel.x;
read_write_rdram(native_pixel, rdram_instance);

Upscaled VI interface

Adding upscaling to the VI interface is fairly straight forward since we can convert e.g. 16 samples back to a 4×4 block of pixels. From there, we just follow the exact same algorithms that we do for native rendering. This means we get correct VI AA, divot and de-dither happening at high resolution.

Modifying rasterization rules

The RDP is a span rasterizer, a very classic design. The rasterization rules are extremely specific and cannot be accurately represented using normal OpenGL/Vulkan triangle rasterization rules, which are based on barycentric plane equations (to the best of my knowledge you can only approximate).

The RDP receives pre-computed triangle setup data from the RSP. We specify three lines with the triangle setup, where one line is the “major” line XH, and a second line is picked from the two “minor” lines XM/XL, depending on y >= YM. Two values YH and YL limit which scanlines we should render. This lets us implement triangles, or more complicated primitives if we want to. Bisqwit made a really cool ongoing video series on software rendering a while back which also implements a span rasterizer, which is very useful to watch if you want a deeper understanding of this approach.

This triangle setup data is defined more specifically as:

  • XH, XM, XL: 32-bit values in the format of s12.15.x. The 4 MSB are sign-extended, and the single LSB is ignored (we can exploit this bit for more precision later!)
  • dXHdy, dXMdy, dXLdy: 32-bit values in the format of s12.13.xxx. 4 MSBs are sign-extended, and 3 LSBs are ignored. This represents the slope of the line for XH, XM and XL.
  • YH: This is a s12.2 value which represents the first scanline we render. There is 2 bits of subpixel precision, which is very useful because the RDP will sample coverage for 4 sub-scanlines per scanline.
  • YM: This s12.2 value represents the first sub-scanline where XL is selected as the minor line, otherwise XM is used.
  • YL: This represents the final sub-scanline which is rendered. The sub-scanline of YL is not included in rasterization.

The algorithm for native resolution in GLSL:

// Interpolate X at all 4 Y-subpixels.
// Check Y dimension.
int yh_interpolation_base = int(setup.yh) & ~(SUBPIXELS - 1);
int ym_interpolation_base = int(setup.ym);

int y_sub = int(y * SUBPIXELS);
ivec4 y_subs = y_sub + ivec4(0, 1, 2, 3);

// dxhdy and others are (setup value >> 2) since we're stepping one sub-scanline at a time, not whole lines. This is why more LSBs are ignored for the slopes.
ivec4 xh = setup.xh + (y_subs - yh_interpolation_base) * setup.dxhdy;
ivec4 xm = setup.xm + (y_subs - yh_interpolation_base) * setup.dxmdy;
ivec4 xl = setup.xl + (y_subs - ym_interpolation_base) * setup.dxldy;
xl = mix(xl, xm, lessThan(y_subs, ivec4(setup.ym)));

ivec4 xh_shifted = quantize_x(xh); // A very specific quantizer, see source ...
ivec4 xl_shifted = quantize_x(xl);

ivec4 xleft, xright;
if (flip) // Flip is a bit set in triangle setup to mark primitive winding.
{
    xleft = xh_shifted;
    xright = xl_shifted;
}
else
{
    xleft = xl_shifted;
    xright = xh_shifted;
}

We have now computed a range of which pixels to render for each sub-scanline, where [xleft, xright) is the range. If xright <= xleft, the sub-scanline does not receive coverage. The quantizer is somewhat esoteric, but we essentially quantize X down to 8 sub-pixels of precision (>> 13). This is used later for multi-sampled coverage in the X dimension.

To add upscaling, the modifications are straight forward.

int yh_interpolation_base = int(setup.yh) & ~(SUBPIXELS - 1);
int ym_interpolation_base = int(setup.ym);
yh_interpolation_base *= SCALING_FACTOR;
ym_interpolation_base *= SCALING_FACTOR;

int y_sub = int(y * SUBPIXELS);
ivec4 y_subs = y_sub + ivec4(0, 1, 2, 3);

// Interpolate X at all 4 Y-subpixels.
ivec4 xh = setup.xh * SCALING_FACTOR + (y_subs - yh_interpolation_base) * setup.dxhdy;
ivec4 xm = setup.xm * SCALING_FACTOR + (y_subs - yh_interpolation_base) * setup.dxmdy;
ivec4 xl = setup.xl * SCALING_FACTOR + (y_subs - ym_interpolation_base) * setup.dxldy;
xl = mix(xl, xm, lessThan(y_subs, ivec4(SCALING_FACTOR * setup.ym)));

This is an accurate representation, as the only thing we do here is to shift in more bits into triangle setup, as long as this does not overflow, we’re golden. After this step, we have scissoring. Scissor coordinates are u10.2 fixed point, so it means the maximum resolution for the RDP is 1024×1024. With 8x upscale and 8 sub-pixels of X precision, we can barely pack the resulting range in unsigned 16-bits without overflow.

Modifying varying interpolation

Attribute interpolation is a little more interesting. There are 8 varyings, which all have the same setup data:

  • Shade Red/Green/Blue/Alpha
  • S
  • T
  • 1/W
  • Z

Each varying has 4 values:

  • Base value – sampled at coordinate (XH, YH) (kinda … it’s complicated)
  • dVdx – Change in value for 1 pixel in X dimension
  • dVde – Change in value when following the major axis down one line, and sampling at the next line’s XH. Basically dVde = dVdx * dXdy + dVdy. I’m not sure why this even exists, it makes the interpolation math a little easier I suppose?
  • dVdy – This feels very redundant, but it is what it is. It is only used for coverage fixup and LOD computation.

We cannot shift in extra bits here, unlike rasterization, so we have to be a little creative here. To stay faithful, and avoid overflow, we need to ensure that the interpolation is correct for each sample point which matches sample points for native resolution, and for the inner sub-pixels, we remove some bits of precision in the derivative. Essentially, instead of doing something like this (not the correct math, see code, here for brevity):

int base_interpolated_x = ((setup.xh + (y - base_y) * setup.dxhdy)) >> 16;
rgba = attr.rgba;
int dy = y - base_y;
int dx = x - base_interpolated_x;
rgba += dy * attr.drgba_de;
rgba += dx * attr.drgba_dx;

we do …

int base_interpolated_x = ((setup.xh + (y - base_y) * setup.dxhdy)) >> 16;
rgba = attr.rgba;
int dy = y - base_y;
int dx = x - base_interpolated_x;
rgba += (dy >> SCALING_LOG2) * attr.drgba_de + (dy & (SCALING_FACTOR - 1)) * (attr.drgba_de >> SCALING_LOG2);
rgba += (dx >> SCALING_LOG2) * attr.drgba_dx + (dx & (SCALING_FACTOR - 1)) * (attr.drgba_dx >> SCALING_LOG2);

The added error here is microscopic.

Workarounds

Some games do not work correctly when we upscale, since the game never intended to render sub-pixels. This usually comes into play in two major scenarios, which we need to workaround.

Using LOD for clever hackery

The mip-mapping on N64 is quite flexible, and sometimes two entirely different textures represent LOD 0 and LOD 1 for smooth distance based effects. When upscaling with e.g. 4x, we essentially get a LOD factor which is a LOD bias of -2 (log2(1/4)). An optional workaround is to compensate by applying a positive LOD bias ourselves to emit LOD levels the game expects. Ideally, this workaround is applied only in places where it’s needed.

Sprite rendering / TEX_RECT

Many games render sprites with TEX_RECT with the expectation that textures are rendered 1:1 with input texels to output texels. When we start upscaling, the game might have forgot to disable bilinear filtering, and we start filtering outside the texture boundaries, i.e., against garbage, which shows up as ugly seams in the image. The simple workaround is to render TEX_RECT primitives as if they are not upscaled. This is necessary anyways for the COPY pipe, since the COPY pipe only updates the varying interpolator every 8th framebuffer byte. We cannot safely upscale these kinds of primitives either way.

Conclusion

There isn’t much more to it. Adding upscaling to ParaLLEl-RDP was not all that complicated compared to the other insanity that went into making this renderer work. It’s a principled approach to the upscaling which I believe could theoretically work in a custom RDP hardware design.

ParaLLEl RDP updates

Several important things got fixed in the latest core updates for ParaLLEl N64.

Changes

– A new deinterlacing system has been implemented that should be significantly better than the old one. Especially noticeable in games like Resident Evil 2, Turok 2’s high-res mode, Daikatana’s high-res mode, Star Wars Episode 1 Racer’s high-res mode, and tons of other games.

It just blits with a linear filter with an Y offset based on field state. Very basic, but seems good enough. Avoids the worst aspects of bob and weave

Crop overscan added

A game rendered with Angrylion and/or ParaLLEl RDP usually retains the black borders of the screen that went unused. With the ‘Crop Overscan’ option, you can strip these away from the final output image.

See here a good example – Daikatana in high-res mode. By default, it has big borders.

Steam Launch lineup revealed

So, it’s been a long time since we (prematurely) announced our intent to launch RetroArch on Steam. We’re nearing the finish line now however, so now is as good as any a time to start discussing how things are going to roll out.

Will launch on Windows first (Linux later)

We will be releasing on Windows first, with a release on Linux scheduled later (no ETA).

We are trying to limit our support burden at launch here since we are (understandably) concerned about the large amount of support requests and feedback we are bound to be receiving. Adding Linux right from the bat would further exacerbate that.

10 Cores Available On Launch Day

We are deciding to launch with 10 cores at launch. These cores have already been approved and uploaded on Steam. They are as follows:

There will be no ‘Core Downloader’ in RetroArch, or anything that is not hosted on Steam in fact. To obtain cores, you need to install cores separately that we provide as ‘DLC’. These are all free just like RetroArch itself.

NOTE: We need to stress – on its own, without installing any of the cores, the most you will be able to do with RetroArch is watch some movie files and playback music files through its builtin ffmpeg core. To make it do anything else, you will have to install cores.

Differences between regular RetroArch and Steam version

Apart from these aforementioned changes, there will be no substantial differences for now in the Steam version. We understand that even though we have consistently improved the User Experience and tried to make things more easily accessible that we will still be in for a lot of criticisms over the initial learning curve, so we’ve pretty much resigned to the fact that this will happen and will just brace for impact and try to do as much as what we can with the criticism that will inevitably be piling on. We will try to do our best to be as receptive to the feedback as possible with the thickest amount of skin possible, and try to suitably make some much needed UI changes.

This is also what helped inform our decision to go with 10 cores. We could have launched with over 60 cores, sure, but the ensuing fallout would have been a mess and it would have been near impossible to focus on bug reports and issues piling in. By focusing on 10 cores, we can do some much-needed Quality Control where issues inevitably get picked up, we can respond to it and in the process improve the quality of the core. This kind of isolated feedback time with a specific batch of cores is something we have found ourselves in the past always lacking, since it was always off to do the Next Big Thing as new features, cores, and other developments are made on an almost weekly basis. This gives us the much-needed time to focus on a specific batch of cores and polish them before we move on to the next batch of cores.

Beetle PSX HW – Experimental texture replacement now available!

DISCLAIMER: Libretro as a group or entity has no affiliation or involvement in the creation of assets contained in any texture pack

So this has been a project that has been cooking in the oven for about a year in the form of a bounty. The goal is to come up with a way to not only dump all the textures of a PlayStation1 game, but also to replace them with user-supplied textures.

Doing this is hard with PlayStation renderers due to the general low level of abstraction of these renderers, which is why it’s not exactly a commonplace feature in many PS1 emulators.

So far we have let it cook slowly in the oven. However, the recent release of people preparing a Proof Of Concept demo in the form of a Chrono Cross texture pack and the circulation of a modified Beetle PSX HW core that adds support for custom texture injection has led us to make the decision to include this feature already in the buildbot cores rather than wait it out. We hope by doing this, that this feature can organically grow and that more people start taking an active interest in making their own texture packs this way for their own favorite content. Libretro is all about enabling people the power and freedom to do what they want with their legally bought content, after all.

Requirements/Availability

Should only require the Vulkan renderer and a graphics card that is compatible with the Vulkan API. Will not work with either OpenGL or software rendering.

Android, Linux and Windows are all supported targets.

How to get it

The usual. Either you have Beetle PSX HW already installed, in which case you would just go to RetroArch’s Online Updater and select ‘Update Cores’. In case you don’t have it already installed, go to ‘Online Updater’, select ‘Core Updater’ or ‘Core Downloader’ (depends on the version of RetroArch you’re using), and then download Beetle PSX HW.

Explanation of core options

Two new core options have been added.

Dump Textures

While the game is running, it will dump all current active textures it comes across to a directory. The name of this folder is [gamename]-texture-replacements, and it will dumped inside the same dir that your content (ISO or other image format) comes from.

Replace Textures

It will attempt to use all HD textures from the ‘texture-replacements’ directory. The name of this folder is [gamename]-texture-replacements, and it will try to read this directory from the same dir that your content (ISO or other image format) comes from.

NOTE: Later on, we might add another option that allows you to point the dumping and injection path to somewhere else. Right now this is a problem for instance when you have your content stored on a slow disk device like a HDD but you want your texture replacement files to be read from your much faster but smaller SSD instead. Right now, you are forced to move the image to your SSD as well, because otherwise it just dumps and/or reads these texture replacement files from the same dir as the image, in this case the mechanical harddrive.

How to make it work

Make sure you have the textures extracted already in your [ganename]texture-replacements dir, and make sure that the dir is in the same dir that your game content file (ISO or other image format) comes from.

Start Beetle PSX HW, make sure that you are using the Vulkan renderer (it won’t work with either the software renderer or GL renderer), and then make sure the ‘Replace Textures’ option is enabled.

If it works properly, you’ll start seeing low-resolution textures replaced by higher-resolution ones.

Screenshots

Future

We hope to provide you with an article in the near future that goes into how to create your own texture pack for a game.

Is the format set in stone? Is it complete? Probably no to both. It is a Work-In-Progress. However, we hope that by putting it out there already, the community can already start experimenting with the option, putting it through its paces, and see what its limitations are and how far it can be pushed.

paraLLEl N64 – Low-level RDP upscaling is finally here!

ParaLLEl RDP this year has singlehandedly caused a breakthrough in N64 emulation. For the first time, the very CPU-intensive accurate Angrylion renderer was lifted from CPU to GPU thanks to the powerful low-level graphics API Vulkan. This combined with a dynarec-powered RSP plugin has made low-level N64 emulation finally possible for the masses at great speeds on modest hardware configurations.

ParaLLEl RDP Upscaling

Jet Force Gemini running with 2x internal upscale
Jet Force Gemini running with 2x internal upscale

It quickly became apparent after launching ParaLLEl RDP that users have grown accustomed to seeing upscaled N64 graphics over the past 20 years. So something rendering at native resolution, while obviously accurate, bit-exact and all, was seen as unpalatable to them. Many users indicated over the past few weeks that upscaling was desired.

Well, now it’s here. ParaLLEl RDP is the world’s first Low-Level RDP renderer capable of upscaling. The graphics output you get is unlike any HLE renderer you’ve ever seen before for the past twenty years, since unlike them, there is full VI emulation (including dithering, divot filtering, and basic edge anti-aliasing). You can upscale in integer steps of the base resolution. When you set resolution upscaling to 2x, you are multiplying the input resolution by 2x. So 256×224 would become 512×448, 4x would be 1024×896, and 8x would be 2048×1792.

Now, here comes the good stuff with LLE RDP emulation. As said before, unlike so many HLE renderers, ParaLLEl RDP fully emulates the RCP’s VI Interface. As part of this interface’s postprocessing routines, it automatically applies an approximation of 8x MSAA (Multi-Sampled Anti-Aliasing) to the image. This means that even though our internal resolution might be 1024×896, this will then be further smoothed out by this aggressive AA postprocessing step.

Super Mario 64 running on ParaLLEl RDP with 2x internal upscale
Super Mario 64 running on ParaLLEl RDP with 2x internal upscale

This results in even games that run at just 2x native resolution looking significantly better than the same resolution running on an HLE RDP renderer. Look for instance at this Mario 64 screenshot here with the game running at 2x internal upscale (512×448).

How to install and set it up

RDP upscaling is available right now on Windows, Linux, and Android. We make no guarantees as to what kind of performance you can expect across these platforms, this is all contingent on your GPU’s Vulkan drivers and its compute power.

Anyway, here is how you can get it.

  • In RetroArch, go to Online Updater.
  • (If you have paraLLEl N64 already installed) – Select ‘Update Installed Cores’. This will update all the cores that you already installed.
  • (If you don’t have paraLLEl N64 installed already) – go to ‘Core Updater’ (older versions of RA) or ‘Core Downloader’ (newer version of RA), and select ‘Nintendo – Nintendo 64 (paraLLEl N64)’.
  • Now start up a game with this core.
  • Go to the Quick Menu and go to ‘Options’. Scroll down the list until you reach ‘GFX Plugin’. Set this to ‘parallel’. Set ‘RSP plugin’ to ‘parallel’ as well.
  • For the changes to take effect, we now need to restart the core. You can either close the game or quit RetroArch and start the game up again.

In order to upscale, you need to first set the Upscaling factor. By default, it is set to 1x (native resolution). Setting it to 2x/4x/8x then restarting the core makes the upscaling take effect.

Explanation of core options

A few new core option features have been added. We’ll briefly explain what they do and how you can go about using them.

  • (ParaLLEl-RDP) Upscaling factor (Restart)

Available options: 1x, 2x, 4x, 8x

The upscaling factor for the internal resolution. 1x is default and is the native resolution. 2x, 4x, and 8x are all possible. NOTE: It bears noting that 8x requires at least 5GB/6GB VRAM on your GPU. System requirements are steep for 8x and we generally don’t recommend anything less than a 1080 Ti or better for this. Your mileage may vary, just be forewarned. 2x and 4x by comparison are much lighter. Even when upscaling, the rendering is still rendering at full accuracy, and it is still all software rendered on the GPU. 4x upscale means 16x times the work that 1x Angrylion would churn through.

  • (paraLLEl-RDP) Downsampling

Available options: Disabled, 1/2, 1/4, 1/8

Also known as SSAA, this works pretty similar to the SSAA downscaling feature in Beetle PSX HW’s Vulkan renderer. The idea is that you internally upscale at a higher resolution, then set this option from ‘Disabled’ to any of the other values. What happens from there is that this internal higher resolution image is then downscaled to either half its size, one quarter of its size, or one eight of its size. This gives you a very smoothed out anti-aliased picture that for all intents and purposes still outputs at 240p/240i. From there, you can apply some frontend shaders on top to create a very nice and compelling look that still looks better than native resolution but is also still very faithful to it.

So, if you would want 4x resolution upscaling with 4x SSAA, you’d set ‘Downsample’ to ‘1/2’. With 4x upscale, and 1/4 downsample, you get 240p output with 16x SSAA, which looks great with CRT shaders.

  • (paraLLEl-RDP) Use native texture LOD when upscaling

This option is disabled by default.

We have so far only found one game that absolutely required this to be turned on for gameplay purposes. If you don’t have this enabled, the Princess-to-Bowser painting transition in Mario 64 is not there and instead you just see Bowser in the portrait from a far distance. There might be further improvements later to attempt to automatically detect these cases.

Most N64 games didn’t use mipmapping, but the ones that do on average benefit from this setting being off – you get higher quality LOD textures instead of a lower-quality LOD texture eventually making way for a more detailed one as you look closer. However, turning this option on could also be desirable depending on whether you favor accurate looking graphics or a facsimile of how things used to look.

  • (paraLLEl-RDP) Use native resolution for TEX_RECT

This option is on by default.

2D elements such as sprites are usually rendered with TEX_RECT commands, and trying to upscale them inevitably leads to ugly “seams” in the picture. This option forces native resolution rendering for such sprites.

 

Managing expectations

It’s important that people understand what the focus of this renderer is. There is no intent to have yet another enhancement-focused renderer here. This is the closest there has ever been to date of a full software rendered reimplementation of Angrylion on the GPU with additional niceties like upscaling. The renderer guarantees bit-exactness, what you see is what you would get on a real N64, no exceptions.

With a HLE renderer, the scene is rendered using either OpenGL or Vulkan rasterization rules. Here, neither is done – the exact rasterization steps of the RDP are followed instead, there are no API calls to GL to draw triangles here or there. So how is this done? Through compute shaders. It’s been established that you cannot correctly emulate the RDP’s rasterization rules by just simply mapping it to OpenGL. This is why previous attempts like z64gl fell flat after an initial promising start.

So the value proposition here for upscaling with ParaLLEl RDP is quite compelling – you get upscaling with the most accurate renderer this side of Angrylion. It runs well thanks to Vulkan, you can upscale all the way to 8x (which is an insane workload for a GPU done this way). And purists get the added satisfaction of seeing for the first time upscaled N64 graphics using the N64’s entire postprocessing pipeline finally in action courtesy of the VI Interface. You get nice dither filtering that smooths out really well at higher resolutions and can really fake the illusion of higher bit depth. HLE renderers have a lot of trouble with the kind of depth cuing and dithering being applied on the geometry, but ParaLLEl RDP does this effortlessly. This causes the upscaled graphics to look less sterile, whereas with traditional GL/Vulkan rasterization, you’d just see the same repeated textures everywhere with the same basic opacity everywhere. Here, we get dithering and divot filtering creating additional noise to the image leading to an overall richer picture.

So basically, the aim here is actually emulating the RDP and RSP. The focus is not on getting the majority of commercial games to just run and simulating the output they would generate through higher level API calls.

Won’t be done – where HLE wins

Therefore, the following requests will not be pursued at least in the near future:

* Widescreen rendering – Can be done through game patches (ASM patches applied directly to the ROM, or bps/ups patches or something similar). Has to be done on a per-game basis, with HLE there is some way to modify the view frustum and viewport dimensions to do this but it almost never works right due to the way the game occludes geometry and objects based on your view distance, so game patches implementing widescreen and DOF/draw distance enhancements would always be preferable.

So, in short, yes, you can do this with ParaLLEl RDP too, just with per-game specific patches. Don’t expect a core option that you can just toggle on or off.
* Rendering framebuffer effects at higher resolution – not really possible with LLE, don’t see much payoff to it either. Super-sampled framebuffer effects might be possible in theory.
* Texture resolution packs – Again, no. The nature of an LLE renderer is right there in the name, Low-Level. While the RDP is processing streams of data (fed to it by the RSP), there is barely any notion whatsoever of a ‘texture’ – it only sees TMEM uploads and tile descriptors which point to raw bytes. With High Level emulation, you have a higher abstraction level where you can ‘hook’ into the parts where you think a texture upload might be going on so you can replace it on the fly. Anyway, those looking for something like that are really at the wrong address with ParaLLEl RDP anyway. ParaLLEl RDP is about making authentic N64 rendering look as good as possible without resorting to replacing original assets or anything bootleg like that.
* Z-fighting/subpixel precision: In some games, there is some slight Z-fighting in the distance that you might see which HLE renderers typically don’t have. Again, this is because this is accurate RDP emulation. Z-fighting is a thing. The RDP only has 18-bit UNORM of depth precision with 10 bits of fractional precision during interpolation, and compression on top of that to squeeze it down to 14 bits. A HLE emulator can render at 24+ bits depth. Compounding this, because the RSP is Low-level, it’s sending 16-bit fixed point vertex coordinates to the RDP for rendering. A typical HLE renderer and HLE RSP would just determine that we are about to draw some 3D geometry and then just turn it into float values so that there is a higher level of precision when it comes to vertex positioning. If you recall, the PlayStation1’s GTE also did not deal with vertex coordinates in floats but in fixed point. There, we had to go to the effort of doing PGXP in order to convert it to float. I really doubt there is any interest to contemplate this at this point. Best to let sleeping dogs lie.
* Raw performance. HLE uses the hardware rasterization and texture units of the GPU which is far more efficient than software, but of course, it is far less accurate than software rendering.

Where LLE wins

Conversely, there are parts where LLE wins over HLE, and where HLE can’t really go –

* HLE tends to struggle with decals, depth bias doesn’t really emulate the RDP’s depth bias scheme at all since RDP depth bias is a double sided test. Depth bias is also notorious for behaving differently on different GPUs.
* Correct dithering. A HLE renderer still has to work with a fixed function blending pipeline. A software rendered rasterizer like ParaLLEl RDP does not have to work with any pre-existing graphics API setup, it implements its own rasterizer and outputs that to the screen through compute shading. Correct dither means applying dither after blending, among other things, which is not something you can generally do [with HLE]. It generally looks somewhat tacky to do dithering in OpenGL. You need 8-bit input and do blending in 8-bit but dither + quantization at the end, which you can’t do in fixed function blending.
* The entire VI postprocessing pipeline. Again, it bears repeating that not only is the RDP graphics being upscaled, so is the VI filtering. VI filtering got a bad rep on the N64 because this overaggressive quasi-8x MSAA would tend to make the already low-resolution images look even blurrier. But at higher resolutions as you can see here, it can really shine. You need programmable blending to emulate the VI’s coverage, and this just is not practical with OpenGL and/or the current HLE renderers out there. The VI has a quite ingenious filter that distributes the dither noise where it reconstructs more color depth. So not only are we getting post-processing AA courtesy of the RDP, we’re also getting more color depth.
* What You See Is What You Get. This renderer is the hypothetical what-if scenario of how an N64 Pro unit would look like that could pump out insane resolutions while still having the very same hardware. Notice that the RDP/VI implementations in ParaLLEl RDP have NOT been enhanced in any way. The only real change was modifying the rasterizer to test fractional pixel coordinates as well.
* Full accuracy with CPU readbacks. CPU can freely read and write on top of RDP rendered data, and we can easily deal with it without extra hacks.

Known issues

  • The deinterlacing process for interlaced video modes is still rather poor (just like Angrylion), basic bob and weave setup. There are plans to come up with a completely new system.
  • Mario Tennis glitches out a bit with upscaling for some reason, there might be subtle bugs in the implementation that only manifest on that game. This seems to not happen on Nvidia Windows drivers though.

Screenshots

The screenshots below here show ParaLLEl RDP running at its maximum internal input resolution, 8x the original native image. This means that when your game is running at say 256×224, it would be running at 2048×1792. But if your game is running at say 640×480 (some interlaced games actually set the resolution that high, Indiana Jones IIRC), then we’d be looking at 5120×3840. That’s bigger than 4K! Then bear in mind that on top of that you’re going to get the VI’s 8x MSAA on top of that, and you can probably begin to imagine just how demanding this is on your GPU given that it’s trying to run a custom software rasterizer on hardware. Suffice it to say, the demands for 2x and 4x will probably not be too steep, but if you’re thinking of using 8x, you better bring some serious GPU horsepower. You’ll need at least 5-6GB of VRAM for 8x internal resolution for starters.

Anyway, without much further ado, here are some glorious screenshots. GoldenEye 007 now looks dangerously close to the upscaled bullshot images on the back of the boxart!

GoldenEye 007 running with ParaLLEl RDP at 8x internal upscale
GoldenEye 007 running with ParaLLEl RDP at 8x internal upscale
Super Mario 64 running on ParaLLEl RDP with 8x internal upscale
Super Mario 64 running on ParaLLEl RDP with 8x internal upscale
Star Fox 64 running on ParaLLEl RDP with 8x internal upscale
Star Fox 64 running on ParaLLEl RDP with 8x internal upscale
Perfect Dark running on ParaLLEl RDP with 8x internal upscale in high-res mode
Perfect Dark running on ParaLLEl RDP with 8x internal upscale in high-res mode
World Driver Championship running on ParaLLEl RDP with 8x internal upscale
World Driver Championship running on ParaLLEl RDP with 8x internal upscale

Videos

Body Harvest

Perfect Dark

Legend of Zelda: Ocarina of Time

Super Mario 64

Coming to Mupen64Plus Next soon

ParaLLEl RDP will also be making its way into the upcoming new version of Mupen64Plus Next as well. Expect increased compatibility over ParaLLEl N64 (especially on Android) and potentially better performance in many games.

Future blog posts

There might eventually be some future blog post by Themaister going into more technical detail on the inner workings of ParaLLEl RDP. I will also probably release a performance test-focused blog post later testing a variety of different GPUs and how far we can take them as far as upscaling is concerned.

I can already tell you to neuter your expectations with regards to Android/mobile GPUs. I tested ParaLLEl RDP with 2x upscaling on a Samsung Galaxy S10+ and performance was about 36fps, this is with vsync off. With 1x native resolution I manage to get on average 64 to 70fps with the same games. So obviously mobile GPUs still have a lot of catching up to do with their discrete big brothers on the desktop.

At least it will make for a nice GPU benchmark for mobile hardware until we eventually crack fullspeed with 2x native!

Coming soon – paraLLEl N64 RDP – Resolution upscaling! (Video demonstration)

ParaLLEl RDP this year has singlehandedly caused a breakthrough in N64 emulation. For the first time, the very CPU-intensive accurate Angrylion renderer was lifted from CPU to GPU thanks to the powerful low-level graphics API Vulkan. This combined with a dynarec-powered RSP plugin has made low-level N64 emulation finally possible for the masses at great speeds on modest hardware configurations.

ParaLLEl RDP will be coming to Mupen64Plus Next soon

ParaLLEl RDP has first seen its debut in ParaLLEl N64, but it will soon make its way into the upcoming new version of Mupen64Plus Next too. Expect increased compatibility over ParaLLEl N64 (especially on Android) and potentially better performance in many games.

ParaLLEl RDP Upscaling

Jet Force Gemini running with 2x internal upscale
Jet Force Gemini running with 2x internal upscale

But that’s not what this article is going to be dedicated to. It quickly became apparent after launching ParaLLEl RDP that users have grown accustomed to seeing upscaled N64 graphics over the past 20 years. So something rendering at native resolution, while obviously accurate, bit-exact and all, was seen as unpalatable to them. Many users indicated over the past few weeks that upscaling was desired.

Well, you won’t have to wait too long, and as a demonstration, today we premiere a 11-minute long YouTube video showcasing ParaLLEl RDP running at 4 times the native resolution. Given an input resolution of 256×224, that means the game is rendering internally at 1024×896.

Now, here comes the good stuff with LLE RDP emulation. Unlike so many HLE renderers, ParaLLEl RDP fully emulates the RCP’s VI Interface. As part of this interface’s postprocessing routines, it automatically applies the equivalent of 8x MSAA (Multi-Sampled Anti-Aliasing) to the image. This means that even though our internal resolution might be 1024×896, this will then be further smoothed out by this aggressive multisampling postprocessing step.

Super Mario 64 running on ParaLLEl RDP with 2x internal upscale
Super Mario 64 running on ParaLLEl RDP with 2x internal upscale

This results in even games that run at just 2x native resolution looking significantly better than the same resolution running on an HLE RDP renderer. Look for instance at this Mario 64 screenshot here with the game running at 2x internal upscale (512×448).

Screenshots

The screenshots below here show ParaLLEl RDP running at its maximum internal input resolution, 8x the original native image. This means that when your game is running at say 256×224, it would be running at 2048×1792. But if your game is running at say 640×480 (some interlaced games actually set the resolution that high, Indiana Jones IIRC correctly), then we’d be looking at 5120×3840. That’s bigger than 4K! Then bear in mind that on top of that you’re going to get the VI’s 8x MSAA on top of that, and you can probably begin to imagine just how demanding this is on your GPU given that it’s trying to run a custom software rasterizer on hardware. Suffice it to say, the demands for 2x and 4x will probably not be too steep, but if you’re thinking of using 8x, you better bring some serious GPU horsepower. You’ll need at least 5-6GB of VRAM for 8x internal resolution for starters.

Anyway, without much further ado, here are some glorious screenshots. GoldenEye 007 now looks dangerously close to the upscaled bullshot images on the back of the boxart!

GoldenEye 007 running  with ParaLLEl RDP at 8x internal upscale
GoldenEye 007 running with ParaLLEl RDP at 8x internal upscale
Super Mario 64 running on ParaLLEl RDP with 8x internal upscale
Super Mario 64 running on ParaLLEl RDP with 8x internal upscale
Star Fox 64 running on ParaLLEl RDP with 8x internal upscale
Star Fox 64 running on ParaLLEl RDP with 8x internal upscale
Perfect Dark running on ParaLLEl RDP with 8x internal upscale in high-res mode
Perfect Dark running on ParaLLEl RDP with 8x internal upscale in high-res mode
World Driver Championship running on ParaLLEl RDP with 8x internal upscale
World Driver Championship running on ParaLLEl RDP with 8x internal upscale

So where is it?

No ETAs, but it’s coming to you soon and will be available on RetroArch shortly for Windows, Linux and Android platforms. Stay tuned!

paraLLEl N64 RDP – Android support and Intel iGPU improvements – What you should know (and what to expect)

Ridge Racer 64 running on Parallel RDP on an Android phone (with RetroArch)
Ridge Racer 64 running on Parallel RDP on an Android phone (with RetroArch)

Themaister wrote an article a few days ago talking in-depth about all the work that has gone into ParaLLEl RDP since launch.

Two of the important things discussed in this article were:
* Intel iGPU performance
* Android support

What you might not have realized from reading the article is that with the right tweaks, you can already get ParaLLEl RDP to run reasonably well. As indicated in the article he wrote, Themaister will be looking at WSI Vulkan issues specifically related to RetroArch since there definitely do seem to be some issues that have to be resolved. In the meantime, we have to resort to some workarounds. Workarounds or not, they will do the job for now.

How to install and set it up

  • In RetroArch, go to Online Updater.
  • (If you have paraLLEl N64 already installed) – Select ‘Update Installed Cores’. This will update all the cores that you already installed.
  • (If you don’t have paraLLEl N64 installed already) – go to ‘Core Updater’, and select ‘Nintendo – Nintendo 64 (paraLLEl N64)’.
  • Now start up a game with this core.
    Go to the Quick Menu and go to ‘Options’. Scroll down the list until you reach ‘GFX Plugin’. Set this to ‘parallel’. Set ‘RSP plugin’ to ‘parallel’ as well.
  • For the changes to take effect, we now need to restart the core. You can either close the game or quit RetroArch and start the game up again.

Intel iGPU

What you should do for optimum performance right now:

  • For Intel iGPU, I have found that what makes the biggest difference by far (on Windows 10 at least) is to run it in windowed mode instead of fullscreen. Fullscreen mode will have horribly crippled performance by comparison.

Performance

Once you have done this, the performance will actually not be that far behind with a run-off-the-mill iGPU from say a 2080 Ti (in asynchronous mode). Sure, it’s still a bit slower by about ~30fps, but it’s no longer the massive gulf in performance it was before where even Angrylion was beating ParaLLEl RDP in the performance department.

With synchronous, the difference between say a 2080 Ti and an iGPU should be a bit more pronounced.

Hopefully in future RetroArch versions, it will no longer be necessary to have to resort to windowed mode for good performance with Intel iGPUs. For now, this workaround will do.

Android

What you should do for optimum performance right now:

  • Turn vsync off. Go to Settings -> Video -> Synchronization, and make sure that ‘Vertical Sync (Vsync)’ is disabled.

NOTE: It is imperative that you turn V-Sync off for now. If not, performance will be so badly crippled that even Angrylion will be faster by comparison. Fortunately, there will be no noticeable screen tearing even with Vsync disabled right now.

Performance

I tested ParaLLEl RDP on two devices:

  • Nvidia Shield TV (2015)
  • Samsung Galaxy S10 Plus (2019) [European Exynos model]

NOTE: The European model of the Galaxy S10 Plus used here has the Samsung Exynos SoC (System-On-A-Chip). Generally these perform worse than the US models of the Galaxy phones, which use a Qualcomm Snapdragon SoC instead. You should therefore expect significantly better performance on a US model.

Performance on Shield TV

Here are some rough performance figures for the Nvidia Shield TV –

Title Performance
Mortal Kombat Trilogy 87 to 94fps
Yoshi’s Story 99fps
Doom 64 90 to 117fps
Tetris 64 117fps
Starcraft 64 177fps

It’s hard to put an exact number on other games, but just from a solely gameplay-focused perspective, you can get a near-locked framerate with games like Legend of Zelda: Ocarina of Time and Super Mario 64 if you run the PAL versions (which limit the framerate to 50fps instead of 60fps with NTSC versions). There might still be the odd frame drop in certain graphics intensive scenes but nothing too serious.

Similarly, games like 1080 Snowboarding drop below fullspeed with the NTSC version, but running them with the PAL version is nearly a locked framerate in all but the most intensive scenes.

Performance on Samsung Galaxy S10 Plus

Performance on a high-end 2019 phone like the Galaxy S10 Plus can tend to be more variable, probably because of the aggressive dynamic throttling being done on phones. Sometimes performance would be a significant step above the Shield TV where it could run NTSC versions of games like Legend of Zelda: Ocarina of Time and Super Mario 64 at fullspeed with no problem (save for the very odd frame drop here and there in very rare scenes), and then at other times it would perform similarly to a Shield TV. Your mileage may vary there.

Conclusions

Overall, it’s clear that certain battles have to be won on the Vulkan side, especially when it comes down to having to disable vsync at all so far for acceptable performance.

We’d like to learn more from people who have a Samsung Galaxy S20 or a similar high end phone released in 2020. Even a Snapdragon version of the S10 Plus would produce better results than what we see here.

So, Low-Level N64 emulation, is it attainable on Android? Yes, with the proper Vulkan extensions, and provided you have a reasonably modern and fast high end phone. The Shield TV is also a decent mid-range performer considering its age. Far from every game runs at fullspeed yet but the potential is certainly there for this to be a real alternative to HLE based N64 emulation on Android as hardware grows more powerful over the years.

FAQ

Some specific issues should be addressed –

Game compatibility is significantly lower on Android right now

The mupen64plus-core part of ParaLLEl N64 is older than the one found in Mupen64plus next. While on PC this is not so much of an issue because of the generally mature (but slower) Hacktarux dynarec, on ARM platforms it is a different story since new_dynarec was in a premature state back then. Not only that, LLE RDP + RSP plugin compatibility with new_dynarec was not even a consideration back then. So some games might not work at all right now with Parallel RDP+RSP on Android.

ParaLLEl N64 will likely receive a mupen64plus-core update soon, and Mupen64Plus Next might also in the near future get ParaLLEl RDP + ParaLLEl RSP support. So this situation will sort itself out.

You get a display error showing ‘ERR’ on your Android device

The Vulkan driver for your GPU is likely missing these two Vulkan extensions, which ParaLLEl RDP requires.

VK_KHR_8bit_storage
VK_KHR_16bit_storage

(Intel iGPU) Performance is halved (or more) in fullscreen mode

Known issue, read above. These issues have been identified and it’s a matter of finding the appropriate solution for these issues.

paraLLEl-RDP update

Since the paraLLEl-RDP rewrite was unleashed upon the world, a fair bit of work has gone into it. Mostly performance related and working around various drivers.

Rendering bug fixes

Unsurprisingly, some bugs were found, but very few compared to what I expected. All the rendering bugs were fortunately rather trivial in nature, and didn’t take much effort to debug. I can only count 3 actual bugs. To be a genuine bug, the issue must be isolated to paraLLEl-RDP. Core bugs are unfortunately quite common and a lot of core bugs were mistaken as RDP ones.

Mega Man 64 – LODFrac in Cycle 1

The RDP combiner can take the LOD fractional value as inputs to the combiner. However, the initial implementation only considered that Cycle 0 would observe a valid LODFrac value. This game however, uses LODFrac in Cycle 1, and that case was completely ignored. Fixing the bug was as simple as consider that case as well, and the RDP dump validated bit-exact against Angrylion. I believe this also fixed some weird glitching in Star Wars – Naboo. At least it too passed bit-exact after this fix was in place.

Mario Tennis crashes – LoadTile overflow

Some games, Mario Tennis in particular will occasionally attempt to upload textures with broken coordinates. This is supposed to overflow in a clean way, but I missed this case, and triggered an “infinite” loop with 4 billion texels being updated. Needless to say, this triggered GPU crashes as I would exhaust VRAM while spamming an “infinite” loop with memory allocations. Fairly simple fix once I reproduced it. I believe I saw these crashes in a few other games as well, and it’s probably the same issue. Haven’t seen any issues since the fix.

Perfect Dark logo transition

Not really an RDP rendering issue, but VI shenanigans. This was a good old case of a workaround for another game causing issues. When the VI is fed garbage input, we should render black, but that causes insane flickering in San Francisco Rush, since it strobes invalid state every frame. Not entirely sure what’s going on here (not impossible it’s a core bug …), but I applied another workaround on top of the workaround. I don’t like this 🙁 At least the default path in the VI implementation is to do the expected thing of rendering black here, and parallel-n64 opts into using weird workarounds for invalid VI state.

Core bugs

Right now, the old parallel-n64 Mupen core is kind of the weakest link, and almost all issues people report as RDP bugs are just core bugs. I’ll need to integrate this in a newer Mupen core and see how that works out.

Improving compatibility with more Vulkan drivers

As mentioned in my last post, a workaround for lack of VK_EXT_external_memory_host was needed, and I implemented a fairly complex scheme to deal with this in a way that is not horribly slow. Effectively, we now need to shuffle memory back and forth between two views of RDRAM, the CPU-owned RDRAM, and GPU-owned RDRAM. The implementation is quite accurate, and tracks writes on a per-byte basis.

The main unit of work submitted to the GPU is a “render pass” (similar in concept to a Vulkan render pass). This is a chunk of primitives which all render to the same addresses in RDRAM and which do not have any feedback effect, where texture data is sampled from the frame buffer region being rendered to. A render pass will have a bunch of reads from RDRAM at the start of the render pass, where frame buffer data is read, along with all relevant updates to TMEM. All chunks of RDRAM which might be read, will be copied over to GPU RDRAM before rendering. We also have a bunch of potential writes after the render pass. These writes must eventually make their way back to CPU RDRAM. Until we drain the GPU for work completely, any write made by the RDP “wins” over any writes made by the CPU. During the “read” phase of the render pass, we can selectively copy bytes based on the pending writemask we maintain on the GPU. If there are no pending writes by GPU, we optimize to a straight copy.

As for performance, I get around 10-15% FPS hit on NVIDIA with this workaround. Noticeable, but not crippling.

Android

Android SoCs do not always support cache-coherency with the GPU, so that’s added complexity. We have to carefully flush caches and invalidate caches on CPU side after we write to GPU RDRAM and before we read from it respectively. I also fixed a bunch of issues with cache management in paraLLEl-RDP which would never happen on a desktop system, since everything is essentially cache coherent.

With these fixes, paraLLEl-RDP runs correctly on at least Galaxy S9/S10 with Android 10 and Mali GPUs, and the Tegra in Shield TV. However, the support for 8/16-bit storage is still very sparse on Android, and I couldn’t find a single Snapdragon/Adreno GPU supporting it, oh well. One day Android will catch up. Don’t expect any magic for the time being w.r.t. performance, there are some horrible performance issues left which are Android specific outside the control of paraLLEl-RDP, and need to be investigated separately.

Fixing various performance issues

The major bulk of the work was fixing some performance issues which would come up in some situations.

Building a profiler

To drill down into these issues, I needed better tooling to be able to correlate CPU and GPU activity. This was a good excuse to add such support into Granite, which is paraLLEl-RDP’s rendering backend, Beetle HW Vulkan’s backend, and the foundation of my personal Vulkan rendering engine. Google Chrome actually has a built-in profile UI frontend in chrome://tracing which is excellent for ad-hoc use cases such as this. Just dump out some simple JSON and off you go.

To make a simple CPU <-> GPU profiler all you need is Vulkan timestamp queries and VK_EXT_calibrated_timestamps to improve accuracy of CPU <-> GPU timestamp correlation. I made use of the “pid” feature of the trace format to show the different frame contexts overlapping each other in execution.

Anyone can make these traces now by setting environment variables: PARALLEL_RDP_BENCH=1 GRANITE_TIMESTAMP_TRACE=mytrace.json, then load the JSON in chrome://tracing.

Why is Intel Mesa much slower than Intel Windows?

This was one of the major questions I had, and I figured out why using this new tool. In async mode, performance just wouldn’t improve over sync mode at all. The reason for this is that swap buffers in RetroArch would completely stall the GPU before completing (“refresh” in the trace). I filed a Mesa bug for this. I’ll need to find a workaround for this in RetroArch. With a hacky local workaround, iGPU finally gives a significant uplift over just using the CPU in this case. Trace captured on my UHD 620 ultrabook which shows buggy driver behavior. Stalling 6 ms in the main emulation thread is not fun. 🙁

Fixing full GPU stalls, or, why isn’t Async mode improving performance?

This was actually a parallel-n64 bug again. To manage CPU <-> GPU overlap, the Vulkan backend uses multiple frame contexts, where one frame on screen should correspond with one frame context. The RDP integration was notified too often that a frame was starting, and thus would wait for GPU work to complete far too early. This would essentially turn Async mode into Sync mode in many cases. Overall, fixing this gained ~10-15% FPS on my desktop systems.

Be smarter about how we batch up work for the GPU – fixing stutters in Mario Tennis

Mario Tennis is pretty crazy in how it renders some of its effects. The hazy effect is implemented with ~50 (!) render passes back to back each just rendering one primitive. This was a pathological case in the implementation that ran horribly.

The original design of paraLLEl-RDP was for larger render passes to be batched up with a sweet spot of around 1k primitives in one go, and each render pass would correspond to one vkQueueSubmit. This assumption fell flat in this case. To fix this I rewrote the entire submission logic to try to make more balanced submits to the GPU. Not too large, and not too small. Tiny render passes back-to-back will now be batched together into one command buffer, and large render passes will be split up. The goal is to submit a meaningful chunk of work to the GPU as early as possible, and not hoard tons of work while the GPU twiddles its thumbs. This is critically important for Sync mode I found, because once we hit a final SyncFull opcode, we will need to wait for the GPU to complete all pending work. If we have already submitted most of the relevant work, we won’t have to wait as long. Overall, this completely removed the performance issue in Mario Tennis for me, and overall performance improved by a fair bit. > 400 VI/s isn’t uncommon in various games now on my main system. RDP overhead in sync mode usually accounts for 0.1 ms – 0.2 ms per frame or something like that, quite insignificant.

Performance work left?

I think paraLLEl-RDP itself is in a very solid place performance-wise now, the main issues are drilling down various WSI issues that plague Intel iGPU and Android, which I believe is where we lose most of the performance now. That work would have to go into RetroArch itself, as that’s where we handle such things.

Overall, remember that accurate LLE rendering is extremely taxing compared to HLE rendering pixel-for-pixel. The amount of work that needs to happen for a single pixel is ridiculous when bit-exactness is the goal. However, shaving away stupid, unnecessary overhead has a lot of potential for performance uplift.

RetroArch 1.8.8 released!


RetroArch 1.8.8 has just been released.

Grab it here.

Read our latest Libretro Cores Progress Report blog post here. It’s an exhaustive list, and especially the older consoles have received a lot of new cores and improvements.

Remember that this project exists for the benefit of our users, and that we wouldn’t keep doing this were it not for spreading the love with our users. This project exists because of your support and belief in us to keep going doing great things. If you’d like to show your support, consider donating to us. Check here in order to learn more. In addition to being able to support us on Patreon, there is now also the option to sponsor us on Github Sponsors! You can also help us out by buying some of our merch on our Teespring store!

Highlights

Add option to sort playlists after name truncation (Ozone)

A couple of users have complained about a feature we made in 1.8.7 (Fix sidebar playlist sort order when ‘Truncate Playlist Names’ is enabled). This new addition makes the new sidebar playlist sorting behaviour optional via a Sort Playlists After Name Truncation setting under User Interface > Appearance. When disabled, playlists will be sorted the old way (according to file name), not by display name.

Localization – big updates and crowdsourced

A new language has been added, Slovakian. And plenty of the existing languages have received big updates as far as localization goes.

But by far the biggest change is our transition to Crowdin. This allows non-programmers to more easily contribute localization changes/additions to RetroArch. You can see the completion status of the various languages on our Crowdin page.

Improved shader preset dirs

1.8.8 restores the original behavior of the “Save Shader Preset As” option, and improves the cycling of shaders by allowing to cycle the shaders on fallback directories if the Video Shader directory does not contain any preset.

Input Remapping Fixes

In 1.8.7, pressing RetroPad Start to reset a core input remap to the default setting did not work correctly – analog stick inputs get the wrong defaults, and inputs that are left undefined by the core are not set to the proper RARCH_UNMAPPED value.

1.8.8 fixes the issue.

RetroArch WiiU: Gamepad hotplugging support, theoretical multi-gamepad support

1.8.8 adds support for hotplugging WiiU Gamepads. Critically for users, this makes the driver no longer assume a Gamepad is present, so when it’s broken or out of battery or missing; the first Wiimote gets slot 1 instead (helpful when All Users Control Menu is off). I [quarkawesome] also made it check Gamepad channel 2 – while it’s impossible to connect a second Gamepad on a retail console, the code to do it still appears to be there. If that feature ever becomes a thing through CFWs, it’ll work here.

RetroArch P2 – New SDK/toolchain – big improvements

fjtrujy spent a lot of time adapting RetroArch PlayStation2 to the latest PS2 SDK. RetroArch PS2 is now being built with a modern version of the GCC compiler, and certain cores are already seeing massive speedups as a result.

As can be seen by the tweet, QuickNES went from 255fps with the old SDK to 429fps with the new SDK. This makes the core more than fast enough to use runahead – on a PlayStation2 of all things!

A newer C/C++ toolchain also will make it much easier to port over software to PS2, it was previously quite difficult to port C++ cores over to PS2.

fjtrujy also added Theodore to the list of cores supported.

Changelog

What you’ve read above is just a small sampling of what 1.8.8 has to offer. There might be things that we forgot to list in the changelog listed below, but here it is for your perusal regardless.

1.8.8

  • AUDIO/JACK: Fix regression introduced after 1.8.4 – would hang at startup
  • CHEEVOS: Disable hardcore when cheats are enabled
  • CHD: Return false when special track cannot be found
  • DISCORD/MATCHMAKING: Fix Discord ‘Ask To Join’ functionality
  • FILE PATH: Various file path handling optimisations
  • FONT: Fix Arabic, Chinese and Korean font rendering
  • INPUT MAPPING/REMAPPING: Restore broken ‘reset to default’ functionality with RetroPad ‘start’ button
  • INPUT MAPPING/REMAPPING: Fix ‘reset to default’ action for analog sticks and undefined core inputs
  • LIBRETRO: Add new message extension allowing for richer messages
  • LOCALIZATION: Update Arabic translation
  • LOCALIZATION: Update Chinese (Simplified) translation
  • LOCALIZATION: Update Chinese (Traditional) translation
  • LOCALIZATION: Update German translation
  • LOCALIZATION: Update Greek translation
  • LOCALIZATION: Update Spanish translation
  • LOCALIZATION: Update French translation
  • LOCALIZATION: Update Italian translation
  • LOCALIZATION: Update Japanese translation
  • LOCALIZATION: Update Korean translation
  • LOCALIZATION: Update Dutch translation
  • LOCALIZATION: Update Polish translation
  • LOCALIZATION: Update Portuguese Brazilian translation
  • LOCALIZATION: Update Russian translation
  • LOCALIZATION: Update Turkish translation
  • LOCALIZATION: Update Vietnamese translation
  • LOCALIZATION: Add Slovak translation
  • MENU: Small buffer optimizations
  • MENU/THUMBNAILS/BUGFIX: Fix heap-use-after-free error
  • MENU/OZONE: Add option to sort playlists after name truncation
  • MENU/OZONE/ANDROIDTV: Default to Ozone menu driver
  • MENU/OZONE/ANDROID: Gamepad-like devices default to Ozone now (Shield Portable)
  • NETPLAY: Lower announcement rate
  • OVERLAYS: Fix memory leak when loading overlays
  • SHADER PRESETS: Improved shader preset dirs
  • TIME/DATE: Enable configuration of date seperator in clock and runtime ‘last played’ displays
  • VITA: Fix upside-down vertical games
  • UWP: Enable playlist and savefile compression by default (because of slow file I/O)
  • VIDEO/WIDGETS: Fix overlapping text when simultaneous pop-up notifications and core/shader messages are being displayed
  • WIIU: Gamepad hotplugging support
  • WIIU: Theoretical multi-gamepad support
  • X11: Fix crash in x11_display_server_get_screen_orientation
  • X11/XSHM: Allow X11/XHSM video driver to operate without SHM extension
  • X11/XSHM: Fix compatibility with X11 input driver
  • XVIDEO: Fix keyboard input initialization
  • XVIDEO/XWAYLAND: Fix XVideo support on xwayland (by supporting I420 and YZ12)

Libretro Cores Progress Report – May 27, 2020

Yesterday marked RetroArch’s 10 Year Anniversary date. Today, we’re not only releasing a new RetroArch version, but we’re letting you know all the changes that have been made to the libretro cores since the last progress report.

Our last core progress report was on April 2, 2020. Below we detail the most significant changes to all the Libretro cores we and/or upstream partners maintain. We are listing changes that have happened since then.

How to update your cores in RetroArch

There are two ways to update your cores:

a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select the core from the list that you want to install.

Final Burn Neo

Description: Multi-system arcade emulator

  • Add Neo Geo Pocket Color support
  • Latest updates from upstream

Flycast

Description: Sega Dreamcast/NAOMI emulator

  • Naomi/AW widescreen integrated cheats (KNIGHTS OF VALOUR THE 7 SPIRITS, Metal Slug 6, Toy Fighter, Dolphin Blue)
  • DSP: Proper MIXS input shift. Fixes Grandia 2 missing sound effects
  • DSP: fix output shift. Fix wrungp ear rape (NAOMI game)
  • Fix for: [Bug]Super Street Fighter II X for Matching Service (Japan) – Disable BIOS region patching
  • Fix for: Samurai Spirits – Frame-skipping issues before reset or changing the option – don’t reset frameskip to 0 at init
  • Haiku: Fix build
  • Rewrite nvidia jetson nano build
  • Make threaded rendering the default on all platforms. Synchronous mode enabled unless LOW_END is defined
  • Info and warning for xBRZ upscaling core option
  • ARM64: check CpuRunning at end of each timeslice. fix hang when exiting service menu in kofnw (NAOMI game)
  • NAOMI: wrungp inputs. support inverted axis for NAOMI
  • Libretro: Improve context request
  • Libretro: fix input descriptor L2/R2 mixup
  • NAOMI: Add Tokyo Bus Guide support
  • Log VMU files loading
  • CUSTOM TEXTURES: support JPEG format. Get rid of libpng and use stb_image log error if naomi eeprom save fails
  • Fix mouse state not being updated
  • PVR/NAOMI: update palette when PAL_RAM_CTRL is updated – fixes wrong palette in Gun Survivor 2 score screen
  • REND: Use original palette data to compute palette hash
  • PVR: textured background plane – fixes Who Wants To Be a Millionaire

DOSbox Core

Description: DOSbox core

  • Latest updates from upstream
  • Add option for using 2axis joystick even when only one port is connected – this fixes input problems in “Super Off Road”, but other games might also need this.
  • Add option for controlling log verbosity level – Since we can log to stdout now, it makes sense. Also, some frontends might not have configurable log verbosity levels.
  • Add option for printing log output to stdout/stderr –
    Useful if the frontend’s logging output is unreliable or too noisy (or
    both) and we only want to see log output from the core. RetroArch does
    have a configuration option for controlling frontend and core log output
    seperately, but it’s bugged.
  • Enable Voodoo on all platforms and regardless of fakesdl – Software-based Voodoo emulation doesn’t require SDL anymore so it should build fine everywhere.
  • Don’t claim there’s Voodoo2 emulation – Oops. 12MB doesn’t actually mean Voodoo2. The current code only emulates the original Voodoo. The 12MB setting is just a non-standard memory configuration for the Voodoo 1.
  • Add support for changing current core option values programmatically.
    This is a hack. The libretro API does not actually support this. We
    achieve this by replacing all current values with a bogus one to force
    the frontend to forget the currently selected value, since it doesn’t
    match the bogus one. We then submit the correct values again, but with
    the default value set to the value we want to switch to. This forces
    the frontend to switch to that value because the bogus value is now
    gone and thus not a valid value anymore. Finally, we submit the values
    again but with the initial default value (we only want to change the
    current value, not set a new default.) The frontend will not switch the
    current value, as the values themselves have not changed, just the
    default has.

    RetroArch seems to be well-behaved here and does the correct thing.
    Other frontends might not play ball though.

  • Hook up 3dfx core options
  • Vita: Fix dynarec, fix build
  • Add build options to make bassmidi and fluidsynth optional
  • Fix ARM dynarec
  • Correct cdrom sector size field length according to docs.
  • Refactor input mapper –
    Code should be simpler to understand now. This also fixes a bug where
    inputs on the second port weren’t working before. Mouse emulation is now
    possible on both ports and the default emulated mouse buttons have been
    swapped with the speed modifier buttons (L/R are now mouse buttons,
    L2/R2 the speed modifiers.) This is a saner default since not all
    controllers have L2/R2 buttons.
  • GHA: Support macOS 10.13 by building with GCC instead of XCode Clang
  • Fix floppy image file size detection oopsie
  • Improve disk control related code and move it to its own source file
  • Make image file extension comparisons case-insensitive –
    This fixes the issue where loading an image that has an upper-case
    extension (like “.CUE” instead of “.cue”) results in dosbox mounting the
    image itself without going through the libretro disk control interface.

    We add new case conversion funtions for this (in the new util.h/util.cpp
    files) because the existing conversion functions provided either by
    dosbox or libretro-common are crap and we’re smarter than everybody
    else.

  • Add libretro disk control interface disk labels support –
    Only retro_get_image_label_t for now. Leave retro_set_initial_image_t
    and retro_get_image_path_t undefined as it’s not clear what the use
    would be in case of DOS games, especially since we don’t handle m3u
    files yet.
  • Fix disk_replace_image_index always reporting failure –
    This silly mistake caused the “Failed to append disc” error message in
    RetroArch.
  • Add core option for setting the free space when auto-mounting drive C
  • Ensure overlay mount path ends with a dir separator –
    Otherwise dosbox will write data in bogus directories in the overlay.
  • Add option for mounting the executable’s parent dir as drive C –
    Some pre-installed games expect to be installed in C:\GAMEDIR rather
    than directly in C:\. This option allows these games to run without
    having to modify their configuration files first.
  • Enable PC speaker by default
  • Fix direct content loading of DOS executables
  • Fix BASSMIDI crashing during startup on 32-bit Windows
  • Add BASS/BASSMIDI libs to core info file
  • Add BASSMIDI MIDI driver –
    The bass and bassmidi libraries are looked for in the frontend’s system
    directory and loaded at runtime. This allows the core to work and be
    distributed in a GPL-compliant way without those libraries.

DOSbox SVN

Description: DOSbox

  • Add SALC and XLAT to the dyn_x86 core. Improve LOCK handling a bit.
  • change new to new(std::nothrow) (vogons 73603) and some formatting
  • QNX: Adjust flags to msync based on libretro-common –
    I didn’t notice any bug before but this stuff is very difficult to trigger
  • QNX: Add support
  • Allow unaligned memory only on x86
  • Determine CPU based on actual running platform rather than build one –
    This allows to closs-compile on x86 linux into arm linux

    Endianness is determined in retro_endianness.h and dynarec is determined
    in dynarec.h

  • Correct cdrom sector size field length according to docs
  • memory: Add missing std::nothrow –
    Given subsequent check for NUL it was obviously intended for use with
    std::nothrow
  • Catch exceptions in dosbox –
    Right now an exceptions ends up killing cothread which is against libco
    recommendations and is more difficult to debug. Instead log and exit normally
  • Fix button mappings and wrong port assignment when using both ports –
    This maps B/A/Y/X to DOS buttons 1/2/3/4 by default. Also corrects the
    issue of input not working correctly when using two controllers.
  • Fix Windows x64 hang/crash by updating libretro-common
  • Don’t submit mouse emulation descriptors when no ports are connected
  • Fix new gamepad emulated mouse defaults being swapped
  • Map mouse buttons to L/R, modifiers to L2/R2, not vice-versa –
    Many gamepads don’t have L2/R2, which were needed to press mouse
    buttons with the gamepad using the default mapping. This change
    maps mouse buttons to L/R by default. The modifiers to speed up
    or slow down mouse motion are moved from L/R to L2/R2, since they
    are less essential. All of this can be remapped via quick menu
    using the input mapper.
  • Fix gamepad emulated mouse inputs not showing in mapper sometimes
  • Vita: Fix dynarec
  • Vita: Build fix
  • Switch to libco provided by libretro-common –
    libco embedded here crashes on vita. So let’s use the common one
  • Use RETRO_CALLCONV for disk control callbacks –
    It’s surprising that there were no crashes so far on x86 32-bit…
  • Fix disk_replace_image_index always reporting failure –
    This silly mistake caused the “Failed to append disc” error message in
    RetroArch.
  • Correct an oversight of r4186 when floppy disks are mounted.

Beetle PSX

Description: Sony PlayStation1 emulator

  • Load bios from path that was checked
  • sanitize gl context requests
  • Add more detailed error messages for mmap, unlink sooner
  • Allow Solaris 11 build
  • Cleanup warnings: snprintf truncation, strncpy->memcpy, memset
  • Lightrec: Cleanup/deduplicate mmap code
  • Lightrec: Improve homebrew support
  • Fix inverted check for dma-only invalidation
  • Fix OS X compile
  • Lightrec: fix more games –
    These games now work:
    – Need For Speed: V-Rally
    – Alone In The Dark – One-Eyed Jack’s Revenge
  • Lightrec: Fix SWL/LWL using wrong mask
  • When HAVE_SHM always use global memfd so it can be closed properly on exit –
    LGTM found this once the default was to compile in lightrec
  • Keep track of mmap failing –
    If mmap failed it should prevent crashing when closing and disables
    lightrec memory mirrors so as to not use an incorrect code path
  • Update to latest lightrec and extract PGXP from lightrec –
    PGXP functions are now called from within beetle
  • Compile in Lightrec support by default –
    Specify HAVE_LIGHTREC=0 if you don’t want it compiled
  • Improve DualShock calibration reference
  • Fix disk control interface when running single-disk PBP content

Mupen64Plus Next

Description: Nintendo 64 emulator

  • Only set WITH_DYNAREC based on ARCH if not set
  • Update GLiden64
  • Add support for loading GL symbols using dlsym() instead of libretro API –
    * Required for platforms with EGL version < 1.5 * To enable, use new compile-time define: `-DGL_USE_DLSYM` Currently only enabled for Raspberry Pi platforms that use the legacy Broadcom driver.
  • Add nasm variable
  • Assign variable and reference nasm
  • Bump Version to 2.0.5
  • Update Mupen64Plus INI
  • Add fbInfoDisabled to Ini parsing
  • Update GLideN64 INI

Yabause

Description: Sega Saturn emulator

  • Enable Solaris build

Kronos

Description: Sega Saturn emulator

NOTE: This core requires you to use the ‘glcore’ video driver right now. There is no software renderer and it won’t work properly yet with the regular ‘gl’ driver.

  • Activate openGL program precompilation
  • Precompile some shaders directly at boot time to have a smoother BIOS animation
  • Emit a callback at each frame for synchronizing need on ports
  • Swap the buffers at each frame
  • (libretro) framerate pacing + CDROM support + m3u extension
  • Implement a database & rework cart auto-detect
  • Add support for BIOS language
  • Fix Assault Leynos 2 black screen
  • Fix mesh improved image unit usage
  • Fix blinking in Sega Rally, CS Mode
  • Add support to remove banding when using gouraud shading
  • Modify the handling of improved mesh handling to prepare improved banding support
  • fix pause in Daytona USA
  • fix Sega Rally USA boot – might introduce other improvements or regressions
  • Prepare the SH2 threading
  • ST-V I/O is reading words
  • (libretro) fix resolution mode change
  • Rebase the openGL on Yabause since compute CS has a better rendering and openGL rework created new issues
  • (libretro) improve rendering loop
  • Better horizontal upscaling
  • Depending of vertical flip, sprite reading is not the same – Improving Sega Rally
  • (libretro) use cpu_tesselation as default for polygon mode
  • Use CPU tesselation by default
  • (libretro) make the frame rendering more libretro-friendly
  • Reintroduce some required variable for color calculation – fix Cotton Club
  • Software renderer supports 4 threads at maximum
  • Consider that only old card compatible only with OpenGL 3.3 are limited in variables and need to reduce VDP2 blitting program
  • Reintroduce software renderer
  • Reduce the VDP2 register texture width
  • Do not initialise unsupported openGL functions. Fixing some openGL Core 3.3 errors
  • For openGL do not call to much the rendering loop – Better to maximize cache texture impact
  • Add ignition line to VDP1 commands
  • Do not flush the cache texture at each HBlank – only flush when needed
  • In case of draw every 2 frames, do not consider CMD that has already been displayed
  • We have to update the texture before the evaluated startupline, not at the end of the first line… Fixing Skeleton tearing
  • allow STV rom loading to be CRC based
  • Trigger the VDP1 rendering as soon as we consider the draw command list has executed
  • Take care of effective starting line of VDP1 command to evaluate if the core has to regenerate the textures
  • Remove NVidia related pragma – Might impact other GC
  • Calculate VDP1 cycles requested by draw commands
  • Fix some STV loading –
    Fix batmanfr, thuntk & thunt boot without breaking dnmtdeka gfx (and maybe other’s ?)
    Also got sanjeon to boot.
  • Load the BIOS file of the right entry – Fixing Die Hard boot
  • Setup EEPROM directory at the same location than STV ROM
  • Fix “heap-use-after-free” on exit –
    YglGenReset can’t be called after _Ygl is freed
  • fix some STV loading issues –
    suikoenb, thunt, thuntk, batmanfr & znpwfvt now boot, however only thuntk seems playable at the moment
  • Support EEPROM save & load for STV
  • Update the rotation window before the rotated layer are using the values
  • Preliminary support for STV’s Kick Harness
  • Fix horizontal RBG offset in Final Fight Revenge
  • Allow openGL 3.3 since openGL without tesselation should work on most of the games
  • Fix the RBG vertical misalignment when upscale is on – Still horizontal tearing
  • Some minor fixes for RBG CS
  • Fix VDP1 mapping on highest upscale ratio
  • Fixing coin setup on ST-V
  • Backup RAM can be accessed in word and Long – STV is doing this
  • Better handling of the upscale
  • PTRM = 0x3 corresponds to PTMR 0x2 – Fixing Skeleton Warriors startup
  • Fix the special color condition in case sprite is cc enabled, not active as first screen, but activable as second
  • For color mode other than 2, read coefficient table on the upper part of the color RAM
  • Do not handle prohibited setting on PTMR – Fix Skeleton
  • Writing lower part of Color RAM in mode 0 is overwriting upper part
  • When VDPRAM mode is 0, color from RBG is read on the upper part since lower part is for color offset
  • If the code is checking EDSR, just wait for VDP1 processing to finish – avoid changing VDP1 texture while it is changing
  • Do not try agressive optimization yet – Fixing Disc menu performance
  • Fix Quad upscale in Compute shader
  • Fix the regeneration of the VDP1 surfaces – Fix Guardian Heroes
  • Add a compilation flag to enable/disable VDP1RAM update – enabled by default
  • LDCSR is changing the SR mask, so interrupt shall be handled as soon as it is changed – Fixing Princess Maker 2 boot
  • Fix the deinit of YGLTM
  • Fix libretro fullscreen switch – Implement a destroy of all openGL objects when openGL context is reset
  • (libretro) fix upscaling weirdness
  • (libretro) try at fixing scaling weirdness
  • On Libretro, do not execute the last resizing. It is required that the libretro framebuffer always provide a FB of size _Ygl->widthx_Ygl->height
  • (libretro) updates
  • Issue on IST cleaning. Princess Maker 2 is now crashing on master SH2 issue. Might be due to SH2 interrupt handling – Fixing Skeleton boot
  • SH2 interrupt mechanism is not accurate – Do not try to make it precise – Fix Capcom Generations 5
  • Compute the VDP1 buffer in threads
  • Double the VDP1 structure to avoid stall between frames
  • Big rework of SCU Interrupt handling – Fix Princess Maker 2 boot while fixing Nanatsue Kaze after the start screen – might introduce regressions
  • Reenable the support of asynchronous preparation of RBG layers
  • Do not enable RBG async preparation
  • Fix BIOS HLE interrupt usage
  • Fix sysclip size for rotation
  • Check if we need to send a SSH2 interrupt each time we are sending a MSH2 interrupt from SCU
  • Purge all the SCU interrupt and do not stack them
  • Changing SCU HIRQMASK can generate an interrupt
  • NMI interrupt has a specific handling and does not exit using a standard RTE command
  • Handle in priotity interrupts on SH2 core
  • Rework a bit the SCU interrupt handling
  • Reintroduce interrupt removal – Fix Princess Maker 2 – might impact Sakura Wars
  • Fix Sakura Taisen video
  • Fix line color offset on RBG compute
  • Fix Highway 2000
  • In case of rotated FB, increase the system clipping by twice the offset – It looks like a workaround – It is fixing Hang On ’95 blue line on the right
  • Better to use VDP2Ram access function
  • Fix VDP1 rotation for Hang on and Power drift. It looks like a scale might be still needed for Capcom Generations 4

PCSX ReARMed

Description: Sony PlayStation1 emulator

  • Android: Support for new lightrec API
  • Update lightrec to latest upstream
  • Minimize logs when loading a cheevos-compatible content
  • Cleanup retro_run() –
    – move input query into separate functions
    – move internal fps display to separate function
  • Hide other inputs from core options –
    – This adds a core option to hide some input options like multitaps, player ports 3-8 and analog-related fine-tuning options.
    – also combine dynarec-only options in one #define directive
  • More core option fixes –
    – This PR fixes core options and moves them to the related dynarec modes where they are implemented.

    LIGHTREC = relates to platforms that supports the new Lightrec mode
    NEW_DYNAREC = relates to previous dynarec implementation that is still used for some 32bit devices

    – Dynarec Recompiler core option, both dynarec implementation can be enabled or disabled

  • Move guncon options to update_variables –
    – This should stop unnecessary RETRO_ENVIRONMENT_GET_VARIABLE callback and log spamming
  • Fix some edge case where core can freeze upon loading content
  • Automatically disable Lightrec when no BIOS is present, take 2
  • cdriso: fix a disk switching deadlock when closing a CD image
  • ARM NEON: Fixed bug where MSB of a 15-bit BGR color could corrupt green value.
  • cdriso: fix a disk switching deadlock
  • unai: Add ARM-optimized lighting / blending functions

Addendum on UNAI ARM-optimized lighting/blending improvements –

“Looking at the generated ASM on 3DS, I thought I could squeeze out some extra performance by moving the inner lighting and blending functions to handwritten A32 assembly. This gives a medium improvement generally (3-5fps faster on the beach in Crash 1) and a large improvement when doing lots of blending (46-48fps before, 57-60fps after, behind the waterfall in Water Dragon Isle in Chrono Cross).

Some other notes:

  • I used the ARM11 MPCore (3DS CPU) timings for pipelining.
  • I had a few stall cycles during lighting, so I used them to preserve the MSB for lighting and blending, which saved a store, load, and orr later on. ~3-6 cycles saved overall by doing that.
  • I switched from u16 to uint_fast16_t, which is 32-bit on this platform. This saved a few useless uxth instructions for another few cycles. This shouldn’t affect other platforms, but I don’t know for sure. Could typedef if necessary.
  • A lot of the speed improvement in blending comes from not using two instructions per and. For example, & 0x8000 — the compiler preferred to mask out bytes using bic 0x7F00 and bic 0x00FF. Both slower and seemed less correct for what we’re trying to do.”

LRMAME 2003

Description: 0.72 version of LRMAME

  • Fix cheat input dip switch option

LRMAME 2003 Plus

Description: 0.72 version of LRMAME with updates/enhancements

See CHANGELOG for all updates/enhancements on top of LRMAME 2003

  • New working game Gulun.Pa! CPS1 prototype
  • sample pause support for journey –
    uses pause instead of the mute hack implemented.
  • TANK III Joystick bootleg
  • Update samples.c
  • fix big samples dynamic loading when not an OST item
  • Update mcr2.c –
    Sepways.wav sample support for journey
  • Update inptport.c
  • Update foodf.c –
    Allows the player to face in the direction last applied.
  • A fix for Midway MCR3 game saving

LRMAME 2010

Description: 0.139 version of LRMAME

  • Fix Selecting “Inputs (this game)” crashes Retroarch on Android (should also affect other ARM builds)
  • backport 12-bit wrapping fix

    Fix 12-bit wrapping behavior in YM2608/2610 ADPCM_A decoding, fixes some glitches in certain samples in the metal slug series, and likely other games. [Lord Nightmare, madbr]

Dolphin

Description: Gamecube/Wii emulator

  • Fix build for Windows x64
  • Request glcore context when video driver is gl

ParaLLEl N64

Description: Nintendo 64 emulator

  • Sanitize GL context requests
  • Update ParaLLEl RDP – about a ~10fps speed increase on Nvidia over previous version
  • Should fix Mega Man 64 graphics glitch (electric fence not visible)
  • Add ParaLLEl RDP
  • Add SI DMA Duration hack for Tetris 64
  • Use separate cache for DRAM and hidden DRAM.
  • Add DRAM flush and fix VI_REGISTER_OUTPUT.
  • Dump the hidden RDRAM as well.
  • Fix Seg Fault on Game Unload –
    When commit 11c1ae3 split r4300_execute into r4300_execute and r4300_init, it continued to check the “stop” variable, but this is undefined. Removing these checks resolves the seg fault and does not affect functionality.

O2EM

Description: Magnavox Odyssey 2 emulator

  • Fix YES/NO keys and set 0 as default key
  • Change Action button to B to be more consistent with other cores, and remove the shortcuts to 1/2/3/4 keys (useless with the new virtual keyboard)
  • Add option for virtual keyboard transparency
  • Add graphical virtual keyboard

Opera

Description: 3DO emulator

  • Fix Haiku build
  • Remove NVRAM file and try rename again on initial failure –
    Windows doesn’t like renaming over files? This logic accomidates for both
    without needing platform specific behavior.

VirtualJaguar

Description: Atari Jaguar emulator

  • Add Haiku build

XRick

Description: Game engine implementation for Rick Dangerous

  • Add Haiku build

vitaQuake 2

Description: Quake 2 game engine core

  • Add GLES Support and initial Rockchip platform
  • Fix intermission screen being unskippable.
  • Put on par with Vita build (Bump to v.2.3).
  • WiiU: Add build
  • PSL1GHT: Add build

vitaQuake 3

Description: Game engine implementation for Quake 3: Arena

  • Add Haiku build

mGBA

Description: Game Boy Advance emulator

  • Libretro: Add cheevos support for GB/GBC
  • PS2: Update to newest toolchain

QuickNES

Description: 8bit Nintendo Entertainment System/Famicom emulator

  • PS2: Update to newest toolchain
  • Fix potential free(NULL); problems

FCEUmm

Description: 8bit Nintendo Entertainment System/Famicom emulator

  • PS2: Update to newest toolchain
  • user-adjustable Zapper tolerance
  • reduce max Zapper tolerance to 20
  • MMC1 overrides are treated as ines 2.0, so its needs at least default values for prgRam and chrRam columns.

    Fixes FF1 pink screen due to unmapped CHRRAM.

  • Use proper geometry when switching NTSC filter on or off –
    – Width changes previously was not respected when using NTSC filter. With full use of overscan, NES width
    is 602 px when NTSC filter is used and 256 px on normal
  • Fix build when compiling without NTSC filter support
  • fixed 3DS build
  • Adjust Zapper tolerance; make Zapper input tolerance circular rather than rectangular by default
  • Fix timing when changing from PAL/Dendy to NTSC –
    – Happens when starting with PAL/Dendy region and changing to NTSC can cause frame to get stuck in 50 Hz
    – using RETRO_ENVIRONMENT_SET_SYSTEM_AV_INFO should remedy this.
  • NTSC: Remove height doubling/scanline effect –
    – We just use shaders for scanline if needed. NTSC + height doubling causes performance hit
    for some slow devices. Any decent platform should be able to handle scanline effect shaders at least.
  • ines.c: Simplify rom info logs and cleanup
  • Fk23c: Fix chr issues for some games –
    – Affects mostly waixing using mixed chr rom/ram modes (bit 2 of ram config register $A001)
  • Update ines-correct.h –
    – Add overrides for FK23C
    – Move MMC1 overrides
    – Move MMC5 overrides
  • Move overrides out of some mappers –
    Move mapper-based overrides out and use ines-correct.h if possible. Affects the following mappers below:
    – Move Cnrom database to ines-correct.h
    – Mapper 201 update
    – Mapper 91: Add Street Fighter III (Submapper 1) to ines-correct.h
    – Add dipswitch notes to m237
    – Update mapper9 (PC10 version of Mike Tyson’s Punch-out)
  • Move battery-backed prg ram override to ines-correct.h
  • Start expanding internal override database (ines-correct.h)

2048

Description: 2048 game implementation

  • PS2: Update to newest toolchain
  • PS2: Apply color correction
  • DOS: Add platform support

Picodrive

Description: Sega Megadrive/Genesis/32X/CD emulator

  • PS2: Update to newest toolchain

Snes9x 2010

Description: 16bit Super Nintendo/Famicom emulator

  • libretro: added granularity in SuperFX overclock –
    There doesn’t seem to be any benefit of overclocking above 15 MHz
    (150%). The user should be allowed to be more precise with their
    overclock setting now.
  • Fix MMC reg for 64bit builds –
    Fixes a segmentation fault when playing large ROM games.
  • Fix MMC bank register bit 7 (FuSoYa) –
    Fixes the 64Mbit ExLoRom map.
  • ROM: fail if ROM is invalid –
    Fixes an issue whereby a non-SNES file would cause a segmentation fault.
    This may occur if the selected ROM is corrupt, or a file has an
    incorrect extension.
  • APU: remove unused SoundSync –
    Additionally modified the resampler to use buffer size as a parameter
    instead of the number of samples within the buffer. Previously, the
    buffer size was being changed to the number of samples, and then changed
    back within the resampler.
  • snes9x: add defines for unused multi-cart support –
    The compiler was already optimising these unused functions out. The
    libretro core can define SNES_SUPPORT_MULTI_CART 1 to re-enable support
    for multi-cart in the future.
  • snes9x: refactor defines and remove overscan –
    Out of bounds memory fix from
  • snes9x: APU: Fix buffer overrun –
    Additionally:
    This fixes linking with LTO.
    Disable audio if an error occurs in init instead of continuing and
    segfaulting.
  • snes9x: reduce APU buffer to 64ms –
    I believe this to be a more sane setting than a 1000ms buffer.
  • snes9x: backport config from upstream –
    Most options were not available in English, despite it being the default
    language. All the options that were available in Turkish are now
    available in English.
  • snes9x: fix headercount increment
  • libretro: fix pitch measurement

Prboom

Description: Game engine core for Doom 1/2, Ultimate Doom, Final Doom

  • Switch from ad-hoc endianness handling to retro_endianness.h

Vecx

Description: Vectrex emulator

  • Fixed colour conversion (7 bit mono to RGB1555).
  • support analog controllers
  • Make line drawing code more efficient by doing RGB conversion once per line only
  • Nicer point shape
  • Allow scaling of vector display
  • more flexibility in adjusting display – allow for scaling and shifting to fit overlays

NeoCD

Description: SNK Neo Geo CD emulator

  • CD-ROM controller logic split into a separate file
  • Y Zooming don’t need a ROM file anymore
  • New system to identify and patch BIOS, should allow unknown BIOS to run.
  • BIOS files don’t need to have a specific name anymore they are identified by contents
  • Support for Universe BIOS 3.3
  • BIOS name in the menu now includes filename
  • Add synchronous CD operation mode
  • Vita: Use synchronous CD operation mode
  • Vita: Add build
  • 3DS: Add build
  • WiiU: Add build
  • PSL1GHT: Add build
  • Emscripten: Add build
  • Implemented horizontal interrupt masking (not verified on real hardware)
  • Fix CDROM music endianness
  • Fix big-endian support

PocketCDG

Description: MP3 Karaoke audio player

  • Properly send logging to proper place –
    use logging interface as much as possible with stderr as fallback
    rather than being inconsistent
  • Swap frame before passing it to audio_batch_cb on bigendian –
    audio_batch_cb expects native-endian frames and libmad gives little-endian
    frames. Hence on big-endian we need to swap
  • libmad: Fix big-endian support

FreeChaF

Description: Fairchild Channel F emulator

  • Switch from ad-hoc endianness to retro_endianness.h
  • separated CHANNELF HLE code

ECWolf

Description: Wolfenstein 3D game engine core

  • Latest updates from upstream
  • iOS/TVOS: Add build
  • WiiU: Add build

UAE4Arm

Description: Commodore Amiga emulator

  • Libretro improvements
  • Two joystick support with automatic switching
  • Define controls close to PUAE default. Use right analog stick as mouse and L2-R2 as mouse button.
  • Add working virtual keyboard support
  • Emulation resolution handing is better (especially change). 2nd mouse button working.
  • Solve right mouse button non working
  • Solve different video resolution
  • Solve audio glitches

P-UAE

Description: Commodore Amiga emulator

  • Disable waiting blits by default
  • Remapping fixes + clarifications
  • Zoom mode horizontal croppings + cleanups
  • Prefixes for hidden core options
  • Better sorting in M3U generation
  • Geometry fixes, Keymap update, Cleanups
  • WHDLoad fixes –
    – Custom parameter not working on many cases
    – Error messages and copy skips if Kickstart filesizes are not correct
  • Model preset overrides, Optional region search
  • Core option for D-Pad joystick/mouse switching
  • Model config overhaul
  • NTSC fixes
  • Optional region forcing with No-Intro tag support
  • WHDLoad update
  • Sort generated M3Us, Statusbar + glue updates
  • Fix android build
  • New defaults for CPU and Drive Sound Emulation, Sound cleanups
  • User-friendly warning messages for Kickstarts and CAPSImg
  • Attempt to fix crash when reloading core on static builds
  • Fix ZIP Browse Archive
  • M3U ZIP fix
  • Hor+Ver positioning fixes
  • Add “Remove Interlace Artifacts” core option
  • Backport interlaced double line field mode to replace old frame mode
  • Fix sound filter from effectively being always Automatic at startup
  • Fix filter type update, Improve zoom
  • Add core option for CD startup delayed insert, Remove previous disc change detect trickery
  • Change model force hierarchy, Option label updates
  • Android unzippings yet again
  • Core option for muting floppy sound when drive is empty
  • Automatic horizontal centering improvement
  • ANDROID X86: Fix build
  • Fix amd64 compilation –
    m68kops for amd64 is empty, use generic exactly how it was before the rewrite
  • Rewrite libretro m68k.h and m68kops.h to use non-libretro variants –
    They encapsulate per-cpu optimizations. There is no reason to
    have a version for libretro
  • Rewrite maccess.h using new retro_endianness
  • Automatic zoom improvements
  • Control improvements:
    – Enabled D-Pad as mouse in analog joystick mode to help out menu traversing, otherwise D-Pad does nothing unless toggled to mouse mode
    – Added core option for inserting RetroPads to joystick ports in different order (for Arcadia, Dyna Blaster etc.)
  • Fix vertical touch alignment on keyboard while zoomed
  • Global conf file, MultiDrive via Disk Insert
  • CDTV core option
  • Statusbar finetuning
  • Disk Control additions:
    – Transparent ZIP support in M3U parsing
    – Insert Disk support for M3U/ZIP/etc

VICE

Description: Commodore 64 home computer emulator

  • Keymap core option + rework
  • Autoloadwarp enhancement, Core option label tweaking
  • Remapping fixes + clarifications
  • Fix Plus4 cartridge launching, Statusbar fixes
  • Autoloadwarp fix for D81s, Statusbar cleanups
  • Freeze cartridge to reset types
  • More robust floppy autoloadwarping
  • VIC-20 updates:
    – RAM block set tidying
    – Fixed starting carts in M3Us
  • Achievements environment
  • Core option for 2nd SID, Warp mode rework
  • Automatic Load Warp core option
  • NBZ support
  • Dump available and not yet core optionized resources for easier ‘vicerc’ usage
  • Disk Control finetuning –
    – Fallback to drive 8
    – Remove “.” from image type detection extensions to also match cases like “.hidden-d64”
  • Fixed and enabled printer
  • Better sorting in M3U generation
  • Disk Control updates –
    – Allow CRTs & PRGs in M3Us
    – Removed redundant short pathname label fallback in widget
  • Automatic region fixes
  • Zoom mode honing –
    – Changed manual mode operation from automatic to optional
    – Corrected non-wide calculations
    – Added VIC-II/VIC/TED border info to sublabels
  • Core option reorganizing –
    – Separated VICE option variables from core option variables
    – Prevented updating VICE variables to the same variable
    – Simplified palette options
    – Fixed CBM2 embedded palettes & added missing PLUS4 embedded files
    – Removed & disabled non-working CBM2 models (510 not selectable in standalone, therefore no point fixing embedded files)
    – Fixed Super VIC memory expansion
    – Fixed CBM2 crash on higher resolution models
  • Work disk improvements –
    – No need to reset
    – Fixed start content
  • Automatic model core options for x64 & x64sc –
    – Scans for “NTSC|(USA)” and “PAL|(Europe)” tags
    – Both PAL/NTSC and C64/C64C can be preferred
    – Default is “C64 PAL Automatic”
  • Sync finetunings –
    – GetTicks returns a ticker from get_time_usec instead of fake microSecCounter
    – Bypassed internal frameskips and delays
    – Warp mode speed improved
    – Statusbar updates FPS in 1 second intervals instead of 2 & shows real FPS from both warp and fastforward
  • Work disk core option with device selection
  • Fix for autostart detection of D71
  • Work disk core option with device selection
  • Fix QNX build
  • Zoom improvements –
    – Presets for usual suspects (16:9, 16:10, 4:3, 5:4)
    – Hardcode exceptions minimized, math maximized with science
  • Fix Emscripten build
  • Switch from adhoc endianness handling to retro_endianness.h
  • Sort generated M3Us
  • Better retro_get_region, Embedded additions
  • Include NIBTOOLS for automatic NIB->G64 conversion
  • Manual cropping core options
  • Remove border stuff
  • Zoom overhaul, Reorganizing, Cleanups
  • JiffyDOS & GO64 for C128
  • Direct hotkey for joyport switching, TDE + DSE enabled by default
  • Disable JiffyDOS with tapes also on static platforms
  • Rename mouse_x to retro_mouse_x in retrostubs.c to avoid confusion with mousedrv.c
  • Fix compilation on Vita
  • PET fixes (embedded data, keyboard layout) + Cleanups
  • Extended ZIP+M3U Disk Control
  • Fix model change not triggering geometry change with borders disabled
  • PSL1GHT: Add build

Frodo

Description: Commodore 64 emulator

  • Disable compilation of CmdPipe.cpp on libretro
  • Add safeguard in case of thead allocation failure
  • Resync libco
  • Add logging
  • 3DS: Add build
  • PSP: Add build
  • Vita: Add build
  • QNX: Add build
  • QNX: Fix cmdpipe
  • Exclude cmdpipe on PSP1 and Vita
  • Don’t use sigaction on PSP1 and Vita
  • Don’t use chdir and getcwd on PSP1 and Vita
  • Change pulsehandler not to use signals on libretro

Quasi88

  • Fix declaration of INLINE on Android
  • Support for big-endian Linux
  • Switch from ad-hoc endianness to retro_endianness

Nestopia

Description: 8bit Nintendo Famicom/NES emulator

  • Fix scratchy audio in Super Mario Bros. 3 and others

Uzem

  • 3DS: Render at half-width
  • PSP: Render at half-width
  • Android: Fix build
  • Gamecube: Add build
  • PSL1GHT: Add build
  • WiiU: Add build
  • Big-endian support
  • avr8: Fix initial memory cleaning –
    Current code zeores-out only part of intended array
  • Disable av8::idle on libretro –
    It’s not used by libretro
  • Disable recording code on libretro –
    It’s not used by retroarch but it links with popen that is not
    available on many of retroarch platforms
  • Support compilation with g++ 4.6

GME

Description: Game Music Emulator core

  • Gamecube: Add build
  • Wii: Add build
  • WiiU: Add build
  • PSP1: Add build
  • Vita: Add build
  • Use retro_endianness.h instead of ad-hoc endianness defines

Cap32 / Caprice

Description:

  • Fix video glitch on big-endian
  • Replace ad-hoc MSB_FIRST with retro_endianness.h
  • QNX: Fix build
  • PSL1GHT: Add build

CrocoDS

Description:

  • PSL1GHT: Add build
  • WiiU: Add build
  • Wii: Add build
  • Gamecube: Add build
  • Filter-out identical calls to SET_GEOMETRY
  • Fix disk-reading routines on big-endian –
    Current approach of swapping after reading/before writing is sound
    in theory but in current codebase it’s difficult to track and leads easily
    to double-swaps. Just swap the values right before use and at assignment
  • Optimize byteswap
  • Fix big-endian video rendering
  • Disk endianness fix
  • Fix arguments to ReadPort/WritePort –
    Passing 32-bit value instead of 16-bit has undefined results. It works
    on some platforms and fails on other. Fix it properly
  • Support big-endian systems
  • Use retro_endianness instead of SDL_BYTEORDER
  • PSP: Add build
  • Vita: Add build

blueMSX

Description: Home computer MSX emulator

  • Remove leftover endiannness defines
  • Fix arguments for coinDeviceCreate –
    Argument mismatch is fatal on emscripten
  • PSL1GHT: Add build
  • Fix non-smooth scrolling in PAL 50Hz

meowPC98

Description:

  • Move from adhoc endianness to retro_endianness
  • PSP and Vita fixes
  • QNX: Add build
  • PSL1GHT: Add build

NP2Kai

Description:

  • Latest updates from upstream

Bk Emulator

Description: BK-0010/0011/Terak 8510a emulator

  • Wii: Add build
  • Gamecube: Add build
  • Use unsigned int instead of uint –
    For mingw compatibility
  • PSL1GHT: Add build

FUSE

Description:

  • Emscripten fixes

Lutro

Description:

  • Vita: Fix build
  • QNX: Add build

PX68k

Description: Sharp X68000 Emulator

  • Vita: Add build
  • QNX: Add build

Craft

Description:

  • Emscripten: Fix build
  • Fix path to auth database.
    Current path ends up in current working dir which might be anywhere,
    unwritable or not even exist as a concept.
  • Don’t attempt to run main loop after failure.
    It only clouds what errors have occurred
  • Android: Now goes ingame and no longer crashes at startup, doesn’t render blocks yet though

Mr. Boom

Description:

  • QNX: Add build
  • Emscripten: Fix build
  • support big-endian linux
  • PSL1GHT: Add build

Xmil

Description: X Millennium Sharp X1 emulator

  • Android: Fix compilation
  • Wii: Add build
  • Gamecube: Add build
  • Add big-endian support
  • PSL1GHT: Add build
  • Add iOS arm64 build support
  • PSP1 scaling messes when source is larger than psp horizontal resolution,
    hence render at quarter resolution
  • PSP: Add build
  • Fix QVGA support: Code for rendering at quarter of resolution is broken for libretro files
    altogether as it was never really implemented.

    Highres QVGA rendering forgot horizontal offset resulting in single-color lines,
    fix it as well

  • Vita: Add build
  • QNX: Add build

Atari800

Description: Atari 800 emulator

  • Support big-endian linux
  • PSP: Fixes
  • config: don’t use unaligned access on RISC –
    configure.ac enables it only on x86 and m68k. Mirror it in our manual config
  • Use frame counter instead of real clock

SMS Plus GX

Description:

  • Allow enabling or disabling FM sound (YM2413)
  • Prevent potential crash and cleanup –
    – set max geometry height to 240
    – remove some unnecessary functions and variables
    – move some callbacks from retro_init to retro_load_game
  • Add core option (Remove Border) –
    – removes left border (overscan). Works on SMS only.
    – General video renderer cleanups
  • Add core options for region and hardware type overrides
  • Add proper support for PAL content –
    – timing/region is detected using internal database
    – fix to rom names on rom info log
  • Improve performance when using NTSC filters –
    – Removes Sony decoder core options
    – Removes doubling of height when using NTSC filters
    – Use standard integers where applicable
  • Allow core to safely close when a required bios is not found –
    – For coleco rom content, it requires the colero bios to exists. So if none is found,
    just allow core to safely exit. SMS roms does not require bios to be playable, so no check is necessary for it.
  • Add support for colecovision roms (Experimental)

RACE

Description: Neo Geo Pocket (Color) emulator

  • Fix input mapping typo

TGB Dual

Description: Game Boy (Color) emulator with splitscreen link cable support

  • 3DS: Update target

FreeIntv

Description:

  • Fixes graphics issues – Intellivania not working on emulator
  • Correctly detect MTE Test Cart
  • Fixes #40 – Intellicart roms w/o A8 –
    Detects Intellicart roms using a different method for files that don’t begin with 0xA8
  • Expose RAM for Retroachievements
  • OSD, Keyboard keypad controls –
    keys 0-9 as expected. [ and ] replace C and E
    OSD updated, real messages replace cryptic “colored pixel” loading error feedback

TyrQuake

Description:

  • 3DS: Fix build
  • PSL1GHT: Add build

LRMAME

Description:

  • Update to 0.220
  • add 4-way joystick simulation option
  • Correct 4way support

Theodore

Description:

  • PS2: Add build
  • Auto load BASIC 1 cartridge instead of BASIC 128 cartridge on TO7/70
  • Add virtual keyboard transparency option
  • Add TO8/9 keyboard
  • Update TO8 keyboard image and add TO7&TO7/70 virtual keyboards
  • Add virtual keyboards for all models
  • First version of on-screen virtual keyboard

gpSP

Description: Game Boy Advance emulator

  • Fix cheevos support

RetroArch 1.8.7 released!


RetroArch 1.8.7 has just been released.

Grab it here.

We will release a Cores Progress report soon going over all the core changes that have happened since the last report. It’s an exhaustive list, and especially the older consoles will receive a lot of new cores and improvements.

Remember that this project exists for the benefit of our users, and that we wouldn’t keep doing this were it not for spreading the love with our users. This project exists because of your support and belief in us to keep going doing great things. If you’d like to show your support, consider donating to us. Check here in order to learn more. In addition to being able to support us on Patreon, there is now also the option to sponsor us on Github Sponsors! You can also help us out by buying some of our merch on our Teespring store!

Highlights

There are many things this release post will not touch upon, such as all the extra cores that have been added to the various console platforms. We’ll spend some more time on that in a future Cores Progress Report post. We’ll go over some of the other highlights instead.

Netplay bugfixes

Some major netplay regressions snuck into version 1.8.5 and has remained a problem ever since. 1.8.7 finally fixes these issues.

New desktop-style Playlist View mode for MaterialUI

1.8.7 adds a new Desktop Thumbnail View to Material UI, available when using landscape display orientations. This is similar to Ozone’s playlist view. Above is a random screenshot showing what it looks like.

Notes:

  • The status bar at the bottom can be hidden by disabling Settings > Playlists > Show Playlist Sub-Labels
  • Touching/clicking the thumbnail bar toggles the fullscreen thumbnails view

This also represents a major refactor of MaterialUI’s menu entry handling code, which will make other kinds of playlist view mode easier to implement in the future.

Finally, this fixes two small existing issues:

  • Entry dividers now fade correctly during menu transition animations (this is subtle, and I only just realised that is wasn’t working!)
  • The ‘missing thumbnail’ placeholders now fade into view, just like normal thumbnails (previously, they were always displayed instantly, which was quite jarring)

Disable ‘Use Global Core Options File’ by default

1.8.7 changes the default setting of Use Global Core Options File to OFF.

This was only set to ON by default for consistency with legacy setups. There is no material benefit to this – in fact, a global core options file has the following downsides:

More file I/O – all options have to be read/written every time content is loaded or options are saved

Difficulty in editing option values by hand – e.g. sometimes this is necessary if a particular setting causes a core to crash, and if options for all cores are bundled together then sifting through them to find the one you need becomes a chore

Obsolescence – settings for old/unused/outdated cores hang around forever, and bloat the global options file without purpose. With per-core options, it is easy to remove settings for unwanted cores

Since settings are automatically imported from the legacy global file on first run when per-core files are enabled, changing the default behaviour will not harm any existing installation.

Don’t perform unnecessary cheevos initialisation when cheevos are disabled

Before, on all platforms with cheevos support rcheevos_load() is called each time content is loaded. This means the following happens even when cheevos are disabled:

  • If the core does not require the full content path (i.e. if RetroArch passes a data buffer directly), then a copy of the content data is made (up to 64 MB in size)
  • If the content is an m3u file, the file is opened and parsed to get the extension of the first file listed inside
  • A checksum is calculated for the content file extension
  • A task is pushed
  • A mutex is locked/unlocked several times

When Cheevos (Achievements) are disabled, all these things are unnecessary work, causing increased loading times and memory usage. On platforms with low memory (i.e. consoles) the unnecessary content data duplication is potentially harmful and may cause crashes.

1.8.7 very simply adds an ‘early out’ to rcheevos_load() which prevents the above unnecessary work when cheevos are disabled.

Cheevos: option to start a session with all achievements active

Adding an option to allow the players to start a gaming session with all achievements active (even the ones they have as unlocked on RetroAchievements.org).

When cheevos_start_active = true, instead of You have X of Y achievements unlocked, the player will see a message like this:

How the option looks in XMB:

And in Ozone:

Fallback directories for shader presets

This allows us to use the Menu Config and config file directories as fallback to store shader presets when the Video Shader directory is not writable by the user, thus following the same behavior shown by the “Save shader as” menu option.

This allows users to handle their own presets without having to mess with the directory configuration on distros such as ArchLinux, where shaders (among other assets) are managed through additional packages. But it also goes a bit further and changes the order of the preset directories, searching first on the Menu Config path, then on the Video Shader path, and finally on the directory of the config file.

This would improve the portability of the configuration for Android users, because they cannot explore the default shaders directory without rooting their devices. Moreover, I think it makes more sense, as regular configuration overrides are already being stored on the Menu Config path by default.

Testing
Assuming these directory values:

  • menu_config: /home/user/.config/retroarch/config/ (non-writable for testing purposes)
  • video_shaders: /home/user/.local/share/libretro/shaders/ (non-writable)
  • retroarch.cfg: /home/user/.config/retroarch/retroarch.cfg
  • The following menu options have all been successfully tested (see appended log output).

    Save options

    Save Shader Preset As

    [WARN] Failed writing shader preset to /home/user/.config/retroarch/config/foobar.glslp.
    [WARN] Failed writing shader preset to /home/user/.local/share/libretro/shaders/foobar.glslp.
    [INFO] Saved shader preset to /home/user/.config/retroarch/foobar.glslp.

    Save Global Preset

    [WARN] Failed to create preset directory /home/user/.config/retroarch/config/presets/.
    [WARN] Failed to create preset directory /home/user/.local/share/libretro/shaders/presets/.
    [INFO] Saved shader preset to /home/user/.config/retroarch/presets/global.glslp.

    Save Core Preset

    [WARN] Failed to create preset directory /home/user/.config/retroarch/config/presets/Snes9x/.
    [WARN] Failed to create preset directory /home/user/.local/share/libretro/shaders/presets/Snes9x/.
    [INFO] Saved shader preset to /home/user/.config/retroarch/presets/Snes9x/Snes9x.glslp.

    Save Content Directory Preset
    [WARN] Failed to create preset directory /home/user/.config/retroarch/config/presets/Snes9x/.
    [WARN] Failed to create preset directory /home/user/.local/share/libretro/shaders/presets/Snes9x/.
    [INFO] Saved shader preset to /home/user/.config/retroarch/presets/Snes9x/SNES.glslp.

    Save Game Preset
    [WARN] Failed to create preset directory /home/user/.config/retroarch/config/presets/Snes9x/.
    [WARN] Failed to create preset directory /home/user/.local/share/libretro/shaders/presets/Snes9x/.
    [INFO] Saved shader preset to /home/user/.config/retroarch/presets/Snes9x/Legend of Zelda, The – A Link to the Past (USA).glslp.

    Apply Changes option
    [WARN] Failed writing shader preset to /home/user/.config/retroarch/config/retroarch.glslp.
    [WARN] Failed writing shader preset to /home/user/.local/share/libretro/shaders/retroarch.glslp.
    [INFO] Saved shader preset to /home/user/.config/retroarch/retroarch.glslp.

    Run content log output
    Game specific shader preset found on fallback directory
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/config/presets
    [INFO] [Shaders]: preset directory: /home/user/.local/share/libretro/shaders/presets
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/presets
    [INFO] [Shaders]: Specific shader preset found at /home/user/.config/retroarch/presets/Snes9x/Legend of Zelda, The – A Link to the Past (USA).glslp.
    [INFO] [Shaders]: game-specific shader preset found.

    Folder specific shader preset found on fallback directory
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/config/presets
    [INFO] [Shaders]: preset directory: /home/user/.local/share/libretro/shaders/presets
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/presets
    [INFO] [Shaders]: Specific shader preset found at /home/user/.config/retroarch/presets/Snes9x/SNES.glslp.
    [INFO] [Shaders]: folder-specific shader preset found.

    Core specific shader preset found on fallback directory
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/config/presets
    [INFO] [Shaders]: preset directory: /home/user/.local/share/libretro/shaders/presets
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/presets
    [INFO] [Shaders]: Specific shader preset found at /home/user/.config/retroarch/presets/Snes9x/Snes9x.glslp.
    [INFO] [Shaders]: core-specific shader preset found.

    Global shader preset found on fallback directory
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/config/presets
    [INFO] [Shaders]: preset directory: /home/user/.local/share/libretro/shaders/presets
    [INFO] [Shaders]: preset directory: /home/user/.config/retroarch/presets
    [INFO] [Shaders]: Specific shader preset found at /home/user/.config/retroarch/presets/global.glslp.
    [INFO] [Shaders]: global shader preset found.

    Remove options

    Remove Global Preset
    [INFO] Deleted shader preset from /home/user/.config/retroarch/presets/global.glslp.

    Remove Core Preset
    [INFO] Deleted shader preset from /home/user/.config/retroarch/presets/Snes9x/Snes9x.glslp.

    Remove Content Directory Preset
    [INFO] Deleted shader preset from /home/user/.config/retroarch/presets/Snes9x/SNES.glslp.

    Remove Game Preset
    [INFO] Deleted shader preset from /home/user/.config/retroarch/presets/Snes9x/Legend of Zelda, The – A Link to the Past (USA).glslp.

    Some other noteworthy things

    • RetroArch WiiU now has working graphics widgets. OSD notifications are no longer just plain yellow-colored text.
    • RetroArch 3DS now has basic networking and Cheevos (RetroAchievements) support.
    • With RetroArch 1.8.7 and overclocking, NeoCD reaches fullspeed on RetroArch PSVita. Even audio playback doesn’t stutter any more

    Changelog

    What you’ve read above is just a small sampling of what 1.8.6 has to offer. There might be things that we forgot to list in the changelog listed below, but here it is for your perusal regardless.

    1.8.7

    • 3DS: Add IDs for Frodo
    • 3DS: Enable basic networking / cheevos
    • CHEEVOS/BUGFIX: Opening achievements list would crash RetroArch with badges enabled (on new games)
    • CHEEVOS: Option to start a session with all achievements active
    • CHEEVOS: Don’t perform unnecessary cheevos initialisation when cheevos are disabled. Should reduce startup times when loading content.
    • CORE OPTIONS: Disable ‘Use Global Core Options File’ by default
    • DOS/DJGPP: Add 32bit color support for cores
    • GLCORE: Switch to glcore video driver when requested by a core
    • LINUX/XDG: Use GenericName correctly in desktop entry
    • MAC/COCOA: Fix mouse cursor tracking
    • MENU/MATERIALUI: Add desktop-style playlist view mode
    • MENU/MATERIALUI/DESKTOPVIEW: When scrolling playlists, show last selected thumbnails while waiting for next entry to load
    • MENU/MATERIALUI: Limit tab switch rate when input repeat is active
    • MENU/OZONE: Fix sidebar playlist sort order when ‘Truncate Playlist Names’ is enabled
    • MENU/RGUI: Adjusted menu defaults, adjusted default scrolling speed
    • MENU/RGUI: Enable custom wallpaper when menu size is reduced at low resolutions
    • MENU/XMB: Limit tab switch rate when input repeat is active
    • NETPLAY: Fix regressions introduced in 1.8.5
    • RGUI: Add option to always stretch menu to fill the screen
    • WIIU: Enable graphics widgets

Reviving and rewriting paraLLEl-RDP – Fast and accurate low-level N64 RDP emulation

Over the last few months after completing the paraLLEl-RSP rewrite to a Lightrec based recompiler, I’ve been plugging away on a project which I had been putting off for years, to implement the N64 RDP with Vulkan compute shaders in a low-level fashion. Every design of the old implementation has been scrapped, and a new implementation has arisen from the ashes. I’ve learned a lot of advanced compute techniques, and I’m able to use far better methods than I was ever able to use back in the early days. This time, I wanted to do it right. Writing a good, accurate software renderer on a massively parallel architecture is not easy and you need to rethink everything. Serial C code will get you nowhere on a GPU, but it’s a fun puzzle, and quite rewarding when stuff works.

The new implementation is a standalone repository that could be integrated into any emulator given the effort: https://github.com/Themaister/parallel-rdp. For this first release, I integrated it into parallel-n64. It is licensed as MIT, so feel free to integrate it in other emulators as well.

Why?

I wanted to prove to myself that I could, and it’s … a little fun? I won’t claim this is more than it is. 🙂

Chasing bit-exactness

The new implementation is implemented in a test-driven way. The Angrylion renderer is used as a reference, and the goal is to generate the exact same output in the new renderer. I started writing an RDP conformance suite. Here, we generate RDP commands in C++, run the commands across different implementations, and compare results in RDRAM (and hidden RDRAM of course, 9-bit RAM is no joke). To pass, we must get an exact match. This is all fixed-point arithmetic, no room for error! I’ve basically just been studying Angrylion to understand what on earth is supposed to happen, and trying to make sense of what the higher level goal of everything is. In LLE, there’s a lot of weird magic that just happens to work out.

I’m quite happy with where I’ve ended up with testing and seeing output like this gives me a small dopamine shot before committing:

122/163 Test #122: rdp-test-interpolation-color-texture-ci4-tlut-ia16 ………………………….. Passed 2.50 sec
Start 123: rdp-test-interpolation-color-texture-ci8-tlut-ia16
123/163 Test #123: rdp-test-interpolation-color-texture-ci8-tlut-ia16 ………………………….. Passed 2.40 sec
Start 124: rdp-test-interpolation-color-texture-ci16-tlut-ia16
124/163 Test #124: rdp-test-interpolation-color-texture-ci16-tlut-ia16 …………………………. Passed 2.37 sec
Start 125: rdp-test-interpolation-color-texture-ci32-tlut-ia16
125/163 Test #125: rdp-test-interpolation-color-texture-ci32-tlut-ia16 …………………………. Passed 2.45 sec
Start 126: rdp-test-interpolation-color-texture-2cycle-lod-frac
126/163 Test #126: rdp-test-interpolation-color-texture-2cycle-lod-frac ………………………… Passed 2.51 sec
Start 127: rdp-test-interpolation-color-texture-perspective
127/163 Test #127: rdp-test-interpolation-color-texture-perspective ……………………………. Passed 2.50 sec
Start 128: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac
128/163 Test #128: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac ……………… Passed 3.29 sec
Start 129: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac-sharpen
129/163 Test #129: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac-sharpen ………. Passed 3.26 sec
Start 130: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac-detail
130/163 Test #130: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac-detail ……….. Passed 3.48 sec
Start 131: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac-sharpen-detail
131/163 Test #131: rdp-test-interpolation-color-texture-perspective-2cycle-lod-frac-sharpen-detail … Passed 3.26 sec
Start 132: rdp-test-texture-load-tile-16-yuv

151/163 Test #151: vi-test-aa-none …………………………………………………………. Passed 21.19 sec
Start 152: vi-test-aa-extra-dither-filter
152/163 Test #152: vi-test-aa-extra-dither-filter ……………………………………………. Passed 48.77 sec
Start 153: vi-test-aa-extra-divot
153/163 Test #153: vi-test-aa-extra-divot …………………………………………………… Passed 64.29 sec
Start 154: vi-test-aa-extra-dither-filter-divot
154/163 Test #154: vi-test-aa-extra-dither-filter-divot ………………………………………. Passed 65.90 sec
Start 155: vi-test-aa-extra-gamma
155/163 Test #155: vi-test-aa-extra-gamma …………………………………………………… Passed 48.28 sec
Start 156: vi-test-aa-extra-gamma-dither
156/163 Test #156: vi-test-aa-extra-gamma-dither …………………………………………….. Passed 48.18 sec
Start 157: vi-test-aa-extra-nogamma-dither
157/163 Test #157: vi-test-aa-extra-nogamma-dither …………………………………………… Passed 47.56 sec

100% tests passed, 0 tests failed out of 163 #feelsgoodman

Ideally, if someone is clever enough to hook up a serial connection to the N64, it might be possible to run these tests through a real N64, that would be interesting.

I also fully implemented the VI this time around. It passes bit-exact output with Angrylion in my tests and there is a VI conformance suite to validate this as well. I implemented almost the entire thing without even running actual content. Once I got to test real content and sort out the last weird bugs, we get to the next important part of a test-driven development workflow …

The importance of dumping formats

A critical aspect of verifying behavior is being able to dump RDP commands from the emulator and replay them.

On the left I have Angrylion and on the right paraLLEl-RDP running side by side from a dump where I can step draw by draw, and drill down any pesky bugs quite effectively. This humble tool has been invaluable. The Angrylion backend in parallel-n64 can be configured to generate dumps which are then used to drill down rendering bugs offline.

Compatibility

The compatibility is much improved and should be quite high, I won’t claim its perfect, but I’m quite happy with it so far. We went through essentially all relevant titles during testing (just the first few minutes), and found and fixed the few issues which popped up. Many games which were completely broken in the old implementation now work just fine. I’m fairly confident that those bugs are solvable this time around though if/when they show up.

Implementation techniques

With Vulkan in 2020 I have some more tools in my belt than was available back in the day. Vulkan is a quite capable compute API now.

Enforcing RDRAM coherency

A major pain point of any N64 emulator is the fact that RDRAM is shared for the CPU and RDP, and games sure know how to take advantage of this. This creates a huge burden on GPU-accelerated implementations as we now have to ensure full coherency to make it accurate. Most HLE emulators simply don’t care or employ complicated heuristics and workarounds, and that’s fine, but it’s not good enough for LLE.

In the previous implementation, it would try to do “framebuffer manager” techniques similar to HLE emulators, but this was the wrong approach and lead to a design which was impossible to fix. What if … we just import RDRAM as buffer straight into the Vulkan driver and render to that, wouldn’t that be awesome? Yes … yes, it would be, and that’s what I did. We have an obscure, but amazing extension in Vulkan called VK_EXT_external_memory_host which lets me import RDRAM from the emulator straight into Vulkan and render to it over the PCI-e bus. That way, all framebuffer management woes simply disappear, I render straight into RDRAM, and the only thing left to do is to handle synchronization. If you’re worried about rendering over the PCI-e bus, then don’t be. The bandwidth required to write out a 320×240 framebuffer is absolutely trivial especially considering that we’re doing …

Tile-based rendering

The last implementation was tile-based as well, but the design is much improved. This time around all tile binning is done entirely on the GPU in parallel, using techniques I implemented in https://github.com/Themaister/RetroWarp, which was the precursor project for this new paraLLEl-RDP. Using tile-based rendering, it does not really matter that we’re effectively rendering over the PCI-e bus as tile-based rendering is extremely good at minimizing external memory bandwidth. Of course, for iGPU, there is no (?) external PCI-e bus to fight with to begin with, so that’s nice!

Ubershaders with asynchronous pipeline optimization

The entire renderer is split into a very small selection of Vulkan GLSL shaders which are precompiled into SPIR-V. This time, I take full advantage of Vulkan specialization constants which allow me to fine-tune the shader for specific RDP state. This turned out to be an absolute massive win for performance. To avoid the dreaded shader compilation stutter, I can always fallback to a generic ubershader while pipeline is being compiled which is slow, but works for any combination of state. This is a very similar idea to what Dolphin pioneered for emulation a few years ago.

8/16-bit integer support

Memory accesses in the RDP are often 8 or 16 bits, and thus it is absolutely critical that we make use of 8/16-bit storage features to interact directly with RDRAM, and if the GPU supports it, we can make use of 8 and 16-bit arithmetic as well for good measure.

Async compute

Async compute is critical as well, since we can make the async compute queue high priority and ensure that RDP shading work happens with very low latency, while VI filtering and frontend shaders can happily chug along in the fragment/graphics queue. Both AMD and NVIDIA now have competent implementations here.

GPU-driven TMEM management

A big mistake I made previously was doing TMEM management in CPU timeline, this all came crashing down once we needed framebuffer effects. To avoid this, all TMEM uploads are now driven by the GPU. This is probably the hairiest part of paraLLEl-RDP by far, but I have quite a lot of gnarly tests to test all the relevant corner cases. There are some true insane edge cases that I cannot handle yet, but the results created would be completely meaningless to any actual content.

Performance

To talk about FPS figures it’s important to consider the three major performance hogs in a low-level N64 emulator, the VR4300 CPU, the RSP and finally the RDP. Emulating the RSP in an LLE fashion is still somewhat taxing, even with a dynarec (paraLLEl-RSP) and even if I make the RDP infinitely fast, there is an upper bound to how fast we can make the emulator run as the CPU and RSP are still completely single threaded affairs. Do keep that in mind. Still, even with multithreaded Angrylion, the RDP represents a quite healthy chunk of overhead that we can almost entirely remove with a GPU implementation.

GPU bound performance

It’s useful to look at what performance we’re getting if emulation was no constraint at all. By adding PARALLEL_RDP_BENCH=1 to environment variables, I can look at how much time is spent on GPU rendering.

Playing on an GTX 166o Ti outside the castle in Mario 64:

[INFO]: Timestamp tag report: render-pass
[INFO]: 0.196 ms / frame context
[INFO]: 0.500 iterations / frame context

We’re talking ~0.2ms on GPU to render one frame on average, hello theoretical 5000 VI/s … Somewhat smaller frame times can be observed on my Radeon 5700 XT, but we’re getting frame rates so ridiciously high they become meaningless here. We’ve tested it on quite old cards as well and the difference in FPS on even something ancient like an R9 290x card and a 2080 Ti is minimal since the time now spent in RDP rendering is completely irrelevant compared to CPU + RSP workloads. We seem to be getting about a 50-100% uplift in FPS, which represents the shaved away overhead that the CPU renderer had. Hello 300+ VI/s!

Unfortunately, Intel iGPU does not fare as well, with an overhead high enough that it does not generally beat multithreaded Angrylion running on CPU. I was somewhat disappointed by this, but I have not gone into any real shader optimization work. My early analysis suggests extremely poor occupancy and a ton of register spilling. I want to create a benchmark tool at some point to help drill down these issues down the line.

It would be interesting to test on the AMD APUs, but none of us have the hardware handy sadly 🙁

Synchronous vs Asynchronous RDP

There are two modes for the RDP. In async mode, the emulation thread does not wait for the GPU to complete rendering. This improves performance, at the cost of accuracy. Many games unfortunately really rely on the unified memory architecture of the N64. The default option is sync, and should be used unless you have a real need for speed, or the game in question does not need sync.

Here we see an example of broken blob shadows caused by async RDP in Jet Force Gemini. This happens because the CPU is actually reading the shadowmap rendered by the RDP, and blurring it on the CPU timeline (why on earth the game would do that is another question), then reuploading it to the RDP. These kinds of effects require very tight sync between CPU and GPU and comes up in many games. N64 is particularly notorious for these kinds of rendering challenges.

Of course, given how fast the GPU implementation is on discrete GPUs, sync mode does not really pose an issue. Do note that since we’re using async compute queues here, we are not stalling on frontend shading or anything like that. The typical stall times on the CPU is in the order of 1 ms per frame, which is very acceptable. That includes the render thread doing its thing, submitting that to GPU, getting it executed and coming back to CPU, which has some extra overhead.

Road-map for future improvement

I believe this is solid enough for a first release, but there are further avenues for improvement.

Figure out poor performance on Intel iGPU

There is something going on here that we should be able to improve.

Implement a workaround for implementations without VK_EXT_external_memory_host (EDIT: Now implemented as of 2020-05-18)

Unfortunately there is one particular driver on desktop which doesn’t support this, and that’s NVIDIA on Linux (Windows has been supported since 2018 …). Hopefully this gets implemented soon, but we will need a fallback. This will get ugly since we’ll need to start shuffling memory back and forth between RDRAM and a GPU buffer. Hopefully the async transfer queue can help make this less painful. It might also open up some opportunities for mobile, which also don’t implement this extension as we speak. There might also be incentives to rewrite some fundamental assumptions in the N64 emulator plugin specifications (can we please get rid of this crap …). If we can let the GPU backend allocate memory, we don’t need any fancy extension, but that means uprooting 20 years of assumptions and poking into the abyss … Perhaps a new implementation can break new ground here (hi @ares_emu!).

EDIT: This is now done! Takes a 5-10% performance hit in sync mode, but the workaround works quite well. A fine blend of masked SIMD moves, a writemask buffer, and atomics …

Internal upscaling?

It is rather counter-intuitive to do upscaling in an LLE emulator, but it might yield some very interesting results. Given how obscenely fast the discrete GPUs are at this task, we should be able to do a 2x or maybe even 4x upscale at way-faster-than-realtime speeds. It would be interesting to explore if this lets us avoid the worst artifacts commonly associated with upscaling in HLE.

Fancier deinterlacer?

Some N64 content runs at 480i, and we can probably spare some GPU cycles running a fancier deinterlacer 😉

Esoteric use cases?

PS1 wobbly polygon rendering has seen some kind of resurgence in the last years in the indie scene, perhaps we’ll see the same for the fuzzy N64 look eventually. With paraLLEl-RDP, it should be possible to build a rendering engine around a N64-lookalike game. That would be cool to see.

Conclusion

This is a somewhat esoteric implementation, but I hope I’ve inspired more implementations like this. Compute-based accurate renderers will hopefully spread to more systems that have difficulties with accurate rendering. I think it’s a very interesting topic, and it’s a fun take on emulation that is not well explored in general.

paraLLEl-RDP rewritten from scratch – available in paraLLEl n64 right now for RetroArch



The ParaLLEl N64 Libretro core has received an update today that adds the brand new paraLLEl-RDP Vulkan renderer to the emulator core.

I implore everybody to read Themaister’s blog post (Reviving and rewriting paraLLEl-RDP – Fast and accurate low-level N64 RDP emulation) for a deep dive into this new renderer.

Requirements

  • You need a graphics card that supports the Vulkan graphics API.
  • It’s currently only available on Windows and Linux.
  • Right now the renderer requires a specific Vulkan extension, called ‘VK_EXT_external_memory_host’. Only Nvidia Linux binary drivers for Vulkan currently doesn’t support this extension. It has been requested but there is no ETA yet on when they will implement this.

What’s new since the old ParaLLEl RDP?

  • Completely rewritten from the ground up
  • Bit-exact renderer
  • Should be pretty much on par with Angrylion accuracy-wise now – none of the issues that plagued the old paraLLEl RDP
  • Now emulates the VI (Video Interface) as well
  • Basic deinterlacing for interlaced video modes

How to install and set it up

  • In RetroArch, go to Online Updater.
  • (If you have paraLLEl N64 already installed) – Select ‘Update Installed Cores’. This will update all the cores that you already installed.
  • (If you don’t have paraLLEl N64 installed already) – go to ‘Core Updater’, and select ‘Nintendo – Nintendo 64 (paraLLEl N64)’.
  • Now start up a game with this core.
  • Go to the Quick Menu and go to ‘Options’. Scroll down the list until you reach ‘GFX Plugin’. Set this to ‘parallel’. Set ‘RSP plugin’ to ‘parallel’ as well.
  • For the changes to take effect, we now need to restart the core. You can either close the game or quit RetroArch and start the game up again.

Progress and development in N64 emulation over the past decade

State of HLE emulation

IMHO, this release today represents one of the biggest steps that have been taken so far to elevate Nintendo 64 emulation as a whole. N64 emulation has gotten a bad rep for over decades because of HLE RDP renderers that fail to accurately reproduce every game’s graphics correctly and tons of unemulated RSP microcode, but it’s gotten significantly better over the years. On the HLE front, things have progressed. GLideN64 has made big strides in emulating most of the major significant games, the HLE RSP implementation used by Mupen 64 Plus is starting to emulate most of the major micro codes that developers made for N64 games. So on that front, things have certainly improved. There are also obviously limiting factors on the HLE front. For instance, GLideN64 still requires OpenGL, and renderers for Vulkan and other modern graphics APIs have not been implemented as of this date (although they could be).

State of LLE emulation

So that’s the HLE front. But for the purpose of this blog article, we are mostly concerned here about Low-Level Emulation. Both HLE and LLE N64 emulation are valid approaches, but if we want to reproduce the N64 accurately, we ultimately have to go LLE. So, what is the state of LLE emulation?

For LLE emulation, some of the advancements over the past few years has been a multithreaded version of Angrylion. Angrylion is the most accurate software RDP renderer to date. Its main problem has always been how slow it is. Up until say the mid to late ’10s, desktop PCs just did not have the CPU power to run any game at fullspeed with this renderer. Multithreaded Angrylion has seen Angrylion make some big gains in the performance department previously thought unimaginable.

However, Angrylion as a software renderer can only be taken so far. The fact remains that it is a big bottleneck on the CPU, and you can easily see CPU activity exceeding over 65% on a modern rig with the multithreaded Angrylion renderer. Software rendering is just never going to be a particularly fast way of doing 3D rasterization.

So, back in 2016, the first attempt at making a hardware renderer that can compete with Angrylion was made. It was a big release for us and it marked one of the first pieces of software to be released that was designed exclusively around the then-new Vulkan graphics API. You can read our old blog post here.

It was a valiant first attempt at making a speedy Angrylion port to hardware. Unfortunately, this first version was full of bugs, and it had some big architectural issues that just made further development on it very hard. So it didn’t see much further development for the past few years.

This year, all the stars have aligned. First out of the gates was the resurrection of paraLLEl-RSP, another project by Themaister. Low-level N64 emulation places a big demand on the CPU, and while cxd4’s RSP interpreter is very accurate, to get at least a 2x leap in performance, a dynamic recompiler approach has to be taken. To that end, this year not only was paraLLEl-RSP resurrected, but we moved the dynamic recompiler architecture from LLVM to Lightrec. It’s a bit less performant than LLVM to be sure but it also has some big advantages – LLVM runtime libraries are very hard to embed and integrate for various platforms, while Lightrec doesn’t have these dependency issues. Furthermore, LLVM would take a long time recompiling code blocks, and it would cause big stutters during gameplay (for instance, bringing up the map in Doom 64 for the first time would cause like a 5-second freeze in the gameplay while it was recompiling a code block – obviously not ideal). With Lightrec, all those stutters were more or less gone.

So, Q1 2020. We now have multithreaded Angrylion which leverages the multi-core CPUs of today’s hardware to get better performance results. We have ParaLLEl RSP, a low-level RSP plugin with a dynamic recompiler that gives us a big bump in performance. But one piece of the puzzle is still missing, and it’s perhaps the most significant. Multithreaded Angrylion still is a software renderer and therefore it still massively bottlenecks the CPU. Whether you can spread that load out over multiple cores or not ultimately matters little – CPUs just are not good at doing fast 3D rasterization, a lesson learned by nearly every mid ’90s PC game developer, and why 3D accelerated hardware could not have come sooner.

So, the obvious Next Big Thing in N64 emulation was to get rid of this CPU bottleneck and move Angrylion kicking and screaming to the GPU, and this time avoid all of the issues that plagued the initial paraLLEl RDP prototype.

Where does that leave us?

With a very accurate Angrylion-quality LLE RDP renderer running on the GPU, and a dynarec LLE RSP core, you will be surprised at how accurate Mupen 64 Plus is now. Nearly every commercial game runs now as expected with nearly no graphical issues, the sound is as you’d expect it to be, it looks, runs and functions just like a real N64. And if you’re on a discrete Nvidia or AMD GPU, your GPU activity will be 4% on average, whether it’s a stone-age GPU from the year 2013 like an AMD R9 290x, or an Nvidia Geforce 2080 Ti. Nearly any discrete GPU made from 2013 to 2020 that supports the Vulkan API seems to eat low-level N64 graphics for breakfast. CPU activity also has decreased significantly. With multithreaded Angrylion and Parallel RSP, there would be about 68% CPU activity on my rig. This is brought down to just 7 to 10% using paraLLEl RDP instead of Angrylion. Software rendering on the CPU is just a huge bottleneck no matter which way you slice it.

So for most practical purposes, using the paraLLEl RDP and paraLLEl RSP cores in tandem, the future is now. Accurate N64 emulation is here, it’s no longer slow, and it’s no longer completely CPU bound either. And you can play it on RetroArch right now, right today. We don’t have to wait for a near-accurate representation of an N64, it’s already here with us for all practical gameplay purposes.

How much faster is paraLLEl RDP compared to Angrylion? That is hard to say, and depends on the game you’re running. On average you can expect a 2x speedup. However, notice that at native resolution rendering, any discrete GPU since 2013 eats this workload for breakfast. This means you’re completely CPU bound in terms of performance most of the time. The better your CPU is at single threaded workloads (IPC), the better it will perform. Core count is a less significant factor. I think on my specific rig, it was my CPU that was the weakest link in the chain (a 7700k i7 Intel CPU paired with a 2080 Ti). The GPU matters relatively little, the 2080 Ti was mostly being completely idle during these tests. For that matter, so was an old 2013 AMD card that I would test with the same CPU – GPU activity remained flat at around 4%. As Themaister has indicated in his blog post, this leaves so much room for upscaled resolutions, which is on the roadmap for future versions.

Benchmarks

System specs: CPU – Intel Core i7 7700k | GPU – Geforce RTX 2080 Ti (11GB VRAM, 2018) | 16GB RAM

Title Angrylion ParaLLEl RDP (Synchronous) ParaLLEl RDP (Asynchronous)
007 GoldenEye 82fps 119fps 133fps
Banjo Tooie 72fps 132fps 148fps
Doom 64 174fps 282fps 322fps
F-Zero X 158fps 370fps 478fps
Hexen 156fps 300fps 360fps
Indiana Jones and the Infernal Machine 61fps 94fps 114fps
Killer Instinct Gold ~103fps ~168fps ~240fps
Legend of Zelda: Majora’s Mask 122fps 202fps 220fps
Mario Kart 64 ~178fps ~309fps ~330-350fps
Perfect Dark (High-res) 70fps 125fps 130fps
Pilotwings 64 87fps 125fps 144fps
Quake 188fps 262fps 300fps
Resident Evil 2 183fps 226fps 383fps (*)
Star Wars Episode I: Battle for Naboo 90fps 136fps 178fps
Super Mario 64 129fps 204fps 220fps
Vigilante 8 (Low-res) 63fps 91fps 112fps
Vigilante 8 (High-res) ~46-55fps ~92-99fps ~119fps
World Driver Championship ~109fps ~225fps ~257fps

* – Has game breaking issues in this mode

System specs: CPU – Intel Core i7 7700k | GPU – AMD Radeon R9 290x (4GB VRAM, 2013) | 16GB RAM

Title Angrylion ParaLLEl RDP (Synchronous) ParaLLEl RDP (Asynchronous)
007 GoldenEye 82fps 119fps 133fps
Banjo Tooie 72fps 132fps 148fps
Doom 64 174fps 282fps 322fps
F-Zero X 158fps 360fps 439fps
Hexen 156fps 288fps 352fps
Indiana Jones and the Infernal Machine 61fps 94fps 114fps
Killer Instinct Gold ~93fps ~162fps ~239fps
Legend of Zelda: Majora’s Mask 122fps 202fps 220fps
Mario Kart 64 ~157fps ~274fps ~292fps
Perfect Dark (High-res) 70fps 125fps 130fps
Pilotwings 64 87fps 125fps 144fps
Quake 189fps 262fps 326fps
Resident Evil 2 156fps 226fps 383fps (*)
Star Wars Episode I: Battle for Naboo 90fps 136fps 178fps
Super Mario 64 129fps 195fps 209fps
Vigilante 8 (Low-res) 63fps 91fps 112fps
Vigilante 8 (High-res) ~46-55fps ~92-99fps ~119fps
World Driver Championship ~109fps ~224fps ~257fps

* – Has game breaking issues in this mode

Core option explanations


paraLLEl RDP has some special dedicated options. You can change these by going to Quick Menu and going to Options. Here’s a quick breakdown of what they do –

ParaLLEl Synchronous RDP:

Turning this off allows for higher CPU/GPU parallelism. However, there are certain games that might produce problems if left disabled. An example of such a game is Resident Evil 2.

It has been verified that with the vast majority of games, disabling this can provide for at least a +10fps speedup. Usually the performance difference is much higher though. Try experimenting with it. If you experience no game breaking bugs or visual anomalies, it’s safe to disable this for the game you’re running and enjoy higher performance.

Video Interface Options
ParaLLEl-RDP emulates the N64 RDP’s VI module. This applied plenty of postprocessing to the final output image to further smooth out the picture. Some of the options down below allow you to enable/disable some of these VI settings on the fly. Disabling some of these and enabling some others could be beneficial if you want to use several frontend shaders on top, since disabling some of these postprocessing effects could result in a radically different output image.

(ParaLLEl-RDP) VI Interlacing Disabling this will disable the VI serration bits used for interlaced video modes. Turning this off essentially looks like basic bob deinterlacing, the picture might become shaky as a result when leaving this off.

(ParaLLEl-RDP) VI Gamma Filter Disabling this will disable the hardware gamma filter that some games use.

(ParaLLEl-RDP) VI Divot filter Disabling this will disable the median filter which is intended to clean up some glitched pixels coming out of the RDP. Subtle difference in output, but usually seems to apply to shadow blob decals.

(ParaLLEl-RDP) VI AA Disabling this will disable Anti-Aliasing.

(ParaLLEl-RDP) VI Dither Filter The VI’s dither filter is used to make color banding less apparent with 16-bit pixels.

(ParaLLEl-RDP) VI Bilinear VI bilinear is the internal upscaler in the VI. Disabling this is typically a good idea, since it’s typically used to upscale horizontally.

By disabling VI AA and enabling VI Bilinear, the picture output looks just like how Angrylion’s “Unfiltered” mode currently looks like.

FAQ

Will this renderer be ported to OpenGL?

Here is the short answer – no. Not by us, at least. Reasons: OpenGL is an outdated API compared to Vulkan that does not support the features required by Parallel-RDP. GL does not support 8/16bit storage, external memory host, or async compute. If one would be able to make it work, it would only work on the very best GL implementation, where Vulkan is supported anyways, rendering it mostly moot.

Ports to DirectX 12 are similarly not going to be considered by us, others can feel free to do so. One word of warning – even DirectX12 (yes, even Ultimate) is found lacking when it comes to providing the graphics techniques that ParaLLEl RDP is built around. Whoever will take on the endeavor to port this to DX12 or GL 4.5/4.6 will have their work cut out for them.

RetroArch 1.8.6 released!


RetroArch 1.8.6 has just been released.

Grab it here.

We will release a Cores Progress report soon going over all the core changes that have happened since the last report. It’s an exhaustive list, and especially the older consoles will receive a lot of new cores and improvements.

Remember that this project exists for the benefit of our users, and that we wouldn’t keep doing this were it not for spreading the love with our users. This project exists because of your support and belief in us to keep going doing great things. If you’d like to show your support, consider donating to us. Check here in order to learn more. In addition to being able to support us on Patreon, there is now also the option to sponsor us on Github Sponsors! You can also help us out by buying some of our merch on our Teespring store!

Highlights

There are many things this release post will not touch upon, such as all the extra cores that have been added to the various console platforms. We’ll spend some more time on that in a future Cores Progress Report post. We’ll go over some of the other highlights instead.

PSL1GHT PlayStation3 port

A new port of RetroArch to the PSL1GHT toolchain has been made for PlayStation3.

Right now there are no automated nightly builds for this, but you can download our experimental stable for it instead.

Working:

  • packaging
  • running cores
  • switching cores
  • gamepad including axis
  • RGUI menu driver
  • audio
  • video
  • cores: 2048, ecwolf, freechaf

Not working:

  • OSD
  • Menus other than RGUI
  • Shaders
  • Graphical acceleration
  • Proper signing
  • ODE build
  • Rumble
  • mouse

iOS/tvOS – Fix audio getting cut off on interruption

While using RetroArch, if you playback audio content (such as via the Control Center) or if you are interrupted by a phone call, the audio in RetroArch would stop entirely.

Changed to set the audio session category to “ambient” so that you can playback other audio sources and have sounds in RA at the same time.

Also, took out the bit to save the config when the app loses focus – it became too much of a distraction (the notification is distracting – this was not working previously anyway).

OpenGL Core – Slang shader improvements

Before, the OpenGL Core shader driver did not correctly initialise loaded textures. The texture filtering and wrap mode are forced on texture creation, but these settings were not recorded – subsequent updates would set garbage values, that then resolved to linear filtering OFF and wrap mode = CLAMP_TO_EDGE.

The wrap mode seemed to work regardless – perhaps once this is set the first time, it cannot change? (I don’t understand the inner workings of OpenGL…) But the texture filtering was certainly wrong. For example, this is what a background image with linear filtering enabled looks like:

…what you actually get is nearest neighbour.

This PR fixes texture initialisation so the filtering and wrap mode are recorded correctly. A linear filtered background image now looks like this:

Only write config files to disk when parameters change

We’ve been looking at ways to reduce disk I/O overhead, since it tends to be a big bottleneck on slower platforms.

Before, RetroArch would continuously overwrite its configuration files:

  • retroarch.cfg is written every time content is closed, and when closing RetroArch itself
  • Core options are written every time content is closed

This represents a large amount of unnecessary disk access, which is quite slow (and also causes wear on solid state drives!)

With 1.8.6, configuration files are only written to disk when the content actually changes.

All types of configuration file should now be ‘well behaved’ – with the exception of cheat files. These are still overwritten when closing content, since reusing old parameters may cause issues (and since I don’t use cheats at all, I didn’t feel confident enough to dabble with this)

While making these changes, we also discovered and fixed a number of bugs:

  • RetroArch no longer crashes when attempting to save a config file after ‘unsetting’ a parameter (currently, this can be triggered quite easily by manipulating input remaps)
  • When using Material UI, RetroArch no longer modifies the wrong setting (or segfaults…) when tapping entries in the Quick Menu > Controls input remapping submenu
  • Quite a few real and potential memory leaks have been fixed.

Playlist compression

There’s a new Compress playlists option under Settings > Playlists. When enabled, playlists are stored in an archived format (using the new rzip_stream interface).

The obvious benefit is that playlist file size is reduced by ~90%, with a corresponding reduction in disk wear on solid state drives (playlists are rewritten to disk quite frequently!).

Given the small size of playlist files, these saving aren’t hugely significant – but of more interest is the fact that on one of our development machines (Linux + mechanical HDD), loading a compressed playlist takes ~20% less time than an uncompressed one (despite the extra zlib overheads). This produces noticeably smoother scrolling when switching playlists in XMB. This improvement is most likely platform-dependent, but on devices where storage speed is a real issue (e.g. 3DS, UWP) the difference in playlist loading times should be quite pronounced.

We’ve also fixed some small playlist-related bugs/issues:

  • When saving playlists using the old format, default core association is now written correctly (not sure when this regression happened…)
  • When saving playlists using the old format, per-playlist sort mode is now recorded (I miscounted the number of available metadata ‘slots’ in the old format files – there was in fact just enough room for this one extra setting)
  • Whenever a playlist is cached by the menu (i.e. when a playlist is opened for display), RetroArch will check the format of the playlist (old/new) and its compression state – if either differ from the current user-set values, the file will be updated. This ensures playlists remain in sync with menu settings. (Previously, toggling the ‘use old format’ setting would do nothing unless the playlist was subsequently modified – this has long been an annoyance for me, since it meant ‘fully populated’ playlists languished in whatever state they were originally created)

It goes without saying that RetroArch will automatically detect whether or not a playlist is compressed and handle it appropriately.

If a playlist has been compressed and a user subsequently wants to edit it by hand, they can simply toggle Compress playlists off and then view the playlist via the menu – it will automatically be decompressed to plain text/JSON.

In addition to this, since human readability is not a factor when compressing playlists, we now omit all whitespace (newlines/indentation) when writing compressed JSON.

This reduces performance overheads when reading compressed JSON playlists by ~16% (!)

SRAM Compression

This is a minor follow-up to PR #10454. It adds a new SaveRAM Compression option under Settings > Saving. When enabled, SRAM save files are written to disk as compressed archives.

While SRAM saves are generally quite small, this can still yield a not insignificant space saving on storage-starved devices (e.g. the SNES/NES Classic consoles). Moreover, it reduces wear on solid state drives when SaveRAM Autosave Interval is set (in the worst case, this can write a couple of MB to disk per minute – vs. a few kB when compression is enabled).

Actual compression ratios will vary greatly depending upon core and loaded content. Here are a few examples of SRAM save sizes for random cores/games:

Core Uncompressed Compressed
Gambatte 32 kB 178 B
Genesis Plus GX 32 kB 83 B
mGBA 64 kB 1.1 kB
Mupen64Plus-Next OpenGL 290 kB 736 B
PCSX-ReARMed 128 kB 605 B
Snes9x 8.0 kB 183 B

In many cases, the actual on-disk save size can be reduced to almost nothing.

Notes:

  • As with save states, RetroArch will automatically detect whether SRAM saves are compressed and handle them appropriately (SaveRAM Compression can be toggled at any time).
  • This only works with cores that use the libretro SRAM interface for saving games. Many (most?) do, but there are some exceptions – e.g. Flycast writes save files directly, and so to does Beetle PSX depending on core settings.

Savestate compression

There’s a new Savestate Compression option under Settings > Saving. When enabled, save state files are written to disk as compressed archives. This both saves a substantial amount of disk space and reduces wear on solid state drives.

Actual compression ratios will vary depending upon core and loaded content. Here are a few examples of save state sizes for random cores/games:

Core Compression OFF Compression ON
Beetle PSX HW 16 MB 1.5 MB
Flycast 27 MB 8.9 MB
Genesis Plus GX 1012 kB 47 kB
mGBA 453 kB 45 kB
Mupen64Plus-Next OpenGL 17 MB 1.5 MB
PPSSPP 40 MB 9.3 MB
PCSX-ReARMed 4.3 MB 2.3 MB
PUAE 11 MB 793 kB
Snes9x 421 kB 82 kB

Notes:

  • RetroArch will automatically detect whether state files are compressed or not, and load them approriately – i.e. Savestate Compression can be toggled at any time, and everything will Just Work (TM)
  • We now have a new file stream for reading/writing archived data: rzip_stream. This can be used to handle any compressed data writing tasks we might have in the future

(Manual content scanner/playlist cleaner) Prevent redundant playlist entries when handling M3U content

Before, when the manual content scanner was used to scan content that includes M3U files, redundant playlist entries were created. For example, content like this:

  • Panzer Dragoon Saga CD1 (Saturn) (U).cue
  • Panzer Dragoon Saga CD2 (Saturn) (U).cue
  • Panzer Dragoon Saga CD3 (Saturn) (U).cue
  • Panzer Dragoon Saga CD4 (Saturn) (U).cue
  • Panzer Dragoon Saga (Saturn) (U).m3u

(where the .m3u references all the .cue files) would generate playlist entries for both the .m3u file and each of the .cue files. This is annoying, since the latter are pointless, and must be removed manually by the user.

1.8.6 adds M3U ‘awareness’ to the manual content scanner. Now whenever M3U files are encountered, they are parsed, and anything they reference internally is removed/omitted from the output playlist.

This functionality has also been added to the Playlist Management Clean Playlist task, so these redundant entries can be removed easily from existing playlists.

(Side note: 1.8.6 also adds a simple but feature complete M3U handling library – this may have additional use if someone wants to add the ability to generate M3U files for existing content…)

Improved handling of ‘broken’ playlists

RetroArch previously would fall apart when handling ‘broken’ playlists – i.e. when playlist entries have missing or invalid path/core path/core name fields. 1.8.6 should fix the most significant issues:

  • RetroArch will no longer segfault when attempting to run content via a playlist entry with missing path or core path fields.
  • When a playlist entry has either core path and/or core name set to NULL, DETECT or an empty string, attempting to load content will fallback to the normal ‘core selection’ code (currently this happens only if both core path and core name are DETECT – this is wholly inadequate!)
  • RetroArch will no longer segfault when attempting to fetch content runtime information when core path is NULL
  • Core name + runtime info will only be displayed on playlists and in the Information submenu if both the core path and core name fields are ‘valid’ (i.e. not NULL or DETECT)
  • When handling entries with missing path fields, the menu sorting order now matches that of the playlist sorting order (at present, everything goes out of sync when paths are empty). Moreover, entries with missing path fields can now be ‘selected’, so users can remove them (currently, hitting A on such an entry immediately tries – and fails – to load the content, so the only way to remove the broken entry is via the Playlist Management > Clean Playlist feature)

(Playlist Management) Add optional per-playlist alphabetical sorting

At present, RetroArch offers a global Sort playlists alphabetically option – but several users have requested more fine grained control. i.e. Users with highly customised setups might want a number of ‘hand-crafted’ playlists with specific ordering (release date, games in a particular series, etc.) without losing the ability to automatically sort their other conventional platform-based playlists.

1.8.6 adds a new Sorting Method option to the Playlist Management interface. This allows the sorting method to be overridden on a per-playlist basis. Available values are System Default (reflects Sort playlists alphabetically setting), Alphabetical and None.

Notes:

  • Content history playlists are excluded – they are never sorted (this has always been the case!)
  • This option is only available when using the ‘new’ format playlists (i.e. Save playlists using old format = OFF). There’s just not enough room in the old-style playlists for additional metadata. Since pretty much everyone uses the new format (by default), I don’t think this is an issue.
  • 1.8.6 also tweaks the way that the displayed menu entries are handled – previously, it would go as follows:

    Sort playlist
    Loop through playlist and generate menu entries
    Sort menu entries

…not only did this duplicate effort, but it meant there was a chance of the playlist and menu going out of sync – especially when using the Label Display Mode feature, which could lead to a different alphabetical ordering when processing the generated menu entries. As of 1.8.6, only the playlist is ever sorted, and menu entries are listed in exactly the same order.

Ozone

Before, Ozone can display either one thumbnail + content metadata or two thumbnails (with content metadata fallback when one image is missing) for each playlist entry.

With 1.8.6, if two thumbnails are enabled then the user can toggle between the second thumbnail and content metadata by pressing RetroPad ‘select’. When metadata is shown in this way, an image icon is displayed to indicate that a second thumbnail is available. The toggle may also be performed with a mouse/touchscreen by clicking/tapping the thumbnail sidebar.

Ozone menu – Mouse/Touch input fixes

  • Pointer input is now correctly disabled when message boxes are displayed
  • It turns out that Windows reports negative pointer coordinates when the mouse cursor goes beyond the left hand edge of the RetroArch window (this doesn’t happen on Linux, so I never encountered this issue before!). As a result, if Ozone is currently not showing the sidebar (menu depth > 1), moving the cursor off the left edge of the window generates a false positive ‘cursor in sidebar’ event – which breaks menu navigation, as described in #10419. With this PR, we now handle ‘cursor in sidebar’ status correctly in all cases.

(RGUI) Enable automatic menu size reduction when running at low resolutions (down to 256×192)

Before, on all platforms other than the Wii/NGC, RGUI had a fixed frame buffer size of [320-426]x240 (width takes one of three values depending upon current menu aspect ratio).

In most cases this is fine, with an important exception: when running content at its native resolution (usually when connected to a CRT), the display size is often smaller than 320×240. For example, SNES titles run at 256×224; master system titles at 256×192. In these cases, RGUI gets ‘squished’ – there are not enough scanlines on the screen, so rows of menu pixels get dropped (or blurred together if bilinear filtering is enabled). This makes the menu difficult to read/use.

This PR modifies RGUI such that its frame buffer dimensions are automatically reduced when running at low resolutions. The minimum nominal menu size is 256×192, which should enable content for almost all TV-connected consoles to be run at native resolution while maintaining pixel perfect menu scaling.

(Unfortunately, going any smaller than this breaks RGUI – so for handheld systems it’s still best to run at higher resolutions with a shader or video filter)

While implementing this, narrowed down the detection of when the aspect ratio lock should be disabled: currently, RGUI’s aspect ratio lock ‘turns off’ when accessing the video settings menu – this now only happens when accessing the video scaling submenu, since this is the only section that can cause conflicts with the aspect lock method. (Note that the old behaviour is maintained for the Wii port, because it has special requirements relating to resolution changes)

Menu – widget and font improvements

  • The font ascender/descender metrics are now used to achieve ‘pixel perfect’ vertical text alignment
  • Message queue text now uses its own dedicated font. Previously, a single (larger) font was used for all active widgets, and this was scaled down for message queue items. This ‘squished’ the text a little; more importantly, when using the stb font renderers (on Android. etc.) it caused ugly artefacts around the edges of glyphs due to pixel interpolation errors. Now that a correctly sized font is used, the message queue is always rendered cleanly.
  • Previously, each widget font was ‘flushed’ (font_driver_flush()) at least once a frame. This is quite a slow operation. Now we only flush fonts if they have actually been used.

Content scanner was unable to identify games from CHD images on Android builds

The content scanner was unable to identify games from CHD images on Android builds (same files that are being properly identified on Windows builds).

It was discovered that both the extracted magic number and CRC hash differed on both builds. This should now be resolved.

Changelog

What you’ve read above is just a small sampling of what 1.8.6 has to offer. There might be things that we forgot to list in the changelog listed below, but here it is for your perusal regardless.

1.8.6

  • 3DS: Add IDs for UZEM, TGB Dual, and NeoCD
  • 3DS: Fix font driver horizontal text alignment
  • 3DS: Allow button presses up to INPUT_MAX_USERS – this enables the 3DS to bind and use buttons and axis for users up to the maximum set by ‘Max Users’ in the input settings menu.
  • 3DS: Disable video filter if upscaled resolution exceeds hardware limits. The 3DS has a maximum video buffer size of 2048×2048. This is sufficient for every core that it supports, but when using software video filters the core output resolution is doubled. This is made worse by the fact that the video filter upscaling buffer size is dependent upon the maximum output resolution of the core – which in some cases is very large indeed (e.g. pcsx-rearmed sets a maximum width of 1024, for enhanced resolution support). The 3DS has very limited ‘linear memory’ for graphics buffer purposes, and a large base core buffer + video filter buffer can easily exceed this – which may also disable video output, or cause a crash. This PR very simply adds a 3DS-specific check to the video filter initialisation: if the resultant upscaling buffer exceeds the hardware limitation, then the filter is automatically disabled.
  • 3DS/FONT/BUGFIX: Text colour was wrong: the RGBA channels were muddled, and R was always set to 255
  • 3DS/FONT/BUGFIX: When drawing multiline strings, the line spacing was completely incorrect
  • 3DS/FONT: Improves the appearance of the drop shadow effect on notification text.
  • 3DS/ARCHIVE/7Z: Re-enable 7zip support.
  • ARCHIVE/ZIP: Expand functionality of ‘rzip_stream’ interface. This PR expands the functionality of the new rzip_stream archived stream interface such that it now has almost complete feature parity with the standard file_stream interface, and can therefore be used as a drop-in replacement in most situations
  • AI SERVICE: Hide redundant entries when service is disabled
  • AI SERVICE: Added in auto-translate support
  • AI SERVICE: support for NVDA and SAPI narration
  • AUTOCONFIG: Use correct port index in input device configured/disconnected notifications
  • BUGFIX: Fix race condition where task could momentarily not be in the queue when reordering
  • CHEEVOS/BUGFIX: Prevent null reference rendering achievement list while closing application
  • CHEEVOS/BUGFIX: Report non-memorymap GBA cores as unsupported
  • COMMANDLINE: Advise against using -s and -S variables on the command line.
  • CONFIG FILE: Only write config files to disk when parameters change
  • CONFIG FILE/BUGFIX: RetroArch no longer crashes when attempting to save a config file after ‘unsetting’ a parameter (currently, this can be triggered quite easily by manipulating input remaps)
  • CONFIG FILE/BUGFIX: When using Material UI, RetroArch no longer modifies the wrong setting (or segfaults…) when tapping entries in the Quick Menu > Controls input remapping submenu
  • CONFIG FILE/BUGFIX: Quite a few real and potential memory leaks have been fixed.
  • CHD: Fixes a crash caused by ignoring the return value from one of the CHD library functions
    FASTFORWARDING: A new Mute When Fast-Forwarding option has been added under Settings > Audio. When enabled, users can fast forward without having to listen to distorted audio.
  • GLCORE/SLANG: Set filter and wrap mode correctly when intialising shader textures. Before, the glcore shader driver did not correctly initialise loaded textures. The texture filtering and wrap mode were forced on texture creation, but these settings were not recorded – subsequent updates would set garbage values, that would resolve to linear filtering OFF and wrap mode = CLAMP_TO_EDGE.
  • LOCALIZATION: Update Japanese translation
  • LOCALIZATION: Update Spanish translation
  • LOCALIZATION: Update Portuguese Brazilian translation
  • IOS: Set audio session category to ambient so sound does not get cut off on interruption (phone call/playing back audio)
    MAC/IOHIDMANAGER/BUGFIX: Fix for Mayflash N64 adapter. In case last hatswitch does not match cookie. For the mayflash N64 adapter, I was getting a BAD EXC ADDRESS (in mac OS 10.13) for this line (tmp was NULL). Retroarch would crash in the gui if I pressed a button from the DPAD on controller 2. With this change, it no longer crashes in the gui and still registers the button push.
  • MAC/COCOA: Fix mouse input – this brings back two lines of code that have been removed over time but appear to be required in order for mouse input to work on macOS
  • METAL/BUGFIX: GPU capture on Metal/OSX/NVidia could crash
  • METAL/BUGFIX: Taking screenshots could capture black frames. Resulting PNG screenshots were black.
  • METAL/BUGFIX: Corrupted image due to incorrect viewport copy when taking screenshot
  • MENU: Prevent font-related segfaults when using extremely small scales/window sizes
  • MENU: Fix ‘gfx_display_draw_texture_slice()’
  • MENU/FONT: Enable correct vertical alignment of text (+ font rendering fixes)
  • MENU/RGUI: Enable automatic menu size reduction when running at low resolutions (down to 256×192)
  • MENU/OZONE: Update timedate style options for Last Played sublabel metadata
  • MENU/OZONE: Hide ‘Menu Color Theme’ setting when ‘Use preferred system color theme’ is enabled
  • MENU/OZONE: Fix thumbnail switching via ‘scan’ button functionality
  • MENU/OZONE: Prevent glitches when rendering Ozone’s selection cursor
  • MENU/OZONE: Enable proper vertical text alignment + thumbnail display improvements
  • MENU/OZONE: Enable second thumbnail/content metadata toggle using RetroPad ‘select’
  • MENU/OZONE: Refactor footer display
  • MENU/OZONE: Hide thumbnail button hints when viewing file browser lists
  • MENU/OZONE/INPUT/BUGFIX: Fix undefined behaviour when using touch screen to change input remaps
  • MENU/OZONE/INPUT/BUGFIX: It turns out that Windows reports negative pointer coordinates when the mouse cursor goes beyond the left hand edge of the RetroArch window (this doesn’t happen on Linux, so I never encountered this issue before!). As a result, if Ozone is currently not showing the sidebar (menu depth > 1), moving the cursor off the left edge of the window generates a false positive ‘cursor in sidebar’ event – which breaks menu navigation, as described in #10419. With this PR, we now handle ‘cursor in sidebar’ status correctly in all cases
  • MENU/OZONE/INPUT/BUGFIX: Pointer input is now correctly disabled when message boxes are displayed
  • MENU/XMB: Fix thumbnail switching via ‘scan’ button functionality
  • ODROID GO ADVANCE: Add DRM HW context driver
  • PSL1GHT: Initial port
  • PSL1GHT/KEYBOARD: Implement PSL1GHT keyboard
  • PLAYLIST/BUGFIX: Improve handling of ‘broken’ playlists – RetroArch will no longer segfault when attempting to run content via a playlist entry with missing path or core path fields.
  • PLAYLIST/BUGFIX: Improve handling of ‘broken’ playlists – when a playlist entry has either core path and/or core name set to NULL, DETECT or an empty string, attempting to load content will fallback to the normal ‘core selection’ code (currently this happens only if both core path and core name are DETECT – this is wholly inadequate!)
  • PLAYLIST/BUGFIX: RetroArch will no longer segfault when attempting to fetch content runtime information when core path is NULL
  • PLAYLIST/BUGFIX: Core name + runtime info will only be displayed on playlists and in the Information submenu if both the core path and core name fields are ‘valid’ (i.e. not NULL or DETECT)
  • PLAYLIST/BUGFIX: When handling entries with missing path fields, the menu sorting order now matches that of the playlist sorting order (at present, everything goes out of sync when paths are empty). Moreover, entries with missing path fields can now be ‘selected’, so users can remove them (currently, hitting A on such an entry immediately tries – and fails – to load the content, so the only way to remove the broken entry is via the Playlist Management > Clean Playlist feature)
  • PLAYLIST: Add optional per-playlist alphabetical sorting
  • PLAYLIST: Omit whitespace when writing compressed JSON format playlists
  • PLAYLIST: Add optional playlist compression
  • QNX: Support analog sticks
  • SAVESTATES: Add optional save state compression (enabled by default now)
  • SRAM: Add optional save (SRAM) file compression
  • SCANNER: Prevent redundant playlist entries when handling M3U content
  • SCANNER/ANDROID: Fix content scanner being unable to identify certain games from CHD images (raw data sector/subcode)
  • TASKS/BUGFIX: Fix task deadlocks
  • TASKS/SCREENSHOT/BUGFIX: Fix heap-use-after-free error when widgets are disabled
  • TVOS: Disable overlays for tvOS, fix app icon
  • VIDEO/WIDGETS/BUGFIX: The font ascender/descender metrics added in #10375 are now used to achieve ‘pixel perfect’ vertical text alignment
  • VIDEO/WIDGETS/BUGFIX: Message queue text now uses its own dedicated font. Previously, a single (larger) font was used for all active widgets, and this was scaled down for message queue items. This ‘squished’ the text a little; more importantly, when using the stb font renderers (on Android. etc.) it caused ugly artefacts around the edges of glyphs due to pixel interpolation errors. Now that a correctly sized font is used, the message queue is always rendered cleanly.
    VIDEO/WIDGETS/BUGFIX: Previously, each widget font was ‘flushed’ (font_driver_flush()) at least once a frame. This is quite a slow operation. Now we only flush fonts if they have actually been used.
  • VULKAN/BUGFIX: Fix display of statistics text
  • UNIX/BUGFIX: Fix overflow when computing total memory on i386
  • WIIU/BUGFIX: Fix font driver horizontal text alignment
  • WIIU/BUGFIX: Fix non-vertex coordinates in draws using tex shader
  • WIIU/BUGFIX: Update and fix meta.xml file for the WiiU release. This change makes it so the information from the meta.xml file parsed for the WiiU’s Homebrew Launcher is displayed properly.

Kronos 2.1.2 progress report (Sega Saturn emulator)

It has been some time since the last report, let’s try to go a bit more in-depth this time.

The OpenGL CS video renderer

The Saturn is a beast. It features 8 processors, among them are 2 custom graphics processors called VDP1 and VDP2. The VDP2 handled backgrounds, while the VDP1 handled sprites, textures and polygons.

The VDP1 was rendering “quads” line per line, the general idea was to interpolate endpoints along the horizontal edges, then to draw textured lines between those endpoints. It had to draw the lines with an extra pixel where the slope changed, so all of the pixels had a neighbor to the left, right, top, or bottom. They did this to prevent gaps between the lines.

A modern graphics APIs like OpenGL doesn’t know how to do that, because its rendering pipeline is based on triangle geometry, so basically it can’t reproduce VDP1 behavior. There are tricks like tesselation, but ultimately they are just workarounds for specific issues and not all-in-one solutions for this. Here is some good news though : with OpenGL 4.3, a new feature called compute shaders was introduced, you might have heard about it through Flycast’s order independent transparency, or N64’s parallel, this new component gives lots of flexibility to OpenGL, and allows the implementation of routines to render quads line per line. It is what this renderer is about : reproducing VDP1 behavior accurately.

Let’s do some comparison, from first to last, those images were shot from console, Mednafen/beetle, Kronos (OpenGL CS renderer), Kronos (the old OpenGL renderer, based on YabaSanshiro’s). There are 2 things noticeable related to this VDP1 behavior in those :

  • border of the road : on console, Mednafen and Kronos’s new renderer, if you zoom in, you’ll notice it’s not a smooth line, there are dots, this is the accurate behavior; the last screen, while the smooth line might look better, is actually inaccurate.
  • holes everywhere : if you zoom in on the last screenshot, you’ll notice some holes here and there, on the top of the hills, on the road in the back, those holes don’t exist on the other screenshots.

It’s possible to workaround those holes with the OpenGL renderer, but at the end of the day you end up creating other issues in the process. Until recently we used such workaround but, in the case of Sega Rally, it was magnifying the dots on the border of the road.

The only known downside of this new renderer is that it will require a fairly good GPU!

ST-V support was improved

While still a bit preliminary, some major rework was done recently on ST-V support :

  • You can now set your favorite bios region (NB : it will be ignored if the game doesn’t support that region though)
  • The EEPROM is now properly saved and loaded
  • ROM loading mechanism was fixed, there should be no more messages of the ST-V bios telling you there is something wrong with the game you are trying to launch
  • Lots of input issues, going from the lack of kick harness (used for 5th & 6th buttons on some games) to the inputs not responding at all, were fixed

Improvements on the Libretro port

There were some long-term issues with the Libretro implementation, but a lot of improvements were done about them :

  • Resolution switching, which is something that happens every few seconds on saturn, was somehow wrong, one of the worst side effect was artifacts especially visible in “mesh” (if you don’t use the “improved mesh” core option), it was fixed
  • Toggling between fullscreen and windowed was causing issues from glitches to crashes, it has been mostly fixed
  • While the saturn framerate should be 50 or 60 fps depending on the region, sometimes it’s not rendering anything because the Saturn is actually shutting down its video output, kronos is trying to have an accurate behavior for this too, which is a bit of a headache for the libretro ecosystem which is expecting a more linear framerate. A better way of handling this was implemented.

Also, here is a summary of this core’s options :

  • Force HLE BIOS : it will ignore your bios file and use the old HLE bios from yabause instead, this function is unmaintained and is mainly there for debugging purpose (there is at least one known case where it’s unlocking the game though : Astal, for some reason the real bios is shutting down the video output), don’t report issues if you enabled this option.
  • Video format : will force format to PAL or NTSC, default is auto
  • Frameskip : will skip rendering at a fixed rate, it can improve playability dramatically on lower end devices
  • SH-2 cpu core : default is “kronos”, our cross-platform cached interpreter, the other one is the unmaintained yabause SH-2 interpreter, we got the same policy than the HLE bios about it.
  • OpenGL version : this option was introduced as a workaround for setups giving false positive when asking if a specific OpenGL version was supported (it happened…), set this to the highest version your gpu support.
  • Video renderer : to enable the new renderer, default is the old one for compatibility reasons
  • Share saves with beetle : will share save paths with beetle-saturn, allowing you to use the same savefiles.
  • Addon cartridge : to change cartridge, default is auto, it is recommended to let the default except if you intend to play heart of darkness, a prototype requiring the 16M extended RAM.
  • 6Player Adaptor on Port 1 : self explanatory
  • 6Player Adaptor on Port 2 : same, one word of warning though, enabling the second multitap is known for causing a weird autofire behavior.
  • Internal Resolution : self explanatory
  • Polygon Mode : works with the default OpenGL renderer, used to fix wobbling textures issues, OpenGL CS doesn’t need this, default is cpu tesselation but gpu tesselation is recommended if your gpu supports it (OpenGL 4.2), perspective correction is more cpu friendly but heavily glitched.
  • Improved mesh : will replace fake transparency (mesh) by real transparency, default is disabled
  • RBG Compute shaders : will use compute shaders to rotate background, it is recommended if your gpu supports it, default is disabled
  • Wireframe mode : self-explanatory ? It works only with OpenGL CS, mostly for debugging but can be a fun feature, give it a try for curiosity !
  • ST-V Service/Test Buttons : enable buttons to access service menu in ST-V game, default is disabled to avoid misspress
  • ST-V Favorite Region : select your region for ST-V, default is EU for censorship and language reasons.

On a sidenote, lots of other things were fixed/improved since my last report, but nothing seemed major so we decided to skip them. If you want to know more about this emulator, you can check the youtube channel, or join us on discord.

Libretro Cores Progress Report – April 2, 2020

Our last core progress report was on February 29, 2020. Below we detail the most significant changes to all the Libretro cores we and/or upstream partners maintain. We are listing changes that have happened since then.

How to update your cores in RetroArch

There are two ways to update your cores:

a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select the core from the list that you want to install.

Final Burn Neo

Description: Multi-system arcade emulator

  • Latest updates from upstream

blueMSX

Description: Home computer MSX emulator

  • Fix not smooth scroll in PAL 50Hz
  • Buildfix for libnx (Switch)
  • Buildfix for 3DS

Beetle PSX

Description: Sony PlayStation emulator

  • Added “fast PAL” hack to allow PAL games to play at NTSC framerates
  • Added Force NTSC aspect ratio
  • Vulkan: Disable adaptive smoothing by default

    This should be disabled by default like the other Vulkan-exclusive
    enhancements so as to better match stock settings

  • Hide scanline core options based on content region
  • Refactor memory card core options logic

    Get rid of confusing check_variables() memcard startup logic and
    corresponding redundant variables, and update core option
    labels/sublabels to match actual core functionality.

  • Implement aspect ratio core option (psx.correct_aspect equivalent)

    Beetle PSX implementation of “psx.correct_aspect” introduced in Mednafen
    1.24.0-UNSTABLE (no relevant code backported from upstream).
    Additionally fixes aspect ratio scaling issues when cropping overscan or
    adjusting visible scanlines. “Force 4:3” is left as a legacy option for
    users preferring the old inaccurate behavior.

  • Add option for setting core-reported FPS timing
  • WIP: increase RAM to 8MB instead of the default 2
  • Improve internal FPS detection

Vitaquake 2

Description: Quake 2 game engine core

Vitaquake 2 is now available for the first time on Emscripten.

Hatari

Description: Atari ST/STE/TT/Falcon emulator

  • Port: Ported Hatari to PS Vita

Atari 800

Description:

  • Port: Ported Atari 800 to 3DS

Dosbox core

Description: MS-DOS home computer emulator

  • Latest updates from Github

Dosbox SVN

Description: MS-DOS home computer emulator

  • Latest updates from Github
  • Make 16MB RAM default, change default cycle mode to “fixed” and “10000”

    Max and auto modes are broken on some systems.

LRMAME

Description: Multi-system arcade emulator

  • Updated to version 0.219

ECWolf

Description: Wolfenstein 3D game engine core

  • Latest updates (TODO/FIXME)

Flycast

Description: Sega Dreamcast emulator

  • fix alignment issues reported by ubsan on x64
  • Fix chd lzma and zlib buffers alignment
  • Fix rec/x64 block check alignment
  • Fix ChannelEx struct alignment
  • nvmem: generate console ID at startup. rec-x64: Call stack alignment

    Generate console ID in dc_nvmem.bin if blank. Used by chuchu rocket
    login.
    Align stack to 16-byte when calling function from x64 rec

  • (NAOMI) add sfz3ugd button labels
  • (NAOMI) Alien Front Naomi needs DIV matching disabled
  • (NAOMI) VMU support (vonot, sf3zu). Fix otrigger inputs.
  • input: only use R2/L2 for trigger input even with digital triggers
  • renderer: generate mipmaps for custom textures
  • custom texture: stop loader thread before loading state
  • renderer: decrease MipmapD bias – fixes street lights in Sonic Adventure 1
  • gdrom: don’t resume CDDA if not playing. stop if cur > end – implement ATA_IDENTIFY
  • Protect RAM and VRAM when VMEM is disabled
  • (Switch) Initial Port
  • ta: defer index building and strip merging, filter out infinite vertices
  • pvr: reserve more opaque polys. Don’t crash on TA overrun
  • vmem: unprotect vram when releasing memory if NO_VMEM
  • (Switch) Iterate each Page for Permission set
  • Use -O2 for YUV_Block8x8 due to UB
  • pvr: don’t reset tile clipping value on each frame – Fixes Irides – master of blocks
  • support multi-session cue/bin. mipmap D-adjust only to increase LoD
  • limit maple schedule time
  • allow VRAM 8-bit reads
  • gl: use common ReadFramebuffer() func
  • sort triangles even with 1 polygon – fixes missing Naomi boot logo and vtennis2 black frame during replay
  • fix crash when TR poly count is 0

ChaiLove

  • Port: Ported ChaiLove core to 3DS
  • Port: Ported ChaiLove core to Android

HBMAME

Description: Emulator of homebrew and hacked games for arcade hardware

  • New core

VICE

Description: Commodore 64 home computer emulator

  • Split “Paddles” joyport type to first two RetroPads:
    Player1 vertical axis = Player2 horizontal axis
    Player1 2nd button = Player2 1st button
    Add speed modifier hotkeys (slower+faster) for paddles/mouse

    Because “Paddles” is in fact 2 controllers in one joyport, and currently it is read like a mouse with 2 axis and 2 buttons, this is not convenient for 2 player games, like Panic Analogue, which use paddles as 2 separate entities with one axis and one button.

  • Fixes for JiffyDOS, Disk Control & Statusbar –

    To evade JiffyDOS incompatibilities with CRTs, PRGs & TAPs, the allowance method is changed from whitelist to blacklist.
    Also M3U playlists of D64 images will allow, and playlists of TAP images will not
    Fixed not being able to insert disks at all when starting without content
    Drive type defaults to 1541, as in inserting D81s will not work for now, because drive type autodetection happens only on autostart
    Finetuned statusbar

  • Turbofire & JiffyDOS fix –
    Minor fixes:

    Turbofire pulse was off (value 2 was actually 4)
    No reason to allow JiffyDOS core option with anything other than D64 & D81, or is there..?

  • Remove Nuklear GUI, Add VKBD touch control –
    Replaced bloaty Nuklear with the lightweight VKBD from PUAE

    No drawbacks, only benefits: Touch control, better performance, simpler maintenance

  • Port: Fixed VICE core Android build
  • Port: Ported VICE core to 3DS
  • Port: Ported VICE core to Emscripten
  • Add support for disk control interface v1 (disk display labels)
  • x64: Exclude vicii-clock-stretch.c – vicii-clock-stretch.c is not really used on x64, it’s only for x128
  • Disable cpmcart on x128 –
    Both x128 and cpmcart have z80 cpu and both have z80_regs symbols.
    On platforms other than emscripten those symbols end up being aliased
    due to “-fcommon” behaviour. This would lead to very weird results if they
    would ever be used together.

    On real hw cpmcart is unnecessarry due to integrated CP/M mode

    In the emulator cpmcart is runtime-enable only on x64 and x64sc but
    the relevant code is still compiled-in.

    So just remove cpmcart.c and #ifndef to avoid references

  • Core option for disabling autostart joined with autostart warp
  • Statusbar improvements, VKBD transparency core option

GME

Description: Game Music Emulator core

  • Port: Ported GME core to PSL1GHT (PS3)
  • Port: Ported GME core to 3DS

prBoom

Description: Doom 1/2 game engine core

  • retro_run: Don’t attempt to run domm lop after exit – This fixes crash on exit on 3DS
  • Port: Added PSL1GHT port (PS3)

MelonDS

Description: Nintendo DS emulator

  • (Switch) Latest updates

P-UAE

Description: Commodore Amiga emulator

  • VKBD updates, CD turbo speed backport
  • WHDLoad changes (button overrides)
  • VKBD touch control –
    Only tested with Windows and mouse, since RETRO_DEVICE_POINTER also reacts with it. Hence also disabled real mouse control while VKBD is visible.
  • Port: Ported P-UAE to PSVita
  • Minor save state improvements
  • Extended ZIP support

ScummVM

  • Update to ScummVM 2.1.1
  • Allow launching games directly from game files

Mr. Boom

Description:

  • Port: Ported Mr. Boom to 3DS
  • Port: Ported Mr. Boom to Emscripten
  • Port: Ported Mr. Boom to PSP
  • Port: Ported Mr. Boom to PS Vita
  • Port: Ported Mr. Boom to Apple tvOS
  • Fix unaligned casts

FCEUmm

Description: NES emulator core

  • Fix unable to load some unif carts
  • M274 update
  • Add 42-in-80000 multicart (m380)
  • Add mapper 389 (Caltron 9-in-1)
  • BMCFK23C – update
  • Fix default palette
  • Add Mortal Kombat Trilogy – 8 People (M1274) (Ch) [!].nes to ines-cor…
  • Merge unif board BMC-Super24in1SC03 to BMC-FK23C
  • M176: Minor tweak to chr mixed ram/rom logic check and others
  • Simplify dipswitch options for Nintendo World Championships 1990 cart
  • MMC3: Make sure to free any allocated memory when using MMC3 as an external module
  • Misc mapper updates
  • m269: Move chr unscrambling to mapper init
  • Unif: Show raw values for prg/chr rom size in logs
  • Remove unneeded code in BMC-Super24in1SC03
  • Remove duplicate code in bmc-fk23c
  • Rewrite BMC-FK23C/A (m176) based on updated notes and testing
  • Fix incompatible pointer type warning
  • Add 168-in-1 New Contra Function 16 to ines-correct.h
  • unif.c: Align board map struct
  • ines.c: Cleanup mapper struct and iNESLoad()
  • Fix unterminated savestate struct
  • Update mapper 79
  • vrc2and4: Fix mapper 22 games not working (regression) and refactoring
  • Update ines-correct.h

The Powder Toy

Description: Game engine core

  • Port: Ported The Powder Toy to 3DS

PocketCDG

Description: MP3 Karaoke audio player

  • Eliminate too verbose output – On 3DS stderr is printed on lower screen and is slow. This messes up the
    performance completely.

Picodrive

Description: Sega Megadrive/Genesis/32X/CD emulator

  • Add option to change sound quality – Even with the fast renderer (#116), the framerate on the PSP slows down at some points in some games. Reducing the sound rate can help increase the framerate in these cases.

    It’s not ideal but it’s better than frame skipping. [bmaupin]

VBA-M

Description: Game Boy Advance emulator

  • Fix Save Failed error for Super Monkey Ball Jr.

gpSP

Description: Game Boy Advance emulator

  • [3DS] Fix dynarec prefetch aborts
  • Add automatic frame skipping

Frodo

Description: Commodore 64 emulator

  • Support running without ROM

PX68K

Description: Sharp X68000 Emulator

  • Prevent simultaneous up+down / left+right button presses

Lutro

Lutro now runs on the 3DS.

Snake runs at 60 fps
Platformer runs at 20 fps

  • (3DS) Fix build
  • (Switch) Fix build