Libretro Cores Progress Report – February 5, 2020 (Big updates for N64, Dreamcast, PlayStation1, Saturn and 3DO emulator cores!)

Our last core progress report was on January 9, 2019. Below we detail the most significant changes to all the Libretro cores we and/or upstream partners maintain. We are listing changes that have happened since then.

How to update your cores in RetroArch

There are two ways to update your cores:

a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select the core from the list that you want to install.

Parallel N64


Description: Nintendo 64 emulator core

Parallel RSP has been completely rewritten to use GNU Lighting instead of LLVM.

Advantages:
* LLVM was a big dependency. When statically linking this in, the core could become as big as 80MB non-stripped and 60MB stripped. Contrast this to GNU Lightning where we are sitting at 3.6MB non-stripped. LLVM also was not trivial to port to other platforms as easily as GNU Lightning. This means that Parallel RSP will make its way to Android and Switch (there is already an Aarch64 backend being written by m4xw)

* There are no more micro stutters and stalls that plagued the LLVM implementation. For instance – bringing up the auto-map in Doom 64, or the first menu screen transitions in F-Zero X, or firing your gun for the first time in Quake 64 – all of these would add temporary 1 second or more stalls the first time a code block was being compiled. With GNU Lightning, there are no such issues.

Disadvantages:
* Code generation is quite naive compared to LLVM’s, so there is somewhat of a performance tradeoff compared to the LLVM implementation. We estimate we lose about 5 to 8fps compared to the LLVM implementation. However, no microstutters/stalls and no more LLVM dependency makes it worth it, and there are ways to win this performance back and go further beyond in departments other than parallel RSP anyway.

  • Remove old parallel RSP implementation based on LLVM, replaced with Lightning-based parallel RSP. Takes care of microstutters/stalls
  • Angrylion: Option to select number of threads
  • Parallel RSP now available on Mac

Flycast


Description: Sega Dreamcast emulator core

Important updates

Flycast – Better saturate colors when converting textures to higher precision

Flycast – fix texture bleeding case when upscaling

Increased NAOMI Arcade game compatibility

Flyinghead has been busy improving arcade emulation support.

Netlink support is being worked on for Gun Survivor 2 Biohazard Code: Veronica. This is an arcade game adaption of Resident Evil Code: Veronica. It was also later released on PlayStation2. It never made the transition to the home on Dreamcast.

Second is Mazan – Flash of The Blade. The controls were not emulated before. This game is actually fairly unique in that not only was it a custom Naomi hardware design by Namco (more capable GPUs that could operate in an array), but it also had an unique input device.


It used motion sensing technology to detect swings you would make with the sword. Yes, that’s right, a motion sensing sword is your primary input device in this game.

(Upcoming) Accurate video output simulation – PowerVR 2 Post-process filtering

Leilei and Flyinghead got together to add something that accuracy purists might love. This is an upcoming feature that will be available soon –

We’ve added a couple of video output postprocessing options. To be exact, it’s an internal 24->16-bit buffer post-dithering pipeline stage. Lei-lei did this reversal of the PowerVR effects with his PowerVR PCX2 card (which has the same exact post-dithering as the newer PVR GPUs) and observing lossless official press release screenshots and xjas’s VGA capture dump.

If you recall, during the PS2’s early launch, people would often remark that the Dreamcast’s video output appeared crisper and had anti-aliasing applied whereas PS2 launch games appeared heavily aliased. In truth, what was going on was not really full-scene anti-aliasing or anything to that effect. Instead, it was a simple vertical blur the PowerVR2 GPU in the Dreamcast did to combat interlace flicker on composite video output.
The GameCube did something similar with the copy filters on a few games. Some of Sonic Team’s games on GameCube for instance did a similar vertical blur for the same deflicker purpose.

There was also noticeable VGA signal loss included in the VGA output when connecting your Dreamcast to a monitor with a VGA cable. It gives the screen a green hue and adds a ton of feedback instead of it being a clean dithered 16bpp image. This is also an option in the frontend shader, and we hope to add this too to Flycast as an optional feature.

You can now enable this with the GL renderer. If you’d like to use this shader in other cores and apply it as a frontend shader, you can do that too – we added the shader to the GLSL and Slang shader packs (see gpu/powervr2).

Frame comparison at native resolution

Frame comparison at 5120×3840

Changelog

  • Init AICA int mask/level at HLE boot – fixes missing audio in KOS homebrews
  • Disable DIV matching for Aqua GT
  • Disable DIV matching for Rayman 2 (NTSC)
  • Disable DIV matching for Rayman 2 (PAL)
  • Disable DIV matching for Elysion
  • Disable DIV matching for Silent Scope (NTSC)
  • Disable DIV matching for Silent Scope (PAL)
  • Disable DIV matching for Power Stone (US)
  • Disable DIV matching for Power Stone (JP)
  • Disable DIV matching for Power Stone (PAL)
  • Disable DIV matching for Metropolis Street Racer (NTSC)
  • Disable DIV matching for Metropolis Street Racer (PAL)
  • Disable RGB Component for Vigilante 8: 2nd Offense, Gauntlet Legends, Street Fighter Alpha 3
  • Stop CDDA when reading sector. Fixes Hydro Thunder – Time records music bug
  • (GL/Vulkan) Ignore trilinear filtering if texture isn’t mipmapped. Fixes Shenmue snowflakes color
  • (GL/GL4) PowerVR2 post-processing filter from leilei
  • (GL4) Fix blending issue when autosort=0. Fixes Sturmwind menu
  • (GL4) Don’t use extra depth scale in fog calculation. Fixes fog density in Sega Rally 2
  • New widescreen cheats: Suzuki Racing, Nightmare Creatures, Rent a Hero
  • (Maple) Safely reconnect VMUs when changing per-game VMUs option, may lead to VMU corruption otherwise. Don’t create VMU files when running Naomi or AtomisWave games
  • (Naomi) Emulate World Kicks and World Kicks PCB inputs
  • (Naomi) Fix reboot (and exiting service menu) by disabling legacy DIMM board emulation
  • (Naomi) Add input config for Mazan, emulate inputs for Mazan
  • (Reios) Support disk eject/change. Tested with Skies of Arcadia and D2
  • (PVR) Better saturate colors when converting tex to higher precision. Fixes transparency Issues in RE: Code Veronica and Dead or Alive 2
  • (PVR) Fix simple texture bleeding case when upscaling
  • Add disk control interface v1 support [jdgleaver]

Beetle PSX


Description: Sony PlayStation1 emulator core

Some important updates for Beetle PSX too – Lightning/Lightrec (the new dynarec technology being used) has seen many updates and improvements. Aarch64 compatibility should be a lot better now. ARMv7 is still a Work-In-Progress and still has many issues.

It’s now possible to set DMA/GPU Event Cycles to values as high as 1024. 1024 can offer a significant speed boost, but some games might not boot with this setting enabled. Test it yourself with a game of your choosing and see if it works reliably before you decide. You can always go for a lower value and see if that works better, while you don’t lose too much performance in the process.

So, to address the previous limitations of the dynarec – the two big ones were that runahead did not work, and that PGXP did not work with the dynarec enabled. Runahead is now working for software rendering mode, so that part is fulfilled (since hardware rendering not working reliably is not a core issue). As for PGXP, it now works with dynarec, but you will see a steep decline in performance, bringing you to performance levels just a little bit better than interpreter mode. However, there are plans to make PGXP part of the dynarec as well, which could take care of this issue.

Some performance tips for people that want to get the most out of their device:
* Set Dynarec Code Invalidation to ‘DMA Only (Slightly Faster)’. If it causes no issues in a game, this should give you a not-insignificant performance boost in some games.
* Set Dynarec DMA/GPU Event Cycles to a higher value than the default 128 if you can get away with it. If a game starts crashing or no longer boots from the BIOS screen, then you know you set it too high. Setting DMA Cycles to 1024 can have a big impact on maximum framerate.
* Software Framebuffer can be disabled for games that don’t make use of framebuffer readback. Try to turn this off if you are using the Vulkan hardware renderer. If you find certain graphics artefacts all of a sudden that were previously not there, you might have to turn this setting back on to get rid of the glitches.
* The Vulkan renderer right now might be a bit slower than the Software renderer. Some things you can try to bring the performance more in line would be to disable things like ‘Adaptive smoothing’, but if there is still a big performance gulf, you should resort to the Software renderer.
* PGXP right now will have a massive impact on performance with the dynarec. Turn it off if you care about getting the best performance possible.

  • Don’t call PGXP functions in gpu when PGXP is disabled
  • Add PGXP support in dynarec. Not much faster than interpreter, due to calling PGXP functions on every load/store opcode. Might become faster later
  • Add more DMA/GPU Event Cycles options. All multiples of 128 (default) should be fine. 1024 should be significantly faster but also the least compatible
  • Increase CPU overclock limit to 750%
  • Fix loading save states from pre-dynarec, need to use SFARRAYN with old name
  • Update Lightning
  • Add disk control interface v1 support [jdgleaver]

4DO


Description: 3DO emulator core

To learn more about some of the recent developments surrounding 4DO, be sure to read our dedicated article on this.

Beetle Saturn

Description: Sega Saturn emulator core

  • Add disk control interface v1 support [jdgleaver]

Beetle Wswan

Description: Bandai WonderSwan/WonderSwan Color emulator core

  • Backport 1.24.0 fixes
  • Backport variable color depth

Beetle Supergrafx

Description: Supergrafx emulator core

  • Move 2/6 button mode toggle to frontend

NeoCD

Description: SNK Neo Geo CD emulator core

  • Initial implementation of memory maps (untested) [fabrice-martinez]

Mupen64plus Next

Description: Nintendo 64 emulator core

  • Hotfix for Legend of Zelda: Majora’s Mask freeze due to special interrupt
  • Hotfix for Legend of Zelda: Ocarina of Time (+randomizer), this also fixes Rat Attack (only for dynarec, not interpreter) due to wrong handling of TLB exceptions for titles that don’t use TLB
  • Both hotfixes don’t fix the root cause and will be revised later on
  • Updated mupen64plus-rsp-hle, thanks to Gillou68310 the HVQM µcode is now implemented for HLE, fixing Pokemon Puzzle League and Yakouchuu II.

Note: Stay tuned for a lot of great updates coming out over the coming months, featuring threaded rendering as well as multi plugin support!

bsnes hd beta

Description: Super Nintendo emulator core

  • Update to latest version [DerKoun]

Boom3

Description: Doom 3 game engine core

  • Changed name from dhewm3 to boom3 at request of author

P-UAE

Description: Commodore Amiga emulator

  • Libco removed. 8-9% performance improvement
  • Enabled SERIAL_PORT, which fixes:
    All versions of Super Skidmarks, except that WHDLoad slave 1.1 with the ludicrous memory requirement
    Grand Prix Circuit
  • Ensure reset_drawing() is called whenever geometry changes (prevents out of bounds video buffer access)
  • D-Pad mouse acceleration + font fix
  • More statusbar options
  • VKBD glyph tuning
  • Add support for disk control interface v1 (disk display labels)
  • Remove savestate_initsave + better VKBD mouse control
  • Fix from WinUAE 4.1.0 for Chaos Engine 2 AGA crash
  • VKBD tuning
  • Audio via retro_audio_batch_cb + MDS fix + pregap fix
  • New default controls
  • Graph font & VKBD tweaks
  • HD LED writing color to red

Final Burn Neo

Description: Multi-system arcade emulator core

  • Latest updates

LRmame

Description: Multi-system arcade emulator core

  • Updated to latest version (0.218) – will be available later today [tcamargo]

NP2kai

Description: PC-9801 series emulator

  • Updated to latest version [AZO234]

Frodo

Description: Commodore 64 emulator

  • Now available on Android

Kronos

Description: Sega Saturn emulator core

  • OpenGLES preparation work
  • Fix window resize for VDP1 layer – Fix Winter Heat in resize
  • When OREG is read while status flag is clear, force command processing – avoid race – fix Rayman controls
  • Be more generic for the SMPC race issue
  • On intback continue write, status flag shall be 1 – Fix batman boot
  • Fix Batman window
  • Set the vdp1On when updating using write – Fix Sega Ages loading screen
  • If the VDP1 is cleared with a non transparent color, assume it shall
  • Introduce the development RAM Card used by Heart of Darkness
  • Display VDP1 layer cleared with non transparent color
  • (libretro) hook the dev cartridge
  • Fix two consecutive end code on core OpenGL – Fix Code R
  • fh is related to kx – shall fix some bad behavior on RBG CS

FCEUmm

Description: NES emulator core

NOTE: All changes courtesy of negativeexponent

  • Update mapper 213
  • Update mapper 319 (BMC-HP898F)
  • Update vrc2and4.c – support for big bank CHR (Contra 3) matched by hash
  • Added iNES 1.0/2.0 mappers
  • 134 – replaced Mapper134_init with Bs5652_Init
  • 391 – NC7000MM
  • 402 – 831019C J-2282
  • Added UNIF boards:
  • AB-G1L
  • BS-110
  • WELL-NO-DG450
  • KG256
  • Fix savestates – prevent possible issue on big-endian by adding mask
  • Fix savestates – specify correct variable size to state struct
  • Backport new FDS [Famicom Disk System] disk handling – fixes saving issue with some games (Bubble Bobble, Super Lode Runner II, …)
  • Add mapper 357
  • Add mapper 372
  • Add mapper 541
  • Add mapper 538
  • Add mapper 381
  • Add mapper 288
  • Update BMC-RESET-TXROM (m313)
  • Add mapper 374
  • Add mapper 390
  • Add mapper 267
  • m313: Fix incorrect bank sizes
  • Add mapper 294 (m134)
  • Add mapper 297

Genesis Plus GX

Description: Sega Genesis/Mega Drive/Master System/Game Gear emulator

  • Updated to latest version
  • Fixed runahead issues

SMS-Plus GX

Description: Sega Master System/Game Gear emulator

  • Add support for 2nd player port

Picodrive

Description: Sega Genesis/Mega Drive/Game Gear/Master System/Sega CD/32X emulator core

  • Allow access to Sega CD’s extra memory using retro_memory_map [negativeExponent]

mgba

Description: Game Boy Advance emulator core

  • Updated to latest version
  • Add Italian core options translation
  • Fixed runahead issues [endrift]
  • Add optional interframe blending

Mesen

Description: NES emulator core

  • Updated to latest version [Sour]
  • Fixed runahead issues [Sour]

PCSX ReARMed

Description: Sony PlayStation1 emulator core

  • Add input analog axis range modifier [stouken]
  • Add disk control interface v1 support [jdgleaver]

Snes9x 2005

Description: Super Nintendo emulator core

  • Should finally compile now for Raspberry Pi 4

TIC-80

Description: TIC-80 emulator core

  • Updated to latest version

PX68K Libretro

Description: NEC X68000 home computer emulator

  • Fix for M3U not registering Eject state
  • Implementation of new Disk Control interface (including custom labels)
  • px68k switch menu now accessable as core options

Parallel N64 with Parallel RSP dynarec release – fast and accurate N64 emulation is now here!

Hot on the heels of our Beetle PSX dynarec public beta release, here comes another bombshell, this time targeting that other big 5th generation console, the Nintendo 64.

ParaLLel N64 is back with a vengeance, and this time Parallel RSP coupled with Angrylion Plus RDP is the enabling technology for a big performance jump in low-level accurate N64 emulation.

Parallel RSP

We have released today a new version of Parallel N64 for Windows and Linux that adds back the Parallel RSP dynarec. This together with multithreaded Angrylion makes Parallel N64 the fastest LLE N64 emulator by far.

We have also added missing bits and pieces to Parallel RSP that allows for several games to work, such as World Driver Championship, Stunt Racer 64, and Gauntlet Legends. On top of that, we also optimized some parts in Angrylion RDP Plus which resulted in greatly enhanced core/thread utilization on a 16-core Ryzen CPU. It should also similarly scale well downwards.

Just to illustrate how dramatic of a difference all these performance-focused enhancements make: on an underpowered 2012 Core i5 laptop, the following games are capable of being played at fullspeed: Resident Evil 2, Mario Kart 64, 1080 Snowboarding, Doom 64, Quake 64, Mischief Makers, Bakuretsu Muteki Bangaioh, Bust-A Move 2 Arcade Edition, Kirby 64 – The Crystal Shards, Mortal Kombat Trilogy, Forsaken 64, Harvest Moon 64.

A 2012 Core i5 desktop CPU like the Core i5 3570k should be more than capable enough of playing the vast majority of N64 games at fullspeed with Angrylion+Parallel RSP, with only rare exceptions like Star Wars Episode 1 Racer being too much for it.

Dramatically lowering the performance ceiling like this has vast implications for the viability of low-level accurate N64 emulation, and opens the door for more people to enjoy bug-free N64 emulation, previously the preserve of only the most overpowered PCs.

Right now, only Linux and Windows are able to enjoy the benefits of Parallel RSP due to the LLVM dependency. This might solve itself later on, though.

How to get it

On Windows: Just downloaded the latest Parallel N64 core from RetroArch’s Online Updater menu. You can either download it from Core Updater, or you can select ‘Update Installed Cores’ if you already had a prior version of Parallel N64 installed.

On Linux: Download the latest Parallel N64 core from RetroArch’s Online Updater menu. You can either download it from Core Updater, or you can select ‘Update Installed Cores’ if you already had a prior version of Parallel N64 installed.

IMPORTANT: On Linux, the situation is a bit more complicated. If you find that the core won’t load on your Linux distribution, it is because LLVM on Linux normally links against libtinfo. For the version of Parallel N64, you will need to make sure you have libtinfo5 installed.

On Ubuntu Linux, you can do this by going into the terminal and typing in the following:

sudo apt-get install libtinfo5

Consult the documentation of your Linux distribution’s package manager for more details on how you can download this package. Once installed, the core should work. If you have a more recent version of libtinfo already installed, creating a symlink for libinfo5.so might be enough.

How to use it

Go into Quick Menu -> Options.

Make sure that ‘GFX Plugin’ is set to ‘angrylion’, and ‘RSP Plugin’ is set to ‘parallel’.

Restart the core.

Explanation of the options

Angrylion (VI Mode) – The N64’s video output interface ‘VI’ would postprocess the image in a very elaborate way. We will explain all of the modes down below:

VI Filter: – All VI filtering is being applied with no omissions.

VI Filter – AA+Blur – Only anti-aliasing + blur filtering by the VI is being applied.

VI Filter – AA+DeDither – Only anti-aliasing + dedithering filtering by the VI is being applied.

VI Filter – AA Only – Only anti-aliasing by the VI is being applied.

Unfiltered: – This is the fastest mode. There is no VI filtering being applied of any sort, the color buffer is being directly output.

Depth: – The VI’s depth buffer as a grayscale image

Coverage: – Coverage as a grayscale image

Angrylion (Thread sync level) – With most games it is fine to leave this at ‘Low’. Low will be the fastest, with High being significantly slower. Games which need Thread sync level set to High to prevent rendering glitches include but are not limited to Starcraft 64, Conker’s Bad Fur Day and Paper Mario.

Angrylion (Multi-threading) – You will want to always enable this. When it is disabled, Angrylion will be entirely single-threaded. It will be much slower in single-threaded mode.

Increased compatibility for Parallel RSP

The following games will now work with Parallel RSP:

* World Driver Championship
* Gauntlet Legends
* Stunt Racer 64
* Mario no Photopie (graphics fixed)

Performance tests

NOTE: All these tests were performed with the game Super Mario 64 (USA). Exact scene being tested can be seen in the screenshot down below.


* Cxd4 means the Interpreter RSP core. This was previously the only option you could use in combination with Angrylion – it needs an LLE RSP core, it cannot work with a HLE RSP core.
* All the VI Filter tests are done with Thread Sync Low.

Test hardware: Desktop PC – AMD Ryzen 9 3950x, Windows 10 (16 cores, 32 HW threads)

RSP Mode VI Filter VI Filter – AA+Blur VI Filter – AA+DeDither VI Filter – AA Only Unfiltered – Thread Sync Low Unfiltered – Thread Sync Mid Unfiltered – Thread Sync High
Cxd4 174 VI/s 176 VI/s 175 VI/s 178 VI/s 180 VI/s 175 VI/s 92 VI/s
Parallel RSP 235 VI/s 238 VI/s 238 VI/s 240 VI/s 245 VI/s 235 VI/s 106 VI/s

Test hardware: Desktop PC – Intel Core i7 7700k @ 4.2GHz, Windows 10 (4 cores, 8 HW threads)

RSP Mode VI Filter VI Filter – AA+Blur VI Filter – AA+DeDither VI Filter – AA Only Unfiltered – Thread Sync Low Unfiltered – Thread Sync Mid Unfiltered – Thread Sync High
Cxd4 95 VI/s 96 VI/s 96 VI/s 99 VI/s 104 VI/s 96 VI/s 86 VI/s
Parallel RSP 139 VI/s 145 VI/s 144 VI/s 151 VI/s 170 VI/s 146 VI/s 121 VI/s

Test hardware: Desktop PC – Intel Core i5 3570k @ 4GHz, Windows 7 x64 (4 cores, 4 HW threads)

RSP Mode VI Filter VI Filter – AA+Blur VI Filter – AA+DeDither VI Filter – AA Only Unfiltered – Thread Sync Low Unfiltered – Thread Sync Mid Unfiltered – Thread Sync High
Cxd4 89 VI/s 92 VI/s 91 VI/s 95 VI/s 103 VI/s 103 VI/s 97 VI/s
Parallel RSP 118 VI/s 124 VI/s 121 VI/s 128 VI/s 145 VI/s 145 VI/s 133 VI/s

Test hardware: Laptop PC – Intel Core i5 3210M @ 2.50GHz, Ubuntu Linux 19.04 (2 cores, 4 HW threads)

This is a 2012 Lenovo G580 laptop.

RSP Mode VI Filter VI Filter – AA+Blur VI Filter – AA+DeDither VI Filter – AA Only Unfiltered – Thread Sync Low Unfiltered – Thread Sync Mid Unfiltered – Thread Sync High
Cxd4 39 VI/s 39 VI/s 42 VI/s 41 VI/s 43 VI/s 39 VI/s 39 VI/s
Parallel RSP 52 VI/s 55 VI/s 54 VI/s 58 VI/s 67 VI/s 66 VI/s 61 VI/s

We recommend for a configuration as low-end as this that you just keep VI Overlay to ‘Unfiltered’.

The majority of games on a configuration this low-end experience dips below full-speed, however, there are a fair few games right now which already run at fullspeed. This is by no means a definitive or exhaustive list but here’s the ones for which I can confirm run at fullspeed with very rare dips if at all:

Resident Evil 2, Mario Kart 64, 1080 Snowboarding, Doom 64, Quake 64, Mischief Makers, Bakuretsu Muteki Bangaioh, Bust-A Move 2 Arcade Edition, Kirby 64 – The Crystal Shards, Mortal Kombat Trilogy, Forsaken 64, Harvest Moon 64

Currently known issues

* Do not use ‘Sync to Exact Content Framerate’ for Angrylion + Parallel RSP. You won’t get good results.
* Recompiling blocks the first time can lead to a very slight stutter. Thankfully this only happens the first time, and never happens afterwards for the runtime duration.
* On Linux there might be dependency issues related to libtinfo5. Read the section ‘How to Get It’ where we try to explain how to solve this problem at least for Ubuntu Linux. NOTE: This situation might resolve itself in the future in case we move away from LLVM for the RSP part.