Libretro Cores Progress Report – February 5, 2020 (Big updates for N64, Dreamcast, PlayStation1, Saturn and 3DO emulator cores!)

Our last core progress report was on January 9, 2019. Below we detail the most significant changes to all the Libretro cores we and/or upstream partners maintain. We are listing changes that have happened since then.

How to update your cores in RetroArch

There are two ways to update your cores:

a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select the core from the list that you want to install.

Parallel N64

Description: Nintendo 64 emulator core

Parallel RSP has been completely rewritten to use GNU Lighting instead of LLVM.

* LLVM was a big dependency. When statically linking this in, the core could become as big as 80MB non-stripped and 60MB stripped. Contrast this to GNU Lightning where we are sitting at 3.6MB non-stripped. LLVM also was not trivial to port to other platforms as easily as GNU Lightning. This means that Parallel RSP will make its way to Android and Switch (there is already an Aarch64 backend being written by m4xw)

* There are no more micro stutters and stalls that plagued the LLVM implementation. For instance – bringing up the auto-map in Doom 64, or the first menu screen transitions in F-Zero X, or firing your gun for the first time in Quake 64 – all of these would add temporary 1 second or more stalls the first time a code block was being compiled. With GNU Lightning, there are no such issues.

* Code generation is quite naive compared to LLVM’s, so there is somewhat of a performance tradeoff compared to the LLVM implementation. We estimate we lose about 5 to 8fps compared to the LLVM implementation. However, no microstutters/stalls and no more LLVM dependency makes it worth it, and there are ways to win this performance back and go further beyond in departments other than parallel RSP anyway.

  • Remove old parallel RSP implementation based on LLVM, replaced with Lightning-based parallel RSP. Takes care of microstutters/stalls
  • Angrylion: Option to select number of threads
  • Parallel RSP now available on Mac


Description: Sega Dreamcast emulator core

Important updates

Flycast – Better saturate colors when converting textures to higher precision

Flycast – fix texture bleeding case when upscaling

Increased NAOMI Arcade game compatibility

Flyinghead has been busy improving arcade emulation support.

Netlink support is being worked on for Gun Survivor 2 Biohazard Code: Veronica. This is an arcade game adaption of Resident Evil Code: Veronica. It was also later released on PlayStation2. It never made the transition to the home on Dreamcast.

Second is Mazan – Flash of The Blade. The controls were not emulated before. This game is actually fairly unique in that not only was it a custom Naomi hardware design by Namco (more capable GPUs that could operate in an array), but it also had an unique input device.

It used motion sensing technology to detect swings you would make with the sword. Yes, that’s right, a motion sensing sword is your primary input device in this game.

(Upcoming) Accurate video output simulation – PowerVR 2 Post-process filtering

Leilei and Flyinghead got together to add something that accuracy purists might love. This is an upcoming feature that will be available soon –

We’ve added a couple of video output postprocessing options. To be exact, it’s an internal 24->16-bit buffer post-dithering pipeline stage. Lei-lei did this reversal of the PowerVR effects with his PowerVR PCX2 card (which has the same exact post-dithering as the newer PVR GPUs) and observing lossless official press release screenshots and xjas’s VGA capture dump.

If you recall, during the PS2’s early launch, people would often remark that the Dreamcast’s video output appeared crisper and had anti-aliasing applied whereas PS2 launch games appeared heavily aliased. In truth, what was going on was not really full-scene anti-aliasing or anything to that effect. Instead, it was a simple vertical blur the PowerVR2 GPU in the Dreamcast did to combat interlace flicker on composite video output.
The GameCube did something similar with the copy filters on a few games. Some of Sonic Team’s games on GameCube for instance did a similar vertical blur for the same deflicker purpose.

There was also noticeable VGA signal loss included in the VGA output when connecting your Dreamcast to a monitor with a VGA cable. It gives the screen a green hue and adds a ton of feedback instead of it being a clean dithered 16bpp image. This is also an option in the frontend shader, and we hope to add this too to Flycast as an optional feature.

You can now enable this with the GL renderer. If you’d like to use this shader in other cores and apply it as a frontend shader, you can do that too – we added the shader to the GLSL and Slang shader packs (see gpu/powervr2).

Frame comparison at native resolution

Frame comparison at 5120×3840


  • Init AICA int mask/level at HLE boot – fixes missing audio in KOS homebrews
  • Disable DIV matching for Aqua GT
  • Disable DIV matching for Rayman 2 (NTSC)
  • Disable DIV matching for Rayman 2 (PAL)
  • Disable DIV matching for Elysion
  • Disable DIV matching for Silent Scope (NTSC)
  • Disable DIV matching for Silent Scope (PAL)
  • Disable DIV matching for Power Stone (US)
  • Disable DIV matching for Power Stone (JP)
  • Disable DIV matching for Power Stone (PAL)
  • Disable DIV matching for Metropolis Street Racer (NTSC)
  • Disable DIV matching for Metropolis Street Racer (PAL)
  • Disable RGB Component for Vigilante 8: 2nd Offense, Gauntlet Legends, Street Fighter Alpha 3
  • Stop CDDA when reading sector. Fixes Hydro Thunder – Time records music bug
  • (GL/Vulkan) Ignore trilinear filtering if texture isn’t mipmapped. Fixes Shenmue snowflakes color
  • (GL/GL4) PowerVR2 post-processing filter from leilei
  • (GL4) Fix blending issue when autosort=0. Fixes Sturmwind menu
  • (GL4) Don’t use extra depth scale in fog calculation. Fixes fog density in Sega Rally 2
  • New widescreen cheats: Suzuki Racing, Nightmare Creatures, Rent a Hero
  • (Maple) Safely reconnect VMUs when changing per-game VMUs option, may lead to VMU corruption otherwise. Don’t create VMU files when running Naomi or AtomisWave games
  • (Naomi) Emulate World Kicks and World Kicks PCB inputs
  • (Naomi) Fix reboot (and exiting service menu) by disabling legacy DIMM board emulation
  • (Naomi) Add input config for Mazan, emulate inputs for Mazan
  • (Reios) Support disk eject/change. Tested with Skies of Arcadia and D2
  • (PVR) Better saturate colors when converting tex to higher precision. Fixes transparency Issues in RE: Code Veronica and Dead or Alive 2
  • (PVR) Fix simple texture bleeding case when upscaling
  • Add disk control interface v1 support [jdgleaver]

Beetle PSX

Description: Sony PlayStation1 emulator core

Some important updates for Beetle PSX too – Lightning/Lightrec (the new dynarec technology being used) has seen many updates and improvements. Aarch64 compatibility should be a lot better now. ARMv7 is still a Work-In-Progress and still has many issues.

It’s now possible to set DMA/GPU Event Cycles to values as high as 1024. 1024 can offer a significant speed boost, but some games might not boot with this setting enabled. Test it yourself with a game of your choosing and see if it works reliably before you decide. You can always go for a lower value and see if that works better, while you don’t lose too much performance in the process.

So, to address the previous limitations of the dynarec – the two big ones were that runahead did not work, and that PGXP did not work with the dynarec enabled. Runahead is now working for software rendering mode, so that part is fulfilled (since hardware rendering not working reliably is not a core issue). As for PGXP, it now works with dynarec, but you will see a steep decline in performance, bringing you to performance levels just a little bit better than interpreter mode. However, there are plans to make PGXP part of the dynarec as well, which could take care of this issue.

Some performance tips for people that want to get the most out of their device:
* Set Dynarec Code Invalidation to ‘DMA Only (Slightly Faster)’. If it causes no issues in a game, this should give you a not-insignificant performance boost in some games.
* Set Dynarec DMA/GPU Event Cycles to a higher value than the default 128 if you can get away with it. If a game starts crashing or no longer boots from the BIOS screen, then you know you set it too high. Setting DMA Cycles to 1024 can have a big impact on maximum framerate.
* Software Framebuffer can be disabled for games that don’t make use of framebuffer readback. Try to turn this off if you are using the Vulkan hardware renderer. If you find certain graphics artefacts all of a sudden that were previously not there, you might have to turn this setting back on to get rid of the glitches.
* The Vulkan renderer right now might be a bit slower than the Software renderer. Some things you can try to bring the performance more in line would be to disable things like ‘Adaptive smoothing’, but if there is still a big performance gulf, you should resort to the Software renderer.
* PGXP right now will have a massive impact on performance with the dynarec. Turn it off if you care about getting the best performance possible.

  • Don’t call PGXP functions in gpu when PGXP is disabled
  • Add PGXP support in dynarec. Not much faster than interpreter, due to calling PGXP functions on every load/store opcode. Might become faster later
  • Add more DMA/GPU Event Cycles options. All multiples of 128 (default) should be fine. 1024 should be significantly faster but also the least compatible
  • Increase CPU overclock limit to 750%
  • Fix loading save states from pre-dynarec, need to use SFARRAYN with old name
  • Update Lightning
  • Add disk control interface v1 support [jdgleaver]


Description: 3DO emulator core

To learn more about some of the recent developments surrounding 4DO, be sure to read our dedicated article on this.

Beetle Saturn

Description: Sega Saturn emulator core

  • Add disk control interface v1 support [jdgleaver]

Beetle Wswan

Description: Bandai WonderSwan/WonderSwan Color emulator core

  • Backport 1.24.0 fixes
  • Backport variable color depth

Beetle Supergrafx

Description: Supergrafx emulator core

  • Move 2/6 button mode toggle to frontend


Description: SNK Neo Geo CD emulator core

  • Initial implementation of memory maps (untested) [fabrice-martinez]

Mupen64plus Next

Description: Nintendo 64 emulator core

  • Hotfix for Legend of Zelda: Majora’s Mask freeze due to special interrupt
  • Hotfix for Legend of Zelda: Ocarina of Time (+randomizer), this also fixes Rat Attack (only for dynarec, not interpreter) due to wrong handling of TLB exceptions for titles that don’t use TLB
  • Both hotfixes don’t fix the root cause and will be revised later on
  • Updated mupen64plus-rsp-hle, thanks to Gillou68310 the HVQM µcode is now implemented for HLE, fixing Pokemon Puzzle League and Yakouchuu II.

Note: Stay tuned for a lot of great updates coming out over the coming months, featuring threaded rendering as well as multi plugin support!

bsnes hd beta

Description: Super Nintendo emulator core

  • Update to latest version [DerKoun]


Description: Doom 3 game engine core

  • Changed name from dhewm3 to boom3 at request of author


Description: Commodore Amiga emulator

  • Libco removed. 8-9% performance improvement
  • Enabled SERIAL_PORT, which fixes:
    All versions of Super Skidmarks, except that WHDLoad slave 1.1 with the ludicrous memory requirement
    Grand Prix Circuit
  • Ensure reset_drawing() is called whenever geometry changes (prevents out of bounds video buffer access)
  • D-Pad mouse acceleration + font fix
  • More statusbar options
  • VKBD glyph tuning
  • Add support for disk control interface v1 (disk display labels)
  • Remove savestate_initsave + better VKBD mouse control
  • Fix from WinUAE 4.1.0 for Chaos Engine 2 AGA crash
  • VKBD tuning
  • Audio via retro_audio_batch_cb + MDS fix + pregap fix
  • New default controls
  • Graph font & VKBD tweaks
  • HD LED writing color to red

Final Burn Neo

Description: Multi-system arcade emulator core

  • Latest updates


Description: Multi-system arcade emulator core

  • Updated to latest version (0.218) – will be available later today [tcamargo]


Description: PC-9801 series emulator

  • Updated to latest version [AZO234]


Description: Commodore 64 emulator

  • Now available on Android


Description: Sega Saturn emulator core

  • OpenGLES preparation work
  • Fix window resize for VDP1 layer – Fix Winter Heat in resize
  • When OREG is read while status flag is clear, force command processing – avoid race – fix Rayman controls
  • Be more generic for the SMPC race issue
  • On intback continue write, status flag shall be 1 – Fix batman boot
  • Fix Batman window
  • Set the vdp1On when updating using write – Fix Sega Ages loading screen
  • If the VDP1 is cleared with a non transparent color, assume it shall
  • Introduce the development RAM Card used by Heart of Darkness
  • Display VDP1 layer cleared with non transparent color
  • (libretro) hook the dev cartridge
  • Fix two consecutive end code on core OpenGL – Fix Code R
  • fh is related to kx – shall fix some bad behavior on RBG CS


Description: NES emulator core

NOTE: All changes courtesy of negativeexponent

  • Update mapper 213
  • Update mapper 319 (BMC-HP898F)
  • Update vrc2and4.c – support for big bank CHR (Contra 3) matched by hash
  • Added iNES 1.0/2.0 mappers
  • 134 – replaced Mapper134_init with Bs5652_Init
  • 391 – NC7000MM
  • 402 – 831019C J-2282
  • Added UNIF boards:
  • AB-G1L
  • BS-110
  • WELL-NO-DG450
  • KG256
  • Fix savestates – prevent possible issue on big-endian by adding mask
  • Fix savestates – specify correct variable size to state struct
  • Backport new FDS [Famicom Disk System] disk handling – fixes saving issue with some games (Bubble Bobble, Super Lode Runner II, …)
  • Add mapper 357
  • Add mapper 372
  • Add mapper 541
  • Add mapper 538
  • Add mapper 381
  • Add mapper 288
  • Update BMC-RESET-TXROM (m313)
  • Add mapper 374
  • Add mapper 390
  • Add mapper 267
  • m313: Fix incorrect bank sizes
  • Add mapper 294 (m134)
  • Add mapper 297

Genesis Plus GX

Description: Sega Genesis/Mega Drive/Master System/Game Gear emulator

  • Updated to latest version
  • Fixed runahead issues


Description: Sega Master System/Game Gear emulator

  • Add support for 2nd player port


Description: Sega Genesis/Mega Drive/Game Gear/Master System/Sega CD/32X emulator core

  • Allow access to Sega CD’s extra memory using retro_memory_map [negativeExponent]


Description: Game Boy Advance emulator core

  • Updated to latest version
  • Add Italian core options translation
  • Fixed runahead issues [endrift]
  • Add optional interframe blending


Description: NES emulator core

  • Updated to latest version [Sour]
  • Fixed runahead issues [Sour]


Description: Sony PlayStation1 emulator core

  • Add input analog axis range modifier [stouken]
  • Add disk control interface v1 support [jdgleaver]

Snes9x 2005

Description: Super Nintendo emulator core

  • Should finally compile now for Raspberry Pi 4


Description: TIC-80 emulator core

  • Updated to latest version

PX68K Libretro

Description: NEC X68000 home computer emulator

  • Fix for M3U not registering Eject state
  • Implementation of new Disk Control interface (including custom labels)
  • px68k switch menu now accessable as core options

4DO: Recent Changes and Future Plans

About two years ago I more or less adopted the libretro 4DO core. After fixing a VSYNC issue that many had complained about I ended up almost completely refactoring the code and adding a number of features. After a year’ish break I’m back and have been focusing on accuracy and some experimentation in HLE (high level emulation). Here’s an overview of some of that work and plans for the future.

Adding HLE

The 3DO was ahead of its time in a number of ways. One was that it had an operating system. It has preemptive multitasking, threading, high level memory management, dynamic loading of “folios” (libraries), IPC, IO APIs, and general abstractions to much of the hardware. The OS and most/all software for the platform is written in C and compiled using the Norcroft ARM toolchain from that era. The OS is present in the system’s ROM but also included on each CDROM. The ROM OS is pretty thorough but ultimately a bootloader and bootstraps the environment before handing control off to the software on the CDROM. Many of the core OS functions are called via the ARM SWI (software interrupt) instruction. This makes catching and overriding these calls very easy when combined with knowledge of the ARM procedure call standard.

Catching and overriding some or all of the OS functions would theoretically improve performance and/or enable advanced features. Some of these functions are somewhat complicated but one set of SWI calls were a perfect candidate for HLE: the matrix arithmetic functions. The MADAM graphics co-processor has a set of hardware accelerated 16.16 fixed point matrix arithmetic functions. This is used in a number of 3D games and called regularly. While reworking MADAM emulation in the past it was recognized that the OS could and would check the revision of the MADAM chip and had a fallback to software based matrix functions if it recognized it was running on a prototype system without the matrix hardware and that for some games there was a slight speed increase when switching between the hardware emulation and the software routines. So we know that improving the matrix arithmetic situation could help lower end systems struggling with the LLE and it should also be simple to ensure accuracy.

So we’ve added a new, optional, HLE mode that replaces the emulated hardware or software matrix functions with native fixed point calculations performed in straight C. Games that make heavy use of the matrix hardware run a few FPS faster on lower end systems such as the XU4. The option can be toggled in real time making comparison easy. You can see what games use these functions on the 3DO Development Repo site.

In the future a number of other functions would be interesting to replace such as the DSP and filesystem APIs. The DSP never got a development kit so most (all?) games use standard programs provided by The 3DO Company. Meaning that each of these programs could be statically re-implemented in C instead of using the current interpreter. The DSP is probably 1/3 of the CPU cost to emulation so HLE could make a big impact there. With the filesystem API the host storage device could be used directly allowing easier sharing and backup of save games and offering more space than a real system could have.

VDLP Rewrite

The 3DO has a fairly advanced video display line processor. It has programmable functions which provide a number of features including: managing CLUTs (color lookup tables), performing optional horizontal and vertical interpolation (upscaling from the internal 320×240 framebuffer to 640×480), managing background color, integrating secondary video data streams, etc.

The FreeDO/4DO VDLP emulation was not the best. If you read the patents the system seemed reasonably straightforward but the code not so much.

  • Has a lot of literal values for masks and shifts making it unclear what is being processed.
  • Variable names did not always match the docs.
  • Inadvertently reversed logic that rendered the wrong lines.
  • A constant bodge value used to offset the rendering line. Didn’t have a functional impact but was confusing and ugly.
  • It fails to properly manage the disabling render of a line. This can result in garbage being rendered.
  • Lacks the interpolation to full NTSC or PAL frames.
  • A general lack of PAL support.
  • It failed to properly sync line rendering with libretro’s main callback leading to additional rendering latency.
  • Has an unnecessary step in the rendering pipeline. Rather than taking the internal VRAM framebuffer and directly generating the buffer for the host to display it was copying the data into an intermediate structure which later was converted into a host compatible format. This copying offered no additional functionality and could be removed entirely.
  • Alternative color formats were ignored completely. Only P555 was supported. Though it is unclear if any software use the other formats.

A new VDLP core has been written from scratch to fix some of the above and enable future enhancements. New features include:

  • Lower frame rendering latency
  • Properly synced frame generation with libretro run loop
  • Improved performance due to removal of extra rendering stage
  • Ability to choose any of the libretro pixel formats (XRGB8888, 0RGB1555, or RGB565). Formerly only supported XRGB8888. May improve performance on 16bpp platforms.
  • Forced CLUT bypass mode. The VDLP has a special, per line and per pixel ability to disable color lookups and convert the P555 RGB to 888 RGB by a logical shift left of 3. Many games technically use a CLUT but in fact use a default palette that matches the left shift values. While this may cause color distortions on some games it also should improve performance due to a simpler pixel conversion function.

Interpolation is still on the TODO list. It will add non-insignificant cost to the rendering pipeline but was a significant feature of the platform and would be nice to have for the authenticity.


Later revisions of the 3DO had a chip called ANVIL which was the combination of the co-processors MADAM and CLIO. The ROM for those systems are very similar to the previous model’s ROMs except that it appeared to be mirrored to a different memory location. It put a single unconditional jump at bootup into that mirrored location but all other references were to the original memory range. This  memory range was not properly supported nor was the VDLP and general rendering pipeline designed to work with PAL. All of that has been fixed. You can use EU ROMs which will cause EU games to run in one of two PAL resolutions. For the time being the ROM and resolution/field rate need to be selected manually.

A list of games and their supported region modes can be found here. Few games actually render at full PAL resolutions (PAL2). Most are letterboxed NTSC (PAL1).

Future Work

  • New HLE functions as described above
  • VDLP interpolation (dev docs, patent)
  • Increased emulated system DRAM and VRAM. The OS can support up to 14MB of DRAM & 2MB of VRAM or 15MB DRAM & 1MB VRAM. The emulator needs to be rewritten to enable it. This could help with homebrew.
  • Rewrite of the ARM and DSP cores
  • Rewrite of the CEL renderer
  • Removal of the “hacks” / improve accuracy
  • Improve accuracy by adding MADAM memory fence support and proper CPU exceptions such as those related to data alignment
  • Automatic detection of NTSC/PAL modes
  • Configurable hardware revisions of MADAM and CLIO chips
  • Add features to help with homebrew development and debugging
  • Other misc things

Renaming of the Project

Development on the original 4DO projected stopped a number of years ago and while the website is still up the project is effectively dead. It’s been noticed that people regularly believe that the old 4DO project and the libretro core are largely identical however the libretro core has gone under significant restructuring, rewrites, and offers new features. To limit confusion we are planning on renaming the project to Opera in the not too distant future. Opera was the code name for the 3DO project.

Related Work

Optimus (author of OptiDoom, the reworked 3DO Doom port) and I have created a new website dedicated to all things 3DO development. We’re writing homebrew tutorials, documenting the hardware, posting development toolkits and homebrew tools, etc. The hope is to spur interest in the platform and lower the barrier to entry.

Right now there is a very small group of individuals working in the 3DO and M2 space. Whether that be documentation, emulation, or homebrew. We’d like to extend an invitation to any and all interested in the 3DO or M2 to help in any capacity or consider the platforms for their current or future homebrew projects. We’ve got a big TODO list that we would love help in tackling. If you’re just interested in following our progress you can create an account on the wiki and subscribe to updates and/or “Watch” the libretro project.

Beetle PSX Dynarec Update – Switch alpha port plus runahead support – lower latency than original hardware!

As an addendum to our earlier Beetle PSX dynarec post, a lot has changed in a couple of days:

  • Several important bugfixes have been pushed for ARM v7/AArch64. This should benefit Android users and ARM Linux SBC users.
  • Runahead support is finally working
  • Alpha release Switch version now available

Runahead support – lower latency than original hardware

Runahead is one of RetroArch’s crowning features as a frontend ever since we debuted this revolutionary feature in 2018.

Most systems more complex than the Atari 2600 will display a reaction to input after one or two frames have already been shown. For example, Super Mario Bros on the NES always has one frame of input latency on real hardware. Super Mario World on SNES has two frames of latency.

Runahead allows RetroArch to “remove” this baked-in latency by running multiple instances of a core simultaneously, and quickly emulating multiple frames at a time if input changes. This strategy has some analogs to frame rollback, similar to how GGPO handles high-lag connections. For it to work reliably, you need to figure out the amount of lag frames a game has, and then set Runahead frame count to that value. When done correctly, not only can you eliminate any perceptible lag, you can actually achieve latency that is quantifiably lower than when played on the original hardware [yes, with a CRT]. Surely that is the stuff of miracles on display technologies that have inherent far higher latency than any CRT screen, plus all the added additional latency of modern day PC OSes on top of the typical overhead brought about by emulators.

So up until now, runahead was not possible on Beetle PSX. Why not? For one, runahead relies on savestates. If serialization and savestates are well implemented, runahead can be used on a libretro core. Second, for runahead to run at fullspeed at any amount of frames, your hardware has to meet the demands of running a game at well above fullspeed. So the lower the FPS, the less conducive that core will be for runahead purposes even IF savestates are well implemented.

A couple of days ago, runahead would not work with the dynarec. This required some modifications. Zach Cook has since managed to fix these issues, and now we can use runahead just fine.

How to use it

First, make sure that you set Hardware Renderer to ‘Software’. Unless you are using a very high end CPU from the last two years, we recommend you leave Internal GPU Resolution at 1x. Combining the dynarec with runahead is a very CPU intensive task. You need all the headroom you can get.

To tinker with runahead settings, go to Quick Menu -> Latency.

Enable ‘Run-Ahead To Reduce Latency’, and set ‘Number of Frames To Run Ahead’ to a value of your choosing. For runahead to run as intended, this value shold match the amount of lag frames of the game.

NOTE: ‘Runahead Use Second Instance’ can give a big performance improvement. Consider always enabling it unless you have reason not to (for instance, some game related issue requires you disable it for whatever reason)

Alpha Switch version available

m4xw has done a quick port of Beetle PSX with the dynarec to the Switch port. He has already uploaded a build that you can try – you’ll have to pass the time with this core until we start deploying this on the buildbot properly.

To use this, extract the contents of this archive into /retroarch/cores/ (assuming RetroArch is already installed).

Performance has dramatically improved over interpreter before – just as an indication, Pepsi Man reportedly runs at fullspeed at 2x software rendering. But be sure to put this build through its paces yourself!