Libretro Cores Progress Report – February 29, 2020

Our last core progress report was on February 5, 2020. Below we detail the most significant changes to all the Libretro cores we and/or upstream partners maintain. We are listing changes that have happened since then.

How to update your cores in RetroArch

There are two ways to update your cores:

a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select the core from the list that you want to install.

Vitaquake 2

Description: Quake 2 game engine core

Vitaquake 2 is now available for the first time on 3DS and Android. It uses the software renderer on those platforms for now.

  • Bugfix: Analog input descriptors added
  • Bugfix: Actually exit on Sys_Quit and Sys_Error. They intend to exit. Not doing so only result in weird crashes later
  • Bugfix: Fix crash in soft renderer when going underwater on increased resolution
  • Port: Ported vitaQuake 2 core to 3DS (software renderer only).
  • Port: Ported vitaQuake 2 core to Android (software renderer only).
  • Softrend: Fix crash in soft renderer when going underwater on increased resolution

ECWolf

You’ve been able to read about the new core ECWolf in our separate blog article. Further developments since then have been a 3DS port and analog deadzone options.

  • Analog deadzone options
  • Port: Ported ECWolf to 3DS

TIC-80

Description: TIC-80 emulator core

  • Updated to latest version

Flycast

Description: Sega Dreamcast emulator
We reported before in two separate articles this month on the mipmapping accuracy improvements. There have been other developments since the last progress report –

  • x64 DSP JIT support – audio DSP enabled should now have far less overhead than before
  • users now cannot force enable div matching [flyinghead]
  • disable div matching for Donald Duck Goin Quackers [flyinghead]
  • (NAOMI) parent rom name was incorrect for GDROMs [flyinghead]
  • (NAOMI) add ringout eeprom + some crcs [barbudreadmon]

Picodrive

The fast tile-based renderer (a new addition) can give a very welcome increase in performance. Note that only the ‘accurate’ renderer will display any graphics with 32X-based games, so only use tile-based rendering for now with regular Genesis/Mega Drive games.

  • Added an option to change the renderer from the default “accurate” line-based renderer to an alternative “fast” tile-based renderer, which may improve performance on slower devices such as the PSP. [bmaupin]

gpSP

  • Fix periodic ram_translation_cache crashes
  • Add manual frame skipping

OpenLara

  • Fix Android compilation [phcoder]

    Despite what the comments say Openlara uses
    functions available only in glesv3. Hence we need to
    force GLES3 and not GLES2. Then we need to fix compile error.
    It was due to requiring too low version of Android before it
    had GLESv3

prBoom

We talked about the serialization support before in prBoom. New improvements since then is that the wiggle fix is finally fixed now for ARM-based platforms like Android and iOS. Previously you’d need to have Wiggle Fix turned off if you didn’t want the floors and ceilings to look glitchy.

  • Implemented serialization support. Rewinding and runahead has been tested.
  • Use fixed point in precise flat calculation. This should work well now also in ARM and other platforms
  • Fix double free on failed load

DOSbox SVN/Dosbox Core

  • Latest SVN updates

Final Burn Neo

  • Latest updates from upstream

Stella

  • Latest updates from upstream

PCSX ReARMed

The biggest improvement since the last progress report [and the articles that followed it] definitely has to be runahead second instance support.

  • [3DS] Tweak compile-time options to increase performance [justinweiss]
    Adjusting a few things in the makefile to get some easy performance boosts.

    Did some measurement with Crash Bandicoot idling on the beach he starts on in Crash 1. All of these together brought it from ~44.5 fps to ~50.9.

    Tried all four UNAI options, this was the fastest by ~0.5fps.

  • [3DS] Fix dynarec crashes [justinweiss]
    After the dynarec writes new instructions, it has to flush the
    instruction and data caches. Some of these flush operations are
    privileged on the 3DS, so the clear cache functions have to run
    through svcBackdoor. The Nintendo implementation (and CFW
    reimplementation) of svcBackdoor has a problem where interrupts and
    context switches will cause crashes.

    Even though we can disable interrupts in the flush function, there’s
    still a window of time between svcBackdoor being called and the
    function being run where an interrupt will corrupt the stack.

    Luma3DS implemements a svcCustomBackdoor call we can use that also
    runs a function in supervisor mode, but uses an implementation that
    avoids this problem.

  • Fix second instance runahead [ZachCook]

Beetle PSX


Some crucial accuracy improvements to the hardware renderers means that Gran Turismo 2 is finally glitch-free with both OpenGL and Vulkan. Before, the HUD and other parts of the screen would flicker during gameplay.

  • Separate lightrec PGXP and non-PGXP rw_func [ZachCook] –
    When tested with soul blade this increases dynarec performance by ~1%
    for non-PGXP, and ~2% for PGXP as it always uses rw_func due to the fast-path
    being disabled for it.

  • Add frame duping support to RSX Vulkan [ggdrt]
  • OpenGL: Eliminate redundant glClear call [ggdrt]
  • Add accurate timing macros and report noninterlaced by default
  • Vulkan: Delay VRAM framebuffer size calculation to latest possible moment in
    Vulkan renderer; don’t rely on external timing of UpdateDisplayMode.
  • Add GP1(05h) command to rsx and parallel-psx [ggdrt]
  • Vulkan: Add Display VRAM core option support [ggdrt]
  • parallel-psx/Vulkan: Decouple MDEC and SSAA filter logic [ggdrt]
  • Limit image_offset range to prevent segfault [ggdrt] –

    Also update image_offset formula to better match core option info

  • Update rsx hook sequence in GPU reset and GPU restore state [ggdrt] –

    Fixes hardware renderer glitch where a portion of the BIOS PS logo would
    briefly flicker into the bottom right corner of the viewport after the
    normal BIOS animation.

  • Make scanline core options dynamic for sw renderer [ggdrt]

Desmume 2015

  • Blank screen gap for non-hybrid modes [Exalm]
  • Account for screen gap for touch input in top/bottom mode [Exalm]

mGBA

  • 3DS: Fix build
  • Update to mgba 0.8.1

Quasi88

  • Fix -mem_wait audio distortion

    Occurred when running in mode V1S
    due to front-loading DMA waits on
    each vertical sync.

  • ParaLLEl N64

    We’ve run several stories on ParaLLEl N64 this month already. In addition to the changes listed here, an Angrylion VI performance enhancement has been implemented which makes the VI filtered modes have less overhead than before.

    • Fixed VI performance issue caused by false sharing

      As pointed out by Themaister, the rseed array in vi.c suffers from false sharing, since its element size is smaller than the cache line size.

      This improves VI performance by no less than 30%. [Themaister/ata4]

    P-UAE

    Description: Commodore Amiga emulator

    This core is fast becoming one of the best ways to emulate the Amiga courtesy of sonninos.

    • Extended ZIP support [sonninnos]
    • WIIU build fix attempt + option display fix [sonninnos]
    • WHDLoad.hdf as raw files, Keylist fix, 68020 back to defaults [sonninnos]
    • Allow launching directories as hard drive images [sonninnos]
    • Save state fixes [jdgleaver]
    • Use memory-based save states (serialization support)
    • Reverted 68020 CPU shenanigans, Minor font fix, Less logging clutter [sonninnos]
    • New VKBD+statusbar fonts, RetroPad + CD32 Pad face button options [sonninnos]
    • Separate resolution & line mode core options, Automatic line mode [sonninnos]
    • Automatic resolution, Deinterlacing fixed, Statusbar fillings [sonninnos]
    • WHDLoad.hdf update [sonninnos]
    • Analog joystick device + minor reorganizing [sonninnos]
    • Resolution changing without reset [sonninnos]
    • Filesystem fixes
    • Model options for A4000, 68030+68040 [sonninnos]
    • “gl” video driver max_width fix [sonninnos]
    • Readme + core option label updates & ‘vsynctimebase’ tuning [sonninnos]
    • SuperHires video resolution & glue organizing [sonninnos]

    ScummVM

    • Update to ScummVM 2.1.1

    Opera (formerly 4DO)

    Description: 3DO emulator core

    Genesis Plus GX

    Description: Sega Genesis/Mega Drive/Master System/Game Gear emulator

    • Android: Add OPLL sound core

    FCEUmm

    Description: NES emulator core

    NOTE: All changes courtesy of negativeexponent (unless expressly attributed to somebody else)

    • Do overscan cropping after the ntsc frame is generated. this allows
      removing left and right screen artifacts when using horizontal crop like
      what you would see happen in TV when adjusting horizontal size (HSIZE).
    • add core option that implements NTSC scanline effects. This requires
      framebuffer to double the height, so a define is added (NTSC_SCANLINES at
      the top of libretro.c) to disable this on some low-memory platforms or just disable NTSC filters altogether.
    • Mapper 235 update support for 1MB/2MB carts and cart with unrom block
    • CNROM: Minor database cleanup
    • Use existing crc32 database for Bingo 75 (Asia) (Unl), this moves checksum check on mapper side instead of unif overrides.
    • m269: Deallocate memory on exit
    • Update mapper 15 based on latest notes
      – bit 7 acts as A13 only on mode 2
      – prg mask now 0x3F of data latche etc.
      – unrolled loops
    • Merge mappers 225 and 255 since they are basically the same
    • re-implement extra RAM as its required for some multicarts
    • Fix colour emphasis support for NTSC filter & raw palette + nes decoder shader

      – this implementation is based on a more accurate colour emphasis from fceux.

      – The raw palette + nes decoder shader was kinda incomplete since the implementation was based on
      a per-frame basis which means that the emphasis bits were read once and applied to the whole frame. This
      means that some games that uses a per pixel or per scanline emphasis would not appear correct. The more
      accurate implementation reads emphasis bits from bits 5-7 of PPU[1] and saves this info in a separate frame.

      – The same implementation is also used to fix emphasis for the ntsc filter.

    • Fix palette for vs. system
    • use fixed value for ntsc width and when its cropped
    • Fix colour emphasis when using palette presets –

      This is an issue i found when implementing the NTSC filters. Currently when using palette presets,
      no clour emphasis is seen on games that supports it. This is cause because the palette table is
      fille with the same 64*3 colour info.

      Using custom palette (palette file) and default both creates the emphasis as expected. So internally,
      fceumm is able to support such feature.

      The fix is to generate the base palette (64 * 3 with each triplet representing rgb colour), and then
      send this internally using FCEUI_SetPaletteArray() to fill the colour table with emphasis instead.

    • (3DS) Disable NTSC filter [jdgleaver]
    • (Android) Add NES NTSC filters
    • Implement blargg NTSC filters –

      – this implements blargg’s nes ntsc filters using core options
      – an optional height doubling is also added but disabled for performance reasons (might make that optional as a core option)
      – since PS2 and PSP have their own blitter branches, these platforms do not have the ntsc filters since i dont have the means to test on those systems.
      – compile with HAVE_NTSC=1 to have these options, HAVE_NTSC=0 disabled filter including core options
      – HAVE_NTSC=1 is set as default, other than PS2 and PSP as stated above.

    • Add mapper 218
    • m227: Implement chr-ram write protect
    • Add UNIF BMC-WX-KB4K (m134) to supported boards
    • Update implementation of some unlicensed mappers

    • Update mapper 91
    • Add mappers 111, 356, 269, 353
    • backport mapper 111 (Cheapocabra or GTROM) from fceux
    • Add BMC-SB-5013 (m359) and UNL-82112C (m540)
    • Add mapper 543, 550, 516
    • MMC1: Better work ram and battery saves support for size greater than 8K
    • Add mappers 382, 534, 539
    • Update mapper 150/243 (unif UNL-Sachen-74LS374N)
    • Update mapper 45 –
      – fix starting register value
      – fix memory write range to 0x6000-0xffff
    • Add mappers 360, 533
    • Minimize core options shown by default
    • updates to region-related settings and overclocking, UNIF now sets ines mapper if available (used for overrides etc)
    • Unif: Pass iNES mapper number to cart struct when available

    Gambatte

    Description: Game Boy/Game Boy Color emulator

    • Add optional LCD ghosting effect –
      This backports the LCD ghosting effects that were recently added to mGBA

      It replaces the existing Mix Frames core option with Interframe Blending. The old Accurate and Fast frame mixing settings have been renamed to Simple (Accurate) and Simple (Fast) – these perform the same 50:50 mix of the current and previous frames as before, required to achieve correct rendering of games that rely on LCD ghosting for transparency effects.

      In addition to these settings, there are now LCD Ghosting (Accurate) and LCD Ghosting (Fast) options. The former recreates the LCD response effect of RetroArch’s Gameboy Shader. The latter is similar, but uses a single accumulation buffer – which is more efficient, but lacks the subtlety of the shader implementation.

      Here are some stats showing the typical increase in performance overheads when using the various methods:

      Simple (Accurate): 30%
      Simple (Fast): 13%
      LCD Ghosting (Accurate): 48%
      LCD Ghosting (Fast): 28%

    • Do not add pointers to memory map if they are not available –

      skips sram when no sram is available
      skips extra RAM banks if not running in GBC mode

    • tvOS support

    Pokemini

    • Safely power off console before closing content [jdgleaver] –
      Thanks to the efforts of @Sanaki, it was discovered that certain games do not correctly update their EEPROM save data unless the virtual Pokemon Mini console is powered off before closing content. Affected titles include Pokemon Pinball Mini and Pokemon Race Mini – up until now, it was generally thought that saving was completely broken in these games.

      This PR adds an automatic power off event to retro_unload_game(). This ensures saves are always written without any special intervention from the user.

      The issue has existed forever, and I believe it affects the standalone version of PokeMini as well – great to finally get this fixed!

    EasyRPG

    • Fix Android build

    Kronos

    Description: Sega Saturn emulator based on Yabause

    • Latest updates by F-Care

    VICE

    Description: Commodore 64 home computer emulator

    Statusbar stays at the screen border + new position options: Top or Bottom

    RetroPad face button options: Rotate, Jump, Rotate+Jump

    Mouse type core option, usable with D-Pad + left analog + mouse, covered at least:

    Arkanoid (Paddles port 1)
    Arkanoid 2 (1351 port 1)
    Maniac Mansion Mercury (1351 port 2)
    Speeds are vastly different in Arkanoid & Maniac Mansion, so it was impossible to have both cases perfect with default options. VICE allows only one potentiometer device at a time, therefore it applies to the current RetroPad port. Pretty much everything else is familiar stuff from the PUAE core.

    Bonus:

    Defaulted ReSID sampling to “Fast” on low-power platforms
    Minor reorganizing

    • Mouse support [sonninnos]
    • RetroPad face button options, More statusbar options [sonninnos]
    • New fonts + cleanups [sonninnos]

    Final Burn Neo

    Description: Multi-system arcade emulator core

    • Latest updates

Flycast Progress Report – Mip-mapping properly implemented now on both Vulkan and OpenGL



Some significant improvements have been made to the Dreamcast emulator core Flycast by flyinghead that serve to increase the graphics accuracy.

For one, the OpenGL renderer has mip-mapping support now for the first time. Second, all texture mipmap levels are now uploaded to the GPU. The Vulkan renderer no longer auto-generates mip-maps and instead uses the proper mipmap levels. What happened before is that these auto-generated mipmaps would ‘bleed’, whereas the ones provided by the game do not.

This caused issues with games like Railroad Tycoon 2 where the beach texture was not properly displayed.

Before the fix, you’d have weird magenta colors on the beaches with the Vulkan renderer. This has now been resolved.

To learn more about texture bleeding, you can read this StackExchange article here.

Another example of what texture bleeding looks like can be seen down below (and note that this is now also fixed) – previously texture bleeding would regularly occur at upscaled resolutions –

NFL 2K2 - texture bleeding issue when upscaling - how it looked before
NFL 2K2 – texture bleeding issue when upscaling – how it looked before

Many of these issues have now been fixed. See the picture down below how the title screen above for instance looks like now –

NFL 2K2 - texture bleeding issues fixed
NFL 2K2 – texture bleeding issues fixed

How mip-mapping looks like on Dreamcast

To best illustrate the effect mip-mapping has on the video output quality, let us show some comparison pictures at the Dreamcast’s native 640×480 resolution.

The picture down below shows Soul Calibur running with mip-mapping disabled:

Soul Calibur on Dreamcast with mip-mapping disabled
Soul Calibur on Dreamcast with mip-mapping disabled

The picture down below shows Soul Calibur running with mip-mapping enabled:

Soul Calibur on Dreamcast with mip-mapping enabled
Soul Calibur on Dreamcast with mip-mapping enabled

At higher resolutions you will notice the blurriness starting to gradually disappear.

Combining mip-mapping with anisotropic filtering

The Vulkan renderer allows you to apply Anisotropic filtering (AF) to the textures (to learn more about it, read the Wikipedia page here).

Soul Calibur on Dreamcast with mip-mapping enabled and 16 x AF filtering
Soul Calibur on Dreamcast with mip-mapping enabled and 16 x AF filtering

While you can still make out some of the far textures in the distance being slightly blurry as a result of the mip-mapping, overall the blurriness factor is significantly reduced as a result of the agressive 16x AF filtering being applied here, even at a very low resolution of 640×480.

Even further tweaking possible with PowerVR postprocessing filter

You don’t have to stop at mipmapping and AF filtering of course. You can also take advantage of leilei’s PowerVR post processing filters on top to further enhance the authenticity of the picture.

NOTE: For now, the PowerVR postprocessing filters only work with OpenGL. Vulkan support will arrive later.

Soul Calibur on Dreamcast with mip-mapping enabled and PowerVR postprocessing filter
Soul Calibur on Dreamcast with mip-mapping enabled and PowerVR postprocessing filter

Here we have mip-mapping enabled and PowerVR 2 Postprocessing filter enabled. NOTE: Because this is the OpenGL renderer, anisotropic filtering right now is not available, so we cannot show you a picture of how the PVR2 postprocessing looks like in conjunction with 16x AF. Neither can we show a similar picture on Vulkan right now because the aforementioned PVR2 postprocessing filters are not available there. Hopefully both renderers can be at feature parity soon in these departments.

How to get it

There are two ways to update your Flycast core. Start up RetroArch first –

a – If you have already installed the core before, you can go to Online Updater and select ‘Update Installed Cores’.

b – If you haven’t installed the core yet, go to Online Updater, ‘Core Updater’, and select ‘Flycast’ from the list. It will then download and install this core.

Flycast world’s first Dreamcast emulator to receive Vulkan renderer – available later today on RetroArch with nightly core!

The first Dreamcast emulator ever to get a Vulkan renderer. Completely open-source, written from scratch, and available later today on RetroArch. Update your core later today to get the latest version with the Vulkan renderer! Available for Android, Windows, and Linux.

For more information, read down below…

Wait … a new what?

The renderer is the emulator component that emulates the Dreamcast/Naomi GPU chip, namely the PowerVR Series2. It was one of the first generations of 3D chips, with only a fixed pipeline. The PowerVR2 supported DirectX 6.0, which was the graphics API used by Windows CE games on the Dreamcast. Successors of the PowerVR2 would later be found in the original iPhone and iPod Touch (PowerVR4), iPhone 4 and iPad (PowerVR5) and many many other mobile devices. Now the Dreamcast GPU is more than 20 years old. You might think it should be easy to emulate such an ancient chip on modern hardware, right? Well … yes for the most part. But there’s one thing that the PVR2 does really well, and it’s order-independent transparency. And even today this is still not trivial to implement even on modern hardware. You won’t find this feature in Open GL or DirectX, and you need a pretty recent version of these APIs to be able to emulate it, which means manually sorting individual pixels from back to front and blending them together, and doing this for each visible pixel on the screen!

OK, but what about Vulkan?

For those of you who are not familiar with Vulkan, it is a relatively new 3D graphics API, basically a follow-on to Open GL. Open GL is quite permissive and has little declarative constraints. You just throw stuff at the driver when you need to and the driver’s job is to figure it out. The downside of this is that the Open GL driver often needs to guess what you’ll do next and he might not guess right. And when it doesn’t, performance suffers. Vulkan is radically different in that everything must be declared in advance, in great details, and there’s very little room for improvisation on the part of the driver. Vulkan works much closer to the hardware than Open GL does. So you can expect less overhead, more reliability and better performance in many cases.

The downside of Vulkan is the sheer amount of code you have to write to display just a single triangle on the screen, let alone a full-featured Dreamcast renderer. Last time I checked, the Vulkan renderer had 47 source files and around 7800 lines of code. (The Open GL renderer only has around 6000 lines of code.)

So what do we get?

As with Open GL, there are actually two Vulkan renderers: The first one uses a traditional single render pass with per-triangle or per-mesh sorting done by the CPU. The second one is capable of order-independent transparency with per-pixel sorting performed by the GPU. It uses multiple subpasses to compose the final image: the first subpass draws the opaque geometry depth map and the shadows casted on them. The second subpass renders all opaque geometry to a temporary color framebuffer, and transparent geometry into a huge pixel linked list. The last subpass then renders shadow volumes for translucent geometry. And finally all pixels are sorted and blended together using the opaque framebuffer of the previous subpass as background.

The next Flycast nightly build will have support for Vulkan on all major platforms: Windows, Linux and Android. In terms of features, the new renderer should be on par with the Open GL renderer, with the notable exception of lightgun crosshair and VMU screens display, which will be added soon. However, expect to find bugs and crashes here and there as is expected with any new piece of software. Also it may be slower than Open GL depending on many factors such as GPU, driver version, game being played, etc. We’ll do our best to fix any issue encountered and overcome performance issues. When reporting problems, make sure to indicate what GPU you’re using and the Vulkan driver version. It is highly recommended to upgrade your drivers to the latest version available, especially on mobile.

Here is a showcase of the differences between the basic and OIT renderers. By the way, this also applies to Open GL.

Here the hair of these ladies show glitching triangles in basic mode.

In Speed Devils 2, the shadow volumes (called “Modifier Volumes” in Dreamcast literature) are used in a special way to project headlights. This is only possible by using deferred rendering.

In this example, look at Ryo’s cast shadow on his left. There is a fog effect applied to this scene, but the basic single pass renderer cannot apply a fog effect to the cast shadow. In the OIT renderer, the shadow is perfectly fogged.

In Jet Set Radio, the character is composed of translucent polygons, and these polygons can be shadowed as well. Only the OIT renderer can properly render shadows cast on translucent polygons.

To finish, here is another seldom used GPU features: secondary accumulation buffer. It can be used to do tri-linear filtering and other effects. This is Evil Dead – Hail to the King and it is clear that the basic renderer is having a hard time here.

Final thoughts

Yes, the per-pixel alpha transparency option which to this date was only available on Windows and Linux now also works on Android with the Vulkan renderer. However, keep in mind that per-pixel alpha sorting is heavily memory bandwidth-limited. It has been tested on a Mali G76 (Samsung Galaxy S10+) – and it runs acceptably at 640×480 or 800×600 resolution. Your mileage may vary depending on the GPU power inside your Android phone. We recommend you to find that sweet spot which works best for you, and if results are too bad with per-pixel alpha enabled, turn back to per-triangle.

Some clear advantages of the Vulkan renderer is that frame pacing is much better than the OpenGL renderer, and performance is far higher when it comes to texture uploads and/or framebuffer manipulation. For example – when you KO an opponent in Dead Or Alive 2 against an explosive wall – the framerate would often tumble a bit on GL, but no such issues with Vulkan. Similar improvements can be noticed in Virtua Tennis 2 – when certain framebuffer effects happen after a replay, performance is much more steady with Vulkan thanks to the high degree of parallelism.

With Vulkan, we have heard reports that virtually all sound crackles and stutters are gone. That’s because with vulkan you choose the sync points where you wait. In GL the driver has to guess and sometimes it fails. These effects are using render to texture, and with OpenGL this creates sync issues.