140228_38849_ultra

For years, Nintendo 64 emulation has been pretty bad and lagging significantly behind Nintendo Gamecube/Wii emulation. At least 90 to 95% of the remaining problems are at the RDP level, the N64’s video subcomponent chip. By moving away from High-Level Emulation of the RDP, we could solve most of the remaining problems. The problem has been that for a long time, it seemed impossible to do this at playable speeds. Software rendering is too slow for a GPU from this timeframe, and older versions of OpenGL have too many crippling limitations in order to allow for a 1:1 reprogramming and port of Angrylion to GL.

At last, this dire situation will change in the upcoming days and we can finally release to the public something that will revolutionize N64 emulation forever so that we can move away from all of the hacky HLE video plugins that have been released in recent years.

The world’s first-ever low-level N64 video plugin implemented using the Vulkan API!

And not just any video plugin either. This is a reimplementation/port of Angrylion to Vulkan. This will be the first time most will be able to get anywhere close to playable speeds with an accuracy-based N64 video renderer.

This hardware renderer is unique for the following reasons:

  • This is the first N64 emulator project ever so far to receive Vulkan support.
  • This is the first time ever that an emulator takes advantage of asynchronous compute (exclusive only to DirectD12/Vulkan) for hardware rasterization of an emulated GPU.
  • This is the first time ever that the Angrylion renderer has been ported to a graphics API. It is the first time an RDP LLE video renderer for N64 has been capable of running at fullspeed. It marks a shift away from decades of inaccurate high-level emulation of the N64’s RDP which made for buggy N64 emulation in general.

How to use it?

When it will be released in the upcoming days, this is what you will need in order to use it.

  • You will need the latest RetroArch version (either nightlies or the upcoming 1.3.5 version). The libretro API has been updated to make asynchronous compute cores possible, hence why ‘Mupen64plus HW libretro’ will not work on any older version of RetroArch.
  • Your video card also needs to support the Vulkan graphics API.

When RetroArch 1.3.5 gets released

Download the new RetroArch 1.3.5, go to ‘Online Updater’, go to ‘Core Updater’.

From there, go to ‘Experimental’, and download Mupen64plus HW. This will download the Vulkan-enabled Mupen64plus core.

Before trying to use it, make sure your video card supports the Vulkan API otherwise it won’t work!

Why RDP LLE? Why is this significant?

For years, Nintendo 64 emulators have fixated upon a High-Level Emulation approach to emulate the RDP, the N64’s video rasterizer. Examples include Glide64, Rice, GLN64 (and its recent fork, GlideN64).

It is a practical but imperfect way of emulating the RDP for many reasons:

  • These plugins require numerous game-specific hacks and workarounds. It becomes a real maintenance chore and there’s plenty of missing graphical effects to this day. Examples include: missing lens flares in Turok: The Dinosaur Hunter, corrupt backgrounds in Killer Instinct Gold and GoldenEye 007, fiddly auxilliary frame buffer glitches, inaccurate approximations of graphical effects due to combiner issues, etc.
  • Most of these HLE RDP plugins recycle a lot of old code. For instance, Gliden64 is mostly a collage of GLN64 + Glide64 code, but the code recycling goes deeper than that. Low-level triangle rasterization functions in both Glide64 and Gliden64 are borrowed from Z64 GL, an RDP plugin by Ziggy. The problem is that bugs still exist in these sections of the code. Most of the low-level rasterization functions that keep being borrowed in these high-level plugins are directly responsible for many of the remaining glitches you can see. And since the code was written by outside people who are no longer active in the scene, it doesn’t seem likely it is ever going to get fixed.
  • There are other legacy issues. The most notorious one of all is of course Glide64, which originally targeted (you guessed it) the obsolete 3Dfx graphics API Glide. We are talking GL 1.2 / 1.3-ish era here, really stone-age. An OpenGL wrapper for Glide had to be written around Glide64 in order to get it to run with OpenGL-supported video cards in the first place, but the wrapper code unfortunately is far from optimal. Other plugins like Z64 GL still seem to use OpenGL 1.4x-era code and lots of questionable fixed function wrapper code.
  • Many games use custom RSP microcode to do certain game tasks. For instance, Rogue Squadron uses custom RSP microcode for terrain heightmap generation, while games like Resident Evil 2 and Legend of Zelda: Ocarina of Time use the RSP for video and image decompression routines. Usually this would call for a high-level implementation/approximation of what the game would expect to be returned to the RDP, and to also implement corresponding high-level displaylist implementations on the RDP rasterization side. Many games simply have never had their custom microcode properly reverse engineered, so the only way to play these games is to use a combination of a low-level RSP plugin and a low-level RDP renderer. Most of the existing microcode was actually handed to devs on a silver platter and it seems the remaining microcodes will probably never be reversed for this reason.
  • You run into pretty big bottlenecks with traditional GL rendering for which no real solutions exist, frame buffer bottlenecks, depth buffer bottlenecks, etc. More recent versions of OpenGL (4.3+) have made it possible to fix some of the issues, like better depth compare, faster and more efficient framebuffer to framebuffer copying, but it’s still honestly a big inoptimal mess.
  • Coverage emulation is usually completely stubbed out in HLE video plugins.
  • All of these plugins have so far completely avoided trying to emulate the VI interface. The VI interface basically reads from the RDP’s frame buffer and sends it to the digital-to-analog converter to create the video output. Along the way it applies several postprocessing effects including what appears to be 8x MSAA. I guess some can blame for this VI interface for leading to the ‘smudged’/’smoothed out’/’blurry’ look of many N64 games. But hey, we’re going for authentic here 🙂

Enter this new renderer. It takes as a base Angrylion (the most accurate RDP rasterizer yet so far) and it uses compute shaders to transfer the workload to the GPU instead of the CPU. Angrylion has been known to render nearly all games accurately unlike regular HLE N64. The only problem has been that it has been too slow to run at full-speed because of it being completely software rendered, which puts all the strain on the CPU. RDP LLE changes that around so that this rendering bottleneck is completely gone. With RDP LLE, the only remaining bottleneck will be the interpreter RSP plugin that a low-level RDP plugin has to use.

Work remaining to be done

With this video renderer we have aimed for a GL 4.3 / Vulkan featureset in order to escape most of the bottlenecks and limitations that usually drags N64 emulation down. From now on, there will be two big remaining tasks to be done:

  • We will have to port the code over to OpenGL 4.3+. Lower subsets of OpenGL won’t work as this renderer requires compute shader support.
  • With the RDP bottleneck being completely gone with this renderer, RSP has now become the main bottleneck. We will have to write a recompiler for the RSP in order to attain even better performance and reduce the RSP bottleneck as much as possible. So far, only Project64 has an RSP recompiler like this, but there are plans of using Daeken’s generic recompiler system in order to come up with something equivalent for Mupen64plus libretro.

Asynchronous compute raymarching libretro test core

72AE6jG

In order to make this renderer possible, extensions to the libretro API had to be added.

For educational reasons and in order to serve as a proof of concept on how to make your own libretro core that takes advantage of the recently added asynchronous compute capabilities, a test core has been made, called ‘libretro-test-vulkan-async-compute‘.

It is a basic test program that demonstrates raymarching being done in Vulkan. We’d very much like to see people improve upon this and collaborate to make a more impressive core out of it.

You can find the sourcecode for this sample test core inside RetroArch’s source code directory tree (cores/libretro-test-vulkan-async-compute in specific).

Conclusion

It has been a long time coming, but finally with paraLLEl, N64 emulation can finally become ‘good enough’ and we no longer need to have patchwork renderer plugins that try to fix graphics issues on a per-game basis.

31 thoughts on “First ever revolutionary N64 Vulkan emulator coming soon – only for libretro (paraLLEl)

  1. Version of this new N64 emulator for retroarch version of android?

    1. Android devices that support Vulkan, like the Nvidia Shield Tablet/Console and Galaxy S7, might indeed be able to run this.

      1. Is the GPX XD also supported? Is there a list I can check?

        1. Libretro RetroArch

          July 13, 2016 — 2:26 pm

          Check with them if they support Vulkan, if not nope.

          1. Thanks for the reply. I went on a search myself. Should be supported. GPD XD has an ARM Mali-T764 gpu and RK3288 cpu.

            https://en.wikipedia.org/wiki/Vulkan_(API) says it’s supported.
            Also checking the ARM website (http://malideveloper.arm.com/documentation/developer-guides/vulkan/) is says the T-760 is Vulkan compatible. The T-764 is based on the 760 as far as I know.

          2. I have the Galaxy Note 4 with the Mali-T760 GPU and RA show me that my device is supporting the Vulkan API but I think the firmware does not have the driver for it, because RA crash when i switch to Vulkan driver 🙁

  2. Just curious about, does “Metal” support this king of features… ?

    1. Unless Apple gets with the program and supports Vulkan or MoltenVK becomes free, I think it’s a no for now. I’m not sure though, so you may want to do more research.

  3. Isn’t ‘asynchronous compute’ the thing which suck on Nvidia DX12 GPU? I wonder how well it’ll perform on my GTX980 Ti.
    So Rogue Squadron 3D will be finally playable? 😉
    Oh and you guys really rock! I never thought I will see the day when N64 emulation becoms good 😀

    1. >Isn’t ‘asynchronous compute’ the thing which suck on Nvidia DX12 GPU?

      Nah, that’s a common misconception. Asynchronous Compute was already working back in Maxwell under DX11, because the scheduling was done at the driver level (GPC granularity for Maxwell, SM granularity for Pascal). AMD put their schedulers in the hardware, but that meant that they could not actually be USED until DX12/Vulkan. GCN leaves a bunch of silicon idle under DX11/OpenGL because of this, whereas in Maxwell/Pascal this is already almost fully utilised. Maxwell/Pascal therefore do not see significant speedups when Async Compute is turned on in DX12/Vulkan, because they were already doing that work anyway. Same situation with multithreading: Nvidia implemented the DX11 multithreading extensions whereas AMD did not (at that time there were pushing Mantle), so the threading speedup is already mostly utilised.

  4. will benefit from this the killer instict arcade?

    1. No. Mupen emulates N64 only.

  5. Will this fix Donkey Kong 64 completely?

  6. Will the new Vulkan core be compatible with the Wii/NGC? :O
    I would love to finally be able to play N64 games without the plugin issues in my Wii!

    I have been DYING to repay my N64 library in it!

    1. of course not…
      Vulkan is only available on modern GPUs

      1. Dino René Caballero

        July 15, 2016 — 4:58 am

        we need an option to do some frameskip in old devices and android. please!

        1. How is this link related to the Wii at all?

          1. Dino René Caballero

            July 17, 2016 — 2:54 pm

            Well, I stumbled across this discussion looking for a way to make frameskip, because I did everything I could manuals available on the configuration of cores retroarch without much success, also reading everything related to the “frameskip” I discovered that it is something that does not please one of the heads of the project. I admire the work and effort of the team, retroarch is incredible, but I would not want that for obsolescence go forward with Vulkan and who have old equipment, whether computers or android, we run out the option that is pitifully required by the characteristics of the hardware. I put it to consideration as constructive criticism, I do not know if there is another method, but what @twinaphex proposed is something I really like to see in retroarch. In poor countries do not all have the ability to access some progress, for example the raspberry pi does not come to my country and if it does come with three or four times higher price.

          2. Well, unfortunately this renderer is based on Vulkan, which only works on newer graphics hardware. It wouldn’t work on the Wii.

          3. We don’t really like frameskip as a solution.
            I don’t think video rendering is not really were modern devices struggle.

          4. Dino René Caballero

            July 20, 2016 — 12:00 pm

            y tienen pensada otra solución? Yo no conozco realmente otro método pero lo propuesto en esa discusión parecía perfecto por que se valdría de la misma tasa de refresco de la interfaz. Espero no nos dejen de lado y nos sorprendan, el problema se evidencia más aún en android creo. Yo logré emular 80 porciento de mis juegos con exito en mi mk808, es ahí donde más me hizo falta esa opción, también leí algo que dijeron sobre “lagframes”. Lamento que hayan dejado de lado la propuesta, pero tengo fe en que nos darán una solución sorpresa para ese caso. Yo trataré de ayudar a otros en base a mi experiencia con retroarch.

          5. Dino René Caballero

            July 20, 2016 — 2:14 pm

            me refería a eso, perdón, no recordaba el término. Gracias por responder!

  7. Will this completely fix all of the video issues in Rareware games?

  8. Make sure to add a toggle for the VI, because a certain group of users have spent years trying to figure out ways to disable the AA filter for sharper output/more aliasing. And they have succeeded, so i’m sure there is still a crowd of people out there who’d like that effect.

    Though , honestly with high res rendering it probably won’t matter.

    Keep up the amazing work. Libretro is great.

  9. I’m not sure how Vulkan helps get rid of HLE. OpenGL should be able to do the same stuff.

    1. This is using async compute capabilities. It’s using the GPU as a processor.
      It’s not the same as using the GPU to render images.

      1. Thanks for responding! Compute shaders have been in OpenGL since 4.3 though — I’m curious why it has to be async?

  10. i cant wait for MAME emulators.

  11. get n64 and playstation working pls

    1. those are working fine for most people

Comments are closed.