Stream Raspberry Pi games to the GBA via the Link Cable

Overview

gba-remote-play

video-only.mp4

This software streams games from a Raspberry Pi to a Game Boy Advance, through its Link Port. Video and audio are compressed and sent in real time to the GBA, while the latter responds with its current input, allowing to play games of any platform by using the GBA (hence, Remote Play).

Features:

  • Plays any game using RetroPie on the GBA!
  • 120x80 pixels of power!
  • ~60fps using the default display mode
  • Retro scanlines 😎
  • Experimental audio support!
  • Crashes on the GB Micro! (yep, that's a feature)

rlabs Created by [r]labs.

Check out my other GBA projects!

GBA Jam 2021

All this code was made during the GBA Jam 2021. Since this project doesn't fit well into the jam (as it requires external hardware), there's a Demo available in the Releases section where one GBA sends a video with audio to another GBA via Link Cable.

Here's a video of it:

gba-jam-demo.mp4

The code of that demo is in the #gba-jam branch.

Demo with audio

video-and-audio.mp4

How it works

⚠️ This section will talk about implementation details. For setup instructions, scroll down to Setup! ⚠️

Basically, there are two programs:

  • On the GBA, a ROM that receives data.
  • On the RPI, a program that collects and sends data.

The ROM is sent to the GBA by using the multiboot protocol, which allows small programs to be sent via Link Cable. No cartridge is required.

Serial communication

Communication is done through a GBA's Link Cable, soldered to the Raspberry Pi's pins.

GBA Link Cable's pinout

Communication modes

The GBA supports several serial communication modes. Depending on what mode you use, the pins behave differently. The most common ones are:

  • Normal Mode: It's essentially SPI mode 3, but they call it "Normal mode" here. The transfer rate can be either 256Kbit/s or 2Mbit/s, and packets can be 8-bit or 32-bit.
  • Multiplayer Mode: What games normally use for multiplayer with up to 4 simultaneous GBAs. The maximum transfer rate is 115200bps and packets are always 16-bit.
  • General Purpose Input/Output: Classic GPIO, used for controlling LEDs, rumble motors, and that kind of stuff.

To have a decent frame rate, this project uses the maximum available speed: that's Normal Mode at 2Mbps, with 32-bit transfers.

Normal Mode / SPI

SPI is a synchronous protocol supported by hardware in many devices, that allows full-duplex transmission. There's a master and a slave, and when the master issues a clock cycle, the two devices send data to each other (one bit at a time).

SPI cycle

This is what happens on an SPI cycle. Both devices use shift registers to move the bits of data circularly. You can read more about the data transmission protocol here.

The GBA can work both as master or as slave, but the Raspberry Pi only works as master. So, the Raspberry controls the clock.

As for the connection, only 4 pins are required for the transmission: CLK (clock), MOSI (master out, slave in), MISO (master in, slave out), and GND (ground).

  • On the GBA, these are Pin 5 (SC), Pin 3 (SI), Pin 2 (SO), and Pin 6 (GND).
  • On the RPI, these are GPIO 11 (SPI0 SCLK), GPIO 10 (SPI0 MOSI), GPIO 9 (SPI0 MISO), and one of its multiples GNDs.

GBA <-> Raspberry Pi connection diagram

Some peculiarities about GBA's Normal Mode:

  • When linking two GBAs, you need to use a GBC Link Cable. If you use a GBA one, the communication will be one-way: the slave will receive data but the master will receive zeroes.
  • Communication at 2Mbps is only reliable when using very short wires, as it's intended for special expansion hardware. Or so they say, I've tested it with a long cable and it's not "unreliable", just slower 🤷‍♂️

Related code:

Reaching the maximum speed

In my tests with a Raspberry Pi 3, the maximum transfer rates I was able to achieve were:

  • Bidirectional: 1.6Mbps. From here, the Raspberry Pi starts receiving garbage from the GBA.
  • One-way: 2.56Mbps. Crank this up, and nothing good will happen.

One-way transfers are fine in this case, because we only care about input and some sync packets from the GBA. That means that the code is constantly switching between two frequencies depending of if it needs a response or not.

In all cases the Raspberry Pi has to wait a small number of microseconds to let the poor GBA's CPU rest.

Speed benchmark

The first dot means 40000 packets/second and each extra dot adds 5000 more. At maximum speed they should be all green. The one at the right indicates if we're free of corrupted packets. If it's red, adjust!

Related code:

MISO waits

In classic SPI, the master blindly issues clock cycles and it's responsability of the slave to catch up and process all packets on time. But here, sometimes the GBA is very busy doing things like putting pixels on screen or whatever it has to do, so it needs a way to tell the master to stop.

As recommended in the GBA manual, the slave can put MISO on HIGH when it's idle, and master can read its value as a GPIO input pin and wait to send until it's LOW.

Pls don't send me anything

Video

Reading screen pixels

First, we need to configure Raspbian to use a frame buffer size that that matches the GBA's resolution: 240x160. There are two properties called framebuffer_width and framebuffer_height inside /boot/config.txt that let us change this.

Linux can provide all the pixel data shown on the screen (frame buffers) in devfiles like /dev/fb0. That works well when using desktop applications, but not for fullscreen games that use OpenGL -for example-, since they talk directly to the Raspberry Pi's GPU. So, to gather the colors no matter what application is running, we use the dispmanx API (calling vc_dispmanx_snapshot(...) once per frame), which provides us a nice RGBA32 pixel matrix with all the screen data.

Here's one of the many ways of reading the frame buffer wrong

Related code:

Drawing on the GBA screen

Instead of RGBA32, the GBA understands RGB555 (or 15bpp color), which means 5 bits for red, 5 for green, and 5 for blue with no alpha channel. As it's a little-endian system, first one is red.

To draw those colors on the screen, it supports 3 different bitmap modes. For this project, I used mode 4, where each pixel is an 8-bit reference to a palette of 256 15bpp colors. The only consideration to have when using mode 4 is that VRAM doesn't support 8-bit writes, so you have to read first what's on the address to rewrite the complete halfword/word.

15bpp color representation

Related code:

Color quantization

So, the Raspberry Pi has to quantize every frame to a 256 colors palette. In an initial iteration, I was using a quantization library that generated the most optimal palette for each frame. Though that's the best regarding image quality, it was too slow. The implemented solution ended up using a fixed palette (this one in particular), and approximate every color to a byte referencing palette's colors.

Original image

Quantized image

To approximate colors faster, when running the code for the first time, it creates a 16MB lookup table called "palette cache" with all the possible color convertions. It's 16MB because there are 2^24 possible colors and each palette index is one byte.

Related code:

Scaling

The frame buffer is 240x160 but what's sent to the GBA is configurable, so if you prefer a killer frame rate over detail you can send 120x80 and use the mosaic effect to scale the image so it fills the entire screen. Or, if you like old CRTs, you could send 240x80 and draw artificial scanlines between each actual line.

The Raspberry Pi discards each pixel that is not a multiple of the drawing scale. For example, if you use a 2x width scale factor, it will discard odd pixels and the resulting width will be 120 instead of 240.

At the time of rendering, you have to take this into account because GBA's mode 4 expects a 240x160 pixel matrix. If you give it less, you'd only fill a part of the screen.

No scaling

2x mosaic

Scanlines

Here are 3 ways of scaling the same 120x80 clip.

Related code:

Image compression

Temporal diffs

The code only sends the pixels that changed since the previous frame, and what "changed" means can be configured: there's a DIFF_THRESHOLD parameter in the configuration file that controls how far should be a color to the previous one in order to refresh it.

At the compression stage, it creates a bit array where 1 means that a pixel did change, and 0 that it didn't. Then, it sends that array + the pixels with '1'.

Example of a 13x1 diff array

Related code:

Run-length encoding

The resulting buffer of the temporal compression is run-length encoded.

When using palette images, it's highly likely that there are consecutive pixels with the same color. Or, for example, during screen transitions where all pixels are black, instead of sending N black pixels (N bytes) we can send 1 byte for N and then the black color (2 bytes). That's RLE.

However, RLE doesn't always make things better: it can sometimes produce a longer buffer than the original one because it has to add the "count" byte for every payload byte. For that reason, the encoding is made of two stages, and it only applies RLE if it helps compressing the data. Then, the frame's metadata stores a bit that represents if the payload is RLE'd or not.

Encoding the compressed buffer

Related code:

Trimming the diffs

For a render resolution of 120x80, the bit array would be 120x80/8 = 1200bytes. That's a lot to transfer every frame, so it only sends the chunk from the first '1' to the last '1', but of course in 32-bit packets.

                                                                v startPacket                   v endPacket
PACKET 0                        PACKET 1                        PACKET 2                        PACKET 3                        PACKET 4                        PACKET 5
BYTE 0  BYTE 1  BYTE 2  BYTE 3  BYTE 4  BYTE 5  BYTE 6  BYTE 7  BYTE 8  BYTE 9  BYTE 10 BYTE 11 BYTE 12 BYTE 13 BYTE 14 BYTE 15 BYTE 16 BYTE 17 BYTE 18 BYTE 19 BYTE 20 BYTE 21 BYTE 22 BYTE 23
00000000000000000000000000000000000000000000000000000000000000000000000000100100110010000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
                                                                          ^ startPixel           ^endPixel
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ packetsToSend
																
// 24x1 screen
maxPackets = 6
startPixel = 74
startPacket = startPixel / 8 / 4 = 2
endPixel = 97
endPacket = endPixel / 8 / 4 = 3
totalPackets = endPacket - startPacket + 1;

Input

Each frame, the GBA sends its pressed keys to the Raspberry Pi. It does so by reading REG_KEYINPUT and transferring it on the initial metadata exchange.

Bits are set when a key is **not** pressed. Weird design!

In Linux, there's /dev/uinput which lets user space processes create virtual devices and update its state. You can create your virtual gamepad however you like, for example, add analog sticks and then map GBA's D-pad to analog values. The current implementation just registers a simple gamepad with the same layout as the GBA.

Related code:

Protocol overview

For every frame, the steps to run are:

  • (Reset if needed)
  • Build frame (RPI only)
  • Sync frame start
  • Metadata exchange (described below)
  • (If the frame has audio, sync and transfer audio)
  • Sync pixels start
  • Transfer pixels
  • Sync frame end
  • Render (GBA only)

Related code:

Metadata exchange

In this step, the GBA sends its input and receives a frame metadata packet:

00000000000000000000000000000000
^#**************$$$$$$$$$$$$$$$$
|||             |
|||              > start pixel index (for faster GBA rendering)
|| > number of expected pixel packets
| > compressed flag: if 1, the frame is RLEncoded
 > audio flag: if 1, the frame includes an audio chunk

As a sanity check, this transfer is done twice. The second time, each device sends the received packet during the first transfer. If it doesn't match => Reset!

Related code:

Audio

For the audio, the GBA runs a port of the GSM Full Rate audio codec. It expects 33-byte audio frames, but in order to survive frame drops, GSM frames are grouped into chunks, with its length defined by a build time constant called AUDIO_CHUNK_SIZE.

Related code:

Reading system audio

On the Raspberry Pi side, we use a virtual sound card that is preinstalled on the system. When you start the module (sudo modprobe snd-aloop), two new sound devices appear (both for playing and recording).

Playback audio devices

Capture audio devices

How it works is that if some application plays sound on -for example- hw:0,0,0 (card 0, device 0, substream 0), another application can record on hw:0,1,0 and obtain that sound. The loopback cards have to be set as the default output on the system, so we can record whatever sound is running on the OS.

Encoding GSM frames

GSM encoding is done with ffmpeg. The GBA port requires a non-standard rate of 18157Hz, so we have to tell it to ignore its checks, like "yeah, this is not officially supported, I don't care", as well as the new rate.

This is what the recording command looks like:

ffmpeg -f alsa -i hw:0,1 -y -ac 1 -af 'aresample=18157' -strict unofficial -c:a gsm -f gsm -loglevel quiet -

I swear this is audio!

Related code:

Controlling Linux pipes

The - at the end of the ffmpeg command means "send the result to stdout". The code launches this process with popen and reads through the created pipe.

Since transferring a frame takes time, it can sometimes happen that more audio frames are generated than what we can actually use. If we don't do anything about it, when reading the pipe we'd be actually reading audio from the past, producing a snowball of audio lag!

Our GBA vibing to outdated audio frames

To fix that, there's an ioctl we can use (called FIONREAD) to retrieve the amount of queued bytes. To skip over those, we call the splice system call to redirect them to /dev/null.

Related code:

Decompressing on time

This was the most complex part of the project. Drawing pixels on the bitmap modes is already a lot of work for the GBA, and now it has to decompress GSM frames! Also, it can't lag. Lots of people tolerate low frame rates on video, but I don't think of anyone who can find acceptable hearing high pitch noises or even silence between audio samples.

What I understand GSMPlayer does, is decoding GSM frames, putting the resulting audio samples in a double buffer, and setting up DMA1 to copy them to a GBA's audio address, by using a special timing mode that syncs the copy with Timer 0.

Me attempting to modify GSMPlayer code

Audio must be copied on time to prevent stuttering, noises, etc. Regular games do this by using VBlank interrupts, but that doesn't work here. When transferring at 2.56Mbps there are very few cycles available to process data, and adding an interrupt handler just messes up the packets.

I had to make it so every transfer is cancellable: if it's time to run the audio (we're on the VBlank part), we stop everything, run the audio, and then start a recovery process where we say to the Raspberry Pi where we're at. On start, end, and every TRANSFER_SYNC_PERIOD packets of every stream, the Raspi sends a bidirectional packet (at the slow rate) to check if it needs to start the "recovery mode".

Related code:

EWRAM Overclock

The GBA code overclocks the external RAM at the beginning, to use only one wait state instead of two. This process crashes on a GB Micro, but who would use this on a Micro anyway?

A guy using a GB Micro with a Raspberry Pi attached to it

Setup

  • Solder a Link Cable to the Raspberry Pi according to the Normal Mode / SPI section of this document.
  • Install RetroPie.
  • Set the following attributes in /boot/config.txt:
# Disable splash screen
disable_splash=1

# Aspect ratio (4:3)
hdmi_safe=0
disable_overscan=1
hdmi_group=2
hdmi_mode=6

# GBA render resolution
framebuffer_width=240
framebuffer_height=160

# Memory Split (for RetroPie)
gpu_mem_256=128
gpu_mem_512=256
gpu_mem_1024=256
#scaling_kernel=8
  • In raspi-config, enable SPI.
  • Set RetroArch to a 4:3 aspect ratio: Settings -> Video -> Aspect ratio -> 4:3.
  • Pick the required files from the Releases section of this GitHub repo.
  • Load the GBA ROM with ./multiboot.tool gba.mb.gba.
  • Run the RPI backend with sudo ./raspi.run

Audio (optional)

It's optional because the Raspberry Pi already has pins for good old analog audio, and you could attach a speaker to it and have clean high-quality sound. On the other hand, audio support here is experimental and heavily decreases the frame rate.

If you want audio coming out from the GBA speakers anyway, here's how:

Change /etc/modprobe.d/alsa-base.conf and make it look like this:

options snd_aloop index=0
options snd_bcm2835 index=1
options snd_bcm2835 index=2
options snd slots=snd-aloop,snd-bcm2835

Then, when you run cat /proc/asound/modules you should see:

 0 snd_aloop
 1 snd_bcm2835
 2 snd_bcm2835

Now run sudo modprobe snd-aloop and set Loopback (Stereo Full Duplex) as the default output audio device from the UI.

As a last step, open the config file of GBA Remote Play (config.cfg) and make sure that SPI_DELAY_MICROSECONDS is 4. It won't work with smaller values!

Credits

This project relies on the following open-source libraries:

The GBA Jam demo, uses these two open Blender clips with Creative Commons licenses:

Also, here are some documentation links that I made use of:

Special thanks to my friend Lucas Fryzek (@Hazematman), who has a deep knowledge of embedded systems and helped me a lot with design decisions.

Issues
  • faint display output, multiboot loading works fine

    faint display output, multiboot loading works fine

    Pi 3B+ running retropie. The length of cable is only about a few inches connecting the pi to the link port. Multiboot transfers are fine but anything beyond that results in a very faint image and controls also seem to have no effect.

    opened by ShawnBusker 2
  • Is it possible to connect the GBA via the card solt solt?

    Is it possible to connect the GBA via the card solt solt?

    I found there's MISO, MOSI, CLK and GND in the card solt so is it possible to connect via this socket instead of the one in the ext. solt? Do these two sets of solt works in the same way? 我發現卡里有 MISO、MOSI、CLK 和 GND,所以可以通過這個插座而不是頂部的Ext.中的那個來連接?這兩組接口之間通用嗎?

    opened by orzgithub 0
Releases(v1.1)
Owner
Rodrigo Alfonso
Rodrigo Alfonso
This is a list of different open-source video games and commercial video games open-source remakes.

This is a list of different open-source video games and commercial video games open-source remakes.

Ivan Bobev 73 Jun 20, 2022
HEX LINK - A Distributed Somatosensory Interaction Device

HEX LINK - A Distributed Somatosensory Interaction Device 介绍 HEX LINK: Higher-order EXcess LINK 这是一套可适用于PC端游戏的体感操作设备。 项目演示视频链接: 【开源·自制】真正的体感"只狼"_哔哩哔哩_

JingYang 216 Jun 25, 2022
tiny game made in ~15 hours on stream

A small game made entirely on live stream over about 15 hours. I intend to add more documentation and clarify some of the code and assets over the next few days.

Noel Berry 171 Jun 19, 2022
StreamMinecraftClone - A Minecraft Clone developed live on stream at twitch.tv/gameswthgabe

Minecraft Clone This is a Minecraft clone that will be used for an education YouTube series. I will link the YouTube series here once I begin creating

null 86 Jun 22, 2022
Publish UnitV recognition result as ROS topic via M5Stack

Publish UnitV recognition result as ROS topic via M5Stack

Naoya Yamaguchi 2 Sep 11, 2021
Open-source, cross-platform, C++ game engine for creating 2D/3D games.

GamePlay v3.0.0 GamePlay is an open-source, cross-platform, C++ game framework/engine for creating 2D/3D mobile and desktop games. Website Wiki API De

gameplay3d 3.7k Jun 28, 2022
A C math library targeted at games

Kazmath Kazmath is a simple 3D maths library written in C. It was initially coded for use in my book, Beginning OpenGL Game Programming - Second editi

Luke Benstead 502 May 25, 2022
3D games console based on RP2040 and iCE40 UP5k

PicoStation 3D This is an unfinished, untested project to develop a 3D games console based on an RP2040 microcontroller and an iCE40 UP5k FPGA. Quick

Luke Wren 35 Jun 5, 2022
A set of libraries and tools to make MSX games using the C programming language.

ubox MSX lib This is a set of libraries and tools to make MSX games using the C programming language. There are three main components: ubox: thin wrap

Juan J. Martínez 42 May 30, 2022
TIC-80 is a fantasy computer for making, playing and sharing tiny games.

TIC-80 is a fantasy computer for making, playing and sharing tiny games.

Vadim Grigoruk 3.5k Jun 27, 2022
Enfusion Artifical Intelligence for DayZ and future Bohemia Interactive games.

Enfusion AI Project (eAI) This mod adds headless player units under the control of a script on the server. Although the script is very rudimentary now

William Bowers 55 Jun 13, 2022
Game engine behind Sea Dogs, Pirates of the Caribbean and Age of Pirates games.

Game engine behind Sea Dogs, Pirates of the Caribbean and Age of Pirates games.

Storm Devs 631 Jun 17, 2022
null 4.8k Jun 20, 2022
Cute Framework (CF for short) is the cutest framework available for making 2D games in C/C++

Cute Framework (CF for short) is the cutest framework available for making 2D games in C/C++. CF comprises of different features, where the various features avoid inter-dependencies. In this way using CF is about picking and choosing which pieces are needed for your game

null 216 Jun 22, 2022
Modding (hacking) il2cpp games by classes, methods, fields names.

ByNameModding Modding (hacking) il2cpp games by classes, methods, fields names. Status: Ready to use Why did I do it 1. In order not to update the off

null 69 Jun 14, 2022
OGRE is a scene-oriented, flexible 3D engine written in C++ designed to make it easier and more intuitive for developers to produce games and demos utilising 3D hardware.

OGRE (Object-Oriented Graphics Rendering Engine) is a scene-oriented, flexible 3D engine written in C++ designed to make it easier and more intuitive for developers to produce games and demos utilising 3D hardware. The class library abstracts all the details of using the underlying system libraries like Direct3D and OpenGL and provides an interface based on world objects and other intuitive classes.

null 2.8k Jun 20, 2022
Insomniac games cache simulation tool plugin for UE4

Insomniac Games CacheSim plugin for UE4 This plugin for Unreal Engine 4 lets you use the Insomniac Games Cache Simulation tool to detect cache misses

Toni Rebollo Berná 26 Jan 19, 2022
Bounce is a 3D physics engine for games.

Bounce Welcome! Bounce is a 3D physics engine for games. Features Common Efficient data structures with no use of STL Fast memory allocators Built-in

Irlan Robson 61 Jun 5, 2022
Project DELTA - An open-source trainer built on the Void Engine for Toby Fox's games and their spin-offs.

Project DELTA v3 Project DELTA - An open-source, modular mod menu for Toby Fox's games and their spin-offs. Important note to Grossley: Yes, it is out

Archie 7 Apr 20, 2022