This is a challenge from the ECW 2023.

We are provided with a Game Boy ROM, its sources and a web emulator that can be controlled with a python script (neat).

The goal of the game is to move a character to a flag inside a maze by given it a list of instructions and the number of times each instruction must be repeated. Of course, because it is a CTF challenge, the game cannot be won by playing by the rules: the flag to reach is surrounded by walls.

A gameboy with a maze on the screen. There is a small deamon head on the top left of the maze, and a flag surrounded by walls in the center. above the maze, there are two rectangle. The left rectangle contains the words “No inst”, the right rectangle is empty

Game Control:

Add an instruction

Press the A key to add an instruction, then press any directional key. This will add the instruction to the instruction list. You can add up to 16 instructions.

Repeat an instruction

Press the UP or DOWN arrow to indicate how many times the selected instruction should be repeated.

Press the LEFT or RIGHT arrow to move the selection cursor through the instruction list.

Delete an instruction

Position the cursor on the instruction to be deleted, then press B. The instruction will be removed.

Move / Merge two instructions

Position the cursor on the instruction to be moved, then press SELECT. Move the cursor to the desired position and press SELECT again. If the two selected instructions are identical, they will be merged. Otherwise, their positions will be switched.

Launch ShellBoy !

Press START to tell ShellBoy to follow your instructions.

The Vulnerability

I started by playing the game a little. Unsurprisingly, no way to win, but I noticed something strange with the move/merge option: when merging an instruction with itself (by pressing SELECT two times), the instruction is removed, and the cursor that select the instruction stay at the same position. Better, we can still change the count value of the now non existing instruction. However, we can only move the cursor to the left, in the direction of the remaining instruction.

A gameboy with a maze on the screen. Above the maze, the left rectangle contains a right arrow and an up arrow, followed by a doted square indicated a cursor. The cell selected by the cursor is empty. The right rectangle indicate “x246”, indicating that even if the instruction is not set, the count value can still be modified

A this point my inner chaos monkey just kept pressing the SELECT button until the was no more instruction in the list. Then I did it again, and surprise! Now the list is full of arrow and we can move the cursor to the right again!

A gameboy with a maze on the screen. Above the maze, the left rectangle contains the words a right arrow followed by 10 up arrows. The right rectangle indicate “x0”

Ok, time to look at the code. The control of the instruction list is defined in inst_list.c. The one function call that interest us is move_inst(inst_cursor_pos, selected_index):

void move_inst(uint8_t inst1_index, uint8_t inst2_index)
{
    uint8_t inst1_id = inst_ids[inst1_index];
    uint8_t inst2_id = inst_ids[inst2_index];
    ...
    if(inst1_id == inst2_id)
    {
        uint16_t rpt_sum = inst1_rpt + inst2_rpt;

        if(rpt_sum > 255)
            ...
        else
        {
            ...
            remove_inst(inst2_index);
        }
    }
    else
        ...
}

We can see that when merging two stack, the second one is delete if the sum is small enough, even when the two stacks are in fact the same!

void remove_inst(uint8_t inst_index)
{
    ...
    inst_count--;
}

...
        else if(KEY_TRIGGERED(J_B) && !list_empty)
        {
            remove_inst(inst_cursor_pos);
            list_empty = inst_count == 0;
            inst_cursor_pos = 0;
        }

We can notice that remove_inst decrement inst_count, without checks, and do not touch inst_cursor_pos nor list_empty, instead, those are updated when pressing the button B, after calling remove_inst. This is not done in move_inst after calling remove_inst.

Here is the explanation of the strange behaviour: remove_inst do not update inst_count nor list_empty, and move_inst assumes there are at least two instructions in the list, which is false: they can be one, or even zero instructions, so inst_count can underflow, while list_empty stay false and inst_cursor_pos can stay out of bound.

How can we use that? Let’s take a look at the other action the user can do:

    ...
        if(KEY_TRIGGERED(J_LEFT))
        {
            inst_cursor_pos -= (inst_cursor_pos > 0) ? 1 : 0;
        }
        // Select next instruction
        else if(KEY_TRIGGERED(J_RIGHT))
        {
            inst_cursor_pos += (inst_cursor_pos < inst_count - 1) ? 1 : 0;
        }
        // Increase instruction repetition
        else if(KEY_TRIGGERED(J_UP) && !list_empty)
        {
            inst_rpt[inst_cursor_pos] += (inst_rpt[inst_cursor_pos] < 255) ? 1 : 2;
        }
        // Decrease instruction repetition
        else if(KEY_TRIGGERED(J_DOWN) && !list_empty)
        {
            inst_rpt[inst_cursor_pos] -= (inst_rpt[inst_cursor_pos] > 1) ? 1 : 2;
        }
    ...

We can move the cursor between 0 and inst_count, and set the value of inst_rpt[inst_cursor_pos] from 1 to 255.

(There is also a mode to enter new instruction, but inst_count is checked against MAX_INST so it not really useful)

At this point we need to find out the memory mapping of the variable. I used Sameboy, a emulator/debugger for Game Boy, to dump the memory after setting inst_rpt to a recognisable value (by playing the game):

$ sameboy shellboy.gb > dump.mem
SameBoy v0.15.8
x/65535 $0000

$ grep -A 1 -B 1 '01 02 03 04' dump.mem
c0a0: 01 00 00 ea 03 9c 00 00 00 00 00 00 00 00 00 00
c0b0: 00 0e 02 07 02 14 02 00 02 01 02 03 04 05 06 07
c0c0: 08 09 0a 0b 0c 0d 0e 0f 00 00 00 00 00 00 00 00

I found that inst_rpt starts at 0xc0b9, and from here, by combining watch points, backtraces, the disassembled values in memory (cf Samboy documentation), the source code and a good amount of time and effort, I mapped a few value in memory:

address value symbol
0xC0B1 0x0e inst_funcs
0xC0B2 0x02
0xC0B3 0x07
0xC0B4 0x02
0xC0B5 0x14
0xC0B6 0x02
0xC0B7 0x00
0xC0B8 0x02
0xC0B9 inst_rpt
0xC0BA
0xC0BB
0xC0BC
0xC0BD
0xC0BE
0xC0BF
0xC0C0
0xC0C1
0xC0C2
0xC0C3
0xC0C4
0xC0C5
0xC0C6
0xC0C7
0xC0C8
0xC0C9 inst_ids
0xC0CA
0xC0CB
0xC0CC
0xC0CD
0xC0CE
0xC0CF
0xC0D0
0xC0D1
0xC0D2
0xC0D3
0xC0D4
0xC0D5
0xC0D6
0xC0D7
0xC0D8
0xC0D9 0xFF inst_count
0xC0DA 0xXX inst_cursor_pos
0xC0DB 0x00 selected_index
0xC0DC 0x00 listen_new_inst
0xC0DD 0x00 is_moving_inst
0xC0DE 0x00 list_empty
0xC0DF ?
0xC0E0 ?
0xC0E1 0x01 bot_x
0xC0E2 0x01 bot_y

I also noted the code that display the victory message and then loop, could be useful:

    0e8d: LD de, $0ed9
    0e90: PUSH de
    0e91: LD hl, $1204
    0e94: PUSH hl
    0e95: LD a, $01
    0e97: PUSH af
    0e98: INC sp
    0e99: CALL $10c0
    0e9c: ADD SP, $05
    0e9e: LD de, $0064
    0ea1: CALL $139b
    0ea4: JR $0e9e

Interesting, bot_x and bot_y are after inst_rpt and not to far away. This means we can change the values of the bot ourself. Before pressing start, we have to put back inst_count to a reasonable value, else the game crashes (we will see why later). Pression SELECT many time will do that.

A GIF of the screen of a gameboy. Above the maze, we can see a right arrow appear, then disappear. After that, the list of instructions is filled with up arrow. The cursor then scroll to the right approximately 40 times. Then the counters is changed to 6, then another is set to 5. As the value of the counters changes, the robot in moves in the maze, ignoring the walls, and finishes on the flags. After that the instruction list empty to only a few arrows, and the text &ldquo;Flag is at 0x06FA&rdquo; appears between the instruction list and the maze.

Arbitrary Code Execution

We beat the game! But still no flag. Sad. Instead, we have the address of the flag in memory, in a location way too far from the region we control.

We saw previously that this code is run when we win:

    0e8d: LD de, $0ed9
    0e90: PUSH de
    0e91: LD hl, $1204
    0e94: PUSH hl
    0e95: LD a, $01
    0e97: PUSH af
    0e98: INC sp
    0e99: CALL $10c0
    0e9c: ADD SP, $05
    0e9e: LD de, $0064
    0ea1: CALL $139b
    0ea4: JR $0e9e

Notice CALL $10C0, this is the function that display the text whose address is stored in the register DE (here, 0x0ED9). It would be nice if we could change the 0x0ED9 to 0x06FA in the code, by the location (0x0E8D) is still to far away, and anyway, this memory section is mapped from the ROM and is read only. We need more than just overriding variables, we need arbitrary code execution.

We need to find a way to set the value of the program pointer. The stack is around 0xDFE0, so we can’t override the return pointer. However, during the simulation, there is one point where the program call a function whose address is read from a variable:

void simulate()
{
    for(uint8_t i = 0; i < inst_count; i++)
    {
        uint8_t inst_id = inst_ids[i];
        uint8_t inst_rp = inst_rpt[i];

        for(uint8_t j = 0; j < inst_rp; j++) {
            if(inst_funcs[inst_id]()) {
                draw_simu();
                delay(100);
            }
        }
    }
    ...
}

We saw that inst_funcs is located at 0xC0B1, juste before inst_rpt, so we cannot override the addresses of the functions inside inst_funcs. Luckily, we don’t need to, because the function to call is selected by inst_id, which is located in inst_ids, juste after inst_rpt. Usually, inst_id is between 0 and 3, but with our first exploit, we can set it to any value, let say, 4. Conveniently, inst_funcs[4] correspond to the values stored at 0xC0B9: the first two octets of inst_rpt.

Earlier, we noticed the game crashed when running the simulation with an inst_count set to 0xFF. This was due to the value of inst_ids[i] being some random value, causing the program counter to jump somewhere in the RAM that probably did not contained any actual code.

So here we are, we can set the program pointer to an arbitrary value, we control a space in memory, we can write our code in memory and jump in it. I had to refresh my Z80 a little to write my payload, mostly because I only had 30 octets with an arbitrary value in the middle. (Latter someone pointed out that 0x04 is the instruction INC B, so basically a NOP in our situation, and in another write-up someone juste removed a lot of the code I used by just jumping at 0x0E90 after setting DE. I’m a little stupid, so my solution is more complex than it needs to, but hey, it works).

So here is the state of the memory after we set it:

address value symbol/disassemble
0xC0B1 0x0e inst_funcs
0xC0B2 0x02
0xC0B3 0x07
0xC0B4 0x02
0xC0B5 0x14
0xC0B6 0x02
0xC0B7 0x00
0xC0B8 0x02
0xC0B9 0xBB inst_rpt
0xC0BA 0xC0
0xC0BB 0x11 LD DE, $06FA
0xC0BC 0xFA
0xC0BD 0x06
0xC0BE 0xD5 PUSH DE
0xC0BF 0x21 LD HL, $1204
0xC0C0 0x04
0xC0C1 0x12
0xC0C2 0xE5 PUSH HL
0xC0C3 0x00 NOP
0xC0C4 0x00 NOP
0xC0C5 0x00 NOP
0xC0C6 0xC3 JMP $C0CA
0xC0C7 0xCA
0xC0C8 0xC0
0xC0C9 0x04 inst_ids
0xC0CA 0x3E LD A, $01
0xC0CB 0x01
0xC0CC 0xF5 PUSH AF
0xC0CD 0x33 INCR SP
0xC0CE 0xCD CALL $10C0
0xC0CF 0xC0
0xC0D0 0x10
0xC0D1 0xC3 JMP $0E9C
0xC0D2 0x9E
0xC0D3 0x0E
0xC0D4
0xC0D5
0xC0D6
0xC0D7
0xC0D8
0xC0D9 0xFF inst_count
0xC0DA 0x00 inst_cursor_pos
0xC0DB 0x00 selected_index
0xC0DC 0x00 listen_new_inst
0xC0DD 0x00 is_moving_inst
0xC0DE 0x00 list_empty
0xC0DF ?
0xC0E0 ?
0xC0E1 0x01 bot_x
0xC0E2 0x01 bot_y

This time no need to set inst_count to a sensible value, the first instruction will trigger our injected code which print the flag before jumping into an infinite loop.

A GIF of a screen of a Game Boy. Above the maze, we can see a right arrow appear, then disappear. After that, the list of instructions is filled with up arrow. The cursor then scroll to the right approximately 30 times. For each instruction, the counter is set to a specific value. The cursor then go back to the beginning of the list, and &ldquo;ECW{R3Tr0_Gb_RcE!}&rdquo; appears between the maze and the instruction list. The robot in the maze does not move from the top left corner.

Here is my code to automate the hack:

def win_game():
    press_button(BUTTON_A)
    press_button(BUTTON_RIGHT_ARROW)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)

    for _ in range(40):
        press_button(BUTTON_RIGHT_ARROW)
    for _ in range(5):
        press_button(BUTTON_UP_ARROW)
    press_button(BUTTON_RIGHT_ARROW)
    for _ in range(3):
        press_button(BUTTON_UP_ARROW)
    for _ in range(41):
        press_button(BUTTON_LEFT_ARROW)
    for i in range(250):
        press_button(BUTTON_SELECT)
        press_button(BUTTON_SELECT)

    press_button(BUTTON_START)

def write_code(code: bytes):
    for b in code:
        for _ in range(b):
            press_button(BUTTON_UP_ARROW)
        press_button(BUTTON_RIGHT_ARROW)
    for _ in code:
    press_button(BUTTON_LEFT_ARROW)

def display_flag():
    press_button(BUTTON_A)
    press_button(BUTTON_RIGHT_ARROW)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)

    write_code(
        bytes(
            [
                0xBB,
                0xC0,
                0x11,
                0xFA,
                0x06,
                0xD5,
                0x21,
                0x04,
                0x12,
                0xE5,
                0,
                0,
                0,
                0xC3,
                0xCA,
                0xC0,
                0x04,
                0x3E,
                0x01,
                0xF5,
                0x33,
                0xCD,
                0xC0,
                0x10,
                0xC3,
                0x9E,
                0x0E,
            ]
        )
    )
    press_button(BUTTON_START)

display_flag()

Optimizations

For the impatiens ones, I was sent by 0xskav an optimized version of write_code:

def write_code(code: bytes):

    for b in code:
        a = 255 - b

        down = False
        if a < b:
            down = True
            b = a
        for _ in range(b):
            if(down == False):
                press_button(BUTTON_UP_ARROW)
            else:
                press_button(BUTTON_DOWN_ARROW)

        press_button(BUTTON_RIGHT_ARROW)

    for _ in code:
        press_button(BUTTON_LEFT_ARROW)

And while we are at it with optimizations, we can use the shorter code injection used by !AlphaTartine:

address value symbol/disassemble
0xC0BB 0x11 LD DE, $06FA
0xC0BC 0xFA
0xC0BD 0x06
0xC0BE 0xC3 JMP $0E90
0xC0BF 0x90
0xC0C0 0x0E
def display_flag():
    press_button(BUTTON_A)
    press_button(BUTTON_RIGHT_ARROW)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)

    write_code(
        bytes(
            [
                0xBB,
                0xC0,
                0x11,
                0xFA,
                0x06,
                0xC3,
                0x90,
                0x0E,
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                0x04,
            ]
        )
    )
    press_button(BUTTON_START)