The original Xbox is, to me, an iconic piece of gaming history and I have many fond memories playing its games. It wasn't long after its release however, that its security system was completely broken and unsigned software (e.g. Linux!) was able to run on it.
I recently wanted to do a bit of reverse-engineering and so I decided to deconstruct the boot ROM to better understand the Xbox security system. In this article, I will present the high-level boot flow of the system, the disassembled ROM code, pseudocode for the disassembly, along with some thoughts.
It should be known that there is essentially no new information presented in this article. The many flaws of the Xbox security system have already been well documented years ago by some really smart people. That said, I am not aware of a similar disassembly of the ROM, so perhaps this article will serve as a guide for others who are interested.
Please note that this article does not cover how to dump the ROM image. For clues on how to do that, please see Bunnie's excellent bus tap work or the A20-line hack described in 17 Mistakes Microsoft Made in the Xbox Security System.
Legal Disclaimer: This article was written for educational purposes only and is not intended to promote copyright infringement. Sensitive information including the RC4 key and decrypted 2bl signature are redacted.
Thanks
Before getting into it, I would like to extend a very special thanks the following people and groups who made all of this possible:
- Bunnie for sharing his excellent work on the Xbox.
- Michael Steil for his talk at 22C3 and paper, which partly inspired me to do this work.
- Evan Amos for the great high-resolution Xbox photos you see here.
- The Xbox Linux Project for their many contributions, including Cromwell.
- The NASM project for their excellent disassembler, ndisasm.
Assumptions
I'm assuming that the reader has at least a basic familiarity with PC architecture and the C programming language.
Xbox Hardware
Before looking at the software, let's take a quick look at the Xbox hardware.
Perhaps unsurprisingly, the original Xbox Alpha development systems were essentially PCs. The Xbox that eventually landed on retail shelves carried over much of this PC legacy.
Major Components
For the time of its release, and for the money it cost to buy one, the original Xbox has fairly impressive specs:
- Processor: Intel Pentium III "Copermine-based" @ 733 MHz (seen in blue below)
- Memory: 64 MiB DDR SDRAM @ 200 MHz (seen in green below)
- Storage: DVD-ROM Drive (above, left), 8 or 10 GB HDD (above, right)
- Graphics: 233 MHz nVidia NV2A Graphics Controller
- Networking: 100 MBit Ethernet
Source: Wikipedia
Some other components relevant to discussion of the bootflow include:
Non-Volatile Storage
MCPX ROM
In addition to its documented features, the MCPX (seen in red above) contains a ROM which stores the very first instructions executed by the processor when the Xbox is switched on. These instructions make up what is called the First-Stage Bootloader.
The 512 bytes at addresses 0xfffffe00
through 0xffffffff
are connected to
the ROM. Reading from these addresses will not read from system memory but
instead read from the boot ROM.
Flash
The Xbox also has a 1 MiB flash memory chip (seen in yellow above) that contains the Second-Stage Bootloader, the Kernel, and the X-codes (more on that in a moment).
The 16 MiB from 0xff000000
through 0xffffffff
are connected to flash and so
reading from this region will read from the flash device. Note that the flash
device is actually only 1 MiB in size and repeats throughout the 16 MiB range.
A keen observer will notice that the ROM and flash ranges overlap. If enabled, the ROM will take precedence in the overlapping region.
Overview of the Xbox Boot Process
Now that we have a basic understanding of the hardware, let's look at the overall boot process; that is, what happens from an "off" state to running a game.
-
First-Stage Bootloader
Upon system startup (typically called "reset"), the first-stage bootloader is executed. This bootloader will perform basic system initialization, then decrypt and transfer control to the Second-Stage Bootloader.
-
Second-Stage Bootloader
The Second-Stage Bootloader (sometimes called "2bl"), will decrypt, decompress, and transfer control to the kernel.
-
Kernel
The kernel is always present in system memory and is responsible for managing system resources and providing a hardware-abstraction layer for the dashboard and titles (executables).
If there is no disc present in the DVD-ROM drive on the Xbox, the kernel will launch the dashboard (stored on the HDD). If a disc is present, the kernel will launch the title (executable) located on the disc.
-
Dashboard
The dashboard is the application that presents the primary user interface for the Xbox. The dashboard also has a music player that can play/rip CDs, a video player that can play DVDs, a storage manager, and a settings manager.
-
Title
The Title is the main application, typically a game, on a DVD-ROM disc.
First-Stage Bootloader Steps
With the high-level boot overview in mind, let's break down the first-stage. These are the major steps, in-order, of the First-Stage Bootloader:
- Switch to Protected Mode
- Perform Basic System Initialization (X-codes)
- Initialize MTRRs
- Setup Caching
- Decrypt the Second-Stage Bootloader
- Jump to the Second-Stage Bootloader
Switch to Protected Mode
Reset Vector
On x86 systems, CPU execution begins at the Reset Vector. The Reset Vector
is located at the physical address 0xfffffff0
.
Translating the Reset Vector to an offset in the ROM yields 0x1f0
. This is a
good place to begin disassembling the ROM.
000001F0 EBC6 jmp short 0x1b8
The first instruction is a short jump to offset 0x1b8
in the ROM. This offset
translates to the physical address 0xffffffb8
.
Load the GDT/IDT
Following the control flow at offset 0x1b8
in the ROM:
FFFFFFB8 662E0F0116F4FF o32 lgdt [cs:0xfff4]
FFFFFFBF 662E0F011EF4FF o32 lidt [cs:0xfff4]
These two instructions first load the Global Descriptor Table
(GDT) then the Interrupt
Descriptor Table
(IDT). The offset
specified in each of these instructions is the offset in the physical address
space. In the ROM, this offset translates to 0x1f4
.
00001f0 eb c6 8b ff 18 00 d8 ff ff ff 80 c2 04 b0 02 ee
^---^ ^---------^
Limit Base
Accounting for little-endian encoding, this reveals:
Table | Base | Limit |
---|---|---|
GDT | 0xffffffd8 |
0x0018 |
IDT | 0xffffffd8 |
0x0018 |
Now, lets read the table. Located at 0x1d8
in the ROM.
00001d8 00 00 00 00 00 00 00 00 ff ff 00 00 00 9b cf 00
00001e8 ff ff 00 00 00 93 cf 00 ...
I will spare you the breakdown and just say that these bytes encode a table which sets up a flat address space model where all 4 GiB of address space is addressed linearly and can be read/written/executed. See the Intel Software Developer Manuals for more information about address space models or how to decode these bytes.
Notice that both the GDT and IDT are the same table. While the above encodes a valid GDT, it does not encode a valid IDT. It's unclear why this was done.
Switch to Protected Mode
Continuing the disassembly, the next few instructions set the Protected Mode
Enable flag (bit 0) in CR0
, then jump to physical address 0xfffffe00
completing the entry to Protected Mode.
FFFFFFC6 0F20C0 mov eax,cr0
FFFFFFC9 0C01 or al,0x1
FFFFFFCB 0F22C0 mov cr0,eax
FFFFFFCE 66EA00FEFFFF0800 jmp dword 0x8:0xfffffe00
Now, executing at 0xfffffe00
, the Data, Extra, and Stack Segment Registers are
loaded. Each Segment Register is loaded with the value 0x10
which is the
offset into the GDT of the data segment (which is still 0-4GiB).
FFFFFE00 33C0 xor eax,eax
FFFFFE02 B010 mov al,0x10
FFFFFE04 8ED8 mov ds,eax
FFFFFE06 8EC0 mov es,eax
FFFFFE08 8ED0 mov ss,eax
Notice that Code Segment Register CS
is not loaded here. That is because it is
loaded automatically by the far jump into protected mode.
Basic System Initialization
Because complete system initialization requires significantly more size than the 512 bytes the boot ROM provides, and because the boot ROM cannot be updated in the field, Microsoft devised a clever solution.
Instead of putting the instructions to initialize the system in the boot ROM,
and instead of simply putting the instructions in flash (which would compromise
security), an
interpreter was
added to the First-Stage Bootloader providing a limited number of operations.
This interpreter understands twelve basic commands which are read from flash at
0xff000080
(offset 0x80
in flash).
The commands read by the interpreter have been dubbed X-codes.
Interestingly, no authentication is performed on the X-codes and they can easily be overwritten by reprogramming the flash device. Knowing this, some precautions were taken in the interpreter to limit exploitability of the interpreter.
Interpreter Command Format
Each command is 9 bytes in length and has the following encoding:
Offset | Size (Bytes) | Value |
---|---|---|
0x00 |
1 | Opcode |
0x01 |
4 | Operand 1 |
0x05 |
4 | Operand 2 |
Start of Interpreter
The code of the interpreter begins at offset 0x0a
in the ROM, physical address
0xfffffe0a
on the system.
The first instruction sets up the interpreter command pointer.
FFFFFE0A BE800000FF mov esi,0xff000080
Pseudocode:
op_ptr = 0xff000080;
Note the following register usage convention for the following interpreter assembly code:
Register | Usage |
---|---|
AL |
Opcode |
EBX |
Operand 1 |
ECX |
Operand 2 |
EDI |
Operation Result |
ESI |
Pointer to Current Operation |
EBP |
Scratch Register |
Main Loop
Next, the main loop of the interpreter is entered.
At the top of the loop, the Opcode, Operand 1, and Operand 2 are loaded into
AL
, EBX
, and ECX
, respectively.
FFFFFE0F 8A06 mov al,[esi]
FFFFFE11 8B5E01 mov ebx,[esi+0x1]
FFFFFE14 8B4E05 mov ecx,[esi+0x5]
In the loop, the opcode is checked to see which command should be executed, then that command is executed. If the opcode is unknown, it is simply skipped over.
At the end of the loop, the operation pointer is incremented and control returns to the top of the loop.
FFFFFEB4 83C609 add esi,byte +0x9
FFFFFEB7 E953FFFFFF jmp dword 0xfffffe0f
Pseudocode:
while (1) {
opcode = *op_ptr;
operand_1 = *((uint32_t *)(op_ptr+1));
operand_2 = *((uint32_t *)(op_ptr+5));
switch (opcode) {
/* ... */
}
op_ptr += 9;
}
What follows in the disassembly are the instructions to detect and execute each command.
Opcode 0x07: Chain Command
This command allows re-using the result of the last operation as operand 2 to another command specified in operand 1.
FFFFFE17 3C07 cmp al,0x7
FFFFFE19 7508 jnz 0xfffffe23
FFFFFE1B 8BD1 mov edx,ecx
FFFFFE1D 8AC3 mov al,bl
FFFFFE1F 8BDA mov ebx,edx
FFFFFE21 8BCF mov ecx,edi
Pseudocode (context: before the switch
statement above):
if (opcode == 0x07) {
opcode = operand_1;
operand_1 = operand_2;
operand_2 = result;
}
Opcode 0x02: Read from Memory
Simply read a double word from memory. The value is stored in the result register.
FFFFFE23 3C02 cmp al,0x2
FFFFFE25 750D jnz 0xfffffe34
FFFFFE27 81E3FFFFFF0F and ebx,0xfffffff
FFFFFE2D 8B3B mov edi,[ebx]
FFFFFE2F E980000000 jmp dword 0xfffffeb4
Pseudocode:
case 0x02:
result = *(operand_1 & 0x0fffffff);
break;
Notice here that special care is given by the interpreter to prevent reading
from any address above 0x0fffffff
(255 MiB). This was likely done to prevent
"malicious" X-codes from reading the contents of the boot ROM region directly.
Opcode 0x03: Write to Memory
Likewise, a command to write to memory is available.
FFFFFE34 3C03 cmp al,0x3
FFFFFE36 7504 jnz 0xfffffe3c
FFFFFE38 890B mov [ebx],ecx
FFFFFE3A EB78 jmp short 0xfffffeb4
Pseudocode:
case 0x03:
*((uint32_t *)operand_1) = operand_2;
break;
Opcode 0x06: AND then OR Result
Allows modifying the result register directly. Bits can be cleared using the mask in operand 1 and bits can be set with the mask in operand 2.
FFFFFE3C 3C06 cmp al,0x6
FFFFFE3E 7506 jnz 0xfffffe46
FFFFFE40 23FB and edi,ebx
FFFFFE42 0BF9 or edi,ecx
FFFFFE44 EB6E jmp short 0xfffffeb4
Pseudocode:
case 0x06:
result = (result & operand_1) | operand_2;
break;
Opcode 0x04: Write to PCI Configuration Space
A command is available that can write to PCI Configuration Space registers. This is useful for device initialization.
FFFFFE46 3C04 cmp al,0x4
FFFFFE48 751A jnz 0xfffffe64
FFFFFE4A 81FB80080080 cmp ebx,0x80000880
FFFFFE50 7503 jnz 0xfffffe55
FFFFFE52 83E1FD and ecx,byte -0x3
FFFFFE55 8BC3 mov eax,ebx
FFFFFE57 66BAF80C mov dx,0xcf8
FFFFFE5B EF out dx,eax
FFFFFE5C 80C204 add dl,0x4
FFFFFE5F 8BC1 mov eax,ecx
FFFFFE61 EF out dx,eax
FFFFFE62 EB50 jmp short 0xfffffeb4
Pseudocode:
case 0x04:
if (operand_1 == 0x80000880) {
operand_2 &= 0xfffffffd;
}
outl(operand_1, 0xcf8);
outl(operand_2, 0xcfc);
break;
Notice here that special care is given by the interpreter to the PCI register at
address 0x80000880
(0xcf8
index mechanism), preventing the setting of bit 1
of PCI Bus 0 Device 1 Function 0 Register 0x80
.
It was discovered (not by me) that this bit would "turn the ROM off" or otherwise disable address decoding to the ROM whenever set.
Of course, working around this limitation is trivial. This bit could very easily
be set by using opcode 0x11
(Write to I/O) to write to 0xcf8
/0xcfc
directly. Indeed this is known as the "MIST" hack.
Opcode 0x05: Read from PCI Configuration Space
Likewise, there is a command to read from PCI Configuration Space.
FFFFFE64 3C05 cmp al,0x5
FFFFFE66 750F jnz 0xfffffe77
FFFFFE68 8BC3 mov eax,ebx
FFFFFE6A 66BAF80C mov dx,0xcf8
FFFFFE6E EF out dx,eax
FFFFFE6F 80C204 add dl,0x4
FFFFFE72 ED in eax,dx
FFFFFE73 8BF8 mov edi,eax
FFFFFE75 EB3D jmp short 0xfffffeb4
Pseudocode:
case 0x05:
outl(operand_1, 0xcf8);
result = inl(0xcfc);
break;
Opcode 0x08: Branch (JNE)
A simple branch mechanism that allows optionally modifying the command pointer by adding the value in operand 2 if the value in operand 1 matches the current result value.
FFFFFE77 3C08 cmp al,0x8
FFFFFE79 7508 jnz 0xfffffe83
FFFFFE7B 3BFB cmp edi,ebx
FFFFFE7D 7435 jz 0xfffffeb4
FFFFFE7F 03F1 add esi,ecx
FFFFFE81 EB31 jmp short 0xfffffeb4
Pseudocode:
case 0x08:
if (result != operand_1) {
op_ptr += operand_2;
}
break;
Opcode 0x09: Jump
A simple jump mechanism that allows modifying the command pointer by adding the value in operand 2.
FFFFFE83 3C09 cmp al,0x9
FFFFFE85 7504 jnz 0xfffffe8b
FFFFFE87 03F1 add esi,ecx
FFFFFE89 EB29 jmp short 0xfffffeb4
Pseudocode:
case 0x09:
op_ptr += operand_2;
break;
Opcode 0x10: Read/Write Scratch Register
The interpreter allows for modifying a very small scratch pad using this one command. Bits can be cleared using the mask in operand 1 and bits can be set with the mask in operand 2.
FFFFFE8B 3C10 cmp al,0x10
FFFFFE8D 7508 jnz 0xfffffe97
FFFFFE8F 23EB and ebp,ebx
FFFFFE91 0BE9 or ebp,ecx
FFFFFE93 8BFD mov edi,ebp
FFFFFE95 EB1D jmp short 0xfffffeb4
Pseudocode:
case 0x10:
scratch = (scratch & operand_1) | operand_2;
result = scratch;
break;
Opcode 0x11: Write to I/O Port
A command to write to an I/O port.
FFFFFE97 3C11 cmp al,0x11
FFFFFE99 7507 jnz 0xfffffea2
FFFFFE9B 8BD3 mov edx,ebx
FFFFFE9D 8BC1 mov eax,ecx
FFFFFE9F EE out dx,al
FFFFFEA0 EB12 jmp short 0xfffffeb4
Pseudocode:
case 0x11:
outb(operand_2, operand_1);
break;
Opcode 0x12: Read from I/O Port
Likewise, a command to read from an I/O port.
FFFFFEA2 3C12 cmp al,0x12
FFFFFEA4 7508 jnz 0xfffffeae
FFFFFEA6 8BD3 mov edx,ebx
FFFFFEA8 EC in al,dx
FFFFFEA9 0FB6F8 movzx edi,al
FFFFFEAC EB06 jmp short 0xfffffeb4
Pseudocode:
case 0x12:
result = inb(operand_1);
break;
Opcode 0xEE: Exit Interpreter
When executed, the interpreter stops processing and jumps to 0xfffffebc
.
FFFFFEAE 3CEE cmp al,0xee
FFFFFEB0 7502 jnz 0xfffffeb4
FFFFFEB2 EB08 jmp short 0xfffffebc
Pseudocode:
case 0xee:
goto enable_caching;
Undefined Opcodes
Opcodes 0x00
, 0x01
, 0x0A
-0x0F
, 0x13
-0xED
, and 0xEF
-0xFF
are
undefined and will be ignored by the interpreter.
Pseudocode:
default:
break;
Summary of Opcodes
Opcode | Operation | Argument 1 | Argument 2 |
---|---|---|---|
0x02 |
Read Memory | Address | N/A |
0x03 |
Write Memory | Address | Value |
0x04 |
Write PCI Config Space | Address | Value |
0x05 |
Read PCI Config Space | Address | N/A |
0x06 |
AND then OR | AND Bitmask | OR Bitmask |
0x07 |
Chain Command | Next OP | Next Arg 1 |
0x08 |
Branch (JNE) | Condition | Offset |
0x09 |
Jump | N/A | Offset |
0x10 |
Read/Write Scratch Reg | AND Bitmask | OR Bitmask |
0x11 |
Write IO | 16-Bit Port | 8-Bit Value |
0x12 |
Read IO | 16-Bit Port | N/A |
0xEE |
Exit Interpreter | N/A | N/A |
Initialize MTRRs
Clear Variable MTRRs (MSR 0x200-0x20F)
Referencing the Intel Software Developer Manuals Chapter System Programming
Guide Chapter 35.1, MSRs 0x200
-0x20F
are the Variable MTRR Mask/Base 0
through 7.
FFFFFEBC 33C9 xor ecx,ecx
FFFFFEBE B502 mov ch,0x2
FFFFFEC0 33C0 xor eax,eax
FFFFFEC2 33D2 xor edx,edx
FFFFFEC4 0F30 wrmsr
FFFFFEC6 41 inc ecx
FFFFFEC7 80F90F cmp cl,0xf
FFFFFECA 76F8 jna 0xfffffec4
Set Default MTRR Type (MSR 0x2FF)
FFFFFECC B1FF mov cl,0xff
FFFFFECE 8BC3 mov eax,ebx
FFFFFED0 0F30 wrmsr
Notice here that EBX
is not being set before writing to the MSR. It is left
over from the last operand 1 of X-code processing. This has the flexibility of
letting the X-codes decide what the default type of caching is. I'm not sure yet
if this was the intended behavior.
Enable Caching
From the Intel Software Developer Manuals:
If the NW and CD flags are clear, write-back is enabled for the whole of system memory, but may be restricted for individual pages or regions of memory by other cache-control mechanisms.
Clear CD
flag (bit 30) and NW
flag (bit 29) of CR0
.
FFFFFED2 0F20C0 mov eax,cr0
FFFFFED5 25FFFFFF9F and eax,0x9fffffff
FFFFFEDA 0F22C0 mov cr0,eax
Load the Second-Stage Bootloader
After the X-code interpreter has finished running and caching has been enabled, the Second-Stage Bootloader is read from flash then decrypted and saved into memory using the RC4 stream cipher.
RC4 Key-Scheduling Algorithm (KSA)
The RC4 Key-Scheduling Algorithm is used to initialize the RC4 "S" array.
Register | Usage |
---|---|
EAX |
Scratch Register |
ECX |
S Iterator (i ) |
EDX |
S Cursor |
ESI |
S Pointer (0x8f000 ) |
FFFFFEDD B800010203 mov eax,0x3020100
FFFFFEE2 B940000000 mov ecx,0x40
FFFFFEE7 BE00F00800 mov esi,0x8f000
FFFFFEEC 8BD6 mov edx,esi
FFFFFEEE 8902 mov [edx],eax
FFFFFEF0 83C204 add edx,byte +0x4
FFFFFEF3 0504040404 add eax,0x4040404
FFFFFEF8 49 dec ecx
FFFFFEF9 75F3 jnz 0xfffffeee
Pseudocode:
uint8_t *s = (uint8_t *)0x8f000;
uint32_t i;
for (i = 0; i <= 255; i++) {
s[i] = i;
}
It may not be immediately obvious, but the assembly code is optimized to write double words instead of writing each byte.
Register Usage:
Register | Usage |
---|---|
EBP |
Key Pointer (0xffffffa5 ) |
ECX |
Key Iterator (i % keylength ) |
ESI |
S Pointer (0x8f000 ) |
EDI |
S Iterator (i ) |
EBX |
j |
EAX |
Scratch Register |
EDX |
Scratch Register |
FFFFFEFB 33C9 xor ecx,ecx
FFFFFEFD 33FF xor edi,edi
FFFFFEFF BDA5FFFFFF mov ebp,0xffffffa5
FFFFFF04 888E00010000 mov [esi+0x100],cl
FFFFFF0A 888E01010000 mov [esi+0x101],cl
FFFFFF10 33DB xor ebx,ebx
FFFFFF12 33D2 xor edx,edx
FFFFFF14 33C0 xor eax,eax
FFFFFF16 8A1437 mov dl,[edi+esi]
FFFFFF19 8A440D00 mov al,[ebp+ecx+0x0]
FFFFFF1D 02D8 add bl,al
FFFFFF1F 41 inc ecx
FFFFFF20 02DA add bl,dl
FFFFFF22 47 inc edi
FFFFFF23 8A0433 mov al,[ebx+esi]
FFFFFF26 884437FF mov [edi+esi-0x1],al
FFFFFF2A 83F910 cmp ecx,byte +0x10
FFFFFF2D 881433 mov [ebx+esi],dl
FFFFFF30 7502 jnz 0xffffff34
FFFFFF32 33C9 xor ecx,ecx
FFFFFF34 81FF00010000 cmp edi,0x100
FFFFFF3A 72D6 jc 0xffffff12
Pseudocode:
uint8_t *key = (uint8_t *)0xffffffa5; /* ROM offset 0x1a5. */
uint8_t j, t;
/* It is unclear why values s[0x100..0x101] are being set to 0. They are
* not modified by the code, but later these will be be used as the initial
* i, j values in the PRGA.
*/
s[0x100] = 0x00;
s[0x101] = 0x00;
for (i = 0, j = 0; i <= 255; i++) {
j = j + s[i] + key[i%16];
/* Swap s[i] and s[j] */
t = s[i];
s[i] = s[j];
s[j] = t;
}
Pseudo-Random Generation Algorithm (PRGA)
Register Usage:
Register | Usage |
---|---|
EAX |
Scratch |
EBP |
Remaining Length of Message (0x6000) |
EDI |
Message Iterator |
ESI |
S Pointer (0x8f000 ) |
ECX |
i |
EDX |
j |
FFFFFF3C 33C9 xor ecx,ecx
FFFFFF3E 33D2 xor edx,edx
FFFFFF40 33FF xor edi,edi
FFFFFF42 33C0 xor eax,eax
FFFFFF44 BE00F00800 mov esi,0x8f000
FFFFFF49 BD00600000 mov ebp,0x6000
FFFFFF4E 8A8E00010000 mov cl,[esi+0x100]
FFFFFF54 8A9601010000 mov dl,[esi+0x101]
FFFFFF5A FEC1 inc cl
FFFFFF5C 8A040E mov al,[esi+ecx]
FFFFFF5F 02D0 add dl,al
FFFFFF61 8A1C16 mov bl,[esi+edx]
FFFFFF64 881C0E mov [esi+ecx],bl
FFFFFF67 880416 mov [esi+edx],al
FFFFFF6A 02C3 add al,bl
FFFFFF6C 8A9F009EFFFF mov bl,[edi-0x6200]
FFFFFF72 8A0406 mov al,[esi+eax]
FFFFFF75 32D8 xor bl,al
FFFFFF77 889F00000900 mov [edi+0x90000],bl
FFFFFF7D 47 inc edi
FFFFFF7E 4D dec ebp
FFFFFF7F 75D9 jnz 0xffffff5a
Pseudocode:
uint8_t *encrypted = (uint8_t*)0xFFFF9E00; /* 2bl */
uint8_t *decrypted = (uint8_t*)0x90000; /* Decrypted 2bl Destination */
uint32_t pos;
/* As noted above, s[0x100..0x101] were set to 0 earlier, but have not been
* modified since. The RC4 algorithm defines i and j both to be set to 0
* before PRGA begins. */
i = s[0x100];
j = s[0x101];
for (pos = 0; pos < 0x6000; pos++) {
/* Update i, j. */
i = (i + 1) & 0xff;
j += s[i];
/* Swap s[i] and s[j]. */
t = s[i];
s[i] = s[j];
s[j] = t;
/* Decrypt message and write output. */
decrypted[pos] = encrypted[pos] ^ s[ s[i] + s[j] ];
}
Check Signature
Now that the Second-Stage Bootloader has been loaded, a quick sanity-check is performed: a "magic" signature is verified. If the signature doesn't match, control goes to the error handler (see below). (Signature redacted.)
FFFFFF81 A1E45F0900 mov eax,[0x95fe4]
FFFFFF86 3D________ cmp eax,0x________
FFFFFF8B 7507 jnz 0xffffff94
Pseudocode:
if (*((uint32_t *) 0x95fe4) != MAGIC_SIGNATURE) {
goto error_handler;
}
Otherwise, control goes to the 2bl entry point. The entry point address is
located at the start of the decrypted 2bl code in memory at 0x90000
:
FFFFFF8D A100000900 mov eax,[0x90000]
FFFFFF92 FFE0 jmp eax
Error Handler
FFFFFF94 B880080080 mov eax,0x80000880
FFFFFF99 66BAF80C mov dx,0xcf8
FFFFFF9D EF out dx,eax
FFFFFF9E EAFAFFFFFF0800 jmp dword 0x8:0xfffffffa
...
FFFFFFFA 80C204 add dl,0x4
FFFFFFFD B002 mov al,0x2
FFFFFFFF EE out dx,al
Notice that the PCI Configuration Space register at Bus 0, Device 1, Function 0,
Register 0x80
is again given special care: the bit that was not supposed to
set with the 0x04
opcode is now being set in the error condition. Setting this
bit in the error condition was presumably done to prevent reading the ROM in
case of an error.
Cleverly, this error handler is split across the ROM such that the last
instruction is located at the highest-most address. When EIP
rolls over to
0x00000000
it is then expected that the CPU halt. This is the behavior on AMD
processors but, on Intel processors that shipped with the Xbox, execution
happily continues at 0x00000000
.
Observing this behavior, it would be possible to replace some existing Xcodes with ones that program new x86 instructions and then intentionally cause the 2bl to fail the signature check. The system will eventually begin executing the modified code. Indeed, this is known as the "Visor" hack, after the person who discovered it.
End
So that is how the Xbox Boot ROM works. It's not very complicated, but it was fun to deconstruct.
This article only briefly touched on a couple of the vulnerabilities of the Xbox. For a much more in-depth security analysis, please refer to Michael Steil's 17 Mistakes Microsoft Made in the Xbox Security System.
If you have any feedback or would like to correct a mistake that I made, please drop a comment below.