Reverse engineering a simple game archive
Sometimes I like to poke around in the files of the games I bought on Steam, just for fun or maybe to listen to the game music. Most game engines put the game files in their own proprietary archive files, so I started to reverse engineer those formats, at least to the point where I could extract the files. Some game archive file formats proved to be so simple that I was able to easily write the archives, too. So I thought I write down how I did it, maybe someone else finds this useful?
The main tools I use are a hex editor (bless) and Python.
Short detour: I'm not 100% satisfied with bless, though. Let me show you what I mean:
I would like those fields in the status bar to be text fields that I can edit, making it easy to select a certain area of the file when I know the size in bytes. If anyone knows a (Linux) hex editor that does that let me know!
But what I do like about bless is that it displays the bytes at the cursor in all possible number formats and it's dynamically resizable, making table structures in the file much more apparent. See:
So much for that. Now to the actual reverse engineering. Let's start with the simples game archive format that I saw: FEZ .pak files. Looking at the start of FEZ/Content/Essentials.pak we see this:
00000000: d700 0000 186f 7468 6572 2074 6578 7475 .....other textu 00000010: 7265 735c 6675 6c6c 7768 6974 65cb 0000 res\fullwhite... 00000020: 0058 4e42 7705 01cb 0000 0001 9401 4d69 .XNBw.........Mi 00000030: 6372 6f73 6f66 742e 586e 612e 4672 616d crosoft.Xna.Fram 00000040: 6577 6f72 6b2e 436f 6e74 656e 742e 5465 ework.Content.Te 00000050: 7874 7572 6532 4452 6561 6465 722c 204d xture2DReader, M 00000060: 6963 726f 736f 6674 2e58 6e61 2e46 7261 icrosoft.Xna.Fra 00000070: 6d65 776f 726b 2e47 7261 7068 6963 732c mework.Graphics, 00000080: 2056 6572 7369 6f6e 3d34 2e30 2e30 2e30 Version=4.0.0.0 00000090: 2c20 4375 6c74 7572 653d 6e65 7574 7261 , Culture=neutra 000000a0: 6c2c 2050 7562 6c69 634b 6579 546f 6b65 l, PublicKeyToke 000000b0: 6e3d 3834 3263 6638 6265 3164 6535 3035 n=842cf8be1de505 000000c0: 3533 0000 0000 0001 0000 0000 0200 0000 53.............. 000000d0: 0200 0000 0100 0000 1000 0000 ffff ffff ................ 000000e0: ffff ffff ffff ffff ffff ffff 1f6f 7468 .............oth 000000f0: 6572 2074 6578 7475 7265 735c 7472 616e er textures\tran 00000100: 7370 6172 656e 7477 6869 7465 cb00 0000 sparentwhite.... 00000110: 584e 4277 0501 cb00 0000 0194 014d 6963 XNBw.........Mic 00000120: 726f 736f 6674 2e58 6e61 2e46 7261 6d65 rosoft.Xna.Frame 00000130: 776f 726b 2e43 6f6e 7465 6e74 2e54 6578 work.Content.Tex 00000140: 7475 7265 3244 5265 6164 6572 2c20 4d69 ture2DReader, Mi 00000150: 6372 6f73 6f66 742e 586e 612e 4672 616d crosoft.Xna.Fram 00000160: 6577 6f72 6b2e 4772 6170 6869 6373 2c20 ework.Graphics, 00000170: 5665 7273 696f 6e3d 342e 302e 302e 302c Version=4.0.0.0, 00000180: 2043 756c 7475 7265 3d6e 6575 7472 616c Culture=neutral 00000190: 2c20 5075 626c 6963 4b65 7954 6f6b 656e , PublicKeyToken 000001a0: 3d38 3432 6366 3862 6531 6465 3530 3535 =842cf8be1de5055 000001b0: 3300 0000 0000 0100 0000 0002 0000 0002 3............... 000001c0: 0000 0001 0000 0010 0000 00ff ffff 00ff ................ 000001d0: ffff 00ff ffff 00ff ffff 0018 6f74 6865 ............othe ...
So there are 5 bytes that aren't printable ASCII character and then there is a bit of ASCII. Those first bytes don't look like a file signature/magic, so I guess it's already data. But denoting what and how? For the how: Most CPU architectures today are little-endian, Windows certainly only runs on little-endian systems (ignoring the XBox 360). Little- and big-endian are ways to encode integers in memory. It's the order of the bytes that make up the number. A 32bit integer consists of 4 bytes. For little-endian the least significant byte comes first, for big endian it's the other way around. Almost everyone has strong opinions about which encoding is the "correct" one, but let's not go into that.
As a first assumption (which I never had to overthrow so far) let's assume integers are stored in little-endian encoding. As a further assumption lets say that sizes of embedded files and things like file counts are stored as unsigned 32 bit values. Only if the game files would exceed 4 GB (or 2 GB for signed values) one would need 64 bit integers. So let's interpret the first 4 bytes as an unsigned 32 bit integer, what do we get? 215
That's a low enough number that it might be a file count. Let's run with that. But it's 5 bytes that are non-ASCII, what is the next one? Well, file names usually are quite short, so maybe they are limited to 255 characters in FEZ? Then we get 24
followed by 24 bytes of printable ASCII characters ("other textures\fullwhite"
) and then more non-ASCII bytes. So that seems right. How many non-ASCII bytes? 4! As 32 bit integer it's the number 203
. A file size? Maybe the next 203 bytes are the file data?
Coloring in the bytes of the file count and the first file you get this:
- number of files
- file name length
- file name
- file data size
- file data
00000000: d700 0000 186f 7468 6572 2074 6578 7475 .....other textu 00000010: 7265 735c 6675 6c6c 7768 6974 65cb 0000 res\fullwhite... 00000020: 0058 4e42 7705 01cb 0000 0001 9401 4d69 .XNBw.........Mi 00000030: 6372 6f73 6f66 742e 586e 612e 4672 616d crosoft.Xna.Fram 00000040: 6577 6f72 6b2e 436f 6e74 656e 742e 5465 ework.Content.Te 00000050: 7874 7572 6532 4452 6561 6465 722c 204d xture2DReader, M 00000060: 6963 726f 736f 6674 2e58 6e61 2e46 7261 icrosoft.Xna.Fra 00000070: 6d65 776f 726b 2e47 7261 7068 6963 732c mework.Graphics, 00000080: 2056 6572 7369 6f6e 3d34 2e30 2e30 2e30 Version=4.0.0.0 00000090: 2c20 4375 6c74 7572 653d 6e65 7574 7261 , Culture=neutra 000000a0: 6c2c 2050 7562 6c69 634b 6579 546f 6b65 l, PublicKeyToke 000000b0: 6e3d 3834 3263 6638 6265 3164 6535 3035 n=842cf8be1de505 000000c0: 3533 0000 0000 0001 0000 0000 0200 0000 53.............. 000000d0: 0200 0000 0100 0000 1000 0000 ffff ffff ................ 000000e0: ffff ffff ffff ffff ffff ffff 1f6f 7468 .............oth 000000f0: 6572 2074 6578 7475 7265 735c 7472 616e er textures\tran 00000100: 7370 6172 656e 7477 6869 7465 cb00 0000 sparentwhite.... 00000110: 584e 4277 0501 cb00 0000 0194 014d 6963 XNBw.........Mic 00000120: 726f 736f 6674 2e58 6e61 2e46 7261 6d65 rosoft.Xna.Frame 00000130: 776f 726b 2e43 6f6e 7465 6e74 2e54 6578 work.Content.Tex 00000140: 7475 7265 3244 5265 6164 6572 2c20 4d69 ture2DReader, Mi 00000150: 6372 6f73 6f66 742e 586e 612e 4672 616d crosoft.Xna.Fram 00000160: 6577 6f72 6b2e 4772 6170 6869 6373 2c20 ework.Graphics, 00000170: 5665 7273 696f 6e3d 342e 302e 302e 302c Version=4.0.0.0, 00000180: 2043 756c 7475 7265 3d6e 6575 7472 616c Culture=neutral 00000190: 2c20 5075 626c 6963 4b65 7954 6f6b 656e , PublicKeyToken 000001a0: 3d38 3432 6366 3862 6531 6465 3530 3535 =842cf8be1de5055 000001b0: 3300 0000 0000 0100 0000 0002 0000 0002 3............... 000001c0: 0000 0001 0000 0010 0000 00ff ffff 00ff ................ 000001d0: ffff 00ff ffff 00ff ffff 0018 6f74 6865 ............othe ...
Indeed it works to read the file like that (1 byte file name length, file name, 4 bytes file data length, file data, repeat). That was easy!
Using what we learned we can write a small Python script to list files of a FEZ game archive:
#!/usr/bin/env python3 | |
import struct | |
import sys | |
with open(sys.argv[1], 'rb') as fp: | |
# struct.unpack() reads binary data from a buffer. | |
# < indicates little-endian and I stands for unsigned 32bit integer. | |
# It returns a tuple of all the values that where unpacked, hence the comma. | |
file_count, = struct.unpack("<I", fp.read(4)) | |
for file_index in range(file_count): | |
file_name_length = fp.read(1)[0] | |
file_name = fp.read(file_name_length).decode() | |
file_size, = struct.unpack("<I", fp.read(4)) | |
print("%5d %10d %10d %s" % (file_index, fp.tell(), file_size, file_name)) | |
fp.seek(file_size, 1) | |
# Check if there is some data at the end after all the files which | |
# we haven't reverse engineered (spoiler: there isn't): | |
file_table_end = fp.tell() | |
fp.seek(0, 2) | |
tail_byte_count = fp.tell() - file_table_end | |
if tail_byte_count != 0: | |
print("ERROR: %d bytes are not accounted for" % tail_byte_count) |
Which produces this output if you pass Essentials.pak
to it:
0 33 203 other textures\fullwhite 1 272 203 other textures\transparentwhite 2 504 211 other textures\fullblack 3 745 10150 other textures\smooth_ray 4 10928 44506 other textures\rainbow_flare 5 55465 3980 other textures\flare_alpha 6 59480 210 background planes\white_square 7 59726 251 background planes\dust_particle 8 60005 3016 effects\basicposteffect 9 63043 3928 effects\doteffect 10 67012 22804 effects\instancedanimatedplaneeffect 11 89848 10192 effects\animatedplaneeffect 12 100075 3680 effects\fakepointspriteseffect 13 103790 5760 effects\defaulteffect_textured 14 109590 5148 effects\defaulteffect_vertexcolored 15 114781 7288 effects\defaulteffect_litvertexcolored 16 122107 7624 effects\defaulteffect_littextured 17 129779 5644 effects\defaulteffect_texturedvertexcolored 18 135474 7672 effects\defaulteffect_littexturedvertexcolored 19 143183 17316 effects\instancedblackholeeffect 20 160527 128137 sounds\ui\menu\exitgame 21 288691 42781 sounds\ui\menu\confirm 22 331502 151699 sounds\nature\watersplash 23 483237 1197237 sounds\intro\zoomtofarawayplace 24 1680502 264705 sounds\zu\blackholebuzz 25 1945232 21014 resources\statictext 26 1966280 278 other textures\glyphs\abutton 27 1966597 278 other textures\glyphs\abutton_sony 28 1966912 308 other textures\glyphs\backbutton 29 1967262 334 other textures\glyphs\backbutton_sony 30 1967630 278 other textures\glyphs\bbutton 31 1967947 278 other textures\glyphs\bbutton_sony 32 1968260 260 other textures\glyphs\dpaddown 33 1968560 280 other textures\glyphs\dpaddown_sony 34 1968875 262 other textures\glyphs\dpadleft 35 1969177 284 other textures\glyphs\dpadleft_sony 36 1969497 260 other textures\glyphs\dpadright 37 1969798 284 other textures\glyphs\dpadright_sony 38 1970115 258 other textures\glyphs\dpadup 39 1970411 280 other textures\glyphs\dpadup_sony 40 1970727 316 other textures\glyphs\leftarrow 41 1971080 288 other textures\glyphs\leftbumper 42 1971409 294 other textures\glyphs\leftbumper_ps3 43 1971744 308 other textures\glyphs\leftbumper_ps4 44 1972088 302 other textures\glyphs\leftstick 45 1972431 264 other textures\glyphs\leftstick_sony 46 1972733 266 other textures\glyphs\lefttrigger 47 1973041 276 other textures\glyphs\lefttrigger_ps3 48 1973359 278 other textures\glyphs\lefttrigger_ps4 49 1973674 316 other textures\glyphs\rightarrow 50 1974028 298 other textures\glyphs\rightbumper 51 1974368 296 other textures\glyphs\rightbumper_ps3 52 1974706 314 other textures\glyphs\rightbumper_ps4 53 1975057 310 other textures\glyphs\rightstick 54 1975409 278 other textures\glyphs\rightstick_sony 55 1975726 276 other textures\glyphs\righttrigger 56 1976045 286 other textures\glyphs\righttrigger_ps3 57 1976374 292 other textures\glyphs\righttrigger_ps4 58 1976704 322 other textures\glyphs\startbutton 59 1977069 304 other textures\glyphs\startbutton_sony 60 1977407 280 other textures\glyphs\xbutton 61 1977726 278 other textures\glyphs\xbutton_sony 62 1978038 286 other textures\glyphs\ybutton 63 1978363 280 other textures\glyphs\ybutton_sony 64 1978682 276 other textures\keyboard_glyphs\p_0 65 1978997 266 other textures\keyboard_glyphs\p_1 66 1979302 284 other textures\keyboard_glyphs\p_2 67 1979625 278 other textures\keyboard_glyphs\p_3 68 1979942 278 other textures\keyboard_glyphs\p_4 69 1980259 282 other textures\keyboard_glyphs\p_5 70 1980580 282 other textures\keyboard_glyphs\p_6 71 1980901 278 other textures\keyboard_glyphs\p_7 72 1981218 274 other textures\keyboard_glyphs\p_8 73 1981531 280 other textures\keyboard_glyphs\p_9 74 1981850 280 other textures\keyboard_glyphs\p_a 75 1982174 352 other textures\keyboard_glyphs\p_arrows 76 1982574 272 other textures\keyboard_glyphs\p_arrow_down 77 1982894 280 other textures\keyboard_glyphs\p_arrow_left 78 1983223 280 other textures\keyboard_glyphs\p_arrow_right 79 1983549 276 other textures\keyboard_glyphs\p_arrow_up 80 1983864 270 other textures\keyboard_glyphs\p_b 81 1984181 366 other textures\keyboard_glyphs\p_backspace 82 1984586 280 other textures\keyboard_glyphs\p_c 83 1984908 358 other textures\keyboard_glyphs\p_caps 84 1985309 268 other textures\keyboard_glyphs\p_comma 85 1985616 274 other textures\keyboard_glyphs\p_d 86 1985934 346 other textures\keyboard_glyphs\p_delete 87 1986324 282 other textures\keyboard_glyphs\p_divide 88 1986645 278 other textures\keyboard_glyphs\p_e 89 1986964 312 other textures\keyboard_glyphs\p_end 90 1987319 358 other textures\keyboard_glyphs\p_enter 91 1987720 266 other textures\keyboard_glyphs\p_equal 92 1988027 320 other textures\keyboard_glyphs\p_esc 93 1988391 356 other textures\keyboard_glyphs\p_escape 94 1988789 334 other textures\keyboard_glyphs\p_esdf 95 1989162 278 other textures\keyboard_glyphs\p_f 96 1989480 290 other textures\keyboard_glyphs\p_f0 97 1989810 280 other textures\keyboard_glyphs\p_f1 98 1990131 290 other textures\keyboard_glyphs\p_f10 99 1990462 278 other textures\keyboard_glyphs\p_f11 100 1990781 290 other textures\keyboard_glyphs\p_f12 101 1991111 294 other textures\keyboard_glyphs\p_f2 102 1991445 288 other textures\keyboard_glyphs\p_f3 103 1991773 292 other textures\keyboard_glyphs\p_f4 104 1992105 288 other textures\keyboard_glyphs\p_f5 105 1992433 292 other textures\keyboard_glyphs\p_f6 106 1992765 286 other textures\keyboard_glyphs\p_f7 107 1993091 286 other textures\keyboard_glyphs\p_f8 108 1993417 290 other textures\keyboard_glyphs\p_f9 109 1993746 282 other textures\keyboard_glyphs\p_g 110 1994067 278 other textures\keyboard_glyphs\p_h 111 1994387 320 other textures\keyboard_glyphs\p_home 112 1994746 266 other textures\keyboard_glyphs\p_i 113 1995054 328 other textures\keyboard_glyphs\p_ijkl 114 1995426 360 other textures\keyboard_glyphs\p_insert 115 1995825 272 other textures\keyboard_glyphs\p_j 116 1996136 284 other textures\keyboard_glyphs\p_k 117 1996459 262 other textures\keyboard_glyphs\p_l 118 1996764 318 other textures\keyboard_glyphs\p_l_alt 119 1997129 272 other textures\keyboard_glyphs\p_l_bracket 120 1997448 352 other textures\keyboard_glyphs\p_l_control 121 1997845 348 other textures\keyboard_glyphs\p_l_shift 122 1998240 356 other textures\keyboard_glyphs\p_l_windows 123 1998635 286 other textures\keyboard_glyphs\p_m 124 1998964 268 other textures\keyboard_glyphs\p_minus 125 1999278 286 other textures\keyboard_glyphs\p_multiply 126 1999603 278 other textures\keyboard_glyphs\p_n 127 1999923 320 other textures\keyboard_glyphs\p_none 128 2000286 324 other textures\keyboard_glyphs\p_num_0 129 2000653 318 other textures\keyboard_glyphs\p_num_1 130 2001014 330 other textures\keyboard_glyphs\p_num_2 131 2001387 326 other textures\keyboard_glyphs\p_num_3 132 2001756 320 other textures\keyboard_glyphs\p_num_4 133 2002119 328 other textures\keyboard_glyphs\p_num_5 134 2002490 326 other textures\keyboard_glyphs\p_num_6 135 2002859 322 other textures\keyboard_glyphs\p_num_7 136 2003224 324 other textures\keyboard_glyphs\p_num_8 137 2003591 322 other textures\keyboard_glyphs\p_num_9 138 2003959 352 other textures\keyboard_glyphs\p_num_lock 139 2004350 276 other textures\keyboard_glyphs\p_o 140 2004665 274 other textures\keyboard_glyphs\p_p 141 2004986 356 other textures\keyboard_glyphs\p_page_down 142 2005387 312 other textures\keyboard_glyphs\p_page_up 143 2005743 266 other textures\keyboard_glyphs\p_period 144 2006051 260 other textures\keyboard_glyphs\p_pipe 145 2006353 278 other textures\keyboard_glyphs\p_plus 146 2006670 276 other textures\keyboard_glyphs\p_q 147 2006992 274 other textures\keyboard_glyphs\p_question 148 2007310 268 other textures\keyboard_glyphs\p_quotes 149 2007617 282 other textures\keyboard_glyphs\p_r 150 2007942 316 other textures\keyboard_glyphs\p_r_alt 151 2008305 270 other textures\keyboard_glyphs\p_r_bracket 152 2008622 352 other textures\keyboard_glyphs\p_r_control 153 2009019 342 other textures\keyboard_glyphs\p_r_shift 154 2009408 350 other textures\keyboard_glyphs\p_r_windows 155 2009797 280 other textures\keyboard_glyphs\p_s 156 2010121 354 other textures\keyboard_glyphs\p_scroll 157 2010522 268 other textures\keyboard_glyphs\p_semicolon 158 2010833 354 other textures\keyboard_glyphs\p_space 159 2011226 274 other textures\keyboard_glyphs\p_t 160 2011541 308 other textures\keyboard_glyphs\p_tab 161 2011892 280 other textures\keyboard_glyphs\p_tilde 162 2012211 272 other textures\keyboard_glyphs\p_u 163 2012522 282 other textures\keyboard_glyphs\p_v 164 2012843 276 other textures\keyboard_glyphs\p_w 165 2013161 340 other textures\keyboard_glyphs\p_wasd 166 2013540 288 other textures\keyboard_glyphs\p_x 167 2013867 282 other textures\keyboard_glyphs\p_y 168 2014188 284 other textures\keyboard_glyphs\p_z 169 2014514 358 other textures\keyboard_glyphs\p_zqsd 170 2014914 254 other textures\speech_bubble\b_prompt 171 2015216 238 other textures\speech_bubble\speechbubblene 172 2015502 234 other textures\speech_bubble\speechbubblenw 173 2015784 212 other textures\speech_bubble\speechbubblese 174 2016044 238 other textures\speech_bubble\speechbubblesw 175 2016332 230 other textures\speech_bubble\speechbubbletail 176 2016598 324 other textures\thought_glyphs\b 177 2016959 386 other textures\thought_glyphs\up 178 2017380 420 other textures\black_hole\rips 179 2017836 418 other textures\black_hole\stars 180 2018289 3460 other textures\splash\polytron 181 2021789 9470 other textures\splash\polytron_1440 182 2031294 9124 other textures\splash\trapdoor 183 2040452 5186 other textures\splash\trixels 184 2045677 9132 other textures\splash\trixels_1440 185 2054831 486056 fonts\chinese big 186 2540911 242918 fonts\chinese small 187 2783852 335020 fonts\japanese big 188 3118897 161618 fonts\japanese small 189 3280536 217780 fonts\korean big 190 3498339 114350 fonts\korean small 191 3612709 3860 fonts\latin big 192 3616591 3866 fonts\latin small 193 3620473 2572 fonts\zuish 194 3623075 2326381 sounds\intro\fezlogodrone 195 5949488 34247 sounds\intro\fezlogoglitch1 196 5983767 34247 sounds\intro\fezlogoglitch2 197 6018046 34247 sounds\intro\fezlogoglitch3 198 6052321 1428457 sounds\intro\hyperspeed 199 7480810 803029 sounds\intro\lensflaregleam 200 8283865 1157729 sounds\intro\logozoom 201 9441619 1411305 sounds\intro\pandown 202 10852956 482837 sounds\intro\polytronjingle 203 11335817 881177 sounds\intro\reboot 204 12217020 1625505 sounds\intro\starzoom 205 13842555 185953 sounds\intro\trixellogoin 206 14028539 185953 sounds\intro\trixellogoout 207 14214528 1197237 sounds\intro\zoomtofarawayplace 208 15411794 1182929 sounds\intro\zoomtohouse 209 16594746 103267 sounds\dot\comeout 210 16698038 78067 sounds\dot\heylisten 211 16776125 132405 sounds\dot\hide 212 16908550 112717 sounds\dot\idle 213 17021287 150125 sounds\dot\move 214 17171432 604905 sounds\dot\talk
And finally we can also dump files from a FEZ game archive like this:
#!/usr/bin/env python3 | |
import struct | |
import sys | |
import os | |
with open(sys.argv[1], 'rb') as fp: | |
file_count, = struct.unpack("<I", fp.read(4)) | |
for file_index in range(file_count): | |
file_name_length = fp.read(1)[0] | |
file_name = fp.read(file_name_length).decode() | |
file_size, = struct.unpack("<I", fp.read(4)) | |
file_data = fp.read(file_size) | |
file_path = file_name.replace('\\', os.sep) | |
print(file_path) | |
subdir = os.path.split(file_path)[0] | |
# WARNING: If subdir would start with '/' ('C:\' or '\\' under Windows) | |
# or otherwise escapes the current directory (e.g. using '..') this | |
# will overwrite unexpected files. You need to check for that in a | |
# proper program, but here I checked the file paths manually by | |
# looking at the output of list_fez.py first. | |
os.makedirs(subdir, exist_ok=True) | |
with open(file_path, 'wb') as outfp: | |
outfp.write(file_data) |
With this knowledge I wrote a small Python program that can list, unpack, pack and even read-only mount FEZ game archives. See fezpak on GitHub.
Next time I'll talk about a more complicated file format, which I reverse engineered in order to make a simple sprite swap game mod.
Comments
Post a Comment