Reverse engineering a simple game archive

Sometimes I like to poke around in the files of the games I bought on Steam, just for fun or maybe to listen to the game music. Most game engines put the game files in their own proprietary archive files, so I started to reverse engineer those formats, at least to the point where I could extract the files. Some game archive file formats proved to be so simple that I was able to easily write the archives, too. So I thought I write down how I did it, maybe someone else finds this useful?

The main tools I use are a hex editor (bless) and Python.

Short detour: I'm not 100% satisfied with bless, though. Let me show you what I mean:

I would like those fields in the status bar to be text fields that I can edit, making it easy to select a certain area of the file when I know the size in bytes. If anyone knows a (Linux) hex editor that does that let me know!

But what I do like about bless is that it displays the bytes at the cursor in all possible number formats and it's dynamically resizable, making table structures in the file much more apparent. See:

So much for that. Now to the actual reverse engineering. Let's start with the simples game archive format that I saw: FEZ .pak files. Looking at the start of FEZ/Content/Essentials.pak we see this:

00000000: d700 0000 186f 7468 6572 2074 6578 7475  .....other textu
00000010: 7265 735c 6675 6c6c 7768 6974 65cb 0000  res\fullwhite...
00000020: 0058 4e42 7705 01cb 0000 0001 9401 4d69  .XNBw.........Mi
00000030: 6372 6f73 6f66 742e 586e 612e 4672 616d  crosoft.Xna.Fram
00000040: 6577 6f72 6b2e 436f 6e74 656e 742e 5465  ework.Content.Te
00000050: 7874 7572 6532 4452 6561 6465 722c 204d  xture2DReader, M
00000060: 6963 726f 736f 6674 2e58 6e61 2e46 7261  icrosoft.Xna.Fra
00000070: 6d65 776f 726b 2e47 7261 7068 6963 732c  mework.Graphics,
00000080: 2056 6572 7369 6f6e 3d34 2e30 2e30 2e30   Version=4.0.0.0
00000090: 2c20 4375 6c74 7572 653d 6e65 7574 7261  , Culture=neutra
000000a0: 6c2c 2050 7562 6c69 634b 6579 546f 6b65  l, PublicKeyToke
000000b0: 6e3d 3834 3263 6638 6265 3164 6535 3035  n=842cf8be1de505
000000c0: 3533 0000 0000 0001 0000 0000 0200 0000  53..............
000000d0: 0200 0000 0100 0000 1000 0000 ffff ffff  ................
000000e0: ffff ffff ffff ffff ffff ffff 1f6f 7468  .............oth
000000f0: 6572 2074 6578 7475 7265 735c 7472 616e  er textures\tran
00000100: 7370 6172 656e 7477 6869 7465 cb00 0000  sparentwhite....
00000110: 584e 4277 0501 cb00 0000 0194 014d 6963  XNBw.........Mic
00000120: 726f 736f 6674 2e58 6e61 2e46 7261 6d65  rosoft.Xna.Frame
00000130: 776f 726b 2e43 6f6e 7465 6e74 2e54 6578  work.Content.Tex
00000140: 7475 7265 3244 5265 6164 6572 2c20 4d69  ture2DReader, Mi
00000150: 6372 6f73 6f66 742e 586e 612e 4672 616d  crosoft.Xna.Fram
00000160: 6577 6f72 6b2e 4772 6170 6869 6373 2c20  ework.Graphics, 
00000170: 5665 7273 696f 6e3d 342e 302e 302e 302c  Version=4.0.0.0,
00000180: 2043 756c 7475 7265 3d6e 6575 7472 616c   Culture=neutral
00000190: 2c20 5075 626c 6963 4b65 7954 6f6b 656e  , PublicKeyToken
000001a0: 3d38 3432 6366 3862 6531 6465 3530 3535  =842cf8be1de5055
000001b0: 3300 0000 0000 0100 0000 0002 0000 0002  3...............
000001c0: 0000 0001 0000 0010 0000 00ff ffff 00ff  ................
000001d0: ffff 00ff ffff 00ff ffff 0018 6f74 6865  ............othe
...

So there are 5 bytes that aren't printable ASCII character and then there is a bit of ASCII. Those first bytes don't look like a file signature/magic, so I guess it's already data. But denoting what and how? For the how: Most CPU architectures today are little-endian, Windows certainly only runs on little-endian systems (ignoring the XBox 360). Little- and big-endian are ways to encode integers in memory. It's the order of the bytes that make up the number. A 32bit integer consists of 4 bytes. For little-endian the least significant byte comes first, for big endian it's the other way around. Almost everyone has strong opinions about which encoding is the "correct" one, but let's not go into that.

As a first assumption (which I never had to overthrow so far) let's assume integers are stored in little-endian encoding. As a further assumption lets say that sizes of embedded files and things like file counts are stored as unsigned 32 bit values. Only if the game files would exceed 4 GB (or 2 GB for signed values) one would need 64 bit integers. So let's interpret the first 4 bytes as an unsigned 32 bit integer, what do we get? 215 That's a low enough number that it might be a file count. Let's run with that. But it's 5 bytes that are non-ASCII, what is the next one? Well, file names usually are quite short, so maybe they are limited to 255 characters in FEZ? Then we get 24 followed by 24 bytes of printable ASCII characters ("other textures\fullwhite") and then more non-ASCII bytes. So that seems right. How many non-ASCII bytes? 4! As 32 bit integer it's the number 203. A file size? Maybe the next 203 bytes are the file data?

Coloring in the bytes of the file count and the first file you get this:

  • number of files
  • file name length
  • file name
  • file data size
  • file data
00000000: d700 0000 186f 7468 6572 2074 6578 7475  .....other textu
00000010: 7265 735c 6675 6c6c 7768 6974 65cb 0000  res\fullwhite...
00000020: 0058 4e42 7705 01cb 0000 0001 9401 4d69  .XNBw.........Mi
00000030: 6372 6f73 6f66 742e 586e 612e 4672 616d  crosoft.Xna.Fram
00000040: 6577 6f72 6b2e 436f 6e74 656e 742e 5465  ework.Content.Te
00000050: 7874 7572 6532 4452 6561 6465 722c 204d  xture2DReader, M
00000060: 6963 726f 736f 6674 2e58 6e61 2e46 7261  icrosoft.Xna.Fra
00000070: 6d65 776f 726b 2e47 7261 7068 6963 732c  mework.Graphics,
00000080: 2056 6572 7369 6f6e 3d34 2e30 2e30 2e30   Version=4.0.0.0
00000090: 2c20 4375 6c74 7572 653d 6e65 7574 7261  , Culture=neutra
000000a0: 6c2c 2050 7562 6c69 634b 6579 546f 6b65  l, PublicKeyToke
000000b0: 6e3d 3834 3263 6638 6265 3164 6535 3035  n=842cf8be1de505
000000c0: 3533 0000 0000 0001 0000 0000 0200 0000  53..............
000000d0: 0200 0000 0100 0000 1000 0000 ffff ffff  ................
000000e0: ffff ffff ffff ffff ffff ffff 1f6f 7468  .............oth
000000f0: 6572 2074 6578 7475 7265 735c 7472 616e  er textures\tran
00000100: 7370 6172 656e 7477 6869 7465 cb00 0000  sparentwhite....
00000110: 584e 4277 0501 cb00 0000 0194 014d 6963  XNBw.........Mic
00000120: 726f 736f 6674 2e58 6e61 2e46 7261 6d65  rosoft.Xna.Frame
00000130: 776f 726b 2e43 6f6e 7465 6e74 2e54 6578  work.Content.Tex
00000140: 7475 7265 3244 5265 6164 6572 2c20 4d69  ture2DReader, Mi
00000150: 6372 6f73 6f66 742e 586e 612e 4672 616d  crosoft.Xna.Fram
00000160: 6577 6f72 6b2e 4772 6170 6869 6373 2c20  ework.Graphics, 
00000170: 5665 7273 696f 6e3d 342e 302e 302e 302c  Version=4.0.0.0,
00000180: 2043 756c 7475 7265 3d6e 6575 7472 616c   Culture=neutral
00000190: 2c20 5075 626c 6963 4b65 7954 6f6b 656e  , PublicKeyToken
000001a0: 3d38 3432 6366 3862 6531 6465 3530 3535  =842cf8be1de5055
000001b0: 3300 0000 0000 0100 0000 0002 0000 0002  3...............
000001c0: 0000 0001 0000 0010 0000 00ff ffff 00ff  ................
000001d0: ffff 00ff ffff 00ff ffff 0018 6f74 6865  ............othe
...

Indeed it works to read the file like that (1 byte file name length, file name, 4 bytes file data length, file data, repeat). That was easy!

Using what we learned we can write a small Python script to list files of a FEZ game archive:

#!/usr/bin/env python3
import struct
import sys
with open(sys.argv[1], 'rb') as fp:
# struct.unpack() reads binary data from a buffer.
# < indicates little-endian and I stands for unsigned 32bit integer.
# It returns a tuple of all the values that where unpacked, hence the comma.
file_count, = struct.unpack("<I", fp.read(4))
for file_index in range(file_count):
file_name_length = fp.read(1)[0]
file_name = fp.read(file_name_length).decode()
file_size, = struct.unpack("<I", fp.read(4))
print("%5d %10d %10d %s" % (file_index, fp.tell(), file_size, file_name))
fp.seek(file_size, 1)
# Check if there is some data at the end after all the files which
# we haven't reverse engineered (spoiler: there isn't):
file_table_end = fp.tell()
fp.seek(0, 2)
tail_byte_count = fp.tell() - file_table_end
if tail_byte_count != 0:
print("ERROR: %d bytes are not accounted for" % tail_byte_count)
view raw list_fez.py hosted with ❤ by GitHub

Which produces this output if you pass Essentials.pak to it:

    0         33        203 other textures\fullwhite
    1        272        203 other textures\transparentwhite
    2        504        211 other textures\fullblack
    3        745      10150 other textures\smooth_ray
    4      10928      44506 other textures\rainbow_flare
    5      55465       3980 other textures\flare_alpha
    6      59480        210 background planes\white_square
    7      59726        251 background planes\dust_particle
    8      60005       3016 effects\basicposteffect
    9      63043       3928 effects\doteffect
   10      67012      22804 effects\instancedanimatedplaneeffect
   11      89848      10192 effects\animatedplaneeffect
   12     100075       3680 effects\fakepointspriteseffect
   13     103790       5760 effects\defaulteffect_textured
   14     109590       5148 effects\defaulteffect_vertexcolored
   15     114781       7288 effects\defaulteffect_litvertexcolored
   16     122107       7624 effects\defaulteffect_littextured
   17     129779       5644 effects\defaulteffect_texturedvertexcolored
   18     135474       7672 effects\defaulteffect_littexturedvertexcolored
   19     143183      17316 effects\instancedblackholeeffect
   20     160527     128137 sounds\ui\menu\exitgame
   21     288691      42781 sounds\ui\menu\confirm
   22     331502     151699 sounds\nature\watersplash
   23     483237    1197237 sounds\intro\zoomtofarawayplace
   24    1680502     264705 sounds\zu\blackholebuzz
   25    1945232      21014 resources\statictext
   26    1966280        278 other textures\glyphs\abutton
   27    1966597        278 other textures\glyphs\abutton_sony
   28    1966912        308 other textures\glyphs\backbutton
   29    1967262        334 other textures\glyphs\backbutton_sony
   30    1967630        278 other textures\glyphs\bbutton
   31    1967947        278 other textures\glyphs\bbutton_sony
   32    1968260        260 other textures\glyphs\dpaddown
   33    1968560        280 other textures\glyphs\dpaddown_sony
   34    1968875        262 other textures\glyphs\dpadleft
   35    1969177        284 other textures\glyphs\dpadleft_sony
   36    1969497        260 other textures\glyphs\dpadright
   37    1969798        284 other textures\glyphs\dpadright_sony
   38    1970115        258 other textures\glyphs\dpadup
   39    1970411        280 other textures\glyphs\dpadup_sony
   40    1970727        316 other textures\glyphs\leftarrow
   41    1971080        288 other textures\glyphs\leftbumper
   42    1971409        294 other textures\glyphs\leftbumper_ps3
   43    1971744        308 other textures\glyphs\leftbumper_ps4
   44    1972088        302 other textures\glyphs\leftstick
   45    1972431        264 other textures\glyphs\leftstick_sony
   46    1972733        266 other textures\glyphs\lefttrigger
   47    1973041        276 other textures\glyphs\lefttrigger_ps3
   48    1973359        278 other textures\glyphs\lefttrigger_ps4
   49    1973674        316 other textures\glyphs\rightarrow
   50    1974028        298 other textures\glyphs\rightbumper
   51    1974368        296 other textures\glyphs\rightbumper_ps3
   52    1974706        314 other textures\glyphs\rightbumper_ps4
   53    1975057        310 other textures\glyphs\rightstick
   54    1975409        278 other textures\glyphs\rightstick_sony
   55    1975726        276 other textures\glyphs\righttrigger
   56    1976045        286 other textures\glyphs\righttrigger_ps3
   57    1976374        292 other textures\glyphs\righttrigger_ps4
   58    1976704        322 other textures\glyphs\startbutton
   59    1977069        304 other textures\glyphs\startbutton_sony
   60    1977407        280 other textures\glyphs\xbutton
   61    1977726        278 other textures\glyphs\xbutton_sony
   62    1978038        286 other textures\glyphs\ybutton
   63    1978363        280 other textures\glyphs\ybutton_sony
   64    1978682        276 other textures\keyboard_glyphs\p_0
   65    1978997        266 other textures\keyboard_glyphs\p_1
   66    1979302        284 other textures\keyboard_glyphs\p_2
   67    1979625        278 other textures\keyboard_glyphs\p_3
   68    1979942        278 other textures\keyboard_glyphs\p_4
   69    1980259        282 other textures\keyboard_glyphs\p_5
   70    1980580        282 other textures\keyboard_glyphs\p_6
   71    1980901        278 other textures\keyboard_glyphs\p_7
   72    1981218        274 other textures\keyboard_glyphs\p_8
   73    1981531        280 other textures\keyboard_glyphs\p_9
   74    1981850        280 other textures\keyboard_glyphs\p_a
   75    1982174        352 other textures\keyboard_glyphs\p_arrows
   76    1982574        272 other textures\keyboard_glyphs\p_arrow_down
   77    1982894        280 other textures\keyboard_glyphs\p_arrow_left
   78    1983223        280 other textures\keyboard_glyphs\p_arrow_right
   79    1983549        276 other textures\keyboard_glyphs\p_arrow_up
   80    1983864        270 other textures\keyboard_glyphs\p_b
   81    1984181        366 other textures\keyboard_glyphs\p_backspace
   82    1984586        280 other textures\keyboard_glyphs\p_c
   83    1984908        358 other textures\keyboard_glyphs\p_caps
   84    1985309        268 other textures\keyboard_glyphs\p_comma
   85    1985616        274 other textures\keyboard_glyphs\p_d
   86    1985934        346 other textures\keyboard_glyphs\p_delete
   87    1986324        282 other textures\keyboard_glyphs\p_divide
   88    1986645        278 other textures\keyboard_glyphs\p_e
   89    1986964        312 other textures\keyboard_glyphs\p_end
   90    1987319        358 other textures\keyboard_glyphs\p_enter
   91    1987720        266 other textures\keyboard_glyphs\p_equal
   92    1988027        320 other textures\keyboard_glyphs\p_esc
   93    1988391        356 other textures\keyboard_glyphs\p_escape
   94    1988789        334 other textures\keyboard_glyphs\p_esdf
   95    1989162        278 other textures\keyboard_glyphs\p_f
   96    1989480        290 other textures\keyboard_glyphs\p_f0
   97    1989810        280 other textures\keyboard_glyphs\p_f1
   98    1990131        290 other textures\keyboard_glyphs\p_f10
   99    1990462        278 other textures\keyboard_glyphs\p_f11
  100    1990781        290 other textures\keyboard_glyphs\p_f12
  101    1991111        294 other textures\keyboard_glyphs\p_f2
  102    1991445        288 other textures\keyboard_glyphs\p_f3
  103    1991773        292 other textures\keyboard_glyphs\p_f4
  104    1992105        288 other textures\keyboard_glyphs\p_f5
  105    1992433        292 other textures\keyboard_glyphs\p_f6
  106    1992765        286 other textures\keyboard_glyphs\p_f7
  107    1993091        286 other textures\keyboard_glyphs\p_f8
  108    1993417        290 other textures\keyboard_glyphs\p_f9
  109    1993746        282 other textures\keyboard_glyphs\p_g
  110    1994067        278 other textures\keyboard_glyphs\p_h
  111    1994387        320 other textures\keyboard_glyphs\p_home
  112    1994746        266 other textures\keyboard_glyphs\p_i
  113    1995054        328 other textures\keyboard_glyphs\p_ijkl
  114    1995426        360 other textures\keyboard_glyphs\p_insert
  115    1995825        272 other textures\keyboard_glyphs\p_j
  116    1996136        284 other textures\keyboard_glyphs\p_k
  117    1996459        262 other textures\keyboard_glyphs\p_l
  118    1996764        318 other textures\keyboard_glyphs\p_l_alt
  119    1997129        272 other textures\keyboard_glyphs\p_l_bracket
  120    1997448        352 other textures\keyboard_glyphs\p_l_control
  121    1997845        348 other textures\keyboard_glyphs\p_l_shift
  122    1998240        356 other textures\keyboard_glyphs\p_l_windows
  123    1998635        286 other textures\keyboard_glyphs\p_m
  124    1998964        268 other textures\keyboard_glyphs\p_minus
  125    1999278        286 other textures\keyboard_glyphs\p_multiply
  126    1999603        278 other textures\keyboard_glyphs\p_n
  127    1999923        320 other textures\keyboard_glyphs\p_none
  128    2000286        324 other textures\keyboard_glyphs\p_num_0
  129    2000653        318 other textures\keyboard_glyphs\p_num_1
  130    2001014        330 other textures\keyboard_glyphs\p_num_2
  131    2001387        326 other textures\keyboard_glyphs\p_num_3
  132    2001756        320 other textures\keyboard_glyphs\p_num_4
  133    2002119        328 other textures\keyboard_glyphs\p_num_5
  134    2002490        326 other textures\keyboard_glyphs\p_num_6
  135    2002859        322 other textures\keyboard_glyphs\p_num_7
  136    2003224        324 other textures\keyboard_glyphs\p_num_8
  137    2003591        322 other textures\keyboard_glyphs\p_num_9
  138    2003959        352 other textures\keyboard_glyphs\p_num_lock
  139    2004350        276 other textures\keyboard_glyphs\p_o
  140    2004665        274 other textures\keyboard_glyphs\p_p
  141    2004986        356 other textures\keyboard_glyphs\p_page_down
  142    2005387        312 other textures\keyboard_glyphs\p_page_up
  143    2005743        266 other textures\keyboard_glyphs\p_period
  144    2006051        260 other textures\keyboard_glyphs\p_pipe
  145    2006353        278 other textures\keyboard_glyphs\p_plus
  146    2006670        276 other textures\keyboard_glyphs\p_q
  147    2006992        274 other textures\keyboard_glyphs\p_question
  148    2007310        268 other textures\keyboard_glyphs\p_quotes
  149    2007617        282 other textures\keyboard_glyphs\p_r
  150    2007942        316 other textures\keyboard_glyphs\p_r_alt
  151    2008305        270 other textures\keyboard_glyphs\p_r_bracket
  152    2008622        352 other textures\keyboard_glyphs\p_r_control
  153    2009019        342 other textures\keyboard_glyphs\p_r_shift
  154    2009408        350 other textures\keyboard_glyphs\p_r_windows
  155    2009797        280 other textures\keyboard_glyphs\p_s
  156    2010121        354 other textures\keyboard_glyphs\p_scroll
  157    2010522        268 other textures\keyboard_glyphs\p_semicolon
  158    2010833        354 other textures\keyboard_glyphs\p_space
  159    2011226        274 other textures\keyboard_glyphs\p_t
  160    2011541        308 other textures\keyboard_glyphs\p_tab
  161    2011892        280 other textures\keyboard_glyphs\p_tilde
  162    2012211        272 other textures\keyboard_glyphs\p_u
  163    2012522        282 other textures\keyboard_glyphs\p_v
  164    2012843        276 other textures\keyboard_glyphs\p_w
  165    2013161        340 other textures\keyboard_glyphs\p_wasd
  166    2013540        288 other textures\keyboard_glyphs\p_x
  167    2013867        282 other textures\keyboard_glyphs\p_y
  168    2014188        284 other textures\keyboard_glyphs\p_z
  169    2014514        358 other textures\keyboard_glyphs\p_zqsd
  170    2014914        254 other textures\speech_bubble\b_prompt
  171    2015216        238 other textures\speech_bubble\speechbubblene
  172    2015502        234 other textures\speech_bubble\speechbubblenw
  173    2015784        212 other textures\speech_bubble\speechbubblese
  174    2016044        238 other textures\speech_bubble\speechbubblesw
  175    2016332        230 other textures\speech_bubble\speechbubbletail
  176    2016598        324 other textures\thought_glyphs\b
  177    2016959        386 other textures\thought_glyphs\up
  178    2017380        420 other textures\black_hole\rips
  179    2017836        418 other textures\black_hole\stars
  180    2018289       3460 other textures\splash\polytron
  181    2021789       9470 other textures\splash\polytron_1440
  182    2031294       9124 other textures\splash\trapdoor
  183    2040452       5186 other textures\splash\trixels
  184    2045677       9132 other textures\splash\trixels_1440
  185    2054831     486056 fonts\chinese big
  186    2540911     242918 fonts\chinese small
  187    2783852     335020 fonts\japanese big
  188    3118897     161618 fonts\japanese small
  189    3280536     217780 fonts\korean big
  190    3498339     114350 fonts\korean small
  191    3612709       3860 fonts\latin big
  192    3616591       3866 fonts\latin small
  193    3620473       2572 fonts\zuish
  194    3623075    2326381 sounds\intro\fezlogodrone
  195    5949488      34247 sounds\intro\fezlogoglitch1
  196    5983767      34247 sounds\intro\fezlogoglitch2
  197    6018046      34247 sounds\intro\fezlogoglitch3
  198    6052321    1428457 sounds\intro\hyperspeed
  199    7480810     803029 sounds\intro\lensflaregleam
  200    8283865    1157729 sounds\intro\logozoom
  201    9441619    1411305 sounds\intro\pandown
  202   10852956     482837 sounds\intro\polytronjingle
  203   11335817     881177 sounds\intro\reboot
  204   12217020    1625505 sounds\intro\starzoom
  205   13842555     185953 sounds\intro\trixellogoin
  206   14028539     185953 sounds\intro\trixellogoout
  207   14214528    1197237 sounds\intro\zoomtofarawayplace
  208   15411794    1182929 sounds\intro\zoomtohouse
  209   16594746     103267 sounds\dot\comeout
  210   16698038      78067 sounds\dot\heylisten
  211   16776125     132405 sounds\dot\hide
  212   16908550     112717 sounds\dot\idle
  213   17021287     150125 sounds\dot\move
  214   17171432     604905 sounds\dot\talk

And finally we can also dump files from a FEZ game archive like this:

#!/usr/bin/env python3
import struct
import sys
import os
with open(sys.argv[1], 'rb') as fp:
file_count, = struct.unpack("<I", fp.read(4))
for file_index in range(file_count):
file_name_length = fp.read(1)[0]
file_name = fp.read(file_name_length).decode()
file_size, = struct.unpack("<I", fp.read(4))
file_data = fp.read(file_size)
file_path = file_name.replace('\\', os.sep)
print(file_path)
subdir = os.path.split(file_path)[0]
# WARNING: If subdir would start with '/' ('C:\' or '\\' under Windows)
# or otherwise escapes the current directory (e.g. using '..') this
# will overwrite unexpected files. You need to check for that in a
# proper program, but here I checked the file paths manually by
# looking at the output of list_fez.py first.
os.makedirs(subdir, exist_ok=True)
with open(file_path, 'wb') as outfp:
outfp.write(file_data)
view raw dump_fez.py hosted with ❤ by GitHub

With this knowledge I wrote a small Python program that can list, unpack, pack and even read-only mount FEZ game archives. See fezpak on GitHub.

Next time I'll talk about a more complicated file format, which I reverse engineered in order to make a simple sprite swap game mod.

Comments

Popular posts from this blog

Save/download data generated in JavaScript

How to write a binary file format