Not just code, but data too

Originally the project included a lot of data structures dumped directly from the ROMs, such as sprite animation, palettes, hitboxes, AI bytecode and more. The only data not ported were the actual tiles; the game depended on four tile ROMs from the original game to be present.

Since all this data from the code ROMs is a considerable part of the original code, I decided to pull it all out and require the entire ROM set to be present, basically as much as you’d need to run the game in Mame.

This has its own difficulties, for a start we need to do endian-swapping each time we read anything larger than a byte. I wondered how much this would impact performance, but it turns out not very much. Some CPUs even have special instructions just to do this.

It also has some benefits. Getting the data out of the ROMs and into C-style structs and byte arrays was error prone, and the utility I wrote to do this sometimes messed up and produced incomplete / corrupt data. Many arrays are jagged. Sometimes sentinel elements are used. Sometimes I had to guess lengths of arrays and got it wrong. Going to the ROMs for all this ensures nothing is missing.

I wrote a compatibility layer to take care of locating these structures and doing the necessary byte-swapping.

As an example, here’s how an 2D jagged array of words would be stored.

00001ff0 lea 0x2000(%pc), %a6    ;get address of row table
00001ff4 move.w (%pc, %d0), %d0  ;get offset of row pointed by %d0
00001ff8 lea (%d0, %a6), %a6     ;add offset to base address
00001ffc jmp some_routine_that_uses_this_data

00002000 dc.w 0x0008   ;first row starts at 0x2000 + 0x8 = 0x2008
00002002 dc.w 0x0018   ;next row starts at 0x2000 + 0x18 = 0x2018
00002004 dc.w 0x001a   ;and so on
00002006 dc.w 0x0020

00002008 dc.w 0x1234   ;first element of first row
0000200a dc.w 0x5678   ;second element of first row
....
00002016 dc.w 0x8888   ;last element of first row
00002018 dc.w 0x1235   ;first (and only) element of second row
....
0000201a dc.w 0x1236   ;first element of third row

I guess they did it this way to save a little space at the cost of a couple of extra instructions, as opposed to just using 32-bit pointers, although there was plenty of free space in the ROMs.

Switch-style statements are very similar (and very frequently used throughout the code). Case variables are often kept to even numbers to avoid having to do pointer arithmetic. An example from the state-machine:

000077dc  302d 000c   move.w (game_mode_3, %a5), %d0
000077e0  323b 0006   move.w (#6, %pc, %d0.w), %d1
000077e4  4efb 1002   jmp    (#2, %pc, %d1.w)
;the jump-offset table follows
000077e8  000a        dc.w 0x000a ;jump to 0x77f2
000077ea  0016        dc.w 0x0016 ;jump to 0x77fe
;... several more ...
; here is case 0:
000077f2  546d 000c   addq.w #2, (game_mode_3, %a5) ;next time do the next case
000077f6  4eb8 2794   jsr 0x2794   ; reset display state
000077fa  4ef8 2c8a   jmp 0x2c8a   ; initialize both players (tail call)

000077fe  546d 000c   addq.w #2, (game_mode_3, %a5) ;again, next time to the next case
00007802  4eb8 1698   jsr 0x1698   ; set up palettes
00007806  4ef8 1ae2   jmp 0x1ae2   ; set up scrolls (tail call again)

This is more or less equivalent to

switch (g_game_mode_3) {
    case 0:
        g_game_mode_3 += 2;
        reset_display_state();
        init_players();
        break;
    case 2:
        g_game_mode_3 += 2;
        set_up_palettes();
        set_up_scrolls();
        break;
    /* more cases */
}

If the case is odd, the machine will crash with a bus error exception. This is why 2 is added to g_game_mode_3. If the case is greater than the number of elements in the jump-offset table, the behaviour is undefined. I’ve put default clauses in all my switch statements to land in the debugger to let me know something’s up.

Not just code, but data too

15 thoughts on “Not just code, but data too

  1. Jordi says:

    SF2 was originally coded by 4 programmers using previous knowledge and libraries from older CAPCOM games. You have disassembled, labeled and understood the code, outstanding! I can’t imagine how many hours have you invested in. 😉

    In 1992, Hung Hsi Enterprise Taiwan hacked SF2 Champion edition creating SF2 Rainbow edition, do you think someone inside CAPCOM leaked the source code or did they do a full disassembly in order to modify the game?

    Thank’s!

    Like

    1. I’ve always wondered how the clones were done and whether they got hold of a leaked source. I haven’t got around to looking at their ROMs yet, but it seems like a lot of the hacks could have been done by changing a vars and patching out jumps. Maybe one day I’ll look into them. Thanks!

      Like

    1. I moved house this month and one of my jobs has got me pretty busy, which is good because the move was costly and I need to get some money in the bank.

      Don’t worry, I’ll be back! Thanks for checking in.

      Like

  2. John says:

    Hi SF2PLATINUM, could you upload some portions of your C code in order people to view? I’m no good with assembler but can read C code. I would like to know how the original AI performs. Thank you.

    Like

  3. JORDI says:

    Hi SF2PLATINUM, how are you? So many time without updating… 😦 I understand you may be occupied but have you considered relising some files of code? I’m not interested in any way to compile the full game in C but looking to some units like SF2 IA would be nice. Could it be possible?

    Like

    1. About a year ago I modified my code so it pulls stuff like AI data directly from the ROMs so there’s no risk of my shipping someone else’s code. This means it should be easier for me to share the AI code, but unfortunately at this stage the AI is probably the buggiest of the lot. I’ll hopefully get time around Xmas to look at this.

      Like

  4. denpashogai says:

    I hope you find time to pick up this project again! It’s something I wanted to do for many years but just didn’t get there.

    Like

    1. Got some plans to gut and rewrite the OpenGL layer. When I wrote it I was pretty new to OpenGL and did it all in what I later discovered to be legacy OpenGL, and used some silly hacks to do things like indexed colour textures that are much easier with shaders.

      Like

  5. Denny says:

    The thing that would interest me the most: How was the general AI implemented? I assume the game uses a decision tree and fetched character-specific values from data in ROM for it. But that decision tree itself, that’s something I’d like to know more about. What conditions did the program check for? How did the whole AI routine “behave”? Could you please tell us more about it?

    Like

  6. AI seems to be the area most people are interested in, so I’ll make this my next post. I know I haven’t updated in ages (gotta pay the bills! I’ve been really busy), but I’m hoping to do a bit of work on the project these summer holidays.

    Like

  7. JORDI says:

    So happy to see movement here!
    SF2PLATINUM, as I told one year ago,please consider realising some parts of the code ! No need this to compile just to see some parts, (even if it’s uncommented) just for study.
    SF2 is so old no one could make profit viewing its code.

    Thx!

    Like

Leave a comment