No new build to share but just a quick note that my crazy "DLL" scheme worked for transparently spreading gameplay code across multiple SWRAM banks.
Because the code already has a jump table interface between code modules, I moved all of these to main RAM (above PAGE, below SHADOW) and changed the jmp opcodes to macros:
Code: Select all
\*-------------------------------
\* auto.asm
\*-------------------------------
.AutoCtrl JUMP_A AUTOCTRL, AUTO_BASE, 0
.checkstrike JUMP_A CHECKSTRIKE, AUTO_BASE, 1
.checkstab JUMP_A CHECKSTAB, AUTO_BASE, 2
.AutoPlayback JUMP_A AUTOPLAYBACK, AUTO_BASE, 3
.cutcheck JUMP_A CUTCHECK, AUTO_BASE, 4
.cutguard JUMP_A CUTGUARD, AUTO_BASE, 5
.addguard JUMP_A ADDGUARD, AUTO_BASE, 6
.cut JUMP_A CUT, AUTO_BASE, 7
Where JUMP_A macro looks like:
Code: Select all
MACRO JUMP_A name, base, index
{
\\ Preserve X
STX DLL_REG_X
\\ Load function index
LDX #(base + index)
\\ Call jump function
JMP jump_to_A
}
ENDMACRO ; 3c+2c+3c = 8c + 7b overhead per fn call
And the jump_to_A function contains:
Code: Select all
.jump_to_A
{
\\ Preserve A
STA DLL_REG_A
\\ Remember current bank
LDA &F4: PHA
LDA #6 ; hard code this aux A = 6
STA &F4: STA &FE30
\\ Load function address
LDA aux_core_fn_table_A_LO, X
STA jump_to_addr_A + 1
LDA aux_core_fn_table_A_HI, X
STA jump_to_addr_A + 2
\\ Restore registers before fn call
LDX DLL_REG_X
LDA DLL_REG_A
}
\\ Call function
.jump_to_addr_A
JSR &FFFF
{
\\ Preserve A
STA DLL_REG_A
\\ Restore original bank
PLA
STA &F4:STA &FE30
\\ Restore A before return
LDA DLL_REG_A
RTS
} ; 3c+3c+3c+2c+3c+4c+4c+4c+4c+4c+3c+3c+6c+3c+4c+3c+4c+3c+6c = 69c overhead
So a hefty per-function call cycle overhead (and not insignificant per stub memory overhead) but the game seems to run OK! I guess the increase in CPU speed is enough to compensate for this and likely that the screen plot routines are dominating the frame time anyway (and are resident in main RAM.)
This only works if (non-sprite) data is either permanently resident (e.g. in HAZEL) or only accessed through a code interface - fortunately the animation frame data and sequence tables are accessed this way so can be grouped with the corresponding module.
Because the gameplay functions can be called from main RAM or from either SWRAM bank arbitrarily, the original bank has to be replaced by the caller otherwise we can return into the wrong place when these can chained together. Also there are instances where the sprite data is temporarily paged in to check the size of frames for collision detection etc. so this needs to be handled carefully.
Later on, once the memory map settles down, any function calls between code modules within the same SWRAM bank can be made direct rather than indirected through the DLL stub to save unnecessary cycles.
Obviously if you were writing a Master-only game from scratch you wouldn't add such an unnecessary overhead but it seems to be working OK and gives me lots of flexibility to continue moving code around and porting the remaining functions etc. The original Apple II code makes a nice clean separation between gameplay & rendering and uses a similar interface (main vs aux RAM) but even includes some similar extra stubs for those occasions when the game needs functions that are in an inconvenient place in RAM. It looks like Jordan ended up doing quite a bit of juggling towards the end of development as there are a few functions that are in the "wrong" code module.
Final quick question to the group - for simplicity's sake, I am thinking of changing the SWRAM handling to be hardcoded to the Master default banks 4,5,6,7 - does anyone have a reason why I shouldn't do that? At the moment there is a SWRAM check at boot and the game will barf if less than 4 are detected. It uses "slots" numbered 0-3 which will typically map to banks 4-7. Given this is Master only, would anyone have a setup with 64K SWRAM but not in banks 4-7? It would be weird to use the internal ROM slots then add the SWRAM back in a cartridge, for instance.