I find the most common pair of letters and replace them with char 128 (may be 96), find the next most common pair which might include char 128 and replace them with char 129, repeat up to 253 or there abouts and the write out the twp half page table with which pair to replace each chara with.
Decompression requires a small stack as reading the next char might mean pushing one on the stack for later and decoding the first.
I can't remember if I posted the code, but someone else took the idea and extended the characters to replace pairs with to give a bit more compression.
In the menu system, I get all the game titles and publishers, compress and if it won;t fit in memory, repeat with title + publisher for the first in each series of publishers, leaving the rest of the games as just title and if that doesn't fir, just do titles, which does.
I also have some abreviations that I substitute and special characters that I replace.
The code was just a quick hack to see if it would work and I never went back to tidy it!
Code: Select all
unsigned char dictionary[128+32][2];
short pages[256];
if (size_t bytes = dst - strings)//fread(strings, 1, sizeof(strings), f))
{
OutputDebugStringA("File: "); OutputDebugStringA(_itoa(int(bytes), num, 10));
for (int replace = 0; replace < 128+32; ++replace)// if (replace != 13 && replace != 9)
{
std::map<unsigned short, int> occurences;
for (int c = 0; c < int(bytes) - 1; ++c)
{
if (252 != strings[c + 0] && 252 != strings[c + 1])
{
++occurences[strings[c + 0] | strings[c + 1] << 8];
}
}
unsigned short best = 0; int most = 0;
for (auto n : occurences)
{
if (n.second > most)
{
best = n.first;
most = n.second;
}
}
p.c = dictionary[replace];
*p.s = best;
unsigned char * ch = p.c = strings;
while (ch < strings + bytes - 1)
{
if (best == *p.s)
{
*ch++ = replace;
p.c += 2;
--bytes;
}
else
{
*ch++ = *p.c++;
}
}
while (ch < strings + bytes)
{
*ch++ = *p.c++;
}
}
It takes about a second iirc, so doesn;t really matter, which is the great thing about compressing for a beeb but not on a beeb, exaustive searches aren't usually exausting
Much like the compressor I use for my games to fit them in ROM, it was good enough, fast enough and small enough so I kept it
Decompressor looks like this:
Code: Select all
ldx #0
ldy #0
.next_line
jsr OSNEWL : lda #CRSR_RIGHT : jsr OSWRCH : jsr OSWRCH : jsr OSWRCH
.next_byte
lda stack,x : cpx #0 : bne stack_not_empty
lda (src,x) : inc src {bne pg : inc src+1 : .pg}
dex
.stack_not_empty
cmp #128+32 : bcs not_compressed
.pop_this
tay : lda dict1,y : sta stack,x
dex : lda dict0,y ;;: sta stack,x
cmp #128+32 : bcc pop_this
.not_compressed
cmp #254 : bcs end_string
inx
and #&7F : jsr OSWRCH
jmp next_byte
.end_string
I don't know how well other methods would work as the strings are relativly small (name or name + publisher, maybe author, I don't remember) but it would be easy to check I guess!
I was thinking about spitting out separate menus for B and Master as either you would have a disc for each machine, or at least two if you have multiple machines but I don;t think all the games are correctly labeled as to which systems they work on, so I just left it as try it, well it doesn;t take long to select another game especailly if you have an MMC.