py8dis - a programmable static tracing 6502 disassembler in Python

handy tools that can assist in the development of new software
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

Steve,

I've been using py8dis today to disasseme OnliBasic on the System and Atom, and its been really great.

One small thing I don't understand is why this STA instruction is not disassembling properly:
https://github.com/hoglet67/OnliBasic/b ... c.asm#L271

Code: Select all

equb &99, <(l0099), >(l0099) ; sta+2 l0099,y            ; d133: 99 99 00
I was expecting to see:

Code: Select all

sta l0099,y            ; d133: 99 99 00
I don't see anything unusual about it.

All the files you need to run this are checked in:
https://github.com/hoglet67/OnliBasic/t ... bly_system

Any ideas?

Dave
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

hoglet wrote: Wed Sep 15, 2021 9:48 pm I've been using py8dis today to disasseme OnliBasic on the System and Atom, and its been really great.
Thanks!
hoglet wrote: Wed Sep 15, 2021 9:48 pm One small thing I don't understand is why this STA instruction is not disassembling properly:
https://github.com/hoglet67/OnliBasic/b ... c.asm#L271

Code: Select all

equb &99, <(l0099), >(l0099) ; sta+2 l0099,y            ; d133: 99 99 00
I was expecting to see:

Code: Select all

sta l0099,y            ; d133: 99 99 00
This is a false positive for an "oversized" instruction - I was lazy and defaulted to assuming that STA zp,Y existed and therefore the assembler might mis-assemble this instruction if it were emitted in the normal way rather than as an equb directive. I've pushed a fix for this to master and the lazy branch mentioned in my last post and it should no longer happen for STA abs,Y.

I haven't yet been over all the instructions to look for other cases where there is no zero page version of an absolute instruction, so if you spot this happening anywhere else please let me know.
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

SteveF wrote: Wed Sep 15, 2021 10:08 pm I haven't yet been over all the instructions to look for other cases where there is no zero page version of an absolute instruction, so if you spot this happening anywhere else please let me know.
As far as I'm aware, the only ZP, Y instructions that exist are:

Code: Select all

LDX ZP,Y
STX ZP,Y
Any other flavour has to be ABS, Y

Dave
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Thanks, I think I've now updated the master and lazy branches to reflect this.
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

I've done a bit more tweaking and there's a new version on the assembler branch. This fixes a subtle bug in move() support where assembler labels might not always be defined, generalises the internal support for different assemblers a little bit and adds support for the "xa" assembler (specify -x or --xa option after the control file name to use this).

I have lightly tested it (not with gfoot's Repton 2 disassembly yet, though) and it seems to work. Currently no assertions (to verify expressions give the expected values) are emitted for "xa" as I don't know how to make it work. If I try to assemble this:

Code: Select all

lffe3 = $ffe3
    * = $2000

pydis_start
    ldx #$00                                                // 2000: a2 00
// Referenced 1 time by $200b
c2002
    lda l200e,x                                             // 2002: bd 0e 20
    jsr lffe3                                               // 2005: 20 e3 ff
    inx                                                     // 2008: e8
    cpx #$0e                                                // 2009: e0 0e
    bne c2002                                               // 200b: d0 f5
    rts                                                     // 200d: 60
l200e
    .asc "Hello, world!"                                    // 200e: 48 65 6c ...
    .byt $0d                                                // 201b: .
    .byt <c2002
pydis_end

// Label references by decreasing frequency:
//     c2002:   1
//     lffe3:   1

#if c2002 <> $2002
    #echo c2002 has invalid value
#endif

#if (<c2002) <> $02 // this is line 28
    #echo foo
#endif
I get this error:

Code: Select all

z.asm:line 28: 201d:Syntax error
    #echo foo
z.asm:line 29: 201d:Syntax error
Break after 2 errors
I am guessing I can't use arbitrary expressions in "#if" statements, which makes a certain amount of sense (it probably helps avoid problems where the first and second pass assemble different code), but I "need" this in order to be able to assert at assembly time that a certain expression has a certain value. Am I doing something wrong/missing something, or is this something I just can't do with xa?
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

A couple more bug reports.

I'm trying to disassemble the System 3 MOS, which is 0x1000 bytes and starts at 0xf000.

I get the following warning:

Code: Select all

load() would overflow memory
It looks like there is an off-by one error here:

Code: Select all

py8dis/commands.py:
        if addr + len(data) > 0xffff:
            utils.die("load() would overflow memory")
After fudging that, I get the following warning from BeebASM:

Code: Select all

TOSDOS-S3.asm:129: error: Out of range.

    guard &10000
So you would need to also omit the guard statement in this case I think.

Finally, I'm having trouble when adding this to my .py file:

Code: Select all

hook_subroutine(0xf009, "print_string", stringhi_hook)
I end up with a python stack trace:

Code: Select all

Traceback (most recent call last):
  File "TOSDOS-S3.py", line 16, in <module>
    go()
  File "/disk1/home/dmb/atom/py8dis/py8dis/commands.py", line 121, in go
    trace.trace()
  File "/disk1/home/dmb/atom/py8dis/py8dis/trace.py", line 20, in trace
    new_entry_points = config.disassemble_instruction()(entry_point)
  File "/disk1/home/dmb/atom/py8dis/py8dis/trace6502.py", line 362, in disassemble_instruction
    return opcode.disassemble(addr)
  File "/disk1/home/dmb/atom/py8dis/py8dis/trace6502.py", line 164, in disassemble
    return_addr = jsr_hooks.get(target, lambda target, addr: addr + 3)(target, addr)
  File "/disk1/home/dmb/atom/py8dis/py8dis/commands.py", line 84, in stringhi_hook
    return stringhi(addr + 3)
  File "/disk1/home/dmb/atom/py8dis/py8dis/classification.py", line 235, in stringhi
    disassembly.add_classification(initial_addr, String(addr - initial_addr, False))
  File "/disk1/home/dmb/atom/py8dis/py8dis/disassembly.py", line 57, in add_classification
    assert not is_classified(addr, classification.length())
Any ideas?

The TOSDOS-S3 disassembly files are checked in here:
https://github.com/hoglet67/OnliBasic/t ... _tosdos_s3

Dave
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Thanks Dave, I've pushed some changes to the assembler branch which should fix these problems. I can now run TOSDOS-S3.py and get a file which correctly reassembles the input using beebasm.

I just put fixes in for top-of-memory problems in where I needed to to make this work; I haven't been over the code exhaustively, so if you find any other examples of this please let me know. (When/if the code settles down I will probably review it all looking for things like this and generally tidying up.)

The backtrace you report was caused by stringhi() - and the other related functions, FWIW - failing to respect existing classifications. py8dis wants every byte to be classified as exactly one thing (part of an instruction, string, byte data or word data). stringhi() didn't stop searching when it hit something that was already classified, so the string it was producing overlapped an existing classification and py8dis complained because a byte can't be classified twice. stringhi() now stops when it hits an existing classification so this doesn't happen any more.
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

Cheers Steve.

One more tiny request....

Do you think you could add a stringhiz() function/hook that terminates at bit 7 set or zero.

Curiously the System 3 MOS print_string works like that, with a BMI/BEQ as the terminator.

Code: Select all

.print_string
    pla                                                     ; f009: 68
    sta l00d8                                               ; f00a: 85 d8
    pla                                                     ; f00c: 68
    sta l00d9                                               ; f00d: 85 d9
    ldy #&00                                                ; f00f: a0 00
.cf011
    inc l00d8                                               ; f011: e6 d8
    bne cf017                                               ; f013: d0 02
    inc l00d9                                               ; f015: e6 d9
.cf017
    lda (l00d8),y                                           ; f017: b1 d8
    bmi cf023                                               ; f019: 30 08
    beq cf023                                               ; f01b: f0 06
    jsr cfff4                                               ; f01d: 20 f4 ff
    jmp cf011                                               ; f020: 4c 11 f0
I guess that allows the same function to be used for general printing and for errors.

Dave
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Hi Dave,

I've sort of done this... :-)

There was already a stringhiz() function which terminates at top-bit-set or 0x00. It didn't have an associated stringhiz_hook() function so I've added one.

However, that doesn't do quite what I think you want here. stringhiz() returns the address of the first byte after the string, and it *doesn't* include the terminator in the string. This meant that using stringhiz_hook for print_string disassembles the 0x00 terminator as BRK.

Edit: This is in fact probably exactly what you wanted, now I look at the code better. Anyway, I'll leave the following here as it's perhaps useful background material for anyone wanting to write their own hook functions, and I think the change I've made to add an optional argument to stringhiz() isn't bad.

I've made stringhiz() take an optional argument include_terminator_fn which is a function which is called with the terminator byte; if that function returns True, the terminator is included in the string (and therefore the return address points one past it), otherwise it isn't (and the return address points to the terminator).

Given this variant, you can write in TOSDOS-S3.py:

Code: Select all

def RENAMEME(target, addr):
    return stringhiz(addr + 3, include_terminator_fn=lambda c: c == 0)

hook_subroutine(0xf009, "print_string", RENAMEME)
I think this does what you want, if it doesn't please let me know. I am happy to include RENAMEME in the standard commands/functions in commands.py if you can suggest a good name for it.

If you think it's clearer, you can also just write it out yourself explicitly in TOSDOS-S3.py:

Code: Select all

def print_string_hook(target, addr):
    addr += 3
    start = addr
    while True:
        if disassembly.is_classified(addr):
            break
        if memory[addr] == 0:
            addr += 1
            break
        if memory[addr] & 0x80 == 0x80:
            break
        addr += 1
    string(start, addr - start)
    return addr

hook_subroutine(0xf009, "print_string", print_string_hook)
That's longer but maybe less confusing than using a standard function which you still need to wrap in a hook and pass a custom (if trivial) lambda function.

All these changes have been pushed to the "assembly" branch.

Edit: Nothing to do with py8dis, but as the System 3 code at &F000 itself shows, NOP has top-bit-set so you can use it as a harmless terminator and then you don't need to recognise 0x00 as a special case. But I suppose 0x00 is a more traditional terminator and if this is a standard function for the platform it's probably nice to recognise it.
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

SteveF wrote: Thu Sep 16, 2021 8:31 pm Edit: This is in fact probably exactly what you wanted, now I look at the code better.
Yes, that's now working as intended... Thanks

I still have a few more bits of currently unreachable code to track down.

But I think this is a good test case.

Dave
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

Here's another interesting edge case.

Code: Select all

.cfb50
    lda l2210,y                                             ; fb50: b9 10 22
    sta l00dd                                               ; fb53: 85 dd
    lda l2213,y                                             ; fb55: b9 13 22
    sta l00de                                               ; fb58: 85 de
    equb &a0                                                ; fb5a: .
.cfb5b
    brk                                                     ; fb5b: 00
    equb &b1, &dd, &48, &a4, &c2, &a9, &fe, &20             ; fb5c: ..H....
    equb &06, &fb, &a6, &dd, &e8, &8a, &99, &10             ; fb64: ........
    equb &22, &d0, &19, &18, &b9, &11, &22, &69             ; fb6c: "....."i
    equb &01, &99, &11, &22, &b9, &12, &22, &69             ; fb74: ...".."i
The second byte of LDY #&00 (A0 00) is being used as a convenient BRK instruction.

It's going to be a bit awkward represent this, but the main issue is that it stops the tracing of the path from 0xfb50 prematurely.

There are a several examples of this ROM.

Dave
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

That is a bit of an awkward one. I *haven't* tried this, but since the tracing process puts newly discovered instructions"branch targets" at the *end* of the internal list after any supplied by the user using entry(), if you were to do:

Code: Select all

entry(0xfb5a)
I think that would tell py8dis that you want that instruction to be disassembled as LDY #&00 (and whatever transfers control to that BRK will just have to do the best it can by generating a "lfb5b = cfb5a+1" label).

Edited to add: I am not sure there's any way to handle this completely automatically, although if anyone has any suggestions I'm definitely interested! Edited some more :-): I suppose rather than stopping the tracing process when it hits something that's already been classified, the tracing process could continue tracing but leave the existing classification alone. That way the fact that we know LDY #&00 doesn't divert the control flow will then cause us to continue tracing even if we can't classify those two bytes as an LDY #&00 because something got in before us and used the &00 as a BRK. Let me see if I can make that change...

Final (?) edit: OK, I think I've made that change and it *seems* to work - please give the latest code on the "assembler" branch a try and let me know what you think. Using "entry(0xfb5a)" as suggested above does provide a way to specify that you'd prefer the "LDY #&00" to be disassembled instead of the BRK, but it is not necessary to do this to force the following instructions to be traced.

Also edited: at the risk of stating the obvious or missing something myself :-), you could also just do:

Code: Select all

entry(0xfb5c)
to restart the tracing process after the brk instruction.
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

I've pushed a further tweak to the "assembler" branch which is a bit more aggressive about disassembling without regard to existing classifications and which attempts to show overlapping instructions via comments. For example:

Code: Select all

    sta l2211,y                                             ; fb75: 99 11 22     
    lda l2212,y                                             ; fb78: b9 12 22     
; overlapping: adc #&00                                     ; fb7b: 69 00        
    equb &69                                                ; fb7b: i            
.cfb7c                                                                           
    brk                                                     ; fb7c: 00           
    sta l2212,y                                             ; fb7d: 99 12 22     
This still passes my test cases but it's possible I've broken something, comments welcome as always!

Where there's an overlap you should still be able to force your preferred disassembly by using entry() on the overlapping instructions.

Edited to add: There's also a new nonentry(addr) command which tells the tracing process *not* to treat the byte at addr as an opcode. You probably won't need this but since the tracing process now doesn't stop because just something is classified, it might be helpful to use this after things like a conditional branch that's always taken in practice and which is followed by a string, in order to stop lots of "overlapping" comments as the tracing code tries to interpret the string as opcodes.
dp11
Posts: 1757
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

Just had my first play.

Feature request for obvious cases I would find it useful to be able to decode osbyte and osword etc . so

Code: Select all

ldx #$c8
ldy #$19
lda #$5
jsr osword 
would become something like:

Code: Select all

ldx #low (l19c8 )
ldy #high (l19c8)
lda #$5
jsr osword ;  Read I/O processor memory 
 
I say obvious case because A XY is not always obvious.
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

Steve,

Thanks for improving the behaviour with overlapping instructions.

Out of the box it worked very well and in all cases continued to trace the following code.

I did end up adding a few additional entry() commands to the python to coerce it into disassembling the longer instruction (i.e. the normal code path), so that the BRK case became the exception.

I think it's always going to be subjective which of the overlapping instructions you want to be disassembled. I tried to think of some heurisics, for example always disassemble the longer instruction. But that doesn't work with the earlier BIT absolute example, where it's better to show the BIT as equb &2C.

Anyway, I'm now done with the TOSDOS disassembly and am very happy with the result.
https://github.com/hoglet67/OnliBasic/b ... DOS-S3.asm

Dave
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

dp11 wrote: Fri Sep 17, 2021 9:44 am Just had my first play.

Feature request for obvious cases I would find it useful to be able to decode osbyte and osword etc . so

Code: Select all

ldx #$c8
ldy #$19
lda #$5
jsr osword 
would become something like:

Code: Select all

ldx #low (l19c8 )
ldy #high (l19c8)
lda #$5
jsr osword ;  Read I/O processor memory 
 
This is a good idea, thanks. I've had a go at implementing it on the autoexpr branch. If you do:

Code: Select all

import acorn
acorn.add_standard_labels()
in your control file, py8dis will do its best to do this kind of decoding. The implementation is pretty hacky but it seems to work fairly well on my test cases; I'll look into tidying it up later once people have had a chance to give feedback or further suggestions.

It would be possible to take this further than I have, for example if *FX21 is recognised, we could change "LDX #0" into "LDX #buffer_keyboard".

Technical note: The implementation looks for sequences of instructions which end with a JSR or JMP and contain only:
  • "neutral" instructions (like STX or PHP)
  • LDA #, LDX #, LDY # instructions
  • A/X/Y corrupting instructions like TXA or LDA (other than the immediate loads)
It then traces through the sequence of instructions and records the address of the operand of the most recent "live" LDA #/LDX #/LDY #; A/X/Y corrupting instructions set the "live" address to None for that register.

When that's been done, all functions which have been registered by calling add_sequence_hook() are then called with the target of the JSR/JMP and the A/X/Y operand addresses, if any (not the literal values). acorn.add_standard_labels() registers a sequence hook called acorn_sequence_hook() which recognises some (not yet all) OS calls and does its best to decode them, typically by either recognising that YX points to a block of memory which we can assign a label to and/or by using a named constant for the LDA # operand.

The list of "neutral" instructions is probably incomplete. We could also be smarter about tracking "across" a JMP instruction, although I don't know if that would be worth the complexity. Edit: A less hacky implementation would probably contain some generic-ish instruction modelling where py8dis knows roughly what each instruction does to each register.
Last edited by SteveF on Fri Sep 17, 2021 7:12 pm, edited 2 times in total.
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

hoglet wrote: Fri Sep 17, 2021 11:31 am Thanks for improving the behaviour with overlapping instructions.

Out of the box it worked very well and in all cases continued to trace the following code.

I did end up adding a few additional entry() commands to the python to coerce it into disassembling the longer instruction (i.e. the normal code path), so that the BRK case became the exception.

I think it's always going to be subjective which of the overlapping instructions you want to be disassembled. I tried to think of some heurisics, for example always disassemble the longer instruction. But that doesn't work with the earlier BIT absolute example, where it's better to show the BIT as equb &2C.
Thanks Dave, I'm glad it worked!

I think you're right about this being subjective. Given it's possible to force a preference with entry() calls I am not sure it's a huge problem, but I am nevertheless toying with the idea of loosening the "there must be at most one classification for each address" idea in py8dis - at the moment the "overlapping" instructions are second-class citizens and just get emitted as comments, but it would be possible to create classifications for all instructions we find. We'd then just need to ensure we emit only one classification per address when generating the final disassembly, which would be where some new code to apply some suitable heuristics could kick in. It might be possible to recognise the "BIT-as-skip" pattern and otherwise prioritise longer instructions, for example.

I don't think I'm likely to rush into this but if you or anyone else has any further thoughts let me know.
gfoot
Posts: 987
Joined: Tue Apr 14, 2020 9:05 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by gfoot »

SteveF wrote: Fri Sep 17, 2021 6:57 pm
dp11 wrote: Fri Sep 17, 2021 9:44 am Feature request for obvious cases I would find it useful to be able to decode osbyte and osword etc . so
This is a good idea, thanks. I've had a go at implementing it on the autoexpr branch.
Oh wow, that's great - I held back from requesting this as I thought all the static analysis would be really complex!

The case I wanted was actually for custom routines - like in Repton 2 there are draw_sprite and erase_sprite routines, that take a sprite number in A. I want that sprite number to be decoded from an enum of sprite values.

It comes back a bit to my feelings about "expr". I was thinking that it could be better - rather than saying exactly which constant goes in each memory reference - to be able to say e.g. "this is a sprite number", then have py8dis decide which actual sprite constant it is. It would also allow marking up a whole array of things in one go, with py8dis deciding which actual constant name to apply to each reference.

Armed with that, it would be possible to define that for a particular entry point, whatever was most recently loaded into A - if it was an immediate load - should be referenced via a particular class of constant name - without having to manually say so at each call site. I think it's very similar to what you've just done.

I'm really glad to see the overlapping instruction support as well. I don't really mind which instruction gets disassembled - both options seem messy in different ways - so long as both disassemblies are present at least in comments.
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

gfoot wrote: Fri Sep 17, 2021 7:38 pm Oh wow, that's great - I held back from requesting this as I thought all the static analysis would be really complex!

The case I wanted was actually for custom routines - like in Repton 2 there are draw_sprite and erase_sprite routines, that take a sprite number in A. I want that sprite number to be decoded from an enum of sprite values.
I think/hope that should be possible fairly easily. I suggest you take a look at acorn_sequence_hook in acorn.py to see how this works, but I'm guessing something like this will do what you want (just a sketch, not tested):

Code: Select all

sprite_names = {
   0x00: "sprite_00",
   0x01: "sprite_01",
}

def sprite_sequence_hook(target, a_addr, x_addr, y_addr):
    if target not in (0xxxxx, 0xxxxx):
        return
    if a_addr is not None:
        sprite_num = memory[a_addr]
        if sprite_num in sprite_names:
            name = sprite_names[sprite_num]
            constant(sprite_num, name) # not necessary if you do this elsewhere
            expr(a_addr, name)

add_sequence_hook(sprite_sequence_hook)
I could imagine making something a bit like sprite_sequence_hook() available as a standard function to do this kind of labelling without it all needing to be spelled out explicitly in the control file, but since it's all experimental at the moment it's not there yet. I expect what the standard function(s) should do will become clearer as this feature gets some testing.

Edited to add: if you try this and can't get it to work let me know and I'll take a look, it's all very hacky code and there could be bugs, of course!
dp11
Posts: 1757
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

SteveF wrote: Fri Sep 17, 2021 6:57 pm This is a good idea, thanks. I've had a go at implementing it on the autoexpr branch. If you do:

Code: Select all

import acorn
acorn.add_standard_labels()
in your control file, py8dis will do its best to do this kind of decoding. The implementation is pretty hacky but it seems to work fairly well on my test cases; I'll look into tidying it up later once people have had a chance to give feedback or further suggestions.

It would be possible to take this further than I have, for example if *FX21 is recognised, we could change "LDX #0" into "LDX #buffer_keyboard".
Wow! great work. I'll this branch on Twinhead which I need to dissemble to find out why one of the demos doesn't quire work on the new jitter on PiTubeDirect.
User avatar
hoglet
Posts: 12664
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

dp11 wrote: Fri Sep 17, 2021 8:15 pm I'll this branch on Twinhead which I need to dissemble to find out why one of the demos doesn't quire work on the new jitter on PiTubeDirect.
Have you tried asking Sarah for a copy of the source?
dp11
Posts: 1757
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

I think Sarah is a bit busy at the moment.

Next simple feature request is to add in the Sheila register names. Maybe with a master variant for fe34 etc. This is really going to be the goto disassembler.
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

dp11 wrote: Fri Sep 17, 2021 10:37 pm Next simple feature request is to add in the Sheila register names. Maybe with a master variant for fe34 etc. This is really going to be the goto disassembler.
Thanks! I've added the addresses for the video ULA, both VIAs and ACCCON and pushed this to the autoexpr branch. You need to add:

Code: Select all

import acorn
acorn.add_standard_labels() # might be renamed later, official OS API addresses for BBC OSes
acorn.hardware_bbc() # hardware addresses common to BBC B/B+/Master
acorn.hardware_master() # Master-specific hardware addresses
to your control file to get all of the labels.

Let me know if there are any other hardware addresses you're interested in and I'll add them.
dp11
Posts: 1757
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

This is amazing support. I would find the tube ula useful. You should be close to automatically disassembling the mos and have all the labels filled in well done.
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Cheers. I've added the tube ULA and pushed this to the autoexpr branch. To get the host side labels you need:

Code: Select all

acorn.hardware_bbc()
and for the parasite side labels you need:

Code: Select all

acorn.hardware_6502sp()
You can include both in the same control file if you wish.
User avatar
fordp
Posts: 1163
Joined: Sun Feb 12, 2012 9:08 pm
Location: Peterborough, England
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by fordp »

I will look to link to your project from my https://github.com/fordp2002/PyAcornDFS project so you can disassemble a file from an SSD, DSD or MMB file. I already support converting a basic file to a text file so exporting a machine code binary as an assembler file would be fun. I think I am the only one using my code but I love working on it.
FordP (Simon Ellwood)
Time is an illusion. Lunchtime, doubly so!
User avatar
TobyLobster
Posts: 618
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

Would it make more sense for acorn.hardware_master() to call acorn.hardware_bbc() which calls acorn.add_standard_labels()?
Then for a Master compatible disassembly you need less boilerplate code:

Code: Select all

import acorn
acorn.hardware_master()
or am I missing something?
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

fordp wrote: Sat Sep 18, 2021 4:55 pm I will look to link to your project from my https://github.com/fordp2002/PyAcornDFS project so you can disassemble a file from an SSD, DSD or MMB file. I already support converting a basic file to a text file so exporting a machine code binary as an assembler file would be fun. I think I am the only one using my code but I love working on it.
That's a neat idea. PyAcornDFS looks pretty cool - I like the GUI!
TobyLobster wrote: Sat Sep 18, 2021 5:01 pm Would it make more sense for acorn.hardware_master() to call acorn.hardware_bbc() which calls acorn.add_standard_labels()?
Then for a Master compatible disassembly you need less boilerplate code:

Code: Select all

import acorn
acorn.hardware_master()
or am I missing something?
This does seem sensible, thanks. I've pushed something similar to the autoexpr branch:
  • bbc() adds the OS labels and hardware labels for the BBC B
  • b_plus() does the same for the B+
  • master() does the same for the Master
  • hardware_master() now calls hardware_bbc()
  • hardware_b_plus() now calls hardware_bbc()
I didn't want to make any of the hardware_*() functions call add_standard_labels() because of the (admittedly largely theoretical) possibility that you might be disassembling something like a custom OS ROM and don't want the Acorn OS labels defining along with the hardware addresses.

Edited: add_standard_labels() should probably be renamed to indicate it deals with OS stuff, the name is just left over from when it was the only function except is_sideways_rom().
User avatar
TobyLobster
Posts: 618
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

SteveF wrote: Sat Sep 18, 2021 6:13 pm Edited: add_standard_labels() should probably be renamed to indicate it deals with OS stuff, the name is just left over from when it was the only function except is_sideways_rom().
Naming is hard.
"acorn.mos_labels()" perhaps?
SteveF
Posts: 1663
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

TobyLobster wrote: Sat Sep 18, 2021 9:29 pm Naming is hard.
"acorn.mos_labels()" perhaps?
I quite like that. Do you think it's worth making a distinction between different MOS versions and having (say) mos_120_labels(), mos_200_labels() etc? My inclination is to just go with "mos_labels()", since I think the OS 1.20 API is a subset of the OS 3.20 API.
Post Reply

Return to “development tools”