py8dis - a programmable static tracing 6502 disassembler in Python

handy tools that can assist in the development of new software
User avatar
TobyLobster
Posts: 622
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

I've made a number of improvements to py8dis, and these are all now merged into the master branch. To try the latest and greatest, pull the master branch https://github.com/ZornsLemma/py8dis . This is a substantial change - do shout if you find issues!

Everything is documented in the README.md, but here's a summary of the changes:
  • Local labels are new, which are like regular labels but only used when referenced within a given range or ranges of addresses. This is useful for a memory location that is reused elsewhere in the binary for a different purpose - this gives them different label names. (This could be done previously using a label maker hook function, but this is simpler.)
  • Suppose we have a dictionary where the keys are integers in the range 0-255 and the values are constant names of a particular type, (e.g. sprite ids). We can pass this to "expr(addr, dict)" to substitute the appropriate constant name for the byte value at that address.
  • That same dictionary can be passed to a new function "substitute_constants(instruction, reg, dict)" which looks for a given instruction and if there is a load immediate instruction somewhere before it, substitutes the constant name.
    e.g. it will find instances of the instruction "jsr plot_sprite" and a load immediate instruction before it, e.g.:

    Code: Select all

        lda #SpriteId_Egg    ; <--- number replaced by constant name automatically
        ldx ...
        ldy ...
        jsr plot_sprite
    
  • Inline comments: Support for writing comments at the end of a line.
  • A new option to show the cycle count of every instruction.
  • Output of the initial list of constants and memory labels is tidier.
  • If assembler specific code is required, an assembler can be tested for using the new "is_assembler()" function.
  • Support for 8080 code, adding to the existing support for 6502 and 65c02.
  • A testing facility (See "./go" script in "/examples") which disassembles every example for each assembler, then reassembles them and checks the binaries are identical. Optionally, also checks the disassemblies match a known good version.
  • Works with Python 2 or 3 with no third party dependencies (use of "six" is removed).
  • Miscellaneous fixes.
  • Updated the README documentation.
Upgrading from the old version is fairly painless but there are just a few things you may need to fix up:
  • The old 'import trace6502' system is no more. Remove the import line and add "6502" (for example) as the third parameter to the load() function.
  • Configuration now has proper setter functions. These are listed in the README.md.
  • hook_subroutine() is a regular command, no need to do "from trace6502 import hook_subroutine"
Thanks to Steve for creating this tool in the first place, for sharing advice and encouragement.
SteveF
Posts: 1697
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Thanks Toby, this is some amazing work - you were obviously too polite to mention that you've also generally tidied up the code and it's a lot more readable than it was before. :-)
User avatar
MarkMoxon
Posts: 615
Joined: Thu Jul 18, 2019 4:38 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by MarkMoxon »

Always good to see py8dis getting an update - I’m sure I’m going to find some of these new features useful in my next project.

Thanks Toby, thanks Steve. This tool is great! =D>

Mark
User avatar
TobyLobster
Posts: 622
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

I've made some further improvements to py8dis, now available in the master branch: https://github.com/ZornsLemma/py8dis

It can now automatically detect all calls to regular OS routines (including all of OSBYTE, OSWORD, OSWRCH, OSFILE, OSARGS, OSGBPB, etc) and automatically comment each time they are called. It knows the meaning of each OSBYTE and OSWORD individually, so where register values are known before the call, commentary is tailored to take this into account. After the call, any results in registers are also commented where used.

For example, from Chuckle Egg, these comments are generated automatically:

Code: Select all

    lda #osbyte_scan_keyboard               ; 306b: a9 79       .y  
    ldx #&81                                ; 306d: a2 81       ..  
    jsr osbyte                              ; 306f: 20 f4 ff     ..    ; Test for 'CTRL' key pressed (X=129)
    txa                                     ; 3072: 8a          .      ; X has top bit set if 'CTRL' pressed
    bpl didntpressctrl                      ; 3073: 10 05       ..  
or from DFS226:

Code: Select all

    lda #osbyte_read_oshwm                  ; 9a8d: a9 83       ..
    jsr osbyte                              ; 9a8f: 20 f4 ff     ..    ; Read top of operating system RAM address (OSHWM)
    sty l10cf                               ; 9a92: 8c cf 10    ...    ; X and Y contain the address of OSHWM (low, high)
    lda #osbyte_read_himem                  ; 9a95: a9 84       ..
    jsr osbyte                              ; 9a97: 20 f4 ff     ..    ; Read top of user memory (HIMEM)
    tya                                     ; 9a9a: 98          .      ; X and Y contain the address of HIMEM (low, high)
This feature is not only available for OS routines, but also for regular subroutines in the disassembly. The new `subroutine()` command allows you to describe a function in words and document its entry and exit parameters (all parameters are optional, so can be filled in as you go along). All calls to this subroutine will be automatically annotated.

If any automatic comment is not wanted at a particular address (maybe you prefer to write your own comment manually) then you can inhibit it using `no_automatic_comment(addr)`

One more minor feature is also added: comments can now be indented.

This is quite a large change, so if there are any problems, shout.
gfoot
Posts: 987
Joined: Tue Apr 14, 2020 9:05 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by gfoot »

I love it. I'm now thinking of revisiting the ones I already did and updating them based on the latest version.
User avatar
hoglet
Posts: 12683
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

Here's a quick question for the py8dis experts out there....

I'm doing a disassembly of the Atom Econet ROM, come up against something I don't know how to do.

The MC68B54 has a write_only control register and read_only status register mapped to same address.

I'd like to have this called ADLC_CONTROL when it's written (e.g. in a STA instruction), and ADLC_STATUS when it's read (e.g. in an LDA).

I'm guessing it's not easy to do this?

Dave
SteveF
Posts: 1697
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Hi Dave,

Perhaps a bit hacky, but how about this?

Code: Select all

from commands import *
import cpu6502

load(0x2000, "rw.orig", "6502", "f047c8f074c0396a3a210a95c1c0cc37")

entry(0x2000)

def our_label_maker(addr, context, suggestion):
    # TODO: I haven't worried about runtime vs binary addrs here; this only
    # matters if move() is used, but if this code gets promoted into the
    # "standard library" this needs thinking about.
    # TODO: Edge case of read+write instructions like "ROR abs"
    if disassembly.is_code(context - 1):
        opcode_value = memory[context - 1]
        opcode_obj = trace.cpu.opcodes.get(opcode_value)
        if opcode_obj is not None:
            if opcode_obj.mnemonic in ("STA", "STX", "STY"):
                if addr == 0xff00:
                    return "write_only_register"
    return None

set_label_maker_hook(our_label_maker)

label(0xff00, "read_only_register")

go()
rw.orig is in the attached zip file. It just does "LDA &FF00:STA &FF00".

This gives:

Code: Select all

read_only_register  = &ff00
write_only_register = &ff00

    org &2000

.pydis_start
    lda read_only_register                                            ; 2000: ad 00 ff    ...
    sta write_only_register                                           ; 2003: 8d 00 ff    ...
.pydis_end
The basic idea is to use a label maker hook to override the standard labelling process for writes (because they're easier to identify; if I'm not mistaken only STA/STX/STY can write to arbitrary addresses). Edit: I was slightly mistaken - see Toby's post below.

This works for me with the latest code on the (very slow work in progress) undocumented-opcodes branch, but I expect it will work with any recent branch.
Attachments
rw.zip
(954 Bytes) Downloaded 45 times
Last edited by SteveF on Sat Dec 24, 2022 1:55 am, edited 2 times in total.
User avatar
hoglet
Posts: 12683
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

SteveF wrote: Thu Dec 22, 2022 5:01 pm The basic idea is to use a label maker hook to override the standard labelling process for writes (because they're easier to identify; if I'm not mistaken only STA/STX/STY can write to arbitrary addresses).

This works for me with the latest code on the (very slow work in progress) undocumented-opcodes branch, but I expect it will work with any recent branch.
That looks just what I need - thanks!

Dave
User avatar
TobyLobster
Posts: 622
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

Nice, that looks good. If it were generalised in the library an edge case is: It's maybe not so clear what the label should be with ROL mem, LSR mem etc that both read and write :-).
SteveF
Posts: 1697
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Good spot Toby, I've added a TODO to my earlier post to make this obvious if it does get lifted from here and put into the "standard library" at some point.
User avatar
KenLowe
Posts: 4699
Joined: Mon Oct 18, 2004 5:35 pm
Location: UK
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by KenLowe »

Hi Steve,

Just tried this for the first time today and have been able to create a relatively simple disassembly (see here for some history). Really like how powerful it is, yet simple(ish) to use =D> =D> =D>.

However, there's one bit that's not working quite correctly for me, and I'm not quite sure how to fix it. Basically I've got an envelope table with 4 different entries (env_01..env_04). The start address of each entry is defined in another table, which is stored at address 'env_table':

Code: Select all

; &0b1b referenced 2 times by &0aa2, &0b32
.envelope_setup
    ldy env_table_offset                                              ; 0b1b: ac 36 0b    .6.
    ldx env_table,y                                                   ; 0b1e: be 37 0b    .7.
    iny                                                               ; 0b21: c8          .
    lda env_table,y                                                   ; 0b22: b9 37 0b    .7.
    iny                                                               ; 0b25: c8          .
    sty env_table_offset                                              ; 0b26: 8c 36 0b    .6.
    tay                                                               ; 0b29: a8          .
    lda #osword_envelope                                              ; 0b2a: a9 08       ..
    jsr osword                                                        ; 0b2c: 20 f1 ff     ..            ; ENVELOPE command
    dec env_count                                                     ; 0b2f: ce 35 0b    .5.
    bne envelope_setup                                                ; 0b32: d0 e7       ..
    rts                                                               ; 0b34: 60          `

; &0b35 referenced 1 time by &0b2f
.env_count
    equb 4                                                            ; 0b35: 04          .
; &0b36 referenced 2 times by &0b1b, &0b26
.env_table_offset
    equb 0                                                            ; 0b36: 00          .
; &0b37 referenced 2 times by &0b1e, &0b22
.env_table
    equw env_01, env_02, env_03, env_04                               ; 0b37: 3f 0b 4d... ?.M
.env_01
    equb   1,   1,   0,   0,   0,   0,   0,   0, &7e, &ff,   0, &ff   ; 0b3f: 01 01 00... ...
    equb &7e,   0                                                     ; 0b4b: 7e 00       ~.
.env_02
    equb   2,   3,   0,   0,   0,   1,   1,   1, &5a, &ec, &ec, &fe   ; 0b4d: 02 03 00... ...
    equb &5a,   0                                                     ; 0b59: 5a 00       Z.
.env_03
    equb   3,   2,   1,   1,   0,   5, &0a, &28, &1e, &f6, &f6, &f1   ; 0b5b: 03 02 01... ...
    equb &7f,   0                                                     ; 0b67: 7f 00       ..
.env_04
    equb   4, &83,   0,   0,   0, &19,   2, &fe, &6e,   0, &fc, &f8   ; 0b69: 04 83 00... ...
    equb &6e, &50                                                     ; 0b75: 6e 50       nP
To achieve the above, this is what I've done. In particular, I've used the expr() function to build the table of four addresses at env_table:

Code: Select all

label(0x0b1b, "envelope_setup")
label(0x0b35, "env_count")
label(0x0b36, "env_table_offset")
label(0x0b37, "env_table")
label(0x0b3f, "env_01")
label(0x0b4d, "env_02")
label(0x0b5b, "env_03")
label(0x0b69, "env_04")

expr(0x0b37, "env_01")
expr(0x0b39, "env_02")
expr(0x0b3b, "env_03")
expr(0x0b3d, "env_04")
...but this has created the following asserts in the assembly, which I don't really want. These env_0x variables should be dynamic and change as the code grows and shrinks.

Code: Select all

    assert env_01 == &0b3f
    assert env_02 == &0b4d
    assert env_03 == &0b5b
    assert env_04 == &0b69
I'm sure there must be a better way of doing this? I thought I might be able to substitute the address within expr() with the variable 'env_table':

Code: Select all

expr(env_table, "env_01")
expr(env_table+2, "env_02")
expr(env_table+4, "env_03")
expr(env_table+6, "env_04")
...but that didn't work. Any ideas?
SteveF
Posts: 1697
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

Hi Ken, I'm glad it's mostly working for you!

I hope the following doesn't seem patronising; I'm trying to give a general "how py8dis works in this area" answer, and I appreciate you may already know some of this.

The idea behind these asserts is that they make it more obvious if you make a mistake during the disassembly, as py8dis currently has no way to evaluate the second argument of expr() to see if it's right.

For example, you (correctly, I assume) have foo.py with:

Code: Select all

expr(0x0b37, "env_01")
expr(0x0b39, "env_02")
Let's suppose you accidentally got these the wrong way round and wrote:

Code: Select all

expr(0x0b37, "env_02")
expr(0x0b39, "env_01")
If you ran your foo.py script with this, py8dis would happily spit out:

Code: Select all

.env_table
    equw env_02, env_01, env_03, env_04                             
This will assemble but the results won't be identical to the input. Despite what I'm about to say about py8dis trying hard to avoid this being a problem, I advise using a shell/batch script to run your foo.py script, reassemble the output afterwards and compare it against the input to make sure it's identical. This adds a lot of peace of mind, even if py8dis tries to avoid problems. (You may well be doing this already, but I wanted to emphasise the value in doing it for anyone else reading this thread in future.)

However, even if you are automatically reassembling and comparing the binaries to catch problems like this, when that happens you will have to work back from the mismatch at offset n in the binary, convert that to a hex address in the program and work out what you did wrong. So py8dis emits these asserts to check at assembly time that env_01 does evaluate to 0x0b3f (the data in memory corresponding to its position in the equw statement), hopefully making it relatively quick and painless to fix. (You might like to try deliberately making the wrong way round error above and seeing if I'm right; I hope I am, but I may have missed something.)

(In this specific case, py8dis probably could and probably should be capable of evaluating the simple label "env_01" and realising it has value 0xb3f so it can generate an error immediately when it's emitted as part of the equw with 0xb4d in the corresponding part of the load()ed binary, rather than emitting an assertion into the output. But the expression could be something arbitrarily complicated in the syntax of an arbitrary-ish assembler like expr(0xb37, "lo(foo)+2*bar"), so doing this in general is trickier. By making the assembler do the work, we sidestep this problem and we also gain an additional sanity check because if a bug in py8dis means the env_01 label is *not* emitted in the right place and it doesn't actually have value 0xb3f even though it should, the assertion will catch that too.)

I understand that once you have a good disassembly, you will want to start moving things around and these assertions will start to fail. The idea - which I appreciate is not conveyed very well by the virtually non-existent "getting started with py8dis" documentation :-) - is that once you've abandoned py8dis like a spent rocket booster after the disassembly phase, you can remove the assertions from the generated code and start to edit it to your heart's content.

It does seem a bit fiddly to have to manually delete them, even if it's probably a one-off step, so I've just pushed a commit to the master branch which adds a new:

Code: Select all

config.set_include_assertions(False)
option you can specify in your foo.py file to disable generation of assertions. I do strongly suggest that leave you them on until you have finished all your disassembly work and are going to start modifying the resulting code. (Again, emphasis just to attract the attention of anyone reading this in future.)

I hope this makes sense and that I haven't got hold of the wrong end of the stick here - let me know if I have!

Edit: Just to be clear, I think the code you've written is perfectly correct, except the last fragment of Python code where you attempted to work around the assertions (and you already know that didn't work anyway).

Edit: It may or may not be helpful if I point out that the assertion:

Code: Select all

    assert env_01 == &0b3f
is generated because you wrote:

Code: Select all

expr(0x0b37, "env_01")
and the word at address 0x0b37 in the binary is 0x0b3f.

The fact that you wrote:

Code: Select all

label(0x0b3f, "env_01")
doesn't cause the assertion to be generated.
User avatar
KenLowe
Posts: 4699
Joined: Mon Oct 18, 2004 5:35 pm
Location: UK
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by KenLowe »

Thank you for the comprehensive reply. All noted and understood. I can, indeed, remove the asserts once I'm finished with py8dis. I just thought I must have been doing something wrong to generate them in the first place.
SteveF wrote: Wed Nov 08, 2023 2:10 am Despite what I'm about to say about py8dis trying hard to avoid this being a problem, I advise using a shell/batch script to run your foo.py script, reassemble the output afterwards and compare it against the input to make sure it's identical. This adds a lot of peace of mind, even if py8dis tries to avoid problems. (You may well be doing this already, but I wanted to emphasise the value in doing it for anyone else reading this thread in future.)
So, what I've been doing is the following:
  1. Open the control file repton.py and the disassembly file repton.asm in Notepad++
  2. Open the command prompt so I can run the python script.
  3. Make small adjustments to the control file (adding label(), expr(), etc) and save that
  4. Switch to the command prompt and run the script
  5. Switch back to Notepad++ which recognises that repton.asm has changed, and reloads the latest version
  6. Check that the repton.asm changes match what I have tried to do
  7. Loop back to step 3
At this stage I've not actually done a re-assembly at point 4 to check against the original, but the changes I'm making are small enough that I'm fairly confident I'm not introducing any errors. However, I will have a look to see how easy it is to do a more thorough and automated check each time I run the python script.
dp11
Posts: 1762
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

A little but of help please . How should I describe this function which takes a string and terminate with bit 7 set.

Code: Select all

.c9190
    jsr pass_string_Acorn_VFS                                         ; 9190: 20 57 8c     W.
    equs "Acorn VFS", &0d                                             ; 9193: 41 63 6f... Aco
; overlapping: sta l0818                                              ; 919d: 8d 18 08    ...
    equb &8d, &18       
I'm currently using

Code: Select all

hook_subroutine(0x8c57, "pass_string_Acorn_VFS", stringhi_hook)
I'd like the termination char 0x8d to be included with the equs line so the next instruction decodes correctly
SteveF
Posts: 1697
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

I don't have access to a PC right now to test this, but I suspect you can define your own hook function which is a lightly tweaked copy of the existing stringhi_hook:

Code: Select all

def my_stringhi_hook(target, addr):
    return stringhi(addr + 3) + 1
stringhi() also takes an optional include_terminator_fn which *might* let you do something like:

Code: Select all

def my_stringhi_hook(target, addr):
    return stringhi(addr + 3, include_terminator_fn=lambda b: True)
Are you sure the &8d isn't an actual STA opcode? I am wondering if some other bug in py8dis is confusing matters here.
dp11
Posts: 1762
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

I think the 0x8D is the terminator

Code: Select all

.pass_string_Acorn_VFS
    pla                                                               ; 8c57: 68          h
    sta l00b6                                                         ; 8c58: 85 b6       ..
    pla                                                               ; 8c5a: 68          h
    sta l00b7                                                         ; 8c5b: 85 b7       ..
    ldy #1                                                            ; 8c5d: a0 01       ..
; &8c5f referenced 1 time by &8c67
.loop_c8c5f
    lda (l00b6),y                                                     ; 8c5f: b1 b6       ..
    bmi c8c69                                                         ; 8c61: 30 06       0.
    jsr sub_c8c7a                                                     ; 8c63: 20 7a 8c     z.
    iny                                                               ; 8c66: c8          .
    bne loop_c8c5f                                                    ; 8c67: d0 f6       ..
; &8c69 referenced 1 time by &8c61
.c8c69
    and #&7f                                                          ; 8c69: 29 7f       ).
    jsr sub_c8c7a                                                     ; 8c6b: 20 7a 8c     z.
    tya                                                               ; 8c6e: 98          .
    clc                                                               ; 8c6f: 18          .
    adc l00b6                                                         ; 8c70: 65 b6       e.
    tay                                                               ; 8c72: a8          .
    lda #0                                                            ; 8c73: a9 00       ..
    adc l00b7                                                         ; 8c75: 65 b7       e.
    pha                                                               ; 8c77: 48          H
    phy                                                               ; 8c78: 5a          Z
    rts   
SteveF
Posts: 1697
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by SteveF »

I think you're right (though I'm not immune to the power of suggestion!), and I suppose it makes sense as it would be a double CR after the string.

Did either of my suggested tweaks fix the disassembly?
User avatar
BigEd
Posts: 6278
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by BigEd »

Wondering if I'm doing something dim... I'm loading a 16k ROM to 0xc000, and somehow py8dis is trying to trace an address of 0x10000 which springs an assertion. Any idea how I can narrow down what's going on?
User avatar
hoglet
Posts: 12683
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

BigEd wrote: Tue Feb 06, 2024 4:56 pm Wondering if I'm doing something dim... I'm loading a 16k ROM to 0xc000, and somehow py8dis is trying to trace an address of 0x10000 which springs an assertion. Any idea how I can narrow down what's going on?
Does anything change if you mark the vectors at the end of the ROM?

Code: Select all

wordentry(0xfffa, 3)
What are you disassembling? The Digiac ROM?
User avatar
BigEd
Posts: 6278
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by BigEd »

No, same thing. I added a print line to trace.py:

Code: Select all

        for runtime_addr, label in self.labels.items():
            runtime_addr = memorymanager.RuntimeAddr(runtime_addr) # TODO: OK? Should keys in this dict be RuntimeAddrs to start with?                  
            print("runtime_addr:", hex(runtime_addr))
and I got this output

Code: Select all

runtime_addr: 0xe0dc
runtime_addr: 0xf022
runtime_addr: 0xe0ca
runtime_addr: 0xc000
runtime_addr: 0x10000
The first three are the vectors, the C000 is the load address (not sure why although it does happen to be valid) and then the fatal overlarge address.
User avatar
BigEd
Posts: 6278
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by BigEd »

I think we've found it, looks like a bug - if we truncate the final two bytes from the ROM, then all is well.
dp11
Posts: 1762
Joined: Sun Aug 12, 2012 9:47 pm
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by dp11 »

SteveF wrote: Mon Feb 05, 2024 2:25 am

Code: Select all

def my_stringhi_hook(target, addr):
    return stringhi(addr + 3, include_terminator_fn=lambda b: True)
The above solves my issue. Thanks
User avatar
TobyLobster
Posts: 622
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

BigEd wrote: Tue Feb 06, 2024 5:47 pm I think we've found it, looks like a bug - if we truncate the final two bytes from the ROM, then all is well.
I've submitted a pull request with a fix.
User avatar
BigEd
Posts: 6278
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by BigEd »

Thanks!
User avatar
hoglet
Posts: 12683
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

I'm helping BigEd with a disassembly of the Digiac-MAC-III ROM.

We're struggling to find a way to handler this case....

The code has a table of pointers to error strings:

Code: Select all

.lee69
    equw &effa, &ee9f, &eeb1, &eec0, &eed8, &eeed, &ef03, &effa, &ef14, &effa
    equw &ef20, &ef41, &ef56, &effa, &effa, &effa, &effa, &effa, &effa, &effa
    equw &ef6e, &ef81, &ef8e, &ef9c, &efad, &efc7, &efe3
 
This is followed by the strings themselves:

Code: Select all

   
    equs "  Unknown command"
    equb &0d
    equs "  Syntax error"
    equb &0d
    equs "  Unknown register name"
    equb &0d
    equs "  Value out of range"
    equb &0d
    equs "  Device name unknown"
    equb &0d
    equs "  Invalid number"
    equb &0d
    equs "  Missing ", '"'
    equb &0d
    equs "  No more breakpoints can be set"
    equb &0d
    equs "  No breakpoints set"
    equb &0d
    equs "  Memory limit exceeded"
    equb &0d
    equs "  Device not ready"
    equb &0d
    equs "  Read error"
    equb &0d
    equs "  Write error"
    equb &0d
    equs "  Checksum error"
    equb &0d
    equs "  Unknown 'S' record type"
    equb &0d
    equs "  Escape character detected"
    equb &0d
    equs "  Invalid I/O function"
    equb &0d, &0d
I'd like to end up with the table filled with automatically generated labels (string_error_0, string_error_1, etc). I was expecting to have to write some python to iterate over the table, but I can't even find a way to do this manually.

Specifically, if I define a label:

Code: Select all

label(0xee9f, "string_error_0)
How can I get this label to be used in the pointer table?

It seems this works automatically for jump tables (using wordentry() but not for data tables defined with word())

I feel I must be missing something obvious here....

Dave
User avatar
TobyLobster
Posts: 622
Joined: Sat Aug 31, 2019 7:58 am
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by TobyLobster »

hoglet wrote: Sun Feb 18, 2024 10:48 am We're struggling to find a way to handler this case....
In theory, the manual way should be:

Code: Select all

label(0xeffa, "string_error_0")
expr(0xee69, "string_error_0")
This can be automatic by reading the memory and constructing the addresses, something like:

Code: Select all

for i in range(27):
    addr = get_u8_binary(0xee69+2*i) + 256*get_u8_binary(0xee69+2*i + 1)
    label(addr, f"string_error_{i}")
    expr(0xee69+2*i, f"string_error_{i}")
Untested.
User avatar
hoglet
Posts: 12683
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by hoglet »

TobyLobster wrote: Sun Feb 18, 2024 3:05 pm Untested.
Thanks Toby, something like that worked:

Code: Select all

# a table of error strings
word(0xee69, 27)
for i in range(27):
    addr = get_u8_binary(0xee69+2*i) + 256*get_u8_binary(0xee69+2*i + 1)
    label(addr, "string_error_" + str(i))
    expr(0xee69+2*i, "string_error_" + str(i))
User avatar
KenLowe
Posts: 4699
Joined: Mon Oct 18, 2004 5:35 pm
Location: UK
Contact:

Re: py8dis - a programmable static tracing 6502 disassembler in Python

Post by KenLowe »

I have a similar question, but the high and low bytes in the table are separated by 5 bytes:

Code: Select all

; &8e0c referenced 1 time by &8e12
.loop_c8e0c
    lda l00bd,y                                                       ; 8e0c: b9 bd 00    ...
    sta (l009c),y                                                     ; 8e0f: 91 9c       ..
    dey                                                               ; 8e11: 88          .
    bpl loop_c8e0c                                                    ; 8e12: 10 f8       ..
    iny                                                               ; 8e14: c8          .
    lda (l00f0),y                                                     ; 8e15: b1 f0       ..
    rts                                                               ; 8e17: 60          `

; &8e18 referenced 1 time by &8e06
.l8e18
    equb &32, &ef, &52, &7a, &71                                      ; 8e18: 32 ef 52... 2.R
; &8e1d referenced 1 time by &8e02
.l8e1d
    equb &8e, &8e, &8e, &8e, &8f                                      ; 8e1d: 8e 8e 8e... ...
From the table, the code to be executed is at:

Code: Select all

&8e33
&8ef0
&8e53
&8e7b
&8f72
ie address in table + 1

I've already defined labels for the entry points:

Code: Select all

entry(0x8E33,label="l8E33")
entry(0x8E53,label="l8E53")
entry(0x8EF0,label="l8EF0")
entry(0x8E7B,label="l8E7B")
entry(0x8F72,label="l8F72")
I would like the table to update when if the code gets relocated. I don't think I need to worry too much about the table being expanded, but if it's easy to manage that as well, that would be ideal.

What's the best way to do this?

Thanks

Edit, I think I might have got this now. It's probably not the most elegant way of doing it, but I think it works:

Code: Select all

byte(0x8E18,n=10, cols=1)
expr(0x8e18, make_lo(make_subtract("l8e33", 1)))
expr(0x8e1d, make_hi(make_subtract("l8e33", 1)))
expr(0x8e18+1, make_lo(make_subtract("l8ef0", 1)))
expr(0x8e1d+1, make_hi(make_subtract("l8ef0", 1)))
expr(0x8e18+2, make_lo(make_subtract("l8e7b", 1)))
expr(0x8e1d+2, make_hi(make_subtract("l8e7b", 1)))
expr(0x8e18+3, make_lo(make_subtract("l8e7b", 1)))
expr(0x8e1d+3, make_hi(make_subtract("l8e7b", 1)))
expr(0x8e18+4, make_lo(make_subtract("l8f72", 1)))
expr(0x8e1d+4, make_hi(make_subtract("l8f72", 1)))
This generates:

Code: Select all

; &8e0c referenced 1 time by &8e12
.loop_c8e0c
    lda l00bd,y                                                       ; 8e0c: b9 bd 00    ...
    sta (l009c),y                                                     ; 8e0f: 91 9c       ..
    dey                                                               ; 8e11: 88          .
    bpl loop_c8e0c                                                    ; 8e12: 10 f8       ..
    iny                                                               ; 8e14: c8          .
    lda (l00f0),y                                                     ; 8e15: b1 f0       ..
    rts                                                               ; 8e17: 60          `

; &8e18 referenced 1 time by &8e06
.l8e18
l8e1d = l8e18+5
    equb <(l8e33 - 1)                                                 ; 8e18: 32          2
    equb <(l8ef0 - 1)                                                 ; 8e19: ef          .
    equb <(l8e7b - 1)                                                 ; 8e1a: 52          R
    equb <(l8e7b - 1)                                                 ; 8e1b: 7a          z
    equb <(l8f72 - 1)                                                 ; 8e1c: 71          q
    equb >(l8e33 - 1)                                                 ; 8e1d: 8e          .
    equb >(l8ef0 - 1)                                                 ; 8e1e: 8e          .
    equb >(l8e7b - 1)                                                 ; 8e1f: 8e          .
    equb >(l8e7b - 1)                                                 ; 8e20: 8e          .
    equb >(l8f72 - 1)                                                 ; 8e21: 8f          .
; &8e1d referenced 1 time by &8e02
Post Reply

Return to “development tools”