py8dis - a programmable static tracing 6502 disassembler in Python
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Thanks Toby, I've raised https://github.com/fachat/xa65/issues/10 for this.
- TobyLobster
- Posts: 622
- Joined: Sat Aug 31, 2019 7:58 am
- Contact:
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I've made a number of improvements to py8dis, and these are all now merged into the master branch. To try the latest and greatest, pull the master branch https://github.com/ZornsLemma/py8dis . This is a substantial change - do shout if you find issues!
Everything is documented in the README.md, but here's a summary of the changes:
Everything is documented in the README.md, but here's a summary of the changes:
- Local labels are new, which are like regular labels but only used when referenced within a given range or ranges of addresses. This is useful for a memory location that is reused elsewhere in the binary for a different purpose - this gives them different label names. (This could be done previously using a label maker hook function, but this is simpler.)
- Suppose we have a dictionary where the keys are integers in the range 0-255 and the values are constant names of a particular type, (e.g. sprite ids). We can pass this to "expr(addr, dict)" to substitute the appropriate constant name for the byte value at that address.
- That same dictionary can be passed to a new function "substitute_constants(instruction, reg, dict)" which looks for a given instruction and if there is a load immediate instruction somewhere before it, substitutes the constant name.
e.g. it will find instances of the instruction "jsr plot_sprite" and a load immediate instruction before it, e.g.:
Code: Select all
lda #SpriteId_Egg ; <--- number replaced by constant name automatically ldx ... ldy ... jsr plot_sprite
- Inline comments: Support for writing comments at the end of a line.
- A new option to show the cycle count of every instruction.
- Output of the initial list of constants and memory labels is tidier.
- If assembler specific code is required, an assembler can be tested for using the new "is_assembler()" function.
- Support for 8080 code, adding to the existing support for 6502 and 65c02.
- A testing facility (See "./go" script in "/examples") which disassembles every example for each assembler, then reassembles them and checks the binaries are identical. Optionally, also checks the disassemblies match a known good version.
- Works with Python 2 or 3 with no third party dependencies (use of "six" is removed).
- Miscellaneous fixes.
- Updated the README documentation.
- The old 'import trace6502' system is no more. Remove the import line and add "6502" (for example) as the third parameter to the load() function.
- Configuration now has proper setter functions. These are listed in the README.md.
- hook_subroutine() is a regular command, no need to do "from trace6502 import hook_subroutine"
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Thanks Toby, this is some amazing work - you were obviously too polite to mention that you've also generally tidied up the code and it's a lot more readable than it was before.
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Always good to see py8dis getting an update - I’m sure I’m going to find some of these new features useful in my next project.
Thanks Toby, thanks Steve. This tool is great!
Mark
Thanks Toby, thanks Steve. This tool is great!
Mark
- TobyLobster
- Posts: 622
- Joined: Sat Aug 31, 2019 7:58 am
- Contact:
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I've made some further improvements to py8dis, now available in the master branch: https://github.com/ZornsLemma/py8dis
It can now automatically detect all calls to regular OS routines (including all of OSBYTE, OSWORD, OSWRCH, OSFILE, OSARGS, OSGBPB, etc) and automatically comment each time they are called. It knows the meaning of each OSBYTE and OSWORD individually, so where register values are known before the call, commentary is tailored to take this into account. After the call, any results in registers are also commented where used.
For example, from Chuckle Egg, these comments are generated automatically:
or from DFS226:
This feature is not only available for OS routines, but also for regular subroutines in the disassembly. The new `subroutine()` command allows you to describe a function in words and document its entry and exit parameters (all parameters are optional, so can be filled in as you go along). All calls to this subroutine will be automatically annotated.
If any automatic comment is not wanted at a particular address (maybe you prefer to write your own comment manually) then you can inhibit it using `no_automatic_comment(addr)`
One more minor feature is also added: comments can now be indented.
This is quite a large change, so if there are any problems, shout.
It can now automatically detect all calls to regular OS routines (including all of OSBYTE, OSWORD, OSWRCH, OSFILE, OSARGS, OSGBPB, etc) and automatically comment each time they are called. It knows the meaning of each OSBYTE and OSWORD individually, so where register values are known before the call, commentary is tailored to take this into account. After the call, any results in registers are also commented where used.
For example, from Chuckle Egg, these comments are generated automatically:
Code: Select all
lda #osbyte_scan_keyboard ; 306b: a9 79 .y
ldx #&81 ; 306d: a2 81 ..
jsr osbyte ; 306f: 20 f4 ff .. ; Test for 'CTRL' key pressed (X=129)
txa ; 3072: 8a . ; X has top bit set if 'CTRL' pressed
bpl didntpressctrl ; 3073: 10 05 ..
Code: Select all
lda #osbyte_read_oshwm ; 9a8d: a9 83 ..
jsr osbyte ; 9a8f: 20 f4 ff .. ; Read top of operating system RAM address (OSHWM)
sty l10cf ; 9a92: 8c cf 10 ... ; X and Y contain the address of OSHWM (low, high)
lda #osbyte_read_himem ; 9a95: a9 84 ..
jsr osbyte ; 9a97: 20 f4 ff .. ; Read top of user memory (HIMEM)
tya ; 9a9a: 98 . ; X and Y contain the address of HIMEM (low, high)
If any automatic comment is not wanted at a particular address (maybe you prefer to write your own comment manually) then you can inhibit it using `no_automatic_comment(addr)`
One more minor feature is also added: comments can now be indented.
This is quite a large change, so if there are any problems, shout.
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I love it. I'm now thinking of revisiting the ones I already did and updating them based on the latest version.
- TobyLobster
- Posts: 622
- Joined: Sat Aug 31, 2019 7:58 am
- Contact:
Re: py8dis - a programmable static tracing 6502 disassembler in Python
It is fun to see the comments magically appear
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Here's a quick question for the py8dis experts out there....
I'm doing a disassembly of the Atom Econet ROM, come up against something I don't know how to do.
The MC68B54 has a write_only control register and read_only status register mapped to same address.
I'd like to have this called ADLC_CONTROL when it's written (e.g. in a STA instruction), and ADLC_STATUS when it's read (e.g. in an LDA).
I'm guessing it's not easy to do this?
Dave
I'm doing a disassembly of the Atom Econet ROM, come up against something I don't know how to do.
The MC68B54 has a write_only control register and read_only status register mapped to same address.
I'd like to have this called ADLC_CONTROL when it's written (e.g. in a STA instruction), and ADLC_STATUS when it's read (e.g. in an LDA).
I'm guessing it's not easy to do this?
Dave
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Hi Dave,
Perhaps a bit hacky, but how about this?
rw.orig is in the attached zip file. It just does "LDA &FF00:STA &FF00".
This gives:
The basic idea is to use a label maker hook to override the standard labelling process for writes (because they're easier to identify; if I'm not mistaken only STA/STX/STY can write to arbitrary addresses). Edit: I was slightly mistaken - see Toby's post below.
This works for me with the latest code on the (very slow work in progress) undocumented-opcodes branch, but I expect it will work with any recent branch.
Perhaps a bit hacky, but how about this?
Code: Select all
from commands import *
import cpu6502
load(0x2000, "rw.orig", "6502", "f047c8f074c0396a3a210a95c1c0cc37")
entry(0x2000)
def our_label_maker(addr, context, suggestion):
# TODO: I haven't worried about runtime vs binary addrs here; this only
# matters if move() is used, but if this code gets promoted into the
# "standard library" this needs thinking about.
# TODO: Edge case of read+write instructions like "ROR abs"
if disassembly.is_code(context - 1):
opcode_value = memory[context - 1]
opcode_obj = trace.cpu.opcodes.get(opcode_value)
if opcode_obj is not None:
if opcode_obj.mnemonic in ("STA", "STX", "STY"):
if addr == 0xff00:
return "write_only_register"
return None
set_label_maker_hook(our_label_maker)
label(0xff00, "read_only_register")
go()
This gives:
Code: Select all
read_only_register = &ff00
write_only_register = &ff00
org &2000
.pydis_start
lda read_only_register ; 2000: ad 00 ff ...
sta write_only_register ; 2003: 8d 00 ff ...
.pydis_end
This works for me with the latest code on the (very slow work in progress) undocumented-opcodes branch, but I expect it will work with any recent branch.
- Attachments
-
- rw.zip
- (954 Bytes) Downloaded 45 times
Last edited by SteveF on Sat Dec 24, 2022 1:55 am, edited 2 times in total.
Re: py8dis - a programmable static tracing 6502 disassembler in Python
That looks just what I need - thanks!SteveF wrote: ↑Thu Dec 22, 2022 5:01 pm The basic idea is to use a label maker hook to override the standard labelling process for writes (because they're easier to identify; if I'm not mistaken only STA/STX/STY can write to arbitrary addresses).
This works for me with the latest code on the (very slow work in progress) undocumented-opcodes branch, but I expect it will work with any recent branch.
Dave
- TobyLobster
- Posts: 622
- Joined: Sat Aug 31, 2019 7:58 am
- Contact:
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Nice, that looks good. If it were generalised in the library an edge case is: It's maybe not so clear what the label should be with ROL mem, LSR mem etc that both read and write .
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Good spot Toby, I've added a TODO to my earlier post to make this obvious if it does get lifted from here and put into the "standard library" at some point.
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Hi Steve,
Just tried this for the first time today and have been able to create a relatively simple disassembly (see here for some history). Really like how powerful it is, yet simple(ish) to use .
However, there's one bit that's not working quite correctly for me, and I'm not quite sure how to fix it. Basically I've got an envelope table with 4 different entries (env_01..env_04). The start address of each entry is defined in another table, which is stored at address 'env_table':
To achieve the above, this is what I've done. In particular, I've used the expr() function to build the table of four addresses at env_table:
...but this has created the following asserts in the assembly, which I don't really want. These env_0x variables should be dynamic and change as the code grows and shrinks.
I'm sure there must be a better way of doing this? I thought I might be able to substitute the address within expr() with the variable 'env_table':
...but that didn't work. Any ideas?
Just tried this for the first time today and have been able to create a relatively simple disassembly (see here for some history). Really like how powerful it is, yet simple(ish) to use .
However, there's one bit that's not working quite correctly for me, and I'm not quite sure how to fix it. Basically I've got an envelope table with 4 different entries (env_01..env_04). The start address of each entry is defined in another table, which is stored at address 'env_table':
Code: Select all
; &0b1b referenced 2 times by &0aa2, &0b32
.envelope_setup
ldy env_table_offset ; 0b1b: ac 36 0b .6.
ldx env_table,y ; 0b1e: be 37 0b .7.
iny ; 0b21: c8 .
lda env_table,y ; 0b22: b9 37 0b .7.
iny ; 0b25: c8 .
sty env_table_offset ; 0b26: 8c 36 0b .6.
tay ; 0b29: a8 .
lda #osword_envelope ; 0b2a: a9 08 ..
jsr osword ; 0b2c: 20 f1 ff .. ; ENVELOPE command
dec env_count ; 0b2f: ce 35 0b .5.
bne envelope_setup ; 0b32: d0 e7 ..
rts ; 0b34: 60 `
; &0b35 referenced 1 time by &0b2f
.env_count
equb 4 ; 0b35: 04 .
; &0b36 referenced 2 times by &0b1b, &0b26
.env_table_offset
equb 0 ; 0b36: 00 .
; &0b37 referenced 2 times by &0b1e, &0b22
.env_table
equw env_01, env_02, env_03, env_04 ; 0b37: 3f 0b 4d... ?.M
.env_01
equb 1, 1, 0, 0, 0, 0, 0, 0, &7e, &ff, 0, &ff ; 0b3f: 01 01 00... ...
equb &7e, 0 ; 0b4b: 7e 00 ~.
.env_02
equb 2, 3, 0, 0, 0, 1, 1, 1, &5a, &ec, &ec, &fe ; 0b4d: 02 03 00... ...
equb &5a, 0 ; 0b59: 5a 00 Z.
.env_03
equb 3, 2, 1, 1, 0, 5, &0a, &28, &1e, &f6, &f6, &f1 ; 0b5b: 03 02 01... ...
equb &7f, 0 ; 0b67: 7f 00 ..
.env_04
equb 4, &83, 0, 0, 0, &19, 2, &fe, &6e, 0, &fc, &f8 ; 0b69: 04 83 00... ...
equb &6e, &50 ; 0b75: 6e 50 nP
Code: Select all
label(0x0b1b, "envelope_setup")
label(0x0b35, "env_count")
label(0x0b36, "env_table_offset")
label(0x0b37, "env_table")
label(0x0b3f, "env_01")
label(0x0b4d, "env_02")
label(0x0b5b, "env_03")
label(0x0b69, "env_04")
expr(0x0b37, "env_01")
expr(0x0b39, "env_02")
expr(0x0b3b, "env_03")
expr(0x0b3d, "env_04")
Code: Select all
assert env_01 == &0b3f
assert env_02 == &0b4d
assert env_03 == &0b5b
assert env_04 == &0b69
Code: Select all
expr(env_table, "env_01")
expr(env_table+2, "env_02")
expr(env_table+4, "env_03")
expr(env_table+6, "env_04")
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Hi Ken, I'm glad it's mostly working for you!
I hope the following doesn't seem patronising; I'm trying to give a general "how py8dis works in this area" answer, and I appreciate you may already know some of this.
The idea behind these asserts is that they make it more obvious if you make a mistake during the disassembly, as py8dis currently has no way to evaluate the second argument of expr() to see if it's right.
For example, you (correctly, I assume) have foo.py with:
Let's suppose you accidentally got these the wrong way round and wrote:
If you ran your foo.py script with this, py8dis would happily spit out:
This will assemble but the results won't be identical to the input. Despite what I'm about to say about py8dis trying hard to avoid this being a problem, I advise using a shell/batch script to run your foo.py script, reassemble the output afterwards and compare it against the input to make sure it's identical. This adds a lot of peace of mind, even if py8dis tries to avoid problems. (You may well be doing this already, but I wanted to emphasise the value in doing it for anyone else reading this thread in future.)
However, even if you are automatically reassembling and comparing the binaries to catch problems like this, when that happens you will have to work back from the mismatch at offset n in the binary, convert that to a hex address in the program and work out what you did wrong. So py8dis emits these asserts to check at assembly time that env_01 does evaluate to 0x0b3f (the data in memory corresponding to its position in the equw statement), hopefully making it relatively quick and painless to fix. (You might like to try deliberately making the wrong way round error above and seeing if I'm right; I hope I am, but I may have missed something.)
(In this specific case, py8dis probably could and probably should be capable of evaluating the simple label "env_01" and realising it has value 0xb3f so it can generate an error immediately when it's emitted as part of the equw with 0xb4d in the corresponding part of the load()ed binary, rather than emitting an assertion into the output. But the expression could be something arbitrarily complicated in the syntax of an arbitrary-ish assembler like expr(0xb37, "lo(foo)+2*bar"), so doing this in general is trickier. By making the assembler do the work, we sidestep this problem and we also gain an additional sanity check because if a bug in py8dis means the env_01 label is *not* emitted in the right place and it doesn't actually have value 0xb3f even though it should, the assertion will catch that too.)
I understand that once you have a good disassembly, you will want to start moving things around and these assertions will start to fail. The idea - which I appreciate is not conveyed very well by the virtually non-existent "getting started with py8dis" documentation - is that once you've abandoned py8dis like a spent rocket booster after the disassembly phase, you can remove the assertions from the generated code and start to edit it to your heart's content.
It does seem a bit fiddly to have to manually delete them, even if it's probably a one-off step, so I've just pushed a commit to the master branch which adds a new:
option you can specify in your foo.py file to disable generation of assertions. I do strongly suggest that leave you them on until you have finished all your disassembly work and are going to start modifying the resulting code. (Again, emphasis just to attract the attention of anyone reading this in future.)
I hope this makes sense and that I haven't got hold of the wrong end of the stick here - let me know if I have!
Edit: Just to be clear, I think the code you've written is perfectly correct, except the last fragment of Python code where you attempted to work around the assertions (and you already know that didn't work anyway).
Edit: It may or may not be helpful if I point out that the assertion:
is generated because you wrote:
and the word at address 0x0b37 in the binary is 0x0b3f.
The fact that you wrote:
doesn't cause the assertion to be generated.
I hope the following doesn't seem patronising; I'm trying to give a general "how py8dis works in this area" answer, and I appreciate you may already know some of this.
The idea behind these asserts is that they make it more obvious if you make a mistake during the disassembly, as py8dis currently has no way to evaluate the second argument of expr() to see if it's right.
For example, you (correctly, I assume) have foo.py with:
Code: Select all
expr(0x0b37, "env_01")
expr(0x0b39, "env_02")
Code: Select all
expr(0x0b37, "env_02")
expr(0x0b39, "env_01")
Code: Select all
.env_table
equw env_02, env_01, env_03, env_04
However, even if you are automatically reassembling and comparing the binaries to catch problems like this, when that happens you will have to work back from the mismatch at offset n in the binary, convert that to a hex address in the program and work out what you did wrong. So py8dis emits these asserts to check at assembly time that env_01 does evaluate to 0x0b3f (the data in memory corresponding to its position in the equw statement), hopefully making it relatively quick and painless to fix. (You might like to try deliberately making the wrong way round error above and seeing if I'm right; I hope I am, but I may have missed something.)
(In this specific case, py8dis probably could and probably should be capable of evaluating the simple label "env_01" and realising it has value 0xb3f so it can generate an error immediately when it's emitted as part of the equw with 0xb4d in the corresponding part of the load()ed binary, rather than emitting an assertion into the output. But the expression could be something arbitrarily complicated in the syntax of an arbitrary-ish assembler like expr(0xb37, "lo(foo)+2*bar"), so doing this in general is trickier. By making the assembler do the work, we sidestep this problem and we also gain an additional sanity check because if a bug in py8dis means the env_01 label is *not* emitted in the right place and it doesn't actually have value 0xb3f even though it should, the assertion will catch that too.)
I understand that once you have a good disassembly, you will want to start moving things around and these assertions will start to fail. The idea - which I appreciate is not conveyed very well by the virtually non-existent "getting started with py8dis" documentation - is that once you've abandoned py8dis like a spent rocket booster after the disassembly phase, you can remove the assertions from the generated code and start to edit it to your heart's content.
It does seem a bit fiddly to have to manually delete them, even if it's probably a one-off step, so I've just pushed a commit to the master branch which adds a new:
Code: Select all
config.set_include_assertions(False)
I hope this makes sense and that I haven't got hold of the wrong end of the stick here - let me know if I have!
Edit: Just to be clear, I think the code you've written is perfectly correct, except the last fragment of Python code where you attempted to work around the assertions (and you already know that didn't work anyway).
Edit: It may or may not be helpful if I point out that the assertion:
Code: Select all
assert env_01 == &0b3f
Code: Select all
expr(0x0b37, "env_01")
The fact that you wrote:
Code: Select all
label(0x0b3f, "env_01")
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Thank you for the comprehensive reply. All noted and understood. I can, indeed, remove the asserts once I'm finished with py8dis. I just thought I must have been doing something wrong to generate them in the first place.
So, what I've been doing is the following:SteveF wrote: ↑Wed Nov 08, 2023 2:10 am Despite what I'm about to say about py8dis trying hard to avoid this being a problem, I advise using a shell/batch script to run your foo.py script, reassemble the output afterwards and compare it against the input to make sure it's identical. This adds a lot of peace of mind, even if py8dis tries to avoid problems. (You may well be doing this already, but I wanted to emphasise the value in doing it for anyone else reading this thread in future.)
- Open the control file repton.py and the disassembly file repton.asm in Notepad++
- Open the command prompt so I can run the python script.
- Make small adjustments to the control file (adding label(), expr(), etc) and save that
- Switch to the command prompt and run the script
- Switch back to Notepad++ which recognises that repton.asm has changed, and reloads the latest version
- Check that the repton.asm changes match what I have tried to do
- Loop back to step 3
Re: py8dis - a programmable static tracing 6502 disassembler in Python
A little but of help please . How should I describe this function which takes a string and terminate with bit 7 set.
I'm currently using
I'd like the termination char 0x8d to be included with the equs line so the next instruction decodes correctly
Code: Select all
.c9190
jsr pass_string_Acorn_VFS ; 9190: 20 57 8c W.
equs "Acorn VFS", &0d ; 9193: 41 63 6f... Aco
; overlapping: sta l0818 ; 919d: 8d 18 08 ...
equb &8d, &18
Code: Select all
hook_subroutine(0x8c57, "pass_string_Acorn_VFS", stringhi_hook)
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I don't have access to a PC right now to test this, but I suspect you can define your own hook function which is a lightly tweaked copy of the existing stringhi_hook:
stringhi() also takes an optional include_terminator_fn which *might* let you do something like:
Are you sure the &8d isn't an actual STA opcode? I am wondering if some other bug in py8dis is confusing matters here.
Code: Select all
def my_stringhi_hook(target, addr):
return stringhi(addr + 3) + 1
Code: Select all
def my_stringhi_hook(target, addr):
return stringhi(addr + 3, include_terminator_fn=lambda b: True)
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I think the 0x8D is the terminator
Code: Select all
.pass_string_Acorn_VFS
pla ; 8c57: 68 h
sta l00b6 ; 8c58: 85 b6 ..
pla ; 8c5a: 68 h
sta l00b7 ; 8c5b: 85 b7 ..
ldy #1 ; 8c5d: a0 01 ..
; &8c5f referenced 1 time by &8c67
.loop_c8c5f
lda (l00b6),y ; 8c5f: b1 b6 ..
bmi c8c69 ; 8c61: 30 06 0.
jsr sub_c8c7a ; 8c63: 20 7a 8c z.
iny ; 8c66: c8 .
bne loop_c8c5f ; 8c67: d0 f6 ..
; &8c69 referenced 1 time by &8c61
.c8c69
and #&7f ; 8c69: 29 7f ).
jsr sub_c8c7a ; 8c6b: 20 7a 8c z.
tya ; 8c6e: 98 .
clc ; 8c6f: 18 .
adc l00b6 ; 8c70: 65 b6 e.
tay ; 8c72: a8 .
lda #0 ; 8c73: a9 00 ..
adc l00b7 ; 8c75: 65 b7 e.
pha ; 8c77: 48 H
phy ; 8c78: 5a Z
rts
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I think you're right (though I'm not immune to the power of suggestion!), and I suppose it makes sense as it would be a double CR after the string.
Did either of my suggested tweaks fix the disassembly?
Did either of my suggested tweaks fix the disassembly?
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Wondering if I'm doing something dim... I'm loading a 16k ROM to 0xc000, and somehow py8dis is trying to trace an address of 0x10000 which springs an assertion. Any idea how I can narrow down what's going on?
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Does anything change if you mark the vectors at the end of the ROM?
Code: Select all
wordentry(0xfffa, 3)
Re: py8dis - a programmable static tracing 6502 disassembler in Python
No, same thing. I added a print line to trace.py:
and I got this output
The first three are the vectors, the C000 is the load address (not sure why although it does happen to be valid) and then the fatal overlarge address.
Code: Select all
for runtime_addr, label in self.labels.items():
runtime_addr = memorymanager.RuntimeAddr(runtime_addr) # TODO: OK? Should keys in this dict be RuntimeAddrs to start with?
print("runtime_addr:", hex(runtime_addr))
Code: Select all
runtime_addr: 0xe0dc
runtime_addr: 0xf022
runtime_addr: 0xe0ca
runtime_addr: 0xc000
runtime_addr: 0x10000
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I think we've found it, looks like a bug - if we truncate the final two bytes from the ROM, then all is well.
Re: py8dis - a programmable static tracing 6502 disassembler in Python
The above solves my issue. ThanksSteveF wrote: ↑Mon Feb 05, 2024 2:25 amCode: Select all
def my_stringhi_hook(target, addr): return stringhi(addr + 3, include_terminator_fn=lambda b: True)
- TobyLobster
- Posts: 622
- Joined: Sat Aug 31, 2019 7:58 am
- Contact:
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I've submitted a pull request with a fix.
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I'm helping BigEd with a disassembly of the Digiac-MAC-III ROM.
We're struggling to find a way to handler this case....
The code has a table of pointers to error strings:
This is followed by the strings themselves:
I'd like to end up with the table filled with automatically generated labels (string_error_0, string_error_1, etc). I was expecting to have to write some python to iterate over the table, but I can't even find a way to do this manually.
Specifically, if I define a label:
How can I get this label to be used in the pointer table?
It seems this works automatically for jump tables (using wordentry() but not for data tables defined with word())
I feel I must be missing something obvious here....
Dave
We're struggling to find a way to handler this case....
The code has a table of pointers to error strings:
Code: Select all
.lee69
equw &effa, &ee9f, &eeb1, &eec0, &eed8, &eeed, &ef03, &effa, &ef14, &effa
equw &ef20, &ef41, &ef56, &effa, &effa, &effa, &effa, &effa, &effa, &effa
equw &ef6e, &ef81, &ef8e, &ef9c, &efad, &efc7, &efe3
Code: Select all
equs " Unknown command"
equb &0d
equs " Syntax error"
equb &0d
equs " Unknown register name"
equb &0d
equs " Value out of range"
equb &0d
equs " Device name unknown"
equb &0d
equs " Invalid number"
equb &0d
equs " Missing ", '"'
equb &0d
equs " No more breakpoints can be set"
equb &0d
equs " No breakpoints set"
equb &0d
equs " Memory limit exceeded"
equb &0d
equs " Device not ready"
equb &0d
equs " Read error"
equb &0d
equs " Write error"
equb &0d
equs " Checksum error"
equb &0d
equs " Unknown 'S' record type"
equb &0d
equs " Escape character detected"
equb &0d
equs " Invalid I/O function"
equb &0d, &0d
Specifically, if I define a label:
Code: Select all
label(0xee9f, "string_error_0)
It seems this works automatically for jump tables (using wordentry() but not for data tables defined with word())
I feel I must be missing something obvious here....
Dave
- TobyLobster
- Posts: 622
- Joined: Sat Aug 31, 2019 7:58 am
- Contact:
Re: py8dis - a programmable static tracing 6502 disassembler in Python
In theory, the manual way should be:
Code: Select all
label(0xeffa, "string_error_0")
expr(0xee69, "string_error_0")
Code: Select all
for i in range(27):
addr = get_u8_binary(0xee69+2*i) + 256*get_u8_binary(0xee69+2*i + 1)
label(addr, f"string_error_{i}")
expr(0xee69+2*i, f"string_error_{i}")
Re: py8dis - a programmable static tracing 6502 disassembler in Python
Thanks Toby, something like that worked:
Code: Select all
# a table of error strings
word(0xee69, 27)
for i in range(27):
addr = get_u8_binary(0xee69+2*i) + 256*get_u8_binary(0xee69+2*i + 1)
label(addr, "string_error_" + str(i))
expr(0xee69+2*i, "string_error_" + str(i))
Re: py8dis - a programmable static tracing 6502 disassembler in Python
I have a similar question, but the high and low bytes in the table are separated by 5 bytes:
From the table, the code to be executed is at:
ie address in table + 1
I've already defined labels for the entry points:
I would like the table to update when if the code gets relocated. I don't think I need to worry too much about the table being expanded, but if it's easy to manage that as well, that would be ideal.
What's the best way to do this?
Thanks
Edit, I think I might have got this now. It's probably not the most elegant way of doing it, but I think it works:
This generates:
Code: Select all
; &8e0c referenced 1 time by &8e12
.loop_c8e0c
lda l00bd,y ; 8e0c: b9 bd 00 ...
sta (l009c),y ; 8e0f: 91 9c ..
dey ; 8e11: 88 .
bpl loop_c8e0c ; 8e12: 10 f8 ..
iny ; 8e14: c8 .
lda (l00f0),y ; 8e15: b1 f0 ..
rts ; 8e17: 60 `
; &8e18 referenced 1 time by &8e06
.l8e18
equb &32, &ef, &52, &7a, &71 ; 8e18: 32 ef 52... 2.R
; &8e1d referenced 1 time by &8e02
.l8e1d
equb &8e, &8e, &8e, &8e, &8f ; 8e1d: 8e 8e 8e... ...
Code: Select all
&8e33
&8ef0
&8e53
&8e7b
&8f72
I've already defined labels for the entry points:
Code: Select all
entry(0x8E33,label="l8E33")
entry(0x8E53,label="l8E53")
entry(0x8EF0,label="l8EF0")
entry(0x8E7B,label="l8E7B")
entry(0x8F72,label="l8F72")
What's the best way to do this?
Thanks
Edit, I think I might have got this now. It's probably not the most elegant way of doing it, but I think it works:
Code: Select all
byte(0x8E18,n=10, cols=1)
expr(0x8e18, make_lo(make_subtract("l8e33", 1)))
expr(0x8e1d, make_hi(make_subtract("l8e33", 1)))
expr(0x8e18+1, make_lo(make_subtract("l8ef0", 1)))
expr(0x8e1d+1, make_hi(make_subtract("l8ef0", 1)))
expr(0x8e18+2, make_lo(make_subtract("l8e7b", 1)))
expr(0x8e1d+2, make_hi(make_subtract("l8e7b", 1)))
expr(0x8e18+3, make_lo(make_subtract("l8e7b", 1)))
expr(0x8e1d+3, make_hi(make_subtract("l8e7b", 1)))
expr(0x8e18+4, make_lo(make_subtract("l8f72", 1)))
expr(0x8e1d+4, make_hi(make_subtract("l8f72", 1)))
Code: Select all
; &8e0c referenced 1 time by &8e12
.loop_c8e0c
lda l00bd,y ; 8e0c: b9 bd 00 ...
sta (l009c),y ; 8e0f: 91 9c ..
dey ; 8e11: 88 .
bpl loop_c8e0c ; 8e12: 10 f8 ..
iny ; 8e14: c8 .
lda (l00f0),y ; 8e15: b1 f0 ..
rts ; 8e17: 60 `
; &8e18 referenced 1 time by &8e06
.l8e18
l8e1d = l8e18+5
equb <(l8e33 - 1) ; 8e18: 32 2
equb <(l8ef0 - 1) ; 8e19: ef .
equb <(l8e7b - 1) ; 8e1a: 52 R
equb <(l8e7b - 1) ; 8e1b: 7a z
equb <(l8f72 - 1) ; 8e1c: 71 q
equb >(l8e33 - 1) ; 8e1d: 8e .
equb >(l8ef0 - 1) ; 8e1e: 8e .
equb >(l8e7b - 1) ; 8e1f: 8e .
equb >(l8e7b - 1) ; 8e20: 8e .
equb >(l8f72 - 1) ; 8e21: 8f .
; &8e1d referenced 1 time by &8e02