Hacker needed ... for Zarch ;-)

subjects relating to classic games for the archimedes and risc pc
User avatar
trixster
Posts: 1173
Joined: Wed May 06, 2015 12:45 pm
Location: York
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by trixster »

It must be time for an update!?? [-o< :D
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

trixster wrote:It must be time for an update!?? [-o< :D
You've not watched the "Let's play with the Zarch source code" series on the JASPP YouTube channel then?

I'll have the Pan-O-Vision version running both on the project and a Pi at the London Show. If I get time, I'll try to merge it with the 50Hz version that the YouTube series is about - but I'm not making any promises as I'm still setting up all the machines for the show and will probably run out of time.
User avatar
trixster
Posts: 1173
Joined: Wed May 06, 2015 12:45 pm
Location: York
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by trixster »

I have not! But I will now!
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

trixster wrote:I have not! But I will now!
With the London Show looming and me desperately trying to get a dozen machines ready in time, I've not had a chance to continue the series. Hopefully I'll resume it in November, where I plan to finish the 50Hz work and look at recoding for dynamic resolution and add 16m colour support ...or more accurately use the full 4096 colour palette that Zarch uses internally and solve the "fade to black" issue for the depth plotting.

Once all that's done I'll look at merging Pan-O-Vision into the original code - which on first inspection isn't going to be straight forward. I was hoping the original source would easily allow the width/depth of the landscape to be increased, sadly that doesn't appear to be the case.
User avatar
sbadger
Posts: 499
Joined: Mon Mar 25, 2013 1:12 pm
Location: Farnham, Surrey
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sbadger »

Hi Jon,

I'd missed your YT vids too. Fantastic, fps looks so smooth!.
Does the source code allow an increase in resolution on the pi?
So many projects, so little time...
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

sbadger wrote:Does the source code allow an increase in resolution on the pi?
No, although there's variables for the width/height of the screen, the code contains hardcoded *320's using shifting.
crj
Posts: 858
Joined: Thu May 02, 2013 5:58 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by crj »

I've only just spotted this thread, so I'm not sure which people here have what levels of background knowledge about writing fast ARMcode for this sort of thing.

Important things to consider:
  • The combination of ARM2 and MEMC, when both code and data are in DRAM, does three relevant kinds of cycle:
    • N-cycle: 2 ticks
    • S-cycle: 1 tick, unless it crosses a 16-byte boundary in which case 2 ticks
    • I-cycle: always 1 tick
  • LDR(B) takes n+i+s
  • STR(B) takes 2n
  • STM of x registers takes 2n+(x-1)s
  • ALU operations usually take s
  • B/BL/PC-dest ALU take n+2s
(Here, a "tick" is a cycle of the un-stretched CPU clock, so 125ns if running at 8MHz.)

If you're trying to fill memory with bytes, there are various strategies you can adopt:
Codebytesticksbytes/tick
STRB r0,[r1],#1140.25
STRB r0,[r1],#1 : STRB r0,[r1],#1280.25
STRB r0,[r1],#1 : STRB r0,[r1],#1 : STRB r0,[r1],#13120.25
LDR r2,[r1,#-1]! : AND r2,r2,#&FF000000 : ORR r2,r2,r0,LSR#8 : STR r2,[r1],#43100.3
STR r0,[r1],#4441
STMIA r4!,{r0-r3} (unaligned)1682
STMIA r4!,{r0-r3} (aligned)1672.29
STMIA r8!,{r0-r7} (aligned)32122.67
STMIA r12!,{r0-r11} (aligned)48172.82
CMP rN,rEND : BLO loop05
That last consideration means that:

Code: Select all

.loop
  STMIA r12!,{r0-r11}
  STMIA r12!,{r0-r11}
  STMIA r12!,{r0-r11}
  STMIA r12!,{r0-r11}
  STMIA r12!,{r0-r11}
  STMIA r12!,{r0-r11}
  CMP r12,r13
  BLO loop
...takes 107 cycles for 288 bytes = 2.69 bytes/tick, whereas:

Code: Select all

.loop
  STMIA r12!,{r0-r11}
  CMP r12,r13
  BLO loop
...takes 22 cycles for 48 bytes = 2.18 bytes/tick.

In summary:
  • Load/mask/store is 20% quicker than LDRBs for storing 3 bytes in a word
  • STR store is 4x speed of STRB (duh!)
  • STMs can be 2x to 2.8x speed of STR
  • Aligned STMs are 15% quicker than unaligned
  • Loop unrolling saves about 20%
  • In code, positioning your loads/stores at addresses 12 mod 16 can save 5-25%
Another consideration: with an ARM3 you need to care about cache occupancy, meaning it's important to keep code size down. With an ARM2 executing the same instruction again is no quicker than executing a different instruction, so the only limit to unrolling is how much RAM you're prepared to sacrifice for the code.

Bringing this all together, I'd be tempted to try something like:

Code: Select all

.entry
  \ r0 = start address
  \ r1 = end address
  \ r4 = byte to store, quadruplicated into every byte of the word

  \ Find the common high-order bits of start and end addresses
  EOR r2,r0,r1
  \ Find the start address modulo 32
  AND r3,r0,#&1F
  
  CMP r2,#&20
  BHS large
  
  \ If we get here, start and end are in the same aligned 32-byte block
  \ Compose r3 into r2
  ORR r2,r2,r3,LSL#5
  
  \ Now dispatch to one of 32*32=1024 different code fragments
  \ Each is up to 16 instructions = 64 bytes long
  \
  \ 16 instructions, because the worst case is:
  \   Copy r4 into 5 more registers
  \   Store 3 bytes in a word (4 instructions)
  \   STM (1 instruction)
  \   Store another 3 bytes in the final word (4 instructions)
  \   MOV PC,lr (1 instruction)
  \ Total: 13 instructions
  \
  \ So the entire table is 64Kbytes, but we've only taken 10 ticks so far,
  \ and the code from this point on is *certain* to be optimal!
  ADD PC,PC,r2,LSL#6

  \ Insert 64K of mechanically-generated code here. (-8
  \ NB: Align .entry so that this table is 16-byte aligned

  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

  \ This code will be executed last, but has to come before .large so that it's within ADR range.
  \ We will land in this code fragment the appropriate number of instructions back from the end.
  \ To be able to store max_bytes bytes, this table must contain max_bytes/32 entries
  
  STMIA r0!,{r4-r11}
  STMIA r0!,{r4-r11}
  STMIA r0!,{r4-r11}
  \ ...
  STMIA r0!,{r4-r11}
.no_blocks
  \ Align preceding STMs so that this MOV is at an address 12 mod 16
  MOV PC,lr

  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
 
.large
  \ copy r4 so we have it in eight consecutive registers
  MOV r5,r4 : MOV r6,r4 : MOV r7,r4 : MOV r8,r4 : MOV r9,r4 : MOV r10,r4 : MOV r11,r4

  \ Down this path, we know there are some higher-order bits in r2 to clear
  AND r2,r2,#&1F
  \ Compose r3 into r2, as before
  ORR r2,r2,r3,LSL#5

  ADR r3,no_blocks

  \ Now we dispatch into a different 64Kbyte cluster of code fragments
  \ In this case the job is to:
  \   Write the initial 0-31 bytes, incrementing r0, worst case:
  \     Store 3 bytes in a word (4 instructions)
  \     STM (1 instruction)
  \   Write the final 0-31bytes, decrementing r1, worst case:
  \     Store 3 bytes in a word (4 instructions)
  \     STM (1 instruction)
  \   (Maybe) dispatch into the block storer to fill the gap (3 instructions)
  \ Total: 11 instructions
  \
  \ The final 3 instructions in every case are:
  \   SUBS r2,r1,r0  \ NB: By now, both r1 and r0 are now 32-byte-aligned
  \   MOVEQ PC,lr
  \   SUB PC,r3,r2,LSR#3  \ Offset backwards 4*number of 32-byte blocks
  \
  \ Note that the MOVEQ PC,lr saves 5 ticks for short lines, but costs 1 tick for
  \ longer ones. You'd need profiling to work out whether or not to include it.
  
  ADD PC,PC,r2,LSL#6

  \ Insert second 64K of mechanically-generated code here. (-8
  \ NB: Align .large so that this table is 16-byte aligned
Look ma, no loops!
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

crj wrote:I'm not sure which people here have what levels of background knowledge about writing fast ARMcode for this sort of thing.
If you take a look at this post, I took the guesswork out of it by writing a BASIC program that works out the optimal set of instructions to line fill up to 64 pixels at every offset within a Quad word. There's also various graphs later on that show the performance when using varying numbers of colour registers.
crj
Posts: 858
Joined: Thu May 02, 2013 5:58 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by crj »

Hmm. It's tricky to make sense of the output from your BASIC program, but it doesn't look correct. In particular, if I understand it correctly, when filling three pixels there's only one alignment for which it will use load/mask/store yet there are two alignments for which it's optimal.

Come to that, for the aligned case, no misalignment fixup is needed. STRB and load/store/mask seem to be your only two options?

Also, it's definitely not the case that the optimal algorithm can always be expressed by three steps. For many lengths and alignments it will be best to deal with an initial part-word, then initial part-line, then whole lines, then final part-line, then final part-word.

Knowing the correct sequence of operations for a particular length and alignment is only half the battle, though. As you can see, I mainly took that as a given and focused on efficient decode, dispatch and code alignment issues.
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

crj wrote:Hmm. It's tricky to make sense of the output from your BASIC program, but it doesn't look correct. In particular, if I understand it correctly, when filling three pixels there's only one alignment for which it will use load/mask/store yet there are two alignments for which it's optimal.
It tries STRB and LDR/mask/STR at both the start and end of the fill, if you run it you'll see the full timing output of every test.

I posted it here so people such as yourself with a good understanding of the CPU could modify it and/or make corrections, so feel free to make the changes you propose and repost.
User avatar
trixster
Posts: 1173
Joined: Wed May 06, 2015 12:45 pm
Location: York
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by trixster »

Any updates Jon? I'm out in the falklands at the moment so zero access to youtube to check your channel for updates.
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

trixster wrote:Any updates Jon?
I've not touched it for the past six months, I have to be in the mood to tinker with Zarch as its mind numbingly tedious.

If memory serves me correctly, in the last "let's play" video I was attempting to get the game to run at the monitor refresh rate, by internally running at 100hz. This was proving a real problem because the hardcoded timing values weren't granular enough to increase the game's internal refresh rate without side effects. eg. The ship flight patterns were affected, both alien and your ship in demo mode,

The other huge obstacle that's put me off, is merging the changes to increase the plot area, it's probably several weeks of man hours in itself and not something I'm looking forward too. Essentially I have two forks, one based on a disassembly that has the increased plot area and the increased FPS version based on the original source code. The "lets play" series is the later of the two.

TL;DR large parts of the code need rewriting.
User avatar
trixster
Posts: 1173
Joined: Wed May 06, 2015 12:45 pm
Location: York
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by trixster »

Any updates, Jon?
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

trixster wrote: Wed Mar 11, 2020 6:35 pm Any updates, Jon?
If you add on the six month from when you last asked, I've not touched it for 2.5 years.
RichP
Posts: 124
Joined: Tue Jan 24, 2012 4:07 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by RichP »

sirbod wrote: Fri Mar 13, 2020 9:20 pm
trixster wrote: Wed Mar 11, 2020 6:35 pm Any updates, Jon?
If you add on the six month from when you last asked, I've not touched it for 2.5 years.
Hey you should start a Patreon. I love Zarch so much, I'd donate money!
chappo
Posts: 1
Joined: Thu Feb 04, 2021 12:27 am
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by chappo »

WOW!!! What a read!!

Any chance that you can release a patch? ;)
iainfm
Posts: 602
Joined: Thu Jan 02, 2020 8:31 pm
Location: Dumbarton
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by iainfm »

I watched these with utter fascination. Amazing to see someone with such an intimate knowledge of ARM assembly that they can pick this up and modify it straight away.

One thing that interested me perhaps a little more than it should... At certain points (eg 13:52 in video 1), the decryption keys for the competition codes are visible. One is a sort of phrase in hex (c0d3ba5e), but the other is a number that looks a lot like a Cambridge telephone number 0223xxxxxx (I'll not post it here).

Does anyone know what this was? Maybe Superior Software's, or one for Mr B himself? Googling it doesn't bring up anything interesting. I've not dared to try calling it :lol:
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

Despite my many hours in the code, I can’t say I looked at the competition section. It will be a phone number for Superior Software. They’re still going, so it might still work.
kaiserh
Posts: 1
Joined: Wed Oct 20, 2021 10:39 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by kaiserh »

Thank you, sirbod, for putting in this enormous amount of work.
If you have a patch for the game that increases the plot area and it's in a state to release, I'd sure love to see it in action.
RichP
Posts: 124
Joined: Tue Jan 24, 2012 4:07 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by RichP »

Hope this still gets finished. Was looking great!
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

I thought I'd spend the start of this year modifying my procedurally generated line-fill routine so it supports dithering.

I believe it now produces code that's mostly as optimal as possible on an ARM2. As with the previous procedurally generated code, it tries all permutations of STRB and LDR with mask at the beginning and end of the line and also tries aligning to a Quad boundary as soon as possible and corrupting/restoring registers. Its using one colour register on entry and then up to four for STM's and a tmp register for odd aligned pixels that need the other colour in the dither.

Here's the dithered line-fill cost on an ARM2:
Line fill cost with one colour register dithered.png
The X-Axis is the line length, Y is time in nS on an ARM2 and Z is where the line starts in the Quad-word alignment.

And here's the solid line-fill cost on an ARM2:
Line fill cost with one colour register.png

Using dithered line-fills is not a big performance hit. Here's the delta cost of dithering vs solid. Higher is worse:
Delta solid vs dithered.png

This graph shows the delta between using the optimised dithered line-fill compared to Zarch's original line-fill. Higher is better:
Dithered line-fill speed compared to Zarch.png

Combined with the optimised quadrilateral and triangle routines, there's a substantial performance improvement. Here's a graph showing the delta in CPU cycles across the first 120,000 line-fills generated by the Zarch demo. This is a direct comparison of the original Zarch's compared to the optimised Zarch with the new triangle and quadrilateral routines. Lower values indicate less CPU cycles on line-fills of that length:
Delta CPU cycles.png
And here is the same graph flattened, showing CPU cycles spend on each line-fill length:
Delta CPU cycles 2D.png

The two graphs above show that the line-fill lengths have increased by switch from triangles to quadrilaterals, and that it's now spending more time on longer line-fills and a lot less time on short line-fills.

Putting all this into actual CPU cycles, here's the actual time taken to plot those 120,000 line on an ARM2:

Code: Select all

 Original Zarch CPU cycles: 47,954,662 (6s)
Optimised Zarch CPU cycles: 19,961,447 (2.5s)
          CPU cycles saved: 27,993,215 (3.5s)
In other words, it's done the same work in 3.5 seconds less time, or can do over twice the work for any given time period.

The next step is to put all this into Zarch and see if it actually makes any noticeable difference.
User avatar
davidb
Posts: 3395
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by davidb »

sirbod wrote: Mon Feb 20, 2023 10:59 pm In other words, it's done the same work in 3.5 seconds less time, or can do over twice the work for any given time period.
Nice work! =D> And nice graphs, too. ;)
User avatar
NickLuvsRetro
Posts: 285
Joined: Sat Jul 17, 2021 4:18 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by NickLuvsRetro »

sirbod wrote: Fri Nov 06, 2015 9:37 pm I can tell you that I'll be using 19 bit precision floats, lookups tables for vector angles of distances smaller than 64x64 pixels (which covers 99% of tri/quads plotted) and that left and right screen clipping is done as part of the line fill and is effectively free. Vertical clipping may not be required on the top of the screen (it's covered by the top screen blanking code) and the bottom clipping comes free as you stop at Y=256
Apologies for dragging up such an old comment, but it's relevant to something I'm working on at the moment.

"left and right screen clipping is done as part of the line fill and is effectively free."

If possible, could you elaborate on this? I understand that if a minX(r0) was -200 (too far left) and maxX(r1) was 360 (40 too far for a 320 screen), you might clip it in each vertical scanline iteration with something like:

Code: Select all

CMP     r0,#0
MOVNE r0,#0
CMP    r1,#&140
MOVGT r1,r1,#&140
But what you write suggests that there's a way to do this that doesn't involve cost?
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

NickLuvsRetro wrote: Mon Mar 13, 2023 3:39 pm "left and right screen clipping is done as part of the line fill and is effectively free."

If possible, could you elaborate on this?
That comment is referring to the cost in the Quadrilateral routine, which would have to check four points. Zarch implements clipping by checking every triangle point against the screen boundaries and using a dedicated clipped triangle routine, but that starts to get a bit convoluted with Quads. If you look at PROCline() in this post, you'll see its performing the line clipping as you describe.

Essentially you either ensure the triangle points never go off screen, or clip in the line-fill routine. I'm using a combination of both, with the Y clipped in the Tri/Quad routine and the X clipped in the line-fill routine.

You could also optimize slightly by having the triangle routine use an ADDS/SUBS or MOVS to work out the start X and jump to either a clipped or non-clipped line-fill to save a few cycles.
User avatar
NickLuvsRetro
Posts: 285
Joined: Sat Jul 17, 2021 4:18 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by NickLuvsRetro »

Thanks Jon! I'm having a look at that code now which is proving very insightful. I think in my case I can very easily just test all for instances where all points are inside the screen and choose between a safe (slightly slower) and non-safe triangle (fast) routine. The majority will be in the latter.
michael
Posts: 7
Joined: Wed Oct 20, 2021 2:34 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by michael »

sirbod wrote: Mon Feb 20, 2023 10:59 pm Putting all this into actual CPU cycles, here's the actual time taken to plot those 120,000 line on an ARM2:

Code: Select all

 Original Zarch CPU cycles: 47,954,662 (6s)
Optimised Zarch CPU cycles: 19,961,447 (2.5s)
          CPU cycles saved: 27,993,215 (3.5s)
In other words, it's done the same work in 3.5 seconds less time, or can do over twice the work for any given time period.
Wow, this sounds very impressive! Is the goal to get the dithering with the increased landscape view area, all at the same frame rate as the original?
sirbod wrote: Mon Feb 20, 2023 10:59 pm The next step is to put all this into Zarch and see if it actually makes any noticeable difference.
Good luck! Are you moving all your changes across to the original source code too?
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

michael wrote: Wed Mar 29, 2023 5:04 pm Is the goal to get the dithering with the increased landscape view area, all at the same frame rate as the original?
Get the game running at 25fps on ARM2 is the short answer, ie double its frame-rate.

The original idea was to increase the game to 50 FPS and plot a full screen landscape. The later of which was finished many years ago, the 50 FPS is proving a bit more of an issue as all the alien movement is updated each frame and not in a way that allows a straight jump in frame-rate.

Tha next phase was to statistically analyse the Zarch code and figure out how to get it running consistently at 25fps on an ARM2. That required moving from triangles to quadrilaterals and optimising the line-fills. The former was tested in BASIC first (it's posted above somewhere) to optimise the plot order and then migrated to assembler. I'm still working on it as although it's now in assembler I need to go back and optimise the register use to avoid unnecessary stacking.

For ARM2 I'll probably turn off the dithering as it does add cycles and as the CPU isn't powerful enough to increase the landscape area by much, you can't really notice the difference unless you pause and look closely.
sirbod wrote: Mon Feb 20, 2023 10:59 pm Good luck! Are you moving all your changes across to the original source code too?
I haven't merged anything into the original source as it arrived after I'd already finished phase 1. I did spend a few hours looking at changing the frame rate, which is a series of videos on the YouTube channel.

I've made so many changes to the Zarch code, fixing bugs etc. merging the source is a project in it's at this point!
RichP
Posts: 124
Joined: Tue Jan 24, 2012 4:07 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by RichP »

sirbod wrote: Mon Feb 20, 2023 10:59 pm Putting all this into actual CPU cycles, here's the actual time taken to plot those 120,000 line on an ARM2:

Code: Select all

 Original Zarch CPU cycles: 47,954,662 (6s)
Optimised Zarch CPU cycles: 19,961,447 (2.5s)
          CPU cycles saved: 27,993,215 (3.5s)
In other words, it's done the same work in 3.5 seconds less time, or can do over twice the work for any given time period.

The next step is to put all this into Zarch and see if it actually makes any noticeable difference.
You really worked hard on this. And all the graphs too. Great work!
sirbod wrote: Sat Nov 07, 2015 7:08 am
Rich Talbot-Watkins wrote:The gradient method is quick regardless of line gradient, but has the setup cost of a division (or reciprocal mult if you store reciprocal tables). However it's easier to perform accurate clipping against the viewport if you have the gradient.

What have people done in the past for their filled poly routines?
In that case I've only ever used the gradient method with a reciprocal table and MUL and in the case of Zarch, where the point distances are below 64 I've added a pre-calculated table of the reciprocal already multiplied. So the setup cost of the gradient (including the fallback to the reciprocal with MUL) is 7S+1N+1I best case or +16I worst case if it falls back to the reciprocal with MUL. All three gradients taking 21N+3N+3I (or ~4650nS using the table above) - which is negligible when you consider the amount of screen memory writes going on.

I suspect we might get away with ignoring distances over 64x64 (I need to analyse a longer recording of tri's/quad's to be certain), in which case we can reduce that to 4S+1N+1I per gradient and drop the reciprocal table.

The 64x64 lookup table is currently 32kB, it's actually 128x64 (ie +-63X by 63Y to avoid sign conversion), increasing the distance to +-256x128 would use 128kB - which fits within the 640kB limit we have easily.
This is good information for me. I am working on a polygon fill routine on a 68000 machine and decided to use the gradient method. I might use a look-up table when I come to optimize it.

I am a beginner and wanted to ask why dividing into triangles is faster? I am making a 3D rotating cube to start with and I assumed that if I fill each polygon (side of the cube) at once it would be faster?
sirbod
Posts: 1624
Joined: Mon Apr 09, 2012 9:44 am
Location: Essex
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by sirbod »

RichP wrote: Mon Aug 21, 2023 12:30 am wanted to ask why dividing into triangles is faster? I am making a 3D rotating cube to start with and I assumed that if I fill each polygon (side of the cube) at once it would be faster?
Quads are quicker than Triangles where applicable. A large part of this project was replacing Zarch's Triangle routine with a Quadrilateral for the landscape, to reduce the number of short line fills caused by splitting into triangles.

The various statistical analysis graphs above show the comparison of line fill lengths with Tri vs Quad and the most recent graphs show the difference in actual time performing those line fills.
RichP
Posts: 124
Joined: Tue Jan 24, 2012 4:07 pm
Contact:

Re: Hacker needed ... for Zarch ;-)

Post by RichP »

sirbod wrote: Mon Aug 21, 2023 9:21 am Quads are quicker than Triangles where applicable. A large part of this project was replacing Zarch's Triangle routine with a Quadrilateral for the landscape, to reduce the number of short line fills caused by splitting into triangles.

The various statistical analysis graphs above show the comparison of line fill lengths with Tri vs Quad and the most recent graphs show the difference in actual time performing those line fills.
Thank you for the help. In what situations would triangles actually be faster then?

I was looking at those graphs last night. If i was reading them right they were much faster with quads. This is a good thread to learn more advanced ideas from. Seems odd that Braben went for triangles after seeing that? I'm sure I read he wrote this game and Lander in 3 months. So he was under time pressure I guess. Probably only had just had time to get it working
Post Reply

Return to “32-bit acorn software: classic games”