Benchmarks
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
Added results for Kinetic StrongARM (233 MHz), also regular StrongARM running RO 4.03 for comparison. Obviously no benefit of Kinetic vs SA for the synthetics outside of bandwidth, or for BASIC (which largely fits in the SA cache anyway), but the remaining tests show gains of 10-30%.
The Kinetic PRM provides a little information on RAM timings, but what it gives appears to be misleading and/or wrong. My interpretation of the results is that reads are performed as a pair of 4-word bursts (5-1-1-1) plus a couple of cycles of CAS latency each, for a total of 20 (66 MHz) bus cycles per 32-byte line fetch, while writes are handled as a read-modify-write 4-word burst, resulting in 18 bus cycles for a 16-byte write burst.
The Kinetic PRM provides a little information on RAM timings, but what it gives appears to be misleading and/or wrong. My interpretation of the results is that reads are performed as a pair of 4-word bursts (5-1-1-1) plus a couple of cycles of CAS latency each, for a total of 20 (66 MHz) bus cycles per 32-byte line fetch, while writes are handled as a read-modify-write 4-word burst, resulting in 18 bus cycles for a 16-byte write burst.
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
I put 60nS RAM in an A5000 and overclocked MEMC to the same rate (16.67MHz) : viewtopic.php?p=316641#p316641
My results:
A5000-16, 25MHz ARM, 4MB RAM @ 16.67MHz, Econet only, RISC OS 3.11
Dhrystone/sec: 23954
Whetstone/sec: 263
Main memory read MB/s: 27.76
Main memory write MB/s: 35.67
CLOCKSP: 220.19
CLOCKSP, Rmfaster: 239.59
Mandelbrot: 436.78
Mandelbrot, RMfaster: 420.37
Reach - Galaxy: 1380
Reach - Tunnel: 6454
I'm not sure about these Reach values - they don't seem to be related to others quoted here, but these were the numbers printed by the two tests.
I wish I'd taken figures for a stock 12MHz/25MHz A5000 before doing the mod. Maybe I need to reverse it and take the measurements!
My results:
A5000-16, 25MHz ARM, 4MB RAM @ 16.67MHz, Econet only, RISC OS 3.11
Dhrystone/sec: 23954
Whetstone/sec: 263
Main memory read MB/s: 27.76
Main memory write MB/s: 35.67
CLOCKSP: 220.19
CLOCKSP, Rmfaster: 239.59
Mandelbrot: 436.78
Mandelbrot, RMfaster: 420.37
Reach - Galaxy: 1380
Reach - Tunnel: 6454
I'm not sure about these Reach values - they don't seem to be related to others quoted here, but these were the numbers printed by the two tests.
I wish I'd taken figures for a stock 12MHz/25MHz A5000 before doing the mod. Maybe I need to reverse it and take the measurements!
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Benchmarks
Thanks! Reach numbers are in the table as frames-per-second, for Galaxy that's 436 / (time / 100), for Tunnel it's 896 / (time / 100).
Doom would also be an interesting test for this machine as it's somewhat more memory intensive than a lot of the other benchmarks, but if you don't have a copy it's a moot point. Also Econet might throw the numbers out.
Doom would also be an interesting test for this machine as it's somewhat more memory intensive than a lot of the other benchmarks, but if you don't have a copy it's a moot point. Also Econet might throw the numbers out.
- DutchAcorn
- Posts: 2674
- Joined: Fri Mar 21, 2014 9:56 am
- Location: Maarn, Netherlands
- Contact:
Re: Benchmarks
Did a quick test on my A5000a+FPA10, now running @33MHz. I'd be interested if the FPA11 performance is different.
Code: Select all
A5000a
33+FPA10
OS version 3.11
Dhrystone/sec 26291
Whetstone/sec 3259
Main memory read MB/s 23.29
Main memory write MB/s 26.04
Paul
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
I "probably" have Doom, somewhere, but there were so many versions flying around that unless we use exactly the same build, I'm not sure it's quite a fair/useful benchmark.SarahWalker wrote: ↑Sun Apr 11, 2021 9:24 am Doom would also be an interesting test for this machine as it's somewhat more memory intensive than a lot of the other benchmarks, but if you don't have a copy it's a moot point. Also Econet might throw the numbers out.
Econet causing a difference - I considrered it. I've no feel for how much an idle Econet (only 1 other active station) would impact the numbers. I guess I could run some tests again with the cable pulled. I really just need to sort out power so I can bung an IDE drive on this board... might populate the floppy power output connector stuff just for fun (perfect for running a CF adaptor).
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
I've been using R-Comp's Doom+, were there multiple releases of that? I'm certainly aware of non-+ Doom and a couple of free ports.
I meant Econet causing a difference for Doom; it pages in graphics from disc during gameplay, pulling graphics over Econet would presumably be quite slow.
I meant Econet causing a difference for Doom; it pages in graphics from disc during gameplay, pulling graphics over Econet would presumably be quite slow.
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
Ah. Then no.SarahWalker wrote: ↑Sun Apr 11, 2021 1:54 pm I've been using R-Comp's Doom+, were there multiple releases of that? I'm certainly aware of non-+ Doom and a couple of free ports.
Oh. Ah. Mm. Well, I just completed another wee A5K upgrade by way of adding a Wizzo 5th column ROM to allow me to chuck a spare 4GB CF card in there - all useful. But indeed, no change in benchmark results after all that.SarahWalker wrote: ↑Sun Apr 11, 2021 1:54 pm I meant Econet causing a difference for Doom; it pages in graphics from disc during gameplay, pulling graphics over Econet would presumably be quite slow.
Re: Benchmarks
FPA11 should be identical since it's the same core just with a higher top speed (and I didn't see any difference when running a few quick Si tests of both on my A540...although I wasn't rigourously comparing...).DutchAcorn wrote: ↑Sun Apr 11, 2021 11:13 am Did a quick test on my A5000a+FPA10, now running @33MHz. I'd be interested if the FPA11 performance is different.
Code: Select all
A5000a 33+FPA10 OS version 3.11 Dhrystone/sec 26291 Whetstone/sec 3259 Main memory read MB/s 23.29 Main memory write MB/s 26.04
Nice to see the performance increase your FPA10 has at 33MHz vs the FPA10 Sarah reported at 26MHz in her R260.
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
Added A7000 numbers - unsurprisingly similar to the RiscPC wo/VRAM, though BASIC being slower was a surprise. Presumably a regression due to the different cache layout between ARM7500 and ARM610.
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
Where can I find Membench?
Re: Benchmarks
Hi @sarah,
here are more results for the 24.86mhz A3020 for the benchmarks results table:
!Sick
21137.60 Dhrystones/sec
98.18102 kWhetstones/sec
Main memory read 56.25 MB/s
Main memory write 56.25MB/s
Cache read 55.55 MB/s
Cache read 55.55 MB/s
-------------------
Clocksp combined (which i think is Unweighted Average?) 82.60
Clocksp rmfaster 201.54
---------------------
Reach Galaxy = 1332 = 32.73
Reach Tunnel = 6537 = 13.70
here are more results for the 24.86mhz A3020 for the benchmarks results table:
!Sick
21137.60 Dhrystones/sec
98.18102 kWhetstones/sec
Main memory read 56.25 MB/s
Main memory write 56.25MB/s
Cache read 55.55 MB/s
Cache read 55.55 MB/s
-------------------
Clocksp combined (which i think is Unweighted Average?) 82.60
Clocksp rmfaster 201.54
---------------------
Reach Galaxy = 1332 = 32.73
Reach Tunnel = 6537 = 13.70
Re: Benchmarks
One other thing to report: I realised I ran demo3 for Doom rather than the demo1 which you requested.
So the results for demo1 are 10.079 fps for low detail and 7.631 for normal detail
So the results for demo1 are 10.079 fps for low detail and 7.631 for normal detail
Re: Benchmarks
I am curious about running !Sick 1.28 on Arculator. Sorry I can't still do this. I get errors. At first I tried to use the A305 (RiscOS 3.11) and I made the two screenshots that are attached to this post. Then I tried the A440/1 and A3000 (RiscOS 3.19). I got a message in German "Interner Fehler: Abbruch beim Holen der Instruktion bei &000286A0. at line 4217". What could I do wrong? Would anybody like to help me a bit?
I also tried to run !Sick on the Rasperry Pi model 1B. I got an error again
I also tried to run !Sick on the Rasperry Pi model 1B. I got an error again
It seems 700Mhz is just too much for Sick.Measuring timers...
abort data transfer at &00029E50
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
!SICK should definitely work on Arculator; I've run it numerous times. Maybe your copy is corrupt? Try the attached.
- Attachments
-
- SICK128.zip
- (53.61 KiB) Downloaded 56 times
Re: Benchmarks
Thank you. But I use the same archive. Maybe !Sick needs additional libraries? Would anybody like to run !Sick on just installed Arculator 2.1? I simply place !Sick-folder on hostfs... It always stops after "Detecting Processor(s)..." for me.SarahWalker wrote: ↑Thu Jan 13, 2022 5:20 pm !SICK should definitely work on Arculator; I've run it numerous times. Maybe your copy is corrupt? Try the attached.
Re: Benchmarks
Sarah's !SICK archive (largely*) works for me - completely fresh Arculator 2.1 (on Windows) and emulated A5000+RO3.11.
However, it does reproduce litwr's ' stops after "Detecting Processor(s)..."' problem on an emulated A440/1+RO3.11.
* There's a Division by zero error at 1246 when it reports Screen memory read speed.
However, it does reproduce litwr's ' stops after "Detecting Processor(s)..."' problem on an emulated A440/1+RO3.11.
* There's a Division by zero error at 1246 when it reports Screen memory read speed.
Miserable old curmudgeon who still likes a bit of an ARM wrestle now and then. Pi 4, 3, ARMX6, SA Risc PC, A540, A440
-
- Posts: 963
- Joined: Sat Aug 27, 2011 11:50 am
- Contact:
Re: Benchmarks
Stressed muffin.
I have several versions of !Si and !ArmSi (yes Sirbod I know you "bowed out" from that after version 3.12, it says so in the !Readme)
Basically I get inconsistent results on my overclocked Arm3/FPA A3010.
4.0 (dated Aug 1996) tells me I need Fpemulator running, it is, as is fpe400.... but runs the other non fpu benchmarks fine. I suspect that there's a bug in the FPA detection.... but I'm no coder and without printing and looking at the text side by side for difference.... yeah, basically I'm not skilled enough to fix.
3.46H tells me everything is hunky dory, and runs the FPA no problem.
!Sick 1.28 also completes without issue.
The really old version of !Si furnished by sirbod has a whole lot of "unknown" stuff and tells me the highest mips speed 18.86.
I can only run in mode 27 (crappy monitor.... vga only)
That the machine is fast is without doubt. But I'm not sure which app/version/ Module to trust/use.
Urgh
Apologies for contamination of thread, I'm back where I was 6 years ago viewtopic.php?f=29&t=11229#p139317 yes that's how long my A3010 has been out of action
Essentially repeating myself here but I'd completely forgotten.
I have several versions of !Si and !ArmSi (yes Sirbod I know you "bowed out" from that after version 3.12, it says so in the !Readme)
Basically I get inconsistent results on my overclocked Arm3/FPA A3010.
4.0 (dated Aug 1996) tells me I need Fpemulator running, it is, as is fpe400.... but runs the other non fpu benchmarks fine. I suspect that there's a bug in the FPA detection.... but I'm no coder and without printing and looking at the text side by side for difference.... yeah, basically I'm not skilled enough to fix.
3.46H tells me everything is hunky dory, and runs the FPA no problem.
!Sick 1.28 also completes without issue.
The really old version of !Si furnished by sirbod has a whole lot of "unknown" stuff and tells me the highest mips speed 18.86.
I can only run in mode 27 (crappy monitor.... vga only)
That the machine is fast is without doubt. But I'm not sure which app/version/ Module to trust/use.
Urgh
Apologies for contamination of thread, I'm back where I was 6 years ago viewtopic.php?f=29&t=11229#p139317 yes that's how long my A3010 has been out of action
Essentially repeating myself here but I'd completely forgotten.
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
I'm not sure what you're seeing now, because ARMSI 4.00 ran fine on your A3010 when it was here. We even compared the reported FLOPS.AndyMc1280 wrote: ↑Wed Apr 06, 2022 10:33 pm 4.0 (dated Aug 1996) tells me I need Fpemulator running, it is, as is fpe400.... but runs the other non fpu benchmarks fine. I suspect that there's a bug in the FPA detection.... but I'm no coder and without printing and looking at the text side by side for difference.... yeah, basically I'm not skilled enough to fix.
Here's a screenshot of the file I saved from your machine: And for reference, here's my 25MHz R140 when I borrowed your FPA chip: And my original A540 with 33MHz FPA:
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
Filling in some gaps - added StrongARM @ 202 and 287 MHz. Also remembered that ArcQuake will run on pre-StrongARM RiscPCs.
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
Had a play with graphs, because bored. Also I felt like visualising all these results I've been collecting.
Firstly, StrongARM performance scaling, all on RO 4.03 : Unsurprisingly the overclocked 287 CPU is obviously fastest at the more synthetic tests, but the Kinetic has a noticeable (though variable) lead in the rest of the tests.
Performance across the board on RiscPC era machines on RO 3.7? Not too surprising, thought it is interesting that FPEmulator on even a 287 MHz StrongARM can't keep up with a 33 MHz FPA11. Acorn sure put out a lot of machines with roughly the same performance.
Something more targeted, how about Doom performance across (almost) all machines? Once again, nothing too surprising, though the relatively poor performance of the ARM7500FE machines is interesting. How efficient is each machine though? Perhaps an arbitary test, though the results are interesting. The pre-StrongARM RiscPC CPUs dominate here, with the StrongARMs struggling with memory bandwidth. The Kinetic does a little better though not brilliantly, I suspect it's bottlenecking on VRAM writes. The ARM3s are also doing poorly, most likely due to slow writes due to the lack of write buffering.
Firstly, StrongARM performance scaling, all on RO 4.03 : Unsurprisingly the overclocked 287 CPU is obviously fastest at the more synthetic tests, but the Kinetic has a noticeable (though variable) lead in the rest of the tests.
Performance across the board on RiscPC era machines on RO 3.7? Not too surprising, thought it is interesting that FPEmulator on even a 287 MHz StrongARM can't keep up with a 33 MHz FPA11. Acorn sure put out a lot of machines with roughly the same performance.
Something more targeted, how about Doom performance across (almost) all machines? Once again, nothing too surprising, though the relatively poor performance of the ARM7500FE machines is interesting. How efficient is each machine though? Perhaps an arbitary test, though the results are interesting. The pre-StrongARM RiscPC CPUs dominate here, with the StrongARMs struggling with memory bandwidth. The Kinetic does a little better though not brilliantly, I suspect it's bottlenecking on VRAM writes. The ARM3s are also doing poorly, most likely due to slow writes due to the lack of write buffering.
-
- Posts: 101
- Joined: Sun Jan 16, 2022 5:19 pm
- Contact:
Re: Benchmarks
Hi!
Sorry, some questions to this:
1) I ran !SICK128 on my machines. On both StrongArms it showed me a section "Processor bugs" where it showed 4 bugs: STM, Istream abort, LDMIB and MSR. Is this normal? Does this cause some troubles?
2) I had Doom on a harddisk of one of my RiscPCs. When I run it, it does not work. I always get an error message. Could this be because of these processor bugs?
3) What do I need to do in order to make Doom and Quake running? Are there any patches out there?
Sorry, some questions to this:
1) I ran !SICK128 on my machines. On both StrongArms it showed me a section "Processor bugs" where it showed 4 bugs: STM, Istream abort, LDMIB and MSR. Is this normal? Does this cause some troubles?
2) I had Doom on a harddisk of one of my RiscPCs. When I run it, it does not work. I always get an error message. Could this be because of these processor bugs?
3) What do I need to do in order to make Doom and Quake running? Are there any patches out there?
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
"Processor bugs" is normal. What error message do you get from Doom?
-
- Posts: 101
- Joined: Sun Jan 16, 2022 5:19 pm
- Contact:
Re: Benchmarks
Thank you for your answer.
I can start doom. Then I add a WAD-file to it. In my case I used the file "Ultimate".
Doom tries to start, I see a progression bar. It gives me some funny messages like "Prepare to die"
Then it crashes.
The error which it shows is: Internal error, no stack for trap handler: Internal error: abort on data transfer at &0224A594, pc = 00000000: registers at 0005B020
This hex numbers are always the same.
I can cancel that error popup, but the screen is then not properly refreshed. I need to reset or switch off the machine.
I can start doom. Then I add a WAD-file to it. In my case I used the file "Ultimate".
Doom tries to start, I see a progression bar. It gives me some funny messages like "Prepare to die"
Then it crashes.
The error which it shows is: Internal error, no stack for trap handler: Internal error: abort on data transfer at &0224A594, pc = 00000000: registers at 0005B020
This hex numbers are always the same.
I can cancel that error popup, but the screen is then not properly refreshed. I need to reset or switch off the machine.
Re: Benchmarks
(Thanks for the extra info in the table, Sarah!)
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
Additional results for basic Arm250 12MHz A3010:
Dhrystone 9548
kWhetstones 57
Main memory read 25.16 MB/sec
Main memory write 25.16 MB/sec
MemBench:
Fetch: 31.50 MB/sec
Read: 26.25 MB/sec
Write: 26.25 MB/sec
Mandelbrot 2025 sec
Mandelbrot, RmFaster 1123 sec
Reach – Galaxy 2931
Reach – Tunnel 14036
These results show the machine is "faster" than Sarah's original A3010, but I believe I know why.
If I configure MonitorType 3, and reboot, then MODE12 uses different timings so that it's "compatible" with a VGA monitor.
Those timings means it refreshes at 70Hz, not 50Hz. Thus VIDC is hitting RAM 70 times a second, not 50...
so the machine seems slower, because VIDC's sucking up more of the RAM bandwidth.
I proved this theory holds by configuring the exact machine used for the above tests, to MonitorType 3, connecting to a VGA monitor, still in MODE 12, and rerunning the benchmarks...
Dhrystone 8892, kWhetstones 54
It makes me wonder whether any/all the other results are similarly skewed.
The better way to avoid this issue happening would have been to have Video DMA disabled completely during the testing.
Dhrystone 9548
kWhetstones 57
Main memory read 25.16 MB/sec
Main memory write 25.16 MB/sec
MemBench:
Fetch: 31.50 MB/sec
Read: 26.25 MB/sec
Write: 26.25 MB/sec
Mandelbrot 2025 sec
Mandelbrot, RmFaster 1123 sec
Reach – Galaxy 2931
Reach – Tunnel 14036
These results show the machine is "faster" than Sarah's original A3010, but I believe I know why.
If I configure MonitorType 3, and reboot, then MODE12 uses different timings so that it's "compatible" with a VGA monitor.
Those timings means it refreshes at 70Hz, not 50Hz. Thus VIDC is hitting RAM 70 times a second, not 50...
so the machine seems slower, because VIDC's sucking up more of the RAM bandwidth.
I proved this theory holds by configuring the exact machine used for the above tests, to MonitorType 3, connecting to a VGA monitor, still in MODE 12, and rerunning the benchmarks...
Dhrystone 8892, kWhetstones 54
It makes me wonder whether any/all the other results are similarly skewed.
The better way to avoid this issue happening would have been to have Video DMA disabled completely during the testing.
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
Results for "Adelaide" Arm2/mezzanine 12MHz A3010 (This proves it's identical timing to my stock Arm250 A3010 at least!)
Dhrystone 9548/sec
kWhetstones 57/sec
Main memory read 25.14 MB/sec
Main memory write 25.14 MB/sec
MemBench:
Fetch: 31.74 MB/sec
Read: 26.45 MB/sec
Write: 26.45 MB/sec
Mandelbrot 2010 sec
Mandelbrot, RmFaster 1118 sec
Reach – Galaxy 2908
Reach – Tunnel 13926
Dhrystone 9548/sec
kWhetstones 57/sec
Main memory read 25.14 MB/sec
Main memory write 25.14 MB/sec
MemBench:
Fetch: 31.74 MB/sec
Read: 26.45 MB/sec
Write: 26.45 MB/sec
Mandelbrot 2010 sec
Mandelbrot, RmFaster 1118 sec
Reach – Galaxy 2908
Reach – Tunnel 13926
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
Results for a mildly overclocked Arm250 A3010 (16.66MHz) ... (machine was set up to fill in the 12Mhz results, forgot I'd OC'd it a bit!) ...
SICK:
CPU Arm250
Clock speed 16.66MHz
OS version 3.11
Dhrystone 13712/sec
kWhetstone 64/sec
Main memory read 36.12MB/sec
Main memory write 36.12MB/sec
MemBench:
Fetch: 45.63MB/sec
Read: 38.01MB/sec
Write: 38.03MB/sec
CLOCKSP 60.68MHz
CLOCKSP, RmFaster 148.03MHz
Mandelbrot 1815.79 sec
Mandelbrot, RmFaster 779.1 sec
Reach – Galaxy 2037
Reach – Tunnel 9868
povray 5h 25m 18s
SICK:
CPU Arm250
Clock speed 16.66MHz
OS version 3.11
Dhrystone 13712/sec
kWhetstone 64/sec
Main memory read 36.12MB/sec
Main memory write 36.12MB/sec
MemBench:
Fetch: 45.63MB/sec
Read: 38.01MB/sec
Write: 38.03MB/sec
CLOCKSP 60.68MHz
CLOCKSP, RmFaster 148.03MHz
Mandelbrot 1815.79 sec
Mandelbrot, RmFaster 779.1 sec
Reach – Galaxy 2037
Reach – Tunnel 9868
povray 5h 25m 18s
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
Ta, added!
Odd about the Dhrystone result being off. I would check what was going on with that A3010 (it's very unlikely it was running in VGA mode) but it's long gone
Odd about the Dhrystone result being off. I would check what was going on with that A3010 (it's very unlikely it was running in VGA mode) but it's long gone
- IanJeffray
- Posts: 6018
- Joined: Sat Jun 06, 2020 3:50 pm
- Contact:
Re: Benchmarks
The 24Mhz Arm250 A3010 kWhetstone result looks decidedly fishy.
- SarahWalker
- Posts: 1599
- Joined: Fri Jan 14, 2005 3:56 pm
- Contact:
Re: Benchmarks
My guess is FPEmulator was *RMFaster'ed on that one.