Idea: Fastest Conway's Game of Life in BBC Basic
Idea: Fastest Conway's Game of Life in BBC Basic
Having seen Richard Russell's presentation including demos of 3D shaders running within BBC Basic, and having previously seen the idea of using a shader to run Conway's Life, I thought the combination might be well worth exploring.
If the shader idea works on the Raspberry Pi, so much the better!
Here's a browser demo which uses the shader in WebGL.
If the shader idea works on the Raspberry Pi, so much the better!
Here's a browser demo which uses the shader in WebGL.
- Diminished
- Posts: 1252
- Joined: Fri Dec 08, 2017 9:47 pm
- Contact:
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Yes indeed! In a related tangent, there's a juicy thread elsewhere which points back nearer home: Conway's Life on 6502.
But my feeling is that a GPU is going to do pretty well.
But my feeling is that a GPU is going to do pretty well.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
There are several shader implementations of The Game of Life at shadertoy.com. They are of course stupendously fast, and the cell size can be as small as a single pixel, but if anything I find them less visually impressive as a result.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Right, I have a shader-based Game of Life running in BBC BASIC! This involved something I've never attempted before: multi-stage shader programming. This is essential for GoL because each iteration's input is the previous iteration's output, which breaks the normal rendering 'pipeline'. This opens up a whole new set of possibilities, but my existing shaderlib.bbc library would need some revisions to support it properly.
It works OK in the desktop edition, but I've not yet tried it with the in-browser edition (there's no guarantee it will work without tweaks, because that's WebGL rather than OpenGL). I might try it tomorrow.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
I was right, it didn't work in WebGL. I had forgotten that it has a restriction that textures must have power-of-two dimensions (unlike OpenGL which can have any dimensions). So I've modified it to use a 512x512 grid and now it works: click here using a suitable browser.
Obviously running it on a desktop platform is much better from the point of view of experimenting; you can try things like changing the grid size, altering the initial condition (currently it's random), slowing down the generations etc. Here's the source if you want to play. It shouldn't be too incomprehensible:
Code: Select all
REM Conway's 'Game of Life': Shader version for BBC BASIC for SDL 2.0
REM Adapted from various examples at Shadertoy.com by Richard Russell
REM Size of grid (max = window size, must be power-of-two for WebGL):
VDU 23,22,512;512;8,16,16,0
GridW% = 512
GridH% = 512
REM Initial random pattern:
FOR C% = 1 TO 80*25 : VDU 32+RND(94) : NEXT
OSCLI "GSAVE """ + @tmp$ + "initial.bmp"" 0,0," + STR$(GridW%*2) + "," + STR$(GridH%*2)
CLS
REM Install libraries:
INSTALL @lib$ + "shaderlib"
REM Define constants:
GL_FRAMEBUFFER = &8D40
GL_COLOR_ATTACHMENT0 = &8CE0
REM Create arrays and structures:
DIM Vertex$(10), Fragment$(999), Final$(10), Float{0%,1%}
REM Fill vertex and fragment shader arrays from DATA statements:
PROC_readshader(Vertex$())
PROC_readshader(Fragment$())
PROC_readshader(Final$())
REM Initialise window and get its size:
ON MOVE IF @msg% <> 5 RETURN ELSE PROCcleanup
VDU 26
IF POS REM SDL thread sync
ScrW% = @size.x%
ScrH% = @size.y%
REM Ensure cleanup on exit:
ON CLOSE PROCcleanup : QUIT
ON ERROR PROCcleanup : MODE 3 : PRINT REPORT$ : END
REM Initialise shader library:
PROC_shaderinit(oVertex%, oFragment%)
SYS `glCreateShader`, GL_FRAGMENT_SHADER, @memhdc% TO oFinal%
`glGenFramebuffers` = FN_gpa("glGenFramebuffers")
`glBindFramebuffer` = FN_gpa("glBindFramebuffer")
`glFramebufferTexture2D` = FN_gpa("glFramebufferTexture2D")
`glDeleteFramebuffers` = FN_gpa("glDeleteFramebuffers")
REM Compile shaders:
PROC_compileshader(oVertex%, Vertex$(), "Vertex")
PROC_compileshader(oFragment%, Fragment$(), "Fragment")
PROC_compileshader(oFinal%, Final$(), "Final")
REM Link shaders:
oProgram1% = FN_useshaders(oVertex%, oFragment%)
oProgram2% = FN_useshaders(oVertex%, oFinal%)
REM Get pointers to uniforms:
SYS `glGetUniformLocation`, oProgram1%, "iResolution", @memhdc% TO pResolution1%
SYS `glGetUniformLocation`, oProgram1%, "iChannel0", @memhdc% TO pChannel0_1%
SYS `glGetUniformLocation`, oProgram2%, "iResolution", @memhdc% TO pResolution2%
SYS `glGetUniformLocation`, oProgram2%, "iChannel0", @memhdc% TO pChannel0_2%
REM Load and bind initial random texture:
SYS `glActiveTexture`, GL_TEXTURE0, @memhdc%
Texture1% = FN_loadtexture1(@tmp$ + "initial.bmp")
Texture2% = FN_loadtexture1(@tmp$ + "initial.bmp")
IF Texture1% = 0 OR Texture2% = 0 ERROR 100, "Couldn't load initial.bmp"
SYS `glUniform1i`, pChannel0_1%, 0, @memhdc%
SYS `glUniform1i`, pChannel0_2%, 0, @memhdc%
REM Create Framebuffer object:
DIM align{fbo%}
SYS `glGenFramebuffers`, 1, align{}, @memhdc%
oBuffer% = align.fbo%
IF oBuffer% = 0 ERROR 100, "Couldn't create framebuffer"
REM Render loop:
REPEAT
REM Stage one: render to texture:
Float.0% = FN_f4(GridW%) : Float.1% = FN_f4(GridH%)
SYS `glBindFramebuffer`, GL_FRAMEBUFFER, oBuffer%, @memhdc%
SYS `glFramebufferTexture2D`, GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, \
\ GL_TEXTURE_2D, Texture2%, 0, @memhdc%
SYS `glBindTexture`, GL_TEXTURE_2D, Texture1%, @memhdc%
SYS `glUseProgram`, oProgram1%, @memhdc%
SYS `glUniform2fv`, pResolution1%, 1, Float{}, @memhdc%
SYS `glDrawArrays`, GL_TRIANGLES, 0, 6, @memhdc%
REM Stage two: render to canvas:
Float.0% = FN_f4(ScrW%) : Float.1% = FN_f4(ScrH%)
SYS `glBindFramebuffer`, GL_FRAMEBUFFER, 0, @memhdc%
SYS `glBindTexture`, GL_TEXTURE_2D, Texture2%, @memhdc%
SYS `glUseProgram`, oProgram2%, @memhdc%
SYS `glUniform2fv`, pResolution2%, 1, Float{}, @memhdc%
PROC_render
SWAP Texture1%, Texture2%
UNTIL FALSE
END
DEF PROCcleanup
ON ERROR OFF
oBuffer%% += 0 : IF oBuffer% SYS `glDeleteFramebuffers`, 1, ^oBuffer%, @memhdc%
Texture1% += 0 : IF Texture1% SYS `glDeleteTextures`, 1, ^Texture1%, @memhdc%
Texture2% += 0 : IF Texture2% SYS `glDeleteTextures`, 1, ^Texture2%, @memhdc%
oProgram2% += 0 : IF oProgram2% SYS `glDeleteProgram`, oProgram2%, @memhdc%
oFinal% += 0 : IF oFinal% SYS `glDeleteShader`, oFinal%, @memhdc%
oProgram1% += 0 : oVertex% += 0 : oFragment% += 0
PROC_shaderexit(oProgram1%, oVertex%, oFragment%)
*REFRESH ON
ENDPROC
REM Minimal vertex shader:
DATA "attribute vec4 vPosition;"
DATA "void main()"
DATA "{"
DATA "gl_Position = vPosition ;"
DATA "}"
DATA ""
REM Intermediate fragment shader (outputs to texture):
DATA "uniform vec2 iResolution;"
DATA "uniform sampler2D iChannel0;"
DATA "int cell( in vec2 p )"
DATA "{"
DATA "return (texture2D(iChannel0, p/iResolution.xy).g > 0.5 ) ? 1 : 0;"
DATA "}"
DATA "void main()"
DATA "{"
DATA "vec2 px = gl_FragCoord.xy;"
DATA "int k = cell(px+vec2(-1,-1)) + cell(px+vec2(0,-1)) + cell(px+vec2(1,-1))"
DATA " + cell(px+vec2(-1, 0)) + cell(px+vec2(1, 0))"
DATA " + cell(px+vec2(-1, 1)) + cell(px+vec2(0, 1)) + cell(px+vec2(1, 1));"
DATA "int e = cell(px);"
DATA "float f = (((k==2) && (e==1)) || (k==3)) ? 1.0 : 0.0;"
DATA "gl_FragColor = vec4( f, f, f, 1.0 );"
DATA "}"
DATA ""
REM Final fragment shader (outputs to canvas):
DATA "uniform vec2 iResolution;"
DATA "uniform sampler2D iChannel0;"
DATA "void main()"
DATA "{"
DATA "gl_FragColor = texture2D(iChannel0, gl_FragCoord.xy/iResolution.xy);"
DATA "}"
DATA ""
Last edited by Deleted User 9295 on Sun Jan 10, 2021 1:02 pm, edited 1 time in total.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Excellent, thanks! One very nice thing about this sort of resolution and speed is that gliders really do glide in a very satisfactory way.
Edit: is this the fastest? I rather expect it is the fastest.
Edit: is this the fastest? I rather expect it is the fastest.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
It's locked to frame sync (60 generations per second on most setups). You could make it faster by turning off vSync, but some of the generations would never be displayed and gliders would no longer move smoothly, so is that meaningful? I would argue not.
That's one reason why I'm not sure that "fastest" is a sensible challenge: any half decent GPU is going to manage 60 generations a second, unless you have a bigger grid. You could instead ask: "how big can the grid be whilst still running at 60 Hz?".
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Ah, I wasn't quite asking if this is as fast as it can go, rather asking if any conventional BBC Basic program, not using a shader, might manage such a high update rate.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
By "conventional" do you mean including assembly language, or 'pure' BASIC code only? Do you consider shader code to be distinctly different from assembler code, or in the same general category?
An interesting question would be: is there any way of leveraging BBC BASIC's array arithmetic to compute multiple cells at once? If there is that would be a game-changer for a pure BASIC solution, but I'm somewhat doubtful.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
I was thinking of portable Basic code, so that rules out assembly language. I was thinking of any conventional approach which has two arrays of cell states, and reads one while writing the other. (It might be that your BBC Basic could do this operation to be programmed explicitly to use multiple threads??)
And indeed, it might be that a sufficiently modern BBC Basic can do those array operations without explicit FOR loops, and avoid a lot of overhead. I hadn't thought of that.
And indeed, it might be that a sufficiently modern BBC Basic can do those array operations without explicit FOR loops, and avoid a lot of overhead. I hadn't thought of that.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Assuming that by "portable" you mean it would have to work in ARM BASIC 5 and Matrix Brandy, that rules out a lot of possible 'tricks' too (I was thinking that to leverage array arithmetic it would need some evil poking of array descriptors, but that wouldn't be portable).
I have so often regretted that array descriptors have to be contiguous with the array itself (that's true in my versions, anyway). It means you can't pretend that part of a large array is itself a small array, which would have been possible had there been an extra level of indirection. Do you happen to know whether the same is true in ARM BASIC 5 and/or Brandy?
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Oh, nested or overlapping arrays! I have no idea about that.
By portable I mainly meant your implementations, but for Pi, desktop, and browser.
By portable I mainly meant your implementations, but for Pi, desktop, and browser.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Oh, I don't know - but I think it would be terribly slow. I was trying to write a (hopefully decent) Basic implementation last year, but my powers of concentration were just not up to it. It was intended to be a follow on of the program presented here.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
With array arithmetic and descriptor hacking, I'm getting just over 3 generations per second with a 512x512 grid on this laptop. The trouble is that I have to do such contortions to make array arithmetic do things that it was never intended to, it may well not end up being any faster than the naive but less wasteful approach.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Here's the pure BASIC (but specific to my current versions) listing. It achieves in the region of 3 generations per second here with a 512x512 grid (compared with the shader version's 60 per second with time to spare). I'd be interested to know if anybody can improve on it - without using a faster interpreter like Brandy of course.
Code: Select all
REM Conway's 'Game of Life' in pure BASIC (with some cheating!)
REM By Richard Russell, http://www.bbcbasic.co.uk/, 10-Jan-2021
HIMEM = PAGE + 8000000
REM Set grid size and initialise screen:
W% = 512 : H% = 512
VDU 23,22,W%;H%;8,8,16,0
OFF
REM Define arrays:
DIM g&(W%-1,H%-1), ft%(9)
REM Initialise the grid to a random state:
FOR Y% = 0 TO H%-1
FOR X% = 0 TO W%-1
g&(X%,Y%) = RND(2) - 1
NEXT
NEXT
REM Create 8 copy arrays with gaps:
G% = W% * 2 + 2 : p%% = 0
DIM k&(W%-1,H%-1), p%% G%, l&(W%-1,H%-1), p%% G%, m&(W%-1,H%-1), p%% G%, n&(W%-1,H%-1), p%% G%
DIM o&(W%-1,H%-1), p%% G%, p&(W%-1,H%-1), p%% G%, q&(W%-1,H%-1), p%% G%, r&(W%-1,H%-1), p%% G%
REM Do some evil descriptor hacking to offset the arrays:
PROCoffset(k&(),W%+1) : PROCoffset(l&(),W%+1) : PROCoffset(m&(),W%+1) : PROCoffset(n&(),W%+1)
PROCoffset(o&(),W%+1) : PROCoffset(p&(),W%+1) : PROCoffset(q&(),W%+1) : PROCoffset(r&(),W%+1)
REM Repeat for each generation:
*REFRESH OFF
GCOL 15
ref% = TIME
@%=&102010A
ft%() = 30
REPEAT
REM Display grid
CLS
FOR Y% = 0 TO H%-1
FOR X% = 0 TO W%-1
IF g&(X%,Y%) PLOT X%*2, Y%*2
NEXT
NEXT
REM Make 8 copies of grid:
k&() = g&() : l&() = g&() : m&() = g&() : n&() = g&()
o&() = g&() : p&() = g&() : q&() = g&() : r&() = g&()
REM Offset arrays:
PROCoffset(k&(),-W%-1) : PROCoffset(l&(),-W%) : PROCoffset(m&(),-W%+1)
PROCoffset(n&(),-1) : PROCoffset(o&(),+1)
PROCoffset(p&(),+W%-1) : PROCoffset(q&(),+W%) : PROCoffset(r&(),+W%+1)
REM Sum the border cells:
k&() += l&() + m&() + n&() + o&() + p&() + q&() + r&()
REM Calculate next state:
l&() = k&() * k&() DIV 4
o&() = l&() DIV 2
m&() = l&() DIV -8
m&() += 1
l&() AND= m&() AND g&() : REM survives
p&() = o&() DIV -2
p&() += 1
o&() AND= p&() AND 1 : REM survives or reproduces
g&() = l&() OR o&()
REM Undo the offsets:
PROCoffset(k&(),+W%+1) : PROCoffset(l&(),+W%) : PROCoffset(m&(),+W%-1)
PROCoffset(n&(),+1) : PROCoffset(o&(),-1)
PROCoffset(p&(),-W%+1) : PROCoffset(q&(),-W%) : PROCoffset(r&(),-W%-1)
*REFRESH
T% = (T% + 1) MOD (DIM(ft%(),1) + 1)
new% = TIME
ft%(T%) = new% - ref%
ref% = new%
SYS "SDL_SetWindowTitle", @hwnd%, STR$(1000/SUM(ft%())) + " generations per second", @memhdc%
UNTIL FALSE
END
REM This is non-portable code:
DEF PROCoffset(RETURN a&(), O%)
LOCAL p%%, W%, H% : p%% = PTR(a&())
W% = DIM(a&(),1) + 1 : H% = DIM(a&(),2) + 1
?p%% = 0 : p%%!1 = 0 : p%%!5 = 0
p%% += O% : PTR(a&()) = p%%
?p%% = 2 : p%%!1 = W% : p%%!5 = H%
ENDPROC
Last edited by Deleted User 9295 on Mon Jan 11, 2021 5:36 pm, edited 1 time in total.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Thanks Richard - I'll give that a go tomorrow.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
I've tried to assess that by writing a naive non-optimised implementation myself; on the same PC it runs at around one generation per second so the array-arithmetic version is quite a bit faster despite its messiness.Richard Russell wrote: ↑Sun Jan 10, 2021 9:00 pmit may well not end up being any faster than the naive but less wasteful approach.
I wonder if anybody can improve on the code I devised to compute the next state of each cell. To be usable with entire arrays it can't use any comparisons, so the challenge is to implement a functional equivalent to this but without using the '=', '<' or '>' operators, where k is in the range 0 to 8 and g is 0 or 1:
Code: Select all
IF (k = 2) AND (g = 1) OR (k = 3) THEN g = 1 ELSE g = 0
Code: Select all
l = k * k DIV 4
o = l DIV 2
m = l DIV -8
m += 1
l AND= m AND g
p = o DIV -2
p += 1
o AND= p AND 1
g = l OR o
Re: Idea: Fastest Conway's Game of Life in BBC Basic
sounds a bit like a palette look up to me, k concatenated with g to form an index.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Yeah, but again not something that can work with entire arrays. The only array operators available (AFAIK) are: + - * / DIV AND OR EOR. Can you suggest some actual code that's simpler than mine?
Re: Idea: Fastest Conway's Game of Life in BBC Basic
I think it's true, and it might help, that g gets 1 if and only if k+g == 3, which may be slightly simpler - we only need the inclusive count of the 9-cell.
Edit: this is wrong! See below. Posting while tired, that's my excuse.
Edit: this is wrong! See below. Posting while tired, that's my excuse.
Last edited by BigEd on Tue Jan 12, 2021 9:03 am, edited 1 time in total.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
g remains 1 also if (k = 3 AND g = 1), which means k+g == 4.
I've found a way of speeding it up a little by changing the way it's displayed, but there's a bug (must be in the previous version too) because it never seems to stabilise however long it's left running. I won't publish it until I can find a fix.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
New version below; changes are as follows:
- Grid corruption bug fixed (hopefully).
- Grid displayed by mimicking a BMP file, so quite a lot faster.
- Compatible with both BB4W and BBCSDL (only updating the title bar is affected).
Code: Select all
REM Conway's 'Game of Life' in pure BASIC (with some cheating!)
REM By Richard Russell, http://www.bbcbasic.co.uk/, 11-Jan-2021
HIMEM = PAGE + 8000000
REM Set grid size and initialise screen:
W% = 512 : H% = 512
VDU 23,22,W%;H%;8,8,16,0
OFF
REM Define structures and arrays:
DIM bm{bfType{l&,h&}, bfSize%, bfReserved%, bfOffBits%, \
\ biSize%, biWidth%, biHeight%, biPlanes{l&,h&}, biBitCount{l&,h&}, \
\ biCompression%, biSizeImage%, biXPelsPerMeter%, biYPelsPerMeter%, \
\ biClrUsed%, biClrImportant%, palette%(255)}, g&(W%-1,H%-1), ft%(9)
REM Initialise bitmap structure:
bm.bfType.l& = ASC"B" : bm.bfType.h& = ASC"M"
bm.bfOffBits% = ^g&(0,0) - bm{} : bm.bfSize% = bm.bfOffBits% + W%*H%
bm.biSize% = 40 : bm.biWidth% = W% : bm.biHeight% = -H%
bm.biPlanes.l& = 1 : bm.biBitCount.l& = 8
bm.palette%(0) = &FF000000 : bm.palette%(1) = &FFFFFFFF
REM Initialise the grid to a random state:
FOR Y% = 0 TO H%-1
FOR X% = 0 TO W%-1
g&(X%,Y%) = RND(2) - 1
NEXT
NEXT
REM Create 8 copy arrays with gaps:
G% = W% * 2 + 2 : p%% = 0
DIM k&(W%-1,H%-1), p%% G%, l&(W%-1,H%-1), p%% G%, m&(W%-1,H%-1), p%% G%, n&(W%-1,H%-1), p%% G%
DIM o&(W%-1,H%-1), p%% G%, p&(W%-1,H%-1), p%% G%, q&(W%-1,H%-1), p%% G%, r&(W%-1,H%-1), p%% G%
REM Do some evil descriptor hacking to offset the arrays:
PROCoffset(k&(),W%+1) : PROCoffset(l&(),W%+1) : PROCoffset(m&(),W%+1) : PROCoffset(n&(),W%+1)
PROCoffset(o&(),W%+1) : PROCoffset(p&(),W%+1) : PROCoffset(q&(),W%+1) : PROCoffset(r&(),W%+1)
REM Repeat for each generation:
*REFRESH OFF
GCOL 15
R% = TIME
@%=&102010A
ft%() = 30
REPEAT
REM Display grid:
OSCLI "MDISPLAY " + STR$~bm{}
REM Make 8 copies of grid:
k&() = g&() : l&() = g&() : m&() = g&() : n&() = g&()
o&() = g&() : p&() = g&() : q&() = g&() : r&() = g&()
REM Offset arrays:
PROCoffset(k&(),-W%-1) : PROCoffset(l&(),-W%) : PROCoffset(m&(),-W%+1)
PROCoffset(n&(),-1) : PROCoffset(o&(),+1)
PROCoffset(p&(),+W%-1) : PROCoffset(q&(),+W%) : PROCoffset(r&(),+W%+1)
REM Sum the border cells:
k&() += l&() + m&() + n&() + o&() + p&() + q&() + r&()
REM Calculate next state (by Svein Svensson):
k&() OR= g&()
g&() = k&() * 80 DIV 240 : REM Relies on 8-bit wraparound
REM Undo the offsets:
PROCoffset(k&(),+W%+1) : PROCoffset(l&(),+W%) : PROCoffset(m&(),+W%-1)
PROCoffset(n&(),+1) : PROCoffset(o&(),-1)
PROCoffset(p&(),-W%+1) : PROCoffset(q&(),-W%) : PROCoffset(r&(),-W%-1)
*REFRESH
REM Report speed in title bar:
T% = (T% + 1) MOD (DIM(ft%(),1) + 1)
N% = TIME
ft%(T%) = N% - R%
R% = N%
title$ = STR$(1000/SUM(ft%())) + " generations per second"
IF INKEY$(-256) = "W" THEN
SYS "SetWindowText", @hwnd%, title$
ELSE
SYS "SDL_SetWindowTitle", @hwnd%, title$, @memhdc%
ENDIF
UNTIL FALSE
END
REM This is non-portable code:
DEF PROCoffset(RETURN a&(), O%)
LOCAL i%%, p%%, W%, H% : p%% = PTR(a&())
W% = DIM(a&(),1) + 1 : H% = DIM(a&(),2) + 1
IF O% < 0 FOR i%% = p%% + O% + 9 TO p%% + 8 : ?i%% = 0 : NEXT
IF O% > 0 FOR i%% = p%% + W%*H% + 9 TO p%% + W%*H% + O% + 8 : ?i%% = 0 : NEXT
p%% += O% : PTR(a&()) = p%%
?p%% = 2 : p%%!1 = W% : p%%!5 = H%
ENDPROC
Last edited by Deleted User 9295 on Tue Jan 12, 2021 11:31 pm, edited 2 times in total.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Although it can't be used as an index as such, 'concatenating' k with g (in practice I used k*2 + g) was key to the latest improvement. This reduces the total number of operations required to calculate the next state (using only whole-array operators) from 12 to 7:
Code: Select all
p = k*2 + g + 8
q = p + 3
g = q DIV 16 - p DIV 16
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Svein Svensson came up with this suggestion at the Discussion Group, which is either brilliant or ugly depending on your attitude to relying on 'side effects' (it works only because byte variables 'wrap around' on overflow, like 8-bit indirection does, rather than reporting a 'Number too big' error):
Code: Select all
REM Calculate next state (by Svein Svensson):
k&() OR= g&()
g&() = k&() * 80 DIV 240 : REM Relies on 8-bit wraparound
Re: Idea: Fastest Conway's Game of Life in BBC Basic
I've listed below what I'm tentatively calling the 'final' version of the program. The last tweak, to squeeze the very last drop of performance, was to remove the synchronisation to vertical sync, because that had become redundant with the new display method.
On my fastest desktop PC it manages a sustained 18 generations-per-second or so, with a 512x512 grid (262,144 cells), which I think is pretty impressive for an entirely-interpreted language. It will run in BBC BASIC for SDL 2.0 or BBC BASIC for Windows, but in the latter case it needs the as-yet unreleased v6.14a.
To achieve this I have made maximum use of BBC BASIC's features:
On my fastest desktop PC it manages a sustained 18 generations-per-second or so, with a 512x512 grid (262,144 cells), which I think is pretty impressive for an entirely-interpreted language. It will run in BBC BASIC for SDL 2.0 or BBC BASIC for Windows, but in the latter case it needs the as-yet unreleased v6.14a.
To achieve this I have made maximum use of BBC BASIC's features:
- Array arithmetic, allowing me to calculate the next state of every cell all at once.
- Byte arrays, used for compactness and to take advantage of 8-bit numeric wraparound.
- Structures, making the main grid array look like a BMP file for fast display.
- Poking directly into array descriptors, to effectively offset the data within the arrays.
Code: Select all
REM Conway's 'Game of Life' in pure BASIC (with some cheating!)
REM By Richard Russell, http://www.bbcbasic.co.uk/, 13-Jan-2021
HIMEM = PAGE + 8000000
REM Set grid size and initialise screen:
W% = 512 : H% = 512
VDU 23,22,W%;H%;8,8,16,0
OFF
REM Define structures and arrays:
DIM bm{bfType{l&,h&}, bfSize%, bfReserved%, bfOffBits%, \
\ biSize%, biWidth%, biHeight%, biPlanes{l&,h&}, biBitCount{l&,h&}, \
\ biCompression%, biSizeImage%, biXPelsPerMeter%, biYPelsPerMeter%, \
\ biClrUsed%, biClrImportant%, palette%(255)}, g&(H%-1,W%-1), ft%(9)
REM Initialise bitmap structure:
bm.bfType.l& = ASC"B" : bm.bfType.h& = ASC"M"
bm.bfOffBits% = ^g&(0,0) - bm{} : bm.bfSize% = bm.bfOffBits% + W%*H%
bm.biSize% = 40 : bm.biWidth% = W% : bm.biHeight% = -H%
bm.biPlanes.l& = 1 : bm.biBitCount.l& = 8
bm.palette%(0) = &FF000000 : bm.palette%(1) = &FFFFFFFF
REM Initialise the grid to a random state:
FOR Y% = 0 TO H%-1
FOR X% = 0 TO W%-1
g&(Y%,X%) = RND(2) - 1
NEXT
NEXT
REM Create 8 copy arrays with gaps:
G% = W% * 2 + 2 : p%% = 0
DIM k&(H%-1,W%-1), p%% G%, l&(H%-1,W%-1), p%% G%, m&(H%-1,W%-1), p%% G%, n&(H%-1,W%-1), p%% G%
DIM o&(H%-1,W%-1), p%% G%, p&(H%-1,W%-1), p%% G%, q&(H%-1,W%-1), p%% G%, r&(H%-1,W%-1), p%% G%
REM Do some evil descriptor hacking to offset the arrays:
PROCoffset(k&(),W%+1) : PROCoffset(l&(),W%+1) : PROCoffset(m&(),W%+1) : PROCoffset(n&(),W%+1)
PROCoffset(o&(),W%+1) : PROCoffset(p&(),W%+1) : PROCoffset(q&(),W%+1) : PROCoffset(r&(),W%+1)
REM Repeat for each generation:
GCOL 15
R% = TIME
@%=&102010A
ft%() = 30
REPEAT
REM Display grid:
OSCLI "MDISPLAY " + STR$~bm{}
REM Make 8 copies of grid:
k&() = g&() : l&() = g&() : m&() = g&() : n&() = g&()
o&() = g&() : p&() = g&() : q&() = g&() : r&() = g&()
REM Offset arrays:
PROCoffset(k&(),-W%-1) : PROCoffset(l&(),-W%) : PROCoffset(m&(),-W%+1)
PROCoffset(n&(),-1) : PROCoffset(o&(),+1)
PROCoffset(p&(),+W%-1) : PROCoffset(q&(),+W%) : PROCoffset(r&(),+W%+1)
REM Sum the border cells:
k&() += l&() + m&() + n&() + o&() + p&() + q&() + r&()
REM Calculate next state (by Svein Svensson):
k&() OR= g&()
g&() = k&() * 80 DIV 240 : REM Relies on 8-bit wraparound
REM Undo the offsets:
PROCoffset(k&(),+W%+1) : PROCoffset(l&(),+W%) : PROCoffset(m&(),+W%-1)
PROCoffset(n&(),+1) : PROCoffset(o&(),-1)
PROCoffset(p&(),-W%+1) : PROCoffset(q&(),-W%) : PROCoffset(r&(),-W%-1)
REM Report speed in title bar:
T% = (T% + 1) MOD (DIM(ft%(),1) + 1)
N% = TIME
ft%(T%) = N% - R%
R% = N%
title$ = STR$(1000/SUM(ft%())) + " generations per second"
IF INKEY$(-256) = "W" THEN
SYS "SetWindowText", @hwnd%, title$
ELSE
SYS "SDL_SetWindowTitle", @hwnd%, title$, @memhdc%
ENDIF
UNTIL FALSE
END
REM This is non-portable code:
DEF PROCoffset(RETURN a&(), O%)
LOCAL i%%, p%%, W%, H% : p%% = PTR(a&())
W% = DIM(a&(),1) + 1 : H% = DIM(a&(),2) + 1
IF O% < 0 FOR i%% = p%% + O% + 9 TO p%% + 8 : ?i%% = 0 : NEXT
IF O% > 0 FOR i%% = p%% + W%*H% + 9 TO p%% + W%*H% + O% + 8 : ?i%% = 0 : NEXT
p%% += O% : PTR(a&()) = p%%
?p%% = 2 : p%%!1 = W% : p%%!5 = H%
ENDPROC
Last edited by Deleted User 9295 on Thu Jan 14, 2021 6:25 pm, edited 1 time in total.
Re: Idea: Fastest Conway's Game of Life in BBC Basic
Very impressive! And very educational too as your successive versions got faster.
It would be nice to see what can be done in, say, Basic V, which runs on reasonably fast machines (the Pi) but doesn't have the benefit of the overlapping array techniques. Perhaps someone could have a go at that one day.
It would be nice to see what can be done in, say, Basic V, which runs on reasonably fast machines (the Pi) but doesn't have the benefit of the overlapping array techniques. Perhaps someone could have a go at that one day.