Idea: Fastest Conway's Game of Life in BBC Basic

for discussion of bbc basic for windows/sdl, brandy and more
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

BigEd wrote: Thu Jan 14, 2021 11:59 amIt would be nice to see what can be done in, say, Basic V, which runs on reasonably fast machines (the Pi) but doesn't have the benefit of the overlapping array techniques. Perhaps someone could have a go at that one day.
Indeed. I see no reason why the 'array offsetting' technique should not be attempted in BASIC V; it may not provide a 'legitimate' way of accessing variable and array pointers in the way my BASICs do (without using assembler code), but there are often workarounds available. For example you can usually get a pointer to an item on the heap by immediately preceding its declaration with a zero-length memory allocation:

Code: Select all

      DIM P% -1, array%(255,255)
There is a high likelihood that P% will now point to the array%() entry on the heap. Alternatively you could 'walk' the relevant variable linked-list to find the entry for array%(. It doesn't matter how slow or messy this step is because the array will never move once declared, so it can be done just once during initialisation.

So the main time-saving tricks - using array arithmetic and offsetting arrays by patching their descriptors (information blocks) - are in principle available in BASIC V, even if not in quite such a straightforward fashion. It would certainly be interesting to see what could be achieved.
User avatar
BigEd
Posts: 6261
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by BigEd »

Am I right in thinking that the cost or saving is a bit of address arithmetic? Some small integer constants added or subtracted?
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

BigEd wrote: Thu Jan 14, 2021 1:23 pm Am I right in thinking that the cost or saving is a bit of address arithmetic? Some small integer constants added or subtracted?
Sorry, I'm not sure what you are referring to here. Address arithmetic where? The (big) saving in manipulating array descriptors is in effectively shifting the array in memory without actually moving it at all!
User avatar
BigEd
Posts: 6261
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by BigEd »

Oh, I must have entirely failed to guess your approach - I'll confess I haven't even tried to study your approach, only skimmed it.

This is what I thought: you have a single state array, and you define 8 other 'arrays' which don't have storage but which give you easy access to the 8 neighbours of each point. As you use array operators, you don't need to iterate over the array in interpreted loops, instead you combine the 9 views of the one grid to construct the next state of the grid in one go.

In Basic V, you'd need the same array, and you'd need to iterate over each cell, and therefore (without extra cleverness) you need a second copy to accept the new-states. For each cell position, you need to access the state of the 8 neighbours, so conventionally that's 8 simple arithmetic expressions to compute the appropriate indices.
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

BigEd wrote: Thu Jan 14, 2021 3:30 pm you define 8 other 'arrays' which don't have storage but which give you easy access to the 8 neighbours of each point.
That would be great, and would speed it up even further, but unfortunately in BBC BASIC (both Sophie's and mine, and I suspect Brandy too) there isn't the extra level of indirection it would need. In other words the method you propose would require the 'array descriptor' (array information block) to contain a pointer to the array's data, but sadly it doesn't; the array's data follows immediately after the descriptor.

So the virtual 'shifting' of the arrays is more messy than you suggest. They must have actual storage (which is why in my loop I make 8 copies of the main grid array, which is the most time-consuming operation of all) and to effectively shift the data you must shift the entire descriptor.

I'm trying to write a BASIC 5 version now, but it's the first time I've ever written a substantial-sized program in that dialect so it's all a bit of a shock to the system compared with my BASICs! I presume in BASIC 5 there's no faster way to display the grid than to iterate through all the cells, plotting each one in turn?
User avatar
BigEd
Posts: 6261
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by BigEd »

Oh, goodness, I didn't expect you were making 8 copies. At least each copy is a single action - no interpreted loops over each cell.

Yes, I would expect to have to iterate over the cells, issuing a PLOT for each. There might be cleverer ways but I don't have any in mind. Actually, if we have both present state and next state arrays, we need only PLOT when a cell changes state - that will save a lot. (There are so many ways to approach a Life implementation! I'm firmly thinking of a fairly straightforward array-based approach.)
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

BigEd wrote: Thu Jan 14, 2021 3:47 pm Actually, if we have both present state and next state arrays, we need only PLOT when a cell changes state - that will save a lot.
Yet another array copy! But I suppose it's a winner time-wise. I may add that, if I can get it to work in BASIC 5 at all; currently I can't. :(
User avatar
BigEd
Posts: 6261
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by BigEd »

It's possible, one way or another, to swap the roles of the previous and the next array on each generation, so no need for a bulk copy. Perhaps put both contents into a double-sized array, and twiddle the access indices appropriately?

It's also possible, and this is where I came unstuck, to have only a single array, but to buffer only a line or two. Or, to use one array for cell values and a second one for horizontal moving sums (sum of three adjacent cells). In this case one pass reads the main array and writes the sums, and a second pass reads the sums and writes the new values back to the main array. Two arrays, but no copying. I tried this and tied myself in knots, but I'm sure it's not that hard.

Edit: thinking about it, to save a lot of PLOT cost, we do actually want a before-and-after view of each cell, or a bitmap of which cells changed.
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

BigEd wrote: Thu Jan 14, 2021 4:46 pm It's possible, one way or another, to swap the roles of the previous and the next array on each generation, so no need for a bulk copy.
True, you can SWAP arrays. I'll allow you to make that change yourself to see what improvement you can achieve (displaying the grid is the bottleneck at the moment). Here's a BASIC 5 version, based very closely on the BBCSDL one, which I believe is running correctly now (tested in Red Squirrel):

Code: Select all

   10 REM Conway's 'Game of Life' in pure BASIC 5 (with some cheating!)
   20 REM By Richard Russell, http://www.bbcbasic.co.uk/, 14-Jan-2021
   30
   40 REM Set grid size and initialise screen:
   50 W% = 512 : H% = 512
   60 MODE 18
   70 OFF
   80
   90 DIM g%(H%-1,W%-1)
  100
  110 REM Initialise the grid to a random state:
  120 FOR Y% = 0 TO H%-1
  130   FOR X% = 0 TO W%-1
  140     g%(Y%,X%) = RND(2) - 1
  150   NEXT
  160 NEXT
  170
  180 REM Create 8 copy arrays with gaps:
  190 G% = W% * 8 + 8
  200 DIM K% -1, k%(H%-1,W%-1), L% G%, L% -1, l%(H%-1,W%-1), M% G%
  210 DIM M% -1, m%(H%-1,W%-1), N% G%, N% -1, n%(H%-1,W%-1), O% G%
  220 DIM O% -1, o%(H%-1,W%-1), P% G%, P% -1, p%(H%-1,W%-1), Q% G%
  230 DIM Q% -1, q%(H%-1,W%-1), R% G%, R% -1, r%(H%-1,W%-1), S% G%
  240
  250 REM Do some evil descriptor hacking to offset the arrays:
  260 PROCoffset(K%,W%+1) : PROCoffset(L%,W%+1) : PROCoffset(M%,W%+1) : PROCoffset(N%,W%+1)
  270 PROCoffset(O%,W%+1) : PROCoffset(P%,W%+1) : PROCoffset(Q%,W%+1) : PROCoffset(R%,W%+1)
  280
  290 REM Repeat for each generation:
  300 GCOL 7
  310 REPEAT
  320   REM Display grid:
  330   CLS
  340   FOR Y% = 0 TO H%-1
  350     FOR X% = 0 TO W%-1
  360       IF g%(Y%,X%) POINT X%*2,Y%*2
  370     NEXT
  380   NEXT Y%
  390
  400   REM Make 8 copies of grid:
  410   k%() = g%() : l%() = g%() : m%() = g%() : n%() = g%()
  420   o%() = g%() : p%() = g%() : q%() = g%() : r%() = g%()
  430
  440   REM Offset arrays:
  450   PROCoffset(K%,-W%-1) : PROCoffset(L%,-W%) : PROCoffset(M%,-W%+1)
  460   PROCoffset(N%,-1)    : PROCoffset(O%,+1)
  470   PROCoffset(P%,+W%-1) : PROCoffset(Q%,+W%) : PROCoffset(R%,+W%+1)
  480
  490   REM Sum the border cells:
  500   k%() = k%() + l%() : m%() = m%() + n%() : o%() = o%() + p%() : q%() = q%() + r%()
  510   k%() = k%() + m%() : o%() = o%() + q%() : k%() = k%() + o%()
  520
  530   REM Calculate next state:
  540   k%() = k%() + k%() : p%() = k%() + g%() : p%() += 8 : q%() = p%() + 3
  550   p%() = p%() / 16   : q%() = q%() / 16   : g%() = q%() - p%()
  560
  570   REM Undo the offsets:
  580   PROCoffset(K%,+W%+1) : PROCoffset(L%,+W%) : PROCoffset(M%,+W%-1)
  590   PROCoffset(N%,+1)    : PROCoffset(O%,-1)
  600   PROCoffset(P%,-W%+1) : PROCoffset(Q%,-W%) : PROCoffset(R%,-W%-1)
  610
  620 UNTIL FALSE
  630 END
  640
  650 REM This is non-portable code:
  660 DEF PROCoffset(A%, O%)
  670 LOCAL I%, P%, W%, H%
  680 P% = A%!8 : W% = P%!0 : H% = P%!4
  690 IF O% < 0 FOR I% = P% + O%*4 + 16 TO P% + 12 STEP 4 : !I% = 0 : NEXT
  700 IF O% > 0 FOR I% = P% + W%*H%*4 + 16 TO P% + W%*H%*4 + O%*4 + 12 STEP 4 : !I% = 0 : NEXT
  710 P% += O%*4 : A%!8 = P%
  720 P%!0 = W% : P%!4 = H% : P%!8 = 0 : P%!12 = W%*H%
  730 ENDPROC
(Edited to work correctly with different grid sizes).
Last edited by Deleted User 9295 on Thu Jan 14, 2021 11:10 pm, edited 3 times in total.
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

In case you want to use a non-square grid, it's necessary to swap the array indices like this:

Code: Select all

   10 REM Conway's 'Game of Life' in pure BASIC 5 (with some cheating!)
   20 REM By Richard Russell, http://www.bbcbasic.co.uk/, 14-Jan-2021
   30
   40 REM Set grid size and initialise screen:
   50 W% = 512 : H% = 512
   60 MODE 18
   70 OFF
   80
   90 DIM g%(H%-1,W%-1)
  100
  110 REM Initialise the grid to a random state:
  120 FOR Y% = 0 TO H%-1
  130   FOR X% = 0 TO W%-1
  140     g%(Y%,X%) = RND(2) - 1
  150   NEXT
  160 NEXT
  170
  180 REM Create 8 copy arrays with gaps:
  190 G% = W% * 8 + 8
  200 DIM K% -1, k%(H%-1,W%-1), L% G%, L% -1, l%(H%-1,W%-1), M% G%
  210 DIM M% -1, m%(H%-1,W%-1), N% G%, N% -1, n%(H%-1,W%-1), O% G%
  220 DIM O% -1, o%(H%-1,W%-1), P% G%, P% -1, p%(H%-1,W%-1), Q% G%
  230 DIM Q% -1, q%(H%-1,W%-1), R% G%, R% -1, r%(H%-1,W%-1), S% G%
  240
  250 REM Do some evil descriptor hacking to offset the arrays:
  260 PROCoffset(K%,W%+1) : PROCoffset(L%,W%+1) : PROCoffset(M%,W%+1) : PROCoffset(N%,W%+1)
  270 PROCoffset(O%,W%+1) : PROCoffset(P%,W%+1) : PROCoffset(Q%,W%+1) : PROCoffset(R%,W%+1)
  280
  290 REM Repeat for each generation:
  300 GCOL 7
  310 REPEAT
  320   REM Display grid:
  330   CLS
  340   FOR Y% = 0 TO H%-1
  350     FOR X% = 0 TO W%-1
  360       IF g%(Y%,X%) POINT X%*2,Y%*2
  370     NEXT
  380   NEXT Y%
  390
  400   REM Make 8 copies of grid:
  410   k%() = g%() : l%() = g%() : m%() = g%() : n%() = g%()
  420   o%() = g%() : p%() = g%() : q%() = g%() : r%() = g%()
  430
  440   REM Offset arrays:
  450   PROCoffset(K%,-W%-1) : PROCoffset(L%,-W%) : PROCoffset(M%,-W%+1)
  460   PROCoffset(N%,-1)    : PROCoffset(O%,+1)
  470   PROCoffset(P%,+W%-1) : PROCoffset(Q%,+W%) : PROCoffset(R%,+W%+1)
  480
  490   REM Sum the border cells:
  500   k%() = k%() + l%() : m%() = m%() + n%() : o%() = o%() + p%() : q%() = q%() + r%()
  510   k%() = k%() + m%() : o%() = o%() + q%() : k%() = k%() + o%()
  520
  530   REM Calculate next state:
  540   k%() = k%() + k%() : p%() = k%() + g%() : p%() += 8 : q%() = p%() + 3
  550   p%() = p%() / 16   : q%() = q%() / 16   : g%() = q%() - p%()
  560
  570   REM Undo the offsets:
  580   PROCoffset(K%,+W%+1) : PROCoffset(L%,+W%) : PROCoffset(M%,+W%-1)
  590   PROCoffset(N%,+1)    : PROCoffset(O%,-1)
  600   PROCoffset(P%,-W%+1) : PROCoffset(Q%,-W%) : PROCoffset(R%,-W%-1)
  610
  620 UNTIL FALSE
  630 END
  640
  650 REM This is non-portable code:
  660 DEF PROCoffset(A%, O%)
  670 LOCAL I%, P%, W%, H%
  680 P% = A%!8 : W% = P%!0 : H% = P%!4
  690 IF O% < 0 FOR I% = P% + O%*4 + 16 TO P% + 12 STEP 4 : !I% = 0 : NEXT
  700 IF O% > 0 FOR I% = P% + W%*H%*4 + 16 TO P% + W%*H%*4 + O%*4 + 12 STEP 4 : !I% = 0 : NEXT
  710 P% += O%*4 : A%!8 = P%
  720 P%!0 = W% : P%!4 = H% : P%!8 = 0 : P%!12 = W%*H%
  730 ENDPROC
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

To my considerable surprise (and after a lot of experimentation) I discovered that there is indeed a fast way of displaying the grid in RISC OS, using SYS "OS_SpriteOp", 52. Having never written a single program for RISC OS previously, achieving this was quite a steep learning curve.

So here I present the ultimate Conway's Game of Life in pure BASIC 5 on RISC OS, using most of the tricks I was able to deploy in the BBCSDL/BB4W version (only 'byte' arrays not being available)! This does need HIMEM set at least 10 Mbytes or so above PAGE, so may need to be run in a non-GUI environment (I'm booting straight into BASIC on Red Squirrel):

Code: Select all

   10 REM Conway's 'Game of Life' in pure BASIC 5 (with some cheating!)
   20 REM By Richard Russell, http://www.bbcbasic.co.uk/, 17-Jan-2021
   30
   40 REM Set grid size and initialise screen:
   50 W% = 512 : H% = 512
   60 MODE 18
   70 OFF
   80
   90 REM Define sprite and main grid:
  100 DIM sparea% 15 + 12, sprite% 43, G% -1, g%(H%-1,W%-1), T% 3
  110
  120 sprite%!0 = T% - sprite% : REM Total size of sprite
  130 $(sprite%+4) = "Game of Life"
  140 sprite%!16 = W% - 1      : REM Sprite width in words, - 1
  150 sprite%!20 = H% - 1      : REM Sprite height in pixels, - 1
  160 sprite%!28 = 31          : REM Last bit on right edge
  170 sprite%!32 = G% - sprite% + 24 : REM Offset to sprite image
  180 sprite%!36 = 44          : REM Offset to sprite mask (none)
  190 sprite%!40 = &B01680B5   : REM Sprite type 6, DPI = 90 x 90
  200
  210 sparea%!0 = T% - sparea%
  220 sparea%!4 = 1
  230 sparea%!8 = sprite% - sparea%
  240 sparea%!12 = T% - sparea%
  250
  260 DIM table% 11
  270 SYS "ColourTrans_GenerateTable", sparea%, sprite%, 18, -1, table%, 1, 0, 0
  280
  290 REM Initialise the grid to a random state:
  300 FOR Y% = 0 TO H%-1
  310   FOR X% = 0 TO W%-1
  320     IF RND(2) = 2 g%(Y%,X%) = TRUE
  330   NEXT
  340 NEXT
  350
  360 REM Create 8 copy arrays with gaps:
  370 G% = W% * 8 + 8
  380 DIM K% -1, k%(H%-1,W%-1), L% G%, L% -1, l%(H%-1,W%-1), M% G%
  390 DIM M% -1, m%(H%-1,W%-1), N% G%, N% -1, n%(H%-1,W%-1), O% G%
  400 DIM O% -1, o%(H%-1,W%-1), P% G%, P% -1, p%(H%-1,W%-1), Q% G%
  410 DIM Q% -1, q%(H%-1,W%-1), R% G%, R% -1, r%(H%-1,W%-1), S% G%
  420
  430 REM Do some evil descriptor hacking to offset the arrays:
  440 PROCoffset(K%,W%+1) : PROCoffset(L%,W%+1) : PROCoffset(M%,W%+1) : PROCoffset(N%,W%+1)
  450 PROCoffset(O%,W%+1) : PROCoffset(P%,W%+1) : PROCoffset(Q%,W%+1) : PROCoffset(R%,W%+1)
  460
  470 REM Repeat for each generation:
  480 REPEAT
  490   REM Display grid:
  500   SYS "OS_SpriteOp", 52+512, sparea%, sprite%, 0, 0, 0, 0, table%
  510
  520   REM Make 8 copies of grid, converting -1 to 1:
  530   k%() = -g%() : l%() = -g%() : m%() = -g%() : n%() = -g%()
  540   o%() = -g%() : p%() = -g%() : q%() = -g%() : r%() = -g%()
  550
  560   REM Offset arrays:
  570   PROCoffset(K%,-W%-1) : PROCoffset(L%,-W%) : PROCoffset(M%,-W%+1)
  580   PROCoffset(N%,-1)    : PROCoffset(O%,+1)
  590   PROCoffset(P%,+W%-1) : PROCoffset(Q%,+W%) : PROCoffset(R%,+W%+1)
  600
  610   REM Sum the border cells:
  620   k%() = k%() + l%() : m%() = m%() + n%() : o%() = o%() + p%() : q%() = q%() + r%()
  630   k%() = k%() + m%() : o%() = o%() + q%() : k%() = k%() + o%()
  640
  650   REM Calculate next state:
  660   k%() = k%() + k%() : p%() = k%() - g%() : p%() += 8 : q%() = p%() + 3
  670   p%() = p%() / 16   : q%() = q%() / 16   : g%() = p%() - q%()
  680
  690   REM Undo the offsets:
  700   PROCoffset(K%,+W%+1) : PROCoffset(L%,+W%) : PROCoffset(M%,+W%-1)
  710   PROCoffset(N%,+1)    : PROCoffset(O%,-1)
  720   PROCoffset(P%,-W%+1) : PROCoffset(Q%,-W%) : PROCoffset(R%,-W%-1)
  730
  740 UNTIL FALSE
  750 END
  760
  770 REM This is non-portable code:
  780 DEF PROCoffset(A%, O%)
  790 LOCAL I%, P%, W%, H%
  800 P% = A%!8 : W% = P%!0 : H% = P%!4
  810 IF O% < 0 FOR I% = P% + O%*4 + 16 TO P% + 12 STEP 4 : !I% = 0 : NEXT
  820 IF O% > 0 FOR I% = P% + W%*H%*4 + 16 TO P% + W%*H%*4 + O%*4 + 12 STEP 4 : !I% = 0 : NEXT
  830 P% += O%*4 : A%!8 = P%
  840 P%!0 = W% : P%!4 = H% : P%!8 = 0 : P%!12 = W%*H%
  850 ENDPROC
Soruk
Posts: 1136
Joined: Mon Jul 09, 2018 11:31 am
Location: Basingstoke, Hampshire
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Soruk »

On my machine, I tried your SpriteOp version, and having added some timing (to exclude the SpriteOp call, so only the calculations are being measured), it hovers between 9 and 10 centiseconds in RPCEmu on my Core I5 laptop (dating back to 2013) running CentOS 8.
Matrix Brandy BASIC VI (work in progress) The Distillery (another work in progress) Note Quiz (New educational software for the BBC and modern kit)
BBC Master 128, PiTubeDirect (Pi 3B), Pi1MHz, 5.25+3.5in dual floppy.
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

Soruk wrote: Sun Jan 17, 2021 8:57 pm On my machine, I tried your SpriteOp version, and having added some timing (to exclude the SpriteOp call, so only the calculations are being measured), it hovers between 9 and 10 centiseconds in RPCEmu on my Core I5 laptop (dating back to 2013) running CentOS 8.
Displaying the grid is a crucial part of the Game of Life, so all my timing measurements have involved the display element (which is after all why I bothered researching and experimenting with OS_SpriteOp). 'Generations per second' (including display) has been my metric, the best I've managed here being about 18 for a 512x512 grid.
User avatar
BigEd
Posts: 6261
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by BigEd »

An interesting result with the Sprite approach. I have been mulling over whether, for example, user-defined characters couldn't help with plotting several pixels at once.

Failing that, a relatively normal optimisation is to plot only changed cells: far fewer points plotted, generally, and no need to clear the screen.

However, the two ideas are perhaps not straightforward to combine.
Deleted User 9295

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by Deleted User 9295 »

BigEd wrote: Sun Jan 17, 2021 9:59 pmI have been mulling over whether, for example, user-defined characters couldn't help with plotting several pixels at once.
But the SYS "OS_SpriteOp", 52 plots 262,144 pixels at once. I wouldn't have thought an approach that can plot at most 64 pixels at once could come anywhere close. But if you do manage to beat my latest version I'll be mightily impressed!
User avatar
BigEd
Posts: 6261
Joined: Sun Jan 24, 2010 10:24 am
Location: West Country
Contact:

Re: Idea: Fastest Conway's Game of Life in BBC Basic

Post by BigEd »

Oh, indeed. My thinking, so far, is this: if a GPU is available, that's the fastest. Otherwise, if on an OS offering this Sprite-like update is available, that's the fastest. Otherwise, if the Basic dialect supports array operations, that'll be fastest. Otherwise, we're down to clever algorithms which need least code and perform fewest operations. Often there are space/time tradeoffs to be made.

(I'm quite interested in the performance on Basic V (or 1.35) as found on the Native ARM coprocessor.)
Post Reply

Return to “modern implementations of classic programming languages”