I thought I'd have a go at porting it to the ARM co-processor, and see what sort of performance it was possible to get out of PiTubeDirect and BeebEm with the ARM7TDMI co-processor enabled. I used beebScreen to help speed up the porting process, I can't imagine sending 40,000 plot commands over the tube for the host to plot would be particularly fast.
I set the resolution to 120x240 in mode 2, which allows for double buffering on a Model B (28,800 bytes needed for both screens), by converting the sin/cos calls into a lookup table I was able to get it running at a surprising speed in BeebEm, we're talking about a frame per second rather than multiple seconds per frame, it pretty much runs as fast as it can transfer the screen over on the native ARM co-processor on PiTubeDirect.
I borrowed the palette settings used for the native BBC Basic version, so it uses colours 1-4 in mode 2, so it probably looks very similar to the BASIC version, however since I have more memory I have created a 65536 entry sine/cosine table (because I can use an unsigned short as the index and get free wrapping of the index).
I've attached an SSD image that contains the program, simply type
to run the program (with an ARM7TDMI or NativeARM co-processor enabled).
Here's a video of it in action on the PiTubeDirect:
https://youtu.be/FAj-eaU-S2s
I may have another pass at it to see if I can reduce the number of floating point operations, currently everything is done in doubles and then converted to integer when required, this isn't really an issue on the PI, but in BeebEm there is no FPU so it's all in software, which is going to cost a lot more than pure integer would.