cswblks.php

Diminished · Post by **Diminished** » Fri Nov 18, 2022 11:37 pm

Here's the latest thrilling installment of "bad PHP scripts I wrote while working on Quadbike". I think this is stable enough to be worth releasing now.

It is essentially a Beeb/Elk CSW block decoder, capable of both listing the blocks contained within a CSW, and if requested, extracting them into a directory. (You may then manually concatenate these blocks into complete CFS files, if such activities bring you pleasure).

It has a trick up its sleeve -- every byte decoded from the CSW is stored along with the sample number in the source audio at which the byte was found (look for the numbers in square brackets). Consequently it is very useful for accurately pinpointing the locations of interesting features in the source audio file. Most relevantly for my purposes, it allows me quickly to find the precise piece of audio that has led to a particular error in the CSW.

It currently only correctly handles bog-standard 8N1 Beeb-style CFS blocks.

I've attached a ZIP of the PHP, but here is the "readme" (from the comments at the top of the source code):

Code: Select all

    WHAT?
    -----
    This software will load and inspect a CSW file, and attempt to decode its blocks.
    Currently only MOS-standard 8N1 BBC Micro/Electron-style blocks are supported.
    
    HOW?
    ----
    By default, the software will list the 8N1 blocks in a CSW file:
    
    $ php -f cswblks.php <CSW file>
    
    If there is a particular block that is of interest (usually because it
    contains errors), you can request verbose details of that particular block
    based on its block ID (which will have been displayed on the prior run):
    
    $ php -f cswblks.php +b <block ID> +v +e <CSW file>
    
    (+v provides a verbose block listing; +e provides a verbose error listing.)
    
    One other useful trick that this tool can perform is to extract all of the
    8N1 blocks from a CSW into a directory, as individual files:
    
    $ php -f cswblks.php +x <output dir> <CSW file>
    
    At this time, this software will automatically create the directory if it
    does not exist, and it has no qualms about overwriting existing files, so be
    careful. This behaviour can be altered easily (change "TRUE, TRUE" to
    "FALSE, FALSE" in the call to save_blocks), but these choices are not exposed
    via the command line right now.
    
    Currently, this software provides no way to extract whole CFS files rather than
    individual blocks, but the blocks it does write can be manually concatenated into
    complete files fairly easily. On Unixalikes a shell command such as the following
    will probably do the job, once this software has extracted the blocks to,
    say, "/tmp/src", assuming you want the files in "/tmp/dst":
  
    SRC="/tmp/src" ; DST="/tmp/dst"
    for N in `ls "${SRC}" | cut -d _ -f 3-4 | sort | uniq` ; do
      cat "${SRC}/"*${N}* >> "${DST}/`echo ${N} | cut -d _ -f 1`" ;
    done
  
    WHY?
    ----
    This software was written for testing CSWs produced by Quadbike. I was prevously
    using beebjit in an ad-hoc way for this purpose, but decided I needed something
    better.
    
    It is expected that this software will be useful for anyone else who
    is unfortunate enough to be developing software to convert an audio
    file into CSW data.
    
    From this perspective, it offers two key innovations:
    
    i)  Every byte decoded from the CSW is stored along with the sample number
        within the CSW where that byte originated. As such, it is very useful
        for determining whereabouts in the original audio file certain features
        lie. Header field locations, data locations, CRC locations and any error
        locations are all displayed to the user, in terms of the number of samples
        into the source audio file that was used to generate the CSW.
        
    ii) An error count and block count is provided at the end of the block decoding
        process. In my adventures with Quadbike, I have found that making a change to
        an audio-to-CSW algorithm often improves transcription of some tapes, but
        only at the expense of other ones. By having a large corpus of test data,
        and by summing errors over this entire corpus, it should be possible to
        measure scientifically the overall efficacy of any change to a CSW encoding
        algorithm (hopefully including my own).
        
        Unlike beebjit's CSW error reporting, this software only counts errors
        that occur within blocks -- so errors caused by transients on the tape
        during silent sections or leader sections will not contribute to the
        error count. It also reports the sample number of stop bit errors, which
        beebjit does not at this time.
        
        Additionally, beebjit's CSW-loading heuristic doesn't play nicely with Quadbike;
        it was intended for dealing with output from CSW.exe, which measures pulse
        lengths from a tape. Quadbike, however, artificially synthesises pairs of
        pulses based on frequency data, so using a hard threshold between 1-pulses
        and 0-pulses, as e.g. b-em currently does, works much better for Quadbike's
        CSW output.

Diminished · Post by **Diminished** » Sat Nov 19, 2022 10:05 am

Here's some sample output.

Code: Select all

$ php -f cswblks.php /tmp/qb.csw 
M: Loading /tmp/qb.csw ... 116240 bytes.
App:            "Quadbike-alpha 2"
Rate:           44100
Pulses:         1232921
Polarity:       starts low
Zipped:         yes
Hdr. ext. len.: 0

M: Body starts at 0x34
M: Unzipped body from 116188 to 1232929 bytes.
M: Counted 1232921 pulses (OK).
Decoded 22754 bytes.
----|----------|------------|--------|--------|----|----|--------|---|--|--
 id    smpnum     filename     load     exec    num  len nextfile flg hE dE
----|----------|------------|--------|--------|----|----|--------|---|--|--
#  0 [  609910]  "MICROCOSM"     1900        0    0  100        0          
#  1 [  743014]  "MICROCOSM"     1900        0    1  100        0 F      DC
  > [  765301] Bad stop bit (data)  (4 more errors ...)

#  2 [ 1262558] "Microcosm1"     4000     4b7c    0  100        0          
#  3 [ 1396504] "Microcosm1"     4000     4b7c    1  100        0          
#  4 [ 1530422] "Microcosm1"     4000     4b7c    2  100        0          
#  5 [ 1664368] "Microcosm1"     4000     4b7c    3  100        0          
#  6 [ 1798281] "Microcosm1"     4000     4b7c    4  100        0          
#  7 [ 1932248] "Microcosm1"     4000     4b7c    5  100        0          
#  8 [ 2066216] "Microcosm1"     4000     4b7c    6  100        0          
#  9 [ 2200153] "Microcosm1"     4000     4b7c    7  100        0          
# 10 [ 2334123] "Microcosm1"     4000     4b7c    8  100        0          
# 11 [ 2468082] "Microcosm1"     4000     4b7c    9  100        0          
# 12 [ 2602073] "Microcosm1"     4000     4b7c    a  100        0          
# 13 [ 2736073] "Microcosm1"     4000     4b7c    b  100        0          
# 14 [ 2870019] "Microcosm1"     4000     4b7c    c  100        0          
# 15 [ 3004047] "Microcosm1"     4000     4b7c    d  100        0          
# 16 [ 3138074] "Microcosm1"     4000     4b7c    e  100        0          
# 17 [ 3272035] "Microcosm1"     4000     4b7c    f  100        0          
# 18 [ 3406029] "Microcosm1"     4000     4b7c   10  100        0          
# 19 [ 3540013] "Microcosm1"     4000     4b7c   11   22        0 F        
# 20 [ 3985044] "Microcosm2"     4300     4300    0  100        0          
# 21 [ 4119052] "Microcosm2"     4300     4300    1  100        0          
# 22 [ 4253042] "Microcosm2"     4300     4300    2  100        0        DC
  > [ 4323264] Bad stop bit (data)  (1 more errors ...)

# 23 [ 4521042] "Microcosm2"     4300     4300    4  100        0        DC
  > [ 4601734] Bad stop bit (data)  (3 more errors ...)

# 24 [ 4789033] "Microcosm2"     4300     4300    6  100        0          
# 25 [ 4923034] "Microcosm2"     4300     4300    7  100        0          
# 26 [ 5057000] "Microcosm2"     4300     4300    8   30        0 F        
# 27 [ 5507101] "Microcosm3"     5500     5500    0  100        0          
# 28 [ 5641140] "Microcosm3"     5500     5500    1  100        0 F        
M: Metrics|Errors|11|Blocks|30

Block ID #1 has some errors, so maybe we want to take a look at that block in detail. Note how every line starts with the sample number, so you can quickly find all the header fields, each line of the data hexdump, and the location of all the errors.

Code: Select all

$ php -f cswblks.php +b 1 +v +e /tmp/qb.csw
M: Loading /tmp/qb.csw ... 116240 bytes.
App:            "Quadbike-alpha 2"
Rate:           44100
Pulses:         1232921
Polarity:       starts low
Zipped:         yes
Hdr. ext. len.: 0

M: Body starts at 0x34
M: Unzipped body from 116188 to 1232929 bytes.
M: Counted 1232921 pulses (OK).
Decoded 22754 bytes.
           ID                            #1
[  743350] name (hexdump follows):
[  743350]     4d 49 43 52 4f 43 4f 53 4d                       MICROCOSM
[  746734] load address                  1900
[  748082] execution address                0
[  749435] MOS block number                 1
[  750110] data length                    100
[  751128] next file address                0
[  750788] flags (final/empty/lock)        80 (F  )
[  752471] hCRC (read/computed)   7b82 / 7b82
[  839967] dCRC (read/computed)   8730 / 18e2
[  753145] data (hexdump follows):
[  753145] 00  30 2c 30 2c 2d 31 32 37 2c 31 32 36 2c 30 3a e2  0,0,-127,126,0:.
[  758545] 10  34 2c 31 2c 30 2c 30 2c 30 2c 30 2c 30 2c 30 2c  4,1,0,0,0,0,0,0,
[  763953] 20  31 32 37 2c 32 2c 30 2c 2d 32 2c 31 32 36 2c 30  127,2,0,-2,126,0
[  769733] 30  0d 00 1e 3c 20 eb 37 3a e7 a6 28 2d 32 35 36 29  ...< .7:..(-256)
[  775134] 40  3c 3e 2d 31 20 f1 22 53 65 72 69 65 73 20 31 20  <>-1 ."Series 1 
[  780545] 50  4f 70 65 72 61 74 69 6e 67 20 53 79 73 74 65 6d  Operating System
[  785948] 60  20 72 65 71 75 69 72 65 64 22 3a e0 0d 00 28 48   required":...(H
[  791350] 70  20 41 25 3d 32 33 34 3a 58 25 3d 30 3a 59 25 3d   A%=234:X%=0:Y%=
[  796755] 80  32 35 35 3a e7 ba 28 26 46 46 46 34 29 80 26 46  255:..(&FFF4).&F
[  802160] 90  46 30 30 20 f1 22 53 77 69 74 63 68 20 54 55 42  F00 ."Switch TUB
[  807561] a0  45 20 6f 66 66 20 61 6e 64 20 72 65 73 74 61 72  E off and restar
[  812956] b0  74 22 3a e0 0d 00 32 1c 20 d3 3d 26 35 38 30 30  t":...2. .=&5800
[  818366] c0  3a 2a 52 55 4e 20 4d 69 63 72 6f 63 6f 73 6d 31  :*RUN Microcosm1
[  823767] d0  0d 00 3c 14 20 2a 52 55 4e 20 4d 69 63 72 6f 63  ..<. *RUN Microc
[  829167] e0  6f 73 6d 32 0d 00 46 14 20 2a 52 55 4e 20 4d 69  osm2..F. *RUN Mi
[  834561] f0  63 72 6f 63 6f 73 6d 33 0d ff 00 00 00 00 00 db  crocosm3........

  > [  765301] Bad stop bit (data) 
  > [ 1054670] Bad pulse seq (data CRC) : 8,8,17,17 
  > [ 1054678] Bad pulse seq (data CRC) : 8,17,17,18 
  > [ 1054703] Bad stop bit (data CRC) 
  > [   N/A  ] Data CRC

M: Metrics|Errors|11|Blocks|30

Diminished · Post by **Diminished** » Sat Dec 10, 2022 1:12 pm

Here's an update -- version 3.

This adds the +p command-line option, which will detect adjacent blocks with inconsistent polarities, and generate errors for them.

It also improves cosmetics a little bit.

EDIT: 3.1 fixes another bug that would sometimes erroneously report a polarity switch on block 0 where none existed.

fuzzel · Post by **fuzzel** » Wed May 03, 2023 5:20 pm

I'd like to use this, could someone tell me where to download php from please and how I install it on my Windows 11 pc?

vanekp · Post by **vanekp** » Wed May 03, 2023 6:16 pm

fuzzel wrote: ↑Wed May 03, 2023 5:20 pm I'd like to use this, could someone tell me where to download php from please and how I install it on my Windows 11 pc?

you can download PHP for windows from Hear
No need o install it just unzip it in to a folder like PHP and run it from a command prompt.
also note there where some new fixed versions of csw2wav.php and cswblks.php released.

Diminished · Post by **Diminished** » Wed May 03, 2023 6:37 pm

fuzzel wrote: ↑Wed May 03, 2023 5:20 pm I'd like to use this, could someone tell me where to download php from please and how I install it on my Windows 11 pc?

this is probably the version you want.

And yes, don't run the version of cswblks.php from this thread, use the more recent one from the Quadbike thread.

stardot.org.uk

cswblks.php

cswblks.php

Re: cswblks.php

Re: cswblks.php

Re: cswblks.php

Re: cswblks.php

Re: cswblks.php