Unexpected difference between Linux and Windows

for discussion of bbc basic for windows/sdl, brandy and more
Post Reply
Deleted User 9295

Unexpected difference between Linux and Windows

Post by Deleted User 9295 »

I have noticed a strange and unexpected difference between the way my Console Mode editions of BBC BASIC behave on Windows compared with Linux and MacOS. I have tracked it down to what happens in the case of this (example) code:

Code: Select all

	char buffer[256];
	FILE *file = fopen("debug.tmp", "w+b");
	size_t bytes_written = fwrite(buffer, 1, 256, file);
	size_t bytes_read = fread(buffer, 1, 256, file);
	fclose(file);
What this does is to create a new (empty) file for writing and reading. It first writes 256 bytes to the file, and then attempts to read 256 bytes from the file. In Linux and MacOS bytes_written is set to 256 and bytes_read is set to zero (because the file pointer is already at the end of the file). This is what I would expect, and BBC BASIC behaves as it should.

But in Windows (compiled using gcc) bytes_written is set to 256 and bytes_read is also set to 256! In other words it appears to be reading beyond the 'end of file' without reporting any error or truncating the data. This confuses BBC BASIC and causes any files written to be unexpectedly padded to a multiple of 256 bytes.

My reading of the documentation of fopen, fwrite and fread doesn't suggest that I am relying on undefined behaviour here; I've opened the file for both writing and reading so it should be legitimate to do both, with fread respecting the usual end-of-file semantics (i.e. the value returned being less than the requested count).

Can anybody shed some light on this, and/or suggest a workaround?
User avatar
sweh
Posts: 3314
Joined: Sat Mar 10, 2012 12:05 pm
Location: 07410 New Jersey
Contact:

Re: Unexpected difference between Linux and Windows

Post by sweh »

I'd suggest doing an fseek() to ensure the pointer is at the end of the file before the fread.

Ah... The C standard seems to claim this is undefined behaviour:

https://wiki.sei.cmu.edu/confluence/dis ... oning+call

And they also recommend an fseek().
Rgds
Stephen
Deleted User 9295

Re: Unexpected difference between Linux and Windows

Post by Deleted User 9295 »

sweh wrote: Fri Dec 11, 2020 3:00 pm I'd suggest doing an fseek() to ensure the pointer is at the end of the file before the fread.
Thank you, that does indeed seem to fix it! =D>
The C standard seems to claim this is undefined behaviour
Hmm, in that case it would be jolly helpful if it was mentioned in the documentation of fread and fwrite!

But I'm very pleased to have a solution.
Soruk
Posts: 1136
Joined: Mon Jul 09, 2018 11:31 am
Location: Basingstoke, Hampshire
Contact:

Re: Unexpected difference between Linux and Windows

Post by Soruk »

Richard Russell wrote: Fri Dec 11, 2020 3:29 pm
sweh wrote: Fri Dec 11, 2020 3:00 pm I'd suggest doing an fseek() to ensure the pointer is at the end of the file before the fread.
Thank you, that does indeed seem to fix it! =D>
The C standard seems to claim this is undefined behaviour
Hmm, in that case it would be jolly helpful if it was mentioned in the documentation of fread and fwrite!

But I'm very pleased to have a solution.
Pipped at the post by a country mile, but it looks like, under the lid, Windows has separate read and write pointers, Linux and MacOS share a pointer.

Is this under Microsoft compilers or MinGW? Edit, replicated the issue under MinGW in my Cygwin build environment...

Edit 2: ftell() returns the position of the write pointer in Windows, so you can do:

Code: Select all

err=fseek(file, ftell(file), SEEK_SET);
to set the read pointer to the same place as the write pointer. It should be effectively a no-op on the UNIX-like systems, but could be put behind a #ifdef.
Matrix Brandy BASIC VI (work in progress) The Distillery (another work in progress) Note Quiz (New educational software for the BBC and modern kit)
BBC Master 128, PiTubeDirect (Pi 3B), Pi1MHz, 5.25+3.5in dual floppy.
User avatar
sweh
Posts: 3314
Joined: Sat Mar 10, 2012 12:05 pm
Location: 07410 New Jersey
Contact:

Re: Unexpected difference between Linux and Windows

Post by sweh »

Soruk wrote: Fri Dec 11, 2020 8:43 pm It should be effectively a no-op on the UNIX-like systems, but could be put behind a #ifdef.
It must always be there 'cos the behaviour is undefined without it. What works today may break tomorrow and you'll find demons flying out of your nose. Don't risk nasal demons!
Rgds
Stephen
Deleted User 9295

Re: Unexpected difference between Linux and Windows

Post by Deleted User 9295 »

Soruk wrote: Fri Dec 11, 2020 8:43 pm ftell() returns the position of the write pointer in Windows, so you can do:

Code: Select all

err=fseek(file, ftell(file), SEEK_SET);
This is easier (and works):

Code: Select all

err=fseek(file, 0, SEEK_CUR);
But, somewhat amusingly, to fix it I actually had to delete code, not add code (I already had a fseek that was called only when the displacement from the current pointer was non-zero, so I just removed the test)!
Deleted User 9295

Re: Unexpected difference between Linux and Windows

Post by Deleted User 9295 »

sweh wrote: Fri Dec 11, 2020 9:32 pm It must always be there 'cos the behaviour is undefined without it.
Whilst strictly speaking you are right, this behaviour is so obscure, so unexpected and so poorly documented that there must be a lot of code out there that doesn't include the fix. I'd confidently predict that Linux will never change its behaviour to make it essential, and if there is any change it's more likely to be Windows removing the anomaly.

Interestingly, the equivalent SDL2 functions work as expected even in Windows, there doesn't seem to be any requirement to do a seek when switching from writing to reading in that case.
User avatar
Diminished
Posts: 1235
Joined: Fri Dec 08, 2017 9:47 pm
Contact:

Re: Unexpected difference between Linux and Windows

Post by Diminished »

It's funny how even the most basic of C functions can have different implementations. I know that on some platforms (can't remember specifics now), fread() will block until it can give you the number of bytes you asked for. On others it prefers to return early with a lesser number of bytes than you asked for, which can result in a nasty surprise if you were just assuming it would give you what you wanted.
cmorley
Posts: 1867
Joined: Sat Jul 30, 2016 8:11 pm
Location: Oxford
Contact:

Re: Unexpected difference between Linux and Windows

Post by cmorley »

Diminished wrote: Sat Dec 12, 2020 5:18 am It's funny how even the most basic of C functions can have different implementations.
Not really... C dates back to 72/73 with the first standard version being C89. That's a long time for implementers to do what they thought it should do.
Diminished wrote: Sat Dec 12, 2020 5:18 am fread() will block until it can give you the number of bytes you asked for. On others it prefers to return early with a lesser number of bytes than you asked for, which can result in a nasty surprise if you were just assuming it would give you what you wanted.
Which is entirely right and fread returns an int so you can deal with it. Blocking is useful if I need to wait for a disc to spin up. Blocking is entirely wrong if I hit EOF. fread can read streams too from stdio or serial... should it block while waiting for the human or fail because the human can't type at MHz speeds? What is the single correct blocking behaviour? (removable media, on demand connections, user io, unimplemented devices on that platform)

fread solves this by giving you the number of bytes read. If you fail to research or choose to ignore the return value...

edit: if you want a quirk remember C int has no specified width in a now standardised language...
Coeus
Posts: 3557
Joined: Mon Jul 25, 2016 12:05 pm
Contact:

Re: Unexpected difference between Linux and Windows

Post by Coeus »

Richard Russell wrote: Fri Dec 11, 2020 3:29 pm Hmm, in that case it would be jolly helpful if it was mentioned in the documentation of fread and fwrite!
For some strange reason, on Linux at least, it is in the documentation for fopen:
Reads and writes may be intermixed on read/write streams in any order. Note that ANSI C requires that a file positioning function intervene between output and input, unless an input operation encounters end-of-file. (If this condition is not met, then a read is allowed to return the result of writes other than the most recent.) Therefore it is good practice (and indeed sometimes necessary under Linux) to put an fseek(3) or fgetpos(3) operation between write and read operations on such a stream. This operation may be an apparent no-op (as in fseek(..., 0L, SEEK_CUR) called for its synchronizing side effect).
I also wonder if there is a mistake in that and fsetpos is meant instead.
Post Reply

Return to “modern implementations of classic programming languages”