Difference between revisions of "Current events"
Line 3: | Line 3: | ||
Doing further testing of ZX BASIC stream support has been very fruitful in shaking out bugs in the core socket library. | Doing further testing of ZX BASIC stream support has been very fruitful in shaking out bugs in the core socket library. | ||
*Edit* | |||
I had thought this bug had existed for the lifetime of the Spectranet and further more... but no, it was a regression I'd put in a couple of weeks ago (duh!). A new check for the socket going dead during the "wait for data loop" in the buffer unload routines was put in the wrong place, with the result that it caused a race condition. | |||
*End edit* | |||
The | The regression was this. When receiving it was checking for the socket buffer being in the RST state before checking to see whether there was data to unload (and also an unbalanced stack, which what actually caused the crash - if it left due to the RST state it wasn't popping a value that had recently been pushed). In the case of a short enough piece of data, the remote end was closing down - and the resultant race was either the code would have passed the check for RST and started to check to see if there was a buffer to unload, or the RST condition became set a little bit sooner, so the code saw the status had gone to RST and exited before checking to see if there was more data to be unloaded from the ethernet buffer. Up until now, the RST flag in the hardware had always been set just after we'd checked it, so the buffer would get copied correctly. But for some reason, the W5100 was beating the code tonight (even though the code in question hadn't changed, and therefore took the same number of T-states). | ||
The fix to the core socket library code fixed this particular problem with BASIC streams. Now to find the next bug... | The fix to the core socket library code fixed this particular problem with BASIC streams. Now to find the next bug... |
Revision as of 22:53, 2 March 2010
Surprises
Doing further testing of ZX BASIC stream support has been very fruitful in shaking out bugs in the core socket library.
- Edit*
I had thought this bug had existed for the lifetime of the Spectranet and further more... but no, it was a regression I'd put in a couple of weeks ago (duh!). A new check for the socket going dead during the "wait for data loop" in the buffer unload routines was put in the wrong place, with the result that it caused a race condition.
- End edit*
The regression was this. When receiving it was checking for the socket buffer being in the RST state before checking to see whether there was data to unload (and also an unbalanced stack, which what actually caused the crash - if it left due to the RST state it wasn't popping a value that had recently been pushed). In the case of a short enough piece of data, the remote end was closing down - and the resultant race was either the code would have passed the check for RST and started to check to see if there was a buffer to unload, or the RST condition became set a little bit sooner, so the code saw the status had gone to RST and exited before checking to see if there was more data to be unloaded from the ethernet buffer. Up until now, the RST flag in the hardware had always been set just after we'd checked it, so the buffer would get copied correctly. But for some reason, the W5100 was beating the code tonight (even though the code in question hadn't changed, and therefore took the same number of T-states).
The fix to the core socket library code fixed this particular problem with BASIC streams. Now to find the next bug...
Winston 22:46, 2 March 2010 (GMT)
Back to BASICs
ZX BASIC is slow. Very, very slow. But this slowness, it turns out, has its uses.
I noticed that somewhere along the way I forgot to actually get the streams module into an anywhere near finished state (while it works, it lacks certain very important features that are needed for the Spectranet to be useful to the BASIC programmer). It's been an interesting shake-down for the base socket library, which I had thought was done and dusted many months ago...
All the tests that I had used, basically the Spectrum could keep up. The tests were written in C or raw assembler. So were real programs, such as the network filesystem and the IRC client. For general networking, the Speccy does seem to cope quite happily with keeping up - it's never pulling in too much data at once, and never really gets behind. As you may have seen, the raw socket library goes fast enough to show full motion video. But BASIC is another matter entirely, and presents a number of interesting challenges.
First, messages end up piling up in the W5100's buffer as the BASIC program can't even keep up with another Spectrum. I've been doing most the testing with a server program to test the new control/status stream that's available to BASIC. This revealed a couple of bugs in the low level routines that load and unload the ethernet buffers (the "wait for data" or "wait for buffer space to fill" loops could hang, because they didn't actually check that the connection went away in the meantime). It also revealed bugs in the poll routines. The W5100 has several status bits for the interrupt register, but they all get cleared when you say "I've handled the interrupt for socket N". This meant if you used poll to check whether there was data waiting (but the socket had also closed), when you went to poll again, the closed interrupt flag had also cleared. Now in a machine code program that may check all the flags at once and know to read and then close, that's fine. But the BASIC interface only handles one flag at a time - so the read would end up clearing the close flag, and the BASIC program would never get the message that the socket had also been closed. So the poll routines now also check the status register so that they reliably return a closed flag always.
The extra challenges put up by BASIC are this. Essentially, BASIC doesn't really map onto the way the socket library really works. The socket library only cares about bytes - it has no concept of "records". But the BASIC INPUT# routine is very much oriented around the concept of a "record", or at least a carriage return terminated string. This means the streams module buffers incoming data, fills the BASIC INPUT buffer until it hits a CR, and exits. Since it may read 256 bytes (and these 256 bytes may contain what BASIC considers are a dozen actual records) we may do a recv() once, and INPUT# is actually executed several times for the one call to recv() as the buffer is emptied. As far as the socket library is concerned, if you call poll() at this stage, there's nothing waiting, and correctly it returns with all the flags cleared. Or it may have even noticed the connection has gone away, and so return the closed flag. But as far as BASIC is concerned, it's not over yet. There may not be anything more to come along the network... but there may be still stuff waiting to be emptied from the buffer. So the poll routine for the BASIC control/status channel must not just call the socket library's poll, but also check the state of all the stream buffers.
All this writing is coming to one final point: I want to get the BASIC streams support into a ...umm...basically demonstrable state before I make a video of "Progress so far". Soon, I hope, really soon...
Winston 21:16, 15 February 2010 (GMT)
Porting and optimizations
The simple portable TNFS server is (I think) pretty much done, except for an 8 bit version. I've tested it on Mac OSX (64 bit intel), Linux (32 bit intel), OpenBSD (64 bit SPARC), Windows XP (32 bit Intel). Windows of course gave all the problems, its POSIX interfaces are annoyingly ever-so-slightly-different to the Unix world (for example, "open" requires an extra flag O_BINARY to work - I notice that Python fell into this trap too), and the socket interface on Windows needs extra things that are not needed for Unix types. And all the header files are different. But at least it works now!
I've also worked a little on the optimization of the network file system routines. The read routine was essentially double buffering the data (reading the block off the network into a Spectranet buffer, then copying it to wherever it was supposed to finally land). This has been changed so that the header is first decoded, then the remainder of the data (the actual block of data read() is supposed to return) is copied directly to its destination from the W5100's buffer. This probably saves around 8,000 T-states per 512 byte block read. It certainly seems faster when loading a 128K snapshot file, although it probably only saves 2/3rds of a second (a 128K snapshot loads in about 2 secs).
Now I probably ought to make that video I've been going on about for the last 3 months...
Winston 17:18, 31 January 2010 (GMT)
Portable network fileserver
The current job is a portable fileserver program for the simple TNFS protocol.
The server program I have used to test the filesystem is something I hacked together using perl, for reasons of expediency. However, to make it simple for Windows users (and don't have a perl interpreter by default), and also as a basis for a fileserver for 8 bit systems, I'm writing a lightweight server in C. It'll also be better for "retroservers" like the MicroVAX that I have. Although the perl server runs on the VAX with adequate speed, a server written in C will run a lot faster. The idea is to have something very portable and small and simple. The architectures I intend to test with the first version will be sparc64 (big endian, 64 bit), VAX (little endian, 32 bit), Win32 (x86, little endian, 32 bit), macppc (32 bit big endian) and mac on intel (x86_64, 64 bit little endian). The endianness matters for testing because 16 and 32 bit integers are encoded as 16 and 32 bit values in the datagrams as little endian values (most 8 bit systems seem to deal with 16 or 32 bit values in a little endian manner).
My goal at the moment is to make the "simple" TNFS server. This TNFS server is for LAN use, so that someone may share files between their PC and a Spectrum. It doesn't include things like authentication and authorization. The idea is that the user sets the server up to share a specific directory tree and it works a bit like ResiDOS - it lets you access files within. A later goal is to find out what I need to use for ESXDOS on the Spectrum, and make a Speccy version of the fileserver, so it can serve files from a DivIDE. All the protocol level stuff should be portable, as will the network code. Filesystem level stuff will probably need some Speccy-specific #asm sections - given that ESXDOS is brand new it may be some time before the z88dk gets ESXDOS support for its base I/O libraries. (However, ESXDOS does have a POSIX like interface which should mean it won't be too bad to port).
Later, I may make a proper "multiuser server" (that is, it'll integrate with the host OS authentication and authorization methods). However, I'll only do this for Unix. The last time I touched Windows at this level was with NT 4.0 and lots of things have changed with Vista and newer, so someone more familiar than me with the Windows security model would have to take that up.
The snapshot manager
Unfortunately "real life" problems have kept me away from doing serious work with the Spectranet software for a week or three, so I'm well behind where I expected to be. However, I still managed to get the first pass of the snapshot manager working. At the moment it's rather rudimentary, but it does at least have a scrollable file/directory browser, and it can load and save snapshots. It is invoked via the NMI menu (and is a ROM module). I am also moving the snapshot code itself to this module - for testing, this had lived in the BASIC extensions module. Hopefully, by next week I'll have the current firmware in a state where it's worthy of making a video update on what the Spectranet does so far.
The other feature I need to add is a rename system call for filesystems in the base ROM. This will need a slight rearrangement of some internal functions because the jump table is full, and there are some functions that can be combined or moved elsewhere to free up entries (I don't really want the jump table to take up more than 256 bytes). Also I need to write a portable TNFS server (the current one is in Perl, and although this is portable enough, Windows users generally don't have a perl interpreter, and most importantly - a Spectrum will never have a perl interpreter and I want the Spectrum itself to be able to be a file server for the ultimate retro network). Then after that, get some new hardware made for those who want one!
Winston 23:05, 28 December 2009 (GMT)