Creating a filesystem

From Spectrum
Jump to navigation Jump to search

Introduction.

The Spectranet has been designed to allow the development of new filesystems with not too great an effort.

The basic principles are simple: there is a filesystem abstraction layer which provides a uniform set of system calls that would be familiar to anyone who has used the Unix fcntl functions (or the Windows equivalent) and transparent to the writer of a program that does filesystem operations, passes the parameters onto the right filesystem module. This means the application programmer doesn't really care if the file is on the network, on an http site, on an IDE disc, on a +3 floppy... it should just work, indeed pretty much like a VFS layer "just works" on a BSD or Linux machine or Macintosh or on Windows - what the programmer wants to do just gets sent to the right place, so the programmer doesn't care where the file physically is or what has to happen to get to it.

To understand what is needed, first you'll need to know the Spectranet has a system of modules. These modules are 4K each (there's no reason why one given module can't be a set of more than one 4K module, should all the code not fit in 4K). The modules are general purpose: they can be used for just some data, they can be used for arbitrary code with a call interface via the MODCALL mechanism, they can be used for pretty much anything - including the code that makes a filesystem work. So any new filesystem will be written as a Spectranet module.

To act as a filesystem module, a module must:

  • Have a MOUNT entry in its vector table.
  • Have a MOUNT function that recognises the filesystem type it's for (a short string).
  • Contain a jump table that maps the entry points for filesystem operations (like open, close, read, write etc.) to the appropriate code that carries out that function.

And that's really it. A filesystem needn't support every single FS call, for example, a filesystem that's only ever going to be read-only will not need to implement "write" (except to return "not implemented" or perhaps the error code for "read only filesystem").

The lifecycle of a filesystem

The basics are this:

  • The filesystem gets mounted.
  • Operations happen on files and directories.
  • The filesystem gets unmounted

The last thing might not happen if the user just turns the machine off, of course, so you have to account for that somehow. Modern operating systems have some kind of shutdown button to deal with this, but Spectrum users are used to just pulling the plug!

Mounting a filesystem

So to go through these things in a bit more detail, what happens is this. The filesystem gets mounted, either at boot using the automounter, or by the user typing in a command like:

%mount 0,"tnfs://example.com/"

What then happens is whatever accepts a filesystem URL parses it, and splits it up into the following components:

  • Desired mount point (in this case, 0)
  • Filesystem type (in this case, tnfs)
  • Username, if needed
  • Password, if needed
  • Hostname (in this case, example.com) - not all filesystems need a hostname
  • Path (in this case, /)

The filesystem module doesn't need to parse all of this, it comes handily pointed to by the IX register; the parsing is done elsewhere and the filesystem programmer doesn't have to care about it too much. All these parameters arrive as null-terminated strings.

Note we talk about "hostnames" here. This doesn't mean all filesystems must be network filesystems. The hostname part might not actually be needed or can be repurposed. For example, let's imagine we're writing a filesystem that supports FAT32 on IDE. We could have a mount command that looks something like this:

%mount 0,"idefat://0/"

In this case, the "hostname" field is now being used to identify the device, which is device 0. Even more elaborate, someone may write a set of device driver modules, that support block devices, let's say there's one for IDE, one for SCSI, one for SATA (on a Spectrum!). We could have an URL like this:

%mount 0,"fat://sata/1"

...and the "fat" filesystem module would interpret the hostname to mean "use the SATA driver module" and the first bit of the path to say "use SATA interface 1".

Effectively, for non-network filesystems, it's up to the filesystem module what it does with the hostname parameter. For network filesystems, then the hostname parameter ought to be an actual hostname.

So what happens when the user enters the %mount command (or the automounter does it)? The Spectranet filesystem layer first looks through all the modules it has for MOUNT vector table entries. Every Spectranet module with executable code has a vector table right at the start, and it looks something like this:

.section vectors
sig:    defb 0xAA               ; This is a ROM module
romid:  defb 0xFA               ; for a filesystem only.
reset:  defw F_init             ; reset vector
mount:  defw F_tnfs_mount       ; The mount routine
        defw 0xFFFF
        defw 0xFFFF
        defw F_startmsg
        defw 0xFFFF
idstr:  defw STR_ident          ; ROM identity string
        jp J_tnfs_modcall       ; MODCALL entry point

First there's the signature. All code modules start with 0xAA, regardless of whether they are filesystems, drivers, snapshot managers, debuggers etc. The next byte is the ROM ID - this only matters for modules that have public functions that are to be called by the MODULECALL calling mechanism. Filesystems don't have to support this if they don't need to. Then there's a reset vector, which is code that gets called when the Spectrum is starting up (after the network is up, but before any filesystems have been mounted, and long before BASIC starts). Filesystems don't have to have any code here, but quite often you'll need to claim a page of RAM to store filesystem data, and usually you'll want to do this in this routine. The next vector is for the mount function. (If you want to know about writing ROM modules generally, see Spectranet:_Tutorial_7).

When the filesystem abstraction layer finds a module with a MOUNT vector, it will go ahead and call it, passing all the information talked about earlier in a control block. This call (for the example above, there is a function called "F_tnfs_mount" in this filesystem module gets called) then looks at the filesystem type - for example in the URL "tnfs://example.com" the filesystem type is "tnfs". This is just a simple plain string compare. If it recognises this string as a filesystem it handles, the module then does whatever it needs to do to attempt to attach to the filesystem and either returns an error if it didn't work, or success if it did. In either case, the filesystem abstraction layer stops looking for other modules that might handle this type.

If the filesystem type is not something this module handles, then it just returns a value saying "this isn't for me", and the filesystem abstraction layer goes onto the next module and so on (until it either finds a module that understands this filesystem, or exhausts all possibilities and returns a "filesystem not known" error).

That is probably the most complex thing that the filesystem abstraction layer ever has to do. Once a filesystem is mounted on a mount point (there may be up to four mounted filesystems, from 0 to 3), it knows to pass on any system call parameters on that mountpoint to the filesystem that got mounted. To do this, there is an area in the Spectranet system variables space that associates mount points with the ROM paget that the module was installed in, so the system can quickly look up the right module and send the call parameters to it. From your point of view as the author of a filesystem module, it's all rather straightforward: to recap, all you need to acutally do is:

  • Provide a valid mount vector in the vector table
  • Do a string compare on the filesystem type against the type you handle
  • Return success, fail, or "not my filesystem".

Of course the gory details of actually attaching to the filesystem may be more complex, but that's your problem.

TODO

More to come! Including a simple tutorial.