The HYDROGEN Virtual Machine Specification

The HYDROGEN Virtual Machine Specification AlaricSnell-Pym 2006200720082009Warhead.org.uk Ltd Foreword TODO: Explain our goals. Document-wide TODOs: Make sure stack effect syntax is consistent (one hyphen or two?), standardise all words to lower-case. Write a HYDROGEN primer chapter (non-normative) that assumes access to an interactive console. Explain how words are executed in order, operating on the stack, and that some words do runtime things while some are 'immediate words' that control the compilation process. Docbook usage: words are in function elements, registers etc. in varname, examples in programlisting. See document for standard wordlist table layout. Add rationales for things in note elements The core CPUs, registers, stacks, and memory Any HYDROGEN system as a whole has some key properties relating to its internal architecture. Core implementation parameters Name Restrictions Description Cell Size (bits) >=16, must be a power of two, must be a multiple of sixteen, and must be at least as large as an address The native integer size of the machine. Allocation Unit size 'byte' or 'cell' Whether an address refers to a cell-sized region of memory, or to a particular byte. Most modern architectures are byte-addressed. An allocation unit is the smallest unit of memory that can be allocated, and the smallest offset that can be added to a pointer. Allocation Unit is frequently abbreviated to AU.

A HYDROGEN system contains one or more CPUs. Each CPU has a set of registers, and set of stacks. All the CPUs in the system share a single memory, and memory is considered uniform, in that there is no special affinity between individual CPUs and individual regions of memory; applications should choose regions of memory without any concern for which CPUs may perform operations on that region of memory. CPU registers are one cell wide. CPU registers Name Function IP Instruction Pointer; the address in memory of the next instruction to execute A and B Address holding registers, used by application code as a store for memory addresses IPL Interrupt priority level (see )

The implementation may or may not check for stack overflows and underflows; if it does, it should panic (see ) in response to such eventualities. If it does not, then the results of an overflow or underflow are undefined. CPU stacks Name Element type Minimum capacity Function D Cells 64 The D, or Data stack, is used by application code as a working store for data, much as most CPU architectures use general-purpose registers. R Cells 64 The R, or Return stack, is used to record return addresses when subroutines or interrupt handlers are executed. It may also be used by the runtime system and applications as a secondary working store; as such, there are more strict requirements concerning its use, in order to prevent clashes. Also, unlike the Data stack, the Return stack must be stored in memory; in other words, every cell on the stack must have an address.

Each CPU executes a stream of instructions from the memory. The format of instructions is implementation-dependent, but the IP register of any CPU always points to the next instruction it will execute, apart from changes caused by interrupts or flow control instructions. Data Types As well as the Cell, which is the fundamental word size of the underlying implementation, various other data types are defined by HYDROGEN. All data types are given a short codename. Data Types Code Name Description c Cell A single cell. When referred to merely as a cell, no information about the intended interpretation of its contents is implied; it is simply a string of bits. s Signed Integer A single cell, interpreted as a two's complement signed binary integer. u Unsigned Integer A single cell, interpreted as an unsigned binary integer. ch Character A single cell, interpreted as a character. The mapping between characters and Character values is not defined by this specification; instead, we provide a set of words to perform operations on characters, to abstract out their representation. 2c Two Cells Two adjacent cells, with no information about their interpretation as anything other than a string of bits implied. The first cell of the pair is the one at the lowest address when in memory, or the deepest when on the stack. p Pointer A single cell, containing a memory address. Pointers are unsigned cells, specifying an offset in allocation units relative to the start of addressible memory. Note that the size of an allocation unit (byte, cell, or something else) is unspecified. b Block Two adjacent cells; the first one being a pointer, the second being an unsigned integer specifying the number of AUs of memory, from that address onwards, are part of the block. 'b' can be considered shorthand for 'p u', with the added implication of the relationship between the two cells. bool Boolean A single cell. If the cell consists entirely of zero bits, then the boolean represents the False value. Otherwise, it represents True. h Handle A single cell containing a reference to an executable subroutine. The exact meaning of the bits that comprise the handle are implementation dependent; for example, it could be an index into a table, a pointer to the executable code in memory, or a pointer to a memory block containing various fields, one of which is a pointer to the actual code. h(a b c - d e f) Handle with declared interface This is a handle, with an added assertion about the stack effect of the subroutine when invoked. The example given above states that the subroutine expects a value of type 'c' to be at the top of the D stack upon entry, then a value of type 'b', then a value of type 'a', and that when it returns, a value of type 'f' will be at the top of the stack, then a value of type 'e', then a value of type 'd'. u8 Eight bit unsigned integer u16 16 bit unsigned integer u32 32 bit unsigned integer s8 Eight bit two's complement signed integer s16 16 bit two's complement signed integer s32 32 bit two's complement signed integer c8 Eight bits. No interpretation of the meaning of those bits is implied. c16 16 bits. No interpretation of the meaning of those bits is implied. c32 32 bits. No interpretation of the meaning of those bits is implied.

Fixed-width integer types The types u8, u16, u32, s8, s16, and s32 are known as the fixed-width integer types. The number of cells taken up by each of these types is implementation-specific, except where specified otherwise. In particular, the results of arithmetic operations on them are not bounded to remain within the range that might be expected. An implementation must provide at least the specified number of bits, but may provide more. Therefore, the result of performing u8 addition on the u8 values 128 and 128 may either be 256 or 0. Provisions are made (see ) to normalise such values into the legal range of any fixed-width integer type. Arithmetic Operators A full range of arithmetic operations on integers is provided. Versions of each operation exist for each integer type: the fixed-width types, and upon cells interpreted as integers. Where the signedness of integers is irrelevant - operations such as addition, subtraction, and bitwise operations - we use the prefix 'c'; where signedness is relevant, we use the prefixes 'u' or 's' as appropriate. The prefix is followed by a bit-width number for fixed-width operations, which is omitted for cell-width operations. Note that pointers are just cell-width unsigned integers, so the cell-width operations may be used to perform pointer arithmetic. However, note that pointers address memory in allocation units, and the size of an allocation unit may vary between implementations, so the effect of adding an integer constant to a pointer is not specified. See for words that return the sizes of various objects in allocation units for pointer arithmetic. In the following table of integer arithmetic words, only the cell-width variants are listed, for brevity. For every word listed, four variants actually exist: the cell-width variant, as listed, plus fixed-width variants for 8, 16, and 32 bits. Implementations supporting the LARGE-INT or FLOAT features will also provide other variants of some or all words, too. Integer Arithmetic Words Name Stack Effect Description c+ c.1 c.2 -- c.3 c.3 = c.1 + c.2 c- c.1 c.2 -- c.3 c.3 = c.1 - c.2 cneg c.1 -- c.2 c.2 = -(c.1) u* u.1 u.2 -- u.3 u.3 = u.1 * u.2 u** uN.1 uN.2 -- u[2N].3 u.3 = u.1 * u.2. This is defined ONLY for fixed-width types; the result is of double the bit width of the two inputs, so overflow cannot occur. Eg, u8** multiples two u8 numbers and produces a single u16. u/ u.1 u.2 -- u.3 u.3 = u.1 / u.2, rounding towards 0 u*/ u.1 u.2 u.3 -- u.4 u.4 = u.1 * u.2 / u.3, rounding towards 0, and with a double-width intermediate product so that u.1 * u.2 does NOT overflow. umod u.1 u.2 -- u.3 u.3 = u.1 % u.2 u/mod u.1 u.2 -- u.3 u.4 u.3 = u.1 % u.2, u.4 = u.1 / u.2, rounding towards 0. s* s.1 s.2 -- s.3 s.3 = s.1 * s.2 s** sN.1 sN.2 -- s[2N].3 s.3 = s.1 * s.2. This is defined ONLY for fixed-width types; the result is of double the bit width of the two inputs, so overflow cannot occur. Eg, s8** multiples two s8 numbers and produces a single s16. s/ s.1 s.2 -- s.3 s.3 = s.1 / s.2, rounding towards 0 s*/ s.1 s.2 s.3 -- s.4 s.4 = s.1 * s.2 / s.3, rounding towards 0, and with a double-width intermediate product so that s.1 * s.2 does NOT overflow. smod s.1 s.2 -- s.3 s.3 = s.1 % s.2 s/mod s.1 s.2 -- s.3 s.4 s.3 = s.1 % s.2, s.4 = s.1 / s.2, rounding towards 0. cAND c.1 c.2 -- c.3 c.3 = c.1 AND c.2 cOR c.1 c.2 -- c.3 c.3 = c.1 OR c.2 cXOR c.1 c.2 -- c.3 c.3 = c.1 XOR c.2 cNOT c.1 -- c.2 c.2 = NOT c.1 c1ROL c.1 -- c.2 c.2 = c.1 rotated left 1 bit c1ROR c.1 -- c.2 c.2 = c.1 rotated right 1 bit c1SHL c.1 -- c.2 c.2 = c.1 shifted left 1 bit, with a zero for the least significant bit u1SHR u.1 -- u.2 u.2 = u.1 shifted right 1 bit, with a zero for the most signiciant bit s1SHR s.1 -- s.2 s.2 = s.1 arithmetic-shifted right 1 bit, with the most significant bit preserved and duplicated right one place cROL c.1 u8 -- c.2 c.2 = c.1 rotated left u8 bits cROR c.1 u8 -- c.2 c.2 = c.1 rotated right u8 bits cSHL c.1 u8 -- c.2 c.2 = c.1 shifted left u8 bits, with zeroes for the least significant bits uSHR u.1 u8 -- u.2 u.2 = u.1 shifted right u8 bits, with zeroes for the most signiciant bits sSHR s.1 u8 -- s.2 s.2 = s.1 arithmetic-shifted right u8 bits, with the most significant bit preserved and duplicated right. c-getbit c u8 -- bool Extracts bit u8 of c, and returns TRUE if the bit is set, FALSE otherwise. c-setbit c.1 u8 bool -- c.2 If bool is true, c.2 = c.1 with bit u8 set, otherwise c.2 = c.1 with bit u8 cleared. c=? c.1 c.2 -- bool Returns TRUE of c.1 and c.2 are identical c/=? c.1 c.2 -- bool Returns TRUE of c.1 and c.2 are not identical u<? u.1 u.2 -- bool Returns TRUE of u.1 is less than u.2 u<=? u.1 u.2 -- bool Returns TRUE of u.1 is less than or equal to u.2 u>? u.1 u.2 -- bool Returns TRUE of u.1 is greater than u.2 u>=? u.1 u.2 -- bool Returns TRUE of u.1 is greater than or equal to u.2 u<? u.1 u.2 -- bool Returns TRUE of u.1 is less than u.2 u<=? u.1 u.2 -- bool Returns TRUE of u.1 is less than or equal to u.2 u> u.1 u.2 -- bool Returns TRUE of u.1 is greater than u.2 u><? u.1 u.2 u.3 -- bool Returns TRUE of u.1 is greater than or equal to u.2, and less than or equal to u.3 (eg, within the inclusive range u.2..u.3) u<>? u.1 u.2 u.3 -- bool Returns TRUE of u.1 is less then u.2, or greater than u.3 (eg, NOT within the inclusive range u.2..u.3) s<? s.1 s.2 -- bool Returns TRUE of s.1 is less than s.2 s<=? s.1 s.2 -- bool Returns TRUE of s.1 is less than or equal to s.2 s>? s.1 s.2 -- bool Returns TRUE of s.1 is greater than s.2 s>=? s.1 s.2 -- bool Returns TRUE of s.1 is greater than or equal to s.2 s<? s.1 s.2 -- bool Returns TRUE of s.1 is less than s.2 s<=? s.1 s.2 -- bool Returns TRUE of s.1 is less than or equal to s.2 s>? s.1 s.2 -- bool Returns TRUE of s.1 is greater than s.2 s><? s.1 s.2 s.3 -- bool Returns TRUE of s.1 is greater than or equal to s.2, and less than or equal to s.3 (eg, within the inclusive range s.2..s.3) s<>? s.1 s.2 s.3 -- bool Returns TRUE of s.1 is less then s.2, or greater than s.3 (eg, NOT within the inclusive range s.2..s.3)

Many of the variants of these operations will simply be aliases to each other. For example, since cells have a minimum width of 16 bits, c+, c8+, and c16+ are quite likely to all refer to the same actual operation. If cells are 32 bits in size, then c32+ may also be an alias. However, the implementation is free to implement these operations individually. Stack Control The fixed-width integer system exists to permit software to be precise about the precision it requires for different quantities (within the limitations of the available granularity), while remaining independent of the underlying cell width, and permitting the trivial generation of efficient code. However, not knowing the size of different objects on the stack - or, even, if they are actually all on the same stack, or split into different stacks for different sizes or types of objects - would make stack rearrangement nonportable in an environment with the traditional FORTH stack manipulation words like DROP and OVER that operate on single cells. Therefore, we provide a single powerful portable stack rearrangement operation that, during code generation, may be analysed with implementation-specific knowledge about the implementation of the stack to generate appropriate and efficient code. The word is shuffle[, and it is a parsing immediate word: that is, rather than a call to the word being compiled, the word is executed immediately by the compiler, and proceeds to parse from the input stream itself and process what it finds. Namely, it parses the input stream for a variant on stack picture notation, until it encounters a closing ] marker. The first part of the stack picture, up to the -- marker, represents the original state of the stack. It consists of one or more space- delimited entries, each of which consists of a type code, followed by a period, then a unique whitespace-terminated name which serves as an identifier for this entry . After the -- marker is found a list of zero or more space-separated entries, which specify the desired state of the stack after the shuffle. Each name references an input object previously specified; using a name not defined to the left of the marker is illegal. What's more, a name may be used zero or more times. Failing to use a name on the right hand side causes the corresponding entry on the left hand side to be discarded. Using a name more than once causes it to be duplicated as many times as required. Using the same name with a different type requests a type conversion. TODO: List legal type conversions. Talk about what happens when we go from c to u8 and there's an overflow (answer: implementation defined! See normalisation words!) For example, shuffle[ u8.1 c.2 c.3 -- c.2 c.2 ] will expect the stack to have two cells on top, then an unsigned 8-bit integer. After the operation has executed, the stack will simply have two copies of the lower of the two cells, with the other cell and the 8-bit integer having been discarded. The other stack-noise-avoidance words is begin[, which again takes a stack input picture in the notation used by shuffle[, but without a -- marker and output specification; instead, it has a ] marker to end the stack picture then arbitrary source code up to a end marker. The begin[ ... ] consumes the stack items depicted, taking them off of the stack and placing them in a "stack frame" elsewhere; from there up to the end, references to the names of stack items defined in the begin[ ... ] block cause a copy of that stack item to be pushed onto the data stack. After the end the names are out of syntactic scope, and space reserved for the "stack frame" is discarded. TODO: Refer to the Return stack memory allocation words below to explain where the stack frame is declared. Do we make the details entirely implementation-private, or export the presence of a stack frame pointer register in the CPU and provide lower-level words to mess with it? Or would it be better to just say that begin[ merely might use the return stack, but could be allocating things to registers for speed? Normalisation Implementations are free to implement the fixed-width integer types with more bits than necessary, but fixed-width arithmetic operations are not required to force their results to be valid values of the fixed-width integer type in question. This allows words such as u8+ to be implemented as aliases to c+. In situations where it is required that values be within the normal range of the type rather than the implementation-defined range (which may be larger), the following words are provided to restrict a value to within that range: Fixed-Width Integer Normalisation Words Name Stack Effect Description u8normalise u8 - u8 Discard all but the lower 8 bits of the top of the stack s8normalise s8 - s8 Discard all but the lower 8 bits of the top of the stack u16normalise u16 - u16 Discard all but the lower 16 bits of the top of the stack s16normalise s16 - s16 Discard all but the lower 16 bits of the top of the stack u32normalise u32 - u32 Discard all but the lower 32 bits of the top of the stack s32normalise s32 - s32 Discard all but the lower 32 bits of the top of the stack

Memory Access Basic memory access All CPUs share a single random-access memory, consisting of a series of allocation units. Each allocation unit has a unique address, which is one of the legal values of the pointer type. Pointers are the same size as a cell. Not all pointer values are valid, however. For a start, the system is unlikely to have precisely enough memory to use the entire range of the pointer type; secondly, the available memory need not be in one contiguous block. The HYDROGEN runtime system is responsible for knowing which ranges of pointer values are valid, and then allocating contiguous ranges for different functions. Attempts to read, write, or execute code from an invalid pointer may result in an invalid access panic if the system provides such a facility, or may have undefined effects if not. The size of an allocation unit is not specified. In practice, many systems will have bytes as the units of allocation, meaning that a pointer is capable of identifying any byte in memory, and the individual bytes of a multi-byte quantity can be addressed. However, some systems may have cells or other multi-byte objects as the fundamental unit of allocation. In which case, arrays of smaller types may actually be stored with an entire cell used for each. Applications need not be aware of this distinction, as they can use the array-indexing operations such as 8[] to calculate indices into arrays. The memory system does not care about the signedness of integers stored in it, or indeed anything about them beyond their size; as such, all of the memory operations defined in the base operate in terms of five fundamental types: c8, c16, c32, c, and ch. All other core types are defined as being the same size as at least one of the types in that list. If a pointer is valid for loading and storing one type, it may not necessarily be valid for loading and storing values of another type. For example, systems with small allocation units such as bytes may have alignment restrictions, where byte-sized quantities may be read from any location but 32-bit quantities may only be read from addresses divisible by four. In which case, a pointer may be valid for loading eight-bit quantities but not larger quantities. Attempts to read, write, or execute code from a pointer that is not valid for the operation being performed may result in an invalid alignment panic if the system provides such a facility, or may have undefined effects if not. A pointer may be valid for reading, but not for writing. For example, there may be read-only regions of memory. Attempts to write to pointers that are not valid for writing may result in a write protection panic if the system provides such a facility, or may have undefined effects if not. The CPU model provides two address registers, A and B, which store pointers. These are used as the addresses for memory loads and stores, rather than a pointer on the stack, as this facility tends to make operations on memory blocks clearer for humans to understand. Note that there are versions of the reading and writing words that advance A or B to the next quantity of the appropriate size in memory. These are intended for implementing loops that examine each element of an array in turn. The advancing versions of words simply add the size of the object in question to the register, with no knowledge of pointer validity restrictions; using such words to advance a pointer beyond the end of the region of memory allocated by the system for the array may result in a pointer that is invalid. Conventionally, the words specified in this chapter do not alter the contents of A or B unless otherwise specified, but any other words are not expected to preserve A or B. Memory access words are defined in terms of a "selected address register", which may be A or B; the selection of an address register is not done at run time, but is a syntactic concept. Memory access words must be prefixed immediately with at least A or B, and optionally additional modifiers from the table. Using the prefixes on a word other than one of the memory access words defined in this chapter is illegal, as is using such a word without at least A or B. Address Mode Selection Words Name Stack Effect Description A - An immediate word that selects A as the selected address register for the subsequently compiled memory access word to use, with no post- or pre- increment or decrement. B - An immediate word that selects B as the selected address register for the subsequently compiled memory access word to use, with no post- or pre- increment or decrement. |++ - An immediate word that modifies the subsequently compiled memory access word to increment the selected address register by the size of the object accessed after the word has completed. |-- - An immediate word that modifies the subsequently compiled memory access word to decrement the selected address register by the size of the object accessed after the word has completed. ++| - An immediate word that modifies the subsequently compiled memory access word to increment the selected address register by the size of the object accessed before the word executes. --| - An immediate word that modifies the subsequently compiled memory access word to decrement the selected address register by the size of the object accessed before the word executes. bulk-memory-operation - An immediate word that gives the implementation a hint that the next memory access is likely to be a small part of a large series of memory accesses, and thus unlikely to exhibit high locality. The implementation may ignore this or, if the implementation supports it, it may modify the implementation of the subsequently compiled memory access word in a way that will be more efficient for large bulk operations on memory (for example, by bypassing data caches). Applications should prefix memory operations such as large array scans with this word.

TODO: Write some examples of the above. Eg, A |++ 8! A |++ 8! A |++ 8! writes the top three bytes on the stack to the addresses at A, A+1, and A+2, in order. Memory Reading Words Name Stack Effect Description 8@ - c8 If the selected address register contains a valid pointer, returns the contents of the eight-bit quantity stored at that location in memory. 16@ - c16 If the selected address register contains a valid pointer, returns the contents of the sixteen-bit quantity stored at that location in memory. 32@ - c32 If the selected address register contains a valid pointer, returns the contents of the thirty-two-bit quantity stored at that location in memory. c@ - c If the selected address register contains a valid pointer, returns the contents of the cell stored at that location in memory. ch@ - ch If the selected address register contains a valid pointer, returns the contents of the character stored at that location in memory.

Memory Writing Words Name Stack Effect Description 8! c8 - If the selected address register contains a valid pointer, writes the eight-bit quantity supplied to that location in memory. 16! c16 - If the selected address register contains a valid pointer, writes the 16-bit quantity supplied to that location in memory. 32! c32 - If the selected address register contains a valid pointer, writes the 32-bit quantity supplied to that location in memory. c! c - If the selected address register contains a valid pointer, writes the cell supplied to that location in memory. ch! ch - If the selected address register contains a valid pointer, writes the character supplied to that location in memory.

Array indexing words use the currently selected address register, but ignore the pre/post increment/decrement settings. Array Indexing Words Name Stack Effect Description 8[]@ u - c8 Given a pointer to the start of an array of eight-bit quantities stored in the selected address register, and an index into that array, reads the value of the indexed element of the array onto the top of the stack. 8[]! c8 u - Given a pointer to the start of an array of eight-bit quantities stored in the selected address register, and an index into that array and a value, writes the value into the indexed element of the array. 16[]@ u - c16 Given a pointer to the start of an array of sixteen-bit quantities stored in the selected address register, and an index into that array, reads the value of the indexed element of the array onto the top of the stack. 16[]! c16 u - Given a pointer to the start of an array of sixteen-bit quantities stored in the selected address register, and an index into that array and a value, writes the value into the indexed element of the array. 32[]@ u - c32 Given a pointer to the start of an array of thirty-two-bit quantities stored in the selected address register, and an index into that array, reads the value of the indexed element of the array onto the top of the stack. 32[]! c32 u - Given a pointer to the start of an array of thirty-two-bit quantities stored in the selected address register, and an index into that array and a value, writes the value into the indexed element of the array. c[]@ u - c Given a pointer to the start of an array of cell-sized quantities stored in the selected address register, and an index into that array, reads the value of the indexed element of the array onto the top of the stack. c[]! c u - Given a pointer to the start of an array of cell-sized quantities stored in the selected address register, and an index into that array and a value, writes the value into the indexed element of the array. ch[]@ u - ch Given a pointer to the start of an array of characters stored in the selected address register, and an index into that array, reads the value of the indexed element of the array onto the top of the stack. ch[]! ch u - Given a pointer to the start of an array of characters stored in the selected address register, and an index into that array and a value, writes the value into the indexed element of the array.

Array Sizing Words Name Stack Effect Description 8s->aus u - u Given a number, return the number of allocation units required for an array of that many eight-bit quantities. 16s->aus u - u Given a number, return the number of allocation units required for an array of that many sixteen-bit quantities. 32s->aus u - u Given a number, return the number of allocation units required for an array of that many thirty-two-bit quantities. cs->aus u - u Given a number, return the number of allocation units required for an array of that many cells. chs->aus u - u Given a number, return the number of allocation units required for an array of that many characters.

Miscellaneous Memory-Related Words Name Stack Effect Description >A p - Loads the A register with a pointer from the stack. <A - p Stores the contents of the A register to the stack, without modifying A. >B p - Loads the B register with a pointer from the stack. <B - p Stores the contents of the B register to the stack, without modifying B. block-copy b p - (may corrupt A or B) Copies the identified block of memory to the identically-sized region of memory starting at the supplied pointer. If the blocks overlap, the contents of the destination block will end up being identical to the contents the source block had before the copy. block-copy! b p - (may corrupt A or B) Copies the identified block of memory to the identically-sized region of memory starting at the supplied pointer. If the blocks overlap, the final state of the target block is undefined. background-block-copy b p h(-) - (may corrupt A or B) Starts copying the identified block of memory to the identically-sized region of memory starting at the supplied pointer. If the blocks overlap, the contents of the destination block will end up being identical to the contents the source block had before the copy. The word may return before the copy is complete if the system has the facility to perform background copies. When the copy is complete, the supplied argumentless sburoutine is invoked (but see for notes on how the THREADS feature may affect this). Between the execution of this word and the start of the handler being called, the state of every aligned cell in memory overlapping the target region is undefined. If the source region is written to before the copy is complete, then the final state of the target memory region is undefined. background-block-copy! b p h(-) - (may corrupt A or B) Starts copying the identified block of memory to the identically-sized region of memory starting at the supplied pointer. If the blocks overlap, the final state of the target block is undefined. The word may return before the copy is complete if the system has the facility to perform background copies. When the copy is complete, the supplied argumentless subroutine is invoked(but see for notes on how the THREADS feature may affect this). Between the execution of this word and the start of the handler being called, the state of every aligned cell in memory overlapping the target region is undefined. If the source region is written to before the copy is complete, then the final state of the target memory region is undefined.

Multi-Accessor Memory Semantics "Normal Semantics" refers to the idea that if one writes a value to a valid location in memory, subsequent reads of that location should return the new value, previous reads return the old value, and never is any other value ever read. However, providing such semantics in multiprocessor systems can be expensive, so the HYDROGEN virtual machine does not guarantee it will be so. Additionally, some implementations may have elements of the system that behave like additional CPUs, despite not being capable of executing code; this includes background memory operations such as background-block-copy!, and potentially implementation-dependent things such as memory-mapped hardware or direct access to system memory from peripherals (a form of background-block-copy!). In this section, we will consider system CPUs and any other implementation-dependent subsystems that might independently and asynchronously write to memory as "writers", and any CPU or other subsystem that might read memory as a "reader". It is guaranteed that if a writer writes to a region of memory the size of a cell or larger, then unless another writer modifies that memory region, then if the writer is also a reader, it will read back exactly the data that was written. In other words, normal semantics apply when the same CPU or subsystem is the only reader and writer using a cell-sized or larger region of memory. For regions smaller than a cell, this may not hold. If two different writers are modifying objects smaller than a cell that happen to be located within the same aligned cell-sized region of memory, then the result of subsequent reads of the entire cell by any reader are undefined. This is because implementations may implement writes smaller than a cell as a non-atomic read-modify-write, in which case, two of them overlapping may succeed in one of the writes appearing to succeed but being invisible undone; we hedge our bets by specifying that the entire cell is completely undefined in such a situation. However, the undefined status of that cell may be cleared by a normal write to it. Also, if a writer writes to any sized region of memory, the contents of that region of memory as seen by other readers is undefined until the writer "commits" the write. This is why the target region of background-block-copy! is undefined until the completion-notification handler is called; the background block copy might only commit the entire block write just before calling the handler. Note that background operations such as block copies may be implemented using special hardware, which may be its own reader in order to read the source block of memory; in such implementations, the words that initiate background operations will automatically commit any uncommitted writes from the CPU that initiates the operation. However, if any writes occur to that region while the copy is in progress, then the background copier may then encounter regions of memory with uncommitted writes to them, producing undefined results. In practice, this means that regions of memory that are shared between multiple writers need to be handled carefully. Some means must be taken to ensure that no readers attempt to read from the memory region between writing and committing; either locking or atomic writes (which are discussed below) or some higher-level means of synchronisation (as with background block copies; the background copy is not initiated until the source block of memory is ready to be copied). Locking One way of ensuring that two indepedent CPUs do not use some shared resource (such as a region of memory) at the same time is locking. The HYDROGEN virtual machine provides only spin locks; issues such as suspending a blocked thread and rescheduling it when it gains the lock are the scope of the application. TODO: Document a simple spin-lock system Atomic Writes One way of ensuring that no readers read memory regions with uncommitted writes pending (resulting in undefined state) is to use atomic writes. There are operations that perform a write and commit it in one indivisible operation, such that no reads or other writes may occur between the write and the commit. TODO: Document atomic cell-sized writes, compare-and-swap, and a timestamped compare-and-swap that's ABA-proof. Atomic Data Structures TODO: Some platforms have hardware support for aiding queues, stacks, and dictionaries. Provide atomic versions of same. Portable implementation can use the atomic write operations or spin locks. Make no claims about lock-freedom, but we hope they are. The Return Stack TODO: As mentioned above in the discussion of the CPU stacks, the Return stack must live in memory, because it must be possible to use it to allocate memory for a dynamic scope. Provide some words to allocate and free memory this way. Subroutines Subroutine handles represent executable code, stored somewhere, as a cell-sized value. A subroutine contains code, which is notionally a list of fundamental operations to be executed in order, but the exact representation of which is implementation-dependent. TODO: Words to call a subroutine given its handle (indirect call), to tail-call, etc. TODO: Flow control words to conditionally or repeatedly call a subroutine, etc. Compilation New subroutines may be generated by compilation of code, by feeding fundamental ops to a compiler context (which wraps a heap stream), then getting a handle out at the end. TODO: Fundamental ops are literal pushes, and literal subroutine calls. Jumps (aka tail calls), indirect calls, and so on are modelled as calls to special subroutines. Smart implementations will peephole-optimise calls to such special 'primitives'. Interpretation TODO: Describe the interpreter mechanism, mainly revolving around dictionaries, parsing words, etc. See HYDROGEN code generation post on my blog. ( word invokes the compiler up to a closing ) to generate a subroutine and pushes its handle on the stack (or, in compile mode, compiles code to push it on the stack). Document that core HYDROGEN words defined in this document appear in the a dictionary which is bound to the name core-dictionary in itself. Interpreter context contains a list of dictionaries to look stuff up in (as a kind of namespace stack that may be modified during interpretation), and a dictionary to bind new definitions into by default, which is always implicitly on the top of the namespace stack. Decide if we want an implicit thread-global "current interpret context" register (with an interpreter context containing a saved copy of the old value, so they form a stack), to make it easier to nest meta-operations. The Heap HYDROGEN itself provides a very simple memory allocation model suitable for bootstrapping or using as the sole memory allocate in a simple system, but with hooks so that it can be extended to support more general heaps built on top. The heap is a region of memory subject to dynamic allocation; and memory can be allocated using two main interfaces. Block allocation is used when the size of memory region desired is known in advance; you request a certain number of AUs and receive either a pointer to a cell-aligned block of that size, or a null pointer to indicate an out-of-memory error. Stream allocation, on the other hand, involves creating a stream, writing data items to it, then 'sealing' the stream, whereupon the system gives you back a pointer to the start of the stream in contiguous memory. However a block is allocated, it may be returned to the heap by giving the start and length of the block. During stream allocation, the application may request the current offset into the stream, in AUs; after the stream is sealed, it may then refer back to those offsets by adding them to the initial pointer. TODO: Write up words to do all this. The bootstrap heap engine is very simple. It starts off with a list of free memory regions, provided by an implementation-dependent mechanism; it picks the largest free block to use for allocation, and constructs a free list from the others, by placing the length of the block in the first cell and a pointer to the next block (or null if this is the last in the chain) in the second cell (blocks of less than two cells in size are just discarded). It has three variables, which it makes available in the core dictionary as the words HEAP-POSITION, HEAP-LIMIT and HEAP-FREELIST. HEAP-POSITION is a pointer initialised to the start of the block used for allocation, and HEAP-LIMIT points to the last AU of the block. HEAP-FREELIST points to the head of the free list. ALLOC then works by copying HEAP-POSITION, adding the requested number of AUs to it (rounded up to cell-align the result), then returning the original HEAP-POSITION; unless this would exceed HEAP-LIMIT. FREE takes the specified position and length of block, and adds them to the head of the freelist. Stream allocation works by using a single global stream allocation start pointer which is private to the implementation; supplied values are written to HEAP-POSITION, moving it along as they go, and calculating offsets by subtracting the start pointer from HEAP-POSITION; sealing the stream simply returns the saved start pointer and the final offset, then advances HEAP-POSITION to cell-align it. Clearly, only one heap stream may be written to at once, and block allocation may not occur during heap allocation. This must be enforced by the application, or it can replace the heap implementation; all the allocation words are actually indirect handles which can be modified to point to application-supplied allocation primitives. If a more advanced heap package is installed, it will find the free list ready for it to use in HEAP-FREELIST, plus the space between HEAP-POSITION and HEAP-LIMIT. If you want to switch back to the original heap package, you can convert whatever free space you have into a free list and call it HEAP-FREELIST, apart from the largest block, which you can call HEAP-POSITION...HEAP-LIMIT, and reinstate the heap words. Interrupts Interrupts are asynchronous events that interrupt the normal flow of execution on a CPU. A number of events may trigger an interrupt, ranging from external hardware events (the completion of an output operation, the arrival of a new input, or any other event indicating that the hardware needs attention) to software events that cause the CPU to divert the normal flow of execution, such as a panic. For each interrupt source present in the system, there will be a word which registers a handler for that interrupt. The specification of the interrupt source will state what arguments are to be passed to the handler subroutine, and in some cases, the subroutine will be expected to return results. If no subroutine (or a null handle) is registered for an interrupt, then the interrupt should be ignored. The implementation may effect this by disabling the interrupt in hardware, or if this is not possible, providing a null handle that returns instantly with some sensible default result. If the THREADS feature is provided, then interrupt handlers may use the INTERRUPT-TASK-SWITCH word to alter the CPU state saved by the interrupt, thus effecting task switches. With the exception of panic interrupts, which are covered later, all interrupts have a priority which is assigned at the same time as the handler. At the same time, each CPU has a priority level as part of its state, stored in the IPL register. When an interrupt occurs, the system chooses a CPU with priority level lower than the priority level of the interrupt to handle it. The currently executing code on that CPU is suspended, the CPU's priority is set to the priority of the interrupt, and the interrupt handler executes. When execution completes, the saved CPU state (including the priority level) is restored. If no CPU is currently able to handle the interrupt, then it must be queued, and executed on the first CPU that is able to handle it. Some implementations do not have the facility to reprogram the priorities of their interrupt hardware; many have priorities assigned to their interrupts in hardware. And some systems may have no notion of interrupt priority. In these cases, the implementation must emulate interrupt priorities in the platform- dependent interrupt handler stubs, by use of a CPU priority level stored in the thread state and a queue of pending interrupts. When the stub runs, it checks to see if the interrupt may be serviced on this CPU by comparing the current priority to the interrupt priority, and if not, queue the interrupt for later. Whenever the CPU's current priority is lowered (be it by a load of a saved CPU state or an explicit set-ipl or exchange-ipl), the queue of pending interrupts should be checked for interrupts that are now eligible to run, and their handlers invoked. Interrupt priorities are specified as unsigned cells, which almost certainly provides a much finer resolution of priority specification than any underlying hardware. The implementation must map from the priority specifications to priority levels supported by the hardware, maintaining the relative ordering of priorities. In which case, the cell-sized IPL register must still be maintained and contain the full cell-sized value specified by the application, but on every setting of IPL, the hardware IPL must be calculated and stored into the underlying "real" IPL register. Interrupt priority control words Name Stack Effect Description exchange-ipl u - u Sets the IPL register to the chosen value, and returns the old value. set-ipl u - Sets the IPL register to the chosen value. get-ipl - u Returns the contents of the IPL register.

Panics A panic is a particular type of interrupt raised by the system when an unresolvable situation occurs. When the panic interrupt handler is entered, a panic code is supplied as the single argument and the saved CPU state is subsequently invalid. Returning from the panic handler without having used the BOOT-THREAD or INTERRUPT-TASK-SWITCH words to switch to a valid state will cause another panic. TODO: Think about how to handle panics in minimal systems with no THREADS feature: Just REBOOT/TERMINATE/SHUTDOWN? The panic interrupt handler will always run on the CPU upon which the panic occurred, with the code that caused the panic being the saved CPU state in effect. Panic interrupt handlers are invoked immediately upon a panic-causing event, regardless of the current CPU priority. It is expected (but not required), therefore, that panic handlers will either suspend the panicing CPU with HALT, or terminate the thread that caused the panic and then pass control to the scheduler to switch to a new thread. If we defined a DEBUG feature, then as part of that we may define ways to 'repair' CPU states that have suffered a panic and continue execution. There is a single system-wide panic handler shared by all CPUs. Panic interrupt handler control words Name Stack Effect Description set-panic-handler h(c - ) - Use the given handle as the panic interrupt handler on all CPUs from this point onwards.

TODO: Find all mentions of panics in this document, and construct a table of panic codes here. System Lifecycle Some external event initiates startup of the system. For bare-metal systems, power is supplied to the system, initiating the implementation-dependent boot process. For hosted systems, some software action initiates the HYDROGEN runtime system's startup. The nature of the boot process is platform and implementation dependent, but it will involve using some set of "boot files" from a mass storage device, to initialise the basic HYDROGEN system, then load the application. By "platform", we mean the underlying platform upon which the HYDROGEN implementation runs; this may be bare metal of a certain type, or a virtual machine, or a host operating system. The platform defines the initial steps of the boot process, the format the HYDROGEN kernel must be in to be executed, and the resources the HYDROGEN kernel has available to it. By "implementation", we mean the precise version of the HYDROGEN kernel that runs on top of the platform and provides the HYDROGEN virtual machine. Different implementations on the same platform may provide different interfaces for extensions and drivers, or have a different format of configuration. For a bare-metal system, the boot files may consist of a number of stages of boot loader; perhaps a small boot sector, a larger second-stage loader, then the HYDROGEN kernel and perhaps a configuration file describing details of available hardware that cannot be gleaned through autoconfiguration, perhaps a set of extension files containing additional drivers, then finally, the application. For an embedded system, there might be only one physical boot file, the contents of a FLASH memory; but it will still logically consist of HYDROGEN kernel and application, at least. For a hosted system, there might be a platform-specific HYDROGEN kernel executable, then a number of drivers, a configuration file, and the application. In most situations, the system has the ability to replace its own boot files. Bare-metal systems can rewrite the boot sector and install new boot loaders and kernels; embedded systems may be able to re-FLASH themselves (but may not); and hosted systems may (subject to administrative restrictions) be able to replace their own binaries and configuration. Either way, the HYDROGEN boot process is responsible for loading the implementation-dependent installation-independent HYDROGEN kernel, loading any implementation-dependent installation-dependent additional drivers or configuration files, then loading the implementation-independent installation-independent application and its implementation-independent installation-dependent configuration. The nature of the kernel, additional drivers, and kernel configuration files is implementation-dependent, but an abstract interface for replacing them may be made available by implementations via the BOOT-MEDIA feature describe in , along with more concrete interfaces for replacing the application and its configuration. The application and its configuration must be provided as one or more blocks of characters, with an ordering. If the BOOT-MEDIA feature is provided, then the identification of these blocks and their order of loading is as defined by the feature; if not, then an implementation-dependent mechanism for providing this series of blocks of characters in the correct order must be provided. Once the kernel and its drivers and configuration have been loaded, providing a basic system in an initial state with all the words defined in this document loaded into an initial wordlist, one CPU is started while any others remain suspended; this CPU then proceeds to interpret the application in the environment containing the words specified in this document, which is bound to CORE-DICTIONARY within itself. CORE-DICTIONARY should not be modified; the application interpreter context binds new words into an initially empty dictionary referenced by APPLICATION-DICTIONARY. Once interpreting the application terminates, the system is ready to run. At this point, the implementation may store a snapshot of the state of the system that can be "cached" and restored in subsequent boots, as if none of the boot files are changed, then rebooting should consistently return to this state. It is recommended, but not required, that implementations provide some implementation-specific means of interrupting this process and dropping to an interactive prompt at the console for maintenance. At this point, if the DEVICE-TREE feature is present and the application has installed a device tree change interrupt handler, then the interrupt handler will now be invoked with the initial change from an empty device tree to the current state of the system; when that handler has completed, the word START, which the application should have defined in APPLICATION-DICTIONARY, is executed on one CPU; any other CPUs execute IDLE, which is described below. If the application expects to use multiple processor other than for interrupt handling, then it should use the CPU:POKE word or dynamic CPU device calls to send "software interrupt"s to the other CPUs in the system to get them working. If no START word has been defined by the application, the implementation will panic. We need a few utility words for large-scale system management. More fine-grained control can be had via power management devices, but these words provide a basic core that is adequate for many purposes. System Management Words Name Stack Effect Description IDLE Suspend execution on this CPU, ideally reducing CPU power usage, until an interrupt occurs; at which point the CPU is restarted to handle the interrupt, and suspends again when the interrupt terminates. The word executes this idle loop forever, unless an interrupt handler switches to another thread with INTERRUPT-TASK-SWITCH. HALT Suspend execution on this CPU, ideally reducing CPU power usage, but unlike IDLE, does not attempt to handle interrupts. REBOOT HALTs all CPUs, terminates all asynchronous operations, resets hardware devices, and initiates the boot process from scratch. TERMINATE HALTs all CPUs. What happens then is implementation-specific, but it must not reboot without some administrative action telling it to do so. SHUTDOWN HALTs all CPUs, then powers down the entire system if possible. PLATFORM-IDENTIFIER -- b Pushes a block which contains a character string, identifying the specific platform upon which HYDROGEN is implemented. IMPLEMENTATION-IDENTIFIER -- b Pushes a block which contains a character string, identifying the specific HYDROGEN implementation. SYSTEM-IDENTIFIER -- b Pushes a block which contains a character string, identifying this specific system. Ideally it should be immutable, such as a serial number. If there is no identifying information available, then an empty string should be returned.

Features TODO The REQUIRE word declares that the application requires a named feature This may be implemented as a kind of include file system, or the words for all supported features might already be present, and REQUIRE then merely becomes 'abort if the desired feature is not available' Features roughly split into two types: features that provide access to a certain type of hardware (eg, clocks) and features that may benefit from certain types of hardware (eg, DSP operations, virtual machines) but that can still be provided without them. The latter kind should have portable implementations provided in the specification, as reference implementations; most implementations can then provide these features by just including the portable implementation of each feature verbatim. However, where special hardware is available, the implementation can instead use that hardware. TODO Common Features FLOAT: Floating point arithmetic TODO. Provide IEEE 32- and 64-bit floats. And maybe a 'native float' with variable characteristics? LARGE-INT: Large Integers TODO Adds [us]{64..128}, [us]2 arithmetic TODO DSP-ARITH-n: DSP Arithmetic tools TODO n being the width of the DSP 'cells', eg 16 or 32. Implementations may provide more than one DSP-ARITH-n feature. TODO Saturating arithmetic - DSPn+ etc TODO Fixpoint arithmetic TODO Multiply-Add TODO cyclic buffers TODO 1d FFT, FIR, etc as well - common high level algorithms can be hardcoded in assembly for speed TODO THREADS: Process state manipulation The state of computation is modelled as the stacks and processor registers defined in along with the state of any background asynchronous operations such as background-block-copy, but as this is a virtual machine, the actual representation of that state is implementation-dependent. Therefore, a set of words is provided to load and store the state in a portable manner in order to facilitate task switching. If this feature is present, then an additional T register added to the virtual CPU pointers to a region of memory set aside for thread state, known as a Thread Control Block (TCB), which consists of any memory required by the runtime system for storing CPU context information, space to store any state that normally resides in other locations (eg, physical CPU registers), and a region for applications to store thread-local state in. A TCB starts with the platform-dependent context area, and the user area follows after that. Upon boot, each processor starts with their T register pointing to a primordial TCB which has a zero-length user area and may not even be able to store a CPU state; the initial contents of the T register must either be left in-place and the THREADS feature words not used, or replaced with the BOOT-THREAD word. A TCB is considered active if the T register of a CPU currently points to it. An active TCB may not completely represent the state of the CPU, as parts of it may be cached in physical CPU registers, so many words that modify TCB contents are illegal on active TCBs. A TCB still counts as active even if an interrupt handler is executing on that CPU. The application may allocated TCBs as it sees fit; however, the start of a TCB must be cell-aligned. THREADS feature words Name Stack Effect Description thread-context-size -- u Pushes the (constant) size of the implementation-dependent area of a TCB, in AUs, onto the stack. This can be used to compute the size of a TCB to allocate storage for it, by adding the amount of space required for the user area. thread-context-initialise p h -- Given a pointer to a region of memory at least thread-context-size AUs long, initialises the first thread-context-size AUs to represent an initial CPU state with empty stacks, zero A, B and IPL registers, no background operations in progress, and IP such that execution will start at the start of the supplied subroutine (which must never return). thread-context-push p c -- Given a pointer to an initialised inactive TCB, pushes the supplied cell onto the D stack of the stored CPU state. Intended for supplying parameters to the initial handler of a thread, but may be extended by a DEBUGGING feature into a suite of introspection tools to examine and modify all the contents of a TCB. thread-context-@ p u -- c Given a pointer to an initialised TCB, active or inactive, fetches the cell at the specified index (in cells) into the user area. Might be implemented as cs->aus thread-context-size + + >A A C@. thread-context-! c p u -- Given a pointer to an initialised TCB, active or inactive, stores the cell into the specified index (in cells) into the user area. Might be implemented as cs->aus thread-context-size + + >A A C!. T@ u -- c Fetches the cell at the specified index (in cells) into the user area of the TCB pointed at by T. T! u c -- Stores the supplied cell into the cell at the specified index (in cells) into the user area of the TCB pointed at by T. <T -- p Pushes the current value of T onto the stack. The TCB remains active. boot-thread p -- (never returns) Discards the current CPU state, sets T to the supplied pointer (which must refer to an inactive TCB), and loads the CPU state therein, making the TCB active. interrupt-task-switch p -- (never directly returns) This word may only be called from within an interrupt handler. It saves the CPU state as captured on entry to the interrupt handler into the TCB pointed at by T, making that TCB inactive, loads T with the supplied pointer (which must point to an inactive TCB), and exits the interrupt handler, returning to the state stored in the new TCB (which becomes active). task-switch p -- Saves the entirety of the current CPU state into the TCB pointed at by T, making that TCB inactive, loads T with the supplied pointer (which must point to an inactive TCB), and returns to the state stored in the new TCB (which becomes active).

CONTINUATIONS: Explicit continuations Closely related to the THREADS feature, the CONTINUATIONS feature provides explicit support for continuations. A continuation represents the future execution of code after the current word has returned. As such, it contains a subset of the state of the current thread. Also, a thread context represents a region of memory that is potentially continuously modified while a thread runs, since on most implementations it will contain the stacks (or at least the portions of them not cached in the CPU itself); while a continuation is a read-only object, permanently freezing a particular state of the system that can be "copied" into a thread context as many times as required in future. A continuation object is created by the MAKE-CONTINUATION word, which captures the current state of the stacks along with the contents of the IPL register. This is copied into a block of memory allocated from the heap, in an implementation-dependent format, and a pointer to same is returned. The heap space (and any other resources) consumed by a continuation object can be freed by passing the pointer to DESTROY-CONTINUATION. A continuation may be restored with the RESTORE-CONTINUATION word. This takes two arguments, the continuation pointer and a single cell. It saves the current return address from the R stack, then discards the current contents of the stacks and replaces them with those stored within the continuation, then pushes its single cell argument onto the data stack and returns to the saved return address. Note that this means that RESTORE-CONTINUATION returns to the word that called it, regardless of the contents of the R stack saved in the continuation. The calling word must then perform any stack operations it needs to do in the new continuation context before returning. The return from the calling word will then return to whatever code called the word that saved the continuation. This feature is intended to be used to support the implementation of higher-level languages that provide explicit continuations. TODO TODO: Formal definitions of the words TAGGED-DATA: Runtime tagged data NOTE: This is really a library abstracted out of IRON, so that it can be used in other systems wanting tagged data. This is a higher level language support feature, mainly. Higher level languages like runtime dynamic types Supports the idea of a 2-bit tag field, 00=fixnum, 01=flonum, 10=character, 11=pointer. Provides a set of base arithmetic operations on tagged value cells, which for fixnum/flonum operations work directly but invoke event handlers on overflow so the arguments can be promoted to bignums if supported, and for pointers call to appropriate user type handlers (which, along with the overflow handlers, are specified as registers with their own load/store instructions) Also provides BOX and UNBOX operations for the basic types. For pointers, we provide relative-to-pointer indexing operations. Given an offset in AUs and a tagged pointer, calculate a native unboxed pointer. Hardware Access Features DEVICE-TREE: Dynamic device access TODO Interrupts defined for the system to notify the application of the addition, removal or change of devices. See for details on how the initial state of the device tree is notified to the application. Devices have one or more of the types listed below. They have attributes, and operations. Many hardware features will both provide direct access words and, if the device tree feature is present, a device tree entry. This is so that embedded systems without the overhead of a device tree can still use a standard interface to the hardware they depend on. If the device tree feature is not present, then the DEVICE-* features listed below may still be REQUIREd to use the device via direct access words. See notes.oo3 Note that if the application does not REQUIRE the feature corresponding to a type of device, then that device should still appear in the tree for dynamic access, although the application cannot call the direct access words for that feature; the whole point of the dynamic nature of the device tree is that applications may not require, but can take advantage of, devices that appear in it. Include sample device trees from notes.oo3 DEVICE-STORAGE: Persistent storage device TODO Asynchronous I/O interface to a block storage device. Available as a device-tree node type with properties such as STORAGE:BLOCK-SIZE (bytes), and STORAGE:BLOCKS Also available via direct-access words if the feature is REQUIREd; access words take a device number parameter, and there is a word to return the number of devices. No support for dynamic device addition and removal; use the device tree if you want that. DEVICE-REMOVABLE: Removable storage drive TODO REMOVABLE:INSERTED attribute and REMOVABLE:EJECT, REMOVABLE:LOCK, etc. commands TODO When media is inserted, this device sprouts a child device of type STORAGE TODO DEVICE-FILESYSTEM: High level filesystem storage TODO Provided by POSIX systems, or by native systems finding disk partitions containing known filesystem types TODO DEVICE-REALTIME: Real time clock TODO REALTIME:TICK-N / REALTIME:TICK-D - number of ticks per second as a fraction N/D TODO REALTIME:IS-SETTABLE - settable clock or not TODO REALTIME:SOURCE - if non-NULL, name of reference clock (GPS, etc); NULL for non-accurate clocks TODO REALTIME:GET-TIME ( - u64 ) returns the number of clock ticks since the UNIX epoch TODO REALTIME:SET-TIME ( u64 - ) sets the number of clock ticks since the UNIX epoch TODO DEVICE-ETHERNET: Ethernet interface TODO Include support for common Ethernet accelerator features TODO DEVICE-WIFI: Wireless Ethernet interface TODO DEVICE-IP: High-level IP interface TODO DEVICE-GPIO: General purpose I/O lines TODO DEVICE-PORT: Byte I/O stream DEVICE-PORT-SERIAL: Serial port TODO Control lines generally also exist as a DEVICE-GPIO TODO DEVICE-PORT-PARALLEL: Centronics parallel port TODO Generally also exists as a DEVICE-GPIO TODO DEVICE-MONITOR: Hardware monitoring TODO Monitor device instances are linked to the device they monitor by MONITOR:DEVICE, and have a MONITOR:UNIT and MONITOR:SCALE-FACTOR, and a MONITOR:NOMINAL value, as well as MONITOR:OK-MIN and MONITOR:OK-MAX, then MONITOR:ALLOWABLE-MIN and MONITOR:ALLOWABLE-MAX as a range outside. TODO DEVICE-SYSTEM: System root node TODO Mainly a placeholder to be a root to the device tree TODO DEVICE-SUBSYSTEM: System subdivision TODO An administrative subdivision of the device tree, perhaps to group a number of devices with a common temperature monitor or some such TODO DEVICE-CPU: Central processing unit TODO Direct-access words are available to find the number of CPUs. cpu:poke takes a CPU number, an IPL, and a handler, and interrupts the CPU at that IPL to run that handler, in order to make an idle CPU do something other than handle interrupts as they come. TODO CONSOLE: System console Many systems have an interactive console hardwired to them; those that do not can have a virtual console, accessible through an appropriate mechanism. Traditional system consoles have been very simple, defined by such standards as VT100 or ANSI, with only character-based input and output, and do not encourage bulk administration of large numbers of systems. Therefore, we define a high-level console that can be implemented on top of even basic teletypes, but can take advantage of advanced interface hardware; and, more importantly, is amenable to bulk administration of large numbers of systems. The architecture of the console abstraction is as follows: Asynchronous events may be logged by any component of the system, given a log level (ranging from debug messages up to fatal errors), a subsystem (allowing filtering of messages), and the message text. The user may be prompted with a prompt string and then allowed to enter a string, which is then returned to the application, or the user may cancel. The application may reject it with an error message, or accept it. A number of monitorable statistics may be registered by the application, and then updated by the application whenever new data are available. Each statistic has a name and either a quantity (as an integer number or 'unknown', a unit string, minimum and maximum ok values, minimum and maximum warning values) or a status string and a state (ok, warning, unknown, error). Quantised statistics have a state calculated; if the measured quantity is unknown then the state is 'unknown', if it is within the ok values then it's 'ok', if it's outside of that range but within the warning values then the state is 'warning', and otherwise it's 'error'. Statistic state changes need not be explicitly logged by the application, as the console system will log them automatically, with suitable hysterisis when statistics 'flap'. For identification of a system in a large room full of identical hardware, there's also a locator signal that can be turned on and off by the implementation. The exact effect is platform and implementation dependent, but audible sounds and flashing lights are recommended. All of the above may be reflected in a hardware console, or instead diverted to a software implementation, if the application provides callbacks, or to both (the default). How these are implemented depends on the implementation and the capabilities of the platform. However, an implementation on something like present-day server hardware might divide a text-mode screen into three regions; one showing all statistics in states other than 'ok', one showing log events, and the last one appearing on demand whenever the user is being prompted for a string. A general alarm level is maintained, ranging from 'ok' to 'error'; whenever a log event at that level or above appears from the application, the alarm level is raised to the log event's severity, and the system may optionally sound an alarm and illuminate any "error" LEDs available, which can both be silenced by pressing a button on the keyboard. Statistics going out of bounds will also trigger the alarm, but in this case, the alarm will silence automatically if the statistic returns within normal bounds. The locator signal can be implemented via beeping, plus the illumination of any suitable software-controlled LEDs that are available. TODO: Write up the words for all this. BOOT-MEDIA: Access to boot media As discussed in , some systems may be able to modify their own bootstrapping media. If so, they should provide the BOOT-MEDIA feature to standardise this process, even though the details of some parts of the boot media will be platform-dependent or implementation-dependent. The boot media can be split into four sections, but may have finer structure within. The four sections are the HYDROGEN kernel (which is platform-specific, but will be shared by many systems), configuration and additional drivers for the kernel (which are also platform-specific, and implementation-specific, but is generally unique to one system), the application (which is platform- and implementation-independent and will generally be shared by many systems), and application configuration and extension (which are platform- and implementation-independent but specific to this system). In order to identify the implementation and the system itself so that the correct platform-specific, implementation-specific or system-specific media may be obtained, platform-identifier, implementation-identifier and system-identifier words are provided in . Having obtained suitable boot media from some repository, it may be installed via the BOOT-MEDIA words: Boot media access words Name Stack Effect Description boot-media:install-kernel b u - Install the given sequence of AUs (in some platform-specific format) into the HYDROGEN kernel boot phase identified by the supplied number. The meanings of the phases is platform- and implementation-specific, so it is not possible to switch HYDROGEN implementations using this word. By convention they should start at 0 and work upwards in the order of use during the boot sequence; eg, 0 for a boot sector, 1 for a second-stage loader, 2 for the HYDROGEN kernel. It is an error to attempt to load a phase that is not supported by this implementation combination, which will cause a panic. Ideally there should be some measure of checksumming in the phases, so that invalid phases can be rejected. boot-media:install-kernel-configuration b u - Install the given sequence of AUs (in some implementation-dependent format, but starting with a zero-terminated sequence of characters) into the kernel configuration area, indexed by the supplied number. The sequence of characters at the start of the configuration item must be a human-readable label, but the format of the rest of configuration item is implementation-dependent; if at all possible, items are encouraged to be valid HYDROGEN source code to be interpreted in order, with any other types of object (eg, precompiled native code libraries, configuration files in special formats, etc) being wrapped in a HYDROGEN envelope, but this is not enforced. It is intended that extra driver modules, configuration files to specify details of hardware devices that cannot be obtained via introspection, etc. be provided here. Some index numbers may be "special" (eg, perhaps slot zero must be a configuration file), but outside the range of special indices, the application is permitted to use any valid unsigned integer, to allow an unbounded number of drivers etc. to be added to the core kernel. A configuration item may be removed by installing a zero-AU block over it. boot-media:install-application-component b u - Install the given sequence of AUs into the application area, indexed by the supplied number. The contents of the block must be valid HYDROGEN source code. Any unsigned integer may be used as an application component index, and during booting, they will be interpreted in numeric order. An application component may be removed by installing a zero-AU block over it.

CACHE: Memory block cache TODO Rationale: access to fast memory outside of address space - PAE, EMS, XMS TODO API from Cache section of notes.oo3 TODO CLOCK-INTERRUPT: Clock interrupts TODO CLOCK-TICK ( - u u ) - returns number of ticks per second as a rational fraction CLOCK-REGULAR ( u h - ) - sets up handler h to be called every u ticks. Call it with NULL h to disable. CLOCK-ALARM ( u h - ) - sets up handler h to be called in u ticks, then never again. Call it again to change the schedule; only one alarm is 'stored' TODO CLOCK-RUNTIME: Profiling clock TODO CLOCK-RUNTIME-TICK ( - u u ) - returns number of ticks per second as a rational fraction TODO CLOCK-RUNTIME ( - u64 ) returns the number of clock ticks since system boot, for this CPU. In a virtualised environment, this may not tally with the change to CLOCK-REALTIME, as only ticks spent running this VM count for CLOCK-RUNTIME. Likewise, the runtime clock does not tick when a CPU is suspended in an IDLE instruction, or halted. TODO Platform features POSIX: POSIX base OS interface TODO POSIX syscall API as a set of words, plus a special POSIX-SYSTEM device tree node (which forms the root of the tree) TODO Hardware Virtualisation Features VM: Generic VM operations TODO VMs are identified by a single-cell value (probably a pointer to a data area) The VM feature provides words to pause, resume, and destroy VMs, access their CPU time counters, etc TODO Creation and configuration of VMs is handled by each VM type's own words, but the general process is: 1) Create VM 2) Set up virtualised resources (I/O ranges which are handled by code outside of the VM, negotiate amount of memory to allocate (which may be hardcoded, eg when the VM system uses a physical slave CPU+RAM, so the negotiation may fail), etc 3) Load code 4) Run! TODO VM-HYDROGEN: Virtual HYDROGEN machine TODO Create a VM which gets the same feature set as the current VM, then provide it with a device tree and/or sets of words that are 'upcalls' into the parent VM, then load code and run, with the option to trigger interrupts in the VM. Operations to access contents of memory, stacks, and registers of paused VMs. Upcall words must use VM introspection words to access their parameters from the VM stacks, memory, etc. TODO VM-X86: Virtual X86 machine TODO The V86 machine is a basic x86 CPU+RAM. The creator must provide I/O mappings and memory mappings, and the VM provides 256 interrupts that trigger the x86 vectored ints. Plus operations to access the RAM and CPU state of paused VMs. It's up to the creator to provide things like PCI systems. Implementations on x86 hardware may use virtualisation or VMWare techniques to provide this in hardware; others must emulate the CPU. TODO VM-X86-XEN: Virtual X86/XEN machine TODO Provides a XEN domain, which can be efficiently implemented on x86 hardware without virtualisation or VMWare tricks. Creator must provide Xen high level disk/network/etc devices. Non x86 implementations providing this feature must do so via CPU emulation. TODO Implementation notes: Portable C TODO A very basic portable C kernel TODO Implementation notes: POSIX TODO The portable C kernel, plus POSIX interfaces: mass storage devices as raw block devices or as files, IP interfaces, etc: and the POSIX feature TODO Implementation notes: x86 assembly TODO