The Anitra Computer

A Complete Minimalist Computer System Designed, Built and Programmed at a Low Level of Abstraction
A hobby electronics project by Eirik Bakke (2004).

Abstract
Introduction and Goal
Deduction of Architecture
Specification and Documentation
Conclusion
Appendix A: Software Development/Downloads
Appendix B: Hardware Prototype Construction/Photos
References
Endnotes

Introduction and Goal

A computer is a machine that, given enough time and memory, can perform any rigidly defined computational operation. A minimalist computer is a computer that satisfies this requirement with only a minimum level of architectural complexity. Hypothetical minimalist computers has long been an interesting study in Computer Science, because they allow for a deeper understanding of how computers essentially work. However, there exists few actual hardware implementations of such computers.

The question of this investigation is as follows: How simple can a basic computer be designed to be, given the requirements below?

A computer is a machine that, given enough time and data retaining units (memory), can perform any rigidly defined computational operation on a set of input data.
The hardware of the computer should be designed in detail using only standard 74-series TTL-compatible digital logic circuits (ICs). A single standard SRAM chip may may be used for main memory, since this requires less supporting circuitry than DRAM. Some analog support circuits may be included as appropriate.^[1]
The design's order of simplicity is given primarily from the total number of simple logic gates logically involved in the standard circuits used, but also from the number of physical ICs and connection points involved. It should be practically possible to build a prototype of the hardware.
The amount of accessible and addressable memory should not be the main functional limitation of the computer (see the first point). Specifically, the computer can be designed for an address space of anywhere between 2 and 64 kilobytes, inclusive.
Given the necessary level of complexity decided on after considering the requirements above, it is also a goal to get as much functionality as possible out of the available components through efficient design.

Deduction of Architecture

In this section, I will try to deduce an appropriate architecture for my computer. I will start by discussing what hardware will be absolutely necessary in order to perform various required operations. If I can show logically that a given set of components cannot be omitted from the architecture, and then design an architecture using only these components, I will theoretically have an architecture that cannot be simplified furtherly. While the argument below is certainly not meant to be of a mathematically satisfactory standard, it does provide is a suitable starting point for an actual design. The premises of the argument are the basic requirements given in the introduction.

1. The computer must have access to memory

Following the requirement of including between 2 and 64 kilobytes of main memory, we must include an SRAM chip; the data-retaining components found in the 74-series cannot hold more than single bytes at a time. We will have to choose among standard data and address bus widths. Of practical reasons, the data bus width is already more or less given to be 8 bits, since 4- and 16-bit standard memories fall outside of our desired size range. This leaves us with an address bus that should be between 11 and 16 bits wide, inclusive. A full address will in any case not fit in a single data bus width. For the moment, we can decide to use an address bus of 16 bits, which is exactly twice the size of the data bus. This will simplify later parts of the discussion.

2. The computer must be able to combine data

A working computer must in some way or another be able to logically combine two values from memory into a single value that can be written back into memory. Without such a capability, there would be no remotely practical way of implementing arithmetic functions in software. All bits on the bus must be included in this operation; if some of the bits are not, they will be logically inaccessible for all arithmetic functions and hence wasted memory. Hence, we must include in our design the appropriate logic for combining two values of data bus width.

Combining two values can either be done with simple two-input gates or with arithmetic full adders. The number of ICs will be the same for both techniques, but an adder configuration is far more logically complex. In a minimalist computer design, it is not altogether obvious why the relatively complex addition operation should be worthy of inclusion. However, the addition operation has one essential property that justifies its use: it has the ability to logically combine not only two values of several parallel bits each, but also to combine the individual bits within these values through a carry mechanism. Without this ability, the software would in the best case have to mask out and process each data bit individually in order to perform arithmetic operations, which would be highly inefficient. This fact is important, and explains why addition is often taken as the most basic of all computing operations. Hence, our design may include an arithmetic adder of data bus width. However, the addition operation itself is not strictly universal enough for software synthesis of, for instance, subtraction. Because the SRAM uses a bidirectional data bus, a tri-state buffer must in any case exist between source data to be written to memory and the data bus. By using an inverting rather than a non-inverting tri-state buffer at this point, the functionality of the data combining logic may be enhanched without extra cost in terms of design complexity.

3. The computer's software must have access to the memory^[2]

A computer must be able to store any value at an address in memory that originates completely from memory itself. Put simpler, a computer must be able resolve an address reference. Memory locations not reachable in this way would in the best case be extremely hard to utilize, because there would be no practical way for the software to specify where to perform operations in memory.

To load a complete address from memory, two read cycles must be completed, since a single 16-bit address must be be split into two parts to fit in 8-bit memory locations. In order to write a value to an address loaded this way, at least four 8-bit wide data retaining units are needed simultaneously: one to hold the value to be written to memory, one to hold the first half of the destination address while the second half is retrieved, and two for providing both parts of the address of this second half of the destination address. This is a very important point. Data retaining units available in the 74-series include, in order of decreasing internal logic complexity, programmable counters, flip-flops, shift registers and non-programmable counters. However, only the first two of these can be programmed directly from a bus of parallel bits. The simplest flip-flop is the D-flip-flop. In the situation above, the unit that holds data to be written to memory cannot be anything simpler than a set of D-flip-flops; its value must be able to originate from the data bus as well. Two of the other three units must also be sets of D-flip-flops in order to be able to hold addresses originating from memory. Hence, at least three of the four required data retaining units must at least be D-flip-flops. The last data retaining unit may be a non-programmable counter if nothing further is shown to be needed. Given that no further data retaining units are introduced at this point, the outputs of two of the required four data retaining units will need to access the same memory address inputs, as they will necessarily contain the same part of two different addresses during the retrieval of an address from memory. The selection between these two units' outputs can be done either with multiplexers or tri-state buffers. As at least one of the two data retaining units will be a set of D-flip-flops that can include a tri-state buffer on its IC, and as this will save one IC circuit, the tri-state approach is chosen. The other data retaining unit sharing this half of the memory address must necessarily be the possible non-programmable counter; the other half of the memory address must come from a set of D-flip-flops if it shall be possible to fetch it from the memory itself.

4. The computer's various components must work together

To function as a computer, the various required components must interact in the correct order, and a certain amount of logic is needed to generate the appropriate control signals for these. The number of simple gates logically involved in this control logic is likely to be small compared to that of, for instance, a single data retaining unit of data bus width, due to the serial nature of control signals. In addition, it will be left to the control logic to get as much functionality from the computer as possible. For these reasons, I will assume considerably more freedom during the design of the control logic.

Resulting Architecture of the Minimalist Computer

Figure 1: Resulting Architecture of the Minimalist Computer ( PDF file available).

A practical architecture based on the deduced minimum hardware is illustrated in Figure 1. It uses no more components than was shown to be needed, but once correctly controlled, it will be capable of executing useful instructions. The design includes a simple accumulator that can either be accumulated with data bus values or cleared. This was the only way I was able to implement the data combination logic using only one set of flip-flops. The address preparation components are arranged so that the requirement described in the third point above may be satisfied. I have not found other working configurations of these three data retaining units. Note that the address bus is one bit smaller than initially suggested; this is to leave space for an instruction qualifier bit which will be described later.

One of the more interesting things to note is the configuration of the program counter (PC). I have tried not to intentionally force familiar concepts such as instructions and a program counter onto my computer. As it turned out in the previous discussion, it was possible to justify the requirement of data retaining unit that would keep track of where the computer should look for memory address references next. It felt quite natural to name this register the program counter, and the memory address references instructions. What I could not seem to justify the need for, however, was making PC a programmable, rather than a non-programmable, counter. A fundamental operation in computer programs is to have the program counter branch to another part of the program. This would traditionally be done by having a programmable program counter load a target address from the branch instruction. This is not necessary here. Essentially, there can be two low-level reasons for a branch; it is either to return to an earlier instruction to create a loop, or to skip a number of following instructions. In this computer, the first requirement is satisfied by having all instructions execute in a single eternal loop. For the second requirement, there exists a conditional skip instruction that, if triggered, will skip a number of subsequent instructions based on set rules. These branching possibilities may sound severely limiting, but this is not the case. The limited number of instructions in the loop may be used, for instance, to create a software interpreter for higher-level virtual instructions that might reside in other parts of memory^[3].

Specification and Documentation

Using the proposed architecture from the previous section, I have laid out the complete circuits of a computer as shown in Figure 2 and Figure 3. The computer, from now on called Anitra, is split into a traditional central processing unit (CPU), which is the actual instruction fetching and executing part of the computer, and a development board. This section is written both as a description and a documentation of the Anitra computer.

Figure 2: Schematics of the Anitra CPU ( PDF file)

Figure 3: Schematics of the Anitra Development Board ( PDF file)

Definitions

The followings terms and definitions will be used when describing Anitra's theory of operation:

accumulator The logic that calculates and fetches the sums during arithmetic addition operations.
CPU The main part of Anitra, which fetches and executes instructions.
argument One of two full addresses that make up an instruction.
block Sub-area in the two first segments in memory consisting of 32 bytes (or eight instructions).
control logic The part of the CPU that generates control signals for the flip-flops, counters and buffers around the data bus as well as the external interface.
debug Method of IO where the external interface accesses memory by taking control over the data bus and flip-flop sets A and S.
execution step A given stage in the process of executing an instruction.
external interface Additional support circuits connected to the CPU to provide memory, IO, power and such.
instruction pointer The actual prepared near address used for fetching instruction bytes from memory.
instruction qualifier bit The most significant bit of the last argument in the current instruction.
IO exchange The exchange of data with the outside world during the last execution step of the last instruction in the loop.
loop The program made up of instructions in the two first segments of memory.
near address 8-bit value for selecting a byte within a segment.
segment Sub-area in memory consisting of 256 bytes.
segment number 7-bit value for selecting a segment within memory.

General Structure

Anitra's CPU, shown in Figure 2, consists of a network of data registers, an arithmetic adder and two data buffers, as well as the control logic necessary to make these circuits successfully fetch and execute instructions from memory. Structurally, the CPU is designed to access a 32-kilobyte SRAM memory chip with an 8-bit data bus. Full 15-bit addresses for the memory are made from a 7-bit segment number and an 8-bit near address. The two flip-flop sets S and A may both retrieve data from the memory through the data bus. The segment number is always prepared in flip-flops S. It may contain data from the memory, or it may be reset to zero. The near address can come from either flip-flops A or the counter PC, depending on which of the two sources currently has its tri-state outputs enabled. PC's first and last 4 bits may be increased or reset individually. A simple accumulator system is connected to the data bus. The flip-flops R may be clocked, which will add the current data from memory to their existing value through an arithmetic adder, or they may be reset to correspond to a value of zero. The result may be directed back to memory through an inverting tri-state buffer. For IO, two methods are provided, both involving an external interface with direct access to the data bus.

Execution Sequence

To execute instructions, the control logic outputs a seven-step sequence of control signals. The control logic keeps track of its progress in terms of the current execution step. Because some of the steps in the execution sequence are performed twice for each instruction, there are only four different execution steps. Each instruction is 4 bytes long, consisting of two 2-byte full addresses. The most significant bit of the segment number of the last address, the instruction qualifier bit, is used to distinguish between two different instruction types, since this bit is left over after a 15-bit address has been created from two 8-bit bytes. Branching possibilities are limited, but sufficient. All instructions are executed sequentially in a loop, and grouped in blocks of eight. Branching from an instruction is done by skipping the remaining instructions in the current block, or when branching from the last instruction in a block, by skipping the complete following block. The address of an instruction can be prepared for the memory by resetting flip-flops S and selecting PC as the source of the near address. The least significant bit of this instruction address will not come from PC, but from the control logic, while PC's outputs will make up to the next 7 bits. PC's remaining and most significant bit is used to control the least significant bit of the segment number whenever PC is selected over A. This system simplifies the control logic because PC will not need to be increased on two clock cycles in a row, and in addition, it is in design terms a cheap way to maximize functionality by making it possible to use both the first and the second segments in memory to store executable instructions.

The execution sequence starts by resetting flip-flop sets R and S, and selecting PC. The memory will now load the first instruction byte at the memory address denoted by PC and the control logic, and place it on the data bus. In the next step, this byte is fetched by A just as PC is selected. The memory will load the second instruction byte, which is next fetched by S while A is selected. PC is increased. The address that is now passed to memory is no longer the address of the instruction itself, but the pointer that was specified as its first argument. The memory will load the value at this address, which is then fetched into the accumulator. Since flip-flops R have been reset and initially contain zero, nothing will be added to the value before it is saved this time. This sequence is then repeated to fetch the next two bytes of the instruction and load the value pointed to by its second argument. At this point, the loaded value may or may not be fetched into the accumulator to be added to the value of the previous argument, depending on the current instruction type specified by the instruction qualifier bit. If it is so, and if the arithmetic addition operation overflows and returns a carry signal, a branch operation will be performed on PC. In any case, the inverse of the resulting accumulator value is finally written to memory at the same address as the last instruction argument. Note that the instruction qualifier bit is included for extra functionality only; it allows the software programmer to benefit from an additional, simplified instruction.

Input/Output

The regular approach to IO, referred to as IO exchange, requires the external interface to supply a set of D-flip-flops for output and a tri-state buffer for input. IO exchange is achieved by having the control logic modify the last execution step of the last instruction in the loop if that instruction is not skipped by a branch. Instead of activating the accumulator's tri-state buffer to write the accumulator's value to memory, the control logic will leave the data bus as it is and notify the external interface (through the signal IOX). The external interface should immediately respond by having its D-flip-flops fetch the last data bus value, which will be the one pointed to by the current instruction's last argument. At the same time, it should activate its own tri-state buffer to make sure the value it wants to return is put on the data bus and written to memory in place.

The second method of IO, called debug, is meant for setting the initial contents of memory, and for inspecting memory during execution. To perform debug, the external interface must first halt the CPU during the appropriate execution state. This is done by ceasing to supply clock pulses once the CPU sends a debug possibility notification (through the signal DOK). During debug, the external interface may override the control signals of flip-flop sets S and A (through the signals DSG and _DAD, respectively) and the memory. The CPU will select the flip-flops A as the memory's near address source, and the data bus will be left free for the external interface to control it. The external interface may thus access any location in memory by first fetching the appropriate address into S and A through the data bus.

Control logic

The control logic, shown in Figure 2 (page 2), is responsible for activating the correct control signals at the correct time for all the other of the CPU's components. A complete list of signals administered by the control logic is given in Table 1. Anitra's exact execution sequence is given in terms of control signal activations in Table 2.

Table 1: The Anitra CPU Hardware Registers and Signals List ( PDF file)

Table 2: Control signals generated per execution step.

Execution step number Tri-state activation: R or Mem selected? Tri-state activation: A or PC (instruction pointer) selected? Value of instruction pointer's least significant bit Other control signals generated
0 Mem PC 0 Clear S
Clear R
1 Mem PC 1 Fetch A
2 Mem A x Fetch S
Increase PC
0 Mem PC 0 Clear S
Fetch R
1 Mem PC 1 Fetch A
2 Mem A x Fetch S
Increase PC
3 None if current instruction is last in loop, otherwise R A x Write to memory
Fetch R if instruction qualifier bit is 1
Have PC skip to next block if accumulator overflowed and instruction qualifier bit is 1
Request IO exchange request if current instruction is last in loop
x = without significance (because PC is not selected)

Execution step number	Tri-state activation: R or Mem selected?	Tri-state activation: A or PC (instruction pointer) selected?	Value of instruction pointer's least significant bit	Other control signals generated
0	Mem	PC	0	Clear S Clear R
1	Mem	PC	1	Fetch A
2	Mem	A	x	Fetch S Increase PC
0	Mem	PC	0	Clear S Fetch R
1	Mem	PC	1	Fetch A
2	Mem	A	x	Fetch S Increase PC
3	None if current instruction is last in loop, otherwise R	A	x	Write to memory Fetch R if instruction qualifier bit is 1 Have PC skip to next block if accumulator overflowed and instruction qualifier bit is 1 Request IO exchange request if current instruction is last in loop
x = without significance (because PC is not selected)

A set of D-flip-flops, called ES, is configured to keep track of the current execution step. It will proceed one step per clock cycle, repeating the first three steps every other time as implied from the execution sequence. PC's least significant bit will indicate whether these steps are being performed for the first or the second time. The outputs of ES are then processed in logic gates to produce the desired control signals. One D-flip-flop, called LP, is used to detect when PC overflows and starts over from the beginning; the latter means that the currently executing instruction must be the last one in the loop and should be treated specially. Another D-flip-flop, called BR, is used to fetch the accumulator's carry bit after arithmetic addition has been performed. If there is carry, PC is made to skip to the beginning of the next block by increasing it from its fourth most significant bit and resetting its four least significant bits. Anitra's maximum clock frequency will be limited by execution step number 2, which requires time both to load a value from memory and to pass it through the accumulator.

To minimize brief bus contentions between tri-state devices during switching, and to ensure that memory addresses are not invalidated too quickly before loaded values are fetched from the data bus, the timing of some control signals have been delayed using additional gates.

The Anitra Development Board

The Anitra development board serves as an external interface to the Anitra CPU. While the CPU, being the most theoretically interesting part of Anitra, has been designed and optimized with great care, the development board is set up more loosely simply to demonstrate how Anitra may work as a complete, standalone computer. It contains, most importantly, a memory chip, a simple power supply, a clock oscillator^[4], the buffers and flip-flops needed for ordinary IO exchange, and some supporting logic gates. An IEEE1284 parallel port interface^[5] is also included so that software may be uploaded from another computer (such as an IBM PC) using the debug method. (Memory inspection has not been included as a feature in the current configuration of the development board; because of limited number of output signals from the parallel port, memory is set to be write-only during debug. In case memory inspection is necessary, the effort required to rewire this configuration would however be minimal.)

For a minimal amount of time after external power is first feeded to the development board, a reset signal will be sent to the CPU. Anitra is not intended to use a ROM chip for initialization, nor to require the use of the debug interface at every powerup. The memory chip used for the development board is battery backed-up and will retain data between power losses. In addition, whenever external power disappears and Anitra starts running on remaining power supply capacitor power, the development board's logic will make sure that the CPU is halted as soon as it reaches the same state as it will be in when reset the next time, that is, when it is about to start execution of the first instruction in the loop. The CPU provides a separate notification signal for this (the signal RET).^[6]

Resulting Specifications from the Software Programmer's Point of View

The software programming premises that results from Anitra's method of operation are summarised below.

The Anitra computer has up to 32Kb of memory, divided into 128 256-byte segments. Full memory addresses consists of a near address and a segment number, denoted as from [0:0] to [127:255]. Executable instructions must be placed in the first two segments. Each instruction takes 4 bytes, making 128 instructions available for machine coding. The instructions will be executed sequentially in an eternal loop, returning to start after the last one. The first two segments are organized in 16 blocks of 8 consecutive instructions each. Branching from an instruction is done by skipping the remaining instructions in the current block, or when branching from the last instruction in a block, by skipping the complete following block. The instruction format is shown in Table 3.

Table 3: Instruction format and argument bit patterns.

argument S argument Q
Byte 1
AAAAAAAA Byte 2
xBBBBBBB Byte 3
CCCCCCCC Byte 4
iDDDDDDD
near address segment number near address segment number
x = don't care

argument S	argument Q
Byte 1 `AAAAAAAA`	Byte 2 `xBBBBBBB`	Byte 3 `CCCCCCCC`	Byte 4 `iDDDDDDD`
near address	segment number	near address	segment number
`x` = don't care

The arguments are two full addresses. All instructions work by processing the values at memory locations [BBBBBBB:AAAAAAAA] (called S) and [DDDDDDD:CCCCCCCC] (called Q). There are two instruction types, distinguished by i, the instruction qualifier bit. In addition, the last instruction in the loop is altered to serve as a special IO instruction. The resulting instructions are given in the Table 4.

Table 4: Anitra's machine code instructions.

Qualifier Instruction Operation
Bit i=0 mov S,Q (move with complement) Q:=255-S
Bit i=1 add S,Q (add with complement) Q:=255-(S+Q mod 256)
branch if S+Q>255
Bit i=0
+ instruction is last in loop iox S,Q (exchange input/output, don't care S) output register:=Q
Q:=input register

Qualifier	Instruction	Operation
Bit `i`=0	`mov S,Q`	(move with complement)	Q:=255-S
Bit `i`=1	`add S,Q`	(add with complement)	Q:=255-(S+Q mod 256) branch if S+Q>255
Bit `i`=0 + instruction is last in loop	`iox S,Q`	(exchange input/output, don't care S)	output register:=Q Q:=input register

Conclusion

Given my requirements, I have shown it possible to construct a computer that comes close to a provable lower component limit, and my investigations suggests that a simpler datapath portion of the CPU is unlikely to exist. In order to write a value to an address in memory, at least four 8-bit wide data retaining units are needed simultaneously: one to hold the value to be written to memory, one to hold the first half of the destination address while the second half is retrieved, and two for providing both parts of the address of this second half of the destination address. Interestingly, one of these registers take on the familiar role of a program counter. The computer hardware is, however, simplified by placing limits on this register's operation. The finished computer, called Anitra, is capable of executing two primitive yet universal instructions which are both based on an inverted addition operation.

The study of minimalist computers is interesting because it casts light upon issues in computer architecture design that may otherwise go unnoticed, and because it stimulates a better understanding of how software and hardware specifications interact with each other. Software and hardware engineers may not necessarily have the same perception of what is simple and not.

Appendix A: Software Development for the Anitra Computer

[IBM PC-type computer with Anitra attached.]

Although I have now completed the design of Anitra's hardware, the computer will not be able to perform any useful tasks before it is programmed in some way.

Specification of the Anitra Assembly Language

The assembly language serves as a way to define the contents of Anitra's memory according to the software programming specifications given earlier. Source code is contained in an ASCII file of extension AAS. The assembler will output a headerless memory dump file of extension BIN. Any bytes in memory not defined in the code are given a value of zero. The source code is interpreted in terms of white-space separated case-sensitive words. The comma (,) is always treated as a separate word. A semicolon (;) denotes that the rest of the line is a comment. There are four types of words, as explained below.

Data Words

Each of these denotes one or more bytes to be placed in Anitra's memory. These bytes are placed in memory in the order their words are entered in the source.

Numeric literal: A decimal number within 0 and 255 to be inserted as a byte.
Uninitialized byte: A hash sign (#). Used to denote a byte in memory whose initial value is not important. A value of zero will be used.
Two-byte address reference: The name of a reference that is defined somewhere else in the source with a reference definition (see below). Will insert a byte equal to the near address of the memory location pointed to by the reference followed by another byte equal to its segment number.
One-byte address reference: The name of a reference as above, but followed by either an at sign (@) or a dollar sign ($). Will insert a byte equal to either the near address or the segment number, respectively, of the referenced memory location. If the reference is directly preceded by a decimal number followed by a plus or minus sign enclosed in brackets ([n+] or [n-]), the value will be added to or subtracted from the given number modulo 256 before it is inserted.

Reference Definitions

A user-defined reference name of at least two characters immediately followed by a colon (:). The new reference will point to the first memory location of the next data word. Reference names may contain small and captital letters A through Z, numbers, underscore (_) and period (.), and must include at least one letter. If an asterisk (*) is included immediately before the reference definition's colon, the assembler will print the reference's name and full address during assembly.

Instructions and the Comma

There are three instructions, denoted by the words mov, add and iox. Inserting one of these words indicates that the next 4 bytes in memory will be part of a new instruction, and the most significant bit of the byte 4 memory addresses ahead will be set to 0, 1 or 0, respectively. In addition, a comma will be expected between the data words that define the first and the second byte pair. A single data word representing more than one byte, such as a reference, can represent bytes within one pair only. Instruction words may only be placed wherever the next data word would reside in one of the first two segments at a near address divisible by 4. The very last of these locations may only be used for the instruction iox, and this instruction cannot appear anywhere else.

Assembler Messages

Words starting with a percent sign (%), giving various other assembler directions.

Code containers: All other words must be contained between the assembler messages %ANITRASM and %END. The source cannot end in the middle of an instruction segment, and only in the middle of a data segment if automatic padding has been enabled (see below).
Memory location verifiers: %NEW_SEGMENT and %NEW_BLOCK. Triggers an error message if the next data word will not be the first one in a segment or block, respectively. Only the first two segments are organized in blocks.
Automatic padding: %AUTO_PAD. Instructs the assembler to insert, if necessary, dummy instructions to fill remaining space at the end of blocks and instruction segments, and before the iox instruction to pad it to the last instruction slot, and to accept incompletely defined data segments. Note: The use of automatic padding implies that the iox instruction will not necessarily be assembled into the same block as data words that directly preceed it in the source, and that other instructions will only be last in their blocks if all preceding instructions in the same block have been defined manually.

Testing and Software Tools

For the purpose of developing Anitra software using an ordinary desktop computer, I have written a cross-assembler, which translates assembly language code into a binary memory image, a debugger/emulator, which inputs the image and interactively simulates the software's operation on the Anitra computer, and a parallel port uploader, which transfers the image to the Development Board's memory chip. These tools make the programming process similar to that of any modern computer or microcontroller.

The project's most important piece of Anitra software is the Debug Routine. It tests all distinct aspects of Anitra's operation by running a sequence of tests that all result in different numerical answers, and then outputs the sum of all results to the user. Since the tests are designed to give a different result if Anitra does not behave according to specification, the precence of the expected sum on the output is very likely to indicate a working model. The routine was used in all development stages: first to test the operation of the emulator, then to test circuit simulations on CAD software, and finally to to test the physical prototype.

Another piece of interesting Anitra software is the Virtual Machine Emulator. The routine executes virtual instruction of another, hypothetical, computer. The virtual machine is far more advanced than Anitra itself, with 14 instructions, in-built function calls, separate data and return stacks, relative local variable addressing, unconstrained branches and so on (the instructions being JCZ, JMP, JZ, CALL, RET, PUSH, POP, LIT, SEG, LOAD, STORE, ADD, COM, and NOP). Although at a cost of speed, this allows Anitra to be programmed without any of the initial limitation on code size, branching etc. Another interesting observation is that the two simple instructions provided by Anitra seem to be perfectly suffient for solving common programming tasks. The code is fairly compact, and there is plenty of space for emulating more virtual instructions, or possibly, to emulate a 16-bit machine instead.

Downloads

Download the Anitra development tools here (207Kb, ZIP archive). This includes the assembler, the debugger, the uploader, an assembler source code template, the main machine code debug routine, a speaker test routine, and the Virtual Machine Emulator. Tools include BASIC source code (compiles with Microsoft QBasic 7.1).

[The Anitra Machine Code Debugger in action]

Appendix B: The Construction of a Hardware Prototype of the Anitra Computer

I eventually built a working prototype of the Anitra computer. I managed to fit the CPU portion on three Veroboards (one for registers S, A, and PC, one for the accumulator R, and one for the control logic) and the development board on a single one. The four boards are shown connected together on the photo at the top of this article. They could also be folded and stacked in a box, and in addition there were two smaller I/O boards with arrays of switches and LEDs, a speaker, and an RS-232 interface, see the photos below.

It took about six weeks to draw, build, and debug the prototype. While the digital logic worked flawlessly as designed and simulated, I had to deal with several bugs relating to analog circuitry, layout, and construction. These included (lessons learned!):

Missing decoupling capacitors and awkward power distribution, leading the ES portion of the CPU's control logic to generate the wrong series of one-hot pulses E[0..2] once in maybe a thousand clock cycles.
Various missing or bad solder joints, and some short-circuits. The whole layout called for 1622 solder joints, so this wasn't unexpected.
Non-ideal behavior of the parallel port connection, including noise and an inability to drive TTL inputs directly. This was a design error; some signal cleaning and isolation circuitry would have been needed to avoid the hacks I had to implement in hardware and software to work around the problem and to make the software upload process more reliable.
Missing on-board test points and sockets for the ICs, making debugging harder.

What made debugging really hard was my lack of equipment. At the time of construction I had no oscilloscope, logic analyzer, or signal generator available, and my home-made variable voltage supply was falling apart. I first ran the main clock from a manual push button, then, at various low frequencies (up to 300Hz) using a 555 oscillator, and finally with the 8 Mhz crystal oscillator shown in the diagram above. At one point during the debugging process I built a quite well-working six-channel logic analyzer out of a spare 74HCT14 and a parallel port cable (visualized with yet another QBasic program...); this helped me catch the first problem above.

The prototype was eventually destroyed as I accidentally fed 17V of unregulated transformer power onto the 5V Vcc rail. What a fine opportunity to conclude the project after many years of tinkering!

[The Anitra Computer.]

[Anitra Computer on lab test bench.]

References

This is a hobby project I have done in my spare time without supervision of any kind. The sources below have been of help and interest during my work.

The following source was used extensively during the hardware design of the Anitra computer:
STMicroelectronics. Standard TTL compatible logic circuitry datasheets for Technical Literature, Datasheets, Logic, HC/HCT High Speed CMOS as well as for the M48Z35 Zeropower SRAM. Available from World Wide Web, valid as of February 28, 2004:
[URL: http://www.st.com]
[URL: http://www.st.com/stonline/books/toc/ds/index.htm]
The following document was consulted during the design of the Anitra development board:
Craig Peacock. Interfacing the standard parallel port. Copyright 1999-2001 Craig Peacock August 19, 2001. Available from World Wide Web, valid as of February 28, 2004:
[URL: http://www.beyondlogic.org/spp/parallel.htm]
The following book is a simple and excellent introduction to digital electronics and low level computer science:
Charles Petzold. Code. First edition. Microsoft Press. WA, USA.
The following book gives an introduction to basic digital circuits:
Berndt Andersson, Lars Asplund. Elektronikk for alle, del 3. Aschehoug, Oslo, Norway, 1981.
The following contains the first description of one of the more well-known minimalist computers:
Ross Cunniff. The One Instruction Set Computer. Available from World Wide Web, valid as of February 28, 2004:
[URL: http://www.cse.psu.edu/~cg331/samp/OISC/README]
The following contains good discussions of the concepts of power-on reset and power-fail signals:
Jay Scolio. Managing your supervisor. Published in April 1, 2004 issue of EDN magazine. Available from World Wide Web, valid as of August 26, 2004:
[URL: http://www.reed-electronics.com/ednmag/article/CA404514?spaceesc=contributedFeature]
Howard Johnson. Power-on reset. Published in December 13, 1998 issue of EDN magazine. Available from World Wide Web, valid as of August 26, 2004:
[URL: http://www.reed-electronics.com/ednmag/archives/1998/120398/25john.htm]
The following document gives an example of how a minimalist computer, although much more sophisticated than Anitra, can be of practical and commercial use:
C. H. Ting. MuP21 Programming Manual. Second edition. Order number 1014 from Offete Enterprises, San Mateo, CA, USA.
The circuit diagrams were drawn using a registered version of the computer program ISIS Lite from Labcenter Electronics.
Thanks to Christian Bernt Håkonsen for shooting the demo video on the front page.
Thanks to Gene Conover and Bradley Dickinson from the Department of Electrical Engineering, Princeton University, for letting me use one of their test benches right ahead of the EU Contest.

Endnotes

^[1]In the circuits presented in this project, I have used the TTL-compatible HC family of logic circuits (high-speed CMOS). Because the SRAM chip selected supplies TTL-level voltage outputs, the TTL-input compatible HCT family have been used for circuits that take inputs from the data bus. See HC/HCT High Speed CMOS datasheets. I have illustrated the voltage levels specified by these data sheets in this diagram ( PDF file).

^[2]Since we do only have access to one memory chip, any software must reside in the main memory. Our computer will hence be of the von Neumann-type.

^[3]This technique works well in practice; I have developed such an interpreter capable of emulating a virtual machine with two stacks and 14 different instructions, including subroutine calls and control structures.

^[4]The oscillator is taken from Elektronikk for alle, p93. A 470μF capacitor has been replaced by an 8Mhz crystal. Although this configuration works in the prototype, a better solution may be to use an integrated oscillator module.

^[5]See Interfacing the standard parallel port; this page was of great help during the design of the parallel port interface.

^[6]The power-on reset is achieved using a Schmitt-trigger with an attached capacitor that will charge through R16 after powerup and discharge through R19 after shutdown. The resistor values have been chosen so that the trigger will generate valid HC-family logic output voltage levels. A separate analog circuit is used to detect whether the main power source fails and Anitra starts running on power from the voltage regulator's discharging input capacitor. Some time after I had laid out the development board's circuits, I learned that the kind of functionality described here had already been thoroughly discussed by engineers, and that Anitra's present supervisory circuits were somewhat flawed with respect to brown-out power conditions and quick power interruptions. See Managing your supervisor and Power-on reset.

accumulator	The logic that calculates and fetches the sums during arithmetic addition operations.
CPU	The main part of Anitra, which fetches and executes instructions.
argument	One of two full addresses that make up an instruction.
block	Sub-area in the two first segments in memory consisting of 32 bytes (or eight instructions).
control logic	The part of the CPU that generates control signals for the flip-flops, counters and buffers around the data bus as well as the external interface.
debug	Method of IO where the external interface accesses memory by taking control over the data bus and flip-flop sets A and S.
execution step	A given stage in the process of executing an instruction.
external interface	Additional support circuits connected to the CPU to provide memory, IO, power and such.
instruction pointer	The actual prepared near address used for fetching instruction bytes from memory.
instruction qualifier bit	The most significant bit of the last argument in the current instruction.
IO exchange	The exchange of data with the outside world during the last execution step of the last instruction in the loop.
loop	The program made up of instructions in the two first segments of memory.
near address	8-bit value for selecting a byte within a segment.
segment	Sub-area in memory consisting of 256 bytes.
segment number	7-bit value for selecting a segment within memory.