\documentclass[11pt, a4paper]{article}

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[margin=1.0in]{geometry}
\usepackage[british]{babel}
\usepackage{indentfirst}
\usepackage{array,booktabs,longtable}
\usepackage{multirow}
\usepackage{comment}

\newcommand{\prog}[1]{{\tt#1}}
\newcommand{\underscore}{$\_\,$}

\begin{document}


\title{Conception and realization of the VIVACE architecture
             \\ \normalsize{\textsc{Projet de Système digital}}}
\author{A. Auvolat \and É. Enguehard \and J. Laurent}
\maketitle


The VIVACE architecture is
a minimalistic 16 bits RISC microprocessor architecture, largely inspired by the MIPS
microprocessor.


The principal characteristics of the architecture are:

\begin{itemize}
\item \textit{8 general-purpose registers}, which can hold 16 bits integers: \prog{Z, A, B, C, D, E, F, G}
\item \textit{16 bit memory addressing}, enabling the CPU to use up to $64kb$ of memory.
\end{itemize}

In order to implement and run the architecture, the following programs have been written:

\begin{itemize}
\item \textit{Netlist  simulator} and \textit{netlist optimizer}
\item An \textit{OCaml library} for generating netlists from Caml code
\item The code for the \textit{CPU implementation} (written in Caml)
\item A \textit{monitor} which is used to interact dynamically with the netlist simulator
\item An \textit{assembler} which can be used to produce the ROM files run by the CPU.
\end{itemize}

The reader keen on learning valuable and useful information should skip the next
section and jump to section \ref{sec:useful}. He who does not strive for usefulness
may proceed.

\section{The name VIVACE}

VIVACE is a recursive acronym standing for “Virtually Infallible VIVACE Automated
Computing Environment”. We do not think anyone would dispute that VIVACE is an
automated computing environment, and shall thus discuss the rest of the allegations
that are made in this name.

\subsection{Why is VIVACE virtually infallible ?}

VIVACE bears a great many similarities to the pope, in that, among other things,
it is virtually infallible the sole and only reason that we, its creators, are able
to claim as loudly as needs be that it is so.

\subsection{Why “Vivace” ?}

“Vivace”, as anyone knows, is an italian word meaning, surprisingly enough, “vivacious”.
Outside of its homeland, the word is mostly known for its place of honour in the musical
glossary. Its inscription at the beginning of a musical piece means the composer
of said piece intends it to be played in a lively tone, yet the interpret should not
strive for fastness and virtuosity, as he should if the inscription were “Allegro”
or “Presto”.

Its name, thus, suits perfectly VIVACE; it is certainly as fast as a software-emulated
microprocessor can be, yet, as it is the sad fate of any software-emulated microprocessor,
it is extremely slow.

One may ask: what does VIVACE has to do with music? In fact, the VIVACE CPU has 
eight registers, seven of which are referred to with the letters \prog{A} through \prog{G}.
In English musical notation, those letters happen to be the names of the seven notes
of the usual scale.\footnote{In German musical notation, the letters \prog{A}
through \prog{G} are the names of the same seven notes, but in F major instead of C
major. Thus the letter \prog{B} represents an English B-flat, while an English B
is represented by the letter H.}

This relation between VIVACE and music can be used to translate VIVACE programs
into musical pieces, which, while extremely interesting from an intellectual point
of view, is remarkably useless. Consider, for instance, the following program,
that uses only elementary instructions (see
section \ref{sec:assembly} for a description of the VIVACE assembly language):

\begin{verbatim}
.text
    # reads a nonnegative rational number and outputs its integer part and
    # whether it is an integer or not
    liuz A 0x41 # input address
    lbr B 0(A) # numerator
    lra E 2
    lbr C 0(A)
    jer E C Z # denominator must be non-zero
    divu D B C
    se G E Z
    mulu F D C
    lil A 0x02 # output address
    sb F 0(A)
    sb G 0(A)
\end{verbatim}

Mapping cycles to notes' length and write access to registers to the pitch, we get
the following piece, for the input \prog{31 10}:

\lilypond[fragment,notime,quote]{
\set Timing.defaultBarType = ""
a'4 b'4. e'4 c'4. s \appoggiatura e'8 d'\breve g'4 f'2 a'4 s2}

As should be expected, it sounds awful.

\section{How to run the VIVACE cpu}
\label{sec:useful}
\subsection{Preparation}

All the tools described in the introduction must first be compiled:

\begin{verbatim}
    $ cd csim; make; cd .. 
    $ cd sched; make; cd .. 
    $ cd monitor; make; cd ..
    $ cd asm; make; cd ..
\end{verbatim}

To run the VIVACE CPU, type the following:

\begin{verbatim}
    $ cd cpu; make
\end{verbatim}

\subsection{Monitor commands}

You are now running the VIVACE CPU. The monitor accepts a few commands to control the simulation.
First, you must configure the monitor to communicate with the CPU. Type:

\begin{verbatim}
    t 0
    s 1 19 18
    d7 20 21 22 23 24 25 26 27
\end{verbatim}

The first command sets up the tick input (a tick is sent once every second on this input by the monitor). The
second command sets up the serial input/output. The third command sets up the 7-segment display (8 digits displayed).
Now, use the following commands to control the simulation:

\begin{itemize}
    \item \prog{a} run the simulation at full speed
    \item \prog{m} run the simulation step by step (enter an empty command to run a step)
    \item \prog{f <freq>} run the simulation at fixed frequency (frequency is dynamically ajusted so
            this is not very accurate)
    \item \prog{q} exit simulation
\end{itemize}

The CPU recieves commands on the serial input. To send a command to the CPU, use the following syntax:

\begin{verbatim}
    :<cpu_command>
\end{verbatim}

For instance:

\begin{verbatim}
    :Y2014
\end{verbatim}

These commands are essentially used to set one of the six variables \prog{YMDhms} ; the syntax is similar to the
example command given above. An empty CPU command tells the CPU to just tell us what time and what day it is.


\section{Program details}
\subsection{Generating netlists from Caml code}

We have developped a library that enables us to easily generate netlists from Caml code. The Caml code we write
has the same abstraction level that MiniJazz has, but it is more comfortable to write circuits like this than
with MiniJazz code.

The library functions are defined in \prog{cpu/netlist\_gen.mli}. Basically, we have created functions that build
the graph of logical operations. The abstract type \prog{t} is actually a closure to a function that adds the
required equation to a netlist program being built, therefore the generation of a netlist consists in two steps:
the generation of a closure graph that describes the graph of logical operations, and the execution of these
closures on a program which, at the beginning, has only the circuit inputs. The equations are progressively
added to the program when the closures are called.

The VIVACE CPU has been entirely realized using this library.

\subsection{The VIVACE CPU}

\subsubsection{Control structure}

The CPU is able to execute instructions that need several cycles to run. The two first cycles of an instruction's
execution are used to load that instruction (16 bits have to be read, ie two bytes). Most instructions finish
their execution on the second cycle, but some executions need more cycles to run:

\begin{itemize}
    \item Load and store instructions need one or two extra cycles
    \item The multiplication operation needs as many cycles as the position
        of the most-significant non-null bit in the second operand.
    \item The division always runs on 16 cycles.
\end{itemize}

The execution of instructions on several cycles is implemented using a ``control bit'' that cycles through
several steps: load instruction, various steps of instruction execution. A few of these step control bits
appear in the simulator, as CPU outputs:

\begin{itemize}
    \item \prog{read\_ilow}, \prog{read\_ihi} CPU is reading low byte/high byte of the instruction
    \item \prog{ex\_instr} CPU begins execution of the instruction
    \item \prog{ex\_finish} CPU finishes execution of the instruction (modified registers may only appear
        in the monitor at the next step)
\end{itemize}


\subsubsection{ROM, RAM and MMIO}

The CPU has uniform acces to a 64kb address space, which contains the ROM (\prog{0x0000-0x3FFF}), MMIO (\prog{0x4000-0x7FFF})
and the RAM (\prog{0x8000-0xFFFF}).
The \prog{cpu\_ram} (\prog{cpu.ml}) subcircuit is basically a bunch of multiplexers that redirect reads and writes to the correct places.

The serial input/output is implemented using one input and two outputs :

\begin{itemize}
    \item Input \prog{ser\_in} (8 bits) : when this input is non-null, the character entered is buffered by
        the CPU. This buffer can be read by reading MMIO byte at address \prog{0x4100}. The buffer is reset to zero
        on read.
    \item Output \prog{ser\_in\_busy} (1 bit) : signals when the input buffer is nonzero (ie a character is
        pending, waiting for the CPU to read and handle it).
    \item Output \prog{ser\_out} (8 bits) : when non-null, the CPU is sending a character to the serial output.
        This output can be written by writing MMIO byte at address \prog{0x4102}.
\end{itemize}

The clock is also handled by MMIO : the CPU recieves a tick every second on input \prog{tick}. When a tick is
recieved, the tick buffer is incremented by one. This tick buffer can be read by reading MMIO word at address
\prog{0x4000}. When the word is read, the buffer is reset to zero.

The 7-segment display is also handled by MMIO : the 8 digits can be modified by writing a byte to MMIO addresses
\prog{0x4200} to \prog{4207}.

\subsubsection{The ALU}

\subsection{The VIVACE assembly}
\label{sec:assembly}

The VIVACE assembly language is mostly inspired from the MIPS assembly language.
An assembly program is made of a \prog{.text} and an optional \prog{.data} segment, both of
which may contain labels that behave exactly like in MIPS, except that they may not
begin with a capital letter. End-of-line characters are
used as delimiters. Comments begin with \prog{\#} and end with en end-of-line character.

\subsubsection{The \prog{.data} segment}

Due to limitations of the simulator that we did not consider were worth the trouble to remove,
the \prog{.data} segment only contains uninitialized data. Its main use is to declare labels for use
in the \prog{.text} segment. Instead of formally describing the straightforward syntax of the
\prog{.data} segment, we will provide a simple example:

\begin{verbatim}
.data
label1:
    word 3 # three 16-bit words
label2: # label2 is label1 + 6
    byte 1 # one byte
\end{verbatim}

\subsubsection{The \prog{.text} segment}

The \prog{.text} segment contains both instructions and read-only data. The programmer
has to ensure that the program counter never reaches the data. Read-only data is declared
in the following manner:

\begin{verbatim}
    # here be instructions
    # they'd better end with a jump
data:
    word 0b101010 42 '*' # three words with the same value
    word -1 # 0xFFFF
    byte -1 255 # 0xFF 0xFF
string1:
    ascii "hello world!\n" # a trailing null character is automatically added
string2:
    ascii "well hello to you good sir!"
strings_addr:
    word string1 string2 # this will make things easier
\end{verbatim}

Instructions have between 0 and 3 operands, which are to be separated by spaces.
Depending on each instruction, operands may be integers, labels or registers.
Registers are referenced either by their name (\prog{Z}, \prog{A} etc.) or by
their number (\prog{\$0}, \prog{\$1} and so on). Here is a list of possible
instruction formats:

\begin{verbatim}
# format R3:
    add A B C
    add A B -5 # signed, 5 bits
# format R2:
    move A B
# format I:
    incri A 2 # signed byte
# format J:
    j -5 # signed, 11 bits
    j label
# format R:
    jr RA # RA is G
# format MEM:
    lw A 2(D) # signed, 5 bits
    lw A addr_label
# format H:
    hlt
\end{verbatim}

The first register is usually the destination register. Like in MIPS, the \prog{sw}
and \prog{sb} instructions are exceptions to this rule.

Here is a list of all instructions supported by the VIVACE assembly language. The
column specifies wether the instruction is actually supported by the processor or
it is translated to elementary instructions. Elementary instructions may still
be translated to several instructions, for example if the operands are too big,
or if integer operands are used in a command that only accepts registers. In that
case, the \prog{E} register is used to store the operands.\footnote{While the assembler will process
all listed instructions, some of them may not have been implemented in the CPU itself.}

\begin{center}
\begin{longtable}{lccl}
\toprule
Name & Elementary? & Format & Comments \\ \midrule
\prog{add} & Yes & R3 & \\ \cmidrule{1-4}
\prog{sub} & Yes & R3 & \\ \cmidrule{1-4}
\prog{mul} & Yes & R3 & \\ \cmidrule{1-4}
\prog{div} & Yes & R3 & \\ \cmidrule{1-4}
\prog{addu} & Yes & R3 & Unsigned version of \prog{add} \\ \cmidrule{1-4}
\prog{subu} & Yes & R3 & \\ \cmidrule{1-4}
\prog{mulu} & Yes & R3 & \\ \cmidrule{1-4}
\prog{divu} & Yes & R3 & \\ \cmidrule{1-4}
\prog{and} & Yes & R3 & \\ \cmidrule{1-4}
\prog{or} & Yes & R3 & \\ \cmidrule{1-4}
\prog{xor} & Yes & R3 & \\ \cmidrule{1-4}
\prog{nor} & Yes & R3 & \\ \cmidrule{1-4}
\prog{lsl} & Yes & R3 & \\ \cmidrule{1-4}
\prog{lsr} & Yes & R3 & \\ \cmidrule{1-4}
\prog{asr} & Yes & R3 & \\ \cmidrule{1-4}
\prog{se} & Yes & R3 & “Set if Equal” \\ \cmidrule{1-4}
\prog{sne} & Yes & R3 & \\ \cmidrule{1-4}
\prog{slt} & Yes & R3 & “Set if Lower Than” \\ \cmidrule{1-4}
\prog{sle} & Yes & R3 & “Set if Lower or Equal” \\ \cmidrule{1-4}
\prog{sltu} & Yes & R3 & Unsigned version of \prog{slt} \\ \cmidrule{1-4}
\prog{sleu} & Yes & R3 & “Set if Lower Than” \\ \cmidrule{1-4}
\prog{incri} & Yes & I & Allows for larger integers than \prog{add} \\ \cmidrule{1-4}
\prog{shi} & Yes & I & Shifts first operand by second operand; mostly useless\\ \cmidrule{1-4}
\prog{j} & Yes & J & If operand is an integer, performs a relative jump \\ \cmidrule{1-4}
\prog{jal} & Yes & J & “Jump and link;” return address is stored in \prog{G}\\ \cmidrule{1-4}
\prog{jr} & Yes & R & \\ \cmidrule{1-4}
\prog{jalr} & Yes & R & \\ \cmidrule{1-4}
\prog{jer} & Yes & R3 & “Jump if (R2 and R3 are) Equal to (address stored in) Register” \\ \cmidrule{1-4}
\prog{jner} & Yes & R3 & \\ \cmidrule{1-4}
\prog{jltru} & Yes & R3 & \\ \cmidrule{1-4}
\prog{jleru} & Yes & R3 & \\ \cmidrule{1-4}
\prog{lra} & Yes & J & Instead of performing jump, loads target address in register
\footnote{Labels may be loaded with command \prog{li}.} \\ \cmidrule{1-4}
\prog{lw} & Yes & MEM & \\ \cmidrule{1-4}
\prog{sw} & Yes & MEM & \\ \cmidrule{1-4}
\prog{lb} & Yes & MEM & \\ \cmidrule{1-4}
\prog{sb} & Yes & MEM & \\ \cmidrule{1-4}
\prog{lwr} & Yes & R3 & Looks up at address R2 + R3\\ \cmidrule{1-4}
\prog{swr} & Yes & R3 & \\ \cmidrule{1-4}
\prog{sbr} & Yes & R3 & \\ \cmidrule{1-4}
\prog{lbr} & Yes & R3 & \\ \cmidrule{1-4}
\prog{hlt} & Yes & H & Infinite loop \\ \cmidrule{1-4}
\prog{li} & No & I & Translates to appropriate \prog{li*} commands \\ \cmidrule{1-4}
\prog{lilz} & Yes & I & Loads lower byte, zeroes upper byte \\ \cmidrule{1-4}
\prog{liuz} & Yes & I & Loads upper byte, zeroes lower byte \\ \cmidrule{1-4}
\prog{lil} & Yes & I & Loads lower byte \\ \cmidrule{1-4}
\prog{liu} & Yes & I & Loads upper byte \\ \cmidrule{1-4}
\prog{push} & No & R & Stores contents of register on top of stack  \\ \cmidrule{1-4}
\prog{pop} & No & R & Loads contents of register from top of stack  \\ \cmidrule{1-4}
\prog{move} & No & R2 & \\ \cmidrule{1-4}
\prog{not} & No & R2 & \\ \cmidrule{1-4}
\prog{move} & No & R2 & \\ \cmidrule{1-4}
\prog{jz} & No & R2 & “Jump if Zero“ \\ \cmidrule{1-4}
\prog{jnz} & No & R2 & \\ \bottomrule
\end{longtable}
\end{center}

In addition to this, labels \prog{\underscore clock}, \prog{\underscore output}
and \prog{\underscore input} are
mapped respectively to the clock counter, the serial output and the serial input.

Here is an example function that outputs a string on the serial output:

\begin{verbatim}
# PROCEDURE: ser_out_str
# ROLE: write null-terminated string to serial output
# ARGUMENTS: address of string in register A
ser_out_str:
    li C _output
_ser_out_str_loop:
    lb B 0(A)
    jz B _ser_out_str_ret
    sb B 0(C)
    incri A 1
    j _ser_out_str_loop
_ser_out_str_ret:
    jr RA
\end{verbatim}

\subsubsection{The assembler}

The assembler itself is written in OCaml in a quite straightforward manner, using
the \prog{ocamllex} and \prog{menhir} tools, which allows for great flexibility.

\subsection{The simulator and the monitor}

The simulator is written in C for performance reasons.

The monitor is a C program, using the curses library for output to the console.

The simulator and the monitor communicate via Unix named pipes (FIFO's), which are created in
the files \prog{/tmp/sim2mon} and \prog{/tmp/mon2sim}. The synchronization of the two programs
has been somewhat problematic, due to incorrect use of \prog{scanf} making the programs hang.

\subsection{The operating system}

\section{Results and benchmarking}

\end{document}