## A Multipurpose Formal RISC-V Specification

Without Creating New Tools

Thomas Bourgeat, Ian Clester, Andres Erbsen, Samuel Gruetter, Andrew Wright, Adam Chlipala MIT CSAIL

## Abstract

RISC-V is a relatively new, open instruction set architecture with a mature ecosystem and an official formal machinereadable specification. It is therefore a promising playground for formal-methods research.

However, we observe that different formal-methods research projects are interested in different aspects of RISC-V and want to simplify, abstract, approximate, or ignore the other aspects. Often, they also require different encoding styles, resulting in each project starting a new formalization from-scratch. We set out to identify the commonalities between projects and to represent the RISC-V specification as a program with holes that can be instantiated differently by different projects.

Our formalization of the RISC-V specification is written in Haskell and leverages existing tools rather than requiring new domain-specific tools, contrary to other approaches. To our knowledge, it is the first RISC-V specification able to serve as the interface between a processor-correctness proof and a compiler-correctness proof, while supporting several other projects with diverging requirements as well.

### 1 Introduction

#### 1.1 RISC-V as a formal-methods-research enabler

Until recently, formal-methods projects requiring specifications of processors were in an uncomfortable situation: The ISAs used in real processors were very complex, did not have openly accessible specifications, or were protected by patents.

Therefore, each formal-methods project was either inventing its own ISA (e.g. [11, 24]) or formalizing from-scratch, at the desired abstraction level, a small subset of an existing ISA (e.g. [22]).

RISC-V is a game changer: it is particularly compelling for formal-methods research, because it is a simple clean-slate design easy to reason about and yet is part of a mature open ecosystem. RISC-V is a family of instruction sets broken into different native bitwidths (32, 64, 128) and extensions (e.g., multiplication and division instructions, atomics, ...) that may be mixed and matched. The the RISC-V software ecosystem contains fully upstreamed Clang, GCC, and Linux; and more and more commercial and open-source processor designs are becoming available [1–3, 9, 10, 13, 34, 35]. All in all, today one can run a mostly complete Linux distribution on a RISC-V machine. It makes for a very compelling platform for experimentation in basic research, progressively extensible to full-fledged realistic systems.

## 1.2 From a *golden standard* to formal-methods infrastructure

There is the potential for a mutually beneficial relationship between the community of formal methods and the RISC-V community.

On the one hand, the RISC-V Foundation already adopted a *golden standard* in the form of an ISA specification in the Sail language [6]. There is now an agreed-on, machine-processed description of authorized behaviors for a significant portion of the spec, beyond the usual natural-language description.

On the other hand, the formal-methods community can leverage RISC-V as a great laboratory to prototype ideas, where researchers will be able to get mature compilation toolchains, realistic processors, and systems to turn minimal experiments into more real-world projects.

However, different research projects use different languages and tools, and even projects using the same language tend to embrace different specification styles depending on their needs. Therefore, even when several formal-methods projects use the same RISC-V ISA and maybe even the same language and tools, it seems that they still all need different formalizations. It would be better if the formalization of the ISA specification were shared between projects, because a specification is only useful and meaningful if many eyes proofread and validated it against existing systems in many ways. This large effort should be shared between projects. In other words, we aim for reusable *multipurpose formalization*.

While the current golden standard, the Sail model, has been translated automatically into the languages of proof assistants, those translated models are produced via a very verbose intermediate representation of Sail (several megabytes of files) quite impractical for formal-methods projects involving manipulation of those models by-hand.

Considering the social context and incentive structure, this outcome is not surprising. The idea of having one single formalization for use in multiple projects is fundamentally at-odds with the interests and requirements of each project using the spec, and it goes beyond the language used. Typically, each research project is interested in a different *aspect* of the specification and wants to approximate soundly, or even simplify incorrectly, the other parts of the specification. The requirement to simplify and opt out of features unrelated to the subject of study is even more important for prototypes and early-stage research projects, where no one is willing to deal with unrelated complexities. Yet, it is exactly these early-stage projects with limited engineering resources that would benefit most from being able to reuse an existing specification, rather than having to start incrementally formalizing the RISC-V ISA one more time, inevitably making design mistakes in the process.

#### 1.3 Abstracting over use cases

The main obstacle in writing a multipurpose ISA spec is to consolidate the conflicting expectations of the different users of the spec.

Again, it is natural to begin by worrying about "Tower of Babel" problems, where different users of machine specifications apply quite-different tools and programming languages. While we acknowledge that the different programming languages used can be a first-order practical concern, this paper considers that question of syntax as orthogonal to another one, subject of this paper, and at least as interesting: *To what extent do the different uses of ISA specifications fundamentally require different presentations?* 

The standard approach [6] considers the ISA specification mostly as an abstract syntax tree (almost like a very structured JSON file), with custom translators and pretty-printers crafted as-needed by different users. This syntactical tradition has been shown to be very productive to share tools between different ISAs (RISC-V, ARM, x86, ...). However, it misses an opportunity to study what is semantically invariant in all the different usages of an ISA semantics.

We instead consider that an ISA specification can be described as a *program with holes*. For us, different instantiations of those holes with different subprograms will cover the different use cases of the specification.

The idea is that a user of the specification can describe her notion of machine state, what it means to set one of the 32 registers, what it means to load something from memory, etc., and we will give her a program that completes what it means to execute any RISC-V instruction on one of her states, to translate a virtual address in that state, etc.

**Contributions.** This paper shows that it is possible to write, in a general-purpose programming language, a multipurpose RISC-V formal specification: a specification used by a variety of research projects in formal methods. We demonstrate and explain how to achieve compatibility with the following classic use cases of machine specs:

- *Interactive theorem proving*: using our specification in the Coq proof assistant for various proofs of functional correctness
- *Verilog modeling*: generating a combinational Verilog circuit specifying the state update of the machine for every instruction for model-checking verification

- *SMT-based explorations*: for memory-model exploration, an interactive generation of partial executions, iteratively checked to be compliant with the memory model axioms
- Testing: batch execution of regression tests
- *Simulation*: execution of off-the-shelf software like Linux

Our specific formal-methods case studies include the following that mimic exercises with past multipurpose specs.

- Cross model-checking of our spec with another spec
- Functional-correctness proof in Coq of individual machinecode programs
- Checking of short litmus-test programs against RISC-V's standard weak memory model

Moreover, we note the existence of the following case studies [14] based on our RISC-V Coq specification, not described in detail in the present paper, but relevant because to our knowledge they are firsts for use of a multipurpose ISA spec.

- Functional-correctness proof of a pipelined processor
- Functional-correctness proof of a software compiler
- The bridging of the two theorems into *end-to-end correctness of a software and hardware system* (including proof of an application in the compiler's source language)

The remaining sections review important elements of the RISC-V ISA, explain our specification style, and discuss how to cover different use cases (differentiated by how they instantiate holes in our spec). Our semantics is available on GitHub<sup>1</sup> under a permissive open-source license.

## 2 Overview

The nonprofit RISC-V Foundation developed an instruction set architecture specification in English [31, 32]. The goal of our project is to translate the English specification into a broadly applicable, machine- and human-readable formal specification.

There is a particular emphasis on avoiding overspecification: Platform-specific details that are left unspecified by the English specification should also be unspecified in the formal specification, while at the same time enabling users of the formal specification to pin down as many of these platform-specific details as appropriate for their use cases.

Orthogonally, we want to support the many different use cases shown in Figure 1: For instance, some users want an executable specification that returns one single final machine state, while others might want to leave some inputs, parameters, or nondeterministic choices abstract and obtain a logic formula restricting the set of possible final states, or obtain a list of final states, or obtain a checker that simply

<sup>&</sup>lt;sup>1</sup>The Haskell code is at https://github.com/mit-plv/riscv-semantics and the Coq code is at https://github.com/mit-plv/riscv-coq/.



**Figure 1.** Projects using our RISC-V specification. The boxes with rounded corners are typeclass instantiations, the instances starting with a plus denote instances derived from other instances (adding new features). The grey boxes show external projects that our specification interacts with.

answers whether a given execution trace is allowed by the specification, etc.

To summarize, these requirements lead us to the following dimensions of parameterization:

- Supported extensions (see Table 1) and bitwidth (32bit, 64-bit or left abstract)<sup>2</sup>
- Platform-specific details
- Use-case-specific details

### 2.1 Choice of language

Serious commitment to a multipurpose specification requires careful thought about the language it should be written in. One goal was to write readable functional programs that could be understood *intuitively* by hardware engineers or compiler hackers, without necessarily being familiar with the underlying features (such as monads, type classes, etc.) enabling readability and parameterizability of the specification. Further goals were to create a specification that is practical to use in interactive theorem provers and to connect our specification to other specifications, especially from the hardware world.

We found that Haskell was able to cover most of the constraints we considered, thanks to the following:

 do notation provides syntactic sugar for readable imperative-looking code, particularly useful for this specification.

- the Clash compiler [27], compiling Haskell (with bounded recursion and finite datatypes) to Verilog/VHDL, is a bridge to hardware-model-checking tools.
- hs-to-coq [7], a compiler that uses the GHC frontend to generate Haskell-like Coq code, built to prove Haskell programs in Coq, is a good bridge to interactive theorem proving in Coq.

We restrict ourselves to concepts supported by these three Haskell compilers (hs-to-coq, Clash, GHC). Via hs-to-coq, we produced a semantics that was chosen as the reference machine model in several Coq projects. Via Clash, we produced a minimal "single-cycle" Verilog execution model for which external people (authors of other specifications) checked the agreement between our spec and theirs. Finally, via GHC, we demonstrated the possibility to explore the basics of the RISC-V memory model and to test our specification as a simulator.

In our work, Haskell is a convenient stand-in for whatever turns out to be the ideal general-purpose language for writing a variety of multipurpose specifications. Writing translators to the input formats of different formal-methods tools can be a significant undertaking, so that it pays to introduce as few languages as possible that require translation. It may turn out that the ideal community-shared spec language is quite different from Haskell, but our interest here is showing that a general-purpose language can work well as the host for a multipurpose spec, leaving fine-tuning of the approach for future work. An important point is that we avoid use of any spec-language features specialized to ISAs.

 $<sup>^2\</sup>mathrm{The}$  RISC-V specification also defines a 128-bit variant that we did not consider.

| Description                | Name  | hs?          | Coq?         | Clash? |
|----------------------------|-------|--------------|--------------|--------|
| Integer                    | Ι     | 1            | 1            | 1      |
| Integer Multiply/Divide    | М     | 1            | 1            | ×      |
| Atomics                    | А     | $\checkmark$ | ×            | X      |
| Single Floating-Point      | F     | $\checkmark$ | ×            | X      |
| Control & Status Registers | Zicsr | $\checkmark$ | (\checkmark) | ×      |

Excluded: E, D, Q, L, C, B, J, T, P V, N, Zifencei, Zam, Ztso **Table 1.** Standard Extensions of RISC-V Version 20191213

#### 2.2 Structure of the specification

The RISC-V specification is composed of several *extensions* listed in Table 1, and an implementation can choose which subset of them to support. Our formalization of the specification only covers the most important of them.

**The primitives.** The key to supporting many different use cases is to specify the semantics of each instruction in terms of a small number of *primitives* listed in Figure 2, while leaving the implementations of these primitives to be filled in by the concrete use cases. The primitives include state-like constructs (for the registers, the memory, ...), plus control-flow-like constructs (endCycle) to capture the control-flow change in case of an exception (see subsection 3.2) raised in the middle of the semantics of a function (an early return).

*Missing equations.* One may complain that our specification style does not reveal what it means to write a register, and so it is incomplete.

This reaction is reasonable on one level. We could hope to define a set of laws that an instance needs to follow to be dubbed "a RISC-V instance." For example, getting the value of a register that we just wrote should return the value we have just written.

While desirable and an interesting question that we have asked ourselves, such an axiomatic description would be intractably complicated. Moreover, many questions related to the specification of RISC-V are left to be decided and formalized – they are still research questions. Examples include the memory model in presence of multi-size accesses, virtual memory, self-modifying code, etc. As such, our specification should be useful for exploring the possibilities.

**Instantiation.** The abstract monad p (of kind \* -> \*) can be instantiated differently by each use case, which keeps our spec agnostic to the concrete state of the machine and to the kind of effects that instructions can have. For instance, depending on the platform and the use case, an invocation of the storeWord primitive could update the memory of the machine state, or it could fail if the address is outside of the physical address range, or it could record constraints in a memory-model graph, or it could record an I/O event if

```
-- Indicates which stage is the source of a memory access
data SourceType = VirtualMemory | Fetch | Execute
-- Type class providing the RISC-V primitives:
class (Monad p, MachineWidth t) => RiscvMachine p t
  | p -> t where
 getRegister :: Register -> p t
  setRegister :: Register -> t -> p ()
 getFPRegister :: FPRegister -> p Int32
 setFPRegister :: FPRegister -> Int32 -> p ()
 loadByte :: SourceType -> t -> p Int8
 loadHalf :: SourceType -> t -> p Int16
 loadWord :: SourceType -> t -> p Int32
 loadDouble :: SourceType -> t -> p Int64
  storeByte :: SourceType -> t -> Int8 -> p ()
  storeHalf :: SourceType -> t -> Int16 -> p ()
  storeWord :: SourceType -> t -> Int32 -> p ()
  storeDouble :: SourceType -> t -> Int64 -> p ()
 makeReservation :: t \rightarrow p ()
 checkReservation :: t -> p Bool
 clearReservation :: t -> p ()
 getCSRField :: CSRField -> p MachineInt
 unsafeSetCSRField :: (Integral s) => CSRField -> s -> p ()
 getPC :: p t
 setPC :: t -> p ()
 getPrivMode :: p PrivMode
  setPrivMode :: PrivMode -> p ()
 commit :: p ()
 endCycle :: forall z. p z
 flushTLB :: p ()
 fence :: MachineInt -> MachineInt -> p ()
 getPlatform :: p Platform
```

Figure 2. The primitives of the abstract RISC-V monad

the address is in a range that the platform uses for memorymapped I/O, etc., and our specification is completely agnostic to these options.

The abstract type t is the type of the values stored in the integer registers. It can be instantiated with Int32, Int64, or left abstract for use cases where it makes sense to reason about all bitwidths at once. Requiring a MachineWidth type-class instance for t guarantees that there are arithmetic and logical operators for t. To distinguish t from helper integer values that do not live in registers, we introduce an additional integer type MachineInt, which is an alias for Int64, and whose more-significant bits are sometimes ignored. In fact, whenever we need an *n*-bit integer (with  $n \le 64$ ) that does not live in a register, we use MachineInt, applying bitmasking where necessary. Like some other authors [21], we believe that it is better to avoid complicating specification languages with very precise typing and therefore do not try to use *n*-bit integers for many different *n*.

Instructions. The above choice also shows up in our algebraic-datatype representation of decoded instructions:

```
data InstructionI =
```

```
Sw { rs1 :: Register, rs2 :: Register, simm12 :: MachineInt } |
Add { rd :: Register, rs1 :: Register, rs2 :: Register } |
Beq { rs1 :: Register, rs2 :: Register, sbimm12 :: MachineInt } |
```

For instance, even though the offset field of the store-word instruction is only a signed 12-bit immediate, we represent it with a (64-bit) MachineInt for simplicity. Note that this simplification does not compromise correctness, because the specification only creates instructions in the decoder, which only ever writes 12-bit values into that field.

Decode. The decoder starts by defining symbolic names for notable bitfields of the instruction inst being decoded:

```
opcode = bitSlice inst 0 7
rd = bitSlice inst 7 12
rs1 = bitSlice inst 15 20
rs2 = bitSlice inst 20 25
simm12 = signExtend 12 $
 shift (bitSlice inst 25 32) 5 .|. bitSlice inst 7 12
. . .
```

and then defines a decoder for each RISC-V extension:

```
decodeT
```

```
opcode==opcode_STORE, funct3==funct3_SW = Sw rs1 rs2 simm12
| opcode==opcode_BRANCH, funct3==funct3_BEQ = Beq rs1 rs2 sbimm1<sup>±</sup>/<sub>2</sub>ion. For the Coq proofs, we use a simplified<sup>3</sup> version that
```

and finally, checks if the decoded instruction is part of the supported extensions.

*Encode.* Our specification does not contain an instruction encoder, because its definition is implied by the decoder. However, we did write an encoder in Coq and prove that it is the inverse of the decoder. While conceptually trivial, this proof took a few days of engineering effort to work around performance limitations in Coq due to the many case distinctions that appear in the proof.

*Execute.* For each RISC-V extension supported, there is an execute function that expresses the effects of each instruction of the extension in terms of the primitives listed in Figure 2. For instance, here is the definition of the jump-andlink-register instruction, for which an explanatory prose will be provided shortly thereafter:

```
execute (Jalr rd rs1 oimm12) = do
 x <- getRegister rs1</pre>
 pc <- getPC
 let newPC = (x + fromImm oimm12) .&. (complement 1)
 if (remu newPC 4 /= 0)
   then raiseExceptionWithInfo 0 0 (fromIntegral newPC)
   else (do
     setRegister rd (pc + 4)
     setPC newPC)
```

It was transcribed from the following English definition:

The indirect jump instruction JALR (jump and link register) uses the I-type encoding. The target address is obtained by adding the sign-extended 12-bit I-immediate to the register rs1, then setting the least-significant bit of the result to zero. The address of the instruction following the jump (pc + 4) is written to register rd. Register x0 can be used as the destination if the result is not required.

The JAL and JALR instructions will generate an instruction-address-misaligned exception if the target address is not aligned to a four-byte boundary.

It uses Haskell's do notation to chain monadic operations, and it can also use standard Haskell constructs such as let or if. The binary operators (such as +, /= and .&. in this example) are provided through the MachineWidth type class, which takes a type parameter t that can be instantiated with 32-bit or 64-bit integers depending on the desired bitwidth. It inherits from Haskell's standard type classes Integral and Bits, allowing us to use the standard infix operators. MachineWidth also provides fromImm to convert immediates from instructions into register values t.

Run. Finally, we define what it means to run one instrucjust fetches an instruction, decodes and executes it, and updates the program counter, whereas the Haskell version also considers interrupts and exceptions.

Use-case- and platform-specific code. The components described so far form the specification and are grouped together in a directory called Spec. However, we have not yet defined how state is represented and how the primitives of Figure 2 are to be implemented. These use-case- and platformspecific definitions are in a separate directory called Platform and are the subject of the next section. In terms of code size, the Spec and Platform directories each are about 3K lines of code.

## 3 Different monads for different use cases

Our RISC-V specification benefits from using a monad (the p in Figure 2) in the very same way as Wadler's interpreter in his classic "The essence of functional programming" paper [30]. In this section, we want to demonstrate the resulting wide coverage of practical use cases.

#### 3.1 Simulation

The most common way of modeling processors in the formalmethods world is to consider the machine to be deterministic,

<sup>&</sup>lt;sup>3</sup>but still accurate with respect to real software and hardware, because both the processor and the compiler are proven to respect this same simplified specification

each cycle updating the state of the registers and the memory depending on the instruction present in memory at the location pointed to by the program counter (PC) register.

Concretely, we wrote two instances of the RiscvMachine type class, named Minimal32 and Minimal64, to obtain a 32and a 64-bit *machine simulator*. We instantiated the type class in the I/O monad, using references and arrays to implement the registers, the program counter, and the memory.

#### 3.2 Supporting RISC-V exceptions

Many formal-methods-oriented projects do not want to deal with exceptions or interrupts, while others are interested in modeling and leveraging them. The formal-spec user should experience the costs of those further features incrementally as they are brought into play.

Raising an exception involves two components: modifying a bunch of state (namely numerous special state registers [CSRs]) and bailing out of a cycle early, not performing the effects of the instruction that would have come after an exception is raised.

Cutting the current cycle short (via an exception) is like returning from a function early: anything prior to the exception actually happens, and nothing after the exception should have any effect at all.

Now, what can cause an exception?

- a virtual-memory translation failure
- a system privilege check
- an alignment problem
- etc....

To be able to write monadic values carrying this early-exit information, we encode the early return in a layer of a MaybeT monad transformer. The crucial primitive already appeared in our definition of the RiscvMachine monad.

endCycle :: forall z. p z

Conspicuously, an implementation of endCycle is missing from the base machine previously described as Minimal64/32, but there is one given in source file Machine.hs:

```
instance (RiscvMachine p t)
    => RiscvMachine (MaybeT p) t where
getRegister r = lift (getRegister r)
setRegister r v = lift (setRegister r v)
...
endCycle = MaybeT (return Nothing)
...
```

This instance demonstrates something powerful about our spec: composability. It takes some existing instance of the same type class and builds on it, adding in the functionality of the Maybe monad. In that monad, a computation halts as soon as a step returns Nothing, precisely capturing the "earlyreturn" behavior we want upon encountering an exception. So, endCycle returns Nothing (halting the computation), and all other functions do what they would normally do, just "lifted" into the new Maybe-infused form of the monad.

With this two-layer specification, we ran in simulation the riscv-tests test suite (rv64mi, rv64si, rv64ui, rv64ua), which is the standard community-maintained test suite.

#### 3.3 Platform modeling, MMIO, and devices

We actually use this kind of instance augmentation repeatedly in our code, first to encode the semantics of core features (like RISC-V exceptions, as we have just seen), but also to add features like memory-mapped I/O devices to existing RiscvMachine instances.

For example, we enrich the platform with a concurrently memory-mapped device: a UART that one can connect to from a terminal, which generates interrupts received by the main loop of the simulator.

With this implementation, composed of three layers of specifications, we were able to run Linux in January 2019 at about 100k instructions per second. (We have not tracked new versions of Linux since then.)

While this strategy allows us to experiment, test, and vouch for our good coverage of the spec, this artifact is not especially competitive for running significant RISC-V programs, e.g. the popular QEMU runs at several hundred million instructions per second.

#### 3.4 Interactive theorem proving

**3.4.1** Translation from Haskell. Using hs-to-coq [7], we can translate the Haskell specification to Coq. Since hs-to-coq was designed to model Haskell semantics in Coq as faithfully as possible, it ships with handwritten and auto-generated translations of Haskell's standard-library files, and by default they are referenced by the Coq files produced by hs-to-coq. However, for this project, we were not seeking a faithful reproduction of Haskell semantics in Coq but rather an idiomatic RISC-V specification in Coq. Therefore, we used hs-to-coq's *edit files* feature, which allows one to provide renaming and rewriting patterns to be applied during the translation, so that we could map all Haskell standard-library references to reasonably close Coq equivalents and obtain an idiomatic, Haskell-independent Coq specification.

We used hs-to-coq to translate the files specifying how each instruction is executed, the instruction decoder, as well as the CSR-file specification, while we manually wrote the remaining files such as supporting utility definitions, the definition of the RiscvMachine type class, and proof-specific files.

**3.4.2** Use as the interface between software and hardware. Our RISC-V Coq specification was used successfully in a project [14] that combines a compiler-correctness proof with a processor-correctness proof.

The combined theorem states that the I/O trace produced by the processor matches the one produced by the source program fed to the compiler, without referencing the RISC-V specification any more. Thus, auditors of the system can know the behavior of the system without having to audit whether both the compiler and the processor interpret the RISC-V specification in the same way, which greatly reduces the auditing burden.

3.4.3 Opting out of features and opting back in. Our first version of the translation to Coq was driven by the requirements of the compiler-correctness project mentioned above, which required a very simple and manageable spec to get started, so it was decided that initially, CSRs should not be modeled. However, this also meant that we could not use the real raiseException function, nor the translate function (translating virtual to physical addresses), which starts by reading a CSR that indicates whether virtual memory is enabled. The solution was surprisingly simple: Since we had already decided to translate manually the file containing the declaration of the RiscvMachine type class, we were free to abstract over raiseException and translate by adding it to the primitives of RiscvMachine (Figure 2). That is, we made our specification more configurable than RISC-V allows, and for the compiler, we instantiated translate to the identity function and raiseException to hard failure, because no instructions emitted by the compiler rely on exceptions, while at the same time, we kept open the possibility to instantiate these two functions with (more) real definitions.

These modifications are not RISC-V-compliant, but we consider it an important feature of our specification that we were able to make them, while still being able to translate most of the specification to Coq automatically and thus continuously pull updates and bugfixes made in Haskell into the Coq code base.

Later, when we added CSRs to the Coq specification, we wrote a simplified raiseException function. Since the compiler does not use it, it was trivial to integrate this update with the compiler-correctness proof, and now another project, still in a very early stage, that requires CSRs can use exactly the same version of our RISC-V Coq specification as the compiler-correctness proof.

#### 3.4.4 Simulator in Coq.

**State monad.** In Coq, the simplest-possible instantiation of the monad is p :=State MachineState, where State is the state monad defined as State(S A: Type) :=S $\rightarrow$  (A \* S), and MachineState is a record containing the values of the processor's registers, the program counter, the memory, the CSR file, and the current privilege level. This instantiation can be used to obtain a deterministic RISC-V simulator.

**State monad with failure.** An arguably even-simpler monad instantiation is p := OState MachineState, where

OState(S A: Type) :=  $S \rightarrow$  (option A) \* S uses a None answer to indicate that a failure occurred. Its Bind and Return operations are implemented as

Bind A B (m: OState S A) (f: A  $\rightarrow$  OState S B) := fun (s: S)  $\Rightarrow$  match m s with | (Some a, s')  $\Rightarrow$  f a s' | (None, s')  $\Rightarrow$  (None, s') end; Return A (a: A) := fun (s: S)  $\Rightarrow$  (Some a, s)

and an unrecoverable (hard) failure can be implemented as

fail\_hard S A: OState S A := fun (s: S)  $\Rightarrow$  (None, s)

For compiler-correctness proofs, fail-hard can be used to indicate that a situation occurred that the compiler is supposed to avoid, e.g. memory access at an invalid address, and a compiler-correctness proof then states that all valid source programs are translated to RISC-V programs that never cause failures.

Moreover, if the compiler has been designed to emit code that does not use certain features, the RISC-V specification can be simplified by implementing the primitives of Figure 2 used by these features as just fail\_hard. For instance, the compiler presented in [14] emits code that does not depend on the CSRs, does not use floating-point operations or atomics, and assumes that there is no virtual memory and that the code always runs at the MachineMode privilege level. Therefore, the monad instantiation used to specify its correctness implements the primitives makeReservation, checkReservation, clearReservation, getCSRField, as well as unsafeSetCSRField, getPrivMode, setPrivMode of Figure 2 as just fail\_hard (while the TLB- and floating-point-related methods were omitted altogether in the translation from Haskell to Coq).

**3.4.5** Adding instruction counters. In a separate project (unpublished at the time of writing) building on top of the compiler mentioned above, the MachineState record was extended to include counters for the number of executed instructions, number of memory accesses, and number of jumps, and proofs about how the compiler preserves these cost metrics were written, allowing one to calculate loose (but formally proven) upper bounds on the execution time of RISC-V programs.

**3.4.6** Nondeterminism. One way to add nondeterminism is to use the nondeterministic option state monad, OStateND S A := S  $\rightarrow$  option (A \* S)  $\rightarrow$  Prop, where the option's None constructor is used to indicate failure, and option (A \* S)  $\rightarrow$  Prop can be thought of as the set of all possible outcomes. Its Bind and Return operations are implemented as

Bind A B (m: OStateND S A)(f : A  $\rightarrow$  OStateND S B) := fun (s : S) (obs: option (B \* S))  $\Rightarrow$ (m s None  $\land$  obs = None)  $\lor$  (∃ a s', m s (Some (a, s')) ∧ f a s' obs); Return A (a : A) := **fun** (s : S) (oas: option (A \* S)) ⇒ oas = Some (a, s)

Why not monad transformers? We use monad transformers [23] to add logging or early returns in Haskell, but they do not work to add nondeterminism as an additional feature on top of an existing instance, because in order to obtain the right type for OStateND, one has to *start* the composition with the nondeterminism monad, rather than adding nondeterminism *at the end* of the monad-transformer composition chain, as would be required to reuse code written for OState in code for OStateND. Moreover, since this code serves as a specification, it should be easy to audit and understand, and we found that the definitions of Bind and Return above are much easier to digest than the composition of several monad transformers, where certain composition orders can result in unintended semantics.

**3.4.7 Runtime input.** Once we have nondeterminism, we can use it to model memory-mapped I/O (MMIO). For instance, in the implementation of the loadWord primitive, if the address is not a physical memory address, we delegate to the following helper function:

mmio\_load32 addr: OStateND S int32 := fun s oas ⇒
 (isMMIOAddr addr ∧ ∃ v: int32, oas =
 Some (v, (appendLog (mmioLoadEvent addr v) s))) ∨
 (~isMMIOAddr addr ∧ oas = None)

It can be read as a function that for each current state s returns a proposition that indicates whether an outcome oas (of type option (int32 \* MachineState)) is in the set of possible outcomes, distinguishing two cases based on whether the address lies in the address range reserved for MMIO. We also augment MachineState with a log to which we append an MMIO event on each load and store that falls into the MMIO address range.

Proof of a compiler targeting this specification will have to show that all states in the outcome set given by mmio\_load32 satisfy the compiler's correctness guarantees (such as being related to a state of the source-language execution), so the body of mmio\_load32 will appear on the left-hand side of an implication, so the existentially quantified v becomes universally quantified, and as expected, the compiler has to prove that its guarantees hold for all possible values v that this MMIO load could have read.

**3.4.8** Nondeterminism by means of weakest preconditions. The compiler project [14] that uses our RISC-V specification requires RISC-V semantics that given an initial state s, a monadic computation m corresponding to the execution of a sequence of primitives from Figure 2, and a desired *postcondition*, returns the weakest precondition that must hold in order for the postcondition to hold. Therefore, it seems that we need the following bridge definition

that tells when a monadic OStateND computation satisfies a postcondition:

mcomp\_sat S A (m: OStateND S A) s post :=  $\forall o, m s o \rightarrow \exists a s', o = Some (a, s') \land post a s'$ 

For an example relating this definition to the previous subsection, m could be instantiated with mmio\_load32 addr and post could be instantiated with the claim that the final state is related to a state of the source-language execution.

When instantiating m with a monadic computation involving many Binds, unfolding mcomp\_sat and all the Binds quickly leads to huge formulas involving an existential for each intermediate state and answer, and we found these formulas to be larger than what human brains can deal with productively. The solution was to treat mcomp\_sat and Bind as opaque and to prove weakest-precondition-style rules for each primitive of Figure 2, using only these rules in the compiler-correctness proof, so that the large formulas were confined to just the proofs of these rules.

However, when a processor in the Coq-embedded hardwaredescription language Kami [12] was being proved against our RISC-V specification, the same formula-explosion problem struck again, but this time, on the other (left-hand) side of the implication. Inversion rules for mcomp\_sat of primitives, dual to the weakest-precondition-style rules mentioned above, might have been a way to go, but it turned out that it is simpler (both for the compiler and the processor) to use an instantiation of the abstract monad that is more suitable for weakest-precondition generation, namely a free monad.

We use a Coq Inductive for effects with one constructor per primitive of Figure 2, and a generic free monad with one constructor for an effect followed by a continuation, plus a second constructor to indicate termination. Bind for this monad can be defined as a **Fixpoint** that flattens monadic computations that might have nested Binds as the first argument of Bind into a more canonical form. A result is almost a sequential list of effects (ended by the termination constructor of the free monad), except in the case of nondeterminism, where branching can occur, so the shape becomes a tree there.

On this free-monad structure, we can run an interpreter that computes weakest preconditions. The crucial difference between OStateND and the free-monad interpreter is that the former creates an existential for the intermediate state and answer of each Bind, whereas the latter works similarly to a continuation-passing-style interpreter and just passes updated states to the right-hand sides of the Binds, leading to considerably simpler formulas. For comparison, here is the helper function that the interpreter invokes in the loadWord case when the address is not a physical memory address:

```
mmio_load32 addr := fun s post ⇒
isMMIOAddr addr ∧ ∀ v: int32,
post v (appendLog (mmioLoadEvent addr v) s)
```

Note how, contrary to OStateND, no case for failure is needed, and the value v being read is already universally quantified, rather than existentially quantified on the left-hand side of the implication of mcomp\_sat, and if more code follows after this snippet, it will be put into post and thus be invoked with the updated state (appendLog (mmioLoadEvent addr v) s), so no intermediate existential is created.

#### 3.5 Weak memory models

In this section we outline our approach to instantiate the type class to generate all the possible outcomes of small multicore litmus tests with respect to the memory model. We instantiate the RiscvMachine type class with a runtime implementing the exploration algorithm of Kokologiannakis and Vafeiadis [20].

This algorithm revolves around 4 data structures:

- a control/data/addr dependency-bookkeeping data structure, to maintain a list of all the memory events that imply dependencies on the currently interpreted instruction
- a current partial execution graph, which is the graph of the memory events and their memory-model relations
- two bookkeeping data structures necessary for backtracking during search: a list of alternative partial execution graphs to explore later and a maintained set of all the read events that would be subject to revisiting, if a store to the same address would occur.

Intuitively, our implementation goes as follow: we write an interpreter in charge of exploring an execution path depthfirst. That interpreter also records all the alternative decisions that it could have taken on its way. The interpreter can either return successfully with a valid execution, or it can return that the execution that it explored ended up violating the memory model. In both cases, the interpreter updates the global bookkeeping of alternative executions.

More precisely, the interpreter is a classic interpreter except for the following twists.

- It keeps track of the dependencies that previous memory events have on the different registers and the PC. It is worth noting that this functionality does not need to be interleaved in the execution semantics. Rather, it can be done directly from the decoded instruction without looking at the values in registers. The reason is that, for one instruction, dependencies are statically known.
- On a load, the interpreter adds to the partial execution that the load reads from one of the stores to the same address already present in the current partial execution. The interpreter also adds all other possible alternative stores to the same address as alternative executions to explore later. The interpreter then interacts with Alloy [19] to give it the partial execution

graph that it just extended. Alloy verifies that this extension is RISC-V-compliant. If not, the interpreter signals failure after updating the backtracking structures; otherwise, it keeps running.

- On a store, the interpreter both adds a new store event to the current execution and updates the alternative partial executions: any load in the revisit set that loads from the address of the current store is the source of a new partial execution to add in the alternative partial execution list.
- When the interpreter reaches the end of one thread, it starts running the next thread. When the interpreter finishes running the last thread successfully, it returns that the current execution is a valid execution.

This straightline interpreter does not do the backtracking itself. Instead, it is called from a toplevel loop that keeps calling the interpreter on the next partial execution to explore, each time either getting a valid execution or a failure but an updated backtracking strucutre. The top level calls the interpreter until the alternative execution list is empty.

Interestingly enough, this algorithm does not require us to model several cores. Instead, we effectively only run threads one-at-a-time, just keeping track of a single register file and program counter.

The complete implementation is 800 lines of Haskell.

We use the upstream official Alloy specification [25] for RVWMO, one of the several machine-readable forms available online for the RISC-V weak memory model.

We only support word load and store instructions plus a TSO fence, as our intention was simply to demonstrate that one can implement state-of-the-art memory-model-exploration algorithms using our specification. Hence we did not go through the implementation of the exploration for atomics and release/acquire fences, which, based on the study of Kokologiannakis and Vafeiadis [20], we predict would not require interestingly different ingredients.

As is standard in memory-model work, we assume the instruction memory is separate from the data memory and not under the memory-model exploration. We also do not support mixed-size or misaligned accesses, virtual memory is deactivated, and there are no exceptions/interrupts.

At the top level, we expect an ELF file of the source program, plus the start and end program counters of the different threads. We require every thread to have only bounded executions (no scenario in which a thread could run into an infinite loop). Our implementation is then able to generate all the possible executions of the program under the RISC-V memory model, that is, which loads observed which stores for every thread.

Performance-wise, this prototype is currently bottlenecked by the interaction with Alloy. For each partial execution, we start a new Alloy session to query the freshly generated model-finding instance. Hence, the time is completely dominated by both the repeated Java runtime startups and the Alloy model finding (currently using SAT4J).

We tested our instance on a set of variations of store-buffer and message-passing litmus tests, with and without fences, with and without address dependencies. The longest example takes on the order of 10 seconds.

## 3.6 Model-checking the decode and execute functions

The riscv-formal project [33] proposes a Verilog description of the Boolean function updating one cycle: assuming a single-cycle machine, it specifies how the register file and the memory are transformed by an arbitrary instruction from the backbone of the base ISA. They also have infrastructure to model-check their Verilog description against other descriptions, for example the standard RISC-V simulator Spike [4].

We used the Haskell-to-hardware converter Clash [27] to transform our specification, using a minimal state monad instance, into a Verilog Boolean function, and the authors of riscv-formal model-checked that output against their reference riscv-formal to find discrepancies.

Interestingly enough, the Clash instance of the specification is quite similar to the Minimal instances. However, the Minimal instances are not directly usable in Clash because they use a Map for the register file, and these potentially arbitrary-size maps do not normalize well in Clash. Instead, we use the Vector datatypes in Clash.

Clash compiles by normalizing a Haskell expression into a static circuit form (eliminating recursion by unfolding it, and compiling **if**s to muxes), but it means that we are sometimes facing instabilities when working with Clash, where the normalization engine diverges. Forcing ourselves to write code that Clash can normalize led us, to eradicate programming patterns that we think were good ideas to eradicate anyway, for example, avoiding recursive code that is not obviously well-founded, in the virtual-memory translation.

Indeed, even though the virtual-memory code is always simplified away for our simple instance that does not use virtual memory, such dead code is not obviously dead to the Clash compiler during compilation, and we often observed divergence, before we manually performed the constant propagation that would allow it to realize that the tricky code was dead. By passing to Clash even conditionally dead code in the specification, we make compilation more robust to Clash version changes (normalization engine changing) and keep the door open to writing an instance in the future that does exercise this code.

However, there exists a snippet worth discussing because it is causing trouble in the current version of the Clash compiler (understandably so). The following snippet is a list concatenating the different possible decodings (by the different extensions that are activated in the machine) of an instruction bit pattern. By construction, that list is guaranteed to be of size at most 1, as each bit pattern is defined in at most one extension.

```
resultI ++
(if supportsM iset then resultM else []) ++
(if supportsA iset then resultA else []) ++
(if supportsF iset then resultF else []) ++
(if bitwidth iset == 64 then resultI64 else []) ++
(if bitwidth iset == 64 && supportsM iset
    then resultM64 else []) ++
(if bitwidth iset == 64 && supportsA iset
    then resultA64 else []) ++
(if bitwidth iset == 64 && supportsF iset
    then resultF64 else []) ++
resultCSR
```

We used a bit of a convoluted programming pattern here to maximize readability and concision. But lists (and more generally, recursive data structures) are not Clash-friendly, as they need to be normalized at compile time. Here, this normalization creates an enormous term if iset is not known. More importantly, even if iset is known, which is the case for us, it seems that Clash does not perform the constant propagation before running into the normalization black hole. It is probably possible to improve Clash's normalization so that this case is handled properly, but we currently shamelessly use a sed script to comment the offending lines.

This snippet highlights a classical example of a tradeoff: if one considers extensions and bitwidths to be statically fixed in one machine, using macros instead of **if** expressions would have side-stepped this problem altogether. On the other hand, having one spec able to handle bitwidths 32 and 64 simultaneously was a requirement for our users writing a generic RISC-V compiler and proving it generically. We decided having to replay all the proofs twice would have been a bigger headache.

# 3.7 Simple symbolic execution and an SMT-solver connection

As a small experiment to illustrate how flexible and lightweight our specification is, we used a combination of Coq and Z3 to prove equivalence between some tiny program snippets and variations or optimizations of them.

For instance, we can prove that the following two program snippets behave the same (a further optimization could remove the addi on the right that became dead code, but then the effect on register t3 is not strictly the same any more):

| addi t3, zero, 31 | slli t1, t2, 5   |
|-------------------|------------------|
| mul t1, t2, t3    | sub t1, t1, t2   |
|                   | addi t3, zero, 3 |

In Coq, we instantiate the RiscvMachine instance with a state monad with failure, using a state record that only contains a register file and a program counter, but no memory. Most

1

of the primitives of Figure 2 are unused in this experiment, so we implement them as failure. We implement a custom top-level instruction-running function that does not fetch instructions from memory but instead from a program represented as a list of instructions. Then, we state a proof goal saying that for all initial register files, the final register file obtained after running the program on the left equals the one obtained after running the program on the right. We use Coq's term-simplification tactics (the one called hnf turned out to work well for our use case) to partially apply the instruction-running function to these two concrete programs and this abstract register file, so that we obtain an equality between two register files containing the effects of the RISC-V instructions of the two programs.

Such equalities could of course be proven manually in Coq, but for the sake of the experiment, we preferred to think of Coq only as a partial-evaluation engine, and we defined some Coq notations to pretty-print this equality into the SMT-LIB format, using which we fed this proof obligation into the Z3 SMT solver. Processing the Coq file takes less than 2 seconds, and the tiny examples we tried are solved by Z3 within less than 2 seconds, except for one that took 50 seconds.

The point of this experiment is *not* to claim that we did any serious program verification here, but rather, that using our RISC-V spec, one can prototype an idea with minimal effort: The whole experiment, including the RiscvMachine instance, the sample programs, the partial-application tactics, and the pretty-printing into SMT-LIB syntax, fits into a Coq file of 200 lines of code (and a two-line shell script feeds Coq's output into Z3).

# 3.8 Building hardware gadgets by partial application

We also experimented with a more esoteric use of Clash on our specification. When we fix a machine-code program by choosing a very specialized instance of our RiscvMachine type class, Clash partially evaluates the specification to produce a description of a specialized circuit.

For example, we can partially apply the RISC-V specification on a program that computes the GCD of two numbers. If we use a type-class instance that does not have memory, except for two hardcoded addresses connected to input wires and an address connected to an output wire, and use Clash to compile to hardware, we get a sequential machine. This technique probably is not practical, though it makes for a fun demonstration of expressivity of our type-class representation.

## 4 Limitations

Our RISC-V specification is not the one officially blessed as the golden reference model by the RISC-V Foundation, and we have not (yet) validated ours against it. Moreover, we might not be fully up-to-date with the latest modifications made in the PDF specification [31, 32] that we used as our reference. As shown in Table 1, we only support the most common extensions. We also do not support dynamically changing the bitwidth.

From a design point of view, it is unsatisfactory that our RiscvMachine type class always contains the getFPRegister and setFPRegister primitives. It would be better if they only were present if the floating-point extension is supported, and similarly for makeReservation, checkReservation, and clearReservation required by the atomics extension.

Sometimes, a deep embedding of the specification is required rather than a shallow one, for instance, to export the specification to another language. In the case of exporting the specification to Coq, we were able to use the existing tool hs-to-coq which already parses Haskell to obtain a deep embedding of the specification, so we did not need to get access to a deep embedding ourselves.

We did a few experiments trying to translate the specification to other languages, but it turned out to require more engineering effort than originally expected. Parsing Haskell is only the first step, and after that, unless the target language supports type classes the same way as Haskell does, the translator would have to perform semantic analysis to resolve the meanings of overloaded operators like + or fromIntegral: They sometimes refer to operations on the integer type t (that can be 32-bit or 64-bit), while in other locations, they refer to machine-width-independent operations.

The expressive power of Haskell is a double-edged sword: It enables very concise formal specifications but puts more burden on those who want to translate it to another language.

## 5 Related work

Most recent work on multipurpose ISA specs has employed domain-specific languages toward ends similar to ours. The Sail [6, 26] language is the highest-profile today for defining ISA semantics.

Probably the most important domain-specific features of languages like Sail are dependent types for tracking bitwidths, specialized support for instructions and their bit-level representations, and specialized support for machine state and updating it imperatively.

In contrast, we restrict ourselves to work within Haskell, i.e. without dependent types or any ISA-specific language features. One more-distinctive Haskell feature we do lean heavily on is type classes, for one type class in particular standing for *machine models*, a crucial interposition point for running our spec in different modes.

One important distinction is that Sail is fundamentally designed to support multiple ISAs, where we are only concerned with one specific ISAs. Our understanding is that Sail compiles its source files to an intermediate language, a lambda calculus with dependent types, structs, and vectors, which is then pretty-printed to the different target. While this strategy works well to generate fast C code, the output of Sail for proof assistants (Coq, for example) seems to be tainted with artifacts introduced by the elaboration to the intermediate language. The output is a Coq representation hard to manipulate by hand. While Sail focuses on translation to different languages, we decided first to address the multitude of problems already arising when trying to target very few languages.

Another DSL specific to the ARM ISA family [28] received a lot of attention recently, thanks to systematic adoption for several of ARM's most-important ISA variations. That second DSL has also been translated automatically to Sail. A notable predecessor to Sail was L3 [15]. These DSLs have been used for encoding several significant mainstream ISAs beyond RISC-V, and indeed it is possible that our approach would be less appropriate for legacy ISAs with complications and baggage beyond what we had to deal with in RISC-V.

There has also been a good amount of past work writing ISA semantics directly in the languages of proof assistants. For instance, Fox and Myreen [16] defined and validated an ARM semantics in HOL4. As in our semantics, theirs uses a monadic style for state-threading and incorporation of effects. However, they implemented a single monad, rather than using parameterization over a monad as the central approach to applying one semantics in different use cases. They also performed extensive validation across a few use cases, most notably in automated testing against real ARM processors.

Goel and Hunt [17, 18] developed a detailed model of x86 in ACL2. Like us, they observed that different use cases impose different constraints on the formalization: On one hand, they want to execute their model efficiently to validate it against real x86 processors, and on the other hand, they want to prove correctness of programs against their x86 model. To bridge between these different requirements, they use an abstract state representation for program verification and a concrete state representation for execution, define correspondence predicates between the two, and prove that all modifications preserve the correspondence. Our approach can be seen as generalizing theirs to consider more than two interpretations of a semantics within a single logic.

Crafting semantics for multicore systems with weak memory models has become a substantial research area in its own right, with some of the earliest work on mainstream ISAs centered at the University of Cambridge [5, 29]. With the Sail language and others, it had already been demonstrated that the meanings of opcodes could be separated from definition of memory models. Our preliminary results show that the same should work with our semantics style.

An ISA semantics is the natural meeting point of proofs for software and hardware, though we have been surprised to see how few past multipurpose specifications have been used for substantial proofs on both sides. Considering more single-purpose specs, CompCert [22] includes quite a few

assembly-language backends with associated operational semantics (including for RISC-V), but as far as we know, these have not been reused to reason about hardware. The CakeML project has connected hardware and software proofs via a semantics for the Silver ISA [24], where, to our knowledge, neither that ISA nor its semantics have been applied outside that case study. The most-involved functional-correctness proof we know of, connected to prior multipurpose specifications, relates to a Sail ARM spec, proving correctness of address translation in Isabelle/HOL [6] (with an ongoing port to Coq [8]). In contrast, our specification has been validated through a case study [14] doing a complete functionalcorrectness proof for a simple embedded system, connecting proofs of hardware and software parts into a final theorem whose statement does not depend on our semantics. The software and hardware sides of that project are compatible with mainstream RISC-V artifacts: the verified software runs on an off-the-shelf RISC-V microcontroller, while the verified processor also runs RISC-V machine code produced by GCC.

## 6 Conclusion

In this paper we presented a machine representation of the RISC-V ISA that we demonstrated was usable in a wide variety of formal-methods projects.

While the architectural side of the RISC-V community is watching first and foremost for readability and a golden standard, we think that the part of the formal-methods community that wants to use the RISC-V ISA for new research has different priorities: the specification should work with various tools of the community. We showed our representation could achieve that: from a small prototype interacting with Alloy for memory models, to the integration of our specification in various sizable projects doing theorem proving in Coq.

We also made sure our specification was complete enough: we did not want to have a specification where the simplicity of the specification would have come only from simplifications made in the subset of the ISA modeled.

While not supporting the complete ISA, we support enough of the specification to boot an operating system (Linux) in simulation, while thanks to various modularity techniques, those features are completely hidden in developments that do not need them.

This paper demonstrated a ready-to-use and customizable solution for formal projects (Coq especially) that want a realistic specification. We hope our specification style promotes smooth extension of proofs to cover system-level instructions, exceptions, and interrupts – paying the cost of each new complexity only as it is first considered, even though all are present in the ISA semantics from the start.

## References

- [1] 2019. Kendryte K210 Datasheet. (2019). https://github.com/kendryte/kendryte-doc-datasheet.
- [2] 2019. SiFive Freedom Platforms. (2019). https://github.com/sifive/ freedom.
- [3] 2020. EH1 SweRV RISC-V Core(TM) 1.8 from Western Digital. (2020). https://github.com/chipsalliance/Cores-SweRV.
- [4] 2020. Spike, a RISC-V ISA Simulator. https://github.com/riscv/riscvisa-sim.
- [5] Jade Alglave, Anthony C. J. Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. 2009. The Semantics of Power and ARM Multiprocessor Machine Code. In *Declarative Aspects of Multicore Programming (DAMP)*, Leaf Petersen and Manuel M. T. Chakravarty (Eds.). ACM, 13–24.
- [6] Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Alastair Reid, Kathryn E. Gray, Robert M. Norton, Prashanth Mundkur, Mark Wassell, Jon French, Christopher Pulte, Shaked Flur, Ian Stark, Neel Krishnaswami, and Peter Sewell. 2019. ISA Semantics for ARMv8-A, RISC-V, and CHERI-MIPS. *Proceedings of the ACM on Programming Languages* 3, POPL (Jan. 2019), 1–31. https://doi.org/10.1145/3290384
- [7] Joachim Breitner, Antal Spector-Zabusky, Yao Li, Christine Rizkallah, John Wiegley, and Stephanie Weirich. 2018. Ready, Set, Verify! Applying Hs-to-Coq to Real-World Haskell Code (Experience Report). *Proceedings of the ACM on Programming Languages* 2, ICFP (July 2018), 89:1–89:16. https://doi.org/10.1145/3236784
- [8] Brian Campbell. 2020. Port of Isabelle Address Translation Proof (WIP). https://github.com/rems-project/armv8a-address-translation-coq.
- [9] Christopher Celio. 2018. A Highly Productive Implementation of an Out-of-Order Processor Generator. Thesis. University of California at Berkeley.
- [10] C. Chen, X. Xiang, C. Liu, Y. Shang, R. Guo, D. Liu, Y. Lu, Z. Hao, J. Luo, Z. Chen, C. Li, Y. Pu, J. Meng, X. Yan, Y. Xie, and X. Qi. 2020. Xuantie-910: A Commercial Multi-Core 12-Stage Pipeline outof-Order 64-Bit High Performance RISC-V Processor with Vector Extension : Industrial Product. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 52–64. https: //doi.org/10.1109/ISCA45697.2020.00016
- [11] Adam Chlipala. 2013. The Bedrock Structured Programming System: Combining Generative Metaprogramming and Hoare Logic in an Extensible Program Verifier. In *ICFP*. ACM Press, 391. https: //doi.org/10.1145/2500365.2500592
- [12] Joonwon Choi, Muralidaran Vijayaraghavan, Benjamin Sherman, Adam Chlipala, and Arvind. 2017. Kami: A Platform for High-Level Parametric Hardware Specification and Its Modular Verification. *Proceedings of the ACM on Programming Languages* 1, ICFP (Aug. 2017), 1–30. https://doi.org/10.1145/3110268
- [13] P. Davide Schiavone, F. Conti, D. Rossi, M. Gautschi, A. Pullini, E. Flamand, and L. Benini. 2017. Slow and Steady Wins the Race? A Comparison of Ultra-Low-Power RISC-V Cores for Internet-of-Things Applications. In 2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS). 1–8. https://doi.org/10.1109/PATMOS.2017.8106976
- [14] Andres Erbsen, Samuel Gruetter, Joonwon Choi, Clark Wood, and Adam Chlipala. 2021. Integration Verification Across Software and Hardware for a Simple Embedded System. *PLDI'21 (Conditionally Accepted)* (2021).
- [15] Anthony Fox. 2012. Directions in ISA Specification. In *Interactive Theorem Proving*, Lennart Beringer and Amy Felty (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 338–344.
- [16] Anthony C. J. Fox and Magnus O. Myreen. 2010. A Trustworthy Monadic Formalization of the ARMv7 Instruction Set Architecture. In *Interactive Theorem Proving (ITP)*, Matt Kaufmann and Lawrence C. Paulson (Eds.). Springer, 243–258.

- [17] Shilpi Goel. 2016. Formal Verification of Application and System Programs Based on a Validated X86 ISA Model. Thesis. University of Texas at Austin. https://repositories.lib.utexas.edu/handle/2152/46437.
- [18] Shilpi Goel and Warren A. Hunt. 2013. Automated Code Proofs on a Formal Model of the X86. In Verified Software: Theories, Tools, Experiments (Lecture Notes in Computer Science), Ernie Cohen and Andrey Rybalchenko (Eds.). Springer, Berlin, Heidelberg, 222–241. https://doi.org/10.1007/978-3-642-54108-7\_12
- [19] Daniel Jackson. 2002. Alloy: A Lightweight Object Modelling Notation. ACM Transactions on Software Engineering and Methodology 11, 2 (April 2002), 256–290. https://doi.org/10.1145/505145.505149 https: //dl.acm.org/doi/10.1145/505145.505149.
- [20] Michalis Kokologiannakis and Viktor Vafeiadis. 2020. HMC: Model Checking for Hardware Memory Models. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, Lausanne Switzerland, 1157– 1171. https://doi.org/10.1145/3373376.3378480
- [21] Leslie Lamport and Lawrence C. Paulson. 1999. Should Your Specification Language Be Typed. ACM Transactions on Programming Languages and Systems 21, 3 (May 1999), 502–526. https://doi.org/10. 1145/319301.319317
- [22] Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 (2009), 107–115. http://xavierleroy.org/publi/compcert-CACM.pdf.
- [23] Sheng Liang, Paul Hudak, and Mark Jones. 1995. Monad Transformers and Modular Interpreters. In In Proceedings of the 22nd ACM Symposium on Principles of Programming Languages.
- [24] Andreas Lööw, Ramana Kumar, Yong Kiam Tan, Magnus O Myreen, Michael Norrish, Oskar Abrahamsson, and Anthony Fox. 2019. Verified Compilation on a Verified Processor. *PLDI*'19 (2019), 13.
- [25] Daniel Lustig. 2018. A Formalization of the RVWMO (RISC-V) Memory Model. https://github.com/daniellustig/riscv-memory-model.
- [26] Prashanth Mundkur, Rishiyur Nikhil, Jon French, Brian Campbell, Robert Norton, Alasdair Armstrong, Thomas Bauereiss, Shaked Flur, Christopher Pulte, and Peter Sewell. 2020. Sail RISC-V Model. (2020). https://github.com/rems-project/sail-riscv.
- [27] QBayLogic. 2020. Clash. https://clash-lang.org/.
- [28] Alastair Reid. 2016. Trustworthy Specifications of ARM® V8-A and v8-M System Level Architecture. In 2016 Formal Methods in Computer-Aided Design (FMCAD). 161–168. https://doi.org/10.1109/FMCAD. 2016.7886675
- [29] Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus O. Myreen, and Jade Alglave. 2009. The Semantics of X86-CC Multiprocessor Machine Code. In *Principles of Programming Languages (POPL)*, Zhong Shao and Benjamin C. Pierce (Eds.). ACM, 379–391.
- [30] Philip Wadler. 1992. The Essence of Functional Programming. In Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '92). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/143165. 143169
- [31] Andrew Waterman and Krste Asanovic (Eds.). 2019. The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 20191213. RISC-V Foundation (Dec. 2019). https://riscv.org/technical/ specifications/.
- [32] Andrew Waterman and Krste Asanovic (Eds.). 2019. The RISC-V Instruction Set Manual, Volume II: Privileged Architecture, Document Version 20190608-Priv-MSU-Ratified. *RISC-V Foundation* (June 2019). https://riscv.org/technical/specifications/.
- [33] Claire Wolf. 2018. RISC-V Formal Verification Framework. Symbiotic EDA. https://github.com/SymbioticEDA/riscv-formal.
- [34] S. Zhang, A. Wright, T. Bourgeat, and A. Arvind. 2018. Composable Building Blocks to Open up Processor Design. In 2018 51st Annual

IEEE/ACM International Symposium on Microarchitecture (MICRO). 68– 81. https://doi.org/10.1109/MICRO.2018.00015

[35] B. Zimmer, Y. Lee, A. Puggelli, J. Kwak, R. Jevtic, B. Keller, S. Bailey, M. Blagojevic, P. Chiu, H. Le, P. Chen, N. Sutardja, R. Avizienis, A. Waterman, B. Richards, P. Flatresse, E. Alon, K. Asanović, and B. Nikolić. 2015. A RISC-V Vector Processor with Tightly-Integrated Switched-Capacitor DC-DC Converters in 28nm FDSOI. In 2015 Symposium on VLSI Circuits (VLSI Circuits). C316–C317. https: //doi.org/10.1109/VLSIC.2015.7231305