This virtual machine contains programs that you can run to evaluate the results presented in our paper.
After logging in (password: sle2017
), open a terminal by clicking on the black rectangular icon on the left. Go to the artifact directory by entering cd ~/sle17.rifl.artifact/
We present the source code for an Ocaml RIFL interpreter in rifl/interpreter/
. The RIFL language is presented briefly in the paper (section 3). Its full syntax and operational semantics are presented in our accompanying technical report (section 3).
The source code corresponds to RIFL as follows:
rifl/interpreter/parser.mly
implements the RIFL syntax.rifl/interpreter/rifl.ml
implements the RIFL operational semantics.We will describe how to run this interpreter later with the benchmark applications (section 5).
The major results of our paper are from the controlled experiment (section 4). We present the steps to obtain these results.
We included the original virtual machines and instructions used in the controlled experiment, in the separate folder "full_design_raw_data" (not in this virtual machine). This separate folder contains a readme file that points to the materials used to set up the controlled experiment. The folder also contains the raw data that we collected.
For your convenience, we have copied the relevant files into this virtual machine, under experiment/
. This directory contains the following files:
experiment/starter/
experiment/submissions/*/prog*.stu
, among which prog3.stu
is the thumbnail programs discussed in the paper. More information about the other files are available in the separate folder "full_design_raw_data" (not in this virtual machine).experiment/submissions/*/thumbnail.stu
. As discussed in the paper (section 4.2 second paragraph), we fixed minor bugs of the original programs prog3.stu
. The fixed programs are named as thumbnail.stu
. To see our slight modifications, run diff prog3.stu thumbnail.stu
.experiment/interpreters/{inspect,control}/
, for the inspect group and the control group, respectively.experiment/interpreters/foc/
experiment/inputs/
, which are used in the paper (table 5) and presented in the technical report (appendix A).experiment/outputs/
, which are described in the paper (table 5).experiment/{test,run,complexity}.sh
, which we will describe below.Please first compile the interpreters as follows:
To run a single program with a single input file, use ./run.sh [interpreter version] [developer id] [input name]
. For example:
To test all the thumbnail programs with all the input files, use ./test.sh
. You should see the following in the terminal:
The outputs of the thumbnail programs are stored in experiments/outputs/
for each developer program. These outputs are consistent with the behavior presented in the paper (table 5), with the few exceptions that some programs with long-running loops may time out and output "[Aborted for time out]" with the current test script, though these programs could have terminated in a longer time.
You may compare these outputs to the correct outputs presented in our accompanying technical report (appendix A). You may also add other input files to experiments/inputs/
, run test.sh
again, and see the updated outputs in experiments/outputs/
.
We identified defects in the developer thumbnail programs by running tests and analyzing code manually. Please read the programs in experiment/submissions/*/thumbnail.stu
and refer to the test outputs (table 5) to verify the defects that we found (table 4).
We provide parsers to analyze the code complexity.
You should see the following output:
These numbers are presented in the paper (figure 6) and used in statistical analysis.
We provide a script experiment/statistics.r
, written in R, to perform statistical analysis presented in the paper.
To use this script, first launch the R software by clicking at the "R" icon on the left. In the R terminal, enter the following command:
You should see at least these outputs in the R terminal:
Please scroll up and find other outputs that did not fit in the screen. The p-values produced in R is consistent with the results reported in the paper (section 4.2).
To quit the R software, enter q()
in the R terminal. When prompted to save workspace, enter n
.
We present the benchmark applications discussed in the paper (section 5) and the technical report (section 5). Each application was implemented in four different versions that have the same functionality (technical report section 5.1).
You may run these programs using the full RIFL interpreter. To do so, first compile the full RIFL interpreter.
Detailed steps to run each application is as follows.
Programs to process binary formats ZIP, PCAP, and PNG:
Note the output files are the same across all four versions
Programs to process text formats JSON, OBJ, and CSV:
Note the outputs on the screen are the same across all four versions.
We provide a parser to compute the code complexity for benchmark programs.
You should see the following:
Which are the numbers used to calculate the percentage numbers in the paper (section 5). The calculation is discussed in more detail in the technical report (section 5.2).
The default setting for all interpreters in this artifact is to produce only the standard outputs of the programs, without generating error logs.
To enable the runtime error logs mentioned in the paper (footnote 2), run the interpreter with verbose level 1 or 2. For example, you may modify the script rifl/applications/run.sh
at lines 21 and 23 to use the interpreter with option 1 or 2 instead of 0: