Getting Started

For this class, we are going to look at the DIVA artifact. ICSE has a great artifact culture, so the artifact is all packaged up and ready to go for us in a VM. (With that said, please install the VM before class, and make sure it works on your computer.)

We will rerun parts of the evaluation of DIVA, and peek into the output to see what the proofs it produces looks like, and how it fails when it fails. We'll also discuss a bit about what reproducibility really means in the context of machine learning tools (we would need fancy hardware to retrain the model, for example, and determinism seems to be near impossible), and how tactic prediction models could be integrated into proof engineering workflows.

Installation Instructions

  1. Install VirtualBox if you do not already have it.
  2. Download and icse2022.ova from the official artifact package.
  3. Unzip and open from the unzipped directory.
  4. Follow the instructions from to install the VM and run DIVA on a single project.

If you have any trouble, please post your questions in this thread. Please also drop any installation tips you have in that thread.

If installation fails, and the class forums are no help, do not panic; we will try to match you with someone in class for whom installation worked.

You can find the exercise we will do together in class below.

Let's Hack!

Let's Hack!

During class, after dividing into groups, open up the VM for the artifact. Then, as a group, run the model on a number of different projects (see the instructions that you downloaded when you installed the VM). Look at the outputs for particular proofs and projects, and discuss the following questions with your group:

  1. What proofs does the model seem to do well on? Do they have anything in common?
  2. What proofs does the model seem to struggle with? Do they have anything in common?
  3. Does the model seem to have any common modes of failure (for example, running the same tactic over and over again)?
  4. What would it mean to reproduce the results from this paper fully, in an ideal world? What resources might it require access to?
  5. Can you think of any ideas to address any limitations you encounter, like common failure modes or reproducibility limitations?

Please have your group post your answers in the class forum thread, listing all group members in your response.

<< Previous | Class Schedule | Next >>