Ejercicios Prácticos en Configuración Automática de Algoritmos

Escuela de Invierno – redHEUR SEIO 2024

Manuel López-Ibáñez, University of Manchester, UK (https://lopez-ibanez.eu)

Download

Download the materials from here: https://lopez-ibanez.eu/2024-redheur/all.zip

Setup

Installing R: https://mlopez-ibanez.github.io/irace/#installing-r
Install RStudio Desktop (optional but useful if you have never used R before): https://posit.co/download/rstudio-desktop/

We are going to install the development version of irace (which will become irace 4.0 soon). Open RStudio (or the R console) and type (this may take a while):
```
install.packages('irace', repos = c('https://mlopez-ibanez.r-universe.dev', 'https://cloud.r-project.org'))
```
Find where irace is installed. Type in the R console:
```
library(irace)
irace.cmdline("--help")
cat(system.file(package="irace", "bin", mustWork=TRUE), "\n")
```
The last command gives you the installation folder of irace, for example, /home/user/R/irace/bin. Make a note of it!
You do not need to launch R or RStudio to run irace. If you visit the above folder, you will see that irace provides several executable files. You can call the executables directly from the Bash shell, Terminal or Powershell:
```
/home/user/R/irace/bin/irace --help
```
If you add irace’s bin/ folder to the PATH environment variable of your operating system, then you can simply type: irace --help (or irace.exe --help in Windows). How to do that is left as homework.

For simplicity, we will use the Rstudio console for the rest of this tutorial.

Install iraceplot (optional)

Install rmarkdown from the R console:

install.packages("rmarkdown", repos = "https://cloud.r-project.org")

If you have installed RStudio, then you may have pandoc installed already. Otherwise, you have to follow the instructions from: https://pandoc.org/installing.html . You can verify that pandoc is correctly installed with:
```
rmarkdown::find_pandoc()
```
and it should print the folder where pandoc is installed. Otherwise, you need to set the correct folder where pandoc (or pandoc.exe in Windows) is located with:
```
rmarkdown::find_pandoc(dir="C:/path/to/pandoc/bin/")
```
We will install the development version of iraceplot, which is compatible with irace version 4.0: Once the above is working, you can do:
```
install.packages('iraceplot', repos = c('https://auto-optimization.r-universe.dev', 'https://cloud.r-project.org'))
```

Once the above is working, you can do:

library(iraceplot)
# If the default web browser does not work, 
# use something like this to specify yours:
# options(browser = "google-chrome")
example("report", ask=FALSE)

Install Python3 and scipy (optional)

Some exercises use Python 3 and the scipy package. The installation depends on the OS that you are using. Make a note of where the python3 executable (python3.exe in Windows) is installed.

Part 1: A basic scenario

Exercise 1: Basic usage

Open the folder basic and look at the files train-instances.txt, parameters.txt and scenario.txt.
Open the Rstudio console and change the working directory to the location of the basic folder (Tools | Change Working Dir... or Session | Set Working Directory depending on the version of RStudio). From the R console, if the location of basic is /path/to/basic, then you can type:
```
setwd("/path/to/basic")
list.files()
```
Run irace and see what happens:
```
irace_cmdline("")
```
If the above command says that it cannot find the function, you need load irace first using:
```
library(irace)
```
Open parameters.txt and change the value of debug to 1. Run irace again. This example illustrates how you can communicate with the targetRunner via fixed parameters. Remember to change debug back to 0.
You can also tell irace to report more details on what irace is doing:
```
irace_cmdline("--debug-level 2")
```
Let’s help irace a bit by providing an initial configuration:
```
irace_cmdline("--debug-level 2 --configurations-file=initial.txt")
```
Is there anything different between the initial configuration and parameters.txt?
Now let’s ask irace to run the best parameter configuration found on a set of test instances.
1. Edit scenario.txt and remove the # character before the line testInstancesFile.
2. Run irace again as you did above. What has happened now that did not happen before?

🥳 Congratulations! You finished successfully your first automatic parameter configuration! 🥳

Exercise 2: Time as tuning budget

Look at the file scenario-time.txt. What is different from the file scenario.txt?
Run irace on this scenario using:
```
irace_cmdline("--scenario scenario-time.txt")
```
Looking at the output, how many runs of the targetRunner was irace able to execute?

How many different configurations was irace able to execute?

On how many instances was the best configuration evaluated?
Change maxTime to a lower value, such as 100 until you see the message:
```
WARNING: with the current settings and estimated time per run ...
```
You can do this with:
```
irace_cmdline("--scenario scenario-time.txt --max-time 100")
```
Command-line options to irace override those in the scenario.txt file.
Reduce maxTime until you see the message:
```
Error: == irace == Insufficient budget
```
What happened?

Exercise 3: Capping

Look at the file scenario-capping.txt. What is different from the file scenario-time.txt?
Run irace on this scenario using:
```
irace_cmdline("--scenario scenario-capping.txt")
```
Looking at the output, how many runs of the targetRunner was irace able to execute?

Notice also there is now a new column Bound in the output.
Now we will disable adaptive capping:
```
irace_cmdline("--scenario scenario-capping.txt --capping 0")
```
Looking at the output, how many runs of the targetRunner was irace able to execute?

Exercise 4: Examining the log file

irace creates a log file irace.Rdata that contains lots of data about the configuration process. You can load the file with:
```
results <- read_logfile("irace.Rdata")
print(results$allConfigurations)
print(results$experiments)
```

There is a lot more information in results if you know where to look. A better way to analyze the logfile is to use the iraceplot package, which we have installed above.

library(iraceplot)
# If the default web browser does not work, 
# use something like this to specify yours:
# options(browser = "google-chrome")
report("irace.Rdata")

Part 2: target-runner as an executable

In this exercise, we will tune the parameters of the executable target-runner-dummy (or target-runner-dummy.exe) that you can find in the folder where irace is installed. To find that folder, type in the R console:

cat(system.file(package="irace", "bin", mustWork=TRUE), "\n")

You can invoke this executable from your OS console or directly from Rstudio console as follows (replace with the path you obtained above):
```
system2("/home/manu/R/x86_64-pc-linux-gnu-library/4.1/irace/bin/target-runner-dummy")
```
Open scenario.txt in the folder dummy/ and change the value of targetRunner to match the path you obtained above (possibly adding .exe at the end of the filename).
Now open the Rstudio console and change the working directory to the location of the dummy folder (Tools | Change Working Dir... or Session | Set Working Directory depending on the version of RStudio). From the R console, if the location of dummy is /path/to/dummy, then you can type:
```
setwd("/path/to/dummy")
list.files()
```
First, let’s check that everything works. In the R console, run:
```
irace_cmdline("--check")
```
If it says “Check unsuccesful”, then maybe you provided the wrong path to "target-runner-dummy" or you forgot to add .exe at the end (in Windows only).
Now, let’s launch irace and see what it is doing:
```
irace_cmdline("")
```

Part 3: target-runner as a Python script

In this exercise, we will tune the parameters of the differential evolution optimizer provided by SciPy. Usually, you would need to write your own target-runner.py script that communicates between irace and differential_evolution. In this case, I have written a possible target-runner.py that you can find in the folder differential_evolution/.

Open the target-runner.py. What is it doing?
Now open instances.txt and parameters.txt and try to understand how they relate to target-runner.py.
Open scenario.txt. What is different from other scenario files we have used so far?
If you are in Linux/MacOS, you can typically execute target-runner.py directly by doing in the terminal:
```
chmod u+x ./target-runner.py
./target-runner.py
```
In Windows, you need to find where python3.exe is installed, let’s say: C:/Python/bin/python3.exe. Then, in scenario.txt, set the value of targetRunnerLauncher to that string and remove the character '#' at the start of the line.
Now open the Rstudio console and change the working directory to the location of the differential_evolution folder (Tools | Change Working Dir... or Session | Set Working Directory depending on the version of RStudio). From the R console, if the location of differential_evolution is /path/to/differential_evolution, then you can type:
```
setwd("/path/to/differential_evolution")
list.files()
```
First, let’s check that everything works. In the R console, run:
```
irace_cmdline("--check")
```
If it says “Check unsuccesful”, then “target-runner.py” may not have executable permissions or irace cannot find python3 or python3.exe or there is a Python package missing such as scipy.
Now, let’s launch irace and see what it is doing:
```
irace_cmdline("--debug-level 2")
```
Usually we do not want so much detail, so let’s cancel the execution with Ctrl+C (in Linux) ESC (in Windows) or click the button in Rstudio. You can also open the Task Manager and kill the python process and this will force irace to stop with an error.
Let’s launch irace again but this time using 2 CPUs to execute multiple calls to target-runner.py in parallel:
```
irace_cmdline("--parallel 2 ")
```
(If you have 4 CPUs, you could use --parallel 4)

What interesting things do you notice in the output?
Let’s wait until irace finishes to do an ablation analysis in the next part.

Part 4: Ablation analysis

You should have a file irace.Rdata in the folder differential_evolution.
We are going to do an ablation analysis between the default configuration and the best found by irace. In the R console, type:
```
ablation("irace.Rdata", src = 1, nrep=10)
```
(Usually, target= will provide the target configuration ID. The default is to choose the best found.)

If irace was unlucky, it could happen that the best configuration found was the default (1) and ablation will give an error.
Now you should have a file log-ablation.Rdata that contains the ablation results. Let’s visualize it:
```
plotAblation("log-ablation.Rdata", type="boxplot")
```
What can you conclude from this plot?

Part 5: ACOTSP scenario

For this exercise, we will use the ACOTSPQAP software.

In the folder acotsp, you will find a file scenario.txt. What is different from other scenario files that we have examined?
Examine also parameters-acotsp.txt and target-runner-acotsp.py.
We need to compile the C code in acotsp/src/. In Linux and MacOS, you should be able to do it from the shell / terminal with:
```
cd ./acotsp/src
make acotsp
```
In Windows, you may need to do something different to compile the code.

If everything works, you should have an executable file acotsp in the folder acotsp/src/ and the following should work:
```
./acotsp --help
```
Now go back to the folder acotsp. If you are in Linux/MacOS, you can typically execute target-runner-acotsp.py directly by doing in the terminal:
```
chmod u+x ./target-runner-acotsp.py
./target-runner-acotsp.py
```
In Windows, you need to find where python3.exe is installed, let’s say: C:/Python/bin/python3.exe. Then, in scenario.txt, set the value of targetRunnerLauncher to that string and remove the character '#' at the start of the line.
In RStudio, change the working directory to the location of the acotsp folder (Tools | Change Working Dir... or Session | Set Working Directory depending on the version of RStudio). From the R console, if the location is /path/to/acotsp, then you can type:
```
setwd("/path/to/acotsp")
list.files()
```
First, let’s check that everything works:
```
irace_cmdline("--check")
```
If it says “Check unsuccesful”, then “target-runner-acotsp.py” or “./acotsp/src/acotsp” (or “./acotsp/src/acotsp.exe”) may not have executable permissions or irace cannot find python3 or python3.exe.
Now, let’s launch irace and see what it is doing:
```
irace_cmdline("--debug-level 2")
```
Usually we do not want so much detail, so let’s cancel the execution with Ctrl+C (in Linux) ESC (in Windows) or click the button in Rstudio. You can also open the Task Manager and kill the python process and this will force irace to stop with an error.
Let’s launch irace again but this time using 2 CPUs to execute multiple calls to target-runner-acotsp.py in parallel:
```
irace_cmdline("--parallel 2 ")
```
(If you have 4 CPUs, you could use --parallel 4)

What interesting things do you notice in the output?
If you have enough time, let irace run to completion and then do an ablation analysis like we did earlier.

Homework

Add irace’s bin/ folder to the PATH environment variable of your operating system. Check that it works by opening a system terminal (bash shell, Terminal or Powershell) and type: irace --help (or irace.exe --help in Windows). Now you are repeat all the exercises by using irace from the terminal. For example, if using R, you evaluated irace.cmdline("--check"), you will type in the terminal irace --check.
Use iraceplot to analyze the irace.Rdata file generated by each exercise.
You can also tune multi-objective optimizers with irace. Check the example provided by the MOEADr package: https://fcampelo.github.io/MOEADr/articles/Comparison_Usage.html