Visualization Generation

Many different things can be plotted using our visualization framework. You can find all possible arguments here.

Common Prerequisites

In this section, we describe what is required to generate the visualizations using the included visualization scripts. They are also required to be fulfilled if you are only using the Visualization Suite.

Software requirements

The scripts will mostly give helpful error messages if you have misconfigured anything, so please pay attention to their output.

Python

Except for the comparison_table script, all visualizations are generated using python. For those, you need python3 to be installed correctly on your system:

sudo apt-get install python3

Python-Packages

For each script, you will need packages and some are probably not yet installed on your computer. You can do that by first installing pip:

sudo apt-get install python3-pip

And then the needed packages via

pip3 install numpy matplotlib pandas sklearn

Furthermore, we recommend to install the package

tqdm

which shows progress bars which can be useful in particular for very large runtime data.

LaTeX

For the pricing visualizations, a working LaTeX installation is required, i.e.

sudo apt-get install dvipng texlive-latex-extra texlive-fonts-recommended

should be up to date.

Run tests

First, you have to do a testrun to gather the data for the visualizations. A guide on how to do that can be found here.
The scripts sometimes expect a certain flag to be set. The following table gives an overview over the requirements for the runtime data for each visualization script.

	Expected Input	Required compile flags	Required test flags
Performance Profile	.out/.res	-	-
General	.out	-	`STATISTICS=true`
Bounds	.out	`STATISTICS=true`	`STATISTICS=true`
Classifier/ Detection	.out	-	`DETECTIONSTATISTICS=true`
Pricing	.out, .vbc	`STATISTICS=true`	-
Tree	.vbc		-
Comparison Table	.res	-	-

Generate Visualizations Manually

You can create different plots with the included scripts in the stats folder. All scripts are briefly explained in the following. When describing how to execute them, it is always assumed that you are inside the stats folder.

Common arguments

The following arguments are common across all following scripts (except for the performance profile plotter and comparison table).

Defining the Output Directory

-o OUTDIR, --outdir OUTDIR
                      output directory (default: "plots")

Parsing without Plotting, Plotting without Parsing

-save, --savepickle      parses the given .out-file without plotting
-load, --loadpickle      loads the given .pkl-file and plots it

Performance Profile Plotter

Using this plotter, one can generate performance profiles. In those, the x-axis represents the factor by which the respective run is worse than the optimal run, while the y-axis is the corresponding probability.

Execution

python3 general/performance_profile.py FILES

with FILES being some (space-separated) .res or .out files in the format as shown in 'What files do I get'.

General Plotter

The general plotter is able to plot two or more arguments parsed by the general_parser.py script in different ways. You can find all parsed arguments here.

Execution

python3 general/{bubble,plot,twin,time}.py [args] FILES

with FILES being either some pickles, or some outfiles, and [args] being as defined below.

Arguments

Defining Data to be plotted

Times are the most common arguments to plot (that's the reason for the naming), but you can use whatever argument parsed you want. You can find a list of those here.

You can define what data to plot with the --times argument:

-t [TIMES [TIMES ...]], --times [TIMES [TIMES ...]]
                      times to be plotted.

Whereas

for bubble.py and twin.py, you may define exactly two times (or none),
for plot.py you can define as many times as you like and
for time.py you can also define an arbitrary amount, but the "OTHERS" time will be added automatically.

Defining Filters for the Data

For all arguments, you can define filters. For example, if you only want to plot all those instances that had a Detection Time of between 10 and 20 seconds, you simply add those bounds to the time argument:

python3 general/time.py check.gcg.out -A -t "DETECTION TIME" 10 20 "RMP LP TIME"

In this example, we generate all plots (-A) for all instances that had a detection time between 10 and 20 seconds in the check.gcg.out file with the arguments "DETECTION TIME" and "RMP LP TIME" (without filter).

Defining additional Arguments (only for time.py)

For the time.py script, you can define which plots you want to generate. The arguments are as follows:

-A, --all             create all visualizations
-B, --bar             create barchart
-G, --grouped-bar     create barchart, grouped by CPU times
-P, --plot            create simple plot
--pie                 create a pie chart plot (will automatically activate --single (see below))

Additionally, for the grouped-bar plot, you may define a number of buckets, in which the script will automatically sort the (nearly) same number of instances into.

--buckets BUCKETS     amount of buckets (resolution of grouped_bar plot)

Finally, you can compare two different runs. Though recommended, it is not required for the runs to have been on the same testset.

--single              if set, all outfiles will be summed and cumulated in a single plot
--compare             if set, each outfile will be summed and plotted with all other outfiles

Bounds Plotter

The Bounds Plotter generates a plot showing the development of the primal and dual bound and gap in the root node, as well as the basic variables generated.

Execution

python3 bounds/plotter_bounds.py FILES

with FILES being one or more .out files.

Classification and Detection Plotter

This plotter works on a whole testset and makes plots similar to performance profiles, showing the performance of the classifiers and detectors.

Execution

python3 detection/plotter_detection.py FILES

with FILES being one or more .out files.

Pricing Plotter

The Pricing Plotter generates 7 different plots illustrating the pricing procedure during a single instance's solving process. When given an outfile with more than one instance, it generates the plots sequentially.

Execution

python3 pricing/plotter_pricing.py FILES --vbcdir VBC

with FILES being one .out file and VBC being the directory where all corresponding .vbc files are (per default: check/results/vbc/)

Complete Plot

This Plot shows how many variables were generated in a certain pricing round in which time for all nodes of the Branch and Bound tree. The node numbers are shown above the plot and the rounds are in the line below that. Each bar represents the iteration of one pricing problem. Here, only one pricing problem existed, thus there is just one color. Note that the numbers of those pricing problems can have gaps in between, since they could have been aggregated prior to the pricing. Whether those variables are useful is shown by all bars that are below zero, as they mean that the variables of that pricing iteration are in the optimal solution of the Root LP (Root LP Sol) or IP (Incumbent). Finally, the dots show how many columns are taken from the column pool. Note that the shown plot was generated with the --lines 0 argument, which means that no lines are printed between pricing rounds.

Time Plot

The Pricing Time Statistics include four pie charts. The first one shows how much of the runtime was needed in the reduced cost pricing, the master LP and during the initial Farkas. The upper center one shows the relative (and, inside the slices, absolute) time needed by each pricing problem that took at least 11 degree of the pie (the last degree where the absolute numbers inside the slices are still readable). Note that if no absolute numbers are needed, but only the highest possible amount of slices (pricing problems) should be shown, the --short-times argument can be set. The pie chart to the upper right shows how many columns were generated by each pricing problem and the ratio between the upper right and the upper center, i.e. the variables per second, is shown in the lower left, illustrating which pricing problem yielded the most variables for the RMP. Finally, in the course of this thesis, an additional subplot that illustrates the seconds needed by each pricing problem to generate a variable was added.

Summary Plot

The summary plot aims to illustrate the same thing as the "complete plot". The end of the root node, which is treated in deeper detail in the Bounds Plot (see Section \ref{sec:bounds}), is marked by a red line. The plot consists of two different $y$-axes, one representing the time (in seconds) needed for the pricing and the other the fraction of pricing problems that generated variables. This leads to the ability to identify pricing rounds that ran for a long time and see when and how many pricing problems were successful. In this given plot, one can see that during the initial Farkas Pricing, nearly all pricing problems generated columns, while in the rest of the time spent in the root node, close to none did. Then, in the rest of the tree for most of the time, below 30% of the pricing problems did not generate any variables.

Bubble Plot

In this visualization, one can see all pricing problems listed vertically along the $y$-axis. Then, in the left subfigure, they are shown against the pricing rounds on the $x$-axis. Every time the pricer yielded at least one variable resulting from a pricing problem, a dot is printed in the round where it was generated. This results in the ability to not only see the sensibility of each pricing problem, but also in which rounds what pricing problem performed best. For example, around the 200th round, only 1 or two pricing problems generated columns, whereas at the end of the Farkas pricing, nearly all generated variables. The subplot on the right-hand side shows how many percent of the variables were generated by which problem.

Gap Plot

The gap plot compares the solving time of a pricing problem (of which usually more than one are solved during a pricing round}) with the size of the gap in the root node at that point of time. The gap here is the ratio between the maximum gap (the ratio of the worst, but finite primal bound and the worst, but finite dual bound) and the gap in the root node at the point of time given by the $x$-axis. Here, the pricing problems that ran when most of the gap was closed, solved nearly immediately. Note that it is also possible to show the time of one pricing iteration} instead of one pricing problem, which sums up all points seen here that happened in the same iteration. Note that the data used for the plots in general is only taken once per 0.01 seconds, which leads to a "discrete" distribution.

Depth Plot

This figure illustrates how the gap develops along the depth of the branching tree. Each dot represents the gap as given by the primal and dual bounds in this specific node as given by the GCG "root bounds" table (just like in the bounds plotter). This node is located on the tree depth that can be read on the $x$-axis, such that for each $x$-coordinate, at most $2^x$ points can exist. Furthermore, a plot of the mean is given.

Node ID Plot

The Node ID plot is similar to the Depth Plot. Instead of the depth in the branch-and-bound tree, we now have the node ID. This leads to the fact that one can see behavior that is not dependent of the depth, but of the time progression during the branching.

Tree Statistics Plotter

The Tree Statistics Plotter, just like the Pricing Plotter, needs the vbc files to function correctly. It will plot how many nodes were opened on each level.

Execution

python3 tree/plotter_tree.py FILES

with FILES being some .vbc files.

Comparison Table

A (quite raw) comparison of testruns can be done using this script in the general-folder. This script just puts the statistics of all runs that are given as arguments into a .tex-file and prints it as ASCII on the console. Execution

./general/comparison_table.sh run1.res run2.res run3.res ...

with run1, run2, ... being a .res file in the format as shown in 'What files do I get'.

Tree Visualizations

Note: The following guide concerns external software. We do not provide warranty nor support for it. Note: The tree visualization tool (vbctool) does not display aggregated information of the tree's development as the tree statistics plotter does, but shows the tree's development interactively.

In this section, we give a brief guide on how to use a tool to visualize branch-and-cut algorithmics, graphically showing how the tree was built during branching.

Installation

In order to generate pictures of the Branch and Bound tree that GCG used during solving, you can use the vbctool. Since the executable might have issues with the linking of the libraries, it is suggested to download the vbctool source code and additionally the Motif Framework source code, both available on the website. Unzip the Motif Framework source code tarball into the lib/ folder of the vbctool. Before starting with the Build Instructions, install the following packages:

sudo apt-get install libmotif-dev libxext-dev

Then, compile the program (just like explained in the Build Instructions):

cd lib/GFE
make
cd ../GraphInterface/
make
cd ../MotifApp/
make
cd ../..
make

Now you can start the program using

./vbctool

Usage

The files you now have to read (File -> Load) are included in the folder check/results/vbc.

A tree.

In order to generate the tree, click on Emulation -> Start. Before doing that, you can configure the emulation in Emulation -> Setup, where you can also set the time it will need to generate the tree.

If left on default values, the tree will generate as fast as it generated in GCG during execution, offering you a good insight into how long GCG was 'stuck' in certain nodes.
If changed, for example to 1 second, it will just generate the tree all at once and you can then save it.

To save the generated tree, just click on File -> Print and it will save a .ps file.

Test Set Selection

Using existing runtime data, you can filter using the instructions under "General Plotter -> Arguments -> Defining Filters for the Data". For the strIPlib, we can provide a full data set (.out and .pkl format) of runtime data (which is also used in the strIPlib). To then export a test set including the instances that your filter applies to, you can set the flag -ts like that:

python3 general/plot.py check.gcg.out -ts -t "DETECTION TIME" 10 20 "RMP LP TIME"

Note that the test set file export mode is only implemented in the standard plotter, so please call plot.py with your filters and the test set export flag. After executing this command, you will get a test set file filtered.test in the given output directory, with which you can call the make test target as usual. If .dec files were used, they will be included in the exported test set file.

Whenever the test set that you get is too large to use or does not satisfy your requirements, we recommend to check out our diverse test set generation functionality.

Custom Visualizations

When creating custom visualizations, one has to know exactly what data is needed to make the visualization. With these arguments in mind, one can then look if they are already parsed. A list of the currently parsed data is located here. If so, one of the parsers (parser_general.py, parser_bounds.py or parser_detection.py) can be used, or otherwise, for the plotter_pricing.py, the –savepickle argument of the plotter shall be used each time to parse the runtime data and save it to the pickle, to then read it again for the plotter. Then, the plotter_ script can be created which should import the parser that already gets the data needed with a simple import parser_.... Finally, the parser can be used just like in the other plotters.

Troubleshooting

Q: Why don't I get any detection times?
A: You probably did not run the test with a set mode, e.g. MODE=0, so GCG fell back to the readdec mode, reading any .dec files it could find instead of detecting.

GCG

Table of Contents

Visualization Generation

Common Prerequisites

Software requirements

Python

Python-Packages

LaTeX

Run tests

Generate Visualizations Manually

Common arguments

Defining the Output Directory

Parsing without Plotting, Plotting without Parsing

Performance Profile Plotter

Execution

Performance Profile

General Plotter

Execution

Arguments

Defining Data to be plotted

Defining Filters for the Data

Defining additional Arguments (only for time.py)

Simple Plot (Variant A)

Simple Plot (Variant B)

Plot with two y-axes

Bubble Plot

Bar Chart

Grouped Bar Chart

Pie Chart

Pie Comparison Chart

Bar Comparison Chart

Bounds Plotter

Execution

Bounds Plot: Time

Bounds Plot: Iterations

Bounds Plot: Comparison

Classification and Detection Plotter

Execution

Detection Times

Found Decompositions

Detection Quality (score)

Detection Quality (score): Set Partitioning Master

Number of Blocks for whitest decomposition

Number of classes found by a single classifier

Pricing Plotter

Execution

Complete Plot

Time Plot

Summary Plot

Bubble Plot

Gap Plot

Depth Plot

Node ID Plot

Tree Statistics Plotter

Execution

Simple Plot

Bar Graph

Normalized Bar Graph

Comparison Table

Comparison Table

Tree Visualizations

Installation

Usage

Test Set Selection

Custom Visualizations

Troubleshooting