Documentation
Solve Attributes
From the solver outcomes like solving time and primal and dual gaps, PAVER computes additional values.
Gaps
The gap specifies the relative gap between the primal and dual bounds reported by a solver.
The gap between two values a and b is computed as
$$ gap(a,b) = \begin{cases}0, & \textrm{if } |a-b| < tol, \\
\infty, & \textrm{if } \min(|a|,|b|)\lt tol, \\
\infty, & \textrm{if } \max(|a|,|b|)\gt\infty, \\
\infty, & \textrm{if } a\, b\lt 0, \\
\frac{a-b}{\min(|a|,|b|)}, & \textrm{otherwise,}
\end{cases}
$$ where tol is by default 1e-9.
For instances with known optimal value, also the gap between primal and dual bound to optimal value are computed.
These are denoted as primal gap and dual gap, respectively.
Integrals
If the progress of primal and/or dual bound during the solving progress is available, e.g., in form of a GAMS solvetrace file,
PAVER can compute the primal integral, dual integral, and primal-dual integral [Berthold] from this information.
These values are computed as follows.
Let $ t_0=0 \leq t_1 \leq \ldots \leq t_n = T $ be the points in time when information about the primal and dual bounds are available and denote these values as $p(t)$ and $d(t)$, respectively.
Let $ opt $ be the known optimal value.
Then the primal integral is obtained by integrating the relative difference of $p(t)$ and $ opt $ over the time:
$$ \sum_{i=1}^n (t_i - t_{i-1})\cdot \min(|gap(p(t),opt)|, 1.0) $$
Analogously, the dual integral is obtained by integrating the relative difference of $d(t)$ and $ opt $ over the time:
$$ \sum_{i=1}^n (t_i - t_{i-1})\cdot \min(|gap(opt,d(t))|, 1.0) $$
Finally, the primal-dual integral is obtained by integrating the relative difference of $p(t)$ and $d(t)$ over the time:
$$ \sum_{i=1}^n (t_i - t_{i-1})\cdot \min(|gap(p(t),d(t))|, 1.0) $$
Note, that the formula we use for the gap is different from the one used in [Berthold]
(we divide by the minimum of primal and dual bound, while [Berthold] divides by the maximum).
Statistics
Absolute Statistics
For a solve attribute and a filter, a table like the following may be shown:
Attribute values were projected onto interval [1.0, inf].
|
SolverA |
SolverB |
SolverC |
virt. best |
virt. worst |
count |
77.00 |
77.00 |
72.00 |
77.00 |
72.00 |
arith. mean |
68.18 |
57.22 |
100.82 |
52.67 |
106.86 |
arith. std. |
175.63 |
154.20 |
156.22 |
154.20 |
166.77 |
geom. mean |
10.99 |
10.51 |
29.22 |
8.21 |
32.62 |
geom. std. |
5.92 |
5.94 |
5.84 |
5.91 |
5.42 |
sh.geom. mean |
18.52 |
18.20 |
41.58 |
14.96 |
44.22 |
sh.geom. std. |
3.11 |
2.90 |
3.30 |
2.86 |
3.29 |
min |
1.00 |
1.00 |
1.00 |
1.00 |
1.18 |
10.0% |
1.38 |
1.02 |
2.46 |
1.00 |
2.85 |
25.0% |
2.81 |
1.99 |
10.00 |
1.79 |
10.79 |
50.0% |
7.92 |
8.68 |
36.12 |
7.46 |
37.47 |
75.0% |
26.50 |
40.18 |
98.39 |
24.00 |
119.52 |
90.0% |
175.49 |
125.89 |
386.89 |
97.70 |
386.89 |
max |
912.34 |
902.50 |
663.14 |
902.50 |
667.91 |
geometric mean shift: 10.0
These numbers have the following meaning:
- count: Number of instances, after filtering, where the attribute has a value for the corresponding attribute.
In this example, solvers A and B provide a value for the attribute for 77 instance, while solver C does so for only 72 instances.
All following measures disregard instances where no value for the attribute is given.
- arith. mean and arith. std.: Arithmetic mean and standard deviation of attribute.
The arithmetic mean and standard deviation for a sequence of numbers are defined as
$$\mu = \frac{1}{n}\sum_{i=1}^n v_i \qquad \textrm{and} \qquad \sqrt{\frac{1}{n}\sum_{i=1}^n (v_i - \mu)^2}.$$
- (sh.)geom. mean and (sh.)geom. std.: (Shifted) geometric mean and standard deviation of attribute.
The (shifted) geometric mean and standard deviation for a sequence of numbers and a shift s are defined as
$$\nu = \prod_{i=1}^n (v_i + s)^{1/n} - s \qquad \textrm{and} \qquad
\exp{\sqrt{\frac{1}{n}\sum_{i=1}^n \log(v_i+s)^2 - \log(\nu + s)^2}}.$$
In this example, the shifted geometric mean and std.dev. are computed for a shift of 10.
- min, x%, max: Minimal, x-quantile, and maximal value of attribute.
That is, for Solver A, the minimal value of the attribute is 1.18, for 10% of the instances, it is below 1.38, ..., and the maximal one is 912.34.
The virtual best and virtual worst solvers are computed instance-wise from the values of all solvers.
If some solver does not provide a value, also the virtual worst one does not.
Bar charts for counts and mean values may be shown, too.
Further, a box plot may plot the attribute values for each solver,
together with the median (red line), 25% quantile (lower edge of the blue box), 75% quantile (upper edge of the blue box),
the lowest data point still within 1.5 * (75%-quantile - 25%-quantile) of the 25% quantile (the lower whisker),
the largest data point still within 1.5 * (75%-quantile - 25%-quantile) of the 75% quantile (the upper whisker),
and data points that are beyond the whiskers (the outliers).
Relative Statistics
For a solve attribute, a filter, and a reference solver, a table like the following may be shown:
Performance with respect to virt. best:
Tolerance:
relative 0.1
absolute 1.0
|
SolverA |
SolverB |
SolverC |
virt. worst |
count |
68.00 |
68.00 |
68.00 |
68.00 |
arith. mean |
1.48 |
1.73 |
19.38 |
19.63 |
arith. std. |
0.83 |
2.95 |
82.38 |
82.33 |
min |
1.00 |
1.00 |
1.00 |
1.18 |
10.0% |
1.00 |
1.00 |
1.03 |
1.35 |
25.0% |
1.00 |
1.00 |
1.58 |
2.20 |
50.0% |
1.02 |
1.00 |
3.81 |
4.12 |
75.0% |
1.53 |
1.27 |
7.67 |
7.67 |
90.0% |
2.74 |
1.97 |
20.93 |
20.93 |
max |
5.08 |
24.22 |
642.43 |
642.43 |
better |
0.00 |
0.00 |
0.00 |
0.00 |
close |
46.00 |
45.00 |
13.00 |
6.00 |
worse |
22.00 |
23.00 |
55.00 |
62.00 |
These numbers have the following meaning:
- count: Number of instances, after filtering, for which a ratio of the attribute value to the reference solver is computed.
- arith. mean / std.: Arithmetic mean and standard deviation of ratios.
- min, x%, max: Minimal, x-quantile, and maximal value of attribute.
As this is 1.0 for solvers A, B, and C in this example, it means that each solver was at least for one instance as good as the best solver on this instance (w.r.t. the considered attribute).
The max row shows, that solver A, B, and C were at most 5, 24, and 642 times slower than the best one on some instance, respectively.
- better, worse, close: Number of instances where a solver is better / worse / not-better-and-not-worse than the reference solver w.r.t. specified relative and absolute tolerances.
In this example, no solver is better than the virtual best one (which is clear by definition).
Solver A was on 22 instances at by least 10% and at least 1.0 worse than the best one,
while it was on 46 instances less than 10% or less than 1.0 worse than the best one.
Performance Profiles
The performance profiles plot the number of instances (w.r.t. the current filter) for which a certain value is below x for increasing x.
For an absolute performance profile, the value is the value of the solve attribute.
For a relative performance profile [Dolan, Moré], the value is the ratio of the solve attribute value of the solver to the minimal one among all solvers.
For an extended performance porfile [Mahajan, Leyffer, Kirches], the value is the ratio of the solve attribute value of the solver to the minimal one among all other solvers.
References
- Timo Berthold, Measuring the impact of primal heuristics, Operations Research Letters, 41:6, 611-614 (2013)
- Elizabeth D. Dolan and Jorge J. Moré, Benchmarking optimization software with performance profiles, Mathematical Programming, 91:2, 201-213 (2002)
- Ashutosh Mahajan, Sven Leyffer, Christian Kirches, Solving Mixed-Integer Nonlinear Programs by QP-Diving, Optimization Online, 2012