Combinatorial Optimization: Exercise Session 5

Exercise Session 5: Probabilistic and Monte-Carlo algorithms, Heuristics.

Reminders

Course Session 5 (in French)
How to submit your work
Deadline for submission: tonight at 23h59.
To run the tests : download test.tar.gz and unpack it in your work directory

Info about all future sessions

We will use the or-tools package from Google (a toolbox of Combinatorial Optimization Algorithms and solvers).
The install instructions are here. It can take a long time (~30min), so it's a good idea to do it now, in the background, so that you don't lose that time in the next session!
Wihtout or-tools, you won't be able to test your code!

Exercise 1: Graph diameter

Given an undirected unweighted graph, you want to evaluate its diameter, which is the largest distance between two nodes of the graph.
Issue: the graph can be very large (millions of nodes!).
It won't be possible to compute the diameter's exact value every time. For instance, running a BFS (Breadth-First-Search) from each node would be too slow...

We will therefore compute an approximate value, and more precisely a lower bound. This was discussed during the course, but ask again if you didn't understand.

I recommend that you simply re-use the BFS implementation done in the previous session (TD1): it's bundled in the test archive test.tar.gz.

In the file diam1.cc, finish writing the function DiametreLB() described in file diam1.h:

⚠ I'm providing excellent boilerplate for the .cc, with the "Time Limit" all written for you. Use it! ⚠

#ifndef __DIAM1_H
#define __DIAM1_H

#include <utility>
#include "ugraph.h"

using namespace std;

// Given a graph, this function tries to find two nodes whose distance
// to each other is the as large as possible (the "distance" between two
// nodes is the length of the shortest path between them).
// 
// This will represent a lower bound of the diameter.
//
// WARNING:
// Your algorithm will have to run within a given maximum time:
// Tmax = 50ms + (M+N)*1µs, where M is the number of edges and N the
// number of nodes. This corresponds to O(M+N) complexity.
//
// For example, if M = 10M (10 millions), N = 1M (1million), Tmax = 11.05s.
//
// Then, the score you'll get will solely depend on the diameter your
// program finds for the various test graphs: the greater the lower bound,
// the better the score.
// But if you exceed the time limit (on my test machine!), your score
// will be zero: keep a safe margin!
pair<int, int> DiametreLB(const UndirectedGraph& graph);

#endif  // __DIAM1_H

Test:

rm test.tar.gz; wget --no-cache http://fabien.viger.free.fr/oc/td5/test.tar.gz
tar xf test.tar.gz
make diam1

TO SUBMIT: diam1.cc

Implement in a file diam2.cc the function DiametreUB() described in file diam2.h:

#ifndef __DIAM2_H
#define __DIAM2_H

#include <vector>
#include "ugraph.h"

using namespace std;

// Given an undirected unweighted graph, this function returns either:
// - a "central" node (so, a vector<int> with a single element)
//   that minimizes the "radius" of the graph for that center: the
//   "radius" is the largest distance from the center to any other node.
// - a "central" sub-clique, i.e. a subset of nodes that are pairwise
//   directly connected by an edge, and such that the "radius" is minimal
//   for that "multi-center" (if that's unclear, ask me).
//
// In both cases, one can deduce an upper bound of the graph diameter:
// - if we returned a single node, with radius R: Diameter <= 2*R
// - if we returned a subclique, with radius R: Diameter <= 2*R+1
//
// To start simple, you can start by always returning a single node.
// The next improvement step should be to either return a single node,
// or an edge (which is a subclique of size 2).
// Then, you may try to further improve by returning larger sub-cliques
// when it helps.
//
// WARNING:
// Same time limit as the previous exercise.
// The score will this time be based on the upper bound of the diameter
// that can be deduced from your "center": the smaller the radius, the
// better the diameter upper bound, and the better the score.
vector<int> DiametreUB(const UndirectedGraph& graph);

#endif  // __DIAM2_H

Test: make diam2

TO SUBMIT: diam2.cc

Exercise 2: Minimum Feedback Arc Set

A Karate Club president is obsessed by ranking the club members. She has the history of all the Karate fights between members of the club.
She considers that "A is better than B" if A and B fought at least once, and A won the majority of fights against B.
So, we have a comparison function for any pair of two club members. Can we rank them all?
Intuitively, one may assume that if "A is better than B" and "B is better than C", then "A is better than C"...
But it's actually wrong!

There might be zero fight between A and C on record
Or, for example, C might actually have won most fights against A, i.e. C is better than A.

So, how can we establish a total ranking?
A solution is to search for a minimal size Feedback Arc Set in the directed graph whose nodes are the club members and whose arcs are the "X is better than Y".
I.e. find the smallest possible set of arcs that, when removed, makes the graph acyclic:
we'd then be able to rank all club members by using a topological sort.

Implement in a file fas.cc the function MaxDAG() described in file fas.h:

#ifndef __FAS_H
#define __FAS_H

#include "graph.h"

// Returns a subgraph of "graph" (same nodes, but only a subset of its
// arcs) that is acyclic and that contains as many arcs as possible.
//
// TIME LIMIT: Like the previous exercises, you will have a time limit,
// which will be a bit greater:  Tmax = 50ms + (M+N)*10µs
//
// SCORING:
// - 0 if your code crashes or returns an invalid output, even if that
//   only happens for a single test case.
// - If all outputs are correct for all test cases:
//   - Each test case gets 0 if it violates the time limit
//   - Else, it'll depend on the number of arcs in your returned subgraph.
DirectedGraph MaxDAG(const DirectedGraph& graph);

#endif  // __FAS_H

Test: make fas

TO SUBMIT: fas.cc