First, let's recall the formal definition of the value of a flow . The value is the net flow leaving the source:
where
represents the total flow leaving the source () and is any node to which an edge exists from .
represents the total flow entering the source (), where is any node from which an edge exists to .
The strategy is to iteratively modify to reduce the incoming flows to zero while maintaining flow conservation at all other nodes.
Algorithm:
We perform the following procedure as long as there exists an edge entering the source with positive flow :
We start at node and trace the flow backwards. Since flow conservation holds at , the positive flow leaving towards must have arrived at from some other node. We can recursively find an incoming edge with positive flow at the previous node. We repeat this backward tracing until we encounter one of two termination cases:
Case 1: We reach the source , meaning we have found a cycle of positive flow that starts and ends at : something like .
Case 2: We reach the sink . This means we have found a path of positive flow from to : like . Since the graph is finite, we cannot trace backwards indefinitely without forming a cycle or hitting a source/sink node.
Let be the sequence of edges found (either the cycle or the path). Let be the bottleneck flow on this sequence. Since all are integers, . We can define a new flow by subtracting from the flow of every edge in :
Since we subtracted the same amount from the flow entering and leaving every intermediate node in , the flow conservation is preserved at all nodes . In addition, since we only reduced flow, . The capacity constraints are still satisfied.
Let's see if
In case 1 we reduced the flow leaving (the start of the cycle) by and the flow entering (the end of the cycle) by . Looking at the formula for , both terms decrease by the same amount. Thus, the net value remains unchanged: .
In case 2 we reduced the flow entering by , but we did not change the flow leaving (since the path started at ). In the formula , the term decreases, while stays constant. Then, the net value increases: .
Exercise 4
Exercise 4.1
We must prove that for a DAG , the condition (R) is equivalent to the condition (C), where we can formalize (C) as "every node has and every node has ".
Proof that
For simplicity I will divide R in R1 (every node is reachable from on some directed path) and R2 (and is reachable from every node on some directed path).
We can use a proof by contradiction, first for and then for for .
Proof for :
Let's assume there exists a node with (that is, no incoming edges). Since is a DAG, any path reaching must terminate at via an incoming edge. If has no incoming edges, then cannot be reached from any other node, including the source . This directly contradicts R1. Therefore, must hold for all .
Proof for :
Assume there exists a node with (that is, no outgoing edges). Since has no outgoing edges, there is no path leading from to . This directly contradicts requirement R2. Therefore, must hold for all .
Since both sub-conditions are necessary consequences of R, we conclude that if R holds, then C must hold.
Proof that
Like the proof above, I will divide R in R1 and R2.
We can still use a proof by contradiction, first for R1 and then for R2.
Proof for R1: every node is reachable from
Let be the set of nodes that are not reachable from . We assume, for contradiction, that is non-empty. Since is a DAG, it admits a topological ordering. We select as the first node of that appears in the topological ordering. Since is unreachable and , by constraint C, must have at least one incoming edge (). Let be one such incoming edge. Because the ordering is topological, must appear before in the ordering. Since was chosen as the first unreachable node, its predecessor must be reachable from .
But if is reachable from , then is reachable from via the path to followed by the edge . This contradicts our initial assumption that is unreachable. Therefore, the set must be empty, and every node is reachable from .
Proof for R2: is reachable from every node .
This is the dual of the previous proof. We can consider the transposed graph , where all edges are flipped. In , the constraint in means that every node has . We can now apply the exact same topological induction used for R1 on , treating as the source. This proves that in , can reach every node. Therefore, is reachable from every node in the original graph .
Since both R and R2 are proven, we conclude that if C holds, then R holds.
Thus, a DAG satisfies property R if and only if every node has at least one incoming edge, and every node has at least one outgoing edge.
Exercise 4.2
The original problem is to maximize the number of deleted edges. This is mathematically equivalent to minimizing the size of the required set of retained edges . So
By the result of Exercise 4.1, the retained set must satisfy the C condition (every node requires an incoming edge, and every node requires an outgoing edge)
We can use the constructed bipartite graph , which has vertices. The vertices explicitly represent all connectivity requirements of Condition (C):
: nodes requiring an outgoing edge
: nodes requiring an incoming edge
A retained edge in corresponds to an edge in . Retaining a set of edges satisfies C if and only if the corresponding set of edges in constitutes an edge cover.
Therefore, finding the minimum number of retained edges is equivalent to finding the minimum size of an edge cover, , in the bipartite graph .
We can now use the relation between the minimum edge cover and the maximum matching:
Since is a constant (), the formula shows that minimizing the number of retained edges is equivalent to maximizing the size of the matching in the graph .
By substituting this relationship back into the max deletion problem, we get
Thus, the problem is solved by finding the maximum matching in the constructed bipartite graph .
Algorithm:
To obtain the set of edges to delete, we perform the following steps:
We compute a maximum matching in the bipartite graph (we proved how to do it during the lessons).
We start with . All nodes incident to these edges satisfy condition (C).
For every vertex that is not covered by (unmatched), select any available edge in and add it to .
For every vertex that is not covered by (unmatched), select any available edge in and add it to .
The final set of edges to delete from the original graph is the set of all edges not corresponding to those in , that is .
We can also verify that the number of edges retained by this procedure matches exactly the theoretical minimum derived from the edge cover theorem. The algorithm first selects the edges from the matching, contributing exactly edges. The number of remaining unmatched vertices is the total vertices minus the matched vertices, which is . Since the algorithm selects exactly one additional edge for each unmatched vertex, the total size of the retained set is . This simplifies to , which is precisely the formula for the minimum edge cover derived earlier.
Exercise 4.3
Let be the number of nodes and the number of edges in the original graph .
The total time complexity is the sum of three components:
Let's analyze , that is the construction of the bipartite graph. We create vertices ( and ). This takes .
Then, we iterate through the edges of . For each edge , we add exactly one edge to . This takes . Then .
Then we have the Maximum Bipartite Matching (). The entire algorithm is dominated by the calculation of .
To calculate we can reduce the problem to a maximum flow problem (as we saw during the lesson), so .
Finally we have the selection of retained edges, that is . In this phase, we iterate through the matching , and since , identifying these edges takes . Then we iterate through all vertices of ( nodes) to check if they are covered by . This scan takes .
For each unmatched node, we select one incident edge. In an adjacency list representation, accessing the first neighbor of a node takes constant time . In the worst case, if we scan neighbors, the total time over all nodes is proportional to the number of edges, .
Finally, we compare the retained set with the original set to determine deletions, taking .
Thus, .
Since is asymptotically negligible compared to (assuming the graph is connected enough such that ), the total time complexity is dominated by the matching phase.
Thus, the resulting algorithm runs in .