Assumption: Without loss of generality, we assume the elements are distinct. This represents the strict worst-case scenario, as the presence of equal elements would only reduce the complexity of the decision tree.
For distinct elements, there are possible total orderings. A deterministic algorithm follows a fixed path in a binary decision tree, where each comparison has two possible outcomes ( or ).
Therefore, a deterministic comparison-based algorithm can distinguish at most:
2 scenarios after 1 comparison ()
4 scenarios after 2 comparisons ()
8 scenarios after 3 comparisons ()
With only 2 comparisons, the algorithm can reach at most 4 distinct leaves in the decision tree. However, an adversary can force the algorithm into a state where ambiguity remains between orderings that yield different medians.
Analysis of the worst path:
Comparison 1 ( vs ): we suppose the outcome is .
Comparison 2: The algorithm must compare the third element with one of the others.
If it compares vs and finds , we know is the maximum. However, the relative order of and is unknown. The true order could be (median is ) or (median is ).
If it compares vs and finds , we know is the minimum, but the order of and is unknown. The median could be or .
In this worst-case scenario, after 2 comparisons, the set of consistent orderings still contains candidates that yield different medians. Therefore, a third comparison is strictly necessary to resolve this final ambiguity.
Exercise 5.2
Let the set of elements be . We perform this steps:
We start by selecting one element uniformly at random. Let the other two elements be and .
We compare with and with . This costs 2 comparisons.
Based on the results of the previous step:
If one element is smaller than and the other is larger than (e.g., ), then is the median. We can terminate. In this case we would do only 2 comparisons.
If both elements are smaller than (i.e., and ), then is the maximum. We have to perform one additional comparison between and to determine the median. In this case we would do 3 comparisons.
If both elements are larger than (i.e., and ), then is the minimum. We have to perform one additional comparison between and to determine the median. In this case we would do 3 comparisons.
We now calculate the expected number of comparisons. Let the true sorted order of the elements be , where is the median (the algorithm does not know this order, but the randomness depends only on our choice of , not on the input values).
Since we choose uniformly at random from , there are three equally likely (each with probability ) cases for :
Case 1: is the minimum ().
In this case, the other two elements are and , and they are both are larger than . The algorithm detects is the minimum (step 2) and must compare vs (step 3) to find the median. The cost is 3 comparisons.
Case 2: is the maximum ().
In this case, the other two elements are and , both are smaller than . The algorithm detects is the maximum (step 2) and must compare vs (step 3) to find the median.
The cost is 3 comparisons.
Case 3: is the median ().
In this case, one remaining element is smaller () and one is larger (). The comparisons in step 2 will reveal and . The algorithm correctly identifies as the median immediately. No third comparison is needed. The cost is 2 comparisons.
So the expected number of comparisons is:
The bound holds for any input since the distribution of the is always uniform, regardless of the specific values or their initial arrangement.
Exercise 5.3
To find the minimum of elements deterministically we need 2 comparisons in the worst case.
Let the set of elements be . We can use a simple algorithm: we compare vs , then compare the smaller of those two with . After these two comparisons we know the minimum. We cannot do it with only one comparison in the worst case because one comparison only gives the order of two elements and leaves the third element unconstrained.
Randomization cannot reduce the expected number of comparisons below 2.
This is because a Las Vegas algorithm must always return the correct minimum for every possible input ordering. Let's consider the first comparison the algorithm makes: it compares two elements, we call them and . There are two possible outcomes, or .
Let's analyze the condition . Given only this information, the third element could still be:
smaller than , so the minimum is
between and , so the minimum is
or larger than , so the minimum is
Thus the single outcome is consistent with at least two different orderings that have different minimum. Therefore the algorithm cannot safely stop and declare the minimum after only this one comparison, because that would make it incorrect on some input consistent with the observed outcome. The same argument applies to the other outcome .
Hence every run uses 2 comparisons, so the expected number of comparisons is .
Optional
The case of can be generalized to arbitrary . For the problem of finding the minimum, comparisons are necessary. We can view the set of candidates for the minimum as a pool of size . To reduce this pool to a single element (the true minimum), elements must be eliminated. Since a single pairwise comparison eliminates exactly one candidate (the larger of the two), and no comparison can eliminate two candidates simultaneously, any algorithm must perform exactly comparisons to guarantee a correct result. Unlike the median problem, the problem of finding the extremum is rigid: the progress rate is fixed at one elimination per step.
If we consider the problem of selecting the median for general , efficient linear-time algorithms exist for both deterministic and randomized approaches. The randomized approach is exemplified by Quickselect, which achieves an expected time complexity of . Although its worst-case performance can degrade to , it is highly efficient (due to the low constant factors) and simpler to implement in practice. A deterministic counterpart is the Median-of-Medians, that guarantees finding the median in worst-case time.
From a theoretical standpoint, any comparison-based algorithm that selects the exact median must perform comparisons, as every element must be processed to determine its rank relative to the pivot. Therefore, randomization in this context does not lower the asymptotic complexity class below linear (it cannot beat ). Instead, its primary advantage lies in simplifying the algorithm and achieving excellent expected-time bounds without the high constant factors and implementation complexity associated with the deterministic Median-of-Medians algorithm.
Exercise 6
The problem bears a structural resemblance to the classic Travelling salesman problem, as it requires finding a minimum-cost tour covering a set of vertices. However, two distinct characteristics fundamentally alter the approach. First, the turn costs means the cost of entering a node depends on the edge used to reach it. Consequently, the standard TSP state , where is the set of visited nodes and is the current node, is insufficient because it lacks the "memory" of the previous location required to calculate the turn cost .
Second, the requirement to visit every node at least once (rather than exactly once) fundamentally changes the topological structure of the state space. In a standard TSP, the set of visited nodes strictly grows with every step, forming a Directed Acyclic Graph. In this variation, the optimal path allows revisiting nodes to perform cheaper turns or traverse lower-cost edges. This means a transition can lead from a state with subset of visited nodes to a new state with the same subset , introducing cycles into the state space. Because the graph of states contains cycles but all edge and turn costs are non-negative, we can model the problem as a shortest-path problem on a big weighted graph. We can therefore employ Dijkstra’s Algorithm over the state space to guarantee finding the global minimum, where represents the predecessor node.
Pseudocode of the algorithm
function solution(V, E, c, t):
Dist = HashMap() # Where key = (subset, previous, current) and value = cost
PQ = PriorityQueue() # We will push (cost, (subset, previous, current))
Parent = HashMap() # To reconstruct the path if needed
# Initialization: The path can start with any single edge in the graph.
# An edge (u, v) implies nodes u and v are visited.
for each (u, v) in E:
# In an actual implementation, we would use bitmasking for efficiency
# e.g., subset = (1 << u) | (1 << v)
subset = {u, v}
cost = c(u, v)
state = (subset, u, v)
Dist[state] = cost
Parent[state] = None
PQ.push(cost, state)
while PQ is not empty:
current_cost, current_state = PQ.pop()
subset, u, v = current_state
# Deletion for outdated entries in PQ
if current_cost > Dist[current_state]:
continue
# Termination condition: All nodes visited
if subset == V:
return Parent, current_state, current_cost
for each neighbor w of v:
# Calculate total step cost: Turn (u->v->w) + Edge (v->w)
step_cost = t((u, v), (v, w)) + c(v, w)
new_cost = current_cost + step_cost
# Update visited set
# In an actual implementation, we would use bitmasking for efficiency
# e.g., new_subset = subset | (1 << w)
new_subset = subset ∪ {w}
new_state = (new_subset, v, w)
# Relaxation step
# If we find a cheaper way to reach new_state, update and push to PQ
if new_cost < Dist[new_state]:
Dist[new_state] = new_cost
Parent[new_state] = current_state
PQ.push(new_cost, new_state)
return HashMap(), Null, Infinity # Should not be reached if graph is connected
The algorithm treats the problem as a search for the shortest path in a "state graph." A specific node in this abstract graph is defined by the triplet , representing a partial path that covers the set of nodes , currently ends at node , and arrived there immediately from node . We begin by identifying all possible starting moves. Since the path is undirected and has no prescribed start, any edge in the original graph is a valid beginning of a tour. We initialize the priority queue with states corresponding to traversing every single edge, setting the initial visited subset to include both endpoints.
The core of the algorithm proceeds via Dijkstra's method. In each iteration, we extract the state with the lowest accumulated cost found so far. From the current node , we explore all outgoing edges to neighbors . For every transition, we calculate the cost to extend the path, which is the sum of the edge cost and the turn cost associated with the sequence . We then determine the new state: the current node becomes , the previous node becomes , and the visited subset is updated. In an actual implementation, we would use bitmasking and the bitwise OR operation to ensure that if was already in the set , the mask would remain unchanged. This handles the "at least once" constraint, allowing the algorithm to revisit nodes without breaking the logic. If this new path to the new state is cheaper than any previously found path to that same state, we update the distance and push the new state into the priority queue. The process repeats until we extract a state where the mask contains all nodes in the graph.
In case the sequence of nodes corresponding to the minimum cost is needed, we maintain a Parent map (that is a predecessor array). Whenever a state is updated during the relaxation step (i.e., a cheaper path is found), we record the predecessor state that led to it (Parent[(S', v, w)] = (S, u, v)).
Once the target state is reached, we backtrack using these pointers. Starting from the final state, we iteratively look up the parent state until we reach the initial edge. This sequence of states is then reversed to produce the optimal ordered list of visited nodes. Crucially, storing the full state (subset and previous node) as the parent, rather than just the node index, allows us to correctly unroll cycles in the physical graph into a simple path in the state space.
Proof of Correctness
The correctness of this approach rests on the reduction of the original problem to a Shortest Path problem on a weighted directed graph. Let be a graph where every vertex is a valid state and edges represent valid physical moves in the original graph. The weight of an edge in connecting state to corresponds exactly to the physical cost of traversing edge plus the specific turn cost incurred at .
Since all edge costs and turn costs are given as non-negative numbers, there are no negative cycles in this state graph . Dijkstra’s algorithm is proven to find the shortest path from a source set to a destination in any graph with non-negative edge weights. The relaxation property ensures that for every state extracted from the priority queue, the cost is the global minimum to reach that specific configuration of visited nodes and orientation. The algorithm terminates the first time it extracts a state where the mask corresponds to the set of all vertices . Because Dijkstra extracts states in strictly non-decreasing order of total cost, the first time this condition is met represents the cheapest possible cost to visit all nodes. The ability to revisit nodes (where the mask does not change size) is handled correctly because such transitions are simply edges in ; infinite cycles of re-visitation are avoided because they would strictly increase the cost without changing the state, meaning they would never be prioritized over the simpler path.
Time Complexity
To determine the time complexity, we analyze the size of the state graph and the number of edges within it. The number of possible subsets is , where is the number of nodes. For each subset, the path tracks the current node ( possibilities) and the previous node (at most possibilities, bounded by the degree of ). Thus, the total number of vertices in the state graph is bounded by .
From any given state , the algorithm attempts to transition to every neighbor of . In the worst case (a complete graph), there are neighbors. Therefore, the number of edges in the state graph is . Dijkstra’s algorithm implemented with a binary heap has a complexity of . Substituting our values, we get . Since , the strictly polynomial factors are dominated by the exponential term. The path reconstruction step would take time linear in the length of the optimal path, thus it would not affect the overall asymptotic time complexity. Using the notation, the time complexity is .