Course home | Homework | Tests | Handouts | Stat |
Ongoing reading homework:
By now you will have reviewed the following subjects from Discrete Mathematics:
If you read an assignment within 30 minutes of its posting, check it again later for immediate updates.
"DO" problems: do not hand them in, solve them for your benefit.
"BONUS PROBLEMS" have no deadline except they expire when discussed in class or tutorial. Hand them in on a separete sheet. If you are working on a bonus problem, let the instructor know so he won't discuss it in class without checking with you.
16.1 DO: The Traveling Salesman Problem (TSP)
takes a weighted complete
graph as input (every edge has a real number, the "weight" assigned to it)
and asks to find the least expensive Hamilton cycle. The Metric
Traveling Salesman Problem (MTSP) lives in a metric space: all edge-weights
are positive and they satify the triangle inequality:
w(i,j)+w(j,k) ≥ w(i,k).
(a) State the decision version of the TSP. (b) Assuming HAMILTONIAN
(the set of Hamiltonian graphs)
is NP-complete, prove that the decision version of TSP, and even of
MTSP, is NP-complete. (c) Prove that the following algorithm
approximates the optimum MTSP solution within a factor of 2 (i.e.,
it outputs a Hamilton cycle with cost ≤ 2 OPT):
a. Find a min-cost spanning tree T.
b. Traverse T in any order.
This means moving from vertex to
neighboring vertex in T and crossing every edge of T
once in each direction. (Examples: BFS, DFS, in-order-traversal.)
c. Cut out repeated vertices from your traversal (except for the return
to the start vertex). (So if your traversal
sequence was 25857592 then the reduced sequence will be 258792.)
This yields the desired Hamilton cycle.
16.2 DO: Suppose an NP-complete language L can be recognized in time O(nlog n). Prove that every language M ∈ NP can be recognized in nO(log n). Why did the big-Oh migrate to the exponent? If the Karp-reduction from M to L takes nc, determine the constant hidden in the big-Oh notation in the expression nO(log n).
16.3 DO: Given a black box for 3-COL (decides 3-colorability), construct a polynomial-time algorithm to find a 3-coloring of a given input graph if a 3-coloring exists. The algorithm may make calls to the black box. Make as few calls as you can; state the number of calls made.
16.4 DO by Thursday, March 8. Let r,b be positive integers. Let G be an undirected graph with n vertices. Assume every vertex has degree ≤ r+b+1. Prove that it is possible to color the vertices red and blue such that every red vertex has at most r red neighbors and every blue vertex has at most b blue neighbors. Base your proof on the following algorithm. A "bad" vertex is one that violates the condition, i.e., if it is red, it has more than r red neighbors, and if it is blue then it has more than b blue neighbors. We wish to show that there is a coloring without bad vertices.
Algorithm (Lovász toggle)
initially, color the vertices arbitrarily.
while there is a bad vertex, switch its color.
Prove that this algorithm terminates; estimate the number of rounds in terms of n,r,b. (You will show in particular that this is a polynomial-time algorithm to construct a coloring without bad vertices.) Recommend a better initialization of the coloring. (Note that these colorings are not "legal" in the usual sense.) (WARNING: observe the boldness of this algorithm: a round (an execution of the "while" loop) can increase the number of bad vertices; yet we claim that eventually, no bad vertices remain.)
15.1 Prove that 561 is a Carmichael number, i.e., that if g.c.d.(a, 561)=1 then a560≡ 1 mod 561. (Note: 561 = 3.11.17.) (5 points)
15.2 DO: The Discrete Log problem takes as inputs a prime number p and two integers g and b and asks for x such that gx≡ b mod p, or a decision that no such x exists. The decision version of this problem takes as input a quintuple (p,g,b,e,f) of positive integers, where p is prime, and asks if there exists an integer x such that e ≤ x ≤ f and gx≡ b mod p. Let DL denote the set of yes-instances (quintuples to which the answer is "YES"). Prove: (a) DL ∈ NP; (b) DL ∈ coNP. (Part (b) is a great deal more difficult.) (Note: this problem is not believed to be solvable in polynomial time. This assumption plays an important role in cryptography.)
15.3 DO: Prove: if NPC &cap coNP is not empty then NP = coNP. (Note: it is conjectured that NP ≠ coNP.)
15.4 DO: Suppose the language L is Karp-reducible to the language M. The reduction function is computable in time O(nc). The language M is recognizable in polynomial time, specifically in O(nd). Prove that L is recognizable in O(nf ) for some constant f; determine f as a function of c and d. (WARNING: n has a different meaning in each of these statements; it always refers to the length of the input relevant to the given statement.)
14.1 ABSOLUTELY DO: review the handout on number theoretic algorithms (Euclid's algorithm and modular exponentiation).
14.2 DO: Review the proof of NP-completeness of CLIQUE discussed in class. Study further NP-completeness proofs from the text. Learn the NP-completeness proof of 3-COLORABLE GRAPHS (all details); learn the outline of the NP-completeness proofs of the following problems: 3D MATCHING, SUBSET SUM, HAMILTONICITY.
14.3 DO: Let k-COL denote the set of k-colorable graphs (encoded as strings over a finite alphabet in some natural manner). Assuming 3-COL is NP-complete, prove that 4-COL is NP-complete by reducing one of these probems to the other (which one to which one?) (WARNING: This is a key test of your understanding of the concept of Karp-reduction. In the past, many students produced incorrect solutions to this problem. Check your solution with the TA.)
14.4 DO: Prove: if, in the definition of NP, we omit the polynomial bound on the length of the certificate, we obtain the definition of the class of recursively (computably) enumerable languages.
14.5 DO: (a) Give a Karp-reduction from 3-COL to HALTING.
(b) Prove that there does not exist a Karp-reduction from
HALTING to 3-COL.
14.6 NP-completeness of the (0,1)-integer linear programming problem (2+5+5+5+12 points)
14.7 DO: (a) Prove that, given the integer k (in binary),
the Fibonacci number Fk cannot be computed in
polynomial time.
(b) Given the integers k and m, compute
Fk mod m in polynomial time.
(Hint: use 2 × 2 matrices.)
14.8 (Correctness of RSA) DO:
Let p,q be distinct primes, N=pq,
M=(p-1)(q-1). Let further e be relatively prime
to M and let f be a multiplicative inverse of
e mod M. Prove: for all x,
xef ≡ x mod N.
(All variables in this problem denote positive integers.)
(WARNING: x is NOT required to be relatively prime to N.)
13.1 DO: There was an error in one of the midterm problems. Here is the correction. Prove that the PARITY function in n variables can be represented (a) by a Boolean circuit of size O(n); and (b) by a Boolean formula of size O(n2).
13.2 DO: Prove the equivalence of the two definitions of "recursively (computably) enumerable languages" given in class. One of the definitions said a language L is recursively enumerable if there exists a Turing machine which halts exactly on the strings in L; the other definition said L is recursively enumerable if either L is empty or L is the set of outputs of a halting Turing machine (a TM that halts on all inputs).
12.1 DO: (a) (Approximate solution of quadratic equations.)
Let a,b,c,k be integers. Assume a, k >0.
Consider the quadratic function f(x)=ax2+bx+c.
(a1) Determine the number of real solutions to the equation f(x)=0.
(a2) If the number of solutions is 1, prove that the solution is a rational
number and determine its value as a fraction. (a3) If the number of solutions
is 2, find an approximate solution within 1/k of a
real solution.
Your answer should be a binary fraction z=r/2s) such that
|z-w|<1/k for a real number w which satisfies
f(w)=0. Your algorithm should run in polynomial time, i.e.,
in time O(nC) where n is the total bitlength of the
input (including k) and C is a constant.
(b) Let p,q be integers, N=pq and M=(p-1)(q-1).
Given N and M, determine p and q in polynomial
time.
(c) Note the connection of part (b) with the analysis of the RSA
encryption scheme: if an adversary can get hold of M, they may
as well know p and q, i.e., they are able to
factor N. In other words, finding M (which would allow
to compute f) is no easier than factoring N. All this
does not prove, however, that computing f is as hard as
factoring N.
12.2 DO: Modular exponentiation is the problem that takes as inputs three integers a,b,m where b ≥ 0 and returns ab mod m. This is a key part in implementing the RSA cryptosystem. Prove that the following algorithm is (a) correct and (b) useless (does not run in polynomial time).
initialize: X = 1
for i = 1 to b do
X = aX mod m
return X
12.3 Study the handout about Euclid's algorithm and modular exponentiation.
12.4 (a)
Prove that Euclid's algorithm for g.c.d. runs in polynomial time
by proving that every other round in Euclid's algorithm reduces the
remainder by half: if the sequence of remainders in an execution of
Euclid's algorithm is r1,...,rt then
ri+2 ≤ ri/2. (8 points)
(b) DO: Given the relatively prime integers e and M,
compute the multiplicative inverse of e modulo M
in polynomial time. Note how this fact pertains to the feasibility
of the RSA cryptosystem.
12.5 DO: Given an undirected graph G, construct, in
polynomial time, a 3-CNF formula fG such that
for all G,
G is 3-colorable if and only if fG
is satisfiable.
12.6 DO: (a) Prove: the OR of four Boolean variables
cannot be expressed as a 3-CNF formula.
(b)
Given a Boolean circuit C, construct, in
polynomial time, a 3-CNF formula fC such that
C is satisfiable if and only if fC
is satisfiable.
12.7 DO: Suppose we are given a black box that accepts an undirected graph as input and returns YES if the input graph is 3-colorable; NO otherwise. If the answer is YES, find a 3-coloring in polynomial time, including O(V) queries to the black box, where V is the number of vertices. Queries are free but your algorithm needs to construct the input to each query. Describe your algorithm in pseudocode.
12.8 DO: Suppose we are given a black box that accepts as input a pair (a,b) of positive integers and returns YES if a has a prime divisor ≤ b. Given a positive integer x, compute a its prime factorization (represent it as a product of primes) in polynomial time. If x has n digits, make sure to use no more than O(n2) black-box queries. Describe your algorithm in pseudocode.
Recall that problems 10.5, 10.6, 10.7 are also due Feb 13.
11.1 DO: ( Range query ) Construct a dynamic data structure which stores a list Q of real numbers and supports the following query: on INPUT x≤ y (reals), RETURN the number of keys k ∈ Q such that x ≤ k ≤ y (in addition to supporting INSERT and DELETE). (Example: if Q = (3,5,7,7,7,9,13, 15) and x=6, y=13 then the output is 5 (corresponding to the sublist 7,7,7,9,13). All operations (INSERT(key), DELETE(key), RANGE(x,y)) must be performed in O(log n) steps where n is the current size of Q.
11.2 DO: Recall that a function f(n) is polynomially bounded if f(n)=O(nconst). Let f(n)>1. Prove: f(n) is polynomially bounded if and only if log(f(n))=O(log n).
11.3 DO: Decide which of the following functions is polynomially bounded. (a) n2log n; (b) (log n)5; (c) 5log n; (d) (n!)1/n; (e) (log log n)log n; (f) (log n)log log n;
11.4 We want to decide whether an integer x is prime by performing trial divisions (we divide x by all integers up to √x). Is this a polynomial-time algorithm? Prove your answer. (5 points)
11.5 With each node x of a binary search tree, we store the leftheight(x) and rightheight(x) information. By inserting a new key, we create a new leaf z. Update the height information stored at each node. Write your algorithm in elegant pseudocode. Your algorithm should work in O(depth(z)) where depth(z)) is the length of the (unique) path from the root to z. (Note: this problem is not specifically about AVL trees, no balancing required.) (6 points)
11.6 DO: An exact 3-CNF formula is an AND of 3-clauses (every clause has exactly 3 literals connected with ORs; the literals of each clause correspond to 3 distinct variables). Suppose we have an exact 3-CNF formula consisting of m clauses. (a) Prove: at least 7m/8 of the clauses are simultaneously satisfiable. In other words, there exists a (0,1)-assignment to the variables which satisfies at least 7m/8 of the clauses. (Hint: probabilistic method.) (b) BONUS PROBLEM: Find such an assignment by a randomized algorithm in expected polynomial time (worst case analysis!) (6 points) (c) BONUS PROBLEM: Find such an assignment in (deterministic) polynomial time. (15 points)
11.7 BONUS PROBLEM: 2-SAT is the satifiability problem for 2-CNF formulas (at most 2 literals per clause). Solve 2-SAT in polynomial time. (10 points)
10.1 DO: Given an undirected graph, count its connected components in linear time. (The graph is given by an array of adjacency lists.) Give your solution in simple and elegant pseudocode. Hint: BFS. Be careful about the degree of detail required for the timing analysis.
10.2 DO: (a) Prove the correctness of Borůvka's algorithm. Assume all edges are of different weight. (b) Do not assume all edges have different weights. How should we modify Borůvka's algorithm to cover all cases?
10.3 (a) Prove: Borůvka's algorithm makes ≤log n rounds. (In each round, Borůvka selects in parallel the lightest edge from each component of the current forest.) (b) Show that this bound is tight: let n=2k; construct a weighted undirected graph with n vertices which makes Borůvka's algorithm perform k rounds. (5+6 points)
10.4 DO: Implement Borůvka's algorithm in O(m log n). Do not use the Union-Find data structure; find a simple way to track the components. Describe your solution in pseudocode.
10.5 (Due Tuesday, Feb 13) DO: Given a set of n coins, some of which are fake, count the fake coins using O((log n)2) measurements on a balance. As usual, all fake coins weigh the same; all true coins weigh the same; and the fake coins are lighter. (Do not assume that n is a power of 2; it won't help.) Describe your solution in pseudocode.
10.6 (Due Tuesday, Feb 13) (Min-cost basis) (a) Let A be a k×m matrix of rank r with columns a1,...,am. A basis of A is a set of r linearly independent columns of A. Given an array w[1..m] of weights associated with the columns, find a minimum weight basis of A. (The weight of a set of columns is the sum of the corresponding values w[i].) Take the following "linear dependence test" as the elementary operation: given a set S of vectors and a vector b, determine whether or not b belongs to the span of S (i.e., whether b can be written as a linear combination of vectors in S). Solve the problem using O(m log m) comparisons and at most m linear dependence tests. (Hint: greedy.) Prove both the correctness and the complexity estimate. (b) Reduce the min-cost spanning tree problem (for a graph with n vertices and m edges) to the min-cost basis problem (for an n × m matrix): Given a weighted undirected connected graph G, initialize an n × m matrix to zero, then change certain entries of the matrix in linear (O(n+m)) steps (based on G) and assign weights to the columns. Now you obtained an instance of the min-cost basis problem; its solution should automatically give a solution to the min-cots spanning tree problem. -- Prove that your reduction is correct. Clearly state what this means; such a clear statement accounts for half the credit. (c) After your reduction (part (b)), your algorithm (part (a)) turns into one of the known algorithms for min-cost spanning tree. Which one? (8+8+2 points).
10.7 (a) (Due Tuesday, Feb 13)
UPDATED Feb 5, 11:40 PM.
Given an array A[1..n]
of real numbers and an integer k where 2 ≤ k ≤ n,
determine the minima
of every interval of length k. The output should be an array
B[1..n-k+1] where
B[i]=min A[i..i+k-1].
Solve this in O(n log k). (12 points)
Hint:
one possible solution uses a combination of divide-and-conquer and
dynamic programming. Another solution would use other tricks learned
in class. You receive up to 8 bonus points if you give two
solutions based on essentially different principles.
(b) (BONUS PROBLEM) Solve this in O(n). (12 points) -- Comment. The problem arose in computational vision.
(c) (BONUS PROBLEM, same conditions as above.) In part (a), replace "minimum" by "median" (assume k is odd). Now solve in O(n log k). (8 points). -- Comment. I am not aware of an O(n) solution. Let me know if you find one, either mentally or in the literature.
9.1 DO: Study Borůvka's, Jarník's (a.k.a. Prim's) and Kruskal's algorithms for min-cost spanning trees. The source for Borůvka's algorithm is your class notes and the Matoušek--Nešetříl text used in Discrete Mathematics.
9.2 DO: Prove the correctness of Jarník's algorithm.
8.1 DO: study the "Loop invariants" handout (LI) (click "Handouts" on the banner). (Note: the handout was slightly updated on Jan 29, 1:00 am; see the footnote in the new version. The student who commmunicated the error to the instructor received 5 hw bonus points.)
8.2 DO: Solve problems LI 1, 2, 3.
8.3 Solve problem LI 4 (prove that a certain statement is NOT a loop invariant for Dijkstra's algorithm). (10 points)
8.4 DO: Use the loop invariants LI 1--4 to prove the correctness of Dijkstra's algorithm.
8.5 DO: Study the "Amortized cost" handout (AC).
8.6 DO: Solve problems AC 1(a), 2(a).
8.7 Solve problem AC 1(b) (find invariant for amortized cost of "increment and double" data structure) (4 points)
8.8 Solve problem AC 2(b) (find invariant for simulation of queue by two stacks) (4 points)
8.9 Solve problem AC 2(c) (amortized cost of k-dequeue) (10 points)
7.1 DO: redo Quiz-2 without the time pressure.
7.2 DO: Suppose the positive function T(n) satisfies the inequality T(n) ≤ T(n/5)+T(7n/10)+O(n). (The selection algorithm discussed in class lead to this recurrent inequality.) Use the method of reverse inequalities to prove: T(n) = O(n).
7.3 Prove: Dijkstra's algorithm fails if we permit one edge to have a negative weight (all the other edges still must have positive weights), even if the graph has no cycles (is a DAG). Draw a small counterexample. Tabulate the evolution of the variables. For each execution of the "while" loop, show the new pivot (the node selected to finalize), the starting value of the cost at each node, and the parent link for each node. (8 points)
7.4 Edit distance. (12 points)
DO: Study Chapter 3 of the Kleinberg--Tardos text. WARNING. The analysis of BFS and DFS in Sections 3.2, 3.3, 3.4 in the text refer to undirected graphs while in class we discuss directed graphs. However, the DFS code given at the top of page 84 correctly describes DFS for digraphs. Adopt this as the definition of DFS. Add the update of the "parent" link to obtain the DFS tree and add notation for the status (white, grey, black). To which shade(s) does the "explored" label used in the text correspond? Note that statement (3.7) on p. 85 of the text applies to undirected graphs only, while (3.6) applies to directed graphs as well.
NOTE. In this course, "graph" without adjective means directed graph. Sometimes we add the adjective "directed" for emphasis.
6.1 Show with a small example that statement (3.7) on p. 85 of the text is false for directed graphs. Draw your (directed) graph, indicating the source (start vertex). Number the vertices in DFS order. Highlight the edges of the DFS tree. Indicate which edge contradicts (3.7). (6 points)
6.2 DO: write an elegant pseudocode for Radix sort assuming all strings are of equal length.
6.3 DO: write an elegant pseudocode for Radix sort NOT assuming all strings are of equal length. (The process needs to run in O(n) where n is the total number of characters in the input.)
For all problems below, the input graph G is given in adjacency list representation (by an array of adjacency lists). If the output is a graph, it is also required to be given in adjacency list representation. All algorithms must run in linear time (i.e., O(n+m)) and must be described in elegant pseudocode.
6.4 For a graph G, the reverse graph Grev is defined by reversing all arrows. Given G, find Grev. (6 points)
6.5 Given G, find a representation of G with monotone adjacency lists. (Each adjacency list needs to be increasing.) (6 points)
6.6 Given a graph in adjacency list representation, decide whether or not the graph is undirected. (6 points)
5.1 Dynamic programming: The all-ones square problem (12 points)
5.2 Find four sequences an, bn, cn, dn of positive numbers such that an∼cn and bn∼dn but an-bn is not asymptotically equal to cn-dn. (5 points)
5.3 Solve problem 1 of Quiz-1 (asymptotic expansion of ln(n!)). Beware of the fact stated in problem 5.2 above. Note that there is a possibility that your solution in the quiz was accepted even though you made a major error (because the grader is human). (10 points)
5.4 DO: Solve problem 4 of Quiz-1 (two players collect coins from the ends of a string of coins). Write your solution in elegant pseudocode. Hint: you can make your solution more elegant by using the following objective function: the amount collected by the first player minus the amount collected by the second player.
4.1 DO: Study the "Divide and Conquer: The Karatsuba-Ofman algorithm for multiplication of large integers" handout.
4.2 DO: Study the "Method of Reverse Inequalities" problem set. Solve the problems stated.
4.3 From the "Method of Reverse Inequalities" problem set: Solve problems 4.1 and 4.2 (analysis of the recurrences arising from Strassen's matrix multiplication and from MERGE-SORT) using the general result described in item 3. (5 + 5 points)
3.1 DO: write elegant pseudocodes for the HEAP implementation of the "INCREASE-KEY(x, newkey)" command ("bubble down").
3.2 DO: compute the inifinite series ∑ i/2i where the summation extends over the range i=0 to ∞. (Hint. Consider the derivative of the power series expansion of 1/(1-x).)
3.3 Consider the knapsack problem with integer values (rather than integer weights, as in class - now the weights and the weight limit are reals). Let V denote the sum of all values. Solve the knapsack problem at cost O(nV). (Warning. Half the credit goes for the "brain" of your dynamic programming solution: a clear definition of the array of problems considered.) (16 points)
3.4 Suppose we have n data (real numbers) arranged in a heap; and the heap is implemented as an array (as in class). Prove that in order to sort the data, we still need to make asymptotically at least n log n comparisons. (Warning: this problem is NOT about the HEAPSORT algorithm. We are allowed to compare any pair of data. The problem says we don't start from scratch, the heap stucture means some comparisons have already been made, so if we take advantage of those comparisons, we may achieve some savings. You need to prove that the savings is asymptotically negligible. Note also that you need to prove more than an Ω(n log n) lower bound; you need to prove the number of comparisons required is greater than or asymptotically equal to n log n. (See the "Asymptotic notation" chapter of the Discrete Math Lecture Notes for this concept.) (8 points)
2.1 DO: read the "Binary search" handout.
2.2 DO: Given 13 coins, one of which is fake, we want to find the fake coin and also decide whether it is heavier or lighter. Prove that 3 measurements on a balance are not enough.
2.3 Suppose we have 2n coins, some of which are fake. All the fake coins have equal weight, and they are lighter than the true coins (which also have equal weight). Using O(log n) measurements on a balance, divide the coins into two sets of n coins of nearly equal weight. "Nearly equal" means if the number of fake coins is even, they must be evenly split between the two parts; if their number is odd, one part must have exactly one more fake coin than the other part. Write your algorithm in elegant pseudocode. (12 points)
2.4 DO: Given n real numbers, we want to determine their min in the comparison model. (Comparisons cost one unit each, bookkeeping is free.) Prove: (a) n-1 comparisons suffice. (b) n-2 comparisons do not suffice.
2.5 DO: Given n real numbers, we want to determine both their min and their max in the comparison model. Prove: 3n/2 comparisons suffice. (Note that this is rather surprising; how could determining the min reduce the cost of determining the max?)
1.1 DO: Read the handout Dynamic programming: The knapsack problem.
1.2 The greedy matching problem. Do not hand in solutions to the "verify" or "why" questions in the preamble of the problem sheet; just solve them for your benefit. Hand in solutions to the highlighted Problem ((a1) 4 points, (a2) 8 points, (a3) 5 points)
1.3 The Interval Sum problem. (12 points)