CMSC 22300 Homework 2, Due 4/12/2012, 11pm setfn.sml is a source file containing the functor SetFn from tutorial2.sml. It has been modified by adding three functions val subset : set * set -> bool val fromList : element list -> set val toList : set -> element list 1. [20] Replace the simple but inefficient versions of union and diff SetFn with more efficient versions (modeled after the subset function) that are linear in the sum of the sizes of the two argument sets. You can do this by modifying the supplied setfn.sml source file in your repository. 2. [20] Note that with the addition of fromList and toList, we can use a set module like IntSet2 (produced by applying SetFn to IntOrd) to perform sorting. For instance, fun sort m = IntSet2.toList(IntSet2.fromList m) or, more concisely, val sort : int list -> int list = IntSet2.toList o IntSet2.fromList where "o" is the infix operator for function composition. The sorting algorithm used here is called "insertion sort". Use this idea to more directly implement the sorting algorithm by implementing a functor SortFn. functor SortFn (X: ORDER) : sig type element = X.t val sort : element list -> element list end = struct ... end You can submit this by modifying the supplied sortfn.sml source file template in your repository. 3. [40] Unification We can define a simple language of "terms" by specifying a set of function symbols, each with an associated arities that specifies the number or arguments that it takes, and a set of variables. For instance, here we will deal with a language specified by: functions: {A: 0, B: 0, F: 1, G: 1, H: 2} variables: {x, y, z, ...} (an unbounded supply of variables) Function symbols of arity 0 like A and B are considered constants, and we write "A" and "B" instead of "A()" and "B()". Here are some example terms in this language: A, x, F(x), G(A), H(G(B),F(y)), H(A,HF(x),y)) And here are some nonexamples, because of arity violations: A(x), F(A,B,x), H(F(A)) The unification problem for this language is to find a substitution of terms for variables that makes an equation (or list of equations) between terms true. For instance, the equation H(x, H(F(A), G(y))) = H(G(z)), H(F(u), z)) is solved, or made true, by the substitution s = X -> G(G(Y)), U -> A, Z -> G(Y) since if we apply s to both sides of the equation, we get H(G(G(Y)), H(F(A), G(Y))) = H(G(G(Y)), H(F(A), G(Y))) Such a substitution is called a "unifier" for the set of term equations. Terminology: A term is either a variable (e.g. x) or an application of a function symbol to arguments (e.g. F(t), or A()). In the later case, we call the function symbol being applied the "head" function symbol. We can "unify" or solve a list of equations as follows: Let the first equation be t1 = t2, so the list is of the form t1 = t2; eqns, where eqns are the rest of the equations. Case: t1 and t2 are identical terms Then procede to solve the remainder of the equations, eqns. Case: t1 and t2 have the same head function symbol. (e.g. A = A, F(t1') = F(t2'), H(t11,t12) = H(t21,t22)) Then eliminate t1 = t2 and replace it with equations between the corresponding arguments of the function at the front of the equation list: A = A; eqns ==> eqns (no arguments) F(t1') = F(t2'); eqns ==> t1' = t2'; eqns H(t11,t12) = H(t21, t22); eqns ==> t11 = t21; t12 = t22; eqns Case: t1 and t2 have different head function symbols (e.g. F(x) = G(A)) Then fail, because the equations obviously cannot be solved. Case: t1 = x, a variable, t2 a term not containing x Then substitute t2 for x throughout the remaining equations eqns, obtaining a modified list of equations eqns', and procede to solve eqns'. (Do this even if t2 is also a variable.) Case: t1 = x, a variable, t2 a term containing x Then fail, because equations like x = F(x) have no (finite) solutions. But x = x is an exception, because it is covered by the first rule above. This is called the "occurrence check". Case: t2 = x, a variable, while t1 is not a variable. Reverse the first equation, yielding x = t1; eqns, and proceed as above. This procedure will eventually either fail, either because of incompatible head function symbols or because a variable is equated with a term containing that variable, or it will terminate with an empty list of remaining equations, in which the equations have all been solved. This procedure does not actually produce the unifying substitution, it only determines whether there is a solution or not. Example: The equation list is: x = F(A), H(x,y) = H(F(A),x) Then t1 = x, t2 = F(A), and eqns is H(x,y) = H(F(A),x). x does not occur in t2, so we substitute F(A) for x in eqns, yielding the modified equation list eqns': eqns': H(F(A),y) = H(F(A),F(A)) This will get broken down into: eqns'': F(A) = F(A), y = F(A) by the case for matching head function symbols. Further steps give us: A = A, y = F(A) y = F(A) (substituting F(A) for y in the empty list) which indicates a successful solution. We can represent terms in this language by the ML datatype datatype term = A | B | F of term | G of term | H of term * term | Var of string (where a variable x is represented as Var "x"). Write an implementation of the above "unification" algorithm is SML. You will need to implement at least the following two functions: val subst : string * term * term -> term such that, subst("x",t1,t2) substitutes t1 for occurrences of Var "x" in t2, e.g. subst("x", F(A), Var "x") = F(A) subst("x", G(Var "y"), H(Var "x",G(B))) = H(G(Var "y"), G(B)) and val unify : (term * term) list -> bool which takes a list of equations (term pairs), and returns true or false indicating whether the list of equations has a solution. You can submit this by modifying the supplied unify.sml source file template in your repository.