Heap-Sort Algorithm

Van C. Ngo · December 30, 2016

In the previous post, I talked about heap data structure, a tree-based implementation in OCaml. This post will discuss how it is used to implement the heap sort algorithm. This sorting algorithm has \(O(nlogn)\) worst-case time complexity.

Algorithm

Assume that we want to sort the list [4; 1; 5; 3; 2] in decreasing order. We assume further that the input list is represented by a complete binary min heap with 5 nodes as shown on the top of Figure 1.

Heap-sort

To sort the list we create an empty list as the output list. We pick the key value 1 of the heap’s root and put it into the output list. Then we remove the root Node 1 from the heap. After these actions, we get a new heap with the root is Node 2 and a new output list with one element 1. If we do the actions above until the heap is empty (e.g., represented by a node Leaf), then we get the final output list is in decreasing order [5; 4; 3; 2; 1]. The process is depicted in Figure 1. Thus the heap-sort algorithm can be divided into the following steps. The result list is initialized empty.

  1. Build a complete binary min heap from the input list
  2. Get the root of the heap and concatenate it with the result list
  3. Remove the root. If the heap is not empty, then goto step 2
  4. If the sorting direction is decreasing then return the result list, otherwise, return the reversed list

The heap building is \(O(nlogn)\) time complexity. The getting and removing the root operations are \(O(1)\) and \(O(logn)\), respectively, and they are called \(n\) times. Therefore, the time complexity of the heap-sort algorithm is \(O(nlogn)\).

Code

This post demonstrates the use of extending modules with functor in OCaml programming language as well. Let’s see how this works in the context of the Heap module to add the additional functionality for sorting. Instead of writing a heap-sort module extending the heap module with the sorting function by hand, we use a functor to add this functionality to the heap module that supports the fold_on_min function.

We create a new module Heapsort that automates the process of adding sorting function to the Heap module. As you can see, Heapsort contains a type k of the Heap key type, a function sort_list to sort a list of this type, and a functor Make that allows one to extend a Heap module.

(************************************)
(*            Heap-sort             *)
(*          Van Chan Ngo            *)
(************************************)

(** Implementation of binary heap over ordered types *)
open Heap

type direction = Dec | Inc

module type Heapsort = sig
	type k
	val sort_list : direction -> k list -> k list
end

module Make (H : Heap) : (Heapsort with type k = H.key)
open Heap

type direction = Dec | Inc

module type Heapsort = sig
	type k
	val sort_list : direction -> k list -> k list
end

module Make (H : Heap) = struct
	type k = H.key

	let sort_list d l = 
	let h = H.of_list l in
	let ol = H.fold_on_min (fun a k -> k::a) [] h in 
	match d with
	| Dec -> ol
	| Inc -> List.rev ol

end

In order to apply the functor, we create a heap module from a module OrdertedType. For example, I want to sort a list of integer pairs, I first create a module IntPair module. Then a heap module H is created from IntPair. Finally, a heap-sort module is created from H using the functor.

open Heap
open Heapsort

module IntPair =
  struct
    type t = int * int

    let to_string x =
      let (x1,x2) = x in
      "(" ^ (string_of_int x1) ^ "," ^ (string_of_int x2) ^ ")"

    let compare (x1,y1) (x2,y2) =
      if x1 > x2 then
        Gt
      else if x1 < x2 then
        Lt
      else
        begin
          if y1 > y2 then
            Gt
          else if y1 < y2 then
            Lt
          else
            Eq
        end
  end   

module H = Heap.Make(IntPair)
module HS = Heapsort.Make(H)
    
let _ =
  let ol = HS.sort_list (Inc :direction) [(1,2); (2,3); (2,343223); (432332525,1333); (342395835289,29235899589); (248,35989835)] in
  List.iter (fun (x,y) -> Printf.printf "(%d,%d) " x y) ol; Printf.printf "\n"

The sorted output list is

(1,2) (2,3) (2,343223) (248,35989835) (432332525,1333) (342395835289,29235899589)