In this post, I will talk about the heap data structure, a specialized tree-based data structure. I will recall how it can be implemented in imperative languages. Finally, I will discuss how heap can be implemented with lists in functional languages and show that it does not satisfy the requirements. A tree-based implementation in OCaml will also be given in this post.

## Heap data-type

A (complete) binary (min) heap is a binary tree that satisfies the following conditions:

• Shape property: it is a complete binary tree; that is, all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.

• Heap property: the key stored in each node is less than or equal to the keys in the node’s children, according to some total order.

Figure 1 shows an example of a complete binary min heap, the numbers represent the value of the keys in the nodes.

Heap is known for its efficient algorithms (logarithmic time complexity) of inserting and deleting the smallest element in implementing a priority queue. It is also used in implementing heapsort algoritm that has $O(nlogn)$ time complexity.

In imperative world, a heap is generally placed in an array with the layout of complete binary tree, mapping complete binary tree nodes to the array indices. For instance, with the zero-based array, the root node is represented by index 0, if i is the index of the parent, then the indices of its left and right child are: left_child(i) = 2*i + 1 and right_child(i) = 2*i + 1. However, in functional languages, if we implement an array using list data-structure. Then we cannot obtain the constant time complexity for accessing elements of the array. Therefore, lists cannot be used to implement efficient heaps. Instead, we need to use tree data structures to implement heaps in functional world. The following will explain how we heaps are implemented in this way, especially how a new node is inserted into and the min node is removed from a heap with $O(logn)$ time complexity.

A complete binary heap is represented by a complete binary tree that satisfies the heap property. A node can be a terminal node, called Leaf or a non-terminal node, called Node, that contains information about the number of nodes, the height, the key, its left and right trees. We will see that the number of nodes and the height help us decide to add a new node on the left or right child tree. The heap data-type is given in the following snippet.

(** type of node value. *)
type key = El.t

(** a node contains # nodes, height, left tree, key, and right tree. *)
type heap = Leaf
| Node of int * int * heap * key * heap