Loop Invariant Insertion Sort

Insertion Sort

Welcome to part 8 on loop invariants. Please refer to the first part (introduction) here. Today, we are going to discuss insertion sort algorithm. Let us get started.

Problem definition

Given an array (A) of (n) numbers. Assume the array starts at position (1). Develop an algorithm to sort the array in increasing order using insertion sort. Insertion sort works in a similar way when we sort playing cards. We select one number (we are free to choose any number we want) then insert it in the correct position within the already sorted section of the array. Take a look at the following diagram to better understand the algorithm:

Insertion Sort Diagram

Let us now follow the same steps we have learned from the previous articles.

Identify the goal of the loop and write it as a post condition

The goal of the algorithm is to sort the elements in the array. This means we are going to get another permutation (contains original elements) of the array where each element is less than or equal to its neighbor to the right.

//Post condition: 
//A[k] <= A[k + 1] where 1 <= k < n
//A is a permutation of the input array

Write the loop specifying the guard (loop condition)

Insertion sort uses two loops so we will have two loop invariants. As in the previous articles, we will be using while loops in our demonstration because it clearly separates the guard (loop condition) from initialization and loop index, however we can use any other kind of loop if we wish (for example a for loop). Here is the skeleton of the while loops along with the post condition from step 1:

while (i <= n)
{
	while (j >= 1 && Cond)
	{
		j--;
	}

	i++;
}

//Post condition: 
//A[k] <= A[k + 1] where 1 <= k < n
//A is a permutation of the input array

Note that the outer loop index (i) increases until it is equal to (n) and the inner loop index (j) decreases until it is equal to (1). Note also that the inner loop may terminate not only when (j = 0) but also if (Cond) is false. We will define what (Cond) stands for and things will start to make more sense when we choose an appropriate invariant for each loop.

Fill in the loop invariant

Loop invariant should describe the goal of the algorithm. It should hold true just before entering the loop and after each iteration. It should give an idea about the current progress towards the final goal. In the case of insertion sort we have an outer loop and an inner loop so we have two invariants not only one. We can think of the inner loop as the implementation (in this case part of the implementation) of the outer loop then only think about the outer loop. After that, we find an appropriate invariant for the inner loop. Imagine the array is split into two sections around some position (i). Assume the elements to the left of (i) are sorted and the elements to the right of (i) including (i) itself are not sorted. The inner loop shifts a subset of the elements in the sorted section to the right so that we can insert one element from the unsorted section into its correct position. On the other hand, the outer loop pushes the position (i) to the right each iteration until all elements are in the correct place. As you can see, as the outer loop progresses, the sorted section grows. The outer loop invariant should give us an idea about this sorting progress. For example we can use the following invariant for the outer loop:

"Elements to the left of (i) are sorted". 

Let us now find an appropriate invariant for the inner loop. The inner loop shifts all elements starting on the left side of (i) all the way back to the position where we are supposed to insert the new element. Inner loop invariant for insertion sort is a little bit complicated so it is not a good idea to state that in words instead we will indicate it formally as follows:

//Recall that after each outer loop iteration a new 
//element from the unsorted section is inserted at 
//some position (p) in the sorted section. Elements to 
//the left side of (p) are sorted. We formulate this as follows: 
"A[1..j] is sorted"

//Similarly, elements to the right side of (p)
//up to the end of the sorted section 
//should be sorted as well. We formulate that as follows:
"A[j+2..i] is sorted"

//(key) a variable that holds A[i] which is the element 
//from the unsorted section that needs to be inserted 
//into the sorted section. (Key) should be less than 
//the smallest element of the sorted section on the 
//right side of (p). We are just using more words to 
//say that A[i] should be inserted in the correct 
//place in the sorted section. We formulate that as follows:
"key < A[j+2] where j+2 <= i"

//The greatest element in the sorted section on the 
//left side of (p) should be less than or equal to the smallest 
//element in the sorted section on the right side of (p). 
//It is another way to say the sorted section should be 
//maintained after the shift operation to make room for 
//the newly inserted element. We formulate that as follows:
"A[j] <= A[j+2] where j+2 <= i"

//(j) should be greater than or equal to (0). 
//The (0) value is the value that makes the inner loop terminates.
"0 <= j"

The formal way to describe the inner loop might be confusing to some extent but the idea should be easy. In plain English, we are trying to say keep moving backwards in the sorted section until we find the correct position of the element to be inserted. In order to find that position we just need a comparison statement to compare A[i] and A[j]. As long as (A[i] < A[j]) we keep moving back in the sorted section until we find the correct position. Remember we used a Boolean variable called (Cond) in the inner loop. (Cond) is equivalent to (A[i] < A[j]) but because the value A[i] is saved to some variable (for example (key)) then (Cond) becomes (key < A[j]). The reason why we save A[i] in (key) is because shifting elements to the right destroys A[i]. Please note that the body of the outer loop in reality consists from three steps:

  1. Pick an element A[i] from the unsorted section then save it (because its position will be destroyed by the shift operation).
  2. Shift all elements to the right starting from the position of the selected element in the sorted section
  3. Insert the saved element into the correct place.

The following code shows the loops and their invariants in a formal way:


//Outer invariant: 
//Elements (A[1] .. A[i-1]) to the left of (i) are sorted
while (i <= n)
{
	//Inner invariant:
	//A[1..j] is sorted
	//A[j+2..i] is sorted
	//key < A[j+2] where j+2 <= i
	//A[j] <= A[j+2] where j+2 <= i
	//0 <= j

	while (j >= 1 && Cond)
	{
		j--;

		//Inner invariant:
		//A[1..j] is sorted
		//A[j+2..i] is sorted
		//key < A[j+2] where j+2 <= i
		//A[j] <= A[j+2] where j+2 <= i
		//0 <= j
	}

	i++;

	//Outer invariant: 
	//Elements (A[1] .. A[i-1]) to the left of (i) are sorted
}

//Post condition: 
//A[k] <= A[k + 1] where 1 <= k < n
//A is a permutation of the input array

Fix the initialization so that the loop invariant evaluate to true

Let us first work on the outer loop. Before entering the loop the sorted section consists from the first element in the array. This means we should initialize (i = 2). Apply that to the outer invariant as follows:

"Elements (A[1] .. A[i-1]) to the left of (i) are sorted"
"Elements (A[1] .. A[2-1]) to the left of (2) are sorted"
"Elements (A[1] .. A[1]) to the left of (2) are sorted"

This basically says A[1] which is to the left of (i = 2) is sorted. This is trivially true because a list of one element is sorted by default. In order to prove that the outer invariant holds true after each iteration, let us test the first iteration when (i = 3). Apply that to the invariant:

"Elements (A[1] .. A[i-1]) to the left of (i) are sorted"
"Elements (A[1] .. A[3-1]) to the left of (3) are sorted"
"Elements (A[1] .. A[2]) to the left of (3) are sorted"

As you can see, as the outer loop progresses the length of the sorted section increases. Please note that we should increment the outer loop index (i) after the inner loop otherwise the outer loop invariant will not hold true. We start with one sorted element A[1] before we enter the outer loop. If we increment the loop index before the inner loop then we are claiming we have two elements sorted while we have one so far. Now let us see what happens when the outer loop terminates. We expect that the invariant would imply the post condition after the loop terminates. The outer loop terminates when (i = n + 1). Apply that to the invariant:

"Elements (A[1] .. A[i-1]) to the left of (i) are sorted"
"Elements (A[1] .. A[n + 1 - 1]) to the left of (n) are sorted"
"Elements (A[1] .. A[n]) to the left of (n) are sorted"

This means the whole array is sorted which is what we are trying to achieve. If you think for a second you will notice that we are assuming the body of the outer loop does its job correctly by inserting one element from the unsorted section to the sorted section each iteration otherwise as the outer loop advances we will not be able to grow the sorted section. In order for the body of the outer loop to do its job correctly we need to implement the three steps outlined earlier. For now we are only focusing on the initialization step and will keep the implementation later. Let us now initialize the inner loop so that the inner invariant holds true before and after each iteration. Let us see what happens if we set (j = i-1):

We have the following inner loop invariant:

A[1..j] is sorted
A[j+2..i] is sorted
key < A[j+2] where j+2 <= i
A[j] <= A[j+2] where j+2 <= i
0 <= j

Substitute (j = i-1) you get:

A[1..i-1] is sorted
A[i+1..i] is sorted
key < A[i+1] where i+1 <= i
A[i-1] <= A[i+1] where i+1 <= i
0 <= i-1

Let us see if the above (5) conditions hold:

  1. Elements from (1) to (i-1) are sorted. This is given by the outer loop (elements to the left of (i) are sorted)
  2. Elements from (i+1) to (i) is an empty set which is trivially sorted
  3. Does not apply because j+2 <= i is not true
  4. Does not apply because j+2 <= i is not true
  5. Trivial

Here is how the code looks like after the initialization stage:


int i = 2;

//Outer invariant: 
//Elements (A[1] .. A[i-1]) to the left of (i) are sorted
while (i <= n)
{
	int j = i-1;

	//Inner invariant:
	//A[1..j] is sorted
	//A[j+2..i] is sorted
	//key < A[j+2] where j+2 <= i
	//A[j] <= A[j+2] where j+2 <= i
	//0 <= j

	while (j >= 1 && Cond)
	{
		j--;

		//Inner invariant:
		//A[1..j] is sorted
		//A[j+2..i] is sorted
		//key < A[j+2] where j+2 <= i
		//A[j] <= A[j+2] where j+2 <= i
		//0 <= j
	}

	i++;

	//Outer invariant: 
	//Elements (A[1] .. A[i-1]) to the left of (i) are sorted
}

//Post condition: 
//A[k] <= A[k + 1] where 1 <= k < n
//A is a permutation of the input array

Figure out how to achieve the goal by filling in the body of the loop

Our implementation will be doing the following:

  1. Save A[i] in (key): int key = A[i]
  2. Replace (cond) with (key < A[i])
  3. Shift the array one element to the right: A[j+1] = A[j]
  4. After the inner loop terminates, insert (key) into its position in the sorted array: A[j+1] = key

Here is how the code looks like eventually:


int i = 2;

//Outer invariant: 
//Elements (A[1] .. A[i-1]) to the left of (i) are sorted
while (i <= n)
{
	//1. Save A[i]
	int key = A[i];

	int j = i-1;

	//Inner invariant:
	//A[1..j] is sorted
	//A[j+2..i] is sorted
	//key < A[j+2] where j+2 <= i
	//A[j] <= A[j+2] where j+2 <= i
	//0 <= j

	//2. Shift elements to the right
	while (j >= 1 && key < A[i])
	{
		//One shift each iteration
		A[j+1] = A[j];

		j--;

		//Inner invariant:
		//A[1..j] is sorted
		//A[j+2..i] is sorted
		//key < A[j+2] where j+2 <= i
		//A[j] <= A[j+2] where j+2 <= i
		//0 <= j
	}

	//3. Insert key into the right position
	A[j+1] = key;

	i++;

	//Outer invariant: 
	//Elements (A[1] .. A[i-1]) to the left of (i) are sorted
}

//Post condition: 
//A[k] <= A[k + 1] where 1 <= k < n
//A is a permutation of the input array

Now let us verify that the inner loop invariant holds true after each iteration and when the loop terminates. Start with:

A[1..j] is sorted
A[j+2..i] is sorted
key < A[j+2] where j+2 <= i
A[j] <= A[j+2] where j+2 <= i
0 <= j

After we apply (j = i-2) we get:

A[1..i-2] is sorted
A[i..i] is sorted
key < A[i] where i <= i
A[i-2] <= A[i] where i <= i
0 <= i-2

All the above conditions hold true:

  1. Elements A[1..i-1] are sorted (we know that from the initialization stage). The inner loop implementation indicated above does not modify the first (i-2) elements which means A[1..i-2] are sorted as well so the condition holds
  2. A[i..i] is only one element which is trivially sorted
  3. (key < A[i] where i <= i) this condition holds true due to the fact that we are inside the inner loop otherwise the inner loop would have been terminated. Note that (key < A[i]) is one of the inner loop guards
  4. (A[i-2] <= A[i] where i <= i) this condition holds true because A[i] holds the same value as A[i-1] (due to the shift operation) so the condition becomes A[i-2] <= A[i-1] and this is also true because the first (i-1) elements are sorted
  5. Trivial

Let us now verify that the inner loop invariant holds true after termination. The inner loop terminates when (j = 0) or (key >= A[i]). Let us start with the first case:

Apply (j = 0) and for simplicity assume (i = n) to the inner loop invariant:

A[1..j] is sorted
A[j+2..i] is sorted
key < A[j+2] where j+2 <= i
A[j] <= A[j+2] where j+2 <= i
0 <= j

We get:

  1. A[1..0] is sorted. This is trivially true because it is an empty array.
  2. A[2..n] is sorted. This implies the post condition i.e. the whole array is sorted. Elements from (2) to (n) are in place and the first element is indeed in place because point (3) below (which we will prove) says (key < A[2]) which means (key) will be inserted in A[1]
  3. (key < A[2] where 2 <= n) same reasoning as point (3) mentioned earlier in the first iteration
  4. (A[0] <= A[2] where 2 <= n) this does not apply as A[0] is not define (our array starts at (1))
  5. (0 <= 0) this is trivial

Let us now work on the second case when (key = A[i] and for simplicity i = n):

  1. Combine the case assumption (key = A[n]) with point (3) (key < A[j+2]) we obtain A[n] = key < A[j+2]
  2. Combining this with point (1) (A[1..j] is sorted) and point (2) (A[j+2..n] is sorted) follows that the assignment A[j+1] = key makes A[1..n] sorted

That is it for today, in the next article we will discuss a new example. Thanks for reading.

Search Terms...
  1. I enjoyed your series on loop invariant very much. Would you continue with this series with more complicated algorithms? How is this technique compared to recursion?

    Thanks!

  2. Perfectly explained, thank you! Keep up this good work!

  3. well explained. i would love to read loop invarients of quick,merge, heap and selections sort as well.

Leave a Reply

%d bloggers like this: