Czech version
logolink

< Back to the list of lessons

Bubblesort - Sorting Algorithm

AlgortimyContent of the lesson:

  • Principle of Algorithm
  • Bubblesort

Principle of Algorithm

The sorting algorithm called BubbleSort is based on the principle of bubbles which push the higher numbers "upwards".

Sample

Sort numbers 4, 8, 1, 4, 5, 2, 3 using gradual substitutions.

bubble sort

The mark separates the unsorted part (left part) from the sorted part of the array. At the beginning it is behind the last item of the array. The program browses through the whole array from the beginning to the end and tries to swap nearby items in case the previous value is higher than the following one. Using this method you will get the highest value to the end after the first step. Move the marker one place left and browse through the unsorted list again (until the marker is reached) and also swap nearby items in case the values are not sorted. The second biggest value is placed at the right position. Continue using the same procedure. Make the last step when the marker is set to the second item from the left. Compare the remaining two items, swap them if needed and the array is sorted.

Idea (rough algorithm)

We repeat this procedure until the marker gets from the last position to the second one:

  • browse the unsorted part until the marker is reached (marker is the k variable) and swap nearby items if needed.

The source code:

Bubblesort
for i:=1 to n-1 do
{browse the unsorted part and switch items}

Browsing and comparing can be defined as the following command:

Bubblesort
for k:=1 to n-i do
if numbers[which]>numbers[which+1] then

Bubblesort - The Slowest Variant

This variant is used only to show how you can quickly improve performance using a small change.

Bubblesort - the slowest variant
FOR k:=1 TO n-1 DO
begin
 FOR i:=1 TO n-1 DO
 begin
  IF p[i]>p[i+1] THEN
  begin
    pom:=p[i];
    p[i]:=p[i+1];
    p[i+1]:=pom;
  end;
 end;
end;

Assume we have defined an array p having n items which are filled with random numbers and we want to sort them.

During the first passage of the FOR cycle, the variables k and i have value 1 and the program compares whether the the first value p[i] is higher than the following one p[i+1]. If the condition is fulfilled, the values are swapped using these three lines:

  • pom:=p[i]; -> The value of one item is saved into the variable pom.
  • p[i]:=p[i+1]; -> The value from the other item is saved into the item which was copied to the variable pom.
  • p[i+1]:=pom; -> The variable pom is copied into the other item inside the array.

The condition has ended and our program returns back to the second passage of the FOR cycle (i=2). Now the second item and the third one will be compared using the same way as explained.

When the variable i reaches the value of variable n, the cycle will be terminated and the program continues executing the first FOR cycle - the variable k is increased to 2 and swapping numbers is done again.

This variant is the slowest one because the program browses the whole array from the first item to the last one until the FOR cycle k:=1 to (n-1) reaches the value (n-1). Try to imagine a situation having an array of 10 000 numbers and having 100 users who want to sort the array at once - this is a useless waste of processor time.

Bubblesort - Second Variant (50% performance gain)

We will improve our algorithm changing the second maximum value of the FOR cycle from n-1 to n-i.

Bubblesort - Second Variant
FOR k:=1 TO n-1 DO
begin
 FOR i:=1 TO n-i DO
 begin
  IF p[i]>p[i+1] THEN
  begin
    pom:=p[i];
    p[i]:=p[i+1];
    p[i+1]:=pom;
  end;
 end;
end;

When the counter reaches the value of variable n then the cycle is terminated and the program returns to the first FOR cycle. The first counter is then increased to value 2 and the inner FOR cycle starts. It will run until the second counter reaches the n-k value because the rest of the array has already been sorted so it is not required to browse it anymore.

One more thing remains to be explained - why does the first FOR cycle has the maximum value n-1 instead of n. It is because of the fact that you have to make n-1 comparisons for n elements - if you select the four biggest values of an array of 5 elements, it is clear that the last one will be the smallest one.

Thanks to the change from n-1 to n-k our program does not have to check all numbers in every step but it checks less and less numbers in each following step (because the value of k variable is increased permanently).

Sample: We have an array of 5 numbers

bubblesort

*this step will be skipped because there is no data to be compared to - it is shown only for the completeness of all "stairs".

Questions

  1. Describe the principle of the BubbleSort algorithm.
webdesign, xhtml, css, php - Mgr. Michal Mikláš