Wednesday, November 6, 2013

Sorting


Sorting algorithm is the most essential element for computing big data or for working with large data. Think about this:

You have a list of names of 1000 people, in a random order. You need to sort this list of names into alphabetical order. You can only look at one letter in each name at a time. How do you do it in the most time efficient way? You'd probably start by running your eye down the entire list, and get everyone whose name starts with A and put them at the top. Then you'd run your eye down again and put everyone whose name starts with B underneath all the As. Do that until all the names are in first-letter alphabetical order. But then you realise the list of names starting with A isn't in alphabetical order! Azir is in front of Aaron! That's wrong! So you run your eye down each second letter of the names, and put them in order according to the second letter. But then they're still not in alphabetical order, because of the third letter in the names! Aaliyah is in front of Aaron! So then you start sorting the A's by fourth letter....

So you do this until the entire list of A's are sorted. But then you need to do the same thing for the names that start with B! And C! And D! As you can see, it would take an extremely long time to sort a large list of names. It even takes a computer, which can look at millions of names per second, a relatively long time. Especially when you have a list of 1,000,000s of names!

Essentially some clever people notice that there must be a better way to solve this. As time passes they have came up with many different ways for computers to sort large lists like these. There are many different ways (algorithms) that computers can sort, and the most efficient way depends on what kind of data is being sorted. Names? Numbers? Dates? They're all different and require different ways of sorting. Some algorithms first sort the list by length of the name, then by alphabetical. Some advance sorting would split the list in half, and get each core in your dual-core computer to sort each list individually, then combine them at the end. Some count how many unique elements exist in the list, assign it a number, then run over and over the list, averaging them, until the list is in perfect order.

No comments:

Post a Comment