At the beginning of the course, I thought that Python doesn't support stack. But in fact Python's list has all the structural elements of stack. Also because of the structure of the namespace (check inheritance link) in
Python, Python doesn't need to implement a separate class for stack like Java. Python has some nice syntactic sugar for dealing with lists.
Other things that make Python great are meta-programming, closures, dynamic loading of modules and portability. It's not that other language doesn't or can't do these things, it's that it doesn't or can't do them out of the box nor as cleanly. Moreover, you don't need to download libraries. Just install Python and go. Starting in Python and then polishing performance in C is a more efficient workflow overall. That might not be personally true for you.
Programming experience: Java & C (not C++)
Saturday, November 30, 2013
Sorting (Python)
There are some slow and fast sorting algorithms. We use Big O notation to define it. Big O notation, basically it says how the time needed scales when you add more elements. So a sorting algorithm with O(n^2 ). Sorting a list with 10 elements would need - very roughly - 10^2 = 100 units of time. Big O notation can give us the average time of the sorting algorithm.
O(nlog2n). Fast
- Quicksort: Typically fastest in practice, simple to code and very processor friendly. Also doesn't need any extra space. O(nlogn) is the average case, the worst case for quicksort is O(n^2). The worst case for quicksort is when the list is ordered, since it needs to partition every item in the list. If part of the list is ordered, then we can solve this by choosing randomly a pivot.
- Mergesort: Not as common as quicksort but occasionally useful (sorting stuff on disk for instance). It needs extra memory space compare to quicksort. Mergesort efficient is same for its worst case and best case.
- Selection Sort: Easy to understand and commonly taught but rarely used (insertion sort better in pretty much every way). Selection sort efficient is the same with every list since it would iterate every item with any case.
- Insertion Sort: Simple algorithm, works well for small arrays. Also runs almost linearly for nearly sorted lists. Pretty common to use this instead heavy-duty sorts for simple jobs. The best case for insertion sort is when the list is ordered, then it will be O(n). The reason is that it doesn't need to shift the items in the list when sorting, hence O(n).
Thursday, November 28, 2013
Recursion Tracing (Regex Assignment 2)
Back to the post of recursion, there are two kinds to trace a recursion algorithm. Tracing step by step and tracing by faith. This assignment is about converting a string of regex to regex tree and marching a regex tree to a string of binary. Converting a string of regex to regex tree, we iterate each token of the string and build it into the tree when we counter an end parenthesis. Is not hard to trace, since being just one level. What is challenged is second part of the assignment, marching a regex tree to a string of binary, especially star node.
There is this particular example:
((1.(0|1)*).0) matches any string that starts with '1' and ends with '0'. Lets say that is marching with '11111' and the bar and dot node work as it should be. The hard part is to implement the starnode to know where to stop tracing the string of binary. For example if the starnode doesn't know where to stop, it will return True without checking the last dot node. The solution is to check every possible split of the string of binary:
...
elif isinstance(tree_regex, StarNode):
if len(partition) is 0:
return True
else:
for i in range(len(partition), 0, -1):
if match(tree_regex.children[0], partition[:i]):
return match(tree_regex, partition[i:])
return False
...
For human to trace step by step is really hard, since each split is a level deeper. Human are only good at tracing two levels of details. This case is the perfect example to trace it by faith. First, just try tracing a simple string, like '110' and '111'; if it work, then hope that it will work with complex string of binary.
This assignment helps me to realize why tracing step by step is hard for human.
There is this particular example:
((1.(0|1)*).0) matches any string that starts with '1' and ends with '0'. Lets say that is marching with '11111' and the bar and dot node work as it should be. The hard part is to implement the starnode to know where to stop tracing the string of binary. For example if the starnode doesn't know where to stop, it will return True without checking the last dot node. The solution is to check every possible split of the string of binary:
...
elif isinstance(tree_regex, StarNode):
if len(partition) is 0:
return True
else:
for i in range(len(partition), 0, -1):
if match(tree_regex.children[0], partition[:i]):
return match(tree_regex, partition[i:])
return False
...
This assignment helps me to realize why tracing step by step is hard for human.
Wednesday, November 27, 2013
Memory Model (The tour of Anne Hoy; Assignment 1)
The tour of Anne Hoy is much like the tower of Hanoi with the difference of using four pegs instead of three pegs. Starting with the assignment we have the knowledge of the algorithm for the tower of Hanoi. So the tricky part of the assignment is not implementing on how to move the disk(cheese), the tricky part is to implement the splitting part of the first peg.
So basically find the optimal i of M(n). The solution is to pick different i and try them all to see which one is fit for n. If you didn't save the i for that particular n, every time it run to find the i for (n+1), it need to find i for n again, which is not very efficient. So the solution is memory model. It stores every i that had run before. It also called memorization. There is this function that Danny Heap wrote that you only need to input the function that you want to memorize; it will record the output for future use. With the additional advantage that it will secure the table you need to memorize, because only the function can access the table.
I can see why Danny didn't teach us the use memorization before the assignment. It will only confuse us, such as inputting a function as a parameter. After knowing how Python works around its object with namespace, this all makes sense.
| Equation of TOAH |
So basically find the optimal i of M(n). The solution is to pick different i and try them all to see which one is fit for n. If you didn't save the i for that particular n, every time it run to find the i for (n+1), it need to find i for n again, which is not very efficient. So the solution is memory model. It stores every i that had run before. It also called memorization. There is this function that Danny Heap wrote that you only need to input the function that you want to memorize; it will record the output for future use. With the additional advantage that it will secure the table you need to memorize, because only the function can access the table.
I can see why Danny didn't teach us the use memorization before the assignment. It will only confuse us, such as inputting a function as a parameter. After knowing how Python works around its object with namespace, this all makes sense.
Wednesday, November 6, 2013
Sorting
Sorting algorithm is the most essential element for computing big data or for working with large data. Think about this:
You have a list of names of 1000 people, in a random order. You need to sort this list of names into alphabetical order. You can only look at one letter in each name at a time. How do you do it in the most time efficient way? You'd probably start by running your eye down the entire list, and get everyone whose name starts with A and put them at the top. Then you'd run your eye down again and put everyone whose name starts with B underneath all the As. Do that until all the names are in first-letter alphabetical order. But then you realise the list of names starting with A isn't in alphabetical order! Azir is in front of Aaron! That's wrong! So you run your eye down each second letter of the names, and put them in order according to the second letter. But then they're still not in alphabetical order, because of the third letter in the names! Aaliyah is in front of Aaron! So then you start sorting the A's by fourth letter....
So you do this until the entire list of A's are sorted. But then you need to do the same thing for the names that start with B! And C! And D! As you can see, it would take an extremely long time to sort a large list of names. It even takes a computer, which can look at millions of names per second, a relatively long time. Especially when you have a list of 1,000,000s of names!
Essentially some clever people notice that there must be a better way to solve this. As time passes they have came up with many different ways for computers to sort large lists like these. There are many different ways (algorithms) that computers can sort, and the most efficient way depends on what kind of data is being sorted. Names? Numbers? Dates? They're all different and require different ways of sorting. Some algorithms first sort the list by length of the name, then by alphabetical. Some advance sorting would split the list in half, and get each core in your dual-core computer to sort each list individually, then combine them at the end. Some count how many unique elements exist in the list, assign it a number, then run over and over the list, averaging them, until the list is in perfect order.
Friday, November 1, 2013
Binary Tree (Python)
What is a Binary tree?
A binary tree is a collection of of nodes that are linked by node with child nodes less than the node to left and nodes more than the node to the right. Just like the linked list, binary tree has a recursive definition. The leaves of a binary tree are the node that has no children and interval node are the opposite. The level of the tree is the depth of the recursion minus the head of the tree.
Inserting node
To insert a node in a tree, first you have to compare with the current node, if the node you are about to insert is less that the current node then insert the node in the left node if not insert to the right node. If the node then equal does nothing.
Traversing
Traversing into the tree can be accomplished by either preorder, inorder or postorder. Preorder means to visit the parent first the children from left to right, inorder means to visit the parent in the midway and postorder is after visit the children visit the parent.
Deleting node
There are three situations when deleting a node. Deleting a leaf, simply remove the node from there. Deleting a node with one child, remove the node and replace it with its child. Deleting a node with two children, remove the node and replace it with its in-order predecessor node, i.e. the rightmost node from the left child.
Application
Binary tree can be useful to store an expression that is infix. Such as x + y, x * y or (x + y) * x. Binary tree can also be used to construct yes/no conditional question. Binary Tree can be efficient when its element is sorted, and the tree is balance. When the condition is fulfilled, searching in a binary tree have the same efficiency of a binary search, which it's O(log(n)). However when constructing the tree is quite slow comparing to linked list or array list. The worst is when deleting a node from the tree, it has an efficiency of O (nlogn).
Subscribe to:
Comments (Atom)