|
|
Auth |
Today's Proof: Show Sum(i=0 to n | nCi) = 2n. Avneesh and I managed to
prove it, but I'm not gonna put the proof down here because I already wasted too much time this
week. :)
Next challenge: come up with a data structure that has these features:
I did some work on the algorithm today at the airport. For quite a while (since writing the
AVL code for DbgMem while in White Plains), I've been thinking
it should be possible to create an "indexed" binary tree, but I wasn't sure. My idea is that at
each node, you could keep a count of the number of right children. In the root, you keep a count
of the total number of nodes in the tree. There's a simple formula for the number of nodes
at each point in the tree: Parent = Left + Right + 1. To find any node by index,
you can just walk down the tree, going left or going right and subtracting, until you find the
node with the index you are looking for. The question in my mind was, could you keep the count
field in each node correct while performing the standard operations on the tree (without having to
traverse the whole tree and re-index)? I was able to show that for simple add, simple delete,
rotate left, and rotate right, you can keep the counts up to date. Excellent! This means that you
can create an "indexed" binary tree. In fact, you can keep it as an AVL tree, with the nodes
sorted by their index. The neat thing is, when you add to the middle and shift the indicies of
half the nodes by one, it has no effect on the relationships between the nodes and thus no
effect on the tree structure. You just add a node and all the subsequent nodes are "magically"
re-indexed! (This is because the index of each node is implicit.)
So, using this data structure, you can create a list which 1) retains arbitray order, 2)
has fast (O(log2 n)) index access, and
3) has fast (O(log2 n)) add / remove
from any location in the list. Thread it, and you get constant time iteration. Overlay a hash
table and you get "constant" time identity access. Memory usage for the threaded tree
is O(n) - same number
of nodes as elements. The hash table adds O(n) bins,
so we're still O(n). There's a lot of constant time pointer manipulation, so you
might not see any improvements for small n, but it should fly on a large data set.
Mmm! Pointer twiddling! I hope I have an excuse to write this sometime.
This evening I flew to Seattle and met Michelle at the airport. We picked up our rental car, drove to our hotel in Bellevue, and went to bed!
| Louis K. Thomas <louisth@hotmail.com> | Auth | 2003-10-02 (1794 days ago) |