the issue
imagine you are creating trees, either pure data structures or ones that try to mimic real trees, and that you are interested in the number of connected branch segments. How would you calculate these numbers for each node in the tree?
If we have a small tree like the one in the illustration, where we have placed the node index inside each sphere, we might store for each node the index of its parent, like in the list:
p = [ -1, 0, 1, 2, 2, 4 ]
Note that node number 0 has -1 as the index of its parent to indicate it has no parent because it is the root node. Also node 3 and 4 have the same parent node because they represent a fork in a branch.
naive solution
to calculate the number of connected nodes for each node in the tree we could simply iterate over all the nodes and traverse the list of parent nodes until we reach the root node, all the while adding 1 to the number of connections. This might look like this (the numbers are profiling information)
hits time t/hit
7385 4037 0.5 for p in parents:
683441 1627791 2.4 while p >= 0:
676057 410569 0.6 connections[p] += 1
676057 351262 0.5 p = parents[p]
The result has the following values
c = [5, 4, 3, 0, 1, 0]
The tips have no connections while the root counts all nodes as connections minus itself so our 6-node tree had a root with 5 connections. This is also illustrated in the illustration (on the right).
If we time this simple algorithm for a moderately sized tree of just over 7000 nodes (which might look like the one in the image below to get a sense of the complexity) I find that it takes about 1.16 seconds on my machine.
numpy implementation
Now we all know that
numpy is good at working with large arrays of numbers (and conveniently already bundled with Blender) so we might try some simple changes. Almost identical code but with numpy arrays instead of python lists:
hits time t/hit
7385 4563 0.6 for p in self.parent[:self.nverts]:
683441 1695443 2.5 while p >= 0:
676057 2004637 3.0 self.connections[p] += 1
676057 456995 0.7 p = self.parent[p]
But in fact this is slower! ( 2.68 seconds on my machine) If we look at the timings from the excellent
kernprof line profiler (the numbers on the left of the code snippets) we see that numpy spends an awful lot of time on both the iteration over the parents and especially the indexing of the connection array. Apparently indexing a single element in a numpy array is slower than the equivalent operation on a python list. (There is a good explanation why indexing a single element of a numpy array takes so long on
StackOverflow.)
reworking the algorithm
It must be possible to do this faster right? Yes it is, have a look at the following code:
hits time t/hit
1 2 2.0 p = self.parent[:self.nverts]
1 19 19.0 p = p[p>=0]
137 97 0.7 while len(p):
136 1684 12.4 c = np.bincount(p)
136 1013 7.4 self.connections[:len(c)] += c
136 3395 25.0 p = self.parent[p]
136 1547 11.4 p = p[p>=0]
This produces the same result (trust me, i'll explain in a minute), yet uses only
milliseconds (7 to be precise). The trick is to use as much numpy code with looping and indexing built in as possible:
bincount()
counts the number of occurrences of a number and stores it the corresponding bin. So if 3 nodes have node 0 as their parent, c[0] will be 3. Next we add this list in one go to the cumulative number of connections and finally we get the parent indices also in a single statement and remove the -1 entries (those nodes that reached root in their tree traversal). We repeat this as long as there are nodes that have not yet reached root.
conclusion
Numpy might often be faster than pure Python but it pays to check, using careful profiling. Using an appropriate algorithm that does away as much as possible with any python overhead might however give you a sizeable speed improvement although the algorithm might not be immediately obvious to find.
Space tree pro
This bit of research is not just academical, I am currently investigating options to speed up tree generation as implemented in my
Space Tree Pro Blender add-on (available on BlenderMarket)