This article is a follow-up to this one, where we introduced the foreach_get()/foreach_set()
methods and had a quick look at NumPy.
I want to present a small add-on that implements different methods to determine the distance from the active camera to the closest vertex in the active mesh. As we will see, all three methods use different approaches for accessing vertex data and calculating distances, which significantly impacts performance and memory trade-offs.
1. NAIVE Implementation
What It Does
The NAIVE approach iterates over mesh.vertices
using a standard Python loop and calculates distances with pure Python's per-vertex Vector
math. When we measure performance it provides us with a baseline that we can compare other methods to.
The relevant code is straight forward, to get the vertex coordinates we loop over all vertices and get the co
attribute (a Vector
object) and return them as a Python list:
def get_vertex_positions(obj: Object) -> list[Vector]:
mesh: Mesh = obj.data # type: ignore
return [v.co for v in mesh.vertices]
The code to determine the closest distance is also very straight forward:
def get_closest_vertex_index_to_camera_naive(
world_verts: npt.NDArray[np.float32], cam_pos: npt.NDArray[np.float32]
) -> Tuple[int, float | np.floating[Any]]:
closest_distance = np.inf
closest_index = -1
for vertex_index, vertex_pos in enumerate(world_verts):
direction = vertex_pos - cam_pos
distance = np.linalg.norm(direction)
if distance < closest_distance:
closest_distance = distance
closest_index = vertex_index
return closest_index, closest_distance
Note that the type annotation here mentions numpy NDArray, so that it will be possible to pass them, but in the NAIVE code path we will actually pass just Python lists. If we would pass numpy arrays this code would would also work.
We do use np.linalg.norm()
here just to make it possible to accept ndarrays; Would we have gone for a completely naive Python/Blender approach we should have used distance = direction.length
instead.
Key Trade-offs
- Pros: It's the simplest and most readable method, and doesn´t requiry NumPy.
- Cons: It incurs high per-vertex Python overhead (for attribute access and method calls), making it slow on large meshes.
- Performance: Acceptable for small meshes (hundreds to low thousands of vertices). It scales poorly for 10,000+ vertices.
2. FOREACH (mesh.foreach_get / foreach_set)
What It Does
The FOREACH method uses mesh.foreach_get
to perform a bulk copy of vertex coordinates into a flat, numeric buffer (typically a NumPy array). However, in the provided code path, the closest vertex is still found using a Python loop, meaning only the data transfer is accelerated (but the values are stored in an ndarray; that's why we accept those in the get_closest_vertex_index_to_camera_naive()
function)
def get_vertex_positions_np(obj: Object) -> npt.NDArray[np.float32]:
mesh: Mesh = obj.data # type: ignore
coords = np.empty(len(mesh.vertices) * 3, dtype=np.float32)
mesh.vertices.foreach_get("co", coords)
return coords.reshape(-1, 3)
Key Trade-offs
- Pros: The bulk copy significantly reduces Python attribute overhead for reading vertex data, making it much faster than NAIVE for data transfer.
- Cons: If the distance calculation remains in a Python loop, you still keep the performance-limiting Python-loop overhead. This method also requires NumPy.
- Performance: A good improvement over NAIVE for medium meshes, but it doesn't achieve the speed of fully vectorized numerical processing.
3. BROADCAST (NumPy Vectorized)
What It Does
The BROADCAST method also reads vertex coordinates into a NumPy array using foreach_get
. Crucially, it then performs distance calculations using vectorized NumPy operations. This means no Python per-vertex loop is used; instead, np.argmin()
finds the closest index.
def get_closest_vertex_index_to_camera(
world_verts: npt.NDArray[np.float32], cam_pos: npt.NDArray[np.float32]
) -> Tuple[int, float | np.floating[Any]]:
dists = np.linalg.norm(world_verts - cam_pos, axis=1)
i = np.argmin(dists)
return i, dists[i]
This function is not only a lot shorter without any Python loop in sight, but it also nicely shows the power of numpy.
world_verts
is a N x 3 array, and cam_pos
a 1 x 3 array. Numpy's broadcasting rules interpret this to mean we want to subtract the camera position from each vertex position, result in an N x 1 list of distances.
np.argmin()
then takes this list of distance and returns the index of the smallest distance.
All this happens inside optimized C/C++ code, sidestepping the Python loop performance penalty completely.
Key Trade-offs
- Pros: Distances are computed in highly optimized C code (via BLAS/NumPy internals), resulting in minimal Python overhead. This is the fastest approach for large meshes.
- Cons: It requires extra memory for NumPy arrays and temporary arrays (
verts_h
,world_verts
,dists
). It also requires NumPy and careful management of array shapes and data types. - Performance: Best for large meshes (tens of thousands of vertices and up). While array allocation overhead may make it slightly slower for tiny meshes, it's generally still acceptable.
Performance results
If we look at the measured performance we see a clear linear dependance on the number of vertices for all methods:
Note that both axes are logarithmic to make it possible to graph the results of many orders of magnitude.
The naive method is too slow to go beyond 1 million vertices: The last point at 1.5M verts already takes more than 4 seconds.
The method that uses foreach_get()
as about twice as fast, but that wouldn´t get us anywhere near 10 million vertices.
However, the method that uses numpy operations to calculate distances and figure out what the minimum distance is, is a whole lot faster. The last point on the yellow line clocks in at about 1 second for 25 million vertices, over a 60x improvement on the naive method.
The actual numbers may of course be different on your computer but the overall trend is pretty clear: using numpy is the way to go.
Practical Guidance & Rule of Thumb
Since the code isn´t all that more complicated, always use all the available nummpy functionality. Only in those cases where you are very contrained by memory you might consider the other approaches because the extra buffer for the foreach_get()
method and any emporary arrays that numpy creates can add up when working with large numbers of vertices.