In this article I want to highlight some details on how to profile a Blender add-on.
Now you can of course simply put some code around a call to the execute() method that measures the elapsed time, but to gain insight in what is taking up the most time we need to do a little bit more, so here I will show how we can use the line-profiler package and how we can make sure that this will not incur any extra overhead on the add-on once it is running in production.
The example used in an add-on wh looked at in a previous article, so it might be a good idea to read that one first if you haven´t already.
The development environment
To make testing easier it is probably a good idea to work in a development container and run Blender as a Python module. This exactly the kind of setup provided by the blenderadds-ng repository that I wrote about previously
With this setup we can use the @profile decorator to provide timing information on a line-by-line basis and write our code in such a way that if the line-profiler package is not available (for example once you distribute your add-on to others) or we don´t enable it explicitely with an environment variable, it is not imported and the decorator resolves to a no-op function.
The line-profiler package
Near the top of our add-on we add some code to see if we want and can import the line-profiler package:
from os import environ
try:
if environ.get("LINE_PROFILE") == "1":
from line_profiler import profile
else:
profile = lambda x: x
except ImportError:
profile = lambda x: x
This code will try to import the line_profiler package if the environment variable LINE_PROFILE is set to 1. If that fails (for example because the package isn´t present) or if that enviroment variable is not set, then we assign a lambda function to profile that simply returns it argument. This way, we can always add a @profile decorator to a method or a function, and it will either be profiled or not, but we don't have to change anything about the function or method itself.
In a bash shell, setting an environment variable for just a single run can be done by prepending an assignment:
LINE_PROFILE=1 python3 add_ons/foreach_examples.py
Profiling a function
With the decorator in place, we can simply decorate any function we might want to profile, for example:
@profile
def get_closest_vertex_index_to_camera_naive(
world_verts: npt.NDArray[np.float32], cam_pos: npt.NDArray[np.float32]
) -> Tuple[int, float | np.floating[Any]]:
closest_distance = np.inf
closest_index = -1
for vertex_index, vertex_pos in enumerate(world_verts):
direction = vertex_pos - cam_pos
distance = np.linalg.norm(direction)
if distance < closest_distance:
closest_distance = distance
closest_index = vertex_index
return closest_index, closest_distance
If this function is called, and profile is the profiler from the line-profiler package, information will be collected for any line of code this is executed.
Printing profiling results
In our test environment do not install an add-on in Blender but we run Blender as a module, so we will have a section in our code that only runs when call it from the command line (note that this code will not be run if we install an add-on in Blender, because the the register() function will be called, but it is safe to leave this code in the add-on). It might look like this:
if __name__ == "__main__":
...
register() # make sure we can call the operator defined elsewhere in the file
...
bpy.ops.object.foreach_ex("INVOKE_DEFAULT", mode=cli_mode)
...
unregister() # unregister anything registered previously.
if (
profile
and hasattr(profile, "print_stats")
and environ.get("LINE_PROFILE") == "1"
):
profile.print_stats()
We still need to call the register() function that will register our add-on, otherwise we can´t call it, but we can then simply invoke our operator. In this example we registered an operator called foreach_ex that takes a mode argument, but this would be different for your add-on of course. But the point here is, that any method that this operator calls which is decorated with the @profile decorater will be profiled (if the decorator was not replaced by the no-op lambda).
After running the operator, we call the unregister() function to clean up. I don´t think this isn´t necessary when we run Blender as a module, but it doesn´t hurt either.
Finally we check if the profile variable is not None, and if it has a print_stats attribute. It will have one if it is the proper decorator, but not if it is the dummy lambda function. Checking for the LINE_PROFILE environment variable is a bit superfluous here (because if it was set, profile would be the lambda), but it makes it extra clear that we only print the statistics we gathered if they are present at all.
A profile run
If we now run our add-on from the commandline and turn profiling on with
LINE_PROFILE=1 python3 add_ons/foreach_examples.py
We see something like this:
Timer unit: 1e-09 s
0.00 seconds - /workspaces/blenderaddons-ng/add_ons/foreach_examples.py:65 - get_active_camera_position
0.01 seconds - /workspaces/blenderaddons-ng/add_ons/foreach_examples.py:55 - to_world_space
0.01 seconds - /workspaces/blenderaddons-ng/add_ons/foreach_examples.py:41 - get_vertex_positions
0.02 seconds - /workspaces/blenderaddons-ng/add_ons/foreach_examples.py:73 - get_closest_vertex_index_to_camera_naive
0.04 seconds - /workspaces/blenderaddons-ng/add_ons/foreach_examples.py:127 - do_execute
Wrote profile results to profile_output.txt
Wrote profile results to profile_output_2025-10-23T110532.txt
Wrote profile results to profile_output.lprof
To view details run:
python3 -m line_profiler -rtmz profile_output.lprof
So for every profiled function we get a summary of the elapsed time, and near the end are some instructions to look at those details.
Profiling details
If we follow the suggestion and run
python3 -m line_profiler -rtmz profile_output.lprof
we get detailed output for each profiled function. I have shown the output here for just our toplevel function.
Total time: 0.0363764 s
File: /workspaces/blenderaddons-ng/add_ons/foreach_examples.py
Function: do_execute at line 127
Line # Hits Time Per Hit % Time Line Contents
==============================================================
127 @profile # type: ignore (if line_profiler is available we get a complaint here)
128 def do_execute(self, context: Context):
129 """Expensive part is moved out of the execute method to allow profiling.
130
131 Note that no profiling is done if line_profiler is not available or
132 if the environment variable `LINE_PROFILE` is not set to "1".
133 """
134 1 0.7 0.7 0.0 obj: Object = context.active_object
135
136 1 4.3 4.3 0.0 if self.mode == "NAIVE":
137 1 8160.5 8160.5 22.4 arr = get_vertex_positions(obj)
138 else:
139 arr = get_vertex_positions_np(obj)
140
141 1 5992.8 5992.8 16.5 world_arr = to_world_space(arr, obj)
142
143 1 7.2 7.2 0.0 if self.mode == "BROADCAST":
144 return get_closest_vertex_index_to_camera(world_arr, cam_pos=self.cam_pos)
145 else:
146 2 22207.5 11103.8 61.0 return get_closest_vertex_index_to_camera_naive(
147 1 3.4 3.4 0.0 world_arr, cam_pos=self.cam_pos
148 )
It may look complicated because this is the code we used to see the effect of using foreach_get() and NumPy functions and what we call is determined by the mode variable, but since we called this with mode=NAIVE, effectively are executing:
obj: Object = context.active_object
arr = get_vertex_positions(obj)
world_arr = to_world_space(arr, obj)
return get_closest_vertex_index_to_camera_naive(
world_arr, cam_pos=self.cam_pos
)
And as you can see from the profile information, those are the only lines with a significant timing percentage, and we spend about 22%, 16% and 61% of our them in those respectively.
If we instead of using a naive Python loop use a call to foreach_get() instead, we get different results (some lines were removed to reduce clutter)
Line # Hits Time Per Hit % Time Line Contents
==============================================================
134 1 0.6 0.6 0.0 obj: Object = context.active_object
135
136 1 4.4 4.4 0.0 if self.mode == "NAIVE":
137 arr = get_vertex_positions(obj)
138 else:
139 1 147.9 147.9 0.7 arr = get_vertex_positions_np(obj)
140
141 1 159.8 159.8 0.7 world_arr = to_world_space(arr, obj)
142
143 1 3.9 3.9 0.0 if self.mode == "BROADCAST":
144 return get_closest_vertex_index_to_camera(world_arr, cam_pos=self.cam_pos)
145 else:
146 2 22196.4 11098.2 98.6 return get_closest_vertex_index_to_camera_naive(
147 1 2.9 2.9 0.0 world_arr, cam_pos=self.cam_pos
148 )
We can see that getting vertex positions with the alternative function get_vertex_positions_np() that uses the foreach_get() method is so much more efficient that now almost 99% of the time is spent in the function get_closest_vertex_index_to_camera_naive(). This means we can focus our optimization efforts solely on that function, and ignore the to_world_space() because that one contributes neglibly to the overall run time when presented with NumPy arrays instead of Python arrays.
Caveat
A typical add-on operator will have an execute() method and it might seem logical to add a @profile decorator to it directly, but if we would do that Blender would complain: the wrapped function doesn´t look exactly like what Blender expects (even though it would function the same). The remedy to this is to move all expensive code to its own do_execute() method and add a @profile decorator to that.
So typical code will have the following structure:
class OBJECT_OT_your_operator(bpy.types.Operator):
... # other code ommited for brevity
@profile
def do_execute(self, context: Context):
... # this is the expensive part we want to profile
def execute(self, context: Context):
self.do_execute(context)
Summary
Profiling a Blender add‑on using the line‑profiler is rather easy: wrap functions with a guarded @profile decorator (enabled by LINE_PROFILE=1, or replaced by a dummy if the line-profile package isn´t available) so profiling is optional and has no runtime cost in production. With the line-profiler package we can collect and print line-by-line timings, and use the results to pinpoint and fix hotspots.
No comments:
Post a Comment