basic profiling support #349

antirotor · 2019-09-02T15:11:10Z

Goal

It would be nice to have mode, that would generate profiling data for pyblish process and plugins. Producing lot of timing and other data, that can be later used for visualizing and debugging whole session.

Motivation

My idea is to capture whole context after each plugin is done as snapshots along with it's timing, memory consumtion, etc. to json. Then to have simple application processing this json file, drawing all data to canvas, so we'll be able to vizualize, what is going on with all data during publishing. This could be useful when debugging behaviour of pipelines where lot of plugins are doing many things.

This implements #348

antirotor · 2019-09-02T15:14:54Z

This still have some caveats. Getting memory information on windows is somehow cumbersome without additional external dependencies (such as psutils. Problem with psutils is that it needs developer version of python on linux (gcc and python header), complicating installation somehow.

Also missing tool to process profiling info yet.

mottosso · 2019-09-02T15:32:29Z

Thanks for this.

Re psutil; it's kind of the de-facto standard for doing these kinds of OS queries; how about we depend on it as an optional dependency? That is, if it's available, the profiling will be that more thorough, and without it you get the basics. That way, users on Windows can pip install psutil and those on Linux could e.g. apt-get install python-psutil or whatever means are available, if they are interested in that kind of information.

If there is a pure-Python equivalent, that would of course be best. But in my experience it just isn't possible. :(

antirotor · 2019-09-03T09:44:16Z

I've changed it to use optional psutil module. Pure python isn't there - only resource module but that is available only on linux and darwin systems.

mottosso

Sorry for the delay in looking at this, I've made some notes in-line, and overall I think there's one thing missing, which is that profiling currently happens per-process, but are stored as one global object, being overwritten each time.

Instead, I think we should store them together with the context.data["results"] data. And actually, looking through the code, I found that we're already collecting the duration of each process in the results dict.

pyblish-base/pyblish/plugin.py

Lines 532 to 534 in f8a2449


	result["duration"] = (__end - __start) * 1000 # ms

This would be a good spot to continue adding profile data I think, so let's move the profiling context manager to there.

Let me know what you think.

mottosso · 2019-09-05T17:34:09Z

pyblish/plugin.py

+    profiling_path = os.path.join(tempfile.gettempdir(),
+                                  "{}-pyblish_profile.json".format(context.id))
+
+    append_to_json(profiling_data, profiling_path)


Ah no, I don't think this is a good idea.

Why not let the user decide what to do with the profiling information?

context = util.publish() with open("somefile.json", "w") as f: json.dump(f, context.data["profiling"])

GUIs could get a hold of it via the context as well, inbetween each process if need be, but likely also as a post-process of sorts.

So you think that it shouldn't save to json file implicitly and it's up to user to handle data in context.data.get('statistics') (lets call it that) afterward? Just checking if I understand it correctly.

Yes, don't write files without the user's knowledge. In the vast majority of cases, the user won't care for those files anyway. Profiling is primarily useful for the plug-in author.

mottosso · 2019-09-05T17:35:09Z

pyblish/plugin.py

+
+    # collected using psutil module, None if it is not installed
+    try:
+        profiling_data['initial_memory_sage'] = profiler.initial_mem


Typo on _sage. The naming convention should be mixedCase for dictionary keys as well (JSON convention).

mottosso · 2019-09-05T17:37:30Z

pyblish/plugin.py

-        result = __explicit_process(plugin, context, instance, action)
-    else:
-        result = __implicit_process(plugin, context, instance, action)
+    if not context.get('profiling'):


Typo here, should be context.data.get(), as below.

Unsure about whether profiling is the best name here. Grammatically it's a verb, describing some sort of state. That it is currently "profiling". As it isn't meant as that, it should probably we profile, but even that could be confused with an action. To "profile" something.

I'm thinking context.data["statistics"]

I agree, statistics is much better.

mottosso · 2019-09-05T17:41:22Z

pyblish/plugin.py

@@ -58,6 +60,45 @@
 STRICT_DATATYPES = EARLY_ADOPTER or STRICT_DATATYPES


+class ProfilingTimer():


I think we should stick with a @contextmanager, like it's being used elsewhere in Pyblish.

import contextlib @contextlib.contextmanager def profiling(): stats = {} stats["startTime"] = time.time() yield stats stats["endTime"] = time.time() with profiling() as stats: ...

Alternatively..

stats = {} with profiling(stats): ...

mottosso · 2019-09-05T17:44:18Z

pyblish/plugin.py

+    context.data['profiling'][plugin] = profiling_data
+    profiling_data["snapshot"] = {
+        "timestamp": int(time.time()),
+        "context": json.dumps(context, default=lambda o: '<not serializable>')


Not sure about "snapshot".. If the goal is capturing more data, I think we should instead favour a mechanism with which to let the plug-in author provide his own data, as mentioned in #348 (comment)

I was thinking about snapshot as "map" if you will, that with all those profiling data you could later on visualize whole pyblish process with all its timings and what was changed in context each step. But yes, adding custom data from every plugin should be implemented also.

mottosso · 2019-09-05T17:45:00Z

pyblish/plugin.py

+
+    context.data['profiling'][plugin] = profiling_data
+    profiling_data["snapshot"] = {
+        "timestamp": int(time.time()),


If we keep startTime from stats we can remove this.

mottosso · 2020-03-20T06:53:40Z

Hi @antirotor, revisiting this long-forgotten issue and would like to merge it. It's a good feature! There are just a few notes left before a first 1.0 of the feature, if you're up for it it would be good to see those though.

antirotor · 2020-03-20T09:52:01Z

Hi @antirotor, revisiting this long-forgotten issue and would like to merge it. It's a good feature! There are just a few notes left before a first 1.0 of the feature, if you're up for it it would be good to see those though.

Sure thing, sorry for this zombie-state. I had more urgent things to do but it is still on my mind. I'll try to take care of it asap.

basic profiling support

bf28ab7

using psutil for memory collection

6d0ca41

mottosso reviewed Sep 5, 2019

View reviewed changes

jedfrechette mentioned this pull request Jun 5, 2024

Update six.py to 1.16.0 #402

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

basic profiling support #349

basic profiling support #349

antirotor commented Sep 2, 2019 •

edited

Loading

antirotor commented Sep 2, 2019

mottosso commented Sep 2, 2019

antirotor commented Sep 3, 2019

mottosso left a comment

mottosso Sep 5, 2019

antirotor Sep 9, 2019

mottosso Sep 9, 2019

mottosso Sep 5, 2019

mottosso Sep 5, 2019

antirotor Sep 9, 2019

mottosso Sep 5, 2019

mottosso Sep 5, 2019

antirotor Sep 9, 2019

mottosso Sep 5, 2019

mottosso commented Mar 20, 2020

antirotor commented Mar 20, 2020

		@@ -58,6 +60,45 @@
		STRICT_DATATYPES = EARLY_ADOPTER or STRICT_DATATYPES


		class ProfilingTimer():

basic profiling support #349

Are you sure you want to change the base?

basic profiling support #349

Conversation

antirotor commented Sep 2, 2019 • edited Loading

Goal

Motivation

antirotor commented Sep 2, 2019

mottosso commented Sep 2, 2019

antirotor commented Sep 3, 2019

mottosso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mottosso commented Mar 20, 2020

antirotor commented Mar 20, 2020

antirotor commented Sep 2, 2019 •

edited

Loading