Konstantin's weekly report #34

KonstantinDmitriev · March 25, 2014, 5:57am

Hello, everyone! I am happy to present the Synfig weekly report #34.

Last week we’ve got first results of optimization work done by Ivan Mahonin. The software renderer was heavily reworked and now it delivers 150-300% speed boost for vector artwork rendering. See the infographics below for details.

And yes, surely there is still a lot of room to improve.

For example, in the process of discussing the results with Gerald Young, we have found that it’s possible to get even better speed by disabling tile rendering. In this case Synfig uses only one CPU, but (surprise!) the time results are better.

So, we are considering to drop tile rendering and think how to utilize the remaining CPU power. For example, we can render several neighbor frames in parallel to current. In addition to that it would be nice to cache already rendered frames for current view. Also, we can have a background rendering for idle time. We have 8-10 fps on non-tiled renderer with current test file (utilizing just one CPU). If we will use 4 CPUs to render several frames at once, then animation playback at 24 fps becomes reality. That way the Preview Dialog would become a history.

I was discussed this idea with Ivan and he said that’s certainly possible, but the current implementation of rendering system doesn’t provide isolation for rendering several frames in parallel (even an ordinary user can notice that: just try to launch a preview and edit something on the workarea while it’s rendering - you’ll get damaged image on the workarea or damaged rendering). So, implementing this would require a significant rework for internal timeline-layer model. Although this might require some time, Ivan still mentions that as possible.

Besides that, we still have a lot of other components pending for optimizations - such as raster filters (blur, distorts, etc).

This is a good place to mention that our fundraising campaign for next month have only 8 days to go and it’s only at 44% now.
You can help us to continue the full-time development in next month by contributing to campaign or spreading a word. Your help is much appreciated!

As part of this report I couldn’t avoid mention of a work done by Yu Chen - he have continued reworking UI for Tool Options panel. In the previous report he have demonstrated the improved layout for Circle Tool and now he have applied it to other tools - Rectangle, Gradient, Star, Polygon, Spline, Draw and Text. See some of them below.

Now Ivan continues the optimization work and we will try to bring you testing binary packages by the end of this week - so you will be able to test the results by yourself.
Stay tuned!

Darkspace_be · March 25, 2014, 7:36am

Hi!
Great work by Yu chen on the reworked UI. Probably a to obvious question, but will the speed optimization will also be translated to the windows version of Synfig?
Greetz!

KonstantinDmitriev · March 25, 2014, 3:44pm

At the moment we have problems with Windows builds. As soon as they will be resolved, the optimization will be available for Windows users as well.

Yoyobuae · March 25, 2014, 11:07pm

Zelgadis:

So, we are considering to drop tile rendering and think how to utilize the remaining CPU power. For example, we can render several neighbor frames in parallel to current. In addition to that it would be nice to cache already rendered frames for current view. Also, we can have a background rendering for idle time. We have 8-10 fps on non-tiled renderer with current test file (utilizing just one CPU). If we will use 4 CPUs to render several frames at once, then animation playback at 24 fps becomes reality. That way the Preview Dialog would become a history.

I was discussed this idea with Ivan and he said that’s certainly possible, but the current implementation of rendering system doesn’t provide isolation for rendering several frames in parallel (even an ordinary user can notice that: just try to launch a preview and edit something on the workarea while it’s rendering - you’ll get damaged image on the workarea or damaged rendering). So, implementing this would require a significant rework for internal timeline-layer model. Although this might require some time, Ivan still mentions that as possible.

I was thinking instead about rendering layers in parallel.

There are some cases on which layers could be rendered in parallel without interference. For example layers inside of a group can be rendered in parallel to layers under it. The layers inside of a group are rendered into an intermediate surface first, then the layers underneath are rendered into another surface, then both surfaces blended using the group’s blend method.

Each “branch” of the operation could execute in parallel to the other, if I’m not mistaken. It’s kind of similar to a MapReduce system.

KonstantinDmitriev · March 26, 2014, 5:10am

I was thinking about that too, but here we should take into account that creating a thread requires some time too. So, if a group contains just one circle, then creating a thread for such case will result in slowdown, not speedup. So, the branching parallelization would require a pre-analysis of current layer tree, identifying the “expensive” parts and some heuristic to effectively map those parts to available cores.

Yoyobuae · March 26, 2014, 3:26pm

Zelgadis:

Yoyobuae:

I was thinking instead about rendering layers in parallel.

There are some cases on which layers could be rendered in parallel without interference. For example layers inside of a group can be rendered in parallel to layers under it. The layers inside of a group are rendered into an intermediate surface first, then the layers underneath are rendered into another surface, then both surfaces blended using the group’s blend method.

Each “branch” of the operation could execute in parallel to the other, if I’m not mistaken. It’s kind of similar to a MapReduce system.

I was thinking about that too, but here we should take into account that creating a thread requires some time too. So, if a group contains just one circle, then creating a thread for such case will result in slowdown, not speedup. So, the branching parallelization would require a pre-analysis of current layer tree, identifying the “expensive” parts and some heuristic to effectively map those parts to available cores.

Threads can be created ahead of time (ie. thread pool). Then the only cost would the communication between threads, which is less expensive.

Besides, if group layers have only a single layer very frequently, then maybe we should be looking for a way to eliminate the need for that intermediate group layer.

KonstantinDmitriev · March 26, 2014, 3:36pm

This is what I mean by “pre-analysis”.

Yoyobuae · March 26, 2014, 4:25pm

“pre-analysis” doesn’t eliminate the (IMO unnecessary) group layer from the layer list nor from SIF file.

Genete · March 26, 2014, 6:39pm

Have you seen the optimize_layers function?
github.com/synfig/synfig/blob/a … .cpp#L1145

You can force to remove the Paste Canvas layer when it doesn’t add anything to the layers inside.
-G

thatraja · March 28, 2014, 1:34am

Off topic: Finally backed the pledge today - current campaign Still disappointed at paypal as they didn’t support very well.

Well, any news about next release of Synfig AND/OR Morevna project?

KonstantinDmitriev · March 31, 2014, 2:55pm

Ah, looks like I have misunderstood you. ^^
My example of “single circle in the group” was a hyperbolization. “Two circles in one group” would slowdown as well. ^^

We could make a pre-analysis of layer tree by assigning a “cost” value for each layer. For example, “1” for circle, “2” for shape, “6” for blur, “9” for distortion, etc… (values are just samples here). Group layer with blur on top of circle will have a cost “1 + 6”.
Then, such cost-tree could be analyzed and the most effective branching is calculated depending on the available cores count.

Thank you, thatraja! I am glad you’ve got your paypal problems solved and appreciate your support.
The Synfig release date is not defined, but I feel we are getting close to it. Morevna project is suspended, because all my efforts are invested into Synfig right now.

Yoyobuae · March 31, 2014, 5:27pm

Did a quick test using Cairo to draw one circle (with outline). Drawing the circle directly on the main thread takes ~150 microseconds for a small diameter circle (50px or so) and ~1 milisecond when the circle was as tall as my screen (700px or so). The speed doesn’t change linearly with size, but in steps (I suppose it depends on the size of different level caches on CPU).

Doing the work in a seperate thread by creating the thread each time the circle was rendered added only ~50 to ~100 microseconds to each render pass. And creating the thread before hand and using a cond var to pass the work to the thread give pretty much the same results.

Also, measuring time is easy/cheap. The cost of each layer can be measured as Synfig is rendering a frame. When the next frame is rendered the previous measurements can be used to get an idea of how much work each layer costs.

No need to invent a costs values for each layer, that would probably be difficult and require lots of adjustment to get the values right.

KonstantinDmitriev · April 6, 2014, 12:50pm

Thank you, this is useful to know. I’ll forward this to Ivan.