TL;DR: The performance tests results (PerformanceTests/ShadowDOM):
v1-small-deep-layout v1-small-shallow-layout Before CL: 216.85 ms 3.14 ms After CL: 39.85 ms (5x times faster) 2.03 ms (better even for a shallow tree)
This CL got a significant performance improvement for layout time after DOM mutation.
See the context here: https://bugs.chromium.org/p/chromium/issues/detail?id=692893
Let's use an example. Suppose that a slot's distributed nodes are:
- before update distribution: [a, b, c, d]- after updating distribution: [b, c, d]
In this case, only |a| should be reattached in theory. We don't need to reattach |b|, |c|, or |d|. However, the current code reattach all nodes. Since creating a layout (sub)tree is the most expensive operation in our engine, it is worth spending some time on avoid reattaching here.
Actually, Shadow DOM v0 does some efforts by using some heuristic rules. See https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/dom/shadow/InsertionPoint.cpp?q=InsertionPoint.cpp+package:%5Echromium$&l=54
For example, in the following case,
- before: [a, b, c, d]- after: [b, c, d]
It reattaches only |a|. That is good, however, there are several cases where a simple heuristic rule doesn't work well. For example,
- before: [a, b, c, d]- after: [e, b, c, d, f]
In this case, it reattaches all nodes, |a|, |b|, |c|, |d|, |e|, and |f|, where |b|, |c|, and |d| don't have to be reattached.
We need a more sophisticated approach to minimize the number of reattached nodes.
This situation is similar to well known "Longest Common Subsequence" problem. We can use 2D dynamic programming to find the best solution.
For example, for the following case:
- before: [a, b, c, d, e]- after: [d, a, b, f, c, g, e]
DP finds [a, b, c, e] as the longest common subsequence, and don't reattach these nodes.
Since dynamic programming takes O(MN) (both in time and space) here, where M is the number of old distributed nodes, N is the number of new distributed nodes, we should be careful about its size.
In most cases, the cost of DP can be justified because avoiding to reattach a subtree saves much bigger time than DP cost. Cost of creating a layout tree is still dominating in the layout engine. However, it would be difficult to consider every possibilities beforehand; We can't know the number of nodes, depth, of the subtree of each distributed node, before we start to use DP for distributed nodes.
Thus, I introduced the hard limit (== 16), to decide whether to use DP or not. That might be the most safest approach, as a starting point.
If N is bigger than the limit, we don't use DP. We might want to use a heuristic approach for large N, but that is yet another issue which should be solved later.
Change-Id: I796573dc308faaf04b7d4aee0ca12d4c0c5becf7 Reviewed-on: https://chromium-review.googlesource.com/535493 Commit-Queue: Hayato Ito
ceed06a Minimize the number of reattached nodes after updating distribution by 2D dynamic programming
third_party/WebKit/Source/core/BUILD.gn | 1 +
.../WebKit/Source/core/html/HTMLSlotElement.cpp | 73 ++++++++++--
.../WebKit/Source/core/html/HTMLSlotElement.h | 45 +++++++
.../Source/core/html/HTMLSlotElementTest.cpp | 131 +++++++++++++++++++++
4 files changed, 241 insertions(+), 9 deletions(-)