Editor: Improve revisions diff pairing performance#77126
Editor: Improve revisions diff pairing performance#77126
Conversation
Replace the O(n*m) diffWords-based similarity check in pairSimilarBlocks with an O(n) word-set overlap (Jaccard index). This eliminates the main performance bottleneck when sliding through revisions. Additionally: - Strip HTML tags before similarity comparison so markup doesn't inflate scores for short blocks - Directly pair 1:1 removed/added blocks of the same type without similarity check (no ambiguity) - Raise similarity threshold from 0.3 to 0.5 to prevent pairing unrelated paragraphs that share common words Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
Size Change: +184 B (0%) Total Size: 7.74 MB 📦 View Changed
ℹ️ View Unchanged
|
Replace regex-based word splitting with Intl.Segmenter for proper multilingual support (CJK, Thai, etc). Remove HTML tag stripping since Intl.Segmenter's isWordLike filter naturally handles tags. Add a test for pairing blocks with similar content (fox jumps/leaps). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Flaky tests detected in ae4278b. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/24125702268
|
What?
Improve the performance of the revisions diff pairing algorithm (
pairSimilarBlocks) in the editor.Why?
The
textSimilarityfunction useddiffWords(O(n*m) per pair) to score every candidate pair of removed/added blocks. For posts with many paragraphs of the same type, this created an R×A matrix of expensive calls — the main source of jank when sliding through revisions.How?
diffWords-based similarity with O(n) word-set overlap (Jaccard index), stripping HTML tags before comparisonThe new similarity produces identical pairing decisions for real-world content (verified across 24 revision pairs), while being ~67x faster in benchmarks on an 8×8 paragraph matrix.
Testing Instructions
npm run test:unit -- --testPathPattern="post-revisions-preview/test/block-diff"Testing Instructions for Keyboard
Use of AI Tools
This PR was authored with Claude Code (Claude Opus 4.6). All code was reviewed and tested manually.
🤖 Generated with Claude Code