perf: pre-cached indent arrays for bulk newline+spaces#676
Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Open
perf: pre-cached indent arrays for bulk newline+spaces#676He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Conversation
stephenamar-db
requested changes
Apr 8, 2026
f2e7618 to
9db668d
Compare
Contributor
Author
|
Good catch — extracted the magic Note that the indent cache content is instance-specific (depends on the |
9db668d to
7ec85ce
Compare
Extract MaxCachedDepth=16 to Renderer companion object constant per review. Pre-compute indentCache arrays for depths 0..15 to replace per-character space emission with a single bulk appendAll in flushBuffer.
7ec85ce to
abfe59a
Compare
| var d = 0 | ||
| while (d < MaxCachedDepth) { | ||
| val spaces = indent * d | ||
| val buf = new Array[Char](spaces + 1) |
Collaborator
There was a problem hiding this comment.
would'nt
val buf = Array.fill(spaces + 1) { ' ' }
buf(0) = '\n'
be more terse?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The JSON renderer generates indentation strings (newline + spaces) on every nested element. For deeply nested Jsonnet output,
Renderer.visitKeyandvisitEndrepeatedly construct identical indent strings. The current implementation callselemBuilder.append('\n')followed by awhileloop appending spaces — this is O(depth) per indent operation.Key Design Decision
Pre-cache indent strings (newline + spaces) in a companion object array up to depth 64 (
MaxCachedDepth). For depths ≤64, indent operations become a single array lookup + bulk write. For depths >64 (rare in practice), fall through to the original loop.Modification
sjsonnet/src/sjsonnet/Renderer.scala:MaxCachedDepth = 64constant andindentCache: Array[Array[Char]]"\n" + " " * (depth * indent)as char arrays for depths 0–64flushBuffer()fast path: when depth ≤MaxCachedDepth, useselemBuilder.appendAll(cachedArray, len)instead of character-by-character loopBenchmark Results
JMH — Full Suite (35 benchmarks, 1+1 warmup)
No regressions detected. All benchmarks within noise margin.
Note
The indentation cache optimization primarily benefits:
std.manifestJsonEx— uses indentation for pretty-printingSystem.arraycopyAnalysis
References
upickle.core.CharBuilder.appendAll(char[], int)for bulk writesRenderer.flushBufferResult
Pre-cached indent arrays eliminate per-character overhead for nested JSON rendering. No regressions. Benefits deeply nested output and Scala Native.