perf: stdlib batch 3 — while-loop conversions and allocation reduction (19 functions)#702
perf: stdlib batch 3 — while-loop conversions and allocation reduction (19 functions)#702He-Pin wants to merge 1 commit intodatabricks:masterfrom
Conversation
Convert 19 stdlib/core functions from Scala collection operations to while-loops and replace intermediate allocations with direct array ops: ArrayModule: std.all, std.any, std.count, std.member (array), std.contains, std.remove (while-loop search + arraycopy), std.removeAt (arraycopy), std.repeat (pre-sized array + arraycopy), std.sum/std.avg (single-pass) ObjectModule: prune (fuse map+filter with ArrayBuilder) StringModule: escapeStringXml (StringBuilder + charAt) ManifestModule: Lines, manifestYamlStream, manifestIni, manifestPythonVars (all StringBuilder conversions) SetModule: sortArr (reuse argBuf, while-loop key compute + index remap) DecimalFormat: leftPad/rightPad (StringBuilder) Evaluator: visitApply (while-loop arg evaluation) Util: compareStringsByCodepoint (fast path for equal non-surrogate chars) Key safety guards: - std.count/std.member: guard x.value with empty-array check to preserve lazy semantics (std.count([], error) must return 0, not throw) - compareStringsByCodepoint: equality fast-path restricted to c1 < 0xD800 to avoid misaligned surrogate pair comparison Upstream: jit branch commits b3c696f, 9ff850f, e19833b, 72eb971, d1629fd, cd612df, af4832f
Code ReviewI found several correctness issues in this PR: 1.
|
|
Closing: superseded by consolidated stdlib optimization effort. All 19 function optimizations from this PR are preserved and will be resubmitted with comprehensive Scala Native benchmarks. |
Motivation
Continue the systematic stdlib optimization effort. This batch converts 19 functions across 8 files from Scala collection operations (
.map,.filter,.forall,.exists,.foreach,.mkString) to while-loops and replaces intermediate allocations with direct array operations (System.arraycopy,StringBuilder).Key Design Decisions
Lazy semantics preservation:
std.countandstd.memberguardx.valueevaluation with empty-array checks —std.count([], error "boom")must return0, not throw. This was caught during cross-review.Surrogate-safe string comparison: The
compareStringsByCodepointequality fast-path is restricted toc1 < 0xD800to prevent misaligned surrogate pair comparisons when one side has valid pairs and the other has unpaired surrogates.Single-pass sum/avg: Old code did two passes (validate all elements with
forall, thenmap + sum). New code validates and accumulates in a single pass — same error message, earlier failure on invalid input.Modification
ArrayModule.scala (9 functions)
std.all:forall→ while-loop withVal.False(pos)early returnstd.any:iterator.exists→ while-loop withVal.True(pos)early returnstd.count:foreachclosure → while-loop;x.valuehoisted with empty guardstd.member(array):indexWhere→ while-loop withvar found; empty guardstd.contains:indexWhere→ while-loopstd.remove:indexWhere + slice/++→ while-loop search +System.arraycopystd.removeAt:slice/++→System.arraycopystd.repeat(array):ArrayBuilder→ pre-sized array +System.arraycopystd.sum/avg: two-pass → single-pass while-loop with pattern matchObjectModule.scala
prune: for-comprehension →ArrayBuilderwhile-loop, fusing map+filterStringModule.scala
escapeStringXml:StringWriter + for→java.lang.StringBuilder + while + charAtManifestModule.scala (4 functions)
Lines:filter + map + mkString→ while-loop +StringBuildermanifestYamlStream:map + mkString→ while-loop +StringBuildermanifestIni:flatMap + Seq + mkString→StringBuilderdirect appendmanifestPythonVars:map + mkString→StringBuilderSetModule.scala
sortArr: reuse single-elementargBufforkeyFinvocations; while-loops for key computation, index remapping, and strict evaluationDecimalFormat.scala
leftPad/rightPad: string concat with"0" * n→StringBuilderEvaluator.scala
visitApply:e.args.map(...)→ while-loops (Array[Val]eager,Array[Eval]lazy)Util.scala
compareStringsByCodepoint: addc1 == c2 && c1 < 0xD800fast skip +c1 < 0xD800 && c2 < 0xD800direct subtractionBenchmark Results
JMH (focused, wi=3, i=5, f=1)
Full Regression Suite (22 cases)
No significant regressions across any benchmark case.
Analysis
These are incremental micro-optimizations that primarily reduce GC pressure through fewer intermediate allocations (no lambda closures, no temporary arrays from
.map). The gains are most visible in benchmarks that exercise stdlib functions heavily. ThevisitApplywhile-loop benefits any code with variable-arity function calls.References
b3c696f2,9ff850f3,e19833b5,72eb9711,d1629fde,cd612df7,af4832f2Result
All 55 test suites pass. 22 benchmark cases show no regressions.