After a deep analysis of both codebases, "Rust is faster" is NOT the real explanation for jrsonnet's performance advantage. The performance difference comes from specific architectural and implementation choices that could theoretically be implemented in any language.
- Interface-based values:
valueis an interface (value.go:29-33), requiring runtime type dispatch - Each value is heap-allocated:
*valueNumber,*valueBoolean,*valueObject, etc. - Size varies: No size assertions, each type has its own struct with embedded
valueBase
- Enum-based values:
Valis a single enum (val.rs:524-546) - Fixed 24 bytes on 64-bit (
val.rs:548-549):static_assertions::assert_eq_size!(Val, [u8; 24]) - Inline small values: Booleans, null, and numbers are stored inline without allocation
The enum representation allows:
- Cache-friendly memory layout (predictable size)
- No virtual dispatch for type checking
- Fewer allocations for primitives
- No interning: Each string is a fresh
[]runeslice (value.go:86-90) - Rune-based: Converts to
[]runefor indexing support, doubling memory for ASCII - String comparison: Character-by-character (
value.go:219-231) - Tree strings: Uses
valueStringTreefor concatenation, but flattens on first use
- Global string interning (
jrsonnet-interner/src/lib.rs):- Thread-local pool with deduplication (
lib.rs:224-226) - O(1) equality: Pointer comparison only (
lib.rs:61-65) - O(1) hashing: Hash the pointer, not the content (
lib.rs:73-78)
- Thread-local pool with deduplication (
- UTF-8 native: Stores
[u8]with UTF-8 flag (inner.rs:12-14) - Custom reference counting inline with data (
inner.rs:49)
String interning is HUGE for jsonnet because:
- Object field names are compared constantly
- Many strings repeat across configurations
- Field lookups become pointer comparisons instead of string comparisons
- Single array type (
value.go:288-291):[]*cachedThunk - Concatenation copies:
concatArrayscreates new slice (value.go:319-324) - No specialized variants
- Multiple specialized array types (
arr/spec.rs):RangeArray- O(1) forstd.range()operations (:341-390)SliceArray- O(1) views into arrays (:26-82)ExtendedArray- O(1) concatenation via tree structure (:207-293)MappedArray- Lazystd.map()with caching (:415-499)ReverseArray- O(1) reversal (:393-413)RepeatedArray- O(1) forstd.repeat()(:501-546)
Common array operations like slicing, mapping, and concatenation avoid copies entirely.
- Tree-based inheritance (
value.go:618-653):extendedObjectstores left/right - Field lookup walks tree:
findFieldrecurses through inheritance (value.go:658-680) - Caches per object instance:
cache map[objectCacheKey]value(value.go:428) - Map copying on extend:
prepareFieldUpvaluescreates new maps (value.go:682-701)
- Same tree-based inheritance (
obj.rs:627-641) - More aggressive caching: Cache key includes
thispointer (obj.rs:165,obj.rs:694-718) - Uses interned strings for keys: Field lookups are pointer comparisons
- Trait-based polymorphism:
ObjectLiketrait allows specialized implementations (obj.rs:181-203)
- Stack-based:
callStackwith[]*callFrame(interpreter.go:96-101) - Variable lookup walks stack:
lookUpVariterates backwards (interpreter.go:197-209) - Map creation per scope:
bindingFrame map[ast.Identifier]*cachedThunk(value.go:67)
- Layered hash map:
LayeredHashMapwith parent pointer (map.rs:7-11) - Shared structure: Parent contexts are shared via
Cc<>reference counting - Lookup walks layers: But fewer allocations due to sharing (
map.rs:40-45) - Interned keys:
IStrkeys make lookups faster
- Go's tracing GC: All values are GC-managed
- Write barriers: Every pointer write involves GC bookkeeping
- GC pauses: Periodic stop-the-world or concurrent marking
- Thunk caching:
cachedThunkholds ontoenvuntil evaluated (thunks.go:52-61), then sets to nil (thunks.go:83)
- Cycle-collecting reference counting:
jrsonnet-gcmodule(similar to Python's approach) Cc<T>: Cycle-collected smart pointer for cyclic structuresTracetrait: Manual tracing implementation for cycle detection- Deterministic: Values freed immediately when refcount hits zero (except cycles)
Jsonnet creates MANY short-lived thunks. In Go:
- Each thunk is a GC-tracked object
- GC must scan them all to find live references
- Memory pressure triggers GC cycles
In jrsonnet:
- Short-lived thunks are freed immediately via refcount
- Only cyclic structures need cycle collection
- Less GC overhead, more predictable latency
// thunks.go:52-61
type cachedThunk struct {
env *environment
body ast.Node
content value // cached result
err error
}- Standard lazy thunk with pointer to AST
// val.rs:56-61
enum ThunkInner<T: Trace> {
Computed(T),
Errored(Error),
Waiting(TraceBox<dyn ThunkValue<Output = T>>),
Pending,
}- Same structure, but with
Pendingstate for infinite recursion detection - The
TraceBoxallows trait objects for thunk computation
| Factor | Impact | Explanation |
|---|---|---|
| String interning | HIGH | O(1) equality/hashing vs O(n) character comparison |
| Specialized arrays | HIGH | O(1) slice/map/concat vs O(n) copies |
| Fixed-size Val enum | MEDIUM | Cache-friendly, no virtual dispatch |
| Reference counting | MEDIUM | Immediate cleanup, less GC pressure |
| Interned field keys | MEDIUM | Object field lookups are pointer comparisons |
| Layered contexts | LOW-MEDIUM | Less copying of binding maps |
The GC impact is real but secondary. The dominant factors are:
- String interning - transforms O(n) string operations to O(1)
- Lazy array views - avoids copying in common operations like
std.map, slicing - Enum-based values - better memory layout and no virtual dispatch
These are implementation choices that could theoretically be added to go-jsonnet. The language (Rust vs Go) makes some of these easier (enums, zero-cost abstractions), but the core insight is algorithmic, not linguistic.
- String interning for field names (biggest win)
- Lazy array views instead of eager slice copying
- More aggressive field caching with composite keys
- Sum types (enums) for fixed-size value representation - Go lacks this
- Zero-cost abstractions for specialized array types - harder in Go
- Deterministic destruction - Go's GC model doesn't support this