Performance Considerations

This chapter covers performance considerations in Sipha’s design.

Memory Efficiency

Green trees are designed for sharing:

Arc-based: Nodes use Arc for reference counting
Shared subtrees: Unchanged subtrees are shared across edits
Arena allocation: Nodes can be arena-allocated for efficiency

Compact Representation

Green trees use compact representations:

Small nodes: Minimal overhead per node
Compact strings: Use CompactString for short tokens
Efficient children: Optimized child storage

Parsing Performance

DFA-Based Lexing

Lexer uses DFA for O(n) tokenization:

Compiled DFA: DFA is compiled once
Fast matching: O(n) tokenization time
Efficient patterns: Character classes are efficient

Table-Based Parsing

Parsers use tables for fast lookups:

Parse tables: Pre-computed parse tables
Fast lookups: O(1) table lookups
Compact tables: Optimized table representation

Incremental Performance

Node Reuse

Incremental parsing reuses nodes:

Budget-based: Limit reuse candidates
Position indexing: Fast node lookup
Smart invalidation: Efficient cache invalidation

Cache Management

Parse cache is optimized:

Version-based: Efficient version management
Automatic eviction: Old versions are evicted
Interned strings: Rule names are interned

Optimization Tips

Grammar Optimization

Factor patterns: Extract common subexpressions
Minimize nesting: Flatten deeply nested structures
Use appropriate lookahead: Don’t use more than needed

Lexer Optimization

Order patterns: Most specific patterns first
Use character classes: Prefer over regex
Keyword trie: Keywords use efficient trie

Parser Optimization

Choose right backend: Use appropriate backend
Enable cache: Always provide grammar for cache
Adjust budget: Tune reuse budget for use case

Benchmarking

Measuring Performance

use std::time::Instant;

// Assuming parser, tokens, and entry are already defined
let start = Instant::now();
let result = parser.parse(&tokens, entry);
let duration = start.elapsed();

Profiling

Use profilers to find bottlenecks:

perf: Linux profiling
valgrind: Memory and call profiling
cargo-flamegraph: Flamegraph generation

Best Practices

Profile first: Profile before optimizing
Measure changes: Benchmark before and after
Use incremental: Always use for interactive apps
Optimize grammar: Factor and simplify
Choose right backend: Use appropriate backend

Next Steps

See Performance for optimization tips
Check Design Principles for design rationale