If only ( m ) out of ( n ) possible leaves are filled, a sparse Merkle tree stores only non-empty subtrees. Mathematical representation uses binary tries of depth ( k ) with empty markers.
Proof size = ( O(\log n) ) still holds, but path pruning reduces storage.
A Merkle tree is a complete binary tree (though generalizations exist) where:
Let ( n ) be the number of data blocks (leaf count). For simplicity, assume ( n = 2^k ) for some ( k \in \mathbbN ). In practice, incomplete trees are handled by duplicating the last leaf or using balanced representations. matematicka analiza merkle 19pdf top
Given:
The verifier recomputes: [ h_0 = H(L_i) ] For ( j = 0 ) to ( k-1 ): [ h_j+1 = H( \textorder(h_j, S_j) ) ] where ( \textorder(a,b) = a \parallel b ) if the leaf is left child, else ( b \parallel a ).
Finally, check ( h_k \stackrel?= R_\textknown ). If only ( m ) out of (
Merkle trees assume a static data set or require rebuilding on updates. For dynamic data, Merkle hash trees can be extended to authenticated dictionaries with ( O(\log n) ) update and proof costs, but this requires balancing (e.g., using Merkle AVL trees). The mathematical trade-off is between update flexibility and proof optimality — no structure can achieve ( o(\log n) ) for both without relaxing security assumptions.
For append-only logs without fixed ( n ), Merkle Mountain Ranges (MMRs) allow dynamic insertion with ( O(\log n) ) proof updates. The structure is a set of perfect binary trees (peaks).
Mathematical invariant: For total size ( n ), the binary representation of ( n ) determines the peaks. If ( n = \sum_j=1^t 2^k_j ) (binary expansion), there are ( t ) peaks. Let ( n ) be the number of data blocks (leaf count)
A Merkle tree of ( N ) leaves has height ( \lceil \log_2 N \rceil ). The verification path length grows as ( O(\log N) ), which is a classic result in asymptotic analysis: ( \lim_N \to \infty \frac\textpath length\log_2 N = 1 ). This convergence is a direct application of limits from real analysis.
To build a tree from scratch:
Total hash operations = ( 2n - 1 ).
For dynamic updates (changing one leaf), recompute path from leaf to root:
This logarithmic cost ( O(\log n) ) is the core efficiency feature.