Contents
μpb Arena Fusion and References
μpb generally follows a thread-compatibility model where only operations on a
const pointer may occur concurrently from multiple threads; non-const
operations must not race with each other or operations on const pointers.
Fusion, via upb_Arena_Fuse, binds the lifetime of two arenas together, such
that none are freed until all of the transitively fused arenas reach refcount 0.
This is useful to avoid dangling pointers when a message in one arena references
a message in another - such as when a child message is constructed in isolation
then set onto a parent message - by fusing the parent’s arena with the child’s,
the child’s lifetime is tied to the parent’s without needing to copy it.
Both modifying the refcount and fusing are thread safe; if you want to
conceptually perform allocations concurrently from multiple threads with the
same arena lifetime, one way to achieve that would be to share a const
upb_Arena* parent plus a dedicated arena per thread, and fuse the
thread-specific arena with the parent arena. By contrast, reference counting
cannot help to achieve concurrent allocations on multiple threads, it only helps
in synchronizing lifetimes from multiple things observing the same arena (which
may be a non-const pointer for multiple writers in a single threaded context,
or a const pointer for multiple readers in a multithreaded context).
To be usable everywhere, fusion has a lock-free implementation, documented below.
Data Structure
Each arena has three pointer sized members used to track the relationship between arenas. Together, they implement a hybrid disjoint set forest and doubly-linked list.
// Tagged pointer - tracked as black arrows in diagrams
UPB_ATOMIC(uintptr_t) parent_or_count;
// Linked list - tracked as red arrows in diagrams
UPB_ATOMIC(struct upb_ArenaInternal*) next;
// Linked list - previous tracked as blue arrows in diagrams, tail as dashed
UPB_ATOMIC(uintptr_t) previous_or_tail;
The disjoint set is used to check whether two arenas are already fused and to
provide agreement on a single reference count across all fused arenas. The
linked list is used to implement freeing arenas once the refcount reaches 0, and
to implement upb_Arena_SpaceAllocated.
Two arenas are considered fused when the disjoint set says they are; the linked list is guaranteed to converge before the refcount reaches 0, but may only track a subset of fused arenas while racing with fuse calls.
Finding the root
Each set of fused arenas is uniquely identified by the root node of its tree. To
find the root for a given arena, traverse the parent_or_count pointer until
you reach a node with a count instead of a parent pointer - that’s the root. To
avoid expensive lookups if frequently repeated, this data structure uses path
splitting to halve the distance from the root for every node traversed.
Repeatedly finding the root for a series of nodes that are part of the same
fused set will converge to O(1) quickly. Let’s find the root of D, starting
with:
digraph {
C [label="C (next)"]
D [label="D (current)"]
A -> B [dir=back]
B -> C [dir=back]
C -> D [dir=back]
}
We traverse our parents; each time we pass one we link it to its grandparent, so that future lookups are faster.
digraph {
B [label="B (next)"]
C [label="C (current)"]
A -> B [dir=back]
B -> C [dir=back]
B -> D [dir=back]
}
Then:
digraph {
A [label="A (next)"]
B [label="B (current)"]
A -> B [dir=back]
A -> C [dir=back]
B -> D [dir=back]
}
D and C’s paths got shorter; if we query D’s root again, all subsequent lookups will only require one step.
Fusing
Fusing starts by identifying the roots of each arena to fuse - if they already share a root, there’s nothing to be done. We’ll work through the example of fusing C and D - which is the same as fusing A and B, as they’re the roots.
digraph {
A [label="A (refs=2)"]
B [label="B (refs=3)"]
A -> C [dir=back]
A -> C [weight=0.001, color="red"]
C -> A [weight=0.001, color="blue"]
A -> C [style=dashed]
B -> D [dir=back]
B -> D [weight=0.001, color="red"]
D -> B [weight=0.001, color="blue"]
B -> D [style=dashed]
}
A and B are roots, so they store a refcount in their parent_or_count and the
start of the linked list in their next pointers; C and D are not, so they
store pointers to their parent node in parent_or_count. C and D store NULL
in next as they happen to be the last nodes in their set.
Pass refcount
As soon as the parent changes, all future refcount operations will switch to the new root. To avoid decrementing to 0 while the old root still has active references, we add the lower-addressed node’s refcount to the new root.
digraph {
A [label="A (refs=5)"]
B [label="B (refs=3)"]
A -> C [dir=back]
A -> C [weight=0.001, color="red"]
C -> A [weight=0.001, color="blue"]
A -> C [style=dashed]
B -> D [dir=back]
B -> D [weight=0.001, color="red"]
D -> B [weight=0.001, color="blue"]
B -> D [style=dashed]
}
The total transferred refcount is tracked throughout the operation so it can be fixed up at the end if retries resulted in over-transfer.
Union
Now the arena with a lower address becomes the new root (A in this case), by
compare-and-exchange the higher address arena’s parent_or_count pointer.
digraph {
A [label="A (refs=5)"]
A -> C [dir=back]
A -> C [weight=0.001, color="red"]
C -> A [weight=0.001, color="blue"]
A -> C [style=dashed]
B -> D [dir=back]
B -> D [weight=0.001, color="red"]
D -> B [weight=0.001, color="blue"]
B -> D [style=dashed]
A -> B [dir=back]
}
This atomically eliminates B’s refcount and makes it one of A’s children. Any change to B’s refcount (or fusing it to a different arena with a lower address) will make the swap fail, and the process starts again from the beginning.
Refcount fixups
If B’s refcount changed while performing the union, we increase or decrease A’s
refcount by the difference between the initial refcount we added and B’s final
refcount, observed as part of the compare-and-exchange of B’s parent_or_count.
Fuse linked lists
To fuse the linked lists, we need to make A’s tail point to B. The root node is
always the start of the linked list, so that the list can’t contain cycles. From
A’s tail pointer, we traverse until we hit the end (C in this case), and CAS its
next to B.
digraph {
A [label="A (refs=5)"]
A -> C [dir=back]
A -> C [weight=0.001, color="red"]
C -> A [weight=0.001, color="blue"]
A -> C [style=dashed]
B -> D [dir=back]
B -> D [weight=0.001, color="red"]
D -> B [weight=0.001, color="blue"]
B -> D [style=dashed]
A -> B [dir=back]
C -> B [weight=0.001, color="red"]
}
With this, B is now reachable from the root and the final free operation will be able to see it, traversing A->C->B->D.
Update tail pointer
If we stopped at the previous step, it’s easy to see how fusing many arenas could result in O(n2) behavior, as we’d spend all of our time traversing the linked list from the root to find the tail. To avoid this, we update A’s tail pointer to point to B’s tail - so the next fuse operation will not have to traverse C or B.
digraph {
A [label="A (refs=5)"]
A -> C [dir=back]
A -> C [weight=0.001, color="red"]
C -> A [weight=0.001, color="blue"]
A -> D [style=dashed]
B -> D [dir=back]
B -> D [weight=0.001, color="red"]
D -> B [weight=0.001, color="blue"]
B -> D [style=dashed]
A -> B [dir=back]
C -> B [weight=0.001, color="red"]
}
Update back links
To make it possible to always traverse the list in both directions for
upb_Arena_SpaceAllocated, we need to make the matching back link for our
doubly-linked list. Since we linked C to B, we need to link B to C via B’s
previous_or_tail pointer.
digraph {
A [label="A (refs=5)"]
A -> C [dir=back]
A -> C [weight=0.001, color="red"]
C -> A [weight=0.001, color="blue"]
A -> D [style=dashed]
B -> D [dir=back]
B -> D [weight=0.001, color="red"]
D -> B [weight=0.001, color="blue"]
B -> C [weight=0.001, color="blue"]
A -> B [dir=back]
C -> B [weight=0.001, color="red"]
}
One-Way References
Fusion creates a bi-directional lifetime dependency between arenas: fused arenas will have the same lifetime. In some cases, only a one-way dependency is needed. If messages in arena A contain pointers to messages in arena B but not the reverse, then arena B only needs to live at least as long as arena A.
upb_Arena_RefArena(A, B) establishes such a one-way dependency by having arena
A increment B’s refcount. When arena A is freed, it releases its reference
on B. This is implemented by allocating a special block in A that holds a
pointer to B; this block is processed during upb_Arena_Free(A). Care is
taken to ensure that these blocks are processed before the memory blocks
containing them are freed from the underlying allocator.
Unlike fusion, upb_Arena_RefArena(A, B) is not thread-safe when called
concurrently with other operations on arena A, but it is thread-safe on B.
It is an error to create cycles of references (e.g., upb_Arena_RefArena(A, B);
upb_Arena_RefArena(B, A)), or to create a reference between arenas that are
fused. Since fusion creates a bi-directional dependency, fusing arenas A and B
and calling upb_Arena_RefArena(B, A) would create a cycle A <-> B -> A. In
debug builds, upb checks for such cycles, with the implementation documented
below.
Counting allocated space
Traversing the linked list while the arenas are still live can be tricky, as we
can’t necessarily reach our own node via the linked list from the root. Instead,
we scan the linked list in both directions starting from the caller-specified
node. This makes space counting weakly consistent - it’s possible to see that A
and B are fused, but not see A’s space reflected in SpaceAllocated(B) if the
fuse operation that joined them is still in progress. But counting allocated
space is always consistent with itself and any fully complete fuses - nodes are
only appended or prepended to the list, so starting the traversal at the same
point each time guarantees that future scans see a superset of previous ones.
Referenced arenas are not included in the count of allocated space.
Detecting reference cycles
To help prevent memory leaks from uncollectable arenas, upb checks for cycles in
debug builds when creating a reference or fusing arenas. A cycle can be formed
purely by references (eg. A->B->A) or by a combination of references and
fusions (eg. Fuse(A, B), then RefArena(B, A) creates the cycle A<->B->A).
If such cycles were permitted, the arenas involved could never be freed.
After performing a fusion or adding a reference, the cycle detector runs. This check is performed after the operation because cycle detection cannot be performed atomically; if the check was done prior to the fusion or reference, we could miss a case where two concurrent operations resulted in a cycle. Both would check for a possible cycle, find none, and then proceed.
Consider arenas with references A->B->C.
digraph {
rankdir=LR;
A; B; C;
A -> B [label="ref", color=blue];
B -> C [label="ref", color=blue];
}
If we attempt RefArena(C, A), the call will add ref C->A, then check that
C is not reachable from A.
digraph {
rankdir=LR;
A -> B [label="ref", color=blue];
B -> C [label="ref", color=blue];
C -> A [label="ref", color=blue];
}
The traversal will reveal A -> B -> C and trigger an assertion failure.
If instead attempt Fuse(A, C), fusion will occur:
digraph {
rankdir=LR;
node [shape=box, style=rounded];
A -> B [label="ref", color=blue];
B -> C [label="ref", color=blue];
C -> A [label="fuse", dir=both, color=red];
}
The traversal will find C <-> A -> B -> C, and trigger an assertion failure.
Cycle detection algorithm
Cycles are detected with a non-memoizing recursive depth-first search (DFS). A path can traverse fusions bi-directionally and references uni-directionally to find a cycle that contains at least one unidirectional edge. While this is not asymptotically optimal as it may traverse the same set of nodes numerous times, it doesn’t require allocating memory, making it less intrusive as a debug-only check. It also may result in infinite recursion if a cycle is created on one thread while another thread is performing cycle checks, but that would have caused an assertion failure anyway.
Fast check for fusion
If a directional ref was added and from and to are already fused
(upb_Arena_IsFused(from, to) returns true), they are mutually reachable, so a
path containing a directed edge exists and the assertion fails.
Traversing fusion group members
To check all references originating from from’s fusion group, we must visit
every arena fused with from. Because fusion operations can race with this
check, we cannot rely on traversing from the fusion root, which might change.
Instead, like upb_Arena_SpaceAllocated, the algorithm finds members of
from’s fusion group by first traversing backwards from from using
previous_or_tail, then iterates forwards.
Traverse fusion group and DFS on references
From the head of the list segment, we traverse forwards using next to visit
all nodes in from’s fusion group. For each arena X in this group, we iterate
through all of its outgoing references (the list of arenas Y such that
RefArena(X, Y) was called). For each such reference X -> Y:
- If
Y == to, thentois directly referenced, so a path exists. We returntrueand eventually trigger an assertion failure. - If
Y != to, we recurse to continue the depth-first search fromY, finding members of its fusion group and inspecting their outgoing references. If this recursive call returnstrue, thentois reachable viaY, and we returntrueto trigger an assertion failure.
If we explore all arenas in the fusion group and all their transitive references
without finding a path to to, no path exists, and we return false.