Data Types#
The genogrove::data_type namespace contains genomic data type definitions and utilities.
key_type_base Concept#
The key_type_base concept defines the requirements for custom key types used with the grove:
a < b,a > b,a == b— Comparison operatorsT::overlaps(a, b)— Static overlap detection returningboolT::aggregate(a, b)— Static pairwise aggregation returningTa.to_string()— String representation
All built-in key types (interval, genomic_coordinate, numeric, kmer) satisfy this concept.
interval#
-
class interval#
Genomic interval representing a contiguous region with start and end positions.
This class represents basic genomic intervals without strand information, satisfying the key_type_base concept for use in grove structures. It provides simple range-based semantics for interval storage, overlap detection, and aggregation.
Public Functions
-
inline constexpr interval()#
Default constructor creating an uninitialized interval.
-
inline constexpr interval(size_t start, size_t end)#
Construct an interval with specified start and end positions.
- Parameters:
start – Starting position (0-based, inclusive)
end – Ending position (0-based, inclusive)
- Throws:
std::invalid_argument – if start > end
-
~interval() = default#
-
inline constexpr bool operator<(const interval &other) const#
Less-than comparison based on start position, then end position.
Intervals are ordered first by start position (ascending), then by end position (ascending) if start positions are equal.
- Parameters:
other – Interval to compare against
- Returns:
true if this interval is less than other
-
inline constexpr bool operator>(const interval &other) const#
Greater-than comparison based on start position, then end position.
- Parameters:
other – Interval to compare against
- Returns:
true if this interval is greater than other
-
inline constexpr bool operator==(const interval &other) const#
Equality comparison (both start and end must match).
- Parameters:
other – Interval to compare against
- Returns:
true if start and end positions are both equal
-
std::string to_string() const#
Convert interval to string representation.
Format: “[start,end]” (e.g., “[100,200]”)
Note
Required by key_type_base concept for debugging/display
- Returns:
String representation of the interval
-
inline constexpr size_t get_start() const noexcept#
Get the start position (0-based, inclusive).
- Returns:
Start position
-
inline constexpr void set_range(size_t start, size_t end)#
Set both start and end positions atomically.
- Parameters:
start – Start position (0-based, inclusive)
end – End position (0-based, inclusive)
- Throws:
std::invalid_argument – if start > end
-
inline constexpr size_t get_end() const noexcept#
Get the end position (0-based, inclusive).
- Returns:
End position
-
void serialize(std::ostream &os) const#
Serialize the interval to an output stream.
Writes the interval in binary format for persistence.
- Parameters:
os – Output stream to write to
Public Static Functions
-
static inline constexpr bool overlaps(const interval &a, const interval &b)#
Determine if two intervals overlap.
Two intervals overlap if they share any positions in their ranges. Uses the standard range intersection test.
-
static inline constexpr interval aggregate(const interval &a, const interval &b)#
Aggregate two intervals into a bounding interval.
Returns the minimal bounding interval encompassing both inputs.
Note
Required by key_type_base concept for internal node construction
- Parameters:
a – First interval
b – Second interval
- Returns:
Bounding interval with min start and max end
-
inline constexpr interval()#
genomic_coordinate#
-
class genomic_coordinate#
Stranded genomic interval representing a region on a specific strand.
This class represents genomic intervals with start/end positions and strand information, satisfying the key_type_base concept for use in grove structures. It extends the basic interval type with strand-awareness, enabling strand-specific queries and operations.
Public Functions
-
inline constexpr genomic_coordinate()#
Default constructor creating an invalid coordinate (strand=’.’, start=0, end=0).
-
inline constexpr genomic_coordinate(char strand, std::size_t start, std::size_t end)#
Construct a genomic coordinate with specified strand and position.
- Parameters:
strand – Strand indicator (‘+’, ‘-’, ‘.’, or ‘*’)
start – Starting position (0-based, inclusive)
end – Ending position (0-based, inclusive)
- Throws:
std::invalid_argument – if strand is not one of ‘+’, ‘-’, ‘.’, ‘*’
std::invalid_argument – if start > end
-
~genomic_coordinate() = default#
-
inline constexpr bool operator<(const genomic_coordinate &other) const#
Less-than comparison using coordinate-first sorting.
Comparison order: start → end → strand (with strand order: * < . < + < -)
- Parameters:
other – Coordinate to compare against
- Returns:
true if this coordinate is less than other
-
inline constexpr bool operator>(const genomic_coordinate &other) const#
Greater-than comparison using coordinate-first sorting.
- Parameters:
other – Coordinate to compare against
- Returns:
true if this coordinate is greater than other
-
inline constexpr bool operator==(const genomic_coordinate &other) const#
Equality comparison (all three components must match).
- Parameters:
other – Coordinate to compare against
- Returns:
true if strand, start, and end are all equal
-
std::string to_string() const#
Convert coordinate to string representation.
Format: “strand:start-end” (e.g., “+:100-200”)
Note
Required by key_type_base concept for debugging/display
- Returns:
String representation of the coordinate
-
inline constexpr char get_strand() const noexcept#
Get the strand indicator.
- Returns:
Strand character (‘+’, ‘-’, ‘.’, or ‘*’)
-
inline constexpr std::size_t get_start() const noexcept#
Get the start position (0-based, inclusive).
- Returns:
Start position
-
inline constexpr std::size_t get_end() const noexcept#
Get the end position (0-based, inclusive).
- Returns:
End position
-
inline constexpr void set_strand(char strand)#
Set the strand indicator.
- Parameters:
strand – Strand character (‘+’, ‘-’, ‘.’, or ‘*’)
- Throws:
std::invalid_argument – if strand is not one of ‘+’, ‘-’, ‘.’, ‘*’
-
inline constexpr void set_range(std::size_t start, std::size_t end)#
Set both start and end positions atomically.
- Parameters:
start – Start position (0-based, inclusive)
end – End position (0-based, inclusive)
- Throws:
std::invalid_argument – if start > end
-
void serialize(std::ostream &os) const#
Serialize the genomic coordinate to an output stream.
Writes the coordinate in binary format for persistence.
- Parameters:
os – Output stream to write to
Public Static Functions
-
static inline constexpr bool overlaps(const genomic_coordinate &a, const genomic_coordinate &b)#
Determine if two genomic coordinates overlap.
Overlap requires both spatial overlap AND strand compatibility:
Coordinates overlap if: a.start <= b.end AND b.start <= a.end
Strands must match exactly, EXCEPT wildcard ‘*’ matches any strand
-
static inline constexpr genomic_coordinate aggregate(const genomic_coordinate &a, const genomic_coordinate &b)#
Aggregate two coordinates into a bounding coordinate.
Returns the minimal bounding coordinate encompassing both inputs:
Start: minimum start position
End: maximum end position
Strand: ‘*’ (wildcard) if strands differ, otherwise common strand
Note
Required by key_type_base concept for internal node construction
- Parameters:
a – First coordinate
b – Second coordinate
- Returns:
Bounding coordinate with min start, max end, and merged strand
-
static genomic_coordinate deserialize(std::istream &is)#
Deserialize a genomic coordinate from an input stream.
Reads the coordinate from binary format and returns it.
- Parameters:
is – Input stream to read from
- Returns:
Deserialized genomic coordinate
Public Static Attributes
-
static constexpr bool is_interval = true#
Indicates this is an interval type (enables interval-aware operations).
-
inline constexpr genomic_coordinate()#
key#
-
template<key_type_base key_t, typename data_t = void>
class key# Wrapper class combining a key value with optional associated data.
This template class wraps a key_t (e.g., interval, genomic_coordinate, numeric) with an optional data_t payload. It serves as the fundamental storage unit in grove structures, enabling efficient indexing while maintaining arbitrary metadata.
Public Functions
-
inline key()#
Default constructor initializing both value and data with defaults.
Only available when both key_t and data_t are default-initializable.
Note
Constrained by requires clause - will not compile if types are not default-initializable
-
inline key()
Default constructor for the data-less form (data_t = void).
The requires clause above needs data_t to be default-initializable, which void is not — so key<T, void> needs its own default ctor.
datais std::monostate here and is value-initialized implicitly.
-
inline explicit key(key_t kvalue)#
Construct a key with the specified key value.
When data_t is void: Creates a key with only the value. When data_t is non-void: Creates a key with value and default-constructed data.
- Parameters:
kvalue – The key value (moved into the key)
-
template<typename D = data_t>
inline key(key_t key_value, D &&data_value)# Construct a key with both key value and associated data.
Only available when data_t is not void (enforced by requires clause). Uses perfect forwarding to efficiently transfer the data value.
Note
This constructor only exists when data_t != void
- Template Parameters:
D – data_t type (deduced, should match data_t)
- Parameters:
key_value – The key value (moved into the key)
data_value – The associated data (forwarded)
-
key &operator=(const key&) = default#
Copy assignment operator (defaulted).
- Returns:
Reference to this key
-
key &operator=(key&&) noexcept = default#
Move assignment operator (defaulted, noexcept).
- Returns:
Reference to this key
-
~key() = default#
Destructor (defaulted).
-
inline const key_t &get_value() const noexcept#
Get the key value (const reference).
- Returns:
Const reference to the underlying key_t value
-
inline void set_value(key_t new_value)#
Set the key value.
- Parameters:
new_value – The new key value (moved)
-
template<typename D = data_t>
inline const D &get_data() const noexcept# Get the associated data (const reference).
Only available when data_t is not void (enforced by requires clause). Provides read-only access to the associated data.
Note
This method only exists when data_t != void
Note
Returns by const reference for efficiency
- Template Parameters:
D – data_t type (deduced, should match data_t)
- Returns:
Const reference to the associated data
-
template<typename D = data_t>
inline D &get_data() noexcept# Get mutable reference to associated data.
Only available when data_t is not void (enforced by requires clause). Allows in-place modification of the data without copying.
Note
This method only exists when data_t != void
Note
Useful for efficient in-place updates
- Template Parameters:
D – data_t type (deduced, should match data_t)
- Returns:
Mutable reference to the associated data
-
template<typename D = data_t>
inline void set_data(D new_data)# Set the associated data.
Only available when data_t is not void (enforced by requires clause).
Note
This method only exists when data_t != void
- Template Parameters:
D – data_t type (deduced, should match data_t)
- Parameters:
new_data – The new data value (moved)
-
inline constexpr bool has_data() const noexcept#
Check if this key has associated data.
Compile-time constant determined by template parameter.
- Returns:
true if data_t is not void, false otherwise
-
inline std::string to_string() const#
Convert key to string representation.
Delegates to the key_t’s to_string() method. Does not include data in the string representation.
- Returns:
String representation of the key value
-
inline void serialize(std::ostream &os) const#
Serialize the key to an output stream.
Writes the key in binary format for persistence:
Always serializes the key_t value
Serializes data_t only when non-void
Uses type-specific serialization_traits for both key and data.
Note
Serialization format depends on serialization_traits specializations
- Parameters:
os – Output stream to write to
-
inline bool operator==(const key &other) const#
Comparison operators.
Comparisons are delegated to the wrapped
key_tvalue;data_tis treated as decoration and ignored. This matches the B+ tree’s notion of identity (the tree orders byvalue) and freesdata_tfrom needing any comparison operators of its own.<and>are unconditionally available because thekey_type_baseconcept already requires them onkey_t.- Parameters:
other – key_t to compare against
Public Static Functions
-
static inline key deserialize(std::istream &is)#
Deserialize a key from an input stream.
Reads the key from binary format and reconstructs it:
Always deserializes the key_t value
Deserializes data_t only when non-void
Note
Must match the format written by serialize()
Note
Static method - creates and returns a new key
- Parameters:
is – Input stream to read from
- Returns:
Deserialized key object
-
inline key()#
query_result#
-
template<key_type_base key_t, typename data_t = void>
class query_result# Container for query results holding matching keys and the original query.
This class stores the results of intersection/search operations performed on grove structures. It maintains both the original query and a collection of pointers to all keys that matched (overlapped with) the query.
Public Functions
-
inline explicit query_result(key_t query)#
Construct a query result with the specified query.
Initializes an empty result set for the given query. Keys are added later via add_key() as the search traverses the grove structure.
- Parameters:
query – The query used for intersection (stored by value)
-
inline const key_t &get_query() const noexcept#
Get the original query that produced this result.
Returns a const reference to the query that was used to search the grove.
- Returns:
Const reference to the query value
-
inline const std::vector<key<key_t, data_t>*> &get_keys() const#
Get all matching keys found by the query.
Returns a const reference to the vector of pointers to keys that overlapped with the query. The pointers reference keys owned by the grove and remain valid as long as the grove exists and the keys are not removed.
Note
Pointers remain valid as long as the grove is not modified
Note
Keys are stored in the order they were found during tree traversal
- Returns:
Const reference to vector of pointers to matching keys (may be empty)
-
inline void add_key(key<key_t, data_t> *key)#
Add a matching key to the result set.
Appends a pointer to a matching key to the internal collection. This method is typically called internally by grove search operations as they traverse the tree structure.
Note
This is primarily an internal method used during grove traversal
Note
No ownership is transferred; the pointer is stored as-is
- Parameters:
key – Pointer to a matching key (must not be nullptr)
-
inline explicit query_result(key_t query)#
flanking_query_result#
-
template<key_type_base key_t, typename data_t = void>
class flanking_query_result# Result of a flanking-key query — the predecessor and successor of a query in the grove’s sort order, restricted to keys that do not overlap the query.
Returned by grove::flanking(). Either field may be null:
predecessor == nullptrif no key K satisfiesK < query AND !overlaps(K, query)successor == nullptrif no key K satisfiesK > query AND !overlaps(K, query)
For interval-like keys (interval, genomic_coordinate), this corresponds to the key with the smallest gap distance to the query on each side. For scalar key types (numeric, kmer), it is the closest key by sort order on either side, excluding any key that satisfies overlaps() with the query.
Distance to a returned key is type-specific and computed by the caller from the key values (e.g.,
query.start - predecessor.end - 1for closed-coord intervals;query.value - predecessor.valuefor numeric).Public Functions
-
flanking_query_result() = default#
Default-construct with both flanking keys null.
-
inline key<key_t, data_t> *get_predecessor() const noexcept#
Get the predecessor: largest non-overlapping key less than the query.
- Returns:
Pointer to the predecessor key, or nullptr if none exists
-
inline key<key_t, data_t> *get_successor() const noexcept#
Get the successor: smallest non-overlapping key greater than the query.
- Returns:
Pointer to the successor key, or nullptr if none exists
numeric#
-
class numeric#
Simple numeric (integer) key type for basic B+ tree operations.
This class wraps an integer value and satisfies the key_type_base concept, enabling use in grove structures as a simple ordered key without range semantics. Unlike interval types that represent ranges, numeric represents a single point value.
Public Functions
-
inline constexpr numeric()#
Default constructor initializing to INT_MIN.
Uses the minimum representable value as a sentinel so that max-based aggregation in internal nodes works correctly: any real value will be greater than the default.
Warning
A default-constructed numeric and a numeric holding the real value INT_MIN are indistinguishable, so
numeric{}compares equal to (and overlaps)numeric{INT_MIN}. Don’t rely on the default being distinct from stored data.
-
inline explicit constexpr numeric(int value)#
Construct a numeric with the specified integer value.
- Parameters:
value – Integer value to wrap
-
~numeric() = default#
-
inline constexpr bool operator<(const numeric &other) const#
Less-than comparison based on integer value.
- Parameters:
other – Numeric to compare against
- Returns:
true if this value is less than other’s value
-
inline constexpr bool operator>(const numeric &other) const#
Greater-than comparison based on integer value.
- Parameters:
other – Numeric to compare against
- Returns:
true if this value is greater than other’s value
-
inline constexpr bool operator==(const numeric &other) const#
Equality comparison based on integer value.
- Parameters:
other – Numeric to compare against
- Returns:
true if values are equal
-
std::string to_string() const#
Convert the numeric value to string representation.
Format: Simple integer string (e.g., “42”, “-7”)
Note
Required by key_type_base concept for debugging/display
- Returns:
String representation of the value
-
inline constexpr int get_value() const noexcept#
Get the integer value.
- Returns:
The wrapped integer value
-
inline constexpr void set_value(int value)#
Set the integer value.
- Parameters:
value – New integer value
-
void serialize(std::ostream &os) const#
Serialize the numeric to an output stream.
Writes the value in binary format for persistence.
- Parameters:
os – Output stream to write to
Public Static Functions
-
static inline constexpr bool overlaps(const numeric &a, const numeric &b)#
Determine if two numeric values overlap.
For point values, overlap occurs only when they are exactly equal. This differs from interval overlap which uses range intersection.
-
static inline constexpr numeric aggregate(const numeric &a, const numeric &b)#
Aggregate two numerics by returning the maximum.
Internal nodes store the maximum value in their subtree, allowing search operations to correctly traverse to child nodes.
Note
Required by key_type_base concept for internal node construction
- Parameters:
a – First numeric
b – Second numeric
- Returns:
The greater of the two values
-
inline constexpr numeric()#
kmer#
-
class kmer#
K-mer key type for sequence-based B+ tree operations.
This class represents a k-mer (substring of length k from a DNA sequence) using a compact 2-bit encoding. It satisfies the key_type_base concept, enabling use in grove structures for k-mer indexing and membership queries.
Public Functions
-
inline constexpr kmer()#
Default constructor creating an empty k-mer (k=0).
-
explicit kmer(std::string_view sequence)#
Construct a k-mer from a DNA sequence string.
Converts the sequence to 2-bit encoding. Only A, C, G, T (case-insensitive) are valid characters.
- Parameters:
sequence – DNA sequence (must contain only A, C, G, T)
- Throws:
std::invalid_argument – if sequence contains invalid characters
std::invalid_argument – if sequence length exceeds 32
-
inline constexpr kmer(uint64_t encoding, uint8_t k)#
Construct a k-mer from a pre-computed encoding.
- Parameters:
encoding – 2-bit encoded k-mer value
k – Length of the k-mer (1-32)
-
~kmer() = default#
-
inline constexpr bool operator<(const kmer &other) const#
Less-than comparison based on encoding value.
K-mers of different lengths are compared by length first, then by encoding. K-mers are compared by their encoding, which gives lexicographic ordering.
- Parameters:
other – K-mer to compare against
- Returns:
true if this k-mer is less than other
-
inline constexpr bool operator>(const kmer &other) const#
Greater-than comparison based on encoding value.
- Parameters:
other – K-mer to compare against
- Returns:
true if this k-mer is greater than other
-
inline constexpr bool operator==(const kmer &other) const#
Equality comparison (encoding and k must both match).
- Parameters:
other – K-mer to compare against
- Returns:
true if both encoding and k are equal
-
std::string to_string() const#
Convert the k-mer to its DNA sequence string.
Decodes the 2-bit encoding back to A, C, G, T characters.
Note
Required by key_type_base concept for debugging/display
- Returns:
DNA sequence string of length k
-
inline constexpr uint64_t get_encoding() const noexcept#
Get the 2-bit encoding value.
- Returns:
The encoded k-mer as a 64-bit integer
-
inline constexpr uint8_t get_k() const noexcept#
Get the k-mer length.
- Returns:
The value of k (1-32)
-
void serialize(std::ostream &os) const#
Serialize the k-mer to an output stream.
Writes encoding and k in binary format for persistence.
- Parameters:
os – Output stream to write to
Public Static Functions
-
static inline constexpr bool overlaps(const kmer &a, const kmer &b)#
Determine if two k-mers overlap.
For k-mers, overlap occurs only when they are exactly equal (same encoding and same k value).
-
static inline constexpr kmer aggregate(const kmer &a, const kmer &b)#
Aggregate two k-mers by returning the maximum.
Internal nodes store the maximum k-mer in their subtree for proper B+ tree navigation.
Note
Required by key_type_base concept for internal node construction
- Parameters:
a – First k-mer
b – Second k-mer
- Returns:
The greater of the two k-mers
-
static kmer deserialize(std::istream &is)#
Deserialize a k-mer from an input stream.
Reads encoding and k from binary format.
- Parameters:
is – Input stream to read from
- Returns:
Deserialized k-mer
-
static inline constexpr uint8_t encode_base(char base)#
Encode a single nucleotide to its 2-bit representation.
- Parameters:
base – Nucleotide character (A, C, G, T - case insensitive)
- Throws:
std::invalid_argument – if base is not A, C, G, or T
- Returns:
2-bit encoding (0-3)
-
static inline constexpr char decode_base(uint8_t encoding)#
Decode a 2-bit value to its nucleotide character.
- Parameters:
encoding – 2-bit encoding (0-3)
- Returns:
Nucleotide character (A, C, G, or T)
-
static inline constexpr bool is_valid(std::string_view sequence)#
Check if a sequence contains only valid nucleotides.
- Parameters:
sequence – DNA sequence to validate
- Returns:
true if sequence contains only A, C, G, T (case insensitive)
-
inline constexpr kmer()#
registry#
-
template<registry_value Key, typename Tag = void, typename Payload = Key>
class registry# Singleton registry that interns values into small integer IDs.
Every distinct key gets one stable ID; calling intern() with the same key always returns the same ID. The point is to collapse many references to the same identity down to a 4-byte ID stored elsewhere — useful when the same identity appears thousands of times across grove entries.
First-write-wins on payload. When
Payload != Keyand a callerintern(k, p)against a key that is already present, the existing payload is preserved and the newpis silently dropped. This matches the typical “first source has the canonical record; later sources may carry placeholder
fields” pattern (e.g. annotations sorted first, downstream entries reusing the id).
Each
(Key, Tag, Payload)triple has its own singleton with an independent ID space. Use theTagparameter when two unrelated pools share the same value type and must not collide:using transcript_registry = registry<std::string, struct transcript_tag>; using source_registry = registry<std::string, struct source_tag>; transcript_registry::instance().intern("ENST00000001"); // 0 in transcript pool source_registry::instance().intern("HAVANA"); // 0 in source pool (separate)
Example (identity is the whole value):
auto& reg = registry<std::string>::instance(); uint32_t a = reg.intern("chr1"); // 0 (new) uint32_t b = reg.intern("chr1"); // 0 (existing — deduplicated) uint32_t c = reg.intern("chr2"); // 1 (new) const std::string& s = reg.get(a); // "chr1"Example (identity is a subset of the payload):
struct gene_info { std::string gene_name; std::string gene_biotype; }; using gene_reg = registry<std::string, void, gene_info>; auto id1 = gene_reg::instance().intern("ENSG001", {"FOO", "protein_coding"}); auto id2 = gene_reg::instance().intern("ENSG001", {"placeholder", ""}); // id1 == id2; the placeholder payload is dropped (first-write-wins). const gene_info& g = gene_reg::instance().get(id1); // {"FOO", "protein_coding"}
Note
Thread safety: intern(), find(), clear(), serialize(), and deserialize() are protected by an internal mutex. get(), contains(), size(), empty() are unlocked fast paths. get(id) is safe under concurrent intern() iff the caller obtained
idfrom a prior intern() that happens-before this thread (e.g. via thread join, mutex, atomic publication, queue). size()/empty()/contains() return best-effort snapshots under concurrent writes.Note
Singleton lifetime: Data persists for program duration. Call reset() in tests to clear state between cases.
- Template Parameters:
Key – The identity type used for deduplication. Must be hashable and equality-comparable.
Tag – Phantom type used only to discriminate singletons. Different
Tagarguments produce distinct types with independent ID pools; the defaultvoidpreserves the original “one
singleton per (Key, Payload)” behavior.
Tagnever appears in the body — no storage, no serialization, no runtime cost.Payload – The value type stored against each ID. Defaults to
Key(the common case: identity is the whole value, likestd::stringintern pools). WhenPayload != Key, the registry storesPayloadvalues keyed onKey— useful when identity is a subset of a larger record (e.g.gene_idkeying agene_info{ id, name, biotype }blob). No constraint onPayloaditself;serialize()/deserialize()additionally requirePayloadto be readable/writable viaserializer<Payload>.
Public Types
-
using id_type = uint32_t#
Type used for registry IDs.
Public Functions
-
~registry() = default#
-
inline id_type intern(const Key &key, const Payload &payload)#
Intern a (key, payload) pair, returning its stable ID.
Note
Idempotent on key: intern(k, _) always returns the same id for k.
Note
Thread-safe.
- Parameters:
key – The identity used to deduplicate.
payload – The value to store under that identity.
- Throws:
std::runtime_error – if the registry has reached maximum capacity.
- Returns:
The ID for
key. Ifkeyis already interned, returns the existing ID and silently drops (first-write-wins); otherwise allocates a new ID and storespayload.
-
inline id_type intern(const Key &value)#
Intern a value (single-arg form when key and payload are the same type).
Note
Only available when
Key == Payload(the default). ForPayload != Key, use the two-arg form.- Parameters:
value – The value to intern; used as both identity and payload.
- Returns:
The ID for
value.
-
inline std::optional<id_type> find(const Key &key) const#
Look up the ID for a key without inserting.
Note
Thread-safe.
- Parameters:
key – The identity to look up.
- Returns:
The ID if
keyis interned, std::nullopt otherwise.
-
inline const Payload &get(id_type id) const#
Get the payload for a given ID (const access).
Note
Unlocked. Safe under concurrent intern() iff
idwas obtained from an intern() that happens-before this call.- Parameters:
id – The ID returned from intern().
- Throws:
std::out_of_range – if
idis not a valid ID.- Returns:
Const reference to the stored payload.
-
inline bool contains(id_type id) const noexcept#
Check whether an ID refers to a valid entry.
Note
Unlocked best-effort read; size may be observed stale under concurrent writes.
- Parameters:
id – The ID to check.
- Returns:
true if valid, false otherwise.
-
inline std::size_t size() const noexcept#
Number of interned entries.
Note
Unlocked best-effort read under concurrent writes.
-
inline bool empty() const noexcept#
Whether the registry has any entries.
Note
Unlocked best-effort read under concurrent writes.
-
inline void clear()#
Clear all interned data.
Note
Primarily intended for testing; use with caution in production.
Note
Thread-safe.
Warning
Invalidates all previously returned IDs.
-
inline void serialize(std::ostream &os) const#
Serialize the registry to an output stream.
Note
Wire format depends on
Key == Payload:When
Key == Payload(default):uint64_t countfollowed by each payload viaserializer<Payload>. The lookup map is reconstructed on deserialize() by treating each payload as its own key. This matches the historical format.When
Key != Payload:uint64_t countfollowed by(key, payload)pairs written in ID order. Bothserializer<Key>andserializer<Payload>are required.
Note
Thread-safe (acquires the mutex for a coherent snapshot).
- Parameters:
os – Output stream to write to.
Public Static Functions
-
static inline registry &instance()#
Get the singleton instance for this type.
Note
Uses Meyer’s singleton pattern for thread-safe initialization
- Returns:
Reference to the singleton registry instance
-
static inline void reset()#
Reset the singleton by clearing all data.
Note
Convenience for tests; equivalent to instance().clear().
-
static inline registry &deserialize(std::istream &is)#
Deserialize registry data from an input stream into the singleton.
Note
Replaces existing data on success; all previous IDs become invalid.
Note
Loaded entries keep their original IDs.
Note
Thread-safe.
Note
Strong exception guarantee: if the stream throws or contains truncated data, the singleton is left exactly as it was before the call. The new state is built into local containers and only move-assigned into the singleton after the read loop completes.
- Parameters:
is – Input stream to read from.
- Returns:
Reference to the singleton (now populated with deserialized data).
Serialization Utilities#
serialization_traits#
-
template<typename T>
struct serialization_traits#
serializer#
-
template<typename T>
struct serializer# Trait-based serialization dispatcher.
Dispatches serialization calls based on type capabilities:
If type has member serialize()/static deserialize() → use those
Otherwise → fall back to serialization_traits<T>
- Template Parameters:
T – The type to serialize/deserialize