genetic_structures Module
Core genetic architecture definitions for the simulation.
Overview
The genetic_structures module defines the immutable genetic architecture of the simulation, including chromosomes, loci, species, and genome templates.
Complete Module Reference
natal.genetic_structures
Defines the immutable genetic architecture of the simulation.
This module defines static, model-level genetic elements (loci, chromosomes, species) that serve as the authoritative blueprint for creating and validating genetic entities.
This module is responsible for: - Representing static, model-level genetic elements (loci, chromosomes, species) - Storing configuration and rules such as locus order, recombination rates, and chromosome IDs - Serving as the authoritative blueprint for creating and validating genetic entities - Optionally tracking bound entities via internal registration mechanisms
Note:
- No runtime dependency on genetic_entities to avoid circular imports
- Modifications to a structure after binding entities are discouraged
SexChromosomeType
Bases: Enum
Sex chromosome type enumeration.
Defines common sex chromosome categories and inheritance constraints used by chromosome-level sex-system logic.
Attributes:
| Name | Type | Description |
|---|---|---|
AUTOSOME |
SexChromosomeType
|
Autosome not involved in sex determination. |
X |
SexChromosomeType
|
X chromosome in the XY system; can come from either parent. |
Y |
SexChromosomeType
|
Y chromosome in the XY system; paternal only. |
Z |
SexChromosomeType
|
Z chromosome in the ZW system; can come from either parent. |
W |
SexChromosomeType
|
W chromosome in the ZW system; maternal only. |
sex_system
property
Returns the sex determination system this chromosome belongs to
RegistryBase
Bases: ABC, Generic[T]
Base class for registries.
Provides the common interface for register/unregister operations while delegating storage semantics to subclass hooks.
Attributes:
| Name | Type | Description |
|---|---|---|
_expected_type |
Optional[type[GeneticStructure[E]]]
|
Runtime type used to validate registry items when provided. |
Source code in src/natal/genetic_structures.py
register
Register one or more items.
Source code in src/natal/genetic_structures.py
unregister
unregister(item_or_items: Union[T, str, List[Union[T, str]], Tuple[Union[T, str], ...], Set[Union[T, str]]]) -> None
Unregister one or more items (by key or item object).
Source code in src/natal/genetic_structures.py
EntityRegistry
Bases: RegistryBase[E]
Registry for entity objects. Deduplication by object identity.
Attributes:
| Name | Type | Description |
|---|---|---|
_storage |
List[E]
|
Ordered storage for deterministic iteration. |
_set |
Set[E]
|
Identity-based membership set for O(1) deduplication checks. |
Source code in src/natal/genetic_structures.py
ChildStructureRegistry
ChildStructureRegistry(owner: GeneticStructure[Any], expected_type: Optional[type[GeneticStructure[E]]] = None)
Bases: RegistryBase[S]
Registry for child structures. Keyed by name, preserves insertion order. Supports both register (existing) and add (create + register).
Attributes:
| Name | Type | Description |
|---|---|---|
_owner |
GeneticStructure[Any]
|
Parent structure that owns this registry. |
_storage |
Dict[str, S]
|
Name-to-child mapping for registered structures. |
Source code in src/natal/genetic_structures.py
add
Create a new child structure and register it. This is a convenience method: create + register.
If a child with the same name already exists in this registry, the
cached instance is returned immediately. This makes add idempotent
and consistent with GeneticStructure.__new__, which also returns
cached instances rather than creating duplicates.
Uses Species-level caching to ensure uniqueness within the same Species.
Source code in src/natal/genetic_structures.py
370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 | |
get
GeneticStructure
GeneticStructure(name: str, parent: Optional[GeneticStructure[Any]] = None, species: Optional[Species] = None)
Bases: Generic[E]
Base class for genetic structures.
Structure uniqueness is now scoped to a Species, not globally. Within the same Species, structures of the same type must have unique names.
Attributes:
| Name | Type | Description |
|---|---|---|
child_structure_type |
Optional[type[GeneticStructure[Any]]]
|
Child structure class used by subclasses, or None when no child structures are supported. |
name |
str
|
Structure identifier unique within the same structure type and species. |
species |
Optional[Species]
|
Species this structure is currently bound to. |
all_entities |
List[E]
|
Snapshot of currently registered runtime entities. |
Examples:
>>> species1 = Species("Species1")
>>> locus1 = Locus("A", species=species1)
>>> locus2 = Locus("A", species=species1)
>>> assert locus1 is locus2 # Same object within species1
>>>
>>> species2 = Species("Species2")
>>> locus3 = Locus("A", species=species2)
>>> assert locus1 is not locus3 # Different speciess allow same name
Source code in src/natal/genetic_structures.py
entity_type
property
Override in subclass to specify the entity type. Using property allows lazy import to avoid circular dependencies.
all_entities
property
Returns a list of all entities currently registered to this structure.
clear_cache
classmethod
Deprecated: Caching is now managed by Species. This method does nothing but is kept for backward compatibility.
clear_all_caches
Clear all caches including: - Global fallback cache (for structures without Species) - All Species-specific caches are cleared via Species.clear_all_caches()
This method is primarily for testing and cleanup.
Source code in src/natal/genetic_structures.py
add
add(name_or_specs: Union[str, List[str], List[Tuple[str, Dict[str, Any]]]], **kwargs: Any) -> Union[GeneticStructure[Any], List[GeneticStructure[Any]]]
Add child structure(s) to this structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name_or_specs
|
Union[str, List[str], List[Tuple[str, Dict[str, Any]]]]
|
Can be: - str: Single child name - List[str]: List of child names - List[Tuple[str, Dict]]: List of (name, kwargs) tuples |
required |
**kwargs
|
Any
|
Additional keyword arguments for single child creation. |
{}
|
Returns:
| Type | Description |
|---|---|
Union[GeneticStructure[Any], List[GeneticStructure[Any]]]
|
Single child structure or list of child structures. |
Examples:
>>> linkage.add("LocusA", location=100) # Single child
>>> linkage.add(["LocusA", "LocusB"]) # Multiple children
>>> linkage.add([("LocusA", {"location": 100}), ("LocusB", {"location": 200})])
Source code in src/natal/genetic_structures.py
remove
remove(name_or_child: Union[str, GeneticStructure[Any], List[Union[str, GeneticStructure[Any]]]]) -> None
Remove child structure(s) from this structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name_or_child
|
Union[str, GeneticStructure[Any], List[Union[str, GeneticStructure[Any]]]]
|
Can be: - str: Child name to remove - GeneticStructure: Child instance to remove - List: List of names or instances to remove |
required |
Examples:
>>> linkage.remove("LocusA") # Remove by name
>>> linkage.remove(locus_a) # Remove by instance
>>> linkage.remove(["LocusA", "LocusB"]) # Remove multiple
Source code in src/natal/genetic_structures.py
get_child
Get a child structure by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the child structure. |
required |
Returns:
| Type | Description |
|---|---|
GeneticStructure[Any]
|
The child structure instance. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no child with that name exists. |
Source code in src/natal/genetic_structures.py
register
Register a single entity or an iterable of entities with this structure.
EntityRegistry performs runtime type validation based on the expected type provided at construction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
entity_or_entities
|
Union[E, List[E], Tuple[E, ...], Set[E]]
|
Single entity or iterable of entities to register. |
required |
Returns:
| Type | Description |
|---|---|
GeneticStructure[E]
|
The GeneticStructure instance (for chaining). |
Source code in src/natal/genetic_structures.py
unregister
unregister(entity_or_entities: Union[E, str, List[Union[E, str]], Tuple[Union[E, str], ...], Set[Union[E, str]]]) -> GeneticStructure[E]
Unregister a single entity or an iterable of entities from this structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
entity_or_entities
|
Union[E, str, List[Union[E, str]], Tuple[Union[E, str], ...], Set[Union[E, str]]]
|
Single entity or iterable of entities to unregister. |
required |
Returns:
| Type | Description |
|---|---|
GeneticStructure[E]
|
The GeneticStructure instance (for chaining). |
Source code in src/natal/genetic_structures.py
with_entities
classmethod
with_entities(name: str, entity_ids: Union[str, Iterable[str]], **entity_kwargs: Any) -> GeneticStructure[E]
Factory method to create a GeneticStructure instance and register entities by their identifiers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the genetic structure. |
required |
entity_ids
|
str | Iterable[str]
|
Single identifier or iterable of identifiers for entities to register. |
required |
**entity_kwargs
|
Any
|
Additional keyword arguments to pass to the entity constructor. |
{}
|
Source code in src/natal/genetic_structures.py
Locus
Locus(name: str, position: Optional[Union[int, float]] = None, chromosome: Optional[Chromosome] = None, parent: Optional[Chromosome] = None, **kwargs: Any)
Bases: GeneticStructure['Gene']
Represents a genetic locus with its name.
A Locus is a blueprint for a genetic position. Multiple Gene entities (alleles) can be bound to a single Locus.
Attributes:
| Name | Type | Description |
|---|---|---|
position |
Union[int, float]
|
The linear position on the chromosome. Used for defining recombination rates. If not specified, defaults to max(position) + 1 among existing loci in the parent Linkage, or 0 if no parent. |
alleles |
List[Gene]
|
Registered allele entities bound to this locus. |
Source code in src/natal/genetic_structures.py
alleles
property
Alias for all_entities - returns all registered alleles (genes).
register
Register gene entities and invalidate species gene index cache.
Source code in src/natal/genetic_structures.py
unregister
unregister(entity_or_entities: Union[Gene, str, List[Union[Gene, str]], Tuple[Union[Gene, str], ...], Set[Union[Gene, str]]]) -> Locus
Unregister gene entities and invalidate species gene index cache.
Source code in src/natal/genetic_structures.py
register_allele
unregister_allele
add_alleles
Add one or more alleles (genes) to this locus.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
alleles_or_allele_names
|
Union[List[Union[Gene, str]], Gene, str]
|
Single Gene instance, single allele name (str), or list of Gene instances and/or allele names (str). |
required |
Returns: The Locus instance (for chaining). Note that for other structure-level add methods, the return type is the child structure(s) added. But here we return self for consistency with the register_allele/unregister_allele methods.
Source code in src/natal/genetic_structures.py
with_alleles
classmethod
with_alleles(name: str, alleles_or_allele_names: Union[List[Union[Gene, str]], Gene, str], position: Optional[Union[int, float]] = None) -> Locus
Factory method to create a Locus and register alleles (genes) by names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the locus. |
required |
alleles_or_allele_names
|
Union[List[Union[Gene, str]], Gene, str]
|
Single Gene instance, single allele name (str), or list of Gene instances and/or allele names (str). |
required |
position
|
Optional[Union[int, float]]
|
Optional position on the chromosome. |
None
|
Returns:
| Type | Description |
|---|---|
Locus
|
Locus instance with registered alleles. |
Examples:
>>> locus = Locus.with_alleles("A", ["A1", "A2", "A3"])
>>> locus.alleles # -> [Gene("A1"), Gene("A2"), Gene("A3")]
Source code in src/natal/genetic_structures.py
Chromosome
Chromosome(name: str, loci: Optional[List[Locus]] = None, species: Optional[Species] = None, parent: Optional[Species] = None, recombination_rates: Optional[Union[List[float], ndarray]] = None, sex_type: Optional[Union[SexChromosomeType, str]] = None)
Bases: GeneticStructure['Haplotype']
Represents a chromosome structure with linkage information among loci.
A Chromosome groups multiple Loci that are physically linked on the same chromosome. It also stores the recombination rates between loci.
Attributes:
| Name | Type | Description |
|---|---|---|
sex_type |
SexChromosomeType
|
Sex chromosome type. - None or 'autosome': Autosome (default) - 'X': X chromosome in XY system - 'Y': Y chromosome in XY system (paternal only) - 'Z': Z chromosome in ZW system - 'W': W chromosome in ZW system (maternal only) |
loci |
List[Locus]
|
Child loci sorted by position. |
recombination_map |
RecombinationMap
|
Adjacent-locus recombination map. |
recombination_matrix |
RecombinationMap
|
Backward-compatible alias of recombination_map. |
is_sex_chromosome |
bool
|
Whether this chromosome participates in sex determination. |
is_autosome |
bool
|
Whether this chromosome is an autosome. |
sex_system |
Optional[str]
|
Sex system inferred from sex_type ('XY', 'ZW', or None). |
Examples:
>>> chr_x = Chromosome('X', sex_type='X')
>>> chr_y = Chromosome('Y', sex_type='Y')
>>> print(chr_x.is_sex_chromosome) # True
>>> print(chr_y.sex_type.paternal_only) # True
This class is also exported as Linkage for backward compatibility.
Source code in src/natal/genetic_structures.py
sex_system
property
Returns the sex determination system this chromosome belongs to ('XY', 'ZW', or None)
loci
property
Returns the list of loci in this chromosome, sorted by position (cached).
recombination_map
property
Returns the recombination map for this chromosome.
The recombination map stores recombination rates between adjacent loci. For n loci, the map has n-1 entries where entry i is the recombination rate between locus i and locus i+1.
recombination_matrix
property
Deprecated: Use recombination_map instead.
RecombinationMap
A 1D container storing recombination rates between adjacent loci.
For n loci, the map has n-1 entries where entry i is the recombination rate between locus i and locus i+1 (in sorted order by position).
Attributes:
| Name | Type | Description |
|---|---|---|
loci_names |
List[str]
|
Locus names in sorted positional order. |
_rates |
ndarray
|
Adjacent-locus recombination rates with length n-1. |
Examples:
For loci [A, B, C, D], the map is [r(A,B), r(B,C), r(C,D)] where index i = rate between locus i and locus i+1.
Source code in src/natal/genetic_structures.py
name_to_index
validate
Validate the recombination map.
get_adjacent_loci
Get the names of the two loci at the given rate index.
Source code in src/natal/genetic_structures.py
add_locus
add_locus(locus_or_name: Union[Locus, str], position: Optional[Union[int, float]] = None, recombination_rate_with_previous: float = 0.0, **kwargs: Any) -> Locus
Add a locus to this chromosome.
When inserting a new locus: - If it's the first locus of the chromosome: the recombination rate parameter sets the rate with the next (second) locus. - Otherwise: the recombination rate parameter sets the rate with the previous locus, and the rate with the next locus is inherited from the old rate between the previous and next loci.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
locus_or_name
|
Union[Locus, str]
|
Either a Locus instance or a name to create a new Locus. |
required |
position
|
Optional[Union[int, float]]
|
Optional position (only used when creating new Locus by name). If not specified, defaults to max(position) + 1 among existing loci. |
None
|
recombination_rate_with_previous
|
float
|
Recombination rate with the adjacent locus. Defaults to 0 (complete linkage). If the first locus of the chromosome, sets the rate with the second locus; otherwise sets the rate with the previous locus. |
0.0
|
**kwargs
|
Any
|
Additional custom parameters to pass to the Locus constructor. |
{}
|
Returns:
| Type | Description |
|---|---|
Locus
|
The added Locus instance. |
Source code in src/natal/genetic_structures.py
remove_locus
Remove a locus from this chromosome.
When removing a locus, the recombination rates are adjusted to maintain connectivity between the remaining loci.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
locus_or_name
|
Union[Locus, str]
|
Either a Locus instance or a name. |
required |
Source code in src/natal/genetic_structures.py
get_locus
Get a locus by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the locus. |
required |
Returns:
| Type | Description |
|---|---|
Optional[Locus]
|
The Locus instance or None if not found. |
Source code in src/natal/genetic_structures.py
invalidate_recombination_map_cache
get_locus_index
set_recombination
Set the recombination rate between two adjacent loci.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
locus_a
|
Union[Locus, str]
|
First locus (by name or Locus object) |
required |
locus_b
|
Union[Locus, str]
|
Second locus (by name or Locus object) |
required |
rate
|
float
|
Recombination rate (must be in [0, 0.5]) |
required |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the loci are not adjacent |
ValueError
|
If rate is out of range or fewer than 2 loci |
Source code in src/natal/genetic_structures.py
set_recombination_bulk
set_recombination_all
Set all recombination rates to the same value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
float
|
Recombination rate (must be in [0, 0.5]) |
required |
Source code in src/natal/genetic_structures.py
set_recombination_default
set_recombination_rate
Deprecated: Use set_recombination instead.
set_recombination_rates
Deprecated: Use set_recombination_bulk instead.
Species
Species(name: str, chromosomes: Optional[List[Chromosome]] = None, gamete_labels: Optional[list[str]] = None)
Bases: GeneticStructure['HaploidGenome']
Represents the complete genetic architecture defined by chromosomes.
A Species is the top-level structure that contains multiple Chromosomes, each representing a chromosome with its loci and recombination rates.
Attributes:
| Name | Type | Description |
|---|---|---|
child_structures |
ChildStructureRegistry[Chromosome]
|
Registry of chromosomes. |
structure_cache |
Dict[type, Dict[str, GeneticStructure[Any]]]
|
Species-scoped structure cache grouped by structure type. |
gamete_labels |
List[str]
|
Optional labels used to identify gamete categories. |
This class is also exported as GenomeTemplate for backward compatibility.
Source code in src/natal/genetic_structures.py
gamete_labels
property
writable
Return the list of gamete labels for this species.
structure_cache
property
Public accessor for species-scoped structure caches.
sex_system
property
Returns the sex determination system ('XY', 'ZW', or None).
Automatically inferred from Chromosome.sex_type. Raises an error if multiple systems are found.
gene_index
property
Returns a mapping from gene names to gene instances.
clear_structure_cache
Clear all structure caches for this Species. This removes all cached Structure instances (Locus, Chromosome) within this Species.
Source code in src/natal/genetic_structures.py
clear_entity_cache
Clear all entity caches for this Species. This removes all cached Entity instances (Gene, Haplotype, etc.) within this Species.
Source code in src/natal/genetic_structures.py
clear_all_caches
invalidate_gene_index_cache
add_chromosome
add_chromosome(chrom_or_name: Union[Chromosome, str], loci: Optional[List[Locus]] = None, sex_type: Optional[Union[SexChromosomeType, str]] = None) -> Chromosome
Add a chromosome to this species.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chrom_or_name
|
Union[Chromosome, str]
|
Either a Chromosome instance or a name to create a new one. |
required |
loci
|
Optional[List[Locus]]
|
Optional list of loci (only used when creating new Chromosome by name). |
None
|
sex_type
|
Optional[Union[SexChromosomeType, str]]
|
Optional sex chromosome type (X', 'Y', 'Z', 'W', None). |
None
|
Returns:
| Type | Description |
|---|---|
Chromosome
|
The added Chromosome instance. |
Source code in src/natal/genetic_structures.py
add_linkage
add_linkage(linkage_or_name: Union[Chromosome, str], loci: Optional[List[Locus]] = None) -> Chromosome
Alias for add_chromosome (backward compatibility).
Source code in src/natal/genetic_structures.py
remove_chromosome
Remove a chromosome from this species.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chrom_or_name
|
Union[Chromosome, str]
|
Either a Chromosome instance or a name. |
required |
Source code in src/natal/genetic_structures.py
remove_linkage
get_all_loci
Returns all loci across all chromosomes.
from_dict
classmethod
from_dict(name: str, structure: Dict[str, Union[List[str], Dict[str, List[str]], Dict[str, Any]]], gamete_labels: Optional[List[str]] = None) -> Species
Create a Species with complete hierarchy from a dictionary specification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the species. |
required |
structure
|
Dict[str, Union[List[str], Dict[str, List[str]], Dict[str, Any]]]
|
Dictionary defining the structure. Format: { 'ChromName': ['Locus1', 'Locus2', ...], # Simple: locus names only # OR 'ChromName': { 'Locus1': ['allele1', 'allele2'], # With alleles 'Locus2': ['allele1', 'allele2'], } # OR 'ChromName': { 'sex_type': 'X', 'loci': { 'Locus1': ['allele1', 'allele2'], }, } } |
required |
Returns:
| Type | Description |
|---|---|
Species
|
Species instance with all Chromosomes and Loci created. |
Examples:
>>> # Simple: just loci names
>>> species = Species.from_dict('Species', {
... 'Chr1': ['LocusA', 'LocusB'],
... 'Chr2': ['LocusC']
... })
>>>
>>> # With alleles
>>> species = Species.from_dict('Species', {
... 'Chr1': {
... 'LocusA': ['A1', 'A2'],
... 'LocusB': ['B1', 'B2', 'B3']
... },
... 'Chr2': {
... 'LocusC': ['C1', 'C2']
... }
... })
Source code in src/natal/genetic_structures.py
2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 | |
get_locus
Get a locus by name across all chromosomes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the locus. |
required |
Returns:
| Type | Description |
|---|---|
Optional[Locus]
|
The Locus instance or None if not found. |
Source code in src/natal/genetic_structures.py
get_chromosome
Get a chromosome by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the chromosome. |
required |
Returns:
| Type | Description |
|---|---|
Optional[Chromosome]
|
The Chromosome instance or None if not found. |
Source code in src/natal/genetic_structures.py
get_gene
Get a gene by name across all loci.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the gene. |
required |
Returns:
| Type | Description |
|---|---|
Optional[Gene]
|
The Gene instance or None if not found. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If duplicate gene names exist in the species. |
Source code in src/natal/genetic_structures.py
has_gene
Check if a gene with the given name exists in the species.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the gene to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if the gene exists, False otherwise. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If duplicate gene names exist in the species. |
Source code in src/natal/genetic_structures.py
get_linkage
get_haploid_genome_from_str
Create or retrieve a HaploidGenome from a string representation.
Supported syntax includes
- Semicolon (;) separates different chromosomes
- Slash (/) separates genes within a chromosome
- If all genes are single characters, slash can be omitted
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
haploid_str
|
str
|
String like "ABC;XY" or "a1/b1/c1;x1/y1" |
required |
Returns:
| Type | Description |
|---|---|
HaploidGenome
|
HaploidGenome instance |
Examples:
>>> species = Species.from_dict("Test", {
... "Chr1": {"A": ["A", "a"], "B": ["B", "b"], "C": ["C", "c"]},
... "Chr2": {"X": ["X", "x"], "Y": ["Y", "y"]}
... })
>>> hap = species.get_haploid_genome_from_str("ABC;XY")
>>> hap = species.get_haploid_genome_from_str("a/b/c;x/y") # equivalent
Source code in src/natal/genetic_structures.py
2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 | |
get_haploid_genotype_from_str
get_genotype_from_str
Create or retrieve a Genotype from a string representation.
Supported syntax includes
- Pipe (|) separates maternal (left) and paternal (right) haploid genomes
- Semicolon (;) separates different chromosomes
- Slash (/) separates genes within a chromosome
- If all genes are single characters, slash can be omitted
The order of chromosomes in the string does not need to match the internal chromosome order - matching is done by gene names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
genotype_str
|
str
|
String like "ABC|abc" or "a1/b1/c1|a2/b2/c2;X/Y|x/y" |
required |
Returns:
| Type | Description |
|---|---|
Genotype
|
Genotype instance |
Examples:
>>> species = Species.from_dict("Test", {
... "Chr1": {"A": ["A", "a"], "B": ["B", "b"], "C": ["C", "c"]},
... "Chr2": {"X": ["X", "x"], "Y": ["Y", "y"]}
... })
>>>
>>> # Simple single-char genes
>>> gt = species.get_genotype_from_str("ABC|abc;XY|xy")
>>>
>>> # Multi-char genes with slash separator
>>> gt = species.get_genotype_from_str("A1/B1/C1|A2/B2/C2;X1/Y1|X2/Y2")
>>>
>>> # Order doesn't matter (unordered matching)
>>> gt = species.get_genotype_from_str("XY|xy;ABC|abc") # Same result
Source code in src/natal/genetic_structures.py
2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 | |
resolve_genotype_selectors
resolve_genotype_selectors(selector: Union[Genotype, str, Tuple[Union[Genotype, str], ...]], all_genotypes: Optional[Iterable[Genotype]] = None, context: str = 'selector') -> List[Genotype]
Resolve one or more genotype selectors into concrete Genotype objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selector
|
Union[Genotype, str, Tuple[Union[Genotype, str], ...]]
|
Selector expression to resolve. Supported forms:
- |
required |
all_genotypes
|
Optional[Iterable[Genotype]]
|
Optional candidate genotype iterable used by pattern
matching. If |
None
|
context
|
str
|
Human-readable context label used in error messages (for
example |
'selector'
|
Returns:
| Type | Description |
|---|---|
List[Genotype]
|
A list of resolved |
List[Genotype]
|
|
List[Genotype]
|
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If a selector atom is neither |
ValueError
|
If the selector is invalid, if pattern parsing fails, if a pattern matches no genotypes, or if a tuple selector is empty. |
Source code in src/natal/genetic_structures.py
parse_genotype_pattern
Parse a genotype pattern string and return a filter function.
Supports regex-like syntax for flexible pattern matching
- ; separates chromosomes
- | separates maternal (left) and paternal (right)
- / separates loci within a chromosome
-
- matches any allele
- {A,B,C} matches any allele in the set
- !A matches any allele except A
- :: matches unordered pair (A::B matches A|B or B|A)
- () explicitly groups chromosome loci
- Omitted chromosomes default to wildcard matching
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern
|
str
|
Pattern string, e.g. "A1/B1|A2/B2; C1/C2" |
required |
Returns:
| Type | Description |
|---|---|
Callable[[Genotype], bool]
|
A filter function that takes a Genotype and returns bool. |
Examples:
>>> filter_func = species.parse_genotype_pattern("A1/B1|A2/B2; C1::*")
>>> genotypes = [gt for gt in pop.genotypes if filter_func(gt)]
Raises:
| Type | Description |
|---|---|
PatternParseError
|
If the pattern is invalid. |
Source code in src/natal/genetic_structures.py
filter_genotypes_by_pattern
Filter a collection of genotypes by a pattern string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
genotypes
|
Iterable[Genotype]
|
Iterable of Genotype objects to filter. |
required |
pattern
|
str
|
Pattern string (see parse_genotype_pattern for syntax). |
required |
Returns:
| Type | Description |
|---|---|
List[Genotype]
|
List of genotypes that match the pattern. |
Examples:
Source code in src/natal/genetic_structures.py
enumerate_genotypes_matching_pattern
enumerate_genotypes_matching_pattern(pattern: str, max_count: Optional[int] = None) -> Iterable[Genotype]
Enumerate all genotypes matching a pattern.
Yields all possible genotype combinations that satisfy the pattern. Uses the pattern's built-in matching logic to filter candidates, avoiding complex combination generation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern
|
str
|
Pattern string (see parse_genotype_pattern for syntax). |
required |
max_count
|
Optional[int]
|
Maximum number of genotypes to yield (prevents explosion). If None, yields all possible genotypes. |
None
|
Yields:
| Type | Description |
|---|---|
Iterable[Genotype]
|
Genotype objects matching the pattern. |
Examples:
Raises:
| Type | Description |
|---|---|
PatternParseError
|
If the pattern is invalid. |
Source code in src/natal/genetic_structures.py
parse_haploid_genome_pattern
Parse a haploid genome pattern string and return a filter function.
Supports regex-like syntax for flexible pattern matching of haploid genomes. A HaploidGenome represents one complete DNA strand (all haplotypes). Uses same syntax as Genotype patterns but applies to single haplotypes: - ; separates chromosomes - / separates loci within a chromosome - * matches any allele - {A,B,C} matches any allele in the set - !A matches any allele except A - () explicitly groups chromosome loci - Omitted chromosomes default to wildcard matching
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern
|
str
|
Pattern string, e.g. "A1/B1; C1/C2" |
required |
Returns:
| Type | Description |
|---|---|
Callable[[HaploidGenome], bool]
|
A filter function that takes a HaploidGenome and returns bool. |
Examples:
>>> filter_func = species.parse_haploid_genome_pattern("A1/B1; C1")
>>> haploid_genomes = [hg for hg in pop.haploid_genomes if filter_func(hg)]
Raises:
| Type | Description |
|---|---|
PatternParseError
|
If the pattern is invalid. |
Source code in src/natal/genetic_structures.py
filter_haploid_genomes_by_pattern
filter_haploid_genomes_by_pattern(haploid_genomes: Iterable[HaploidGenome], pattern: str) -> List[HaploidGenome]
Filter a collection of haploid genomes by a pattern string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
haploid_genomes
|
Iterable[HaploidGenome]
|
Iterable of HaploidGenome objects to filter. |
required |
pattern
|
str
|
Pattern string (see parse_haploid_genome_pattern for syntax). |
required |
Returns:
| Type | Description |
|---|---|
List[HaploidGenome]
|
List of haploid genomes that match the pattern. |
Examples:
Source code in src/natal/genetic_structures.py
enumerate_haploid_genomes_matching_pattern
enumerate_haploid_genomes_matching_pattern(pattern: str, max_count: Optional[int] = None) -> Iterable[HaploidGenome]
Enumerate all haploid genomes matching a pattern.
Yields all possible haploid genome combinations that satisfy the pattern. Uses the pattern's built-in matching logic to filter candidates, avoiding complex combination generation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern
|
str
|
Pattern string (see parse_haploid_genome_pattern for syntax). |
required |
max_count
|
Optional[int]
|
Maximum number of haploid genomes to yield (prevents explosion). If None, yields all possible haploid genomes. |
None
|
Yields:
| Type | Description |
|---|---|
Iterable[HaploidGenome]
|
HaploidGenome objects matching the pattern. |
Examples:
Raises:
| Type | Description |
|---|---|
PatternParseError
|
If the pattern is invalid. |
Source code in src/natal/genetic_structures.py
get_sex_chromosome_groups
count_alleles
Count the total number of alleles across all loci.
Returns:
| Type | Description |
|---|---|
int
|
Total allele count. |
Source code in src/natal/genetic_structures.py
count_haploid_genotypes
Calculate the total number of possible haploid genomes.
For each locus with n alleles, the haploid genome count = product of allele counts at each locus. If sex chromosome groups exist, only one chromosome is selected per group.
Returns:
| Type | Description |
|---|---|
int
|
Total number of possible haploid genomes |
Source code in src/natal/genetic_structures.py
count_genotypes
Calculate the total number of possible diploid genotypes.
If _valid_sex_genotypes is defined, only valid sex chromosome combinations are counted.
Sex chromosome system configuration: - Can be automatically inferred by setting Chromosome.sex_type - Can also be manually set via _sex_chromosome_groups and _valid_sex_genotypes
Returns:
| Type | Description |
|---|---|
int
|
Total number of possible genotypes |
Source code in src/natal/genetic_structures.py
2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 | |
iter_haploid_genotypes
Iterate over all possible haploid genomes (HaploidGenome).
If sex chromosome groups exist, only one chromosome is selected per group. Note: This method returns all possible haploid genotypes without distinguishing between maternal/paternal. For scenarios requiring distinction, use iter_maternal_haploid_genotypes() and iter_paternal_haploid_genotypes().
Yields:
| Type | Description |
|---|---|
Iterable[HaploidGenome]
|
HaploidGenome instances |
Examples:
Source code in src/natal/genetic_structures.py
3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 | |
iter_maternal_haploid_genotypes
Iterate maternal haploid genomes that can be transmitted.
Yields:
| Type | Description |
|---|---|
Iterable[HaploidGenome]
|
HaploidGenome instances. |
Source code in src/natal/genetic_structures.py
iter_paternal_haploid_genotypes
Iterate paternal haploid genomes that can be transmitted.
Yields:
| Type | Description |
|---|---|
Iterable[HaploidGenome]
|
HaploidGenome instances. |
Source code in src/natal/genetic_structures.py
iter_genotypes
Iterate all possible diploid genotypes.
Maternal and paternal sides are ordered, so (A|B) and (B|A)
are distinct genotypes. When _valid_sex_genotypes or
Chromosome.sex_type constraints are present, only valid sex
chromosome pairings are emitted.
Yields:
| Type | Description |
|---|---|
Iterable[Genotype]
|
Genotype instances. |
Examples:
Source code in src/natal/genetic_structures.py
get_all_haploid_genotypes
Get a list of all possible haploid genomes.
Returns:
| Type | Description |
|---|---|
List[HaploidGenome]
|
List of all HaploidGenome instances. |
get_maternal_haploid_genotypes
Get all maternal-transmissible haploid genomes.
Returns:
| Type | Description |
|---|---|
List[HaploidGenome]
|
List of maternal haploid genomes. |
get_paternal_haploid_genotypes
Get all paternal-transmissible haploid genomes.
Returns:
| Type | Description |
|---|---|
List[HaploidGenome]
|
List of paternal haploid genomes. |
get_haploid_genotypes
get_haploid_genotypes(parent: Optional[Literal['maternal', 'paternal']] = None) -> List[HaploidGenome]
Get haploid genomes, optionally constrained by parent role.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent
|
Optional[Literal['maternal', 'paternal']]
|
Parent role constraint. Accepted values are |
None
|
Returns:
| Type | Description |
|---|---|
List[HaploidGenome]
|
List of haploid genomes for the requested scope. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in src/natal/genetic_structures.py
get_all_genotypes
Get a list of all possible diploid genotypes.
Returns:
| Type | Description |
|---|---|
List[Genotype]
|
List of all Genotype instances. |
ensure_type
Ensures that an object is an instance of a given class, with lazy import.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
any
|
The object to check |
required |
expected_type
|
type
|
The expected class type. |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If obj is not an instance of the specified class. |