Skip to content

genetic_entities Module

Genetic entities representing individual genetic elements.

Overview

The genetic_entities module defines the runtime genetic entities that represent individual genetic elements like genes, haplotypes, and genotypes.

Complete Module Reference

natal.genetic_entities

Defines mutable biological entities bound to genetic structures.

This module defines concrete biological entities such as genes, haplotypes, haploid genomes, and genotypes that are bound to genetic structures.

This module represents concrete genetic instances such as genes, haplotypes, haploid genomes, and genotypes. Instances are validated against their bound structures and are registered to those structures at creation time.

Note
  • Runtime dependency on genetic_structures for architecture definitions
  • Entities should not modify their bound structures
  • Entity must be bound to a Structure (mandatory binding rule)

GeneticEntity

GeneticEntity(name: str, structure: Any = None, **kwargs: Any)

Bases: Generic[S]

Base class for genetic entities bound to genetic structures.

Entities follow three invariants: 1. An entity must be bound to a structure. 2. An entity auto-registers to its structure at creation time. 3. The same entity name under the same structure resolves to one cached instance.

Attributes:

Name Type Description
structure_type type[GeneticStructure[Any]]

Required bound structure type for subclasses.

name str

Entity identifier within its bound structure.

structure GeneticStructure[Any]

Bound structure instance.

Examples:

gene = Gene("A1", locus=locus_A) # ✅ Required locus assert gene in locus_A.all_entities # ✅ Auto-registered gene2 = Gene("A1", locus=locus_A) # ✅ Returns same instance assert gene is gene2

Source code in src/natal/genetic_entities.py
def __init__(
    self,
    name: str,
    structure: Any = None,
    **kwargs: Any
):
    # Prevent re-initialization of cached instances
    if hasattr(self, "_initialized") and self._initialized:
        return

    if name.strip() == "":
        raise ValueError("Entity name cannot be empty.")

    if structure is None:
        raise TypeError(
            f"{self.__class__.__name__} must be bound to a structure. "
            f"Please provide a valid structure parameter."
        )
    structure = cast(GeneticStructure[Any], structure)
    _ = kwargs  # keep constructor signature aligned with __new__


    # Validate structure type using class attribute
    expected_type = self.__class__.structure_type
    if expected_type != GeneticStructure and not isinstance(structure, expected_type):
        raise TypeError(
            f"structure must be of type {expected_type.__name__}, "
            f"got {type(structure).__name__}."
        )

    self.name = name
    self.structure = structure

    # Auto-register with the structure ("register upon creation")
    register_owner = cast(Any, structure)
    register_owner.register(self)

    # Mark as initialized
    self._initialized = True

    # Cache the instance AFTER successful initialization
    if hasattr(self, '_pending_cache_key'):
        GeneticEntity._instance_cache[self._pending_cache_key] = self
        del self._pending_cache_key
clear_cache classmethod
clear_cache() -> None

Clear the instance cache for this entity class. Useful for testing or resetting the global state.

Source code in src/natal/genetic_entities.py
@classmethod
def clear_cache(cls) -> None:
    """
    Clear the instance cache for this entity class.
    Useful for testing or resetting the global state.
    """
    keys_to_remove = [k for k in GeneticEntity._instance_cache if k[3] == cls]
    for key in keys_to_remove:
        del GeneticEntity._instance_cache[key]
clear_all_caches classmethod
clear_all_caches() -> None

Clear all entity instance caches.

Source code in src/natal/genetic_entities.py
@classmethod
def clear_all_caches(cls) -> None:
    """
    Clear all entity instance caches.
    """
    GeneticEntity._instance_cache.clear()
clear_species_cache classmethod
clear_species_cache(species_id: int) -> None

Clear entity cache entries that belong to one species id.

Source code in src/natal/genetic_entities.py
@classmethod
def clear_species_cache(cls, species_id: int) -> None:
    """Clear entity cache entries that belong to one species id."""
    keys_to_remove = [k for k in GeneticEntity._instance_cache if k[0] == species_id]
    for key in keys_to_remove:
        del GeneticEntity._instance_cache[key]

Gene

Gene(name: str, locus: Optional[Locus] = None, **kwargs: Any)

Bases: GeneticEntity[Locus]

Represents a single allele at a genetic locus.

A Gene must be bound to a Locus and is automatically registered upon creation. Same name under same Locus returns the same instance. (Alias: Allele)

Attributes:

Name Type Description
name str

The name of the gene.

locus Locus

The locus structure this gene is bound to.

Examples:

>>> locus = Locus("A")
>>> gene1 = Gene("A1", locus=locus)
>>> gene2 = Gene("A1", locus=locus)
>>> assert gene1 is gene2
Source code in src/natal/genetic_entities.py
def __init__(
    self,
    name: str,
    locus: Optional[Locus] = None,
    **kwargs: Any
):
    # Prevent re-initialization of cached instances
    if hasattr(self, "_initialized") and self._initialized:
        return

    if locus is None:
        raise TypeError("Gene must be bound to a Locus. Please provide locus parameter.")

    # Set locus alias
    self.locus = locus

    # Store custom parameters as attributes
    for key, value in kwargs.items():
        setattr(self, key, value)

    # Validate name format
    if not validate_name(name):
        raise ValueError(f"Invalid gene name format: '{name}'. "
                         f"Gene names must contain only letters, numbers, and underscores.")

    # Check for duplicate gene names in the species
    species = locus.species
    if species is not None:
        # Use the public has_gene method to check for existing gene
        if hasattr(species, 'has_gene') and species.has_gene(name):
            # If has_gene returns True, the gene exists
            existing_gene = species.get_gene(name)
            if existing_gene:
                raise ValueError(
                    f"Duplicate gene name '{name}' found in species. "
                    f"Gene names must be unique for string-based lookups. "
                    f"Found at locus '{existing_gene.locus.name}' and '{locus.name}'."
                )

    # Call parent constructor which handles registration
    super().__init__(name, structure=locus)

Haplotype

Haplotype(chromosome: Optional[Chromosome] = None, genes: Optional[List[Gene]] = None, **kwargs: Any)

Bases: GeneticEntity[Chromosome]

Represents a haplotype - genes on a single chromosome from one parent.

A Haplotype is bound to a Chromosome structure and contains a list of Genes, one for each Locus in the Chromosome. Same gene combination under same Chromosome structure returns the same instance.

Attributes:

Name Type Description
chromosome Chromosome

Bound chromosome structure.

genes List[Gene]

One gene per locus in chromosome order.

linkage Chromosome

Backward-compatible alias for chromosome.

Source code in src/natal/genetic_entities.py
def __init__(
    self,
    chromosome: Optional[Chromosome] = None,
    genes: Optional[List[Gene]] = None,
    **kwargs: Any
):
    # Prevent re-initialization of cached instances
    if hasattr(self, "_initialized") and self._initialized:
        return

    if chromosome is None:
        raise TypeError("Haplotype must be bound to a Chromosome. Please provide chromosome parameter.")
    if genes is None:
        raise TypeError("Haplotype requires a genes list. Please provide genes parameter.")

    # Validate completeness and uniqueness
    chrom_loci = chromosome.loci  # List of loci in chromosome

    # Check 1: All genes must belong to this chromosome
    chrom_loci_set = set(chrom_loci)
    for gene in genes:
        if gene.locus not in chrom_loci_set:
            raise ValueError(
                f"Gene {gene.name!r} at locus {gene.locus.name!r} "
                f"is not part of chromosome {chromosome.name!r}."
            )

    # Check 2: No duplicate loci (each locus can only have one gene)
    seen_loci: set[Locus] = set()
    for gene in genes:
        if gene.locus in seen_loci:
            raise ValueError(
                f"Duplicate locus {gene.locus.name!r} in haplotype. "
                f"Each locus can only have one gene in a haplotype."
            )
        seen_loci.add(gene.locus)

    # Check 3: Completeness - must cover all loci (with exceptions)
    missing_loci = set(chrom_loci) - seen_loci
    if missing_loci:
        # Check if this is allowed (e.g., sex chromosomes)
        if not getattr(chromosome, '_allow_incomplete_haplotype', False):
            missing_names = [locus.name for locus in missing_loci]
            raise ValueError(
                f"Incomplete haplotype for chromosome {chromosome.name!r}. "
                f"Missing genes for loci: {missing_names}. "
                f"All loci must be covered unless chromosome allows incomplete haplotypes."
            )

    # Set attributes
    self.chromosome = chromosome
    self.genes = genes

    # Alias for backward compatibility
    self.linkage = chromosome

    # Store custom parameters as attributes
    for key, value in kwargs.items():
        setattr(self, key, value)

    # Generate a name from gene names for identification
    gene_names = "/".join(g.name for g in genes)

    # Call parent constructor which handles registration
    super().__init__(name=gene_names, structure=chromosome)
get_gene_at_locus
get_gene_at_locus(locus: Locus) -> Optional[Gene]

Get the gene at a specific locus.

Source code in src/natal/genetic_entities.py
def get_gene_at_locus(self, locus: Locus) -> Optional[Gene]:
    """Get the gene at a specific locus."""
    for gene in self.genes:
        if gene.locus is locus:
            return gene
    return None

HaploidGenotype

HaploidGenotype(species: Optional[Species] = None, haplotypes: Optional[List[Haplotype]] = None, **kwargs: Any)

Bases: GeneticEntity[Species]

Represents a complete haploid genome - all haplotypes from one parent.

A HaploidGenotype is bound to a Species and contains one Haplotype for each Chromosome in the Species. Same haplotype combination under same Species returns the same instance.

Attributes:

Name Type Description
species Species

Bound species structure.

haplotypes List[Haplotype]

One haplotype per required chromosome.

genome Species

Backward-compatible alias for species.

chromosomes List[Haplotype]

Backward-compatible alias for haplotypes.

This class is also exported as HaploidGenome for backward compatibility.

Source code in src/natal/genetic_entities.py
def __init__(
    self,
    species: Optional[Species] = None,
    haplotypes: Optional[List[Haplotype]] = None,
    **kwargs: Any
):
    # Prevent re-initialization of cached instances
    if hasattr(self, "_initialized") and self._initialized:
        return

    if species is None:
        raise TypeError("HaploidGenotype must be bound to a Species. Please provide species parameter.")
    if haplotypes is None:
        raise TypeError("HaploidGenotype requires haplotypes. Please provide haplotypes parameter.")

    # Validate completeness and uniqueness
    species_chroms = species.chromosomes  # List of chromosomes

    # Check 1: All haplotypes must belong to this species
    species_chroms_set = set(species_chroms)
    for hap in haplotypes:
        if hap.chromosome not in species_chroms_set:
            raise ValueError(
                f"Haplotype for chromosome {hap.chromosome.name!r} "
                f"is not part of species {species.name!r}."
            )

    # Check 2: No duplicate chromosomes (each chromosome can only have one haplotype)
    seen_chroms: set[Chromosome] = set()
    for hap in haplotypes:
        if hap.chromosome in seen_chroms:
            raise ValueError(
                f"Duplicate chromosome {hap.chromosome.name!r} in haploid genotype. "
                f"Each chromosome can only have one haplotype in a haploid genotype."
            )
        seen_chroms.add(hap.chromosome)

    # Check 3: Completeness - must cover required chromosomes (with exceptions)
    # Prefer public API; keep a compatibility fallback for legacy objects.
    get_groups = getattr(species, 'get_sex_chromosome_groups', None)
    if callable(get_groups):
        sex_chr_groups = get_groups()
    else:
        sex_chr_groups = getattr(species, '_sex_chromosome_groups', None)

    if sex_chr_groups:
        sex_chr_groups = cast(Dict[str, List[Chromosome]], sex_chr_groups)
        # For sex chromosomes: must have exactly one from each group
        for group_name, group_chroms in sex_chr_groups.items():
            group_chroms_set = set(group_chroms)
            present_in_group = [c for c in seen_chroms if c in group_chroms_set]

            if len(present_in_group) == 0:
                group_names = [c.name for c in group_chroms]
                raise ValueError(
                    f"Missing chromosome from {group_name} group. "
                    f"Must have exactly one of: {group_names}"
                )
            elif len(present_in_group) > 1:
                present_names = [c.name for c in present_in_group]
                raise ValueError(
                    f"Multiple chromosomes from {group_name} group: {present_names}. "
                    f"Can only have one."
                )
    else:
        # No sex chromosomes: must have all chromosomes
        missing_chroms = set(species_chroms) - seen_chroms
        if missing_chroms:
            missing_names = [c.name for c in missing_chroms]
            raise ValueError(
                f"Incomplete haploid genotype for species {species.name!r}. "
                f"Missing haplotypes for chromosomes: {missing_names}. "
                f"All chromosomes must be covered."
            )

    # Set attributes
    self.species = species
    self.haplotypes = haplotypes

    # Aliases for backward compatibility
    self.genome = species
    self.chromosomes = haplotypes

    # Store custom parameters as attributes
    for key, value in kwargs.items():
        setattr(self, key, value)

    # Generate a canonical, species-parsable name from haplotype names
    # Each haplotype name already uses "/" between alleles; join haplotypes with ";"
    hap_names = ";".join(h.name for h in haplotypes)

    # Call parent constructor which handles registration
    super().__init__(name=hap_names, structure=species)
to_string
to_string() -> str

Return species-parsable string for this haploid genotype.

Source code in src/natal/genetic_entities.py
def to_string(self) -> str:
    """Return species-parsable string for this haploid genotype."""
    return self.name
get_haplotype_for_chromosome
get_haplotype_for_chromosome(chromosome: Chromosome) -> Haplotype

Get the haplotype for a specific chromosome.

Source code in src/natal/genetic_entities.py
def get_haplotype_for_chromosome(self, chromosome: Chromosome) -> Haplotype:
    """Get the haplotype for a specific chromosome."""
    for hap in self.haplotypes:
        if hap.chromosome is chromosome:
            return hap
    raise ValueError(
        f"Chromosome {chromosome.name!r} not found in haploid genotype for species {self.species.name!r}."
    )
from_str classmethod
from_str(species: Species, haploid_str: str) -> HaploidGenotype

Create a HaploidGenotype from string by delegating to Species parser.

Keeps a convenient factory on the entity class for callers who prefer HaploidGenotype.from_str(species, s) instead of calling the Species parser directly.

Source code in src/natal/genetic_entities.py
@classmethod
def from_str(cls, species: Species, haploid_str: str) -> HaploidGenotype:
    """
    Create a HaploidGenotype from string by delegating to Species parser.

    Keeps a convenient factory on the entity class for callers who prefer
    `HaploidGenotype.from_str(species, s)` instead of calling the Species
    parser directly.
    """
    return species.get_haploid_genotype_from_str(haploid_str)
get_chromosome_for_linkage
get_chromosome_for_linkage(linkage: Chromosome) -> Optional[Haplotype]

Alias for get_haplotype_for_chromosome (backward compatibility).

Source code in src/natal/genetic_entities.py
def get_chromosome_for_linkage(self, linkage: Chromosome) -> Optional[Haplotype]:
    """Alias for get_haplotype_for_chromosome (backward compatibility)."""
    return self.get_haplotype_for_chromosome(linkage)
get_gene_at_locus
get_gene_at_locus(locus: Locus) -> Optional[Gene]

Get the gene at a specific locus across all haplotypes.

Source code in src/natal/genetic_entities.py
def get_gene_at_locus(self, locus: Locus) -> Optional[Gene]:
    """Get the gene at a specific locus across all haplotypes."""
    for hap in self.haplotypes:
        gene = hap.get_gene_at_locus(locus)
        if gene is not None:
            return gene
    return None

Genotype

Genotype(species: Species, maternal: HaploidGenotype, paternal: HaploidGenotype)

Represents a diploid genotype consisting of two haploid genomes.

A Genotype pairs two HaploidGenotypes (maternal and paternal) that are both bound to the same Species structure. The distinction between maternal and paternal origin is preserved for modeling phenomena like maternal effects, cytoplasmic inheritance, and genomic imprinting.

Attributes:

Name Type Description
species Species

Species shared by maternal and paternal haploid genomes.

maternal HaploidGenotype

Maternal haploid genotype.

paternal HaploidGenotype

Paternal haploid genotype.

genome Species

Backward-compatible alias for species.

name str

Canonical species-parsable genotype string.

Note: Genotype uses identity comparison (is) since instances are cached.

This class is also exported as Genome, DiploidGenome, and DiploidGenotype.

Source code in src/natal/genetic_entities.py
def __init__(
    self,
    species: Species,
    maternal: HaploidGenotype,
    paternal: HaploidGenotype
):
    # Prevent re-initialization of cached instances
    if hasattr(self, '_initialized') and self._initialized:
        return



    # Validate both haploid genomes are bound to the same species
    if maternal.species is not species or paternal.species is not species:
        raise ValueError("Both haploid genomes must be bound to the same species.")

    self.species = species
    self.maternal = maternal
    self.paternal = paternal

    # Alias for backward compatibility
    self.genome = species

    # Cache for gamete frequencies (Mendelian only)
    # Single cache entry per genotype
    self._gamete_cache: Optional[Dict[HaploidGenotype, float]] = None

    self._initialized = True

    # Cache the instance AFTER successful initialization
    if hasattr(self, '_pending_cache_key'):
        cls = self.__class__
        cls._cache[self._pending_cache_species][self._pending_cache_key] = self
        del self._pending_cache_species
        del self._pending_cache_key

    # Set canonical name for this genotype (species-parsable)
    try:
        self.name = self.to_string()
    except Exception:
        # Fallback: keep existing cache-key name if to_string fails
        self.name = getattr(self, 'name', None)
get_alleles_at_locus
get_alleles_at_locus(locus: Locus) -> Tuple[Optional[Gene], Optional[Gene]]

Get the pair of alleles at a specific locus.

Returns:

Type Description
Tuple[Optional[Gene], Optional[Gene]]

Tuple of (maternal_allele, paternal_allele)

Source code in src/natal/genetic_entities.py
def get_alleles_at_locus(self, locus: Locus) -> Tuple[Optional[Gene], Optional[Gene]]:
    """
    Get the pair of alleles at a specific locus.

    Returns:
        Tuple of (maternal_allele, paternal_allele)
    """
    mat_gene = self.maternal.get_gene_at_locus(locus)
    pat_gene = self.paternal.get_gene_at_locus(locus)
    return (mat_gene, pat_gene)
is_homozygous_at
is_homozygous_at(locus: Locus) -> bool

Check if the genotype is homozygous at a given locus.

Source code in src/natal/genetic_entities.py
def is_homozygous_at(self, locus: Locus) -> bool:
    """Check if the genotype is homozygous at a given locus."""
    mat, pat = self.get_alleles_at_locus(locus)
    # Since entities are cached, we can use identity comparison
    return mat is pat
is_heterozygous_at
is_heterozygous_at(locus: Locus) -> bool

Check if the genotype is heterozygous at a given locus.

Source code in src/natal/genetic_entities.py
def is_heterozygous_at(self, locus: Locus) -> bool:
    """Check if the genotype is heterozygous at a given locus."""
    return not self.is_homozygous_at(locus)
produce_gametes
produce_gametes() -> Dict[HaploidGenotype, float]

Generate all possible gametes (haploid genotypes) from this diploid genotype, along with their theoretical Mendelian frequencies.

This method computes pure Mendelian segregation based on recombination rates. No gene drives or other modifiers are applied - this is the baseline calculation.

For gene drives, gamete selection, or other modifications, use Population-level gamete modifiers via Population.set_gamete_modifier().

Recombination behavior is controlled by the Species's RecombinationMap: - If recombination rates are defined and non-zero, recombinant haplotypes will be generated with appropriate frequencies. - If recombination rates are zero or undefined, only parental haplotypes are produced (simple Mendelian segregation).

For chromosomes where maternal and paternal haplotypes are identical, produces only 1 gamete (the identical haplotype) with frequency 1.0.

Returns:

Type Description
Dict[HaploidGenotype, float]

Dict mapping HaploidGenotype instances to their theoretical frequencies.

Dict[HaploidGenotype, float]

All frequencies sum to 1.0.

Examples:

>>> # Get Mendelian gamete frequencies
>>> gametes = genotype.produce_gametes()
>>> sum(gametes.values())  # → 1.0
>>> for haploid_genotype, freq in gametes.items():
...     print(f"{haploid_genotype}: {freq:.3f}")
Note

Results are cached for performance. Each genotype has one cached result.

If you modify the recombination rates after calling this method, you must manually clear the cache by setting self._gamete_cache = None. Best practice: set recombination rates during Chromosome construction.

Source code in src/natal/genetic_entities.py
def produce_gametes(self) -> Dict[HaploidGenotype, float]:
    """
    Generate all possible gametes (haploid genotypes) from this diploid genotype,
    along with their theoretical Mendelian frequencies.

    This method computes pure Mendelian segregation based on recombination rates.
    No gene drives or other modifiers are applied - this is the baseline calculation.

    For gene drives, gamete selection, or other modifications, use Population-level
    gamete modifiers via `Population.set_gamete_modifier()`.

    Recombination behavior is controlled by the Species's RecombinationMap:
    - If recombination rates are defined and non-zero, recombinant haplotypes
      will be generated with appropriate frequencies.
    - If recombination rates are zero or undefined, only parental haplotypes
      are produced (simple Mendelian segregation).

    For chromosomes where maternal and paternal haplotypes are identical,
    produces only 1 gamete (the identical haplotype) with frequency 1.0.

    Returns:
        Dict mapping HaploidGenotype instances to their theoretical frequencies.
        All frequencies sum to 1.0.

    Examples:
        >>> # Get Mendelian gamete frequencies
        >>> gametes = genotype.produce_gametes()
        >>> sum(gametes.values())  # → 1.0
        >>> for haploid_genotype, freq in gametes.items():
        ...     print(f"{haploid_genotype}: {freq:.3f}")

    Note:
        Results are cached for performance. Each genotype has one cached result.

        If you modify the recombination rates after calling this method,
        you must manually clear the cache by setting `self._gamete_cache = None`.
        Best practice: set recombination rates during Chromosome construction.
    """
    # Check cache first
    if self._gamete_cache is not None:
        return self._gamete_cache

    # Dictionary to accumulate gamete frequencies
    # Key: chromosome/group index -> Dict[haplotype, frequency]
    chromosome_gamete_frequencies: List[Dict[Haplotype, float]] = []

    def _find_haplotype_in_group(
        haploid: HaploidGenotype,
        group_chromosomes: List[Chromosome],
    ) -> Optional[Haplotype]:
        """Return the unique haplotype from a sex-chromosome group, if present."""
        found: Optional[Haplotype] = None
        for group_chromosome in group_chromosomes:
            try:
                current = haploid.get_haplotype_for_chromosome(group_chromosome)
            except ValueError:
                continue

            if found is not None and found is not current:
                raise ValueError(
                    "Haploid genotype contains multiple chromosomes from the same sex group."
                )
            found = current
        return found

    sex_groups: Optional[Dict[str, List[Chromosome]]] = None
    get_groups = getattr(self.species, "get_sex_chromosome_groups", None)
    if callable(get_groups):
        sex_groups = cast(Optional[Dict[str, List[Chromosome]]], get_groups())

    sex_chromosomes: set[Chromosome] = set()
    if sex_groups:
        for group in sex_groups.values():
            sex_chromosomes.update(group)

    # For each autosome, compute possible haplotypes and frequencies.
    for chromosome in self.species.chromosomes:
        if chromosome in sex_chromosomes:
            continue

        mat_haplotype = self.maternal.get_haplotype_for_chromosome(chromosome)
        pat_haplotype = self.paternal.get_haplotype_for_chromosome(chromosome)

        if mat_haplotype is pat_haplotype:
            # Homozygous chromosome - only one gamete type (frequency 1.0)
            chromosome_gamete_frequencies.append({mat_haplotype: 1.0})
        else:
            # Heterozygous autosome
            if self._should_use_recombination(chromosome):
                frequencies = self._compute_recombinant_haplotypes_for_chromosome(
                    mat_haplotype, pat_haplotype, chromosome
                )
                chromosome_gamete_frequencies.append(frequencies)
            else:
                chromosome_gamete_frequencies.append({
                    mat_haplotype: 0.5,
                    pat_haplotype: 0.5,
                })

    # For each sex-chromosome group, choose one maternal and one paternal
    # haplotype from that group (e.g., X/Y in XY systems).
    if sex_groups:
        for group_name, group_chromosomes in sex_groups.items():
            mat_haplotype = _find_haplotype_in_group(self.maternal, group_chromosomes)
            pat_haplotype = _find_haplotype_in_group(self.paternal, group_chromosomes)

            if mat_haplotype is None and pat_haplotype is None:
                continue
            if mat_haplotype is None or pat_haplotype is None:
                raise ValueError(
                    f"Incomplete sex chromosome pair in group '{group_name}' for genotype '{self}'."
                )

            if mat_haplotype is pat_haplotype:
                chromosome_gamete_frequencies.append({mat_haplotype: 1.0})
                continue

            # Recombination only applies when both haplotypes are from the same
            # chromosome. For X/Y or Z/W pairs, use simple Mendelian segregation.
            if mat_haplotype.chromosome is pat_haplotype.chromosome and self._should_use_recombination(mat_haplotype.chromosome):
                frequencies = self._compute_recombinant_haplotypes_for_chromosome(
                    mat_haplotype,
                    pat_haplotype,
                    mat_haplotype.chromosome,
                )
                chromosome_gamete_frequencies.append(frequencies)
            else:
                chromosome_gamete_frequencies.append({
                    mat_haplotype: 0.5,
                    pat_haplotype: 0.5,
                })

    if not chromosome_gamete_frequencies:
        raise ValueError("Cannot produce gametes: no chromosome haplotypes available in genotype.")

    # Combine chromosome gametes using the multiplication rule
    # Each gamete is a combination of one haplotype per chromosome
    gamete_combinations: List[Tuple[Tuple[Haplotype, float], ...]] = list(
        itertools.product(*[tuple(d.items()) for d in chromosome_gamete_frequencies])
    )

    # Build gamete frequencies: Dict[HaploidGenotype, float]
    gamete_freqs: Dict[HaploidGenotype, float] = {}
    for combination in gamete_combinations:
        # combination is a tuple of (haplotype, frequency) pairs per chromosome
        haplotypes = [hap for hap, _ in combination]
        frequency = float(np.prod([freq for _, freq in combination]))

        # Create HaploidGenotype from haplotypes
        haploid_genotype = HaploidGenotype(species=self.species, haplotypes=haplotypes)

        if haploid_genotype in gamete_freqs:
            gamete_freqs[haploid_genotype] += frequency
        else:
            gamete_freqs[haploid_genotype] = frequency

    # Cache the result (single cache per genotype)
    self._gamete_cache = gamete_freqs

    return gamete_freqs
to_string
to_string() -> str

Return a species-parsable string representation of this genotype.

Format: "|" where each hap_str is a semicolon-separated list of chromosome haplotype allele lists, and alleles on a chromosome are joined with '/'.

Source code in src/natal/genetic_entities.py
def to_string(self) -> str:
    """
    Return a species-parsable string representation of this genotype.

    Format: "<maternal_hap_str>|<paternal_hap_str>"
    where each hap_str is a semicolon-separated list of chromosome haplotype
    allele lists, and alleles on a chromosome are joined with '/'.
    """
    species = self.species

    # For each chromosome produce "maternal_part|paternal_part"
    chrom_pairs: List[str] = []
    for chrom in species.chromosomes:
        mat_hap = self.maternal.get_haplotype_for_chromosome(chrom)
        pat_hap = self.paternal.get_haplotype_for_chromosome(chrom)

        def hap_allele_str(
            hap: Optional[Haplotype],
            loci: List[Locus] = chrom.loci,
        ) -> str:
            if hap is None:
                return ""
            names: List[str] = []
            for locus in loci:
                gene = hap.get_gene_at_locus(locus)
                names.append(gene.name if gene is not None else "")
            return "/".join(names)

        mat_str = hap_allele_str(mat_hap)
        pat_str = hap_allele_str(pat_hap)
        chrom_pairs.append(f"{mat_str}|{pat_str}")

    return ";".join(chrom_pairs)

compute_recombinant_haplotypes

compute_recombinant_haplotypes(n_loci: int, recombination_rates: ndarray, start_maternal: bool = True) -> Tuple[np.ndarray, np.ndarray]

Compute all possible recombinant haplotype patterns and their frequencies.

Abstract problem: Given a sequence of loci [0, 1, 2, ..., n_loci-1] with recombination rates between adjacent loci, enumerate all crossover patterns and produce the resulting haplotype pattern (which chain at each locus).

Parameters:

Name Type Description Default
n_loci int

Number of loci (>= 1)

required
recombination_rates ndarray

Shape (n_loci - 1,). recombination_rates[i] = rate between locus i and i+1

required
start_maternal bool

If True, start from maternal chain (0); else paternal (1)

True

Returns:

Name Type Description
haplotype_patterns ndarray

Shape (2^(n_loci-1), n_loci). Each row is 01 sequence: 0=maternal allele at that locus, 1=paternal allele

frequencies ndarray

Shape (2^(n_loci-1),). Frequency of each pattern.

Examples:

>>> n_loci = 3
>>> recomb_rates = np.array([0.1, 0.2])  # rate between 0-1 and 1-2
>>> patterns, freqs = compute_recombinant_haplotypes(n_loci, recomb_rates, True)
>>> patterns
array([[0, 0, 0],   # No crossovers: all maternal
       [0, 0, 1],   # Crossover after locus 1: mat, mat, pat
       [0, 1, 1],   # Crossover after locus 0: mat, pat, pat
       [0, 1, 0]], dtype=int64)  # Two crossovers: mat, pat, mat
>>> freqs
array([0.72, 0.02, 0.18, 0.08])  # 0.9*0.8, 0.9*0.2, 0.1*0.8, 0.1*0.2
Source code in src/natal/genetic_entities.py
def compute_recombinant_haplotypes(
    n_loci: int,
    recombination_rates: np.ndarray,
    start_maternal: bool = True
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Compute all possible recombinant haplotype patterns and their frequencies.

    Abstract problem: Given a sequence of loci [0, 1, 2, ..., n_loci-1] with
    recombination rates between adjacent loci, enumerate all crossover patterns
    and produce the resulting haplotype pattern (which chain at each locus).

    Args:
        n_loci: Number of loci (>= 1)
        recombination_rates: Shape (n_loci - 1,). recombination_rates[i] = rate between locus i and i+1
        start_maternal: If True, start from maternal chain (0); else paternal (1)

    Returns:
        haplotype_patterns: Shape (2^(n_loci-1), n_loci). Each row is 01 sequence:
                            0=maternal allele at that locus, 1=paternal allele
        frequencies: Shape (2^(n_loci-1),). Frequency of each pattern.

    Examples:
        >>> n_loci = 3
        >>> recomb_rates = np.array([0.1, 0.2])  # rate between 0-1 and 1-2
        >>> patterns, freqs = compute_recombinant_haplotypes(n_loci, recomb_rates, True)
        >>> patterns
        array([[0, 0, 0],   # No crossovers: all maternal
               [0, 0, 1],   # Crossover after locus 1: mat, mat, pat
               [0, 1, 1],   # Crossover after locus 0: mat, pat, pat
               [0, 1, 0]], dtype=int64)  # Two crossovers: mat, pat, mat
        >>> freqs
        array([0.72, 0.02, 0.18, 0.08])  # 0.9*0.8, 0.9*0.2, 0.1*0.8, 0.1*0.2
    """
    if n_loci < 1:
        raise ValueError("n_loci must be >= 1")

    if n_loci == 1:
        patterns = np.array([[int(not start_maternal)]], dtype=np.int64)
        frequencies = np.array([1.0], dtype=np.float64)
        return patterns, frequencies

    n_boundaries = n_loci - 1
    n_patterns = 2 ** n_boundaries

    patterns = np.zeros((n_patterns, n_loci), dtype=np.int64)
    frequencies = np.zeros(n_patterns, dtype=np.float64)

    for pattern_idx in range(n_patterns):
        current_chain = 0 if start_maternal else 1
        frequency = 1.0
        patterns[pattern_idx, 0] = current_chain

        for boundary_idx in range(n_boundaries):
            has_crossover = (pattern_idx >> boundary_idx) & 1
            recomb_rate = recombination_rates[boundary_idx]

            if has_crossover:
                frequency *= recomb_rate
                current_chain = 1 - current_chain
            else:
                frequency *= (1.0 - recomb_rate)

            patterns[pattern_idx, boundary_idx + 1] = current_chain

        frequencies[pattern_idx] = frequency

    return patterns, frequencies

compute_recombinant_haplotypes_with_alleles

compute_recombinant_haplotypes_with_alleles(maternal_alleles: List[str], paternal_alleles: List[str], recombination_rates: ndarray, start_maternal: bool = True) -> Dict[str, float]

Compute recombinant haplotypes with actual allele symbols.

Given maternal and paternal allele sequences, compute all recombinant haplotypes considering recombination rates, and return them as strings mapped to their frequencies.

Parameters:

Name Type Description Default
maternal_alleles List[str]

List of allele symbols at each locus (maternal chain)

required
paternal_alleles List[str]

List of allele symbols at each locus (paternal chain)

required
recombination_rates ndarray

Recombination rates between adjacent loci

required
start_maternal bool

Start from maternal (Arue) or paternal (False)

True

Returns:

Type Description
Dict[str, float]

Dict mapping haplotype string (e.g., "A1/a2/A3") to frequency

Source code in src/natal/genetic_entities.py
def compute_recombinant_haplotypes_with_alleles(
    maternal_alleles: List[str],
    paternal_alleles: List[str],
    recombination_rates: np.ndarray,
    start_maternal: bool = True
) -> Dict[str, float]:
    """
    Compute recombinant haplotypes with actual allele symbols.

    Given maternal and paternal allele sequences, compute all recombinant
    haplotypes considering recombination rates, and return them as strings
    mapped to their frequencies.

    Args:
        maternal_alleles: List of allele symbols at each locus (maternal chain)
        paternal_alleles: List of allele symbols at each locus (paternal chain)
        recombination_rates: Recombination rates between adjacent loci
        start_maternal: Start from maternal (Arue) or paternal (False)

    Returns:
        Dict mapping haplotype string (e.g., "A1/a2/A3") to frequency
    """
    n_loci = len(maternal_alleles)
    if len(paternal_alleles) != n_loci:
        raise ValueError("maternal_alleles and paternal_alleles must have same length")

    # Compute patterns (auto-selects Numba or Python)
    patterns, frequencies = compute_recombinant_haplotypes(
        n_loci, recombination_rates, start_maternal
    )

    # Convert patterns to haplotype strings
    result: Dict[str, float] = {}
    for pattern_idx, pattern in enumerate(patterns):
        alleles = [
            maternal_alleles[i] if chain == 0 else paternal_alleles[i]
            for i, chain in enumerate(pattern)
        ]
        result["/".join(alleles)] = frequencies[pattern_idx]

    return result

create_haplotype_from_allele_names

create_haplotype_from_allele_names(chromosome: Chromosome, allele_names: List[str]) -> Haplotype

Create a Haplotype from allele names.

Parameters:

Name Type Description Default
chromosome Chromosome

The Chromosome structure this haplotype belongs to.

required
allele_names List[str]

List of allele names, one per locus in order.

required

Returns:

Type Description
Haplotype

A new Haplotype instance.

Source code in src/natal/genetic_entities.py
def create_haplotype_from_allele_names(
    chromosome: Chromosome,
    allele_names: List[str]
) -> Haplotype:
    """
    Create a Haplotype from allele names.

    Args:
        chromosome: The Chromosome structure this haplotype belongs to.
        allele_names: List of allele names, one per locus in order.

    Returns:
        A new Haplotype instance.
    """
    if len(allele_names) != len(chromosome.loci):
        raise ValueError(
            f"Number of alleles ({len(allele_names)}) must match "
            f"number of loci ({len(chromosome.loci)}) in chromosome."
        )

    genes: List[Gene] = []
    for locus, allele_name in zip(chromosome.loci, allele_names):
        # Find existing gene or raise error
        matching_genes = [g for g in locus.alleles if g.name == allele_name]
        if not matching_genes:
            raise ValueError(
                f"No allele named {allele_name!r} found at locus {locus.name!r}. "
                f"Available alleles: {[g.name for g in locus.alleles]}"
            )
        genes.append(matching_genes[0])

    return Haplotype(chromosome=chromosome, genes=genes)