genetic_entities Module
Genetic entities representing individual genetic elements.
Overview
The genetic_entities module defines the runtime genetic entities that represent individual genetic elements like genes, haplotypes, and genotypes.
Complete Module Reference
natal.genetic_entities
Defines mutable biological entities bound to genetic structures.
This module defines concrete biological entities such as genes, haplotypes, haploid genomes, and genotypes that are bound to genetic structures.
This module represents concrete genetic instances such as genes, haplotypes, haploid genomes, and genotypes. Instances are validated against their bound structures and are registered to those structures at creation time.
Note
- Runtime dependency on
genetic_structuresfor architecture definitions - Entities should not modify their bound structures
- Entity must be bound to a Structure (mandatory binding rule)
GeneticEntity
Bases: Generic[S]
Base class for genetic entities bound to genetic structures.
Entities follow three invariants: 1. An entity must be bound to a structure. 2. An entity auto-registers to its structure at creation time. 3. The same entity name under the same structure resolves to one cached instance.
Attributes:
| Name | Type | Description |
|---|---|---|
structure_type |
type[GeneticStructure[Any]]
|
Required bound structure type for subclasses. |
name |
str
|
Entity identifier within its bound structure. |
structure |
GeneticStructure[Any]
|
Bound structure instance. |
Examples:
gene = Gene("A1", locus=locus_A) # ✅ Required locus assert gene in locus_A.all_entities # ✅ Auto-registered gene2 = Gene("A1", locus=locus_A) # ✅ Returns same instance assert gene is gene2
Source code in src/natal/genetic_entities.py
clear_cache
classmethod
Clear the instance cache for this entity class. Useful for testing or resetting the global state.
Source code in src/natal/genetic_entities.py
clear_all_caches
classmethod
clear_species_cache
classmethod
Clear entity cache entries that belong to one species id.
Source code in src/natal/genetic_entities.py
Gene
Bases: GeneticEntity[Locus]
Represents a single allele at a genetic locus.
A Gene must be bound to a Locus and is automatically registered
upon creation. Same name under same Locus returns the same instance.
(Alias: Allele)
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The name of the gene. |
locus |
Locus
|
The locus structure this gene is bound to. |
Examples:
>>> locus = Locus("A")
>>> gene1 = Gene("A1", locus=locus)
>>> gene2 = Gene("A1", locus=locus)
>>> assert gene1 is gene2
Source code in src/natal/genetic_entities.py
Haplotype
Haplotype(chromosome: Optional[Chromosome] = None, genes: Optional[List[Gene]] = None, **kwargs: Any)
Bases: GeneticEntity[Chromosome]
Represents a haplotype - genes on a single chromosome from one parent.
A Haplotype is bound to a Chromosome structure and contains a list of Genes, one for each Locus in the Chromosome. Same gene combination under same Chromosome structure returns the same instance.
Attributes:
| Name | Type | Description |
|---|---|---|
chromosome |
Chromosome
|
Bound chromosome structure. |
genes |
List[Gene]
|
One gene per locus in chromosome order. |
linkage |
Chromosome
|
Backward-compatible alias for chromosome. |
Source code in src/natal/genetic_entities.py
get_gene_at_locus
HaploidGenotype
HaploidGenotype(species: Optional[Species] = None, haplotypes: Optional[List[Haplotype]] = None, **kwargs: Any)
Bases: GeneticEntity[Species]
Represents a complete haploid genome - all haplotypes from one parent.
A HaploidGenotype is bound to a Species and contains one Haplotype for each Chromosome in the Species. Same haplotype combination under same Species returns the same instance.
Attributes:
| Name | Type | Description |
|---|---|---|
species |
Species
|
Bound species structure. |
haplotypes |
List[Haplotype]
|
One haplotype per required chromosome. |
genome |
Species
|
Backward-compatible alias for species. |
chromosomes |
List[Haplotype]
|
Backward-compatible alias for haplotypes. |
This class is also exported as HaploidGenome for backward compatibility.
Source code in src/natal/genetic_entities.py
400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 | |
to_string
get_haplotype_for_chromosome
Get the haplotype for a specific chromosome.
Source code in src/natal/genetic_entities.py
from_str
classmethod
Create a HaploidGenotype from string by delegating to Species parser.
Keeps a convenient factory on the entity class for callers who prefer
HaploidGenotype.from_str(species, s) instead of calling the Species
parser directly.
Source code in src/natal/genetic_entities.py
get_chromosome_for_linkage
Alias for get_haplotype_for_chromosome (backward compatibility).
get_gene_at_locus
Get the gene at a specific locus across all haplotypes.
Source code in src/natal/genetic_entities.py
Genotype
Represents a diploid genotype consisting of two haploid genomes.
A Genotype pairs two HaploidGenotypes (maternal and paternal) that are both bound to the same Species structure. The distinction between maternal and paternal origin is preserved for modeling phenomena like maternal effects, cytoplasmic inheritance, and genomic imprinting.
Attributes:
| Name | Type | Description |
|---|---|---|
species |
Species
|
Species shared by maternal and paternal haploid genomes. |
maternal |
HaploidGenotype
|
Maternal haploid genotype. |
paternal |
HaploidGenotype
|
Paternal haploid genotype. |
genome |
Species
|
Backward-compatible alias for species. |
name |
str
|
Canonical species-parsable genotype string. |
Note: Genotype uses identity comparison (is) since instances are cached.
This class is also exported as Genome, DiploidGenome, and DiploidGenotype.
Source code in src/natal/genetic_entities.py
get_alleles_at_locus
Get the pair of alleles at a specific locus.
Returns:
| Type | Description |
|---|---|
Tuple[Optional[Gene], Optional[Gene]]
|
Tuple of (maternal_allele, paternal_allele) |
Source code in src/natal/genetic_entities.py
is_homozygous_at
Check if the genotype is homozygous at a given locus.
is_heterozygous_at
produce_gametes
Generate all possible gametes (haploid genotypes) from this diploid genotype, along with their theoretical Mendelian frequencies.
This method computes pure Mendelian segregation based on recombination rates. No gene drives or other modifiers are applied - this is the baseline calculation.
For gene drives, gamete selection, or other modifications, use Population-level
gamete modifiers via Population.set_gamete_modifier().
Recombination behavior is controlled by the Species's RecombinationMap: - If recombination rates are defined and non-zero, recombinant haplotypes will be generated with appropriate frequencies. - If recombination rates are zero or undefined, only parental haplotypes are produced (simple Mendelian segregation).
For chromosomes where maternal and paternal haplotypes are identical, produces only 1 gamete (the identical haplotype) with frequency 1.0.
Returns:
| Type | Description |
|---|---|
Dict[HaploidGenotype, float]
|
Dict mapping HaploidGenotype instances to their theoretical frequencies. |
Dict[HaploidGenotype, float]
|
All frequencies sum to 1.0. |
Examples:
>>> # Get Mendelian gamete frequencies
>>> gametes = genotype.produce_gametes()
>>> sum(gametes.values()) # → 1.0
>>> for haploid_genotype, freq in gametes.items():
... print(f"{haploid_genotype}: {freq:.3f}")
Note
Results are cached for performance. Each genotype has one cached result.
If you modify the recombination rates after calling this method,
you must manually clear the cache by setting self._gamete_cache = None.
Best practice: set recombination rates during Chromosome construction.
Source code in src/natal/genetic_entities.py
696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 | |
to_string
Return a species-parsable string representation of this genotype.
Format: "
Source code in src/natal/genetic_entities.py
compute_recombinant_haplotypes
compute_recombinant_haplotypes(n_loci: int, recombination_rates: ndarray, start_maternal: bool = True) -> Tuple[np.ndarray, np.ndarray]
Compute all possible recombinant haplotype patterns and their frequencies.
Abstract problem: Given a sequence of loci [0, 1, 2, ..., n_loci-1] with recombination rates between adjacent loci, enumerate all crossover patterns and produce the resulting haplotype pattern (which chain at each locus).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_loci
|
int
|
Number of loci (>= 1) |
required |
recombination_rates
|
ndarray
|
Shape (n_loci - 1,). recombination_rates[i] = rate between locus i and i+1 |
required |
start_maternal
|
bool
|
If True, start from maternal chain (0); else paternal (1) |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
haplotype_patterns |
ndarray
|
Shape (2^(n_loci-1), n_loci). Each row is 01 sequence: 0=maternal allele at that locus, 1=paternal allele |
frequencies |
ndarray
|
Shape (2^(n_loci-1),). Frequency of each pattern. |
Examples:
>>> n_loci = 3
>>> recomb_rates = np.array([0.1, 0.2]) # rate between 0-1 and 1-2
>>> patterns, freqs = compute_recombinant_haplotypes(n_loci, recomb_rates, True)
>>> patterns
array([[0, 0, 0], # No crossovers: all maternal
[0, 0, 1], # Crossover after locus 1: mat, mat, pat
[0, 1, 1], # Crossover after locus 0: mat, pat, pat
[0, 1, 0]], dtype=int64) # Two crossovers: mat, pat, mat
>>> freqs
array([0.72, 0.02, 0.18, 0.08]) # 0.9*0.8, 0.9*0.2, 0.1*0.8, 0.1*0.2
Source code in src/natal/genetic_entities.py
1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 | |
compute_recombinant_haplotypes_with_alleles
compute_recombinant_haplotypes_with_alleles(maternal_alleles: List[str], paternal_alleles: List[str], recombination_rates: ndarray, start_maternal: bool = True) -> Dict[str, float]
Compute recombinant haplotypes with actual allele symbols.
Given maternal and paternal allele sequences, compute all recombinant haplotypes considering recombination rates, and return them as strings mapped to their frequencies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
maternal_alleles
|
List[str]
|
List of allele symbols at each locus (maternal chain) |
required |
paternal_alleles
|
List[str]
|
List of allele symbols at each locus (paternal chain) |
required |
recombination_rates
|
ndarray
|
Recombination rates between adjacent loci |
required |
start_maternal
|
bool
|
Start from maternal (Arue) or paternal (False) |
True
|
Returns:
| Type | Description |
|---|---|
Dict[str, float]
|
Dict mapping haplotype string (e.g., "A1/a2/A3") to frequency |
Source code in src/natal/genetic_entities.py
create_haplotype_from_allele_names
Create a Haplotype from allele names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chromosome
|
Chromosome
|
The Chromosome structure this haplotype belongs to. |
required |
allele_names
|
List[str]
|
List of allele names, one per locus in order. |
required |
Returns:
| Type | Description |
|---|---|
Haplotype
|
A new Haplotype instance. |