Skip to content

genetic_patterns Module

Genetic pattern matching and filtering.

Overview

The genetic_patterns module implements the syntax and logic for matching genotypes and haploid genomes against symbolic patterns.

Complete Module Reference

natal.genetic_patterns

Genetic pattern matching system for genotypes and haploid genomes.

This module provides regex-like pattern matching for genetic sequences: - PatternElement: Base class for allele-level matching - HaplotypePath: Pattern for a single DNA strand of one chromosome - ChromosomePairPattern: Pattern for a pair of homologous chromosomes - GenotypePattern: Pattern for a complete diploid genotype - HaploidGenomePattern: Pattern for a complete haploid genome - GenotypePatternParser: Parser for pattern syntax strings - GenotypeSelector: Unified genotype selector for observation/filtering

PatternParseError

Bases: Exception

Error raised during genotype pattern parsing.

PatternElement

Bases: ABC

Base class for all pattern elements representing allele-level matching.

matches abstractmethod
matches(gene: Optional[Gene]) -> bool

Check if a single allele matches this pattern element.

Parameters:

Name Type Description Default
gene Optional[Gene]

The Gene object to match, or None.

required

Returns:

Type Description
bool

True if the gene matches this pattern element.

Source code in src/natal/genetic_patterns.py
@abstract_method
def matches(self, gene: Optional['Gene']) -> bool:
    """Check if a single allele matches this pattern element.

    Args:
        gene: The Gene object to match, or None.

    Returns:
        True if the gene matches this pattern element.
    """
    pass

AllelePattern

AllelePattern(allele_name: str)

Bases: PatternElement

Exact match for a single allele name.

Source code in src/natal/genetic_patterns.py
def __init__(self, allele_name: str):
    self.allele_name = allele_name

WildcardPattern

Bases: PatternElement

Wildcard (*) - matches any allele.

SetPattern

SetPattern(alleles: Set[str], negate: bool = False)

Bases: PatternElement

Set pattern - matches alleles in a set, with optional negation.

Initialize a set pattern.

Parameters:

Name Type Description Default
alleles Set[str]

Set of allele names to match.

required
negate bool

If True, match alleles NOT in this set.

False
Source code in src/natal/genetic_patterns.py
def __init__(self, alleles: Set[str], negate: bool = False):
    """Initialize a set pattern.

    Args:
        alleles: Set of allele names to match.
        negate: If True, match alleles NOT in this set.
    """
    self.alleles = alleles
    self.negate = negate

LocusPattern

LocusPattern(maternal_pattern: PatternElement, paternal_pattern: PatternElement, unordered: bool = False)

Pattern for a single locus (two homologous chromosomes).

Initialize a locus pattern.

Parameters:

Name Type Description Default
maternal_pattern PatternElement

PatternElement for maternal allele.

required
paternal_pattern PatternElement

PatternElement for paternal allele.

required
unordered bool

If True, use :: ordering (match either maternal|paternal or paternal|maternal).

False
Source code in src/natal/genetic_patterns.py
def __init__(
    self,
    maternal_pattern: PatternElement,
    paternal_pattern: PatternElement,
    unordered: bool = False
):
    """Initialize a locus pattern.

    Args:
        maternal_pattern: PatternElement for maternal allele.
        paternal_pattern: PatternElement for paternal allele.
        unordered: If True, use :: ordering (match either maternal|paternal or paternal|maternal).
    """
    self.maternal_pattern = maternal_pattern
    self.paternal_pattern = paternal_pattern
    self.unordered = unordered
matches
matches(mat_gene: Optional[Gene], pat_gene: Optional[Gene]) -> bool

Check if a pair of alleles matches this locus pattern.

Parameters:

Name Type Description Default
mat_gene Optional[Gene]

Maternal allele.

required
pat_gene Optional[Gene]

Paternal allele.

required

Returns:

Type Description
bool

True if the allele pair matches.

Source code in src/natal/genetic_patterns.py
def matches(self, mat_gene: Optional['Gene'], pat_gene: Optional['Gene']) -> bool:
    """Check if a pair of alleles matches this locus pattern.

    Args:
        mat_gene: Maternal allele.
        pat_gene: Paternal allele.

    Returns:
        True if the allele pair matches.
    """
    if self.unordered:
        # Try both orderings
        match_straight = (
            self.maternal_pattern.matches(mat_gene) and
            self.paternal_pattern.matches(pat_gene)
        )
        match_reversed = (
            self.maternal_pattern.matches(pat_gene) and
            self.paternal_pattern.matches(mat_gene)
        )
        return match_straight or match_reversed
    else:
        # Strict ordering
        return (
            self.maternal_pattern.matches(mat_gene) and
            self.paternal_pattern.matches(pat_gene)
        )

HaplotypePath

HaplotypePath(locus_patterns: Sequence[PatternElement])

Pattern for a single Haplotype (one copy of a pair of homologous chromosomes).

Initialize a haplotype pattern.

Parameters:

Name Type Description Default
locus_patterns Sequence[PatternElement]

Sequence of PatternElement for each locus in order. Each PatternElement matches a single allele at that locus.

required
Source code in src/natal/genetic_patterns.py
def __init__(self, locus_patterns: Sequence[PatternElement]):
    """Initialize a haplotype pattern.

    Args:
        locus_patterns: Sequence of PatternElement for each locus in order.
                       Each PatternElement matches a single allele at that locus.
    """
    self.locus_patterns = locus_patterns
matches
matches(haplotype: Haplotype) -> bool

Check if a haplotype matches this pattern.

Parameters:

Name Type Description Default
haplotype Haplotype

The Haplotype to match.

required

Returns:

Type Description
bool

True if all loci match.

Source code in src/natal/genetic_patterns.py
def matches(self, haplotype: 'Haplotype') -> bool:
    """Check if a haplotype matches this pattern.

    Args:
        haplotype: The Haplotype to match.

    Returns:
        True if all loci match.
    """
    # Get loci from the haplotype's chromosome
    loci = haplotype.chromosome.loci

    if len(self.locus_patterns) != len(loci):
        return False

    for pattern_elem, locus in zip(self.locus_patterns, loci):
        gene = haplotype.get_gene_at_locus(locus)
        if not pattern_elem.matches(gene):
            return False

    return True
to_filter
to_filter() -> Callable[[Haplotype], bool]

Convert to a filter function.

Returns:

Type Description
Callable[[Haplotype], bool]

A callable that takes a Haplotype and returns bool.

Source code in src/natal/genetic_patterns.py
def to_filter(self) -> Callable[['Haplotype'], bool]:
    """Convert to a filter function.

    Returns:
        A callable that takes a Haplotype and returns bool.
    """
    return lambda haplotype: self.matches(haplotype)

ChromosomePairPattern

ChromosomePairPattern(maternal_pattern: HaplotypePath, paternal_pattern: HaplotypePath, unordered: bool = False, explicit_grouping: bool = False)

Pattern for a pair of homologous chromosomes.

Initialize a chromosome pair pattern.

Parameters:

Name Type Description Default
maternal_pattern HaplotypePath

HaplotypePath for maternal haplotype.

required
paternal_pattern HaplotypePath

HaplotypePath for paternal haplotype.

required
unordered bool

If True, use :: ordering (match either order).

False
explicit_grouping bool

If True, this pattern was explicitly grouped with ().

False
Source code in src/natal/genetic_patterns.py
def __init__(
    self,
    maternal_pattern: HaplotypePath,
    paternal_pattern: HaplotypePath,
    unordered: bool = False,
    explicit_grouping: bool = False
):
    """Initialize a chromosome pair pattern.

    Args:
        maternal_pattern: HaplotypePath for maternal haplotype.
        paternal_pattern: HaplotypePath for paternal haplotype.
        unordered: If True, use :: ordering (match either order).
        explicit_grouping: If True, this pattern was explicitly grouped with ().
    """
    self.maternal_pattern = maternal_pattern
    self.paternal_pattern = paternal_pattern
    self.unordered = unordered
    self.explicit_grouping = explicit_grouping
matches
matches(haplotype_pair: Tuple[Haplotype, Haplotype]) -> bool

Check if a pair of haplotypes (one chromosome pair) matches.

Parameters:

Name Type Description Default
haplotype_pair Tuple[Haplotype, Haplotype]

Tuple of (maternal_haplotype, paternal_haplotype).

required

Returns:

Type Description
bool

True if the haplotype pair matches.

Source code in src/natal/genetic_patterns.py
def matches(self, haplotype_pair: Tuple['Haplotype', 'Haplotype']) -> bool:
    """Check if a pair of haplotypes (one chromosome pair) matches.

    Args:
        haplotype_pair: Tuple of (maternal_haplotype, paternal_haplotype).

    Returns:
        True if the haplotype pair matches.
    """
    mat_hap, pat_hap = haplotype_pair

    if self.unordered:
        # Try both orderings
        match_straight = (
            self.maternal_pattern.matches(mat_hap) and
            self.paternal_pattern.matches(pat_hap)
        )
        match_reversed = (
            self.maternal_pattern.matches(pat_hap) and
            self.paternal_pattern.matches(mat_hap)
        )
        return match_straight or match_reversed
    else:
        # Strict ordering: maternal | paternal
        return (
            self.maternal_pattern.matches(mat_hap) and
            self.paternal_pattern.matches(pat_hap)
        )
to_filter
to_filter() -> Callable[[Tuple[Haplotype, Haplotype]], bool]

Convert to a filter function.

Returns:

Type Description
Callable[[Tuple[Haplotype, Haplotype]], bool]

A callable that takes a haplotype pair and returns bool.

Source code in src/natal/genetic_patterns.py
def to_filter(self) -> Callable[[Tuple['Haplotype', 'Haplotype']], bool]:
    """Convert to a filter function.

    Returns:
        A callable that takes a haplotype pair and returns bool.
    """
    return lambda pair: self.matches(pair)

GenotypePattern

GenotypePattern(chromosome_patterns: List[Optional[ChromosomePairPattern]])

Complete genotype pattern matching multiple chromosomes.

Initialize a complete genotype pattern.

Parameters:

Name Type Description Default
chromosome_patterns List[Optional[ChromosomePairPattern]]

List of ChromosomePairPattern (or None for omitted chromosomes). None means that chromosome is not constrained by the pattern.

required
Source code in src/natal/genetic_patterns.py
def __init__(self, chromosome_patterns: List[Optional[ChromosomePairPattern]]):
    """Initialize a complete genotype pattern.

    Args:
        chromosome_patterns: List of ChromosomePairPattern (or None for omitted chromosomes).
                           None means that chromosome is not constrained by the pattern.
    """
    self.chromosome_patterns = chromosome_patterns
matches
matches(genotype: Genotype) -> bool

Check if a genotype matches this pattern.

Parameters:

Name Type Description Default
genotype Genotype

The Genotype to match.

required

Returns:

Type Description
bool

True if the genotype matches all specified chromosome patterns.

Source code in src/natal/genetic_patterns.py
def matches(self, genotype: 'Genotype') -> bool:
    """Check if a genotype matches this pattern.

    Args:
        genotype: The Genotype to match.

    Returns:
        True if the genotype matches all specified chromosome patterns.
    """
    species = genotype.species

    for i, chr_pattern in enumerate(self.chromosome_patterns):
        if chr_pattern is None:
            # Omitted chromosome - no constraint
            continue

        # Get the haplotype pair for this chromosome
        chromosome = species.chromosomes[i]
        try:
            mat_hap = genotype.maternal.get_haplotype_for_chromosome(chromosome)
            pat_hap = genotype.paternal.get_haplotype_for_chromosome(chromosome)
        except (AttributeError, KeyError, IndexError):
            return False

        if not chr_pattern.matches((mat_hap, pat_hap)):
            return False

    return True
to_filter
to_filter() -> Callable[[Genotype], bool]

Convert to a filter function for use in rules.

Returns:

Type Description
Callable[[Genotype], bool]

A callable that takes a Genotype and returns bool.

Source code in src/natal/genetic_patterns.py
def to_filter(self) -> Callable[['Genotype'], bool]:
    """Convert to a filter function for use in rules.

    Returns:
        A callable that takes a Genotype and returns bool.
    """
    return lambda genotype: self.matches(genotype)

HaploidGenomePattern

HaploidGenomePattern(haplotype_patterns: List[Optional[HaplotypePath]])

Pattern for a complete HaploidGenome (one DNA strand of an individual).

Initialize a haploid genome pattern.

Parameters:

Name Type Description Default
haplotype_patterns List[Optional[HaplotypePath]]

List of HaplotypePath for each chromosome. None means that chromosome is not constrained.

required
Source code in src/natal/genetic_patterns.py
def __init__(self, haplotype_patterns: List[Optional[HaplotypePath]]):
    """Initialize a haploid genome pattern.

    Args:
        haplotype_patterns: List of HaplotypePath for each chromosome.
                           None means that chromosome is not constrained.
    """
    self.haplotype_patterns = haplotype_patterns
matches
matches(haploid_genome: HaploidGenome) -> bool

Check if a haploid genome matches this pattern.

Parameters:

Name Type Description Default
haploid_genome HaploidGenome

The HaploidGenome to match.

required

Returns:

Type Description
bool

True if the haploid genome matches all specified patterns.

Source code in src/natal/genetic_patterns.py
def matches(self, haploid_genome: 'HaploidGenome') -> bool:
    """Check if a haploid genome matches this pattern.

    Args:
        haploid_genome: The HaploidGenome to match.

    Returns:
        True if the haploid genome matches all specified patterns.
    """
    species = haploid_genome.species

    for i, haplotype_pattern in enumerate(self.haplotype_patterns):
        if haplotype_pattern is None:
            # Omitted chromosome - no constraint
            continue

        # Get the haplotype for this chromosome
        chromosome = species.chromosomes[i]
        try:
            haplotype = haploid_genome.get_haplotype_for_chromosome(chromosome)
        except (AttributeError, KeyError, IndexError):
            return False

        if not haplotype_pattern.matches(haplotype):
            return False

    return True
to_filter
to_filter() -> Callable[[HaploidGenome], bool]

Convert to a filter function.

Returns:

Type Description
Callable[[HaploidGenome], bool]

A callable that takes a HaploidGenome and returns bool.

Source code in src/natal/genetic_patterns.py
def to_filter(self) -> Callable[['HaploidGenome'], bool]:
    """Convert to a filter function.

    Returns:
        A callable that takes a HaploidGenome and returns bool.
    """
    return lambda genome: self.matches(genome)

GenotypePatternParser

GenotypePatternParser(species: Species)

Parses genotype pattern strings into GenotypePattern objects.

Initialize parser for a specific species.

Parameters:

Name Type Description Default
species Species

The Species object to use for validation and context.

required
Source code in src/natal/genetic_patterns.py
def __init__(self, species: 'Species'):
    """Initialize parser for a specific species.

    Args:
        species: The Species object to use for validation and context.
    """
    self.species = species
parse
parse(pattern_str: str) -> GenotypePattern

Parse a pattern string into a GenotypePattern.

Supported syntax includes
  • ; separates chromosomes (outside parentheses)
  • | separates maternal (left) and paternal (right)
  • / separates loci within a chromosome
  • * matches any allele
  • {A,B,C} matches any allele in the set
  • !A matches any allele except A
  • :: matches unordered pair (A::B matches A|B or B|A)
  • () groups loci within a chromosome, ; inside () separates loci
  • Omitted chromosomes default to wildcard matching (optional)

Parameters:

Name Type Description Default
pattern_str str

The pattern string to parse.

required

Returns:

Type Description
GenotypePattern

A GenotypePattern object.

Raises:

Type Description
PatternParseError

If the pattern is invalid.

Source code in src/natal/genetic_patterns.py
def parse(self, pattern_str: str) -> GenotypePattern:
    """Parse a pattern string into a GenotypePattern.

    Supported syntax includes:
        - `;` separates chromosomes (outside parentheses)
        - `|` separates maternal (left) and paternal (right)
        - `/` separates loci within a chromosome
        - `*` matches any allele
        - `{A,B,C}` matches any allele in the set
        - `!A` matches any allele except A
        - `::` matches unordered pair (A::B matches A|B or B|A)
        - `()` groups loci within a chromosome, `;` inside () separates loci
        - Omitted chromosomes default to wildcard matching (optional)

    Args:
        pattern_str: The pattern string to parse.

    Returns:
        A GenotypePattern object.

    Raises:
        PatternParseError: If the pattern is invalid.
    """
    pattern_str = pattern_str.strip()

    # Check cache
    cache_key = (id(self.species), pattern_str)
    if cache_key in self._pattern_cache:
        return self._pattern_cache[cache_key]

    try:
        # Split by semicolon, respecting parentheses
        chr_pattern_strs = self._split_by_semicolon_respecting_parens(pattern_str)

        chromosome_patterns: List[Union[ChromosomePairPattern, Literal["WILDCARD_CHROMOSOME"]]] = []
        for chr_str in chr_pattern_strs:
            chr_pattern = self._parse_chromosome_pair(chr_str)
            chromosome_patterns.append(chr_pattern)

        # Handle wildcard chromosome markers and fill remaining chromosomes
        final_patterns: List[Optional[ChromosomePairPattern]] = []
        for i, pattern in enumerate(chromosome_patterns):
            if pattern == "WILDCARD_CHROMOSOME":
                # Create a fully wildcard pattern for this chromosome
                if i < len(self.species.chromosomes):
                    chromosome = self.species.chromosomes[i]
                    num_loci = len(chromosome.loci)
                    wildcard_patterns = [WildcardPattern() for _ in range(num_loci)]
                    maternal_path = HaplotypePath(wildcard_patterns)
                    paternal_path = HaplotypePath(wildcard_patterns.copy())
                    final_patterns.append(ChromosomePairPattern(maternal_path, paternal_path))
                else:
                    final_patterns.append(None)
            else:
                final_patterns.append(pattern)

        # Fill remaining chromosomes with None
        while len(final_patterns) < len(self.species.chromosomes):
            final_patterns.append(None)

        result = GenotypePattern(final_patterns)
        self._pattern_cache[cache_key] = result
        return result

    except PatternParseError:
        raise
    except Exception as e:
        raise PatternParseError(f"Failed to parse pattern '{pattern_str}'") from e
parse_haplotype_pattern
parse_haplotype_pattern(pattern_str: str) -> HaplotypePath

Parse a complete haplotype pattern.

Parameters:

Name Type Description Default
pattern_str str

Pattern string for a single haplotype (e.g., "A1/B1; C1")

required

Returns:

Type Description
HaplotypePath

HaplotypePath object with all loci patterns combined.

Raises:

Type Description
PatternParseError

If the pattern is invalid.

Source code in src/natal/genetic_patterns.py
def parse_haplotype_pattern(self, pattern_str: str) -> HaplotypePath:
    """Parse a complete haplotype pattern.

    Args:
        pattern_str: Pattern string for a single haplotype (e.g., "A1/B1; C1")

    Returns:
        HaplotypePath object with all loci patterns combined.

    Raises:
        PatternParseError: If the pattern is invalid.
    """
    pattern_str = pattern_str.strip()

    try:
        # Split by semicolon to get loci from all chromosomes
        chr_strs = [s.strip() for s in pattern_str.split(";") if s.strip()]

        all_locus_patterns: List[PatternElement] = []
        for chr_str in chr_strs:
            subbandloci = chr_str.split("/")
            for locus_str in subbandloci:
                pattern_elem = self._parse_allele_element(locus_str.strip())
                all_locus_patterns.append(pattern_elem)

        return HaplotypePath(all_locus_patterns)

    except PatternParseError:
        raise
    except Exception as e:
        raise PatternParseError(f"Failed to parse haplotype pattern '{pattern_str}'") from e
parse_haploid_genome_pattern
parse_haploid_genome_pattern(pattern_str: str) -> HaploidGenomePattern

Parse a haploid genome pattern (single DNA strand of individual).

For haploid genomes: - ; at top level separates different chromosomes - () brackets represent a single haplotype (one DNA strand) - Inside brackets, ; separates different loci on that strand - / is not used inside brackets for haploid (it's only for diploid)

Parameters:

Name Type Description Default
pattern_str str

Pattern string (e.g., "A1/B1; C1" or "(A1; B1); C1")

required

Returns:

Type Description
HaploidGenomePattern

HaploidGenomePattern object.

Raises:

Type Description
PatternParseError

If the pattern is invalid.

Source code in src/natal/genetic_patterns.py
def parse_haploid_genome_pattern(self, pattern_str: str) -> HaploidGenomePattern:
    """Parse a haploid genome pattern (single DNA strand of individual).

    For haploid genomes:
    - `;` at top level separates different chromosomes
    - `()` brackets represent a single haplotype (one DNA strand)
    - Inside brackets, `;` separates different loci on that strand
    - `/` is not used inside brackets for haploid (it's only for diploid)

    Args:
        pattern_str: Pattern string (e.g., "A1/B1; C1" or "(A1; B1); C1")

    Returns:
        HaploidGenomePattern object.

    Raises:
        PatternParseError: If the pattern is invalid.
    """
    pattern_str = pattern_str.strip()

    try:
        # Split by semicolon, respecting parentheses
        chr_strs = self._split_by_semicolon_respecting_parens(pattern_str)

        haplotype_patterns: List[Optional[Union[HaplotypePath, Literal["WILDCARD_CHROMOSOME"]]]] = []
        for chr_str in chr_strs:
            if chr_str == "*":
                # Wildcard chromosome - will be expanded later
                haplotype_patterns.append("WILDCARD_CHROMOSOME")
            elif chr_str.startswith("(") and chr_str.endswith(")"):
                # Bracketed haplotype for this chromosome
                inner = chr_str[1:-1].strip()
                haplotype_path = self._parse_bracketed_haplotype_path(inner)
                haplotype_patterns.append(haplotype_path)
            else:
                # Standard form: A1/B1/C1
                haplotype_path = self._parse_haplotype_path(chr_str)
                haplotype_patterns.append(haplotype_path)

        # Handle wildcard markers and expand
        final_haplotype_patterns: List[Optional[HaplotypePath]] = []
        for i, pattern in enumerate(haplotype_patterns):
            if pattern == "WILDCARD_CHROMOSOME":
                # Create wildcard pattern for this chromosome
                if i < len(self.species.chromosomes):
                    chromosome = self.species.chromosomes[i]
                    num_loci = len(chromosome.loci)
                    wildcard_patterns = [WildcardPattern() for _ in range(num_loci)]
                    final_haplotype_patterns.append(HaplotypePath(wildcard_patterns))
                else:
                    final_haplotype_patterns.append(None)
            else:
                final_haplotype_patterns.append(pattern)

        # Fill remaining chromosomes with None
        while len(final_haplotype_patterns) < len(self.species.chromosomes):
            final_haplotype_patterns.append(None)

        return HaploidGenomePattern(final_haplotype_patterns)

    except PatternParseError:
        raise
    except Exception as e:
        raise PatternParseError(f"Failed to parse haploid genome pattern '{pattern_str}'") from e
get_allowed_alleles
get_allowed_alleles(pattern_element: PatternElement) -> List[str]

Get all allowed allele names for a pattern element.

Parameters:

Name Type Description Default
pattern_element PatternElement

The PatternElement to analyze.

required

Returns:

Type Description
List[str]

List of allowed allele names.

Source code in src/natal/genetic_patterns.py
def get_allowed_alleles(self, pattern_element: PatternElement) -> List[str]:
    """Get all allowed allele names for a pattern element.

    Args:
        pattern_element: The PatternElement to analyze.

    Returns:
        List of allowed allele names.
    """
    if isinstance(pattern_element, AllelePattern):
        return [pattern_element.allele_name]
    elif isinstance(pattern_element, WildcardPattern):
        return self._get_all_allele_names()
    elif isinstance(pattern_element, SetPattern):
        if pattern_element.negate:
            all_alleles = set(self._get_all_allele_names())
            return list(all_alleles - pattern_element.alleles)
        else:
            return list(pattern_element.alleles)
    else:
        raise ValueError(f"Unknown pattern element type: {type(pattern_element)}")

GenotypeSelector

GenotypeSelector(species: Species)

Unified genotype selector for observation and filtering.

This class provides a unified interface for selecting genotypes using various input formats, leveraging the existing pattern matching system.

Initialize genotype selector for a specific species.

Parameters:

Name Type Description Default
species Species

The Species object to use for pattern parsing.

required
Source code in src/natal/genetic_patterns.py
def __init__(self, species: 'Species'):
    """Initialize genotype selector for a specific species.

    Args:
        species: The Species object to use for pattern parsing.
    """
    self.species = species
    self.parser = GenotypePatternParser(species)
resolve_genotype_indices
resolve_genotype_indices(gen_spec: Optional[Iterable[Any]], diploid_genotypes: Optional[Sequence[Any]], unordered: bool = False) -> List[int]

Resolve genotype selectors into a list of indices.

This method provides the same functionality as observation.py's _resolve_genotype_list() but uses the pattern matching system.

Parameters:

Name Type Description Default
gen_spec Optional[Iterable[Any]]

Genotype selector specification. Can be: - None: select all genotypes - int: genotype index - str: genotype pattern string - Genotype: genotype object - Iterable of any of the above

required
diploid_genotypes Optional[Sequence[Any]]

Sequence of diploid genotypes for resolution.

required
unordered bool

Whether to treat genotypes as unordered (A|a == a|A).

False

Returns:

Type Description
List[int]

List of resolved genotype indices.

Raises:

Type Description
ValueError

If diploid_genotypes is required but missing.

Source code in src/natal/genetic_patterns.py
def resolve_genotype_indices(
    self,
    gen_spec: Optional[Iterable[Any]],
    diploid_genotypes: Optional[Sequence[Any]],
    unordered: bool = False,
) -> List[int]:
    """Resolve genotype selectors into a list of indices.

    This method provides the same functionality as observation.py's
    _resolve_genotype_list() but uses the pattern matching system.

    Args:
        gen_spec: Genotype selector specification. Can be:
            - None: select all genotypes
            - int: genotype index
            - str: genotype pattern string
            - Genotype: genotype object
            - Iterable of any of the above
        diploid_genotypes: Sequence of diploid genotypes for resolution.
        unordered: Whether to treat genotypes as unordered (A|a == a|A).

    Returns:
        List of resolved genotype indices.

    Raises:
        ValueError: If diploid_genotypes is required but missing.
    """
    if gen_spec is None:
        if diploid_genotypes is None:
            raise ValueError("diploid_genotypes required to enumerate genotypes")
        return list(range(len(diploid_genotypes)))

    # Handle single item vs iterable
    if not isinstance(gen_spec, (list, tuple, set)):
        gen_spec = [gen_spec]

    resolved_indices: Set[int] = set()

    for selector in gen_spec:
        if isinstance(selector, int):
            # Direct index
            resolved_indices.add(selector)
        elif isinstance(selector, str):
            # Pattern string - use pattern matching system
            pattern = self.parser.parse(selector)
            if diploid_genotypes is None:
                raise ValueError("diploid_genotypes required for pattern matching")

            for i, genotype in enumerate(diploid_genotypes):
                if pattern.matches(genotype):
                    resolved_indices.add(i)
        else:
            # Assume it's a Genotype object or similar
            if diploid_genotypes is None:
                raise ValueError("diploid_genotypes required for genotype matching")

            for i, genotype in enumerate(diploid_genotypes):
                if self._genotypes_equal(selector, genotype, unordered):
                    resolved_indices.add(i)

    return sorted(resolved_indices)
create_filter_function
create_filter_function(gen_spec: Optional[Iterable[Any]], unordered: bool = False) -> Callable[[Any], bool]

Create a filter function for genotype selection.

Parameters:

Name Type Description Default
gen_spec Optional[Iterable[Any]]

Genotype selector specification.

required
unordered bool

Whether to use unordered matching.

False

Returns:

Type Description
Callable[[Any], bool]

A callable that takes a genotype and returns True if it matches.

Source code in src/natal/genetic_patterns.py
def create_filter_function(
    self,
    gen_spec: Optional[Iterable[Any]],
    unordered: bool = False
) -> Callable[[Any], bool]:
    """Create a filter function for genotype selection.

    Args:
        gen_spec: Genotype selector specification.
        unordered: Whether to use unordered matching.

    Returns:
        A callable that takes a genotype and returns True if it matches.
    """
    if gen_spec is None:
        # Match all genotypes
        return lambda genotype: True

    # Handle single item vs iterable
    if not isinstance(gen_spec, (list, tuple, set)):
        gen_spec = [gen_spec]

    # Create pattern-based filters for string selectors
    pattern_filters: List[Callable[[Any], bool]] = []
    other_selectors: List[Any] = []

    for selector in gen_spec:
        if isinstance(selector, str):
            pattern = self.parser.parse(selector)
            pattern_filters.append(pattern.to_filter())
        else:
            other_selectors.append(selector)

    def filter_func(genotype: Any) -> bool:
        # Check pattern filters
        for pattern_filter in pattern_filters:
            if pattern_filter(genotype):
                return True

        # Check other selectors
        for selector in other_selectors:
            if self._genotypes_equal(selector, genotype, unordered):
                return True

        return False

    return filter_func
get_pattern_for_selector
get_pattern_for_selector(selector: Any) -> Optional[GenotypePattern]

Convert a selector to a GenotypePattern if possible.

Parameters:

Name Type Description Default
selector Any

Genotype selector.

required

Returns:

Type Description
Optional[GenotypePattern]

GenotypePattern if selector can be converted, None otherwise.

Source code in src/natal/genetic_patterns.py
def get_pattern_for_selector(self, selector: Any) -> Optional[GenotypePattern]:
    """Convert a selector to a GenotypePattern if possible.

    Args:
        selector: Genotype selector.

    Returns:
        GenotypePattern if selector can be converted, None otherwise.
    """
    if isinstance(selector, str):
        return self.parser.parse(selector)
    elif isinstance(selector, GenotypePattern):
        return selector
    else:
        return None