drvi.utils.tools.iterate_on_top_differential_vars#
- drvi.utils.tools.iterate_on_top_differential_vars(traverse_adata, key, title_col='title', order_col='order', gene_symbols=None, score_threshold=0.0)[source]#
Create an iterator of top differential variables per latent dimension.
This function processes differential analysis results to create an organized list of top differentially expressed genes for each latent dimension, sorted by their effect scores and organized by dimension.
- Parameters:
traverse_adata (
AnnData) – AnnData object with differential analysis results fromcalculate_differential_vars. Must contain differential effect data for the specifiedkey.key (
str) – Key prefix for the differential variables intraverse_adata. Should correspond to a key used infind_differential_effectsorcalculate_differential_vars. Common value: “combined_score”.title_col (
str(default:'title')) – Column name intraverse_adata.obscontaining dimension titles. These titles will be used in the output dimension names.order_col (
str(default:'order')) – Column name intraverse_adata.obscontaining dimension ordering.gene_symbols (
str|None(default:None)) – Column name intraverse_adata.varcontaining gene symbols. If None, uses the index oftraverse_adata.var(usually gene IDs). Useful for converting between gene IDs and readable gene names.score_threshold (
float(default:0.0)) – Minimum score threshold to include genes in the results. Only genes with scores above this threshold will be included.
- Return type:
- Returns:
list[tuple[str, pd.Series]] List of tuples, where each tuple contains: - str: Dimension title with direction indicator (e.g., “Cell Cycle+”, “Cell Cycle-“) - pd.Series: Series of gene scores for that dimension/direction, sorted descending
The list is sorted by dimension order, with each dimension appearing at most twice (once for positive effects, once for negative effects).
- Raises:
KeyError – If required columns or differential effect data are missing.
ValueError – If the specified key doesn’t exist in the AnnData object.
Notes
The function performs the following steps: 1. Extracts positive and negative differential effects for the specified key 2. Maps gene names to symbols if
gene_symbolsis provided 3. Filters genes by score threshold 4. Organizes results by dimension and direction (positive/negative) 5. Returns a list sorted by dimension orderOutput Structure:
Each dimension appears twice in the results - once for positive effects and once for negative effects. The direction is indicated by “+” or “-” appended to the dimension title.
Only dimensions with at least one gene above the threshold are included.
Examples
>>> # Basic iteration over top differential variables >>> top_vars = iterate_on_top_differential_vars(traverse_adata, "combined_score") >>> for dim_title, gene_scores in top_vars: ... print(f"{dim_title}: {len(gene_scores)} genes") ... print(f"Top genes: {gene_scores.head().index.tolist()}") >>> # With custom parameters and gene symbols >>> top_vars = iterate_on_top_differential_vars( ... traverse_adata, "max_possible", gene_symbols="gene_symbol", score_threshold=1.0 ... ) >>> # Create a summary of results >>> for dim_title, gene_scores in top_vars: ... print(f"{dim_title}: {gene_scores.head().index.tolist()}")