drvi.utils.plotting.show_differential_vars_scatter_plot

drvi.utils.plotting.show_differential_vars_scatter_plot#

drvi.utils.plotting.show_differential_vars_scatter_plot(traverse_adata, key_x, key_y, key_combined, title_col='title', order_col='order', gene_symbols=None, score_threshold=0.0, dim_subset=None, ncols=3, show=True, **kwargs)[source]#

Show a scatter plot of differential variables considering multiple criteria.

This function creates scatter plots comparing different differential effect (usaully “max_possible” and “min_possible”) measures for each latent dimension. It is color-coded by the combined score. It’s useful for understanding how different analysis methods relate to each other and identifying genes that show consistent effects across multiple criteria. The top 20 genes are labeled with their names.

Parameters:
  • traverse_adata (AnnData) – AnnData object containing the differential analysis results from calculate_differential_vars. Must contain differential effect data for all specified keys.

  • key_x (str) – Key for the x-axis variable in traverse_adata.varm. Typically “max_possible” or “min_possible”.

  • key_y (str) – Key for the y-axis variable in traverse_adata.varm. Typically “min_possible” or “max_possible”.

  • key_combined (str) – Key for the color-coded variable in traverse_adata.varm. Typically “combined_score” for the final combined effect.

  • title_col (str (default: 'title')) – Column name in traverse_adata.obs that contains the titles for each dimension. These titles will be used as subplot titles.

  • order_col (str (default: 'order')) – Column name in traverse_adata.obs that specifies the order of dimensions. Results will be sorted by this column. Ignored if dim_subset is provided.

  • gene_symbols (str | None (default: None)) – Column name in traverse_adata.var that contains gene symbols. If provided, gene symbols will be used for point labels instead of gene indices.

  • score_threshold (float (default: 0.0)) – Threshold value for gene scores. Only genes with combined scores above this threshold will be plotted.

  • dim_subset (Sequence[str] | None (default: None)) – Subset of dimensions to plot. If None, all dimensions with significant effects are plotted.

  • ncols (int (default: 3)) – Number of columns in the plot grid.

  • show (bool (default: True)) – Whether to display the plot. If False, returns the figure object.

  • **kwargs – Additional keyword arguments passed to the scatter plot (e.g., alpha, s for point size).

Returns:

matplotlib.figure.Figure or None The figure object if show=False, otherwise None.

Raises:
  • KeyError – If required data is missing from traverse_adata.

  • ValueError – If any of the specified keys don’t exist in the AnnData object.

Notes

The function performs the following steps: 1. Extracts differential variables for all three keys (x, y, combined) 2. Creates scatter plots for each dimension comparing the two measures 3. Color-codes points by the combined score 4. Labels the top 20 genes by combined score

Interpretation:

  • X-axis: Effect measure from key_x (e.g., max_possible)

  • Y-axis: Effect measure from key_y (e.g., min_possible)

  • Color: Combined score from key_combined

  • Point position: Relationship between the two measures

  • Labeled points: Genes with highest combined scores