API
Import pyvia as:
import pyVIA.core as via
Full API
pyVIA core
- class VIA.core.VIA(data, true_label=None, edgepruning_clustering_resolution_local=1, edgepruning_clustering_resolution=0.15, labels=None, keep_all_local_dist='auto', too_big_factor=0.4, resolution_parameter=1.0, partition_type='ModularityVP', small_pop=10, jac_weighted_edges=True, knn=30, n_iter_leiden=5, random_seed=42, num_threads=-1, distance='l2', time_smallpop=15, super_cluster_labels=False, super_node_degree_list=False, super_terminal_cells=False, x_lazy=0.99, alpha_teleport=0.99, root_user=None, preserve_disconnected=True, dataset='', super_terminal_clusters=[], is_coarse=True, csr_full_graph='', csr_array_locally_pruned='', ig_full_graph='', full_neighbor_array='', full_distance_array='', embedding=None, df_annot=None, preserve_disconnected_after_pruning=False, secondary_annotations=None, pseudotime_threshold_TS=30, cluster_graph_pruning=0.15, visual_cluster_graph_pruning=0.15, neighboring_terminal_states_threshold=3, num_mcmc_simulations=1300, piegraph_arrow_head_width=0.1, piegraph_edgeweight_scalingfactor=1.5, max_visual_outgoing_edges=2, via_coarse=None, velocity_matrix=None, gene_matrix=None, velo_weight=0.5, edgebundle_pruning=None, A_velo=None, CSM=None, edgebundle_pruning_twice=False, pca_loadings=None, time_series=False, time_series_labels=None, knn_sequential=10, knn_sequential_reverse=0, t_diff_step=1, single_cell_transition_matrix=None, embedding_type='via-mds', do_compute_embedding=False, color_dict=None, user_defined_terminal_cell=[], user_defined_terminal_group=[], do_gaussian_kernel_edgeweights=False, RW2_mode=False, working_dir_fp='/home/', memory=5, viagraph_decay=0.9, p_memory=1, graph_init_pos=None, spatial_coords=None, do_spatial_knn=False, do_spatial_layout=False, spatial_knn=15, spatial_aux=[])[source]
A class to represent the VIA analysis
- Parameters:
data (ndarray) – input matrix of size n_cells x n_dims. Expects the PCs or features that will be used in the TI computation. Can be e.g. adata.obsm[‘X_pca][:,0:20]
true_label (list) – list of str/int that correspond to the ground truth or reference annotations. Can also be None when no labels are available
labels (ndarray (nsamples, )) – default is None. and PARC clusters are used for the viagraph. alternatively provide a list of clustermemberships that are integer values (not strings) to construct the viagraph using another clustering method or available annotations
edgepruning_clustering_resolution_local (float) – default = 2 local level of pruning for PARC graph clustering stage. Range (0.1,3) higher numbers mean more edge retention. For large datasets can stick to just tuning edgepruning_clustering_resolution
edgepruning_clustering_resolution (float) – (optional, default = 0.15, can also set as ‘median’) graph pruning for PARC clustering stage. Higher value keeps more edges, results in fewer clusters. Smaller value removes more edges and results in more clusters. Number of standard deviations below the network’s mean-jaccard-weighted edges. 0.1-1 provide reasonable pruning. higher value means less pruning (more edges retained). e.g. a value of 0.15 means all edges that are above mean(edgeweight)-0.15*std(edge-weights) are retained. We find both 0.15 and ‘median’ to yield good results/starting point and resulting in pruning away ~ 50-60% edges
keep_all_local_dist (bool, str) – default value of ‘auto’ means that for smaller datasets local-pruning is done prior to clustering, but for large datasets local pruning is set to False for speed. can also set to be bool of True or False
too_big_factor (float) – (optional, default=0.4). Forces clusters > 0.4*n_cells to be re-clustered
resolution_parameter (float) – (default =1) larger value means more and smaller clusters
partition_type (str) – (default “ModularityVP”) Options
small_pop (int) – (default 10) Via attempts to merge Clusters with a population < 10 cells with larger clusters. If you have a very small dataset (e.g. few hundred cells), then consider lowering to e.g. 5
jac_weighted_edges (bool) – (default = True) Use weighted edges in the PARC clustering step
knn (int) – (optional, default = 30) number of K-Nearest Neighbors for HNSWlib KNN graph. Larger knn means more graph connectivity. Lower knn means more loosely connected clusters/cells
n_iter_leiden (int) –
random_seed (int) – Random seed to pass to clustering
num_threads –
distance (str) – (default ‘l2’) Euclidean distance ‘l2’ by default; other options ‘ip’ and ‘cosine’ for graph construction and similarity
visual_cluster_graph_pruning (float) – (optional, default = 0.15) This only comes into play if the user deliberately chooses not to use the default edge-bundling method of visualizating edges (draw_piechart_graph()) and instead calls draw_piechart_graph_nobundle(). It is often set to the same value as the PARC clustering level of edgepruning_clustering_resolution. This does not impact computation of terminal states, pseudotime or lineage likelihoods. It controls the number of edges plotted for visual effect
cluster_graph_pruning (float) – (optional, default =0.15) Pruning level of the cluster graph (does not impact number of clusters). Only impacts the connectivity of the clustergraph. Often set to the same value as the PARC clustering level of edgepruning_clustering_resolution.Reasonable range [0.1,1] To retain more connectivity in the clustergraph underlying the trajectory computations, increase the value
time_smallpop (max time to be allowed handling singletons) –
x_lazy (float) – (default =0.95) 1-x = probability of staying in same node (lazy). Values between 0.9-0.99 are reasonable
alpha_teleport (float) – (default = 0.99) 1-alpha is probability of jumping. Values between 0.95-0.99 are reasonable unless prior knowledge of teleportation
root_user (list, None) – can be a list of strings, a list of int or None (default is None) When the root_user is set as None and an RNA velocity matrix is available, a root will be automatically computed if the root_user is None and not velocity matrix is provided, then an arbitrary root is selected if the root_user is [‘celltype_earlystage’] where the str corresponds to an item in true_label, then a suitable starting point will be selected corresponding to this group if the root_user is [678], where 678 is the index of the cell chosen as a start cell, then this will be the designated starting cell. It is possible to give a list of root indices and groups. [120, 699] or [‘traj1_earlystage’, ‘traj2_earlystage’] when there are more than one trajectories
preserve_disconnected (bool) – (default = True) If you believe there may be disconnected trajectories then set this to False
dataset (str) – Can be set to ‘group’ or ‘’ (default). this refers to the type of root label (group level root or single cell index) you are going to provide. if your true_label has a sensible group of cells for a root then you can set dataset to ‘group’ and make the root parameter [‘labelname_root_cell_type’] if your root corresponds to one particular cell then set dataset = ‘’ (default)
embedding (ndarray) – (optional, default = None) embedding (e.g. precomputed tsne, umap, phate, via-umap) for plotting data. Size n_cells x 2 If an embedding is provided when running VIA, then a scatterplot colored by pseudotime, highlighting terminal fates
velo_weight (float) – (optional, default = 0.5) #float between [0,1]. the weight assigned to directionality and connectivity derived from scRNA-velocity
neighboring_terminal_states_threshold (int) – (default = 3). Candidates for terminal states that are neighbors of each other may be removed from the list if they have this number of more of terminal states as neighbors
knn_sequential (int) – (default =10) number of knn in the adjacent time-point for time-series data (t_i and t_i+1)
knn_sequential_reverse (int) – (default = 0) number of knn enforced from current to previous time point
t_diff_step (int) – (default =1) Number of permitted temporal intervals between connected nodes. If time data is labeled as [0,25,50,75,100,..] then t_diff_step=1 corresponds to ‘25’ and only edges within t_diff_steps are retained
is_coarse (bool) – (default = True) If running VIA in two iterations where you wish to link the second fine-grained iteration with the initial iteration, then you set to False
via_coarse (VIA) – (default = None) If instantiating a second iteration of VIA that needs to be linked to a previous iteration (e.g. via0), then set via_coarse to the previous via0 object
df_annot (DataFrame) – (default None) used for the Mouse Organ data
preserve_disconnected_after_pruning (bool) – (default = False) If you believe there are disconnected trajectories then set this to True and test your hypothesis
A_velo (ndarray) – Cluster Graph Transition matrix based on rna velocity [n_clus x n_clus]
velocity_matrix (matrix) – (default None) matrix of size [n_samples x n_genes]. this is the velocity matrix computed by scVelo (or similar package) and stored in adata.layers[‘velocity’]. The genes used for computing velocity should correspond to those useing in gene_matrix Requires gene_matrix to be provided too.
gene_matrix (matrix) – (default None) Only used if Velocity_matrix is available. matrix of size [n_samples x n_genes]. We recommend using a subset like HVGs rather than full set of genes. (need to densify input if taking from adata = adata.X.todense())
time_series (bool) – (default False) if the data has time-series labels then set to True
time_series_labels (list) – (default None) list of integer values of temporal annoataions corresponding to e.g. hours (post fert), days, or sequential ordering
pca_loadings (array) – (default None) the loadings of the pcs used to project the cells (to projected euclidean location based on velocity). n_cells x n_pcs
secondary_annotations (None) – (default None)
edgebundle_pruning (float) – (default=None) will by default be set to the same as the cluster_graph_pruning and influences the visualized level of pruning of edges. Typical values can be between [0,1] with higher numbers retaining more edges
edgebundle_pruning_twice (bool) –
default: False. When True, the edgebundling is applied to a further visually pruned (visual_cluster_graph_pruning) and can sometimes simplify the visualization. it does not impact the pseudotime and lineage computations piegraph_arrow_head_width: float
(default = 0.1) size of arrow heads in via cluster graph
piegraph_edgeweight_scalingfactor – (defaulf = 1.5) scaling factor for edge thickness in via cluster graph
max_visual_outgoing_edges (int) – (default =2) Only allows max_visual_outgoing_edges to come out of any given node. Used in differentiation_flow()
edgebundle_pruning – (default=None) will by default be set to the same as the cluster_graph_pruning and influences the visualized level of pruning of edges. Typical values can be between [0,1] with higher numbers retaining more edges
edgebundle_pruning_twice – default: False. When True, the edgebundling is applied to a further visually pruned (visual_cluster_graph_pruning) and can sometimes simplify the visualization for very cluttered graphs. it does not impact the pseudotime and lineage computations
pseudotime_threshold_TS (int) – (default = 30) corresponds to the criteria for a state to be considered a candidate terminal cell fate to be 30% or later of the computed psuedotime range
num_mcmc_simulations (int) – (default = 1300) number of random walk simulations conducted
embedding_type (str) – (default = ‘via-mds’, other options are ‘via-atlas’ and ‘via-force’
do_compute_embedding (bool) – (default = False) If you want an embedding (n_samples x2) to be computed on the basis of the via sc graph then set this to True
do_gaussian_kernel_edgeweights (bool) – (default = False) Type of edgeweighting on the graph edges
memory (1/q * edge weight to a next-node that is not a neighbor of previous node. larger number means more memory and more introspective walk. small number <1 means more exploration) – (default = 2) higher q means more memory, more retrospective/inwards randomwalk. memory = 2 means run using the non-memory Via 1.0 mode
viagraph_decay (float) – (default = 0.9) increasing decay causes more edges to merge
memory –
p_memory (1/p * edge weight to next node = previous node. large value means more exploration) –
graph_init_pos (matrix (or list of lists) to initialize the viagraph) –
spatial_coords (np.ndarray of size n_cells x 2 (denoting x,y coordinates) of each spot/cell) –
do_spatial_knn (Whether or not to do spatial mode of StaVia for graph augmentation) –
do_spatial_layout (whether to use spatial coords for layout of the clustergraph) –
spatial_knn (int = 15. number of knn's added based on spatial proximity indiciated by spatial_coords) –
spatial_aux (list = [] a list of slice IDs so that only cells/spots on the same slice are considered when building the spatial_knn graph) –
- labels
length (n_samples, ) of cluster labels ndarray pre determined cluster labels user defined. #np.asarray(pre_labels).flatten()
- Type:
array
- single_cell_pt_markov
length n_samples of pseudotime
- Type:
list
- single_cell_bp
[n_lineages x n_samples] array of single cell branching probabilities towards each lineage (lineage normalized). Each column corresponds to a terminal state, in the order presented by the terminal_clusters attribute
- Type:
ndarray
- single_cell_bp_rownormed
[n_lineages x n_samples] array of single cell branching probabilities towards each lineage (cell normalized). Each column corresponds to a terminal state, in the order presented by the terminal_clusters attribute
- Type:
ndarray
- terminal_clusters
list of clusters that are cell fates/ unique lineages
- Type:
list
- cluster_bp
[n_clusters x n_terminal_states]. Lineage probability of cluster towards a particular terminal cluster state
- Type:
ndarray
- CSM
[n_cluster x n_clusters] array of cosine similarity used to weight the cluster graph transition matrix by velocity
- Type:
ndarray
- single_cell_transition_matrix
[n_samples x n_samples]
- Type:
ndarray
- terminal_clusters
(default None) list of terminal clusters
- Type:
list
- csr_full_graph
- Type:
csr matrix of single-cell graph (augmented with sequential data when providing time_series information)
- csr_array_locally_pruned
- Type:
csr matrix
- ig_full_graph
- full_neighbor_array
- user_defined_terminal_cell
- Type:
list=[]
- user_defined_terminal_group
- Type:
list=[]
- n_milestones
- Type:
int = None Number of milestones in the via-mds computation (anything more than 10,000 can be computationally heavy and time consuming) Typically auto-determined within the via-mds function
- embedding
[n_cells x 2] provided by user or autocomputed with via-mds or via-umap
- Type:
ndarray
- sc_transition_matrix(smooth_transition, b=10, use_sequentially_augmented=False)[source]
#computes the single cell level transition directions that are later used to calculate velocity of embedding #based on changes at single cell level in genes and single cell level velocity
- Parameters:
smooth_transition –
b – slope of logistic function
- Returns:
Plotting
- VIA.plotting_via.animate_atlas(hammerbundle_dict=None, via_object=None, linewidth_bundle=2, frame_interval=10, n_milestones=None, facecolor='white', cmap='plasma_r', extra_title_text='', size_scatter=1, alpha_scatter=0.2, saveto='/home/user/Trajectory/Datasets/animation_default.gif', time_series_labels=None, lineage_pathway=[], sc_labels_numeric=None, show_sc_embedding=False, sc_emb=None, sc_size_scatter=10, sc_alpha_scatter=0.2, n_intervals=50, n_repeat=2)[source]
- Parameters:
ax – axis to plot on
hammer_bundle – hammerbundle object with coordinates of all the edges to draw
layout – coords of cluster nodes and optionally also contains the numeric value associated with each cluster (such as time-stamp) layout[[‘x’,’y’,’numeric label’]] sc/cluster/milestone level
CSM – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight – percentage weightage given to the RNA velocity based transition matrix
pt – cluster-level pseudotime
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
frame_interval (
int
) – smaller number, faster refresh and videofacecolor (
str
) – default = whiteheadwidth_bundle – headwidth of arrows used in bundled edges
arrow_frequency – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – True will draw arrows along the lines to indicate direction
milestone_edges – pandas DataFrame milestone_edges[[‘source’,’target’]]
:param t_diff_factor scaling the average the time intervals (0.25 means that for each frame, the time is progressed by 0.25* mean_time_differernce_between adjacent times (only used when sc_labels_numeric are directly passed instead of using pseudotime) :type show_sc_embedding:
bool
:param show_sc_embedding: plot the single cell embedding under the edges :param sc_emb numpy array of single cell embedding (ncells x 2) :param sc_alpha_scatter, Alpha transparency value of points of single cells (1 is opaque, 0 is fully transparent) :param sc_size_scatter. size of scatter points of single cells :param n_repeat. number of times you repeat the whole process :return: axis with bundled edges plotted
- VIA.plotting_via.animate_atlas_old(hammerbundle_dict=None, via_object=None, linewidth_bundle=2, frame_interval=10, n_milestones=None, facecolor='white', cmap='plasma_r', extra_title_text='', size_scatter=1, alpha_scatter=0.2, saveto='/home/user/Trajectory/Datasets/animation_default.gif', time_series_labels=None, lineage_pathway=[], sc_labels_numeric=None, t_diff_factor=0.25, show_sc_embedding=False, sc_emb=None, sc_size_scatter=10, sc_alpha_scatter=0.2, n_intervals=50)[source]
- Parameters:
ax – axis to plot on
hammer_bundle – hammerbundle object with coordinates of all the edges to draw
layout – coords of cluster nodes and optionally also contains the numeric value associated with each cluster (such as time-stamp) layout[[‘x’,’y’,’numeric label’]] sc/cluster/milestone level
CSM – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight – percentage weightage given to the RNA velocity based transition matrix
pt – cluster-level pseudotime
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
frame_interval (
int
) – smaller number, faster refresh and videofacecolor (
str
) – default = whiteheadwidth_bundle – headwidth of arrows used in bundled edges
arrow_frequency – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – True will draw arrows along the lines to indicate direction
milestone_edges – pandas DataFrame milestone_edges[[‘source’,’target’]]
:param t_diff_factor scaling the average the time intervals (0.25 means that for each frame, the time is progressed by 0.25* mean_time_differernce_between adjacent times (only used when sc_labels_numeric are directly passed instead of using pseudotime) :type show_sc_embedding:
bool
:param show_sc_embedding: plot the single cell embedding under the edges :param sc_emb numpy array of single cell embedding (ncells x 2) :param sc_alpha_scatter, Alpha transparency value of points of single cells (1 is opaque, 0 is fully transparent) :param sc_size_scatter. size of scatter points of single cells :param time_series_labels, should be a single-cell level list (n_cells) of numerical values that form a discrete set. I.e. not continuous like pseudotime, :return: axis with bundled edges plotted
- VIA.plotting_via.animate_streamplot(via_object, embedding, density_grid=1, linewidth=0.5, min_mass=1, cutoff_perc=None, scatter_size=500, scatter_alpha=0.2, marker_edgewidth=0.1, smooth_transition=1, smooth_grid=0.5, color_scheme='annotation', other_labels=[], b_bias=20, n_neighbors_velocity_grid=None, fontsize=8, alpha_animate=0.7, cmap_scatter='rainbow', cmap_stream='Blues', segment_length=1, saveto='/home/shobi/Trajectory/Datasets/animation.gif', use_sequentially_augmented=False, facecolor_='white', random_seed=0)[source]
Draw Animated vector plots. the Saved .gif file saved at the saveto address, is the best for viewing the animation as the fig, ax output can be slow
- Parameters:
via_object – viaobject
embedding – ndarray (nsamples,2) umap, tsne, via-umap, via-mds
density_grid –
linewidth –
min_mass –
cutoff_perc –
scatter_size –
scatter_alpha –
marker_edgewidth –
smooth_transition –
smooth_grid –
color_scheme – ‘annotation’, ‘cluster’, ‘other’
add_outline_clusters –
cluster_outline_edgewidth –
gp_color –
bg_color –
title –
b_bias –
n_neighbors_velocity_grid –
fontsize –
alpha_animate –
cmap_scatter –
cmap_stream – string of a cmap for streamlines, default = ‘Blues’ (for dark blue lines) . Consider ‘Blues_r’ for white lines OR ‘Greys/_r’ ‘gist_yard/_r’
color_stream – string like ‘white’. will override cmap_stream
segment_length –
- Returns:
fig, ax.
- VIA.plotting_via.get_gene_expression(via_object, gene_exp, cmap='jet', dpi=150, marker_genes=[], linewidth=2.0, n_splines=10, spline_order=4, fontsize_=8, marker_lineages=[], optional_title_text='', cmap_dict=None)[source]
- Parameters:
via_object – via object
gene_exp (
DataFrame
) – dataframe where columns are features (gene) and rows are single cellscmap (
str
) – default: ‘jet’dpi (
int
) – default:150marker_genes (
list
) – Default is to use all genes in gene_exp. other provide a list of marker genes that will be used from gene_exp.linewidth (
float
) – default:2n_slines – default:10 Note n_splines must be > spline_order.
spline_order (
int
) – default:4 n_splines must be > spline_order.marker_lineages – Default is to use all lineage pathways. other provide a list of lineage number (terminal cluster number).
cmap_dict (
dict
) – {lineage number: ‘color’}
- Returns:
fig, axs
- VIA.plotting_via.make_dict_of_clusters_for_each_celltype(via_labels=[], true_label=[], verbose=False)[source]
- Parameters:
via_labels (
list
) – usually set to via_object.labels. list of length n_cells of cluster membershiptrue_label (
list
) – cell type labels (list of length n_cells)
- Returns:
- VIA.plotting_via.make_edgebundle_milestone(embedding=None, sc_graph=None, via_object=None, sc_pt=None, initial_bandwidth=0.03, decay=0.7, n_milestones=None, milestone_labels=[], sc_labels_numeric=None, weighted=True, global_visual_pruning=0.5, terminal_cluster_list=[], single_cell_lineage_prob=None, random_state=0)[source]
Perform Edgebundling of edges in a milestone level to return a hammer bundle of milestone-level edges. This is more granular than the original parc-clusters but less granular than single-cell level and hence also less computationally expensive requires some type of embedding (n_samples x 2) to be available
- Parameters:
embedding (
ndarray
) – optional (not required if via_object is provided) embedding single cell. also looks nice when done on via_mds as more streamlined continuous diffused graph structure. Umap is a but “clustery”graph – optional (not required if via_object is provided) igraph single cell graph level
via_object – via_object (best way to run this function by simply providing via_object)
sc_graph – igraph graph set as the via attribute self.ig_full_graph (affinity graph)
initial_bandwidth – increasing bw increases merging of minor edges
decay – increasing decay increases merging of minor edges #https://datashader.org/user_guide/Networks.html
milestone_labels (
list
) – default list=[]. Usually autocomputed. but can provide as single-cell level labels (clusters, groups, which function as milestone groupings of the single cells)sc_labels_numeric (
list
) – default is None which automatically chooses via_object’s pseudotime or time_series_labels (when available). otherwise set to a list of numerical values representing some sequential/chronological informationterminal_cluster_list (
list
) – default list [] and automatically uses all terminal clusters. otherwise set to any of the terminal cluster numbers within a list
- Returns:
dictionary containing keys: hb_dict[‘hammerbundle’] = hb hammerbundle class with hb.x and hb.y containing the coords hb_dict[‘milestone_embedding’] dataframe with ‘x’ and ‘y’ columns for each milestone and hb_dict[‘edges’] dataframe with columns [‘source’,’target’] milestone for each each and [‘cluster_pop’], hb_dict[‘sc_milestone_labels’] is a list of milestone label for each single cell
- VIA.plotting_via.plot_all_spatial_clusters(spatial_coords, true_label, via_labels, save_to='', color_dict={}, cmap='rainbow', alpha=0.4, s=5, verbose=False, reference_labels=[], reference_labels2=[])[source]
- Parameters:
spatial_coords – ndarray of x,y coords of tissue location of cells (ncells x2)
true_label – categorial labels (list of length n_cells)
via_labels – cluster membership labels (list of length n_cells)
save_to (
str
) –color_dict (
dict
) – optional dict with keys corresponding to true_label type. e.g. {true_label_celltype1: ‘green’,true_label_celltype2: ‘red’}cmap (
str
) – string default = rainbowreference_labels (
list
) – optional list of single-cell labels (e.g. time, annotation). Used to selectively provide a grey background to cells not in the cluster being inspected. If you have multipe time points, then set reference_labels to the time_points. All cells in the most prevalent timepoint seen in the cluster of interest will be plotted as a backgroundreference_labels2 (
list
) – optional list of single-cell labels (e.g. time, annotation). this will be used in the title of each subplot to note the majority cell (ref2) type for each cluster
- Returns:
list lists of [[fig1, axs_set1], [fig2, axs_set2],…]
- VIA.plotting_via.plot_atlas_view(hammerbundle_dict=None, via_object=None, alpha_bundle_factor=1, linewidth_bundle=2, facecolor='white', cmap='plasma', extra_title_text='', alpha_milestones=0.3, headwidth_bundle=0.1, headwidth_alpha=0.8, arrow_frequency=0.05, show_arrow=True, sc_labels_sequential=None, sc_labels_expression=None, initial_bandwidth=0.03, decay=0.7, n_milestones=None, scale_scatter_size_pop=False, show_milestones=True, sc_labels=None, text_labels=False, lineage_pathway=[], dpi=300, fontsize_title=6, fontsize_labels=6, global_visual_pruning=0.5, use_sc_labels_sequential_for_direction=False, sc_scatter_size=3, sc_scatter_alpha=0.4, add_sc_embedding=True, size_milestones=5, colorbar_legend='pseudotime')[source]
Edges can be colored by time-series numeric labels, pseudotime, lineage pathway probabilities, or gene expression. If not specificed then time-series is chosen if available, otherwise falls back to pseudotime. to use gene expression the sc_labels_expression is provided as a list. To specify other numeric sequential data provide a list of sc_labels_sequential = [] n_samples in length. via_object.embedding must be an ndarray of shape (nsamples,2)
- Parameters:
hammer_bundle_dict – dictionary with keys: hammerbundle object with coordinates of all the edges to draw. If hammer_bundle and layout are None, then this will be computed internally
via_object – type via object, if hammerbundle_dict is None, then you must provide a via_object. Ensure that via_object has embedding attribute
layout – coords of cluster nodes and optionally also contains the numeric value associated with each cluster (such as time-stamp) layout[[‘x’,’y’,’numeric label’]] sc/cluster/milestone level
CSM – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight – percentage weightage given to the RNA velocity based transition matrix
pt – cluster-level pseudotime
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
alpha_milestones (
float
) – float 0.3 alpha of milestonessize_milestones (
int
) – scatter size of the milestones (use sc_size_scatter to control single cell scatter when using in conjunction with lineage probs/ sc embeddings)arrow_frequency (
float
) – min dist between arrows (bundled edges otherwise have overcrowding of arrows)show_direction – True will draw arrows along the lines to indicate direction
milestone_edges – pandas DataFrame milestoone_edges[[‘source’,’target’]]
milestone_numeric_values – the milestone average of numeric values such as time (days, hours), location (position), or other numeric value used for coloring edges in a sequential manner if this is None then the edges are colored by length to distinguish short and long range edges
arrow_frequency – 0.05. higher means fewer arrows
n_milestones (
int
) – int None. if no hammerbundle_dict is provided, but via_object is provided, then the user can specify level of granularity by setting the n_milestones. otherwise it will be automatically selectedscale_scatter_size_pop (
bool
) – bool default Falsesc_labels_expression (
list
) – list single cell numeric values used for coloring edges and nodes of corresponding milestones mean expression levels (len n_single_cell samples) edges can be colored by time-series numeric (gene expression)/string (cell type) labels, pseudotime, or gene expression. If not specificed then time-series is chosen if available, otherwise falls back to pseudotime. to use gene expression the sc_labels_expression is provided as a listsc_labels_sequential (
list
) – list single cell numeric sequential values used for directionality inference as replacement for pseudotime or via_object.time_series_labels (len n_samples single cell)sc_labels (
list
) – list None list of single-cell level labels (categorial or discrete set of numerical values) to label the nodestext_labels (
bool
) – bool False if you want to label the nodes based on sc_labels (or true_label if via_object is provided)lineage_pathway (
list
) – list of terminal states to plot lineage pathwaysuse_sc_labels_sequential_for_direction (
bool
) – use the sequential data (timeseries labels or other provided by user) to direct the arrows
:param lineage_alpha_threshold number representing the percentile (0-100) of lineage likelikhood in a particular lineage pathway, below which edges will be drawn with lower alpha transparency factor :type sc_scatter_alpha:
float
:param sc_scatter_alpha: transparency of the background singlecell scatter when plotting lineages :type add_sc_embedding:bool
:param add_sc_embedding: add background of single cell scatter plot for Atlas :param scatter_size_sc_embedding :param colorbar_legend str title of colorbar :return: fig, axis with bundled edges plotted
- VIA.plotting_via.plot_clusters_spatial(spatial_coords, clusters=[], via_labels=[], title_sup='', fontsize_=6, color='green', s=5, alpha=0.5, xlim_max=None, ylim_max=None, xlim_min=None, ylim_min=None, reference_labels=[], reference_labels2=[], equal_axes_lim=True)[source]
- Parameters:
spatial_coords – ndarray of spatial coords ncellsx2 dims
clusters – the clusters in via_object.labels which you want to plot (usually a subset of the total number of clusters)
via_labels – via_object.labels (cluster level labels, list of n_cells length)
title_sup – title of the overall figure
fontsize – fontsize for legend
color – color of scatter points
s (
int
) – size of scatter pointsalpha – float alpha transparency of scatter (0 fully transporent, 1 is opaque)
xlim_max – limits of axes
ylim_max – limits of axes
xlim_min – limits of axes
ylim_min – limits of axes
reference_labels (
list
) – optional list of single-cell labels (e.g. time, annotation). this will be used in the title of each subplot to note the majority cell (ref2) type for each clusterreference_labels2 (
list
) – optional list of single-cell labels (e.g. time, annotation). this will be used in the title of each subplot to note the majority cell (ref2) type for each cluster
- Returns:
fig, axs
- VIA.plotting_via.plot_differentiation_flow(via_object, idx=None, dpi=150, marker_lineages=[], label_node=[], do_log_flow=True, fontsize=8, alpha_factor=0.9, majority_cluster_population_dict=None, cmap_sankey='rainbow', title_str='Differentiation Flow', root_cluster_list=None)[source]
#SANKEY PLOTS G is the igraph knn (low K) used for shortest path in high dim space. no idx needed as it’s made on full sample knn_hnsw is the knn made in the embedded space used for query to find the nearest point in the downsampled embedding that corresponds to the single cells in the full graph
- Parameters:
via_object –
embedding – n_samples x 2. embedding is 2D representation of the full dataset.
idx (
list
) – if one uses a downsampled embedding of the original data, then idx is the selected indices of the downsampled samples used in the visualizationcmap_name –
dpi –
:param do_log_flow bool True (default) take the natural log (1+edge flow value) :param label_node list of labels for each cell (could be cell type, stage level) length is n_cells :param scatter_size: if None, then auto determined based on n_cells :param marker_lineages: Default is to use all lineage pathways. other provide a list of lineage number (terminal cluster number). :type alpha_factor:
float
:param alpha_factor: float transparency :type root_cluster_list:list
:param root_cluster_list: list of roots by cluster number e.g. [5] means a good root is cluster number 5 :return: fig, axs
- VIA.plotting_via.plot_gene_trend_heatmaps(via_object, df_gene_exp, marker_lineages=[], fontsize=8, cmap='viridis', normalize=True, ytick_labelrotation=0, fig_width=7)[source]
Plot the gene trends on heatmap: a heatmap is generated for each lineage (identified by terminal cluster number). Default selects all lineages
- Parameters:
via_object –
df_gene_exp (
DataFrame
) – pandas DataFrame single-cell level expression [cells x genes]marker_lineages (
list
) – list default = None and plots all detected all lineages. Optionally provide a list of integers corresponding to the cluster number of terminal cell fatesfontsize (
int
) – int default = 8cmap (
str
) – str default = ‘viridis’normalize (
bool
) – bool = Trueytick_labelrotation (
int
) – int default = 0
- Returns:
fig and list of axes
- VIA.plotting_via.plot_piechart_only_viagraph(via_object, type_data='pt', gene_exp=[], cmap_piechart='rainbow', title='', cmap=None, ax_text=True, dpi=150, headwidth_arrow=0.1, alpha_edge=0.4, linewidth_edge=2, edge_color='darkblue', reference_labels=None, show_legend=True, pie_size_scale=0.8, fontsize=8, pt_visual_threshold=99, highlight_terminal_clusters=True, size_node_notpiechart=1, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]
plot two subplots with a clustergraph level representation of the viagraph showing true-label composition (lhs) and pseudotime/gene expression (rhs) Returns matplotlib figure with two axes that plot the clustergraph using edge bundling left axis shows the clustergraph with each node colored by annotated ground truth membership. right axis shows the same clustergraph with each node colored by the pseudotime or gene expression
- Parameters:
via_object – is class VIA (the same function also exists as a method of the class and an external plotting function
type_data – string default ‘pt’ for pseudotime colored nodes. or ‘gene’
gene_exp (
list
) – list of values (or column of dataframe) corresponding to feature or gene expression to be used to color nodes at CLUSTER levelcmap_piechart (
str
) – str cmap for piechart categoriestitle – string
cmap (
str
) – default None. automatically chooses coolwarm for gene expression or viridis_r for pseudotimeax_text – Bool default= True. Annotates each node with cluster number and population of membership
dpi – int default = 150
headwidth_arrow – default = 0.1. width of arrowhead used to directed edges
reference_labels – None or list. list of categorical (str) labels for cluster composition of the piecharts (LHS subplot) length = n_samples.
pie_size_scale (
float
) – float default=0.8 scaling factor of the piechart nodespt_visual_threshold (
int
) – int (percentage) default = 95 corresponding to rescaling the visual color scale by clipping outlier cluster pseudotimes
:param highlight_terminal_clusters:bool = True (red border around terminal clusters) :type size_node_notpiechart:
float
:param size_node_notpiechart: scaling factor for node size of the viagraph (not the piechart part) :param initial_bandwidth: (float = 0.05) increasing bw increases merging of minor edges. Only used when tune_edges = True :param decay: (decay = 0.9) increasing decay increases merging of minor edges . Only used when tune_edges = True :param edgebundle_pruning (float = 0.5). takes on values between 0-1. smaller value means more pruning away edges that can be visualised. Only used when tune_edges = True :return: f, ax, ax1
- VIA.plotting_via.plot_piechart_viagraph(via_object, type_data='pt', gene_exp=[], cmap_piechart='rainbow', title='', cmap=None, ax_text=True, dpi=150, headwidth_arrow=0.1, alpha_edge=0.4, linewidth_edge=2, edge_color='darkblue', reference_labels=None, show_legend=True, pie_size_scale=0.8, fontsize=8, pt_visual_threshold=99, highlight_terminal_clusters=True, size_node_notpiechart=1, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]
plot two subplots with a clustergraph level representation of the viagraph showing true-label composition (lhs) and pseudotime/gene expression (rhs) Returns matplotlib figure with two axes that plot the clustergraph using edge bundling left axis shows the clustergraph with each node colored by annotated ground truth membership. right axis shows the same clustergraph with each node colored by the pseudotime or gene expression
- Parameters:
via_object – is class VIA (the same function also exists as a method of the class and an external plotting function
type_data – string default ‘pt’ for pseudotime colored nodes. or ‘gene’
gene_exp (
list
) – list of values (or column of dataframe) corresponding to feature or gene expression to be used to color nodes at CLUSTER levelcmap_piechart (
str
) – str cmap for piechart categoriestitle – string
cmap (
str
) – default None. automatically chooses coolwarm for gene expression or viridis_r for pseudotimeax_text – Bool default= True. Annotates each node with cluster number and population of membership
dpi – int default = 150
headwidth_arrow – default = 0.1. width of arrowhead used to directed edges
reference_labels – None or list. list of categorical (str) labels for cluster composition of the piecharts (LHS subplot) length = n_samples.
pie_size_scale (
float
) – float default=0.8 scaling factor of the piechart nodespt_visual_threshold (
int
) – int (percentage) default = 95 corresponding to rescaling the visual color scale by clipping outlier cluster pseudotimes
:param highlight_terminal_clusters:bool = True (red border around terminal clusters) :type size_node_notpiechart:
float
:param size_node_notpiechart: scaling factor for node size of the viagraph (not the piechart part) :param initial_bandwidth: (float = 0.05) increasing bw increases merging of minor edges. Only used when tune_edges = True :param decay: (decay = 0.9) increasing decay increases merging of minor edges . Only used when tune_edges = True :param edgebundle_pruning (float = 0.5). takes on values between 0-1. smaller value means more pruning away edges that can be visualised. Only used when tune_edges = True :return: f, ax, ax1
- VIA.plotting_via.plot_population_composition(via_object, time_labels=None, celltype_list=None, cmap='rainbow', legend=True, alpha=0.5, linewidth=0.2, n_intervals=20, xlabel='time', ylabel='', title='Cell populations', color_dict=None, fraction=True)[source]
- Parameters:
via_object – optional. this is required unless both time_labels and cell_labels are provided as arguments to the function
time_labels (
list
) – list length n_cells of pseudotime or known stage numeric labelscell_labels – list of cell type or cluster length n_cells
- Returns:
ax
- VIA.plotting_via.plot_sc_lineage_probability(via_object, embedding=None, idx=None, cmap_name='plasma', dpi=150, scatter_size=None, marker_lineages=[], fontsize=8, alpha_factor=0.9, majority_cluster_population_dict=None, cmap_sankey='rainbow', do_sankey=False)[source]
G is the igraph knn (low K) used for shortest path in high dim space. no idx needed as it’s made on full sample knn_hnsw is the knn made in the embedded space used for query to find the nearest point in the downsampled embedding that corresponds to the single cells in the full graph
- Parameters:
via_object –
embedding (
ndarray
) – n_samples x 2. embedding is either the full or downsampled 2D representation of the full dataset.idx (
list
) – if one uses a downsampled embedding of the original data, then idx is the selected indices of the downsampled samples used in the visualizationcmap_name –
dpi –
scatter_size – if None, then auto determined based on n_cells
marker_lineages – Default is to use all lineage pathways. other provide a list of lineage number (terminal cluster number).
alpha_factor (
float
) – float transparency
- Returns:
fig, axs
- VIA.plotting_via.plot_scatter(embedding, labels, cmap='rainbow', s=5, alpha=0.3, edgecolors='None', title='', text_labels=True, color_dict=None, via_object=None, sc_index_terminal_states=None, true_labels=[], show_legend=True, hide_axes_ticks=True, color_labels_reverse=False)[source]
General scatter plotting tool for numeric and categorical labels on the single-cell level
- Parameters:
embedding (
ndarray
) – ndarray n_samples x 2labels (
list
) – list single cell labels list of number or stringscmap – str default = ‘rainbow’
s – int size of scatter dot
alpha – float with 0 transparent to 1 opaque default =0.3
edgecolors –
title (
str
) – strtext_labels (
bool
) – bool default =Truevia_object –
sc_index_terminal_states (
list
) – list of integers corresponding to one cell in each of the terminal statescolor_dict – {‘true_label_group_1’: #COLOR,’true_label_group_2’: #COLOR2,….} where the dictionary keys correspond to the provided labels
true_labels (
list
) – list of single cell labels used to annotate the terminal states
- Returns:
matplotlib pyplot fig, ax
- VIA.plotting_via.plot_trajectory_curves(via_object, embedding=None, idx=None, title_str='Pseudotime', draw_all_curves=True, arrow_width_scale_factor=15.0, scatter_size=50, scatter_alpha=0.5, linewidth=1.5, marker_edgewidth=1, cmap_pseudotime='viridis_r', dpi=150, highlight_terminal_states=True, use_maxout_edgelist=False)[source]
projects the graph based coarse trajectory onto a umap/tsne embedding
- Parameters:
via_object – via object
embedding (
ndarray
) – 2d array [n_samples x 2] with x and y coordinates of all n_samples. Umap, tsne, pca OR use the via computed embedding via_object.embeddingidx (
Optional
[list
]) – default: None. Or List. if you had previously computed a umap/tsne (embedding) only on a subset of the total n_samples (subsampled as per idx), then the via objects and results will be indexed according to idx tootitle_str (
str
) – title of figuredraw_all_curves (
bool
) – if the clustergraph has too many edges to project in a visually interpretable way, set this to False to get a simplified view of the graph pathwaysarrow_width_scale_factor (
float
) –scatter_size (
float
) –scatter_alpha (
float
) –linewidth (
float
) –marker_edgewidth (
float
) –cmap_pseudotime (
str
) –dpi (
int
) – int default = 150. Use 300 for paper figureshighlight_terminal_states (
bool
) – whether or not to highlight/distinguish the clusters which are detected as the terminal states by via
- Returns:
f, ax1, ax2
- VIA.plotting_via.plot_viagraph(via_object, type_data='gene', df_genes=None, gene_list=[], arrow_head=0.1, edgeweight_scale=1.5, cmap=None, label_text=True, size_factor_node=1, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]
cluster level expression of gene/feature intensity :param via_object: :param type_data: :param gene_exp: pd.Dataframe size n_cells x genes. Otherwise defaults to plotting pseudotime :type gene_list:
list
:param gene_list: list of gene names corresponding to the column name :type arrow_head:float
:param arrow_head: :type edgeweight_scale:float
:param edgeweight_scale: :param cmap: :type label_text:bool
:param label_text: bool to add numeric values of the gene exp level :param size_factor_node size of graph nodes :type tune_edges:bool
:param tune_edges: bool (false). if you want to change the number of edges visualized, then set this to True and modify the tuning parameters (initial_bandwidth, decay, edgebundle_pruning) :param initial_bandwidth: (float = 0.05) increasing bw increases merging of minor edges. Only used when tune_edges = True :param decay: (decay = 0.9) increasing decay increases merging of minor edges . Only used when tune_edges = True :param edgebundle_pruning (float = 0.5). takes on values between 0-1. smaller value means more pruning away edges that can be visualised. Only used when tune_edges = True :return: fig, axs
- VIA.plotting_via.plot_viagraph_(ax=None, hammer_bundle=None, layout=None, CSM=None, velocity_weight=None, pt=None, alpha_bundle=1, linewidth_bundle=2, edge_color='darkblue', headwidth_bundle=0.1, arrow_frequency=0.05, show_direction=True, ax_text=True, title='', plot_clusters=False, cmap='viridis', via_object=None, fontsize=9, dpi=300, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]
this plots the edgebundles on the via clustergraph level and also adds the relevant arrow directions based on the TI directionality
- Parameters:
ax – axis to plot on
hammer_bundle – hammerbundle object with coordinates of all the edges to draw. self.hammer
layout (
ndarray
) – coords of cluster nodesCSM (
ndarray
) – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when availablevelocity_weight (
float
) – percentage weightage given to the RNA velocity based transition matrixpt (
list
) – cluster-level pseudotime (or other intensity level of features at average-cluster level)alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
headwidth_bundle – headwidth of arrows used in bundled edges
arrow_frequency – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – bool default True. will draw arrows along the lines to indicate direction
plot_clusters (
bool
) – bool default False. When this function is called on its own (and not from within draw_piechart_graph() then via_object must be providedax_text (
bool
) – bool default True. Show labels of the clusters with the cluster population and PARC cluster labelfontsize (
float
) – float default 9 Font size of labels
- Returns:
fig, ax with bundled edges plotted
- VIA.plotting_via.via_atlas_emb(via_object=None, X_input=None, graph=None, n_components=2, alpha=1.0, negative_sample_rate=5, gamma=1.0, spread=1.0, min_dist=0.1, init_pos='via', random_state=0, n_epochs=100, distance_metric='euclidean', layout=None, cluster_membership=None, parallel=False, saveto='', n_jobs=2)[source]
Run dimensionality reduction using the VIA modified HNSW graph using via cluster graph initialization when Via_object is provided
- Parameters:
via_object – if via_object is provided then X_input and graph are ignored
X_input (
ndarray
) – ndarray nsamples x features (PCs)graph (
csr_matrix
) – csr_matrix of knngraph. This usually is via’s pruned, sequentially augmented sc-knn graph accessed as an attribute of via via_object.csr_full_graphn_components (
int
) –alpha (
float
) –negative_sample_rate (
int
) –gamma (
float
) – Weight to apply to negative samples.spread (
float
) – The effective scale of embedded points. In combination with min_dist this determines how clustered/clumped the embedded points are.min_dist (
float
) – The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of pointsinit_pos (
Union
[str
,ndarray
]) – either a string (default) ‘via’ (uses via graph to initialize), or ‘spectral’. Or a n_cellx2 dimensional ndarray with initial coordinatesrandom_state (
int
) –n_epochs (
int
) – The number of training epochs to be used in optimizing the low dimensional embedding. Larger values result in more accurate embeddings. If 0 is specified a value will be selected based on the size of the input dataset (200 for large datasets, 500 for small).distance_metric (
str
) –layout (
Optional
[list
]) – ndarray . custom initial layout. (n_cells x2). also requires cluster_membership labelscluster_membership (
Optional
[list
]) – via_object.labels (cluster level labels of length n_samples corresponding to the layout)
- Return type:
ndarray
- Returns:
ndarray of shape (nsamples,n_components)
- VIA.plotting_via.via_forcelayout(X_pca, viagraph_full=None, k=10, n_milestones=2000, time_series_labels=[], knn_seq=5, saveto='', random_seed=0)[source]
Compute force directed layout. #TODO not complete
- Parameters:
X_pca –
viagraph_full (
csr_matrix
) – optional. if calling before via, then None. if calling after or from within via, then we can use the via-graph to reinforce the layoutk (
int
) –random_seed (
int
) –t_diffusion –
n_milestones –
time_series_labels (
list
) –knn_seq (
int
) –
- Return type:
ndarray
- Returns:
ndarray
- VIA.plotting_via.via_mds(via_object=None, X_pca=None, viagraph_full=None, k=15, random_seed=0, diffusion_op=1, n_milestones=2000, time_series_labels=[], knn_seq=5, k_project_milestones=3, t_difference=2, saveto='', embedding_type='mds', double_diffusion=False)[source]
Fast computation of a 2D embedding FOR EXAMPLE: via_object.embedding = via.via_mds(via_object = v0) plot_scatter(embedding = via_object.embedding, labels = via_object.true_labels)
- Parameters:
via_object –
X_pca (
ndarray
) – dimension reduced (only if via_object is not passed)viagraph_full (
csr_matrix
) – optional. if calling before or without via, then None and a milestone graph will be computed. if calling after or from within via, then we can use the via-graph to reinforce the layout of the milestone graphk (
int
) – number of knn for the via_mds reinforcement graph on milestones. default =15. integers 5-20 are reasonablerandom_seed (
int
) – randomseed integert_diffusion – default integer value = 1 with higher values generate more smoothing
n_milestones – number of milestones used to generate the initial embedding
time_series_labels (
list
) – numerical values in list form representing some sequentual informationknn_seq (
int
) – if time-series data is available, this will augment the knn with sequential neighbors (2-10 are reasonable values) default =5embedding_type (
str
) – default = ‘mds’ or set to ‘umap’double_diffusion (
bool
) – default is False. To achieve sharper strokes/lineages, set to Truek_project_milestones (
int
) – number of milestones in the milestone-knngraph used to compute the single-cell projectionn_iterations – number of iterations to run
neighbors_distances – array of distances of each neighbor for each cell (n_cells x knn) used when called from within via.run() for autocompute via-mds
- Return type:
ndarray
- Returns:
numpy array of size n_samples x 2
- VIA.plotting_via.via_streamplot(via_object, embedding=None, density_grid=0.5, arrow_size=0.7, arrow_color='k', color_dict=None, arrow_style='-|>', max_length=4, linewidth=1, min_mass=1, cutoff_perc=5, scatter_size=500, scatter_alpha=0.5, marker_edgewidth=0.1, density_stream=2, smooth_transition=1, smooth_grid=0.5, color_scheme='annotation', add_outline_clusters=False, cluster_outline_edgewidth=0.001, gp_color='white', bg_color='black', dpi=300, title='Streamplot', b_bias=20, n_neighbors_velocity_grid=None, labels=None, use_sequentially_augmented=False, cmap='rainbow', show_text_labels=True)[source]
Construct vector streamplot on the embedding to show a fine-grained view of inferred directions in the trajectory
- Parameters:
via_object –
embedding (
ndarray
) – np.ndarray of shape (n_samples, 2) umap or other 2-d embedding on which to project the directionality of cellsdensity_grid (
float
) –arrow_size (
float
) –arrow_color (
str
) –arrow_style –
max_length (
int
) –linewidth (
float
) – width of lines in streamplot, default = 1min_mass –
cutoff_perc (
int
) –scatter_size (
int
) – size of scatter points default =500scatter_alpha (
float
) – transpsarency of scatter pointsmarker_edgewidth (
float
) – width of outline arround each scatter point, default = 0.1density_stream (
int
) –smooth_transition (
int
) –smooth_grid (
float
) –color_scheme (
str
) – str, default = ‘annotation’ corresponds to self.true_labels. Other options are ‘time’ (uses single-cell pseudotime) and ‘cluster’ (via cluster graph) and ‘other’. Alternatively provide labels as a listadd_outline_clusters (
bool
) –cluster_outline_edgewidth –
gp_color –
bg_color –
dpi –
title –
b_bias – default = 20. higher value makes the forward bias of pseudotime stronger
n_neighbors_velocity_grid –
labels (
list
) – list (will be used for the color scheme) or if a color_dict is provided these labels should matchuse_sequentially_augmented –
cmap (
str
) –
- Returns:
fig, ax
Datasets
- VIA.datasets_via.cell_cycle(foldername='./')[source]
Load cell cycle data as AnnData object
- Args:
foldername (string): Directory of dataset
- Returns:
AnnData object
- VIA.datasets_via.cell_cycle_cyto_data(foldername='./')[source]
Load cell cycle imagine based flow-cyto features AnnData object with n_obs × n_vars = 2036 × 38 obs: ‘cell_cycle_phase’ :param foldername (string) Default current directory. path to directory where you want to store the dataset
- Returns:
anndata
- VIA.datasets_via.embryoid_body(foldername='./')[source]
Load embryoid body data as AnnData object
- Args:
foldername (string): Directory to save dataset
- Returns:
AnnData object
- VIA.datasets_via.moffitt_preoptic(foldername='./')[source]
Load preoptic hypothalamus mouse data from moffitt et al.,m as AnnData object
- Args:
foldername (string): foldername (string): path to directory where you want to store the dataset ‘./’ current directory is default
- Returns:
AnnData object
- VIA.datasets_via.scATAC_hematopoiesis(foldername='./')[source]
Load scATAC seq Hematopoiesis data as AnnData object
- Args:
foldername (string): Directory of dataset
- Returns:
AnnData object
- VIA.datasets_via.scRNA_hematopoiesis(foldername='./')[source]
Load scRNA seq Hematopoiesis data as AnnData object
- Args:
foldername (string): Directory of dataset
- Returns:
AnnData object
- VIA.datasets_via.toy_disconnected(foldername='./')[source]
Load Toy_Disconnected data as AnnData object
To access obs (label) as list, use AnnData.obs[‘group_id’].values.tolist()
- Args:
foldername (string): Default current directory. path to directory where you want to store the dataset
- Returns:
AnnData object
- VIA.datasets_via.toy_multifurcating(foldername='./')[source]
Load Toy_Multifurcating data as AnnData object
To access obs (label) as list, use AnnData.obs[‘group_id’].values.tolist()
- Args:
foldername (string): foldername (string): path to directory where you want to store the dataset ‘./’ current directory is default
- Returns:
AnnData object