API

Import pyvia as:

import pyVIA.core as via

API

Full API

pyVIA core

class VIA.core.VIA(data, true_label=None, edgepruning_clustering_resolution_local=1, edgepruning_clustering_resolution=0.15, labels=None, keep_all_local_dist='auto', too_big_factor=0.4, resolution_parameter=1.0, partition_type='ModularityVP', small_pop=10, jac_weighted_edges=True, knn=30, n_iter_leiden=5, random_seed=42, num_threads=-1, distance='l2', time_smallpop=15, super_cluster_labels=False, super_node_degree_list=False, super_terminal_cells=False, x_lazy=0.99, alpha_teleport=0.99, root_user=None, preserve_disconnected=True, dataset='', super_terminal_clusters=[], is_coarse=True, csr_full_graph='', csr_array_locally_pruned='', ig_full_graph='', full_neighbor_array='', full_distance_array='', embedding=None, df_annot=None, preserve_disconnected_after_pruning=False, secondary_annotations=None, pseudotime_threshold_TS=30, cluster_graph_pruning=0.15, visual_cluster_graph_pruning=0.15, neighboring_terminal_states_threshold=3, num_mcmc_simulations=1300, piegraph_arrow_head_width=0.1, piegraph_edgeweight_scalingfactor=1.5, max_visual_outgoing_edges=2, via_coarse=None, velocity_matrix=None, gene_matrix=None, velo_weight=0.5, edgebundle_pruning=None, A_velo=None, CSM=None, edgebundle_pruning_twice=False, pca_loadings=None, time_series=False, time_series_labels=None, knn_sequential=10, knn_sequential_reverse=0, t_diff_step=1, single_cell_transition_matrix=None, embedding_type='via-mds', do_compute_embedding=False, color_dict=None, user_defined_terminal_cell=[], user_defined_terminal_group=[], do_gaussian_kernel_edgeweights=False, RW2_mode=False, working_dir_fp='/home/', memory=5, viagraph_decay=0.9, p_memory=1, graph_init_pos=None, spatial_coords=None, do_spatial_knn=False, do_spatial_layout=False, spatial_knn=15, spatial_aux=[])[source]

A class to represent the VIA analysis

Parameters:

data (ndarray) – input matrix of size n_cells x n_dims. Expects the PCs or features that will be used in the TI computation. Can be e.g. adata.obsm[‘X_pca][:,0:20]
true_label (list) – list of str/int that correspond to the ground truth or reference annotations. Can also be None when no labels are available
labels (ndarray (nsamples, )) – default is None. and PARC clusters are used for the viagraph. alternatively provide a list of clustermemberships that are integer values (not strings) to construct the viagraph using another clustering method or available annotations
edgepruning_clustering_resolution_local (float) – default = 2 local level of pruning for PARC graph clustering stage. Range (0.1,3) higher numbers mean more edge retention. For large datasets can stick to just tuning edgepruning_clustering_resolution
edgepruning_clustering_resolution (float) – (optional, default = 0.15, can also set as ‘median’) graph pruning for PARC clustering stage. Higher value keeps more edges, results in fewer clusters. Smaller value removes more edges and results in more clusters. Number of standard deviations below the network’s mean-jaccard-weighted edges. 0.1-1 provide reasonable pruning. higher value means less pruning (more edges retained). e.g. a value of 0.15 means all edges that are above mean(edgeweight)-0.15*std(edge-weights) are retained. We find both 0.15 and ‘median’ to yield good results/starting point and resulting in pruning away ~ 50-60% edges
keep_all_local_dist (bool, str) – default value of ‘auto’ means that for smaller datasets local-pruning is done prior to clustering, but for large datasets local pruning is set to False for speed. can also set to be bool of True or False
too_big_factor (float) – (optional, default=0.4). Forces clusters > 0.4*n_cells to be re-clustered
resolution_parameter (float) – (default =1) larger value means more and smaller clusters
partition_type (str) – (default “ModularityVP”) Options
small_pop (int) – (default 10) Via attempts to merge Clusters with a population < 10 cells with larger clusters. If you have a very small dataset (e.g. few hundred cells), then consider lowering to e.g. 5
jac_weighted_edges (bool) – (default = True) Use weighted edges in the PARC clustering step
knn (int) – (optional, default = 30) number of K-Nearest Neighbors for HNSWlib KNN graph. Larger knn means more graph connectivity. Lower knn means more loosely connected clusters/cells
n_iter_leiden (int) –
random_seed (int) – Random seed to pass to clustering
num_threads –
distance (str) – (default ‘l2’) Euclidean distance ‘l2’ by default; other options ‘ip’ and ‘cosine’ for graph construction and similarity
visual_cluster_graph_pruning (float) – (optional, default = 0.15) This only comes into play if the user deliberately chooses not to use the default edge-bundling method of visualizating edges (draw_piechart_graph()) and instead calls draw_piechart_graph_nobundle(). It is often set to the same value as the PARC clustering level of edgepruning_clustering_resolution. This does not impact computation of terminal states, pseudotime or lineage likelihoods. It controls the number of edges plotted for visual effect
cluster_graph_pruning (float) – (optional, default =0.15) Pruning level of the cluster graph (does not impact number of clusters). Only impacts the connectivity of the clustergraph. Often set to the same value as the PARC clustering level of edgepruning_clustering_resolution.Reasonable range [0.1,1] To retain more connectivity in the clustergraph underlying the trajectory computations, increase the value
time_smallpop (max time to be allowed handling singletons) –
x_lazy (float) – (default =0.95) 1-x = probability of staying in same node (lazy). Values between 0.9-0.99 are reasonable
alpha_teleport (float) – (default = 0.99) 1-alpha is probability of jumping. Values between 0.95-0.99 are reasonable unless prior knowledge of teleportation
root_user (list, None) – can be a list of strings, a list of int or None (default is None) When the root_user is set as None and an RNA velocity matrix is available, a root will be automatically computed if the root_user is None and not velocity matrix is provided, then an arbitrary root is selected if the root_user is [‘celltype_earlystage’] where the str corresponds to an item in true_label, then a suitable starting point will be selected corresponding to this group if the root_user is [678], where 678 is the index of the cell chosen as a start cell, then this will be the designated starting cell. It is possible to give a list of root indices and groups. [120, 699] or [‘traj1_earlystage’, ‘traj2_earlystage’] when there are more than one trajectories
preserve_disconnected (bool) – (default = True) If you believe there may be disconnected trajectories then set this to False
dataset (str) – Can be set to ‘group’ or ‘’ (default). this refers to the type of root label (group level root or single cell index) you are going to provide. if your true_label has a sensible group of cells for a root then you can set dataset to ‘group’ and make the root parameter [‘labelname_root_cell_type’] if your root corresponds to one particular cell then set dataset = ‘’ (default)
embedding (ndarray) – (optional, default = None) embedding (e.g. precomputed tsne, umap, phate, via-umap) for plotting data. Size n_cells x 2 If an embedding is provided when running VIA, then a scatterplot colored by pseudotime, highlighting terminal fates
velo_weight (float) – (optional, default = 0.5) #float between [0,1]. the weight assigned to directionality and connectivity derived from scRNA-velocity
neighboring_terminal_states_threshold (int) – (default = 3). Candidates for terminal states that are neighbors of each other may be removed from the list if they have this number of more of terminal states as neighbors
knn_sequential (int) – (default =10) number of knn in the adjacent time-point for time-series data (t_i and t_i+1)
knn_sequential_reverse (int) – (default = 0) number of knn enforced from current to previous time point
t_diff_step (int) – (default =1) Number of permitted temporal intervals between connected nodes. If time data is labeled as [0,25,50,75,100,..] then t_diff_step=1 corresponds to ‘25’ and only edges within t_diff_steps are retained
is_coarse (bool) – (default = True) If running VIA in two iterations where you wish to link the second fine-grained iteration with the initial iteration, then you set to False
via_coarse (VIA) – (default = None) If instantiating a second iteration of VIA that needs to be linked to a previous iteration (e.g. via0), then set via_coarse to the previous via0 object
df_annot (DataFrame) – (default None) used for the Mouse Organ data
preserve_disconnected_after_pruning (bool) – (default = False) If you believe there are disconnected trajectories then set this to True and test your hypothesis
A_velo (ndarray) – Cluster Graph Transition matrix based on rna velocity [n_clus x n_clus]
velocity_matrix (matrix) – (default None) matrix of size [n_samples x n_genes]. this is the velocity matrix computed by scVelo (or similar package) and stored in adata.layers[‘velocity’]. The genes used for computing velocity should correspond to those useing in gene_matrix Requires gene_matrix to be provided too.
gene_matrix (matrix) – (default None) Only used if Velocity_matrix is available. matrix of size [n_samples x n_genes]. We recommend using a subset like HVGs rather than full set of genes. (need to densify input if taking from adata = adata.X.todense())
time_series (bool) – (default False) if the data has time-series labels then set to True
time_series_labels (list) – (default None) list of integer values of temporal annoataions corresponding to e.g. hours (post fert), days, or sequential ordering
pca_loadings (array) – (default None) the loadings of the pcs used to project the cells (to projected euclidean location based on velocity). n_cells x n_pcs
secondary_annotations (None) – (default None)
edgebundle_pruning (float) – (default=None) will by default be set to the same as the cluster_graph_pruning and influences the visualized level of pruning of edges. Typical values can be between [0,1] with higher numbers retaining more edges
edgebundle_pruning_twice (bool) –
default: False. When True, the edgebundling is applied to a further visually pruned (visual_cluster_graph_pruning) and can sometimes simplify the visualization. it does not impact the pseudotime and lineage computations piegraph_arrow_head_width: float

(default = 0.1) size of arrow heads in via cluster graph
piegraph_edgeweight_scalingfactor – (defaulf = 1.5) scaling factor for edge thickness in via cluster graph
max_visual_outgoing_edges (int) – (default =2) Only allows max_visual_outgoing_edges to come out of any given node. Used in differentiation_flow()
edgebundle_pruning – (default=None) will by default be set to the same as the cluster_graph_pruning and influences the visualized level of pruning of edges. Typical values can be between [0,1] with higher numbers retaining more edges
edgebundle_pruning_twice – default: False. When True, the edgebundling is applied to a further visually pruned (visual_cluster_graph_pruning) and can sometimes simplify the visualization for very cluttered graphs. it does not impact the pseudotime and lineage computations
pseudotime_threshold_TS (int) – (default = 30) corresponds to the criteria for a state to be considered a candidate terminal cell fate to be 30% or later of the computed psuedotime range
num_mcmc_simulations (int) – (default = 1300) number of random walk simulations conducted
embedding_type (str) – (default = ‘via-mds’, other options are ‘via-atlas’ and ‘via-force’
do_compute_embedding (bool) – (default = False) If you want an embedding (n_samples x2) to be computed on the basis of the via sc graph then set this to True
do_gaussian_kernel_edgeweights (bool) – (default = False) Type of edgeweighting on the graph edges
memory (1/q * edge weight to a next-node that is not a neighbor of previous node. larger number means more memory and more introspective walk. small number <1 means more exploration) – (default = 2) higher q means more memory, more retrospective/inwards randomwalk. memory = 2 means run using the non-memory Via 1.0 mode
viagraph_decay (float) – (default = 0.9) increasing decay causes more edges to merge
memory –
p_memory (1/p * edge weight to next node = previous node. large value means more exploration) –
graph_init_pos (matrix (or list of lists) to initialize the viagraph) –
spatial_coords (np.ndarray of size n_cells x 2 (denoting x,y coordinates) of each spot/cell) –
do_spatial_knn (Whether or not to do spatial mode of StaVia for graph augmentation) –
do_spatial_layout (whether to use spatial coords for layout of the clustergraph) –
spatial_knn (int = 15. number of knn's added based on spatial proximity indiciated by spatial_coords) –
spatial_aux (list = [] a list of slice IDs so that only cells/spots on the same slice are considered when building the spatial_knn graph) –

labels

length (n_samples, ) of cluster labels ndarray pre determined cluster labels user defined. #np.asarray(pre_labels).flatten()

Type:: array

single_cell_pt_markov

length n_samples of pseudotime

Type:: list

single_cell_bp

[n_lineages x n_samples] array of single cell branching probabilities towards each lineage (lineage normalized). Each column corresponds to a terminal state, in the order presented by the terminal_clusters attribute

Type:: ndarray

single_cell_bp_rownormed

[n_lineages x n_samples] array of single cell branching probabilities towards each lineage (cell normalized). Each column corresponds to a terminal state, in the order presented by the terminal_clusters attribute

Type:: ndarray

terminal_clusters

list of clusters that are cell fates/ unique lineages

Type:: list

cluster_bp

[n_clusters x n_terminal_states]. Lineage probability of cluster towards a particular terminal cluster state

Type:: ndarray

CSM

[n_cluster x n_clusters] array of cosine similarity used to weight the cluster graph transition matrix by velocity

Type:: ndarray

single_cell_transition_matrix

[n_samples x n_samples]

Type:: ndarray

terminal_clusters

(default None) list of terminal clusters

Type:: list

csr_full_graph

Type:: csr matrix of single-cell graph (augmented with sequential data when providing time_series information)

csr_array_locally_pruned

Type:: csr matrix

ig_full_graph

full_neighbor_array

user_defined_terminal_cell

Type:: list=[]

user_defined_terminal_group

Type:: list=[]

n_milestones

Type:: int = None Number of milestones in the via-mds computation (anything more than 10,000 can be computationally heavy and time consuming) Typically auto-determined within the via-mds function

embedding

[n_cells x 2] provided by user or autocomputed with via-mds or via-umap

Type:: ndarray

sc_transition_matrix(smooth_transition, b=10, use_sequentially_augmented=False)[source]

#computes the single cell level transition directions that are later used to calculate velocity of embedding #based on changes at single cell level in genes and single cell level velocity

Parameters:

smooth_transition –
b – slope of logistic function

Returns:

Plotting

VIA.plotting_via.animate_atlas(hammerbundle_dict=None, via_object=None, linewidth_bundle=2, frame_interval=10, n_milestones=None, facecolor='white', cmap='plasma_r', extra_title_text='', size_scatter=1, alpha_scatter=0.2, saveto='/home/user/Trajectory/Datasets/animation_default.gif', time_series_labels=None, lineage_pathway=[], sc_labels_numeric=None, show_sc_embedding=False, sc_emb=None, sc_size_scatter=10, sc_alpha_scatter=0.2, n_intervals=50, n_repeat=2)[source]

Parameters:

ax – axis to plot on
hammer_bundle – hammerbundle object with coordinates of all the edges to draw
layout – coords of cluster nodes and optionally also contains the numeric value associated with each cluster (such as time-stamp) layout[[‘x’,’y’,’numeric label’]] sc/cluster/milestone level
CSM – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight – percentage weightage given to the RNA velocity based transition matrix
pt – cluster-level pseudotime
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
frame_interval (int) – smaller number, faster refresh and video
facecolor (str) – default = white
headwidth_bundle – headwidth of arrows used in bundled edges
arrow_frequency – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – True will draw arrows along the lines to indicate direction
milestone_edges – pandas DataFrame milestone_edges[[‘source’,’target’]]

:param t_diff_factor scaling the average the time intervals (0.25 means that for each frame, the time is progressed by 0.25* mean_time_differernce_between adjacent times (only used when sc_labels_numeric are directly passed instead of using pseudotime) :type show_sc_embedding: bool :param show_sc_embedding: plot the single cell embedding under the edges :param sc_emb numpy array of single cell embedding (ncells x 2) :param sc_alpha_scatter, Alpha transparency value of points of single cells (1 is opaque, 0 is fully transparent) :param sc_size_scatter. size of scatter points of single cells :param n_repeat. number of times you repeat the whole process :return: axis with bundled edges plotted

VIA.plotting_via.animate_atlas_old(hammerbundle_dict=None, via_object=None, linewidth_bundle=2, frame_interval=10, n_milestones=None, facecolor='white', cmap='plasma_r', extra_title_text='', size_scatter=1, alpha_scatter=0.2, saveto='/home/user/Trajectory/Datasets/animation_default.gif', time_series_labels=None, lineage_pathway=[], sc_labels_numeric=None, t_diff_factor=0.25, show_sc_embedding=False, sc_emb=None, sc_size_scatter=10, sc_alpha_scatter=0.2, n_intervals=50)[source]

Parameters:

ax – axis to plot on
hammer_bundle – hammerbundle object with coordinates of all the edges to draw
layout – coords of cluster nodes and optionally also contains the numeric value associated with each cluster (such as time-stamp) layout[[‘x’,’y’,’numeric label’]] sc/cluster/milestone level
CSM – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight – percentage weightage given to the RNA velocity based transition matrix
pt – cluster-level pseudotime
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
frame_interval (int) – smaller number, faster refresh and video
facecolor (str) – default = white
headwidth_bundle – headwidth of arrows used in bundled edges
arrow_frequency – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – True will draw arrows along the lines to indicate direction
milestone_edges – pandas DataFrame milestone_edges[[‘source’,’target’]]

:param t_diff_factor scaling the average the time intervals (0.25 means that for each frame, the time is progressed by 0.25* mean_time_differernce_between adjacent times (only used when sc_labels_numeric are directly passed instead of using pseudotime) :type show_sc_embedding: bool :param show_sc_embedding: plot the single cell embedding under the edges :param sc_emb numpy array of single cell embedding (ncells x 2) :param sc_alpha_scatter, Alpha transparency value of points of single cells (1 is opaque, 0 is fully transparent) :param sc_size_scatter. size of scatter points of single cells :param time_series_labels, should be a single-cell level list (n_cells) of numerical values that form a discrete set. I.e. not continuous like pseudotime, :return: axis with bundled edges plotted

VIA.plotting_via.animate_streamplot(via_object, embedding, density_grid=1, linewidth=0.5, min_mass=1, cutoff_perc=None, scatter_size=500, scatter_alpha=0.2, marker_edgewidth=0.1, smooth_transition=1, smooth_grid=0.5, color_scheme='annotation', other_labels=[], b_bias=20, n_neighbors_velocity_grid=None, fontsize=8, alpha_animate=0.7, cmap_scatter='rainbow', cmap_stream='Blues', segment_length=1, saveto='/home/shobi/Trajectory/Datasets/animation.gif', use_sequentially_augmented=False, facecolor_='white', random_seed=0)[source]

Draw Animated vector plots. the Saved .gif file saved at the saveto address, is the best for viewing the animation as the fig, ax output can be slow

Parameters:

via_object – viaobject
embedding – ndarray (nsamples,2) umap, tsne, via-umap, via-mds
density_grid –
linewidth –
min_mass –
cutoff_perc –
scatter_size –
scatter_alpha –
marker_edgewidth –
smooth_transition –
smooth_grid –
color_scheme – ‘annotation’, ‘cluster’, ‘other’
add_outline_clusters –
cluster_outline_edgewidth –
gp_color –
bg_color –
title –
b_bias –
n_neighbors_velocity_grid –
fontsize –
alpha_animate –
cmap_scatter –
cmap_stream – string of a cmap for streamlines, default = ‘Blues’ (for dark blue lines) . Consider ‘Blues_r’ for white lines OR ‘Greys/_r’ ‘gist_yard/_r’
color_stream – string like ‘white’. will override cmap_stream
segment_length –

Returns:

fig, ax.

VIA.plotting_via.get_gene_expression(via_object, gene_exp, cmap='jet', dpi=150, marker_genes=[], linewidth=2.0, n_splines=10, spline_order=4, fontsize_=8, marker_lineages=[], optional_title_text='', cmap_dict=None)[source]

Parameters:

via_object – via object
gene_exp (DataFrame) – dataframe where columns are features (gene) and rows are single cells
cmap (str) – default: ‘jet’
dpi (int) – default:150
marker_genes (list) – Default is to use all genes in gene_exp. other provide a list of marker genes that will be used from gene_exp.
linewidth (float) – default:2
n_slines – default:10 Note n_splines must be > spline_order.
spline_order (int) – default:4 n_splines must be > spline_order.
marker_lineages – Default is to use all lineage pathways. other provide a list of lineage number (terminal cluster number).
cmap_dict (dict) – {lineage number: ‘color’}

Returns:

fig, axs

VIA.plotting_via.make_dict_of_clusters_for_each_celltype(via_labels=[], true_label=[], verbose=False)[source]

Parameters:

via_labels (list) – usually set to via_object.labels. list of length n_cells of cluster membership
true_label (list) – cell type labels (list of length n_cells)

Returns:

VIA.plotting_via.make_edgebundle_milestone(embedding=None, sc_graph=None, via_object=None, sc_pt=None, initial_bandwidth=0.03, decay=0.7, n_milestones=None, milestone_labels=[], sc_labels_numeric=None, weighted=True, global_visual_pruning=0.5, terminal_cluster_list=[], single_cell_lineage_prob=None, random_state=0)[source]

Perform Edgebundling of edges in a milestone level to return a hammer bundle of milestone-level edges. This is more granular than the original parc-clusters but less granular than single-cell level and hence also less computationally expensive requires some type of embedding (n_samples x 2) to be available

Parameters:

embedding (ndarray) – optional (not required if via_object is provided) embedding single cell. also looks nice when done on via_mds as more streamlined continuous diffused graph structure. Umap is a but “clustery”
graph – optional (not required if via_object is provided) igraph single cell graph level
via_object – via_object (best way to run this function by simply providing via_object)
sc_graph – igraph graph set as the via attribute self.ig_full_graph (affinity graph)
initial_bandwidth – increasing bw increases merging of minor edges
decay – increasing decay increases merging of minor edges #https://datashader.org/user_guide/Networks.html
milestone_labels (list) – default list=[]. Usually autocomputed. but can provide as single-cell level labels (clusters, groups, which function as milestone groupings of the single cells)
sc_labels_numeric (list) – default is None which automatically chooses via_object’s pseudotime or time_series_labels (when available). otherwise set to a list of numerical values representing some sequential/chronological information
terminal_cluster_list (list) – default list [] and automatically uses all terminal clusters. otherwise set to any of the terminal cluster numbers within a list

Returns:

dictionary containing keys: hb_dict[‘hammerbundle’] = hb hammerbundle class with hb.x and hb.y containing the coords hb_dict[‘milestone_embedding’] dataframe with ‘x’ and ‘y’ columns for each milestone and hb_dict[‘edges’] dataframe with columns [‘source’,’target’] milestone for each each and [‘cluster_pop’], hb_dict[‘sc_milestone_labels’] is a list of milestone label for each single cell

VIA.plotting_via.plot_all_spatial_clusters(spatial_coords, true_label, via_labels, save_to='', color_dict={}, cmap='rainbow', alpha=0.4, s=5, verbose=False, reference_labels=[], reference_labels2=[])[source]

Parameters:

spatial_coords – ndarray of x,y coords of tissue location of cells (ncells x2)
true_label – categorial labels (list of length n_cells)
via_labels – cluster membership labels (list of length n_cells)
save_to (str) –
color_dict (dict) – optional dict with keys corresponding to true_label type. e.g. {true_label_celltype1: ‘green’,true_label_celltype2: ‘red’}
cmap (str) – string default = rainbow
reference_labels (list) – optional list of single-cell labels (e.g. time, annotation). Used to selectively provide a grey background to cells not in the cluster being inspected. If you have multipe time points, then set reference_labels to the time_points. All cells in the most prevalent timepoint seen in the cluster of interest will be plotted as a background
reference_labels2 (list) – optional list of single-cell labels (e.g. time, annotation). this will be used in the title of each subplot to note the majority cell (ref2) type for each cluster

Returns:

list lists of [[fig1, axs_set1], [fig2, axs_set2],…]

VIA.plotting_via.plot_atlas_view(hammerbundle_dict=None, via_object=None, alpha_bundle_factor=1, linewidth_bundle=2, facecolor='white', cmap='plasma', extra_title_text='', alpha_milestones=0.3, headwidth_bundle=0.1, headwidth_alpha=0.8, arrow_frequency=0.05, show_arrow=True, sc_labels_sequential=None, sc_labels_expression=None, initial_bandwidth=0.03, decay=0.7, n_milestones=None, scale_scatter_size_pop=False, show_milestones=True, sc_labels=None, text_labels=False, lineage_pathway=[], dpi=300, fontsize_title=6, fontsize_labels=6, global_visual_pruning=0.5, use_sc_labels_sequential_for_direction=False, sc_scatter_size=3, sc_scatter_alpha=0.4, add_sc_embedding=True, size_milestones=5, colorbar_legend='pseudotime')[source]

Edges can be colored by time-series numeric labels, pseudotime, lineage pathway probabilities, or gene expression. If not specificed then time-series is chosen if available, otherwise falls back to pseudotime. to use gene expression the sc_labels_expression is provided as a list. To specify other numeric sequential data provide a list of sc_labels_sequential = [] n_samples in length. via_object.embedding must be an ndarray of shape (nsamples,2)

Parameters:

hammer_bundle_dict – dictionary with keys: hammerbundle object with coordinates of all the edges to draw. If hammer_bundle and layout are None, then this will be computed internally
via_object – type via object, if hammerbundle_dict is None, then you must provide a via_object. Ensure that via_object has embedding attribute
layout – coords of cluster nodes and optionally also contains the numeric value associated with each cluster (such as time-stamp) layout[[‘x’,’y’,’numeric label’]] sc/cluster/milestone level
CSM – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight – percentage weightage given to the RNA velocity based transition matrix
pt – cluster-level pseudotime
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
alpha_milestones (float) – float 0.3 alpha of milestones
size_milestones (int) – scatter size of the milestones (use sc_size_scatter to control single cell scatter when using in conjunction with lineage probs/ sc embeddings)
arrow_frequency (float) – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – True will draw arrows along the lines to indicate direction
milestone_edges – pandas DataFrame milestoone_edges[[‘source’,’target’]]
milestone_numeric_values – the milestone average of numeric values such as time (days, hours), location (position), or other numeric value used for coloring edges in a sequential manner if this is None then the edges are colored by length to distinguish short and long range edges
arrow_frequency – 0.05. higher means fewer arrows
n_milestones (int) – int None. if no hammerbundle_dict is provided, but via_object is provided, then the user can specify level of granularity by setting the n_milestones. otherwise it will be automatically selected
scale_scatter_size_pop (bool) – bool default False
sc_labels_expression (list) – list single cell numeric values used for coloring edges and nodes of corresponding milestones mean expression levels (len n_single_cell samples) edges can be colored by time-series numeric (gene expression)/string (cell type) labels, pseudotime, or gene expression. If not specificed then time-series is chosen if available, otherwise falls back to pseudotime. to use gene expression the sc_labels_expression is provided as a list
sc_labels_sequential (list) – list single cell numeric sequential values used for directionality inference as replacement for pseudotime or via_object.time_series_labels (len n_samples single cell)
sc_labels (list) – list None list of single-cell level labels (categorial or discrete set of numerical values) to label the nodes
text_labels (bool) – bool False if you want to label the nodes based on sc_labels (or true_label if via_object is provided)
lineage_pathway (list) – list of terminal states to plot lineage pathways
use_sc_labels_sequential_for_direction (bool) – use the sequential data (timeseries labels or other provided by user) to direct the arrows

:param lineage_alpha_threshold number representing the percentile (0-100) of lineage likelikhood in a particular lineage pathway, below which edges will be drawn with lower alpha transparency factor :type sc_scatter_alpha: float :param sc_scatter_alpha: transparency of the background singlecell scatter when plotting lineages :type add_sc_embedding: bool :param add_sc_embedding: add background of single cell scatter plot for Atlas :param scatter_size_sc_embedding :param colorbar_legend str title of colorbar :return: fig, axis with bundled edges plotted

VIA.plotting_via.plot_clusters_spatial(spatial_coords, clusters=[], via_labels=[], title_sup='', fontsize_=6, color='green', s=5, alpha=0.5, xlim_max=None, ylim_max=None, xlim_min=None, ylim_min=None, reference_labels=[], reference_labels2=[], equal_axes_lim=True)[source]

Parameters:

spatial_coords – ndarray of spatial coords ncellsx2 dims
clusters – the clusters in via_object.labels which you want to plot (usually a subset of the total number of clusters)
via_labels – via_object.labels (cluster level labels, list of n_cells length)
title_sup – title of the overall figure
fontsize – fontsize for legend
color – color of scatter points
s (int) – size of scatter points
alpha – float alpha transparency of scatter (0 fully transporent, 1 is opaque)
xlim_max – limits of axes
ylim_max – limits of axes
xlim_min – limits of axes
ylim_min – limits of axes
reference_labels (list) – optional list of single-cell labels (e.g. time, annotation). this will be used in the title of each subplot to note the majority cell (ref2) type for each cluster
reference_labels2 (list) – optional list of single-cell labels (e.g. time, annotation). this will be used in the title of each subplot to note the majority cell (ref2) type for each cluster

Returns:

fig, axs

VIA.plotting_via.plot_differentiation_flow(via_object, idx=None, dpi=150, marker_lineages=[], label_node=[], do_log_flow=True, fontsize=8, alpha_factor=0.9, majority_cluster_population_dict=None, cmap_sankey='rainbow', title_str='Differentiation Flow', root_cluster_list=None)[source]

#SANKEY PLOTS G is the igraph knn (low K) used for shortest path in high dim space. no idx needed as it’s made on full sample knn_hnsw is the knn made in the embedded space used for query to find the nearest point in the downsampled embedding that corresponds to the single cells in the full graph

Parameters:

via_object –
embedding – n_samples x 2. embedding is 2D representation of the full dataset.
idx (list) – if one uses a downsampled embedding of the original data, then idx is the selected indices of the downsampled samples used in the visualization
cmap_name –
dpi –

:param do_log_flow bool True (default) take the natural log (1+edge flow value) :param label_node list of labels for each cell (could be cell type, stage level) length is n_cells :param scatter_size: if None, then auto determined based on n_cells :param marker_lineages: Default is to use all lineage pathways. other provide a list of lineage number (terminal cluster number). :type alpha_factor: float :param alpha_factor: float transparency :type root_cluster_list: list :param root_cluster_list: list of roots by cluster number e.g. [5] means a good root is cluster number 5 :return: fig, axs

VIA.plotting_via.plot_gene_trend_heatmaps(via_object, df_gene_exp, marker_lineages=[], fontsize=8, cmap='viridis', normalize=True, ytick_labelrotation=0, fig_width=7)[source]

Plot the gene trends on heatmap: a heatmap is generated for each lineage (identified by terminal cluster number). Default selects all lineages

Parameters:

via_object –
df_gene_exp (DataFrame) – pandas DataFrame single-cell level expression [cells x genes]
marker_lineages (list) – list default = None and plots all detected all lineages. Optionally provide a list of integers corresponding to the cluster number of terminal cell fates
fontsize (int) – int default = 8
cmap (str) – str default = ‘viridis’
normalize (bool) – bool = True
ytick_labelrotation (int) – int default = 0

Returns:

fig and list of axes

VIA.plotting_via.plot_piechart_only_viagraph(via_object, type_data='pt', gene_exp=[], cmap_piechart='rainbow', title='', cmap=None, ax_text=True, dpi=150, headwidth_arrow=0.1, alpha_edge=0.4, linewidth_edge=2, edge_color='darkblue', reference_labels=None, show_legend=True, pie_size_scale=0.8, fontsize=8, pt_visual_threshold=99, highlight_terminal_clusters=True, size_node_notpiechart=1, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]

plot two subplots with a clustergraph level representation of the viagraph showing true-label composition (lhs) and pseudotime/gene expression (rhs) Returns matplotlib figure with two axes that plot the clustergraph using edge bundling left axis shows the clustergraph with each node colored by annotated ground truth membership. right axis shows the same clustergraph with each node colored by the pseudotime or gene expression

Parameters:

via_object – is class VIA (the same function also exists as a method of the class and an external plotting function
type_data – string default ‘pt’ for pseudotime colored nodes. or ‘gene’
gene_exp (list) – list of values (or column of dataframe) corresponding to feature or gene expression to be used to color nodes at CLUSTER level
cmap_piechart (str) – str cmap for piechart categories
title – string
cmap (str) – default None. automatically chooses coolwarm for gene expression or viridis_r for pseudotime
ax_text – Bool default= True. Annotates each node with cluster number and population of membership
dpi – int default = 150
headwidth_arrow – default = 0.1. width of arrowhead used to directed edges
reference_labels – None or list. list of categorical (str) labels for cluster composition of the piecharts (LHS subplot) length = n_samples.
pie_size_scale (float) – float default=0.8 scaling factor of the piechart nodes
pt_visual_threshold (int) – int (percentage) default = 95 corresponding to rescaling the visual color scale by clipping outlier cluster pseudotimes

:param highlight_terminal_clusters:bool = True (red border around terminal clusters) :type size_node_notpiechart: float :param size_node_notpiechart: scaling factor for node size of the viagraph (not the piechart part) :param initial_bandwidth: (float = 0.05) increasing bw increases merging of minor edges. Only used when tune_edges = True :param decay: (decay = 0.9) increasing decay increases merging of minor edges . Only used when tune_edges = True :param edgebundle_pruning (float = 0.5). takes on values between 0-1. smaller value means more pruning away edges that can be visualised. Only used when tune_edges = True :return: f, ax, ax1

VIA.plotting_via.plot_piechart_viagraph(via_object, type_data='pt', gene_exp=[], cmap_piechart='rainbow', title='', cmap=None, ax_text=True, dpi=150, headwidth_arrow=0.1, alpha_edge=0.4, linewidth_edge=2, edge_color='darkblue', reference_labels=None, show_legend=True, pie_size_scale=0.8, fontsize=8, pt_visual_threshold=99, highlight_terminal_clusters=True, size_node_notpiechart=1, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]

Parameters:

via_object – is class VIA (the same function also exists as a method of the class and an external plotting function
type_data – string default ‘pt’ for pseudotime colored nodes. or ‘gene’
gene_exp (list) – list of values (or column of dataframe) corresponding to feature or gene expression to be used to color nodes at CLUSTER level
cmap_piechart (str) – str cmap for piechart categories
title – string
cmap (str) – default None. automatically chooses coolwarm for gene expression or viridis_r for pseudotime
ax_text – Bool default= True. Annotates each node with cluster number and population of membership
dpi – int default = 150
headwidth_arrow – default = 0.1. width of arrowhead used to directed edges
reference_labels – None or list. list of categorical (str) labels for cluster composition of the piecharts (LHS subplot) length = n_samples.
pie_size_scale (float) – float default=0.8 scaling factor of the piechart nodes
pt_visual_threshold (int) – int (percentage) default = 95 corresponding to rescaling the visual color scale by clipping outlier cluster pseudotimes

VIA.plotting_via.plot_population_composition(via_object, time_labels=None, celltype_list=None, cmap='rainbow', legend=True, alpha=0.5, linewidth=0.2, n_intervals=20, xlabel='time', ylabel='', title='Cell populations', color_dict=None, fraction=True)[source]

Parameters:

via_object – optional. this is required unless both time_labels and cell_labels are provided as arguments to the function
time_labels (list) – list length n_cells of pseudotime or known stage numeric labels
cell_labels – list of cell type or cluster length n_cells

Returns:

VIA.plotting_via.plot_sc_lineage_probability(via_object, embedding=None, idx=None, cmap_name='plasma', dpi=150, scatter_size=None, marker_lineages=[], fontsize=8, alpha_factor=0.9, majority_cluster_population_dict=None, cmap_sankey='rainbow', do_sankey=False)[source]

G is the igraph knn (low K) used for shortest path in high dim space. no idx needed as it’s made on full sample knn_hnsw is the knn made in the embedded space used for query to find the nearest point in the downsampled embedding that corresponds to the single cells in the full graph

Parameters:

via_object –
embedding (ndarray) – n_samples x 2. embedding is either the full or downsampled 2D representation of the full dataset.
idx (list) – if one uses a downsampled embedding of the original data, then idx is the selected indices of the downsampled samples used in the visualization
cmap_name –
dpi –
scatter_size – if None, then auto determined based on n_cells
marker_lineages – Default is to use all lineage pathways. other provide a list of lineage number (terminal cluster number).
alpha_factor (float) – float transparency

Returns:

fig, axs

VIA.plotting_via.plot_scatter(embedding, labels, cmap='rainbow', s=5, alpha=0.3, edgecolors='None', title='', text_labels=True, color_dict=None, via_object=None, sc_index_terminal_states=None, true_labels=[], show_legend=True, hide_axes_ticks=True, color_labels_reverse=False)[source]

General scatter plotting tool for numeric and categorical labels on the single-cell level

Parameters:

embedding (ndarray) – ndarray n_samples x 2
labels (list) – list single cell labels list of number or strings
cmap – str default = ‘rainbow’
s – int size of scatter dot
alpha – float with 0 transparent to 1 opaque default =0.3
edgecolors –
title (str) – str
text_labels (bool) – bool default =True
via_object –
sc_index_terminal_states (list) – list of integers corresponding to one cell in each of the terminal states
color_dict – {‘true_label_group_1’: #COLOR,’true_label_group_2’: #COLOR2,….} where the dictionary keys correspond to the provided labels
true_labels (list) – list of single cell labels used to annotate the terminal states

Returns:

matplotlib pyplot fig, ax

VIA.plotting_via.plot_trajectory_curves(via_object, embedding=None, idx=None, title_str='Pseudotime', draw_all_curves=True, arrow_width_scale_factor=15.0, scatter_size=50, scatter_alpha=0.5, linewidth=1.5, marker_edgewidth=1, cmap_pseudotime='viridis_r', dpi=150, highlight_terminal_states=True, use_maxout_edgelist=False)[source]

projects the graph based coarse trajectory onto a umap/tsne embedding

Parameters:

via_object – via object
embedding (ndarray) – 2d array [n_samples x 2] with x and y coordinates of all n_samples. Umap, tsne, pca OR use the via computed embedding via_object.embedding
idx (Optional[list]) – default: None. Or List. if you had previously computed a umap/tsne (embedding) only on a subset of the total n_samples (subsampled as per idx), then the via objects and results will be indexed according to idx too
title_str (str) – title of figure
draw_all_curves (bool) – if the clustergraph has too many edges to project in a visually interpretable way, set this to False to get a simplified view of the graph pathways
arrow_width_scale_factor (float) –
scatter_size (float) –
scatter_alpha (float) –
linewidth (float) –
marker_edgewidth (float) –
cmap_pseudotime (str) –
dpi (int) – int default = 150. Use 300 for paper figures
highlight_terminal_states (bool) – whether or not to highlight/distinguish the clusters which are detected as the terminal states by via

Returns:

f, ax1, ax2

VIA.plotting_via.plot_viagraph(via_object, type_data='gene', df_genes=None, gene_list=[], arrow_head=0.1, edgeweight_scale=1.5, cmap=None, label_text=True, size_factor_node=1, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]: cluster level expression of gene/feature intensity :param via_object: :param type_data: :param gene_exp: pd.Dataframe size n_cells x genes. Otherwise defaults to plotting pseudotime :type gene_list: list :param gene_list: list of gene names corresponding to the column name :type arrow_head: float :param arrow_head: :type edgeweight_scale: float :param edgeweight_scale: :param cmap: :type label_text: bool :param label_text: bool to add numeric values of the gene exp level :param size_factor_node size of graph nodes :type tune_edges: bool :param tune_edges: bool (false). if you want to change the number of edges visualized, then set this to True and modify the tuning parameters (initial_bandwidth, decay, edgebundle_pruning) :param initial_bandwidth: (float = 0.05) increasing bw increases merging of minor edges. Only used when tune_edges = True :param decay: (decay = 0.9) increasing decay increases merging of minor edges . Only used when tune_edges = True :param edgebundle_pruning (float = 0.5). takes on values between 0-1. smaller value means more pruning away edges that can be visualised. Only used when tune_edges = True :return: fig, axs

VIA.plotting_via.plot_viagraph_(ax=None, hammer_bundle=None, layout=None, CSM=None, velocity_weight=None, pt=None, alpha_bundle=1, linewidth_bundle=2, edge_color='darkblue', headwidth_bundle=0.1, arrow_frequency=0.05, show_direction=True, ax_text=True, title='', plot_clusters=False, cmap='viridis', via_object=None, fontsize=9, dpi=300, tune_edges=False, initial_bandwidth=0.05, decay=0.9, edgebundle_pruning=0.5)[source]

this plots the edgebundles on the via clustergraph level and also adds the relevant arrow directions based on the TI directionality

Parameters:

ax – axis to plot on
hammer_bundle – hammerbundle object with coordinates of all the edges to draw. self.hammer
layout (ndarray) – coords of cluster nodes
CSM (ndarray) – cosine similarity matrix. cosine similarity between the RNA velocity between neighbors and the change in gene expression between these neighbors. Only used when available
velocity_weight (float) – percentage weightage given to the RNA velocity based transition matrix
pt (list) – cluster-level pseudotime (or other intensity level of features at average-cluster level)
alpha_bundle – alpha when drawing lines
linewidth_bundle – linewidth of bundled lines
edge_color –
headwidth_bundle – headwidth of arrows used in bundled edges
arrow_frequency – min dist between arrows (bundled edges otherwise have overcrowding of arrows)
show_direction – bool default True. will draw arrows along the lines to indicate direction
plot_clusters (bool) – bool default False. When this function is called on its own (and not from within draw_piechart_graph() then via_object must be provided
ax_text (bool) – bool default True. Show labels of the clusters with the cluster population and PARC cluster label
fontsize (float) – float default 9 Font size of labels

Returns:

fig, ax with bundled edges plotted

VIA.plotting_via.via_atlas_emb(via_object=None, X_input=None, graph=None, n_components=2, alpha=1.0, negative_sample_rate=5, gamma=1.0, spread=1.0, min_dist=0.1, init_pos='via', random_state=0, n_epochs=100, distance_metric='euclidean', layout=None, cluster_membership=None, parallel=False, saveto='', n_jobs=2)[source]

Run dimensionality reduction using the VIA modified HNSW graph using via cluster graph initialization when Via_object is provided

Parameters:

via_object – if via_object is provided then X_input and graph are ignored
X_input (ndarray) – ndarray nsamples x features (PCs)
graph (csr_matrix) – csr_matrix of knngraph. This usually is via’s pruned, sequentially augmented sc-knn graph accessed as an attribute of via via_object.csr_full_graph
n_components (int) –
alpha (float) –
negative_sample_rate (int) –
gamma (float) – Weight to apply to negative samples.
spread (float) – The effective scale of embedded points. In combination with min_dist this determines how clustered/clumped the embedded points are.
min_dist (float) – The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points
init_pos (Union[str, ndarray]) – either a string (default) ‘via’ (uses via graph to initialize), or ‘spectral’. Or a n_cellx2 dimensional ndarray with initial coordinates
random_state (int) –
n_epochs (int) – The number of training epochs to be used in optimizing the low dimensional embedding. Larger values result in more accurate embeddings. If 0 is specified a value will be selected based on the size of the input dataset (200 for large datasets, 500 for small).
distance_metric (str) –
layout (Optional[list]) – ndarray . custom initial layout. (n_cells x2). also requires cluster_membership labels
cluster_membership (Optional[list]) – via_object.labels (cluster level labels of length n_samples corresponding to the layout)

Return type:

ndarray

Returns:

ndarray of shape (nsamples,n_components)

VIA.plotting_via.via_forcelayout(X_pca, viagraph_full=None, k=10, n_milestones=2000, time_series_labels=[], knn_seq=5, saveto='', random_seed=0)[source]

Compute force directed layout. #TODO not complete

Parameters:

X_pca –
viagraph_full (csr_matrix) – optional. if calling before via, then None. if calling after or from within via, then we can use the via-graph to reinforce the layout
k (int) –
random_seed (int) –
t_diffusion –
n_milestones –
time_series_labels (list) –
knn_seq (int) –

Return type:

ndarray

Returns:

ndarray

VIA.plotting_via.via_mds(via_object=None, X_pca=None, viagraph_full=None, k=15, random_seed=0, diffusion_op=1, n_milestones=2000, time_series_labels=[], knn_seq=5, k_project_milestones=3, t_difference=2, saveto='', embedding_type='mds', double_diffusion=False)[source]

Fast computation of a 2D embedding FOR EXAMPLE: via_object.embedding = via.via_mds(via_object = v0) plot_scatter(embedding = via_object.embedding, labels = via_object.true_labels)

Parameters:

via_object –
X_pca (ndarray) – dimension reduced (only if via_object is not passed)
viagraph_full (csr_matrix) – optional. if calling before or without via, then None and a milestone graph will be computed. if calling after or from within via, then we can use the via-graph to reinforce the layout of the milestone graph
k (int) – number of knn for the via_mds reinforcement graph on milestones. default =15. integers 5-20 are reasonable
random_seed (int) – randomseed integer
t_diffusion – default integer value = 1 with higher values generate more smoothing
n_milestones – number of milestones used to generate the initial embedding
time_series_labels (list) – numerical values in list form representing some sequentual information
knn_seq (int) – if time-series data is available, this will augment the knn with sequential neighbors (2-10 are reasonable values) default =5
embedding_type (str) – default = ‘mds’ or set to ‘umap’
double_diffusion (bool) – default is False. To achieve sharper strokes/lineages, set to True
k_project_milestones (int) – number of milestones in the milestone-knngraph used to compute the single-cell projection
n_iterations – number of iterations to run
neighbors_distances – array of distances of each neighbor for each cell (n_cells x knn) used when called from within via.run() for autocompute via-mds

Return type:

ndarray

Returns:

numpy array of size n_samples x 2

VIA.plotting_via.via_streamplot(via_object, embedding=None, density_grid=0.5, arrow_size=0.7, arrow_color='k', color_dict=None, arrow_style='-|>', max_length=4, linewidth=1, min_mass=1, cutoff_perc=5, scatter_size=500, scatter_alpha=0.5, marker_edgewidth=0.1, density_stream=2, smooth_transition=1, smooth_grid=0.5, color_scheme='annotation', add_outline_clusters=False, cluster_outline_edgewidth=0.001, gp_color='white', bg_color='black', dpi=300, title='Streamplot', b_bias=20, n_neighbors_velocity_grid=None, labels=None, use_sequentially_augmented=False, cmap='rainbow', show_text_labels=True)[source]

Construct vector streamplot on the embedding to show a fine-grained view of inferred directions in the trajectory

Parameters:

via_object –
embedding (ndarray) – np.ndarray of shape (n_samples, 2) umap or other 2-d embedding on which to project the directionality of cells
density_grid (float) –
arrow_size (float) –
arrow_color (str) –
arrow_style –
max_length (int) –
linewidth (float) – width of lines in streamplot, default = 1
min_mass –
cutoff_perc (int) –
scatter_size (int) – size of scatter points default =500
scatter_alpha (float) – transpsarency of scatter points
marker_edgewidth (float) – width of outline arround each scatter point, default = 0.1
density_stream (int) –
smooth_transition (int) –
smooth_grid (float) –
color_scheme (str) – str, default = ‘annotation’ corresponds to self.true_labels. Other options are ‘time’ (uses single-cell pseudotime) and ‘cluster’ (via cluster graph) and ‘other’. Alternatively provide labels as a list
add_outline_clusters (bool) –
cluster_outline_edgewidth –
gp_color –
bg_color –
dpi –
title –
b_bias – default = 20. higher value makes the forward bias of pseudotime stronger
n_neighbors_velocity_grid –
labels (list) – list (will be used for the color scheme) or if a color_dict is provided these labels should match
use_sequentially_augmented –
cmap (str) –

Returns:

fig, ax

Datasets

VIA.datasets_via.cell_cycle(foldername='./')[source]

Load cell cycle data as AnnData object

Args:: foldername (string): Directory of dataset
Returns:: AnnData object

https://github.com/ShobiStassen/VIA/blob/master/Figures/mb231_overall_300dpi.png?raw=true:width="200px"

VIA.datasets_via.cell_cycle_cyto_data(foldername='./')[source]

Load cell cycle imagine based flow-cyto features AnnData object with n_obs × n_vars = 2036 × 38 obs: ‘cell_cycle_phase’ :param foldername (string) Default current directory. path to directory where you want to store the dataset

Returns:: anndata

VIA.datasets_via.embryoid_body(foldername='./')[source]

Load embryoid body data as AnnData object

Args:: foldername (string): Directory to save dataset
Returns:: AnnData object

VIA.datasets_via.moffitt_preoptic(foldername='./')[source]

Load preoptic hypothalamus mouse data from moffitt et al.,m as AnnData object

Args:: foldername (string): foldername (string): path to directory where you want to store the dataset ‘./’ current directory is default
Returns:: AnnData object

https://github.com/ShobiStassen/VIA/blob/master/Figures/Bregma29_tissue.png?raw=true:width="200px"

VIA.datasets_via.scATAC_hematopoiesis(foldername='./')[source]

Load scATAC seq Hematopoiesis data as AnnData object

Args:: foldername (string): Directory of dataset
Returns:: AnnData object

VIA.datasets_via.scRNA_hematopoiesis(foldername='./')[source]

Load scRNA seq Hematopoiesis data as AnnData object

Args:: foldername (string): Directory of dataset
Returns:: AnnData object

https://github.com/ShobiStassen/VIA/blob/master/Figures/humancd34_streamplot.png?raw=true:width="200px"

VIA.datasets_via.toy_disconnected(foldername='./')[source]

Load Toy_Disconnected data as AnnData object

To access obs (label) as list, use AnnData.obs[‘group_id’].values.tolist()

Args:: foldername (string): Default current directory. path to directory where you want to store the dataset
Returns:: AnnData object

https://github.com/ShobiStassen/VIA/blob/master/Figures/stream_plot_toy4.png?raw=true:width="200px"

VIA.datasets_via.toy_multifurcating(foldername='./')[source]

Load Toy_Multifurcating data as AnnData object

To access obs (label) as list, use AnnData.obs[‘group_id’].values.tolist()

Args:: foldername (string): foldername (string): path to directory where you want to store the dataset ‘./’ current directory is default
Returns:: AnnData object

https://github.com/ShobiStassen/VIA/blob/master/Figures/toy3_streamvia.png?raw=true:width="200px"

VIA.datasets_via.zesta(foldername='./')[source]

Returns: