2. Via 2.0 Cartography on Zebrahub (Trajectory Inference)

This tutorial focuses on how to perform TI analysis of Zebrahub. See Tutorial #3 for Atlas View creation

import scanpy as sc
import matplotlib.pyplot as plt
import numpy as np
import pyVIA.core as via
from datetime import datetime
import pyVIA.datasets_via as datasets

Load the data

Data has been filtered and library normalized and log1p. Data is annotated with time/stage, cluster labels, and cell type labels (coarse and fine).

print(f'{datetime.now()}\tStart reading data')
adata=sc.read_h5ad( filename='/home/user/Trajectory/Datasets/Zebrafish_Lange2023/Rawdata/Zebrahub_data_via.h5ad')
#compute PCA
adata.uns['log1p']['base'] = None #fix for small bug in scanpy #see https://github.com/theislab/single-cell-tutorial/issues/97
sc.pp.highly_variable_genes(adata, n_top_genes=2000)
sc.tl.pca(adata, n_comps=100)
print(adata)

#load the adata_velo below if you want to try to use velocity. this is a 8GB file so is very RAM intensive
adata_velo = sc.read_h5ad('/home/user/Trajectory/Datasets/Zebrafish_Lange2023/Rawdata/Zebrahub_data_velocity_via.h5ad') #10GB file. if you dont have the RAM to load this, then you can still run Via2 without velocity.
print('adata_velo')
print(adata_velo)


#the data can also be loaded using
adata = datasets.zebrahub(foldername="./", use_velocity = False)

2023-09-25 08:25:15.560669	Start reading data
AnnData object with n_obs × n_vars = 119457 × 28187
    obs: 'n_genes', 'n_counts', 'fish', 'timepoint_cluster', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_nc', 'pct_counts_nc', 'developmental_stage', 'timepoint', 'zebrafish_anatomy_ontology_class', 'zebrafish_anatomy_ontology_id', 'parc', 'kmeans50', 'celltype_extrainfo', 'zebrafish_anatomy_ontology_class_shobi', 'adjusted_timepoint', 'bool_list', 'global annotation', 'time_coarse_cell_type', 'time_fine_cell_type', 'parc_cluster', 'parc_cluster_and_cell_type'
    var: 'gene_ids', 'feature_types', 'genome', 'mt', 'nc', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'global_annotation_colors', 'hvg', 'log1p', 'pca', 'timepoint_colors', 'zebrafish_anatomy_ontology_class_colors', 'zebrafish_anatomy_ontology_id_colors'
    obsm: 'X_U', 'X_Via2_Atlas', 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts'
adata_velo
AnnData object with n_obs × n_vars = 119457 × 3000
    obs: 'clusters', 'seurat_clusters', 'timepoint', 'fish', 'initial_size_spliced', 'initial_size_unspliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'velocity_length', 'velocity_confidence', 'velocity_confidence_transition', 'barcode in labeled data', 'parc', 'kmeans50', 'ontology_class', 'parc_cluster', 'parc_cluster_and_cell_type'
    var: 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std', 'gene_count_corr', 'velocity_gamma', 'velocity_qreg_ratio', 'velocity_r2', 'velocity_genes', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'fit_r2'
    uns: 'hvg', 'neighbors', 'pca', 'recover_dynamics', 'seurat_clusters_colors', 'umap', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
    obsm: 'X_Via2_Atlas', 'X_pca', 'X_umap', 'noveloumap', 'velocity_umap'
    varm: 'PCs', 'loss'
    layers: 'Ms', 'Mu', 'counts', 'fit_t', 'fit_tau', 'fit_tau_', 'matrix', 'spliced', 'unspliced', 'variance_velocity', 'velocity', 'velocity_u'
    obsp: 'connectivities', 'distances'

Extract some annotation

n_cells = adata.shape[0]
print('cells is', n_cells)
parc_cluster = [i for i in adata.obs['parc_cluster']]
parc_cluster_and_cell_type = [i for i in adata.obs['parc_cluster_and_cell_type']]
coarse_cell_type = [i for i in adata.obs['zebrafish_anatomy_ontology_class']]
fine_cell_type = [i for i in adata.obs['zebrafish_anatomy_ontology_class_shobi']] #consolidated annotaions from individual time point files
coarse_fine_cell_type = [coarse_cell_type[i] + fine_cell_type[i] for i in range(n_cells)]
time_point_numeric = [i for i in adata.obs['adjusted_timepoint']] #assigned a numeric value to each stage (e.g. 5hpf is 5, 10hpf is 10,...)
early_cell_type = [i if str(i) != 'nan' else '' for i in adata.obs['global annotation']]

hybrid_cell_time = [str(fine_cell_type[i]) + '_'+early_cell_type[i]+'_'+str(time_point_numeric[i]) for i in range(n_cells)] #labels of early stage cells which was provided for cells <24hpf also included
parc_numeric_cluster_labels = [i for i in adata.obs['parc']]
parc_cluster = [i for i in adata.obs['parc_cluster']]

cells is 119457

Find a suitable root

root_str = 'paraxial mesoderm_Somites_0'
loc_root = np.where(np.asarray(hybrid_cell_time) == root_str)[0] #'paraxial mesoderm_PSM_0' 1 cell , 'paraxial mesoderm_Somites_0' has 51 cells, 'paraxial mesoderm_Muscle_0', 'tail bud_NMPs_0'
root = [loc_root[0]]
n_pcs = 100
print(f'{datetime.now()}\troot is cell number  {root[0]} of type {fine_cell_type[root[0]]}')

2023-09-25 08:28:03.566017	root is cell number  2604 of type paraxial mesoderm

Initialize parameters

knn=30
time_series = True  
memory = 50
jac_std_global = 1  
random_seed = 0#1  
knn_sequential = 5  
t_diff_step = 2
do_gaussian_kernel_edgeweights = False
edgebundle_pruning_twice = True
cluster_graph_pruning_std = 0.15  
edgebundle_pruning = 0.05  
neighboring_terminal_states_threshold = 4 
viagraph_decay = 0.7
max_vis_out = 2
use_velocity = True
velo_weight = 0.5

if use_velocity:
    gene_matrix =np.asmatrix(adata_velo.layers['Ms'])
    velocity_matrix = np.asmatrix(adata_velo.layers['velocity'])
    velo_weight=velo_weight
else: 
    gene_matrix =None
    velocity_matrix = None
    velo_weight=0

Set terminal states (optional)

# The parameter "user_defined_terminal_group" is being set to a list of group level annotations corresponding to the provided true_label. These were terminal fates detected in previous runs and saved for reproducability.
# This parameter can also be left to its default user_defined_terminal_group = []. the results should be similar.
user_defined_cellfates= ['0_cartilage element', '2_dermis', '49_fin',
                                          '67_pharyngeal arch', '120_cranial cartilage',
                                          '1_forebrain', '71_neuron', '89_diencephalon',
                                          '136_radial glial cell',
                                          '10_epidermal cell', '83_neuromast', 
                                          '145_enteric musculature', '55_myotome', 
                                          '46_hematopoietic cell',   '102_vasculature', '127_Muller cell', '105_oligodendrocyte',   '69_liver', '90_pharynx',
                                          '95_intestine']

Initialize and Run the Via2.0 Class

# Initialize and Run the Via2.0 Class
print(f'{datetime.now()}\tRun Via2.0')

v0 = via.VIA(adata.obsm['X_pca'][:, 0:n_pcs], true_label=parc_cluster_and_cell_type, jac_std_global=jac_std_global,
             labels=parc_cluster,
             dist_std_local=1, knn=knn, memory=memory,
             user_defined_terminal_group=user_defined_cellfates,
             cluster_graph_pruning_std=cluster_graph_pruning_std,
             neighboring_terminal_states_threshold=neighboring_terminal_states_threshold,
             too_big_factor=0.3, resolution_parameter=1,
             root_user=root, dataset='', random_seed=random_seed,
             is_coarse=True, preserve_disconnected=False, pseudotime_threshold_TS=40, x_lazy=0.99,
             do_gaussian_kernel_edgeweights=do_gaussian_kernel_edgeweights,
             alpha_teleport=0.99, edgebundle_pruning=edgebundle_pruning, edgebundle_pruning_twice=True,
             time_series=time_series, time_series_labels=time_point_numeric, knn_sequential=knn_sequential,
             knn_sequential_reverse=knn_sequential, t_diff_step=t_diff_step, velo_weight=velo_weight, velocity_matrix=velocity_matrix, gene_matrix=gene_matrix,
             do_compute_embedding=False)

v0.run_VIA()
v0.embedding = adata.obsm['X_Via2_Atlas']

2023-09-25 08:28:03.589451	Run Via2.0
2023-09-25 08:28:04.804497	Running VIA over input data of 119457 (samples) x 100 (features)
2023-09-25 08:28:04.804574	Knngraph has 30 neighbors
2023-09-25 08:28:42.727156	Using time series information to guide knn graph construction 
2023-09-25 08:28:42.736297	Time series ordered set [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]
2023-09-25 08:29:35.321478	Shape neighbors (119457, 30) and sequential neighbors (119457, 10)
2023-09-25 08:29:35.354718	Shape augmented neighbors (119457, 40)
2023-09-25 08:29:35.380011	Actual average allowable time difference between nodes is 10.0
2023-09-25 08:30:02.089539	Finished global pruning of 30-knn graph used for clustering at level of 1. Kept 87.6 % of edges. 
2023-09-25 08:30:02.213207	Number of connected components used for clustergraph  is 1
2023-09-25 08:30:18.393414	Using predfined labels provided by user (this must be provided as an array)
2023-09-25 08:30:18.393502	Making cluster graph. Global cluster graph pruning level: 0.15
2023-09-25 08:30:19.415532	Graph has 1 connected components before pruning
2023-09-25 08:30:19.425338	Graph has 1 connected components after pruning
2023-09-25 08:30:19.425605	Graph has 1 connected components after reconnecting
2023-09-25 08:30:19.427311	44.3% links trimmed from local pruning relative to start
2023-09-25 08:30:19.427334	87.4% links trimmed from global pruning relative to start
2023-09-25 08:30:19.430975	Graph has 1 connected components before pruning
2023-09-25 08:30:19.444368	Graph has 3 connected components after pruning
2023-09-25 08:30:19.448015	Graph has 1 connected components after reconnecting
2023-09-25 08:30:19.449751	44.3% links trimmed from local pruning relative to start
2023-09-25 08:30:19.449775	90.1% links trimmed from global pruning relative to start
size velocity matrix 119457 (119457, 3000)

/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/scipy/sparse/_index.py:103: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_intXint(row, col, x.flat[0])

2023-09-25 08:30:23.954500	Looking for initial states
2023-09-25 08:30:24.106221	Stationary distribution normed [0.01  0.005 0.009 0.032 0.015 0.01  0.006 0.006 0.008 0.022 0.008 0.009
 0.012 0.007 0.009 0.011 0.006 0.021 0.001 0.002 0.008 0.004 0.003 0.009
 0.006 0.014 0.006 0.016 0.011 0.001 0.003 0.008 0.003 0.007 0.007 0.004
 0.006 0.001 0.007 0.002 0.005 0.012 0.009 0.005 0.007 0.018 0.    0.005
 0.011 0.006 0.007 0.008 0.017 0.006 0.017 0.004 0.005 0.003 0.01  0.005
 0.012 0.007 0.015 0.002 0.007 0.005 0.006 0.005 0.018 0.005 0.011 0.006
 0.    0.008 0.006 0.006 0.005 0.004 0.003 0.007 0.006 0.008 0.001 0.002
 0.011 0.008 0.001 0.005 0.002 0.003 0.005 0.001 0.006 0.003 0.002 0.006
 0.006 0.019 0.008 0.007 0.003 0.012 0.001 0.004 0.009 0.001 0.001 0.002
 0.012 0.008 0.007 0.011 0.003 0.008 0.005 0.001 0.004 0.006 0.011 0.002
 0.005 0.005 0.021 0.009 0.006 0.002 0.005 0.001 0.001 0.002 0.005 0.002
 0.008 0.015 0.    0.017 0.006 0.005 0.005 0.004 0.001 0.004 0.006 0.005
 0.003 0.005]
2023-09-25 08:30:24.107358	Top 5 candidates for root: [134  72  46  37  82  18 102 127  86 105] with stationary prob (%) [0.032 0.039 0.043 0.055 0.055 0.07  0.076 0.082 0.089 0.092]
2023-09-25 08:30:24.107614	Top 5 candidates for terminal: [  3   9 122  17  97]
cell type of suggested roots: ['120_cranial cartilage', '24_myotome', '6_ventricular zone', '49_pectoral fin', '89_diencephalon', '74_cranial cartilage', '23_eye', '1_forebrain', '107_neuron', '103_fin']
cell stage of suggested roots: [45, 15, 35, 45, 35, 40, 35, 35, 40, 25]
2023-09-25 08:30:24.122752	component number 0 out of  [0]
2023-09-25 08:30:24.460261	The root index, 2604 provided by the user belongs to cluster number 97 and corresponds to cell type 97_paraxial mesoderm
2023-09-25 08:30:24.629263	Computing lazy-teleporting expected hitting times
2023-09-25 08:30:32.855588	 ended all multiprocesses, will retrieve and reshape
try rw2 hitting times setup
start computing walks with rw2 method
g.indptr.size, 147

/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/pecanpy/graph.py:90: UserWarning: WARNING: Implicitly set node IDs to the canonical node ordering due to missing IDs field in the raw CSR npz file. This warning message can be suppressed by setting implicit_ids to True in the read_npz function call, or by setting the --implicit_ids flag in the CLI
  warnings.warn(

memory for rw2 hittings times  2. Using rw2 based pt
do scaling of pt
2023-09-25 08:30:41.521269	Terminal cluster list based on user defined cells/groups: [('0_cartilage element', 0), ('2_dermis', 2), ('49_fin', 49), ('67_pharyngeal arch', 67), ('120_cranial cartilage', 120), ('1_forebrain', 1), ('71_neuron', 71), ('89_diencephalon', 89), ('136_radial glial cell', 136), ('10_epidermal cell', 10), ('83_neuromast', 83), ('145_enteric musculature', 145), ('55_myotome', 55), ('46_hematopoietic cell', 46), ('102_vasculature', 102), ('127_Muller cell', 127), ('105_oligodendrocyte', 105), ('69_liver', 69), ('90_pharynx', 90), ('95_intestine', 95)]
2023-09-25 08:30:41.521658	Terminal clusters corresponding to unique lineages in this component are [0, 2, 49, 67, 120, 1, 71, 89, 136, 10, 83, 145, 55, 46, 102, 127, 105, 69, 90, 95] 
TESTING rw2_lineage probability at memory 50
testing rw2 lineage probability at memory 50
g.indptr.size, 147

/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/pecanpy/graph.py:90: UserWarning: WARNING: Implicitly set node IDs to the canonical node ordering due to missing IDs field in the raw CSR npz file. This warning message can be suppressed by setting implicit_ids to True in the read_npz function call, or by setting the --implicit_ids flag in the CLI
  warnings.warn(

2023-09-25 08:30:46.871385	 Cluster or terminal cell fate 0 is reached 194.0 times
2023-09-25 08:30:47.265164	 Cluster or terminal cell fate 2 is reached 211.0 times
2023-09-25 08:30:47.620006	 Cluster or terminal cell fate 49 is reached 38.0 times
2023-09-25 08:30:47.959715	 Cluster or terminal cell fate 67 is reached 206.0 times
2023-09-25 08:30:48.442676	 Cluster or terminal cell fate 120 is reached 52.0 times
2023-09-25 08:30:48.917698	 Cluster or terminal cell fate 1 is reached 265.0 times
2023-09-25 08:30:49.369804	 Cluster or terminal cell fate 71 is reached 148.0 times
2023-09-25 08:30:49.691185	 Cluster or terminal cell fate 89 is reached 485.0 times
2023-09-25 08:30:50.014253	 Cluster or terminal cell fate 136 is reached 137.0 times
2023-09-25 08:30:50.368288	 Cluster or terminal cell fate 10 is reached 29.0 times
2023-09-25 08:30:50.585364	 Cluster or terminal cell fate 83 is reached 128.0 times
2023-09-25 08:30:50.860216	 Cluster or terminal cell fate 145 is reached 37.0 times
2023-09-25 08:30:51.067859	 Cluster or terminal cell fate 55 is reached 76.0 times
2023-09-25 08:30:51.453572	 Cluster or terminal cell fate 46 is reached 21.0 times
2023-09-25 08:30:51.802014	 Cluster or terminal cell fate 102 is reached 62.0 times
2023-09-25 08:30:52.075750	 Cluster or terminal cell fate 127 is reached 70.0 times
2023-09-25 08:30:52.443008	 Cluster or terminal cell fate 105 is reached 111.0 times
2023-09-25 08:30:52.775031	 Cluster or terminal cell fate 69 is reached 26.0 times
2023-09-25 08:30:53.049582	 Cluster or terminal cell fate 90 is reached 26.0 times
2023-09-25 08:30:53.420929	 Cluster or terminal cell fate 95 is reached 15.0 times
2023-09-25 08:30:54.022736	There are (20) terminal clusters corresponding to unique lineages {0: '0_pectoral fin', 2: '2_dermis', 49: '49_fin', 67: '67_pharyngeal arch', 120: '120_cranial cartilage', 1: '1_forebrain', 71: '71_neuron', 89: '89_diencephalon', 136: '136_radial glial cell', 10: '10_epidermal cell', 83: '83_neuromast', 145: '145_enteric musculature', 55: '55_myotome', 46: '46_hematopoietic cell', 102: '102_vasculature', 127: '127_Muller cell', 105: '105_oligodendrocyte', 69: '69_liver', 90: '90_pharynx', 95: '95_intestine'}
2023-09-25 08:30:54.022861	Begin projection of pseudotime and lineage likelihood
2023-09-25 08:31:03.461077	Start reading data
2023-09-25 08:31:03.461154	Correlation of Via pseudotime with developmental stage 72.11 %
2023-09-25 08:31:03.498392	Transition matrix with weight of 0.5 on RNA velocity
2023-09-25 08:31:03.498789	Cluster graph layout based on forward biasing
2023-09-25 08:31:03.528940	Graph has 1 connected components before pruning
2023-09-25 08:31:03.536532	Graph has 57 connected components after pruning
2023-09-25 08:31:03.599803	Graph has 1 connected components after reconnecting
2023-09-25 08:31:03.600527	68.3% links trimmed from local pruning relative to start
2023-09-25 08:31:03.600550	81.0% links trimmed from global pruning relative to start
2023-09-25 08:31:03.600584	Additional Visual cluster graph pruning for edge bundling at level: 0.15
2023-09-25 08:31:03.602326	Make via clustergraph edgebundle

/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/scipy/sparse/_index.py:103: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_intXint(row, col, x.flat[0])

2023-09-25 08:31:06.258257	Hammer dims: Nodes shape: (146, 2) Edges shape: (918, 3)
2023-09-25 08:31:06.490458	Time elapsed 151.8 seconds

Visualise Via 2.0 Cluster Graph

print(f'{datetime.now()}\t plotting piechart_graph')
fig,ax1, ax2 = via.draw_piechart_graph(via_object=v0, reference_labels=time_point_numeric, show_legend=False, ax_text=False,
                        headwidth_arrow=0.8, highlight_terminal_clusters=False, cmap_piechart='plasma', cmap='viridis',
                        pie_size_scale=0.6,size_node_notpiechart=1)
fig.set_size_inches(40, 15)
fig,ax1, ax2=via.draw_piechart_graph(via_object=v0, reference_labels=coarse_cell_type, show_legend=False, ax_text=False,
                        headwidth_arrow=0.8, highlight_terminal_clusters=True, cmap_piechart='rainbow',cmap='viridis',
                        pie_size_scale=0.5,size_node_notpiechart=1)
fig.set_size_inches(40, 15)

fig,ax1, ax2=via.draw_piechart_graph(via_object=v0, reference_labels=coarse_cell_type, show_legend=False, ax_text=True,
                        headwidth_arrow=0.8, highlight_terminal_clusters=True, cmap_piechart='rainbow', 
                        pie_size_scale=0.5,size_node_notpiechart=1)
fig.set_size_inches(40, 15)
plt.show()

2023-09-25 08:31:06.497122	 plotting piechart_graph

/home/user/PycharmProjects/Via2_May2023_py310/plotting_via.py:3242: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
  sct = ax.scatter(node_pos[:, 0], node_pos[:, 1],
/home/user/PycharmProjects/Via2_May2023_py310/plotting_via.py:3242: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
  sct = ax.scatter(node_pos[:, 0], node_pos[:, 1],
/home/user/PycharmProjects/Via2_May2023_py310/plotting_via.py:3242: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
  sct = ax.scatter(node_pos[:, 0], node_pos[:, 1],

../_images/61b4179f8c857eeca5452408865fa0b1fa276d975eb90e02073af6fffc0da33e.png

../_images/c204a72eea8ea489ef38544efbd7c46f3403d93eac4ac20db30f412b2d5ffd31.png

../_images/1a2d8ca1976a1be6b7c0e69a64b8a44c56ee2c14dbf27b1f65af45f7617f5309.png

Plot single-cell lineage probabilities

Memory

The lineage probability pathways below are computed based on a memory parameter value of 10 (set when we initialized Via), this is a high degree of memory. The drawback of too high memory can be that some populations are difficult to reach depending on the connectivity of the network. You can always try lowering the value to e.g. 10 or 50.
If you run Via at memory=1 (the no memory case), and then plot the lineage probability pathways and gene trends, you will notice that many of the pathways are diffuse (migrating into unrelated cell populations) and the gene trends can overlap and not be very discriminative.

ecto_lineages=[1,71,89]#,136,10]#,83,115] #CNS
meso_lineages = [0,49,67]#,120,145]#,55,37,46,102]

fig, ax = via.plot_atlas_view(via_object=v0, lineage_pathway=meso_lineages,
                               linewidth_bundle=3, alpha_bundle_factor=2, headwidth_bundle=0.3,
                               size_scatter=0.1, alpha_scatter=0.5, use_sc_labels_sequential_for_direction=True)

fig.set_size_inches(30 , 8)

fig, ax = via.plot_atlas_view(via_object=v0, lineage_pathway=ecto_lineages,
                               linewidth_bundle=3, alpha_bundle_factor=2, headwidth_bundle=0.3,
                               size_scatter=0.1, alpha_scatter=0.5, use_sc_labels_sequential_for_direction=True)
fig.set_size_inches(30 , 8)
fig, ax = via.draw_sc_lineage_probability(via_object=v0, alpha_factor=0.6, scatter_size=4,
                                          marker_lineages=meso_lineages)
fig.set_size_inches(30 , 8)
fig, ax = via.draw_sc_lineage_probability(via_object=v0, alpha_factor=0.6, scatter_size=4,
                                          marker_lineages=ecto_lineages)
fig.set_size_inches(30 , 8)
plt.show()

2023-09-25 09:15:19.086314	Computing Edges
2023-09-25 09:15:19.086621	Start finding milestones

/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
  warnings.warn(

2023-09-25 09:15:49.264351	End milestones with 150
2023-09-25 09:15:50.147607	Recompute weights
2023-09-25 09:15:50.186651	pruning milestone graph based on recomputed weights
2023-09-25 09:15:50.188110	Graph has 1 connected components before pruning
2023-09-25 09:15:50.188877	Graph has 1 connected components after pruning
2023-09-25 09:15:50.189092	Graph has 1 connected components after reconnecting
2023-09-25 09:15:50.191130	regenerate igraph on pruned edges
2023-09-25 09:15:50.266700	Setting numeric label as time_series_labels or other sequential metadata for coloring edges
2023-09-25 09:15:50.554701	Making smooth edges
location of 0 is at [0] and 0
location of 49 is at [2] and 2
location of 67 is at [3] and 3
2023-09-25 09:16:25.367232	Computing Edges
2023-09-25 09:16:25.367304	Start finding milestones

/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
  warnings.warn(

2023-09-25 09:16:53.996278	End milestones with 150
2023-09-25 09:16:54.899173	Recompute weights
2023-09-25 09:16:54.939138	pruning milestone graph based on recomputed weights
2023-09-25 09:16:54.940593	Graph has 1 connected components before pruning
2023-09-25 09:16:54.941371	Graph has 1 connected components after pruning
2023-09-25 09:16:54.941591	Graph has 1 connected components after reconnecting
2023-09-25 09:16:54.943708	regenerate igraph on pruned edges
2023-09-25 09:16:55.014257	Setting numeric label as time_series_labels or other sequential metadata for coloring edges
2023-09-25 09:16:55.267735	Making smooth edges
location of 1 is at [5] and 5
location of 71 is at [6] and 6
location of 89 is at [7] and 7
2023-09-25 09:17:28.852338	Marker_lineages: [0, 49, 67]
2023-09-25 09:17:28.852410	Automatically setting embedding to via_object.embedding
2023-09-25 09:17:28.966372	The number of components in the original full graph is 1
2023-09-25 09:17:28.966428	For downstream visualization purposes we are also constructing a low knn-graph 
2023-09-25 09:18:33.313344	Check sc pb 1.0 
f getting majority comp
2023-09-25 09:18:39.231618	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 0: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 0]
2023-09-25 09:18:39.231657	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 2: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 15, 34, 2]
2023-09-25 09:18:39.231674	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 49: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 137, 49]
2023-09-25 09:18:39.231688	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 67: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 7, 67]
2023-09-25 09:18:39.231702	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 120: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 0, 120]
2023-09-25 09:18:39.231713	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 1: [97, 3, 17, 61, 113, 80, 53, 1]
2023-09-25 09:18:39.231724	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 71: [97, 3, 17, 61, 113, 80, 53, 1, 71]
2023-09-25 09:18:39.231736	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 89: [97, 3, 17, 9, 92, 50, 8, 23, 32, 89]
2023-09-25 09:18:39.231748	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 136: [97, 3, 17, 61, 113, 80, 53, 1, 6, 48, 136]
2023-09-25 09:18:39.231761	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 10: [97, 3, 52, 45, 68, 27, 59, 31, 109, 95, 141, 44, 90, 10]
2023-09-25 09:18:39.231772	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 83: [97, 3, 52, 45, 68, 27, 59, 31, 83]
2023-09-25 09:18:39.231785	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 145: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 15, 145]
2023-09-25 09:18:39.231796	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 55: [97, 25, 41, 24, 14, 65, 20, 55]
2023-09-25 09:18:39.231806	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 46: [97, 54, 70, 117, 86, 46]
2023-09-25 09:18:39.231815	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 102: [97, 54, 70, 96, 76, 102]
2023-09-25 09:18:39.231826	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 127: [97, 3, 17, 9, 92, 50, 8, 23, 32, 22, 127]
2023-09-25 09:18:39.231839	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 105: [97, 3, 17, 9, 92, 50, 8, 23, 32, 22, 128, 107, 105]
2023-09-25 09:18:39.231851	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 69: [97, 3, 52, 45, 68, 27, 59, 31, 109, 130, 69]
2023-09-25 09:18:39.231864	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 90: [97, 3, 52, 45, 68, 27, 59, 31, 109, 95, 141, 44, 90]
2023-09-25 09:18:39.231875	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 95: [97, 3, 52, 45, 68, 27, 59, 31, 109, 95]
2023-09-25 09:18:39.357220	Revised Cluster level path on sc-knnGraph from Root Cluster 97 to Terminal Cluster 0 along path: [54, 3, 53, 0, 0, 0, 0, 0, 0]
2023-09-25 09:18:39.440505	Revised Cluster level path on sc-knnGraph from Root Cluster 97 to Terminal Cluster 49 along path: [54, 54, 54, 75, 120, 49, 49, 49, 49]
2023-09-25 09:18:39.536499	Revised Cluster level path on sc-knnGraph from Root Cluster 97 to Terminal Cluster 67 along path: [54, 54, 54, 54, 54, 28, 7, 7, 7, 7]
2023-09-25 09:18:39.552106	Marker_lineages: [1, 71, 89]
2023-09-25 09:18:39.552142	Automatically setting embedding to via_object.embedding
2023-09-25 09:18:39.666859	The number of components in the original full graph is 1
2023-09-25 09:18:39.666929	For downstream visualization purposes we are also constructing a low knn-graph 
2023-09-25 09:19:43.632672	Check sc pb 1.0 
f getting majority comp
2023-09-25 09:19:49.565437	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 0: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 0]
2023-09-25 09:19:49.565473	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 2: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 15, 34, 2]
2023-09-25 09:19:49.565489	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 49: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 137, 49]
2023-09-25 09:19:49.565504	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 67: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 7, 67]
2023-09-25 09:19:49.565518	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 120: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 0, 120]
2023-09-25 09:19:49.565529	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 1: [97, 3, 17, 61, 113, 80, 53, 1]
2023-09-25 09:19:49.565540	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 71: [97, 3, 17, 61, 113, 80, 53, 1, 71]
2023-09-25 09:19:49.565553	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 89: [97, 3, 17, 9, 92, 50, 8, 23, 32, 89]
2023-09-25 09:19:49.565569	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 136: [97, 3, 17, 61, 113, 80, 53, 1, 6, 48, 136]
2023-09-25 09:19:49.565583	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 10: [97, 3, 52, 45, 68, 27, 59, 31, 109, 95, 141, 44, 90, 10]
2023-09-25 09:19:49.565594	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 83: [97, 3, 52, 45, 68, 27, 59, 31, 83]
2023-09-25 09:19:49.565606	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 145: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 15, 145]
2023-09-25 09:19:49.565617	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 55: [97, 25, 41, 24, 14, 65, 20, 55]
2023-09-25 09:19:49.565627	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 46: [97, 54, 70, 117, 86, 46]
2023-09-25 09:19:49.565637	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 102: [97, 54, 70, 96, 76, 102]
2023-09-25 09:19:49.565648	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 127: [97, 3, 17, 9, 92, 50, 8, 23, 32, 22, 127]
2023-09-25 09:19:49.565661	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 105: [97, 3, 17, 9, 92, 50, 8, 23, 32, 22, 128, 107, 105]
2023-09-25 09:19:49.565673	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 69: [97, 3, 52, 45, 68, 27, 59, 31, 109, 130, 69]
2023-09-25 09:19:49.565685	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 90: [97, 3, 52, 45, 68, 27, 59, 31, 109, 95, 141, 44, 90]
2023-09-25 09:19:49.565696	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 95: [97, 3, 52, 45, 68, 27, 59, 31, 109, 95]
2023-09-25 09:19:49.689435	Revised Cluster level path on sc-knnGraph from Root Cluster 97 to Terminal Cluster 1 along path: [54, 3, 1, 1, 1, 1]
2023-09-25 09:19:49.777811	Revised Cluster level path on sc-knnGraph from Root Cluster 97 to Terminal Cluster 71 along path: [54, 3, 80, 71, 71]
2023-09-25 09:19:49.872623	Revised Cluster level path on sc-knnGraph from Root Cluster 97 to Terminal Cluster 89 along path: [54, 3, 1, 89]

../_images/5cc4825cbd4bf57958a003485d72909422f05dc1aeae4ba95bab5a3275d4bf06.png

../_images/d235278cfc70c2fa2643763f5bdf9c2206e68166a24ca30ee8e4f9fd9b408146.png

../_images/176c6f88d9d16c28da8d94500d327f22ae7445242049e34d1fe2c2714412ec2e.png

../_images/643b20d310e699d8933ba1710de2ce641eed09ee0acd0c34ba18ae0d49c058dc.png

Differentiation Flow

Generates a differentiation flow diagram based on the Via2.0 clustergraph. The second plot in each function call has the details by parc cluster and major cell population. In this case, we assign a different root node for each set of lineages.

ecto_lineages=[1,22,71,89,136,10,83,115] #CNS
meso_lineages = [0,49,67,120,145,55,37,46,102]
via.draw_differentiation_flow(via_object=v0, marker_lineages=meso_lineages, label_node=coarse_fine_cell_type,
                              do_log_flow=True, root_cluster_list=[97])  # cluster corresponding to root

via.draw_differentiation_flow(via_object=v0, marker_lineages=ecto_lineages, label_node=coarse_fine_cell_type,
                              do_log_flow=True, root_cluster_list=[52])  # cluster corresponding to a suitable CNS root
plt.show()

2023-09-25 08:49:15.613940	Marker_lineages: [0, 49, 67, 120, 145, 55, 37, 46, 102]
automatically setting embedding to via_object.embedding
2023-09-25 08:49:15.614169	Start dictionary modes
2023-09-25 08:49:15.852553	End dictionary modes
2023-09-25 08:49:18.016249	Check sc pb 1.0 
2023-09-25 08:49:25.668206	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 0: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 0]

2023-09-25 08:49:25.741354	Holoviews for TC 0
dest 0, is at loc 0 on the bp_array
2023-09-25 08:49:25.741804	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 49: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 137, 49]

2023-09-25 08:49:25.806408	Holoviews for TC 49
dest 49, is at loc 2 on the bp_array
2023-09-25 08:49:25.807038	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 67: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 7, 67]

2023-09-25 08:49:25.840009	Holoviews for TC 67
dest 67, is at loc 3 on the bp_array
2023-09-25 08:49:25.840436	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 120: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 11, 58, 0, 120]

2023-09-25 08:49:25.889504	Holoviews for TC 120
dest 120, is at loc 4 on the bp_array
2023-09-25 08:49:25.889883	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 145: [97, 25, 41, 24, 14, 65, 20, 110, 36, 26, 43, 79, 15, 145]

2023-09-25 08:49:25.941053	Holoviews for TC 145
dest 145, is at loc 11 on the bp_array
2023-09-25 08:49:25.941432	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 55: [97, 25, 41, 24, 14, 65, 20, 55]

2023-09-25 08:49:26.001078	Holoviews for TC 55
dest 55, is at loc 12 on the bp_array
2023-09-25 08:49:26.001825	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 37: [97, 54, 70, 57, 125, 37]

2023-09-25 08:49:26.052467	Holoviews for TC 37
2023-09-25 08:49:26.052785	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 46: [97, 54, 70, 117, 86, 46]

2023-09-25 08:49:26.113529	Holoviews for TC 46
dest 46, is at loc 13 on the bp_array
2023-09-25 08:49:26.114352	Cluster path on clustergraph starting from Root Cluster 97 to Terminal Cluster 102: [97, 54, 70, 96, 76, 102]

2023-09-25 08:49:26.179086	Holoviews for TC 102
dest 102, is at loc 14 on the bp_array
2023-09-25 08:49:26.182116	Start sankey
2023-09-25 08:49:26.272915	make sankey color dict
2023-09-25 08:49:26.274535	set options and render

2023-09-25 08:49:27.354921	Marker_lineages: [1, 22, 71, 89, 136, 10, 83, 115]
automatically setting embedding to via_object.embedding
2023-09-25 08:49:27.355000	Start dictionary modes
2023-09-25 08:49:27.545101	End dictionary modes
2023-09-25 08:49:29.465100	Check sc pb 1.0 
2023-09-25 08:49:35.972127	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 1: [52, 3, 17, 61, 113, 80, 53, 1]

2023-09-25 08:49:36.004311	Holoviews for TC 1
dest 1, is at loc 5 on the bp_array
2023-09-25 08:49:36.004850	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 22: [52, 3, 17, 9, 92, 50, 8, 23, 32, 22]

2023-09-25 08:49:36.043009	Holoviews for TC 22
2023-09-25 08:49:36.043484	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 71: [52, 3, 17, 61, 113, 80, 53, 1, 71]

2023-09-25 08:49:36.088468	Holoviews for TC 71
dest 71, is at loc 6 on the bp_array
2023-09-25 08:49:36.089030	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 89: [52, 3, 17, 9, 92, 50, 8, 23, 32, 89]

2023-09-25 08:49:36.139421	Holoviews for TC 89
dest 89, is at loc 7 on the bp_array
2023-09-25 08:49:36.139808	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 136: [52, 3, 17, 61, 113, 80, 53, 1, 6, 48, 136]

2023-09-25 08:49:36.198432	Holoviews for TC 136
dest 136, is at loc 8 on the bp_array
2023-09-25 08:49:36.198699	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 10: [52, 45, 68, 27, 59, 31, 109, 95, 141, 44, 90, 10]

2023-09-25 08:49:36.264525	Holoviews for TC 10
dest 10, is at loc 9 on the bp_array
2023-09-25 08:49:36.264947	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 83: [52, 45, 68, 27, 59, 31, 83]

2023-09-25 08:49:36.322102	Holoviews for TC 83
dest 83, is at loc 10 on the bp_array
2023-09-25 08:49:36.322475	Cluster path on clustergraph starting from Root Cluster 52 to Terminal Cluster 115: [52, 45, 68, 27, 98, 138, 142, 66, 78, 115]

2023-09-25 08:49:36.379335	Holoviews for TC 115
2023-09-25 08:49:36.381979	Start sankey
2023-09-25 08:49:36.459963	make sankey color dict
2023-09-25 08:49:36.462874	set options and render