Ara_data.hdf5 =================================================================================================== Contains: counts: (matrix) 22271 x 16641 - Merged arabidopsis scRNA seq expression data from 4 sources coldata: (dataframe) 16641 x 6 - Cell meta data rowdata: (dataframe) 22271 x 1 - Gene meta data embedding: (dataframe) 16641 x 3 - UMAP embedding TM_data.hdf5 =================================================================================================== Contains: counts: (matrix) 23433 x 44949 - Merged Tabula Muris Smart-seq expression data coldata: (dataframe) 44949 x 8 - Cell meta data rowData: (dataframe) 23433 x 2 - Gene meta data embedding: (dataframe) 44949 x 3 - tSNE embedding 20200625_tagr_nodups.Rdata =================================================================================================== Contains: oa.mat: (dataframe) 271,068 x 18 - Table of 1-to-1 orthologs between a pair of species (Gene symbol and Ensembl IDs), the number of algorithms predicting the orthologs, and their coexpression conservation + specificity scores spe37_divergence_timetree.csv =================================================================================================== Contains: counts: (matrix) 37 x 37 - Upper triangular matrix of estimated species divergence times for all pairs of 37 species from timetree.org scripts/ =================================================================================================== Contains: ortholog_scores_1to1_topN.R: (script) - Functions to calculate coexpression conservation and specificity scores using only 1-to-1 orthologs given a species pair (species1, species2) ortholog_scores_manytomany_topN.R: (script) - Functions to calculate coexpression conservation and specificity scores using many-to-many orthologs given a species pair (species1, species2) geneInfo/ =================================================================================================== Contains: _info.csv: (dataframe) - Table of genes present in the species' coexpression network and corresponding IDs in other databases (ex: Entrez, Ensembl, OrthoDB, etc.) orthoNets1-1/ =================================================================================================== Contains: -orthologMatrix.Rdata: - net11: dataframe with list of 1-to-1 orthologs shared by the species pair orthoNetsN-M/ =================================================================================================== Contains: __NM_orthologmat.Rdata: - om: matrix mapping many-to-many orthologs across species pair. Rows and columns contain OrthoDB IDs of genes in species2 and species1, respectively. Matrix entries 1 indicate genes are many-to-many orthologs (either 1-to-1, or part of gene family), and 0 otherwise. coexp_cons/ =================================================================================================== Contains: __CoCoBLAST_scores.hdf5: - sp1_netid: List of genes from coexpression network of species1 (Gene symbol or Ensembl IDs) - sp1_orthoid: OrthoDB IDs corresponding to genes in sp1_netid - sp2_netid: List of genes from coexpression network of species2 (Gene symbol or Ensembl IDs) - sp2_orthoid: OrthoDB IDs corresponding to genes in sp2_netid - ortho_map: Dataframe of many-to-many orthologs in species pair - fc_scores: x Matrix of coexpression conservation scores for all gene pairs across species. Scores for many-to-many orthologs can be shortlisted when combined with ortho_map. - sc_scores: x Matrix of coexpression conservation specificity scores for all gene pairs across species. Scores for many-to-many orthologs can be shortlisted when combined with ortho_map. - chunkSize: Chunk size used to store the dataset