There are 235,545,792 edges in refseq93.edge.tsv, so it is impossible to create an network of all these edges in my computer. Therefore, I tried to look at the relationships of species instead of individual genomes.
To do so, I initially separated the genomes into isolated clusters. There is no connection between two genomes in any two of these clusters. I wrote a c++ code, namely getclusters.cc, to handle this job. The result was cluster.txt. Then, I arranged the identified numbers of the clusters by get_cluster_arrange.cc to get cluster_arrange.txt. After that, I removed any clusters in which there are only genomes of similar species by get_cluster_reduced.cc, before get edges of genomes in the remaining clusters by get_edges_from_cluster_reduced.cc. Next, I continued deleting the edges of two genomes that are in the same species by get_edges_from_cluster_reduced2.cc. Finally, I replaced each genome by its color using get_edges_color.cc, before reducing its size by get_edges_color_reduced.cc.
In the final file, namely edges_color_reduced.txt, there are only connections between species which are identified by colors, so the network of these edges shows the relationships of living species.
The edges_color_reduced.txt network was performed by Cytoscape with a setting of 8 GB for both heap and stack memory. To illustrate the number of genomes of each species, I wrote get_size.cc to calculate the radius of the nodes as the square root of the number of genomes belong to them. In Cytoscape, the size function is a linear one which start from 30 (at the smallest radius) to 100 (at the largest one).
The file nodes_color.txt is a brief version of refseq93.color in which there are only two columns of genomes and their colors.
Cytoscape 3.7.1 was utilized to depict the bonds of species in file edges_color_reduced.txt. Each species has a specific color. The small labels in each nodes are names of those corresponding species. I chose yFiles organic layout for the graph. The network can be downloaded at here.
Please click the link to get files you need. All explanations of the files are in Readme.txt.
Please click the link to view poster.
