我发现这对于快速可视化以CSV文件形式提供的交互数据(这里是基因数据)非常有用。
数据文件 [a.csv]
APC,TP73
BARD1,BRCA1
BARD1,ESR1
BARD1,KRAS2
BARD1,SLC22A18
BARD1,TP53
BRCA1,BRCA2
BRCA1,CHEK2
BRCA1,MLH1
BRCA1,PHB
BRCA2,CHEK2
BRCA2,TP53
CASP8,ESR1
CASP8,KRAS2
CASP8,PIK3CA
CASP8,SLC22A18
CDK2,CDKN1A
CHEK2,CDK2
ESR1,BRCA1
ESR1,KRAS2
ESR1,PPM1D
ESR1,SLC22A18
KRAS2,BRCA1
MLH1,CHEK2
MLH1,PMS2
PIK3CA,BRCA1
PIK3CA,ESR1
PIK3CA,RB1CC1
PIK3CA,SLC22A18
PMS2,TP53
PTEN,BRCA1
PTEN,MLH3
RAD51,BRCA1
RB1CC1,SLC22A18
SLC22A18,BRCA1
TP53,PTEN
Python 3.7虚拟环境
import networkx as nx
import matplotlib.pyplot as plt
G = nx.read_edgelist("a.csv", delimiter=",")
G.edges()
'''
[('CDKN1A', 'CDK2'), ('MLH3', 'PTEN'), ('TP73', 'APC'), ('CHEK2', 'MLH1'),
('CHEK2', 'BRCA2'), ('CHEK2', 'CDK2'), ('CHEK2', 'BRCA1'), ('BRCA2', 'TP53'),
('BRCA2', 'BRCA1'), ('KRAS2', 'CASP8'), ('KRAS2', 'ESR1'), ('KRAS2', 'BRCA1'),
('KRAS2', 'BARD1'), ('PPM1D', 'ESR1'), ('BRCA1', 'PHB'), ('BRCA1', 'ESR1'),
('BRCA1', 'PIK3CA'), ('BRCA1', 'PTEN'), ('BRCA1', 'MLH1'), ('BRCA1', 'SLC22A18'),
('BRCA1', 'BARD1'), ('BRCA1', 'RAD51'), ('CASP8', 'ESR1'), ('CASP8', 'SLC22A18'),
('CASP8', 'PIK3CA'), ('TP53', 'PMS2'), ('TP53', 'PTEN'), ('TP53', 'BARD1'),
('PMS2', 'MLH1'), ('PIK3CA', 'SLC22A18'), ('PIK3CA', 'ESR1'), ('PIK3CA', 'RB1CC1'),
('SLC22A18', 'ESR1'), ('SLC22A18', 'RB1CC1'), ('SLC22A18', 'BARD1'),
('BARD1', 'ESR1')]
'''
G.number_of_edges()
G.nodes()
'''
['CDKN1A', 'MLH3', 'TP73', 'CHEK2', 'BRCA2', 'KRAS2', 'CDK2', 'PPM1D', 'BRCA1',
'CASP8', 'TP53', 'PMS2', 'RAD51', 'PIK3CA', 'MLH1', 'SLC22A18', 'BARD1',
'PHB', 'APC', 'ESR1', 'RB1CC1', 'PTEN']
'''
G.number_of_nodes()
更新
这个过去可以使用(2018-03),但现在(2019-12)会出现pygraphviz
导入错误:
from networkx.drawing.nx_agraph import graphviz_layout
nx.draw(G, pos = graphviz_layout(G), node_size=1200, node_color='lightblue', \
linewidths=0.25, font_size=10, font_weight='bold', with_labels=True)
Traceback (most recent call last):
...
ImportError: libpython3.7m.so.1.0: cannot open shared object file:
No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
ImportError: ('requires pygraphviz ', 'http://pygraphviz.github.io/')
解决方法
在 Python 外(venv 终端提示符下: $
),安装 pydot
。
pip install pydot
回到Python运行以下代码。
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
import networkx as nx
import matplotlib.pyplot as plt
G = nx.read_edgelist("a.csv", delimiter=",")
nx.draw(G, pos = nx.nx_pydot.graphviz_layout(G), node_size=1200, \
node_color='lightblue', linewidths=0.25, font_size=10, \
font_weight='bold', with_labels=True)
plt.show()
主要的变化是替换了
nx.draw(G, pos = graphviz_layout(G), ...)
nx.draw(G, pos = nx.nx_pydot.graphviz_layout(G), ...)
参考文献
删除matplotlib的过时警告
什么原因导致NetworkX和PyGraphViz单独使用正常,但一起使用不正常?
改进绘图布局
在这些静态networkx/matplotlib图中减少拥挤很困难;一种解决方法是增加图形大小,参见此StackOverflow问题与回答:使用NetworkX和Matplotlib制作图形的高分辨率图像:
plt.figure(figsize=(20,14))
nx.draw(G, pos = nx.nx_pydot.graphviz_layout(G), \
node_size=1200, node_color='lightblue', linewidths=0.25, \
font_size=10, font_weight='bold', with_labels=True, dpi=1000)
plt.show()
要将输出图像大小重置为系统默认值:
plt.figure()
奖励:最短路径
nx.dijkstra_path(G, 'CDKN1A', 'MLH3')
plot1.png
plot2.png
虽然这里没有做,但如果您想添加节点边框并加粗节点边框线(节点边缘厚度:linewidths
),请执行以下操作。
nx.draw(G, pos = nx.nx_pydot.graphviz_layout(G), \
node_size=1200, node_color='lightblue', linewidths=2.0, \
font_size=10, font_weight='bold', with_labels=True)
ax = plt.gca()
ax.collections[0].set_edgecolor('r')
plt.show()