我希望使用Pandas读取一个包含节点及其属性的CSV文件。并不是所有节点都有每个属性,缺失的属性在CSV文件中被简单地忽略。当Pandas读取CSV文件时,缺失值显示为nan
。我想从数据框中批量添加节点,但避免添加nan
属性。
例如,这里有一个名为mwe.csv
的样本CSV文件:
Name,Cost,Depth,Class,Mean,SD,CST,SL,Time
Manuf_0001,39.00,1,Manuf,,,12,,10.00
Manuf_0002,36.00,1,Manuf,,,8,,10.00
Part_0001,12.00,2,Part,,,,,28.00
Part_0002,5.00,2,Part,,,,,15.00
Part_0003,9.00,2,Part,,,,,10.00
Retail_0001,0.00,0,Retail,253,36.62,0,0.95,0.00
Retail_0002,0.00,0,Retail,45,1,0,0.95,0.00
Retail_0003,0.00,0,Retail,75,2,0,0.95,0.00
以下是我目前的处理方式:
import pandas as pd
import numpy as np
import networkx as nx
node_df = pd.read_csv('mwe.csv')
graph = nx.DiGraph()
graph.add_nodes_from(node_df['Name'])
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['Cost'])), 'nodeCost')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['Mean'])), 'avgDemand')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['SD'])), 'sdDemand')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['CST'])), 'servTime')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['SL'])), 'servLevel')
# Loop through all nodes and all attributes and remove NaNs.
for i in graph.nodes:
for k, v in list(graph.nodes[i].items()):
if np.isnan(v):
del graph.nodes[i][k]
它可以工作,但很笨重。有没有更好的方法,例如,在添加节点时避免使用nan
,而不是之后删除nan
?