我有一个pandas数据框,希望在一个使用多进程的函数中绘制它的切片。虽然当我单独调用"process_expression"函数时它可以工作,但是当我使用"multiprocessing"选项时它并没有生成任何图形。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy
import seaborn as sns
import sys
from multiprocessing import Pool
import os
os.system("taskset -p 0xff %d" % os.getpid())
pool = Pool()
gn = pool.map(process_expression, gene_ids)
pool.close()
pool.join()
def process_expression(gn_name, df_gn=df_coding):
df_part = df_gn.loc[df_gn['Gene_id'] == gn_name]
df_part = df_part.drop('Gene_id', 1)
df_part = df_part.drop('Transcript_biotype', 1)
COUNT100 = df_part[df_part >100 ].count()
COUNT10 = (df_part[df_part >10 ].count()) - COUNT100
COUNT1 = (df_part[df_part >1].count())- COUNT100 - COUNT10
COUNT0 = (df_part[df_part >0].count())- COUNT100-COUNT10- COUNT1
result = pd.concat([COUNT0,COUNT1,COUNT10,COUNT100], axis=1)
result.columns = [ '0 TO 1', '1 TO 10','10 TO 100', '>100']
result.plot( kind='bar', figsize=(50, 20), fontsize=7, stacked=True)
plt.savefig('./expression_levels/all_genes/'+gn_name+'.png')#,bbox_inches='tight')
plt.close()
df_coding表类似于以下内容(实际上有更多列,我已经删除了一些):
Isoform_name,heart,heart.1,lung.3,Gene_id,Transcript_biotype
ENST00000296782,0.14546900000000001,0.161245,0.09479889999999999,ENSG00000164327,protein_coding
ENST00000357387,6.53902,5.86969,7.057689999999999,ENSG00000164327,protein_coding
ENST00000514735,0.0,0.0,0.0,ENSG00000164327,protein_coding
输入的数据框 df_coding 是一个带有 Gene_id 列的数据框。在这一列中,我有一个 gn_name 的列表。我想要做的是每次只取出数据框中 Gene_id 列中包含 gn_name[i] 名称的部分,并基于此数据框绘制条形图。
例如,如果我调用 'process_expression('ENSG00000164327')',这是一个特定的 gn_name,输出结果将类似于下面这个图片:
![This is the barplot of the dataframe if the gn_name is ENSG00000164327](https://istack.dev59.com/zthdj.webp)