我有一个输入文件,
10N06_64 sc635516 93.93 100.0
10N06_64 sc711028 93.99 100.0
10N06_64 sc255425 93.46 95.8
10N06_64 sc115511 87.5 93.0
116F19_238 sc121016 91.30 12.1
116F19_238 sc1132492 90.94 6.1
116F19_238 sc513573 87.38 6.1
116F19_238 sc68511 75.93 10.5
我需要在每个line[0]内分组和迭代,并选择具有line[3]和line[2]最高值的3行进行打印,以便输出文件如下:
10N06_64 sc635516 93.93 100.0
10N06_64 sc711028 93.99 100.0
10N06_64 sc255425 93.46 95.8
116F19_238 sc121016 91.30 12.1
116F19_238 sc68511 75.93 10.5
116F19_238 sc1132492 90.94 6.1
这是我的尝试,但它只打印出最好的一行,如何修改它以打印3个最佳匹配?
import csv
from itertools import groupby
from operator import itemgetter
with open('myfile','rb') as f1:
with open('outfile', 'wb') as f2:
reader = csv.reader(f1, delimiter='\t')
writer1 = csv.writer(f2, delimiter='\t')
for group, rows in groupby(reader, itemgetter(0)):
best = max(rows, key=lambda r: (float(r[3]), float(r[2])))
writer1.writerow(best)