使用Python覆盖Excel中的工作表

Question

使用Python覆盖Excel中的工作表

python

4

我刚学习Python编程，遇到一个问题：向Excel表格写入数据时出现了问题。

我读取Excel文件，对特定列执行求和计算，然后将结果写入新工作簿中。最后，根据结果创建两个图表。

代码可以运行，但每次运行都会创建带有数字的新工作表。我只想覆盖我提供的工作表名称，而不是创建新的工作表。

我对所有模块不够熟悉，无法理解所有可用选项。我已经研究了openpyxl、pandas等，但类似于我要做的示例要么很难找到，要么在我尝试时似乎不起作用。

import pandas as pd
import xlrd
import openpyxl as op
from openpyxl import load_workbook
import matplotlib.pyplot as plt

# declare the input file
input_file = 'TestData.xlsx'

# declare the output_file name to be written to
output_file = 'TestData_Output.xlsx'
book = load_workbook(output_file)
writer = pd.ExcelWriter(output_file, engine='openpyxl')
writer.book = book

# read the source Excel file and calculate sums
excel_file = pd.read_excel(input_file)
num_events_main = excel_file.groupby(['Column1']).sum()
num_events_type = excel_file.groupby(['Column2']).sum()

# create dataframes and write names and sums out to new workbook/sheets
df_1 = pd.DataFrame(num_events_main)
df_2 = pd.DataFrame(num_events_type)
df_1.to_excel(writer, sheet_name = 'TestSheet1')
df_2.to_excel(writer, sheet_name = 'TestSheet2')

# save and close
writer.save()
writer.close()

# dataframe for the first sheet
df = pd.read_excel(output_file, sheet_name='TestSheet1')
values = df[['Column1', 'Column3']]

# dataframe for the second sheet
df = pd.read_excel(output_file, sheet_name='TestSheet2')
values_2 = df[['Column2', 'Column3']]

# create the graphs
events_graph = values.plot.bar(x = 'Column1', y = 'Column3', rot = 60) # rot = rotation
type_graph = values_2.plot.bar(x = 'Column2', y = 'Column3', rot = 60) # rot = rotation
plt.show()

我已获得预期的结果，并且图表运行良好。我真的只想在每次运行时覆盖工作表。

- subzeroLV

如果您不使用load_workbook会怎样呢？据我回忆，当初始化pd.ExcelWriter时，它会覆盖现有的工作簿。 - dubbbdan

writer = pd.ExcelWriter(output_file, engine='openpyxl') already has a book, so there is no reason to use book = load_workbook(output_file) - dubbbdan

dubbbdan，谢谢！那个有效了！我感激你的帮助。 - subzeroLV

已回答您的帖子，请接受并点赞（如果您认为它值得）。 - dubbbdan

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- dubbbdan · Accepted Answer

从pd.DataFrame.to_excel文档中可以得知：

通过指定唯一的sheet_name，可以写入多个工作表。将所有数据写入文件后，需要保存更改。请注意，使用已存在的文件名创建ExcelWriter对象将导致擦除现有文件的内容。

尝试像这样写入book：

import pandas as pd
df = pd.DataFrame({'col1':[1,2,3],'col2':[4,5,6]})
writer = pd.ExcelWriter('g.xlsx')
df.to_excel(writer, sheet_name = 'first_df')
df.to_excel(writer, sheet_name = 'second_df')
writer.save()

如果您检查工作簿，您将有两个工作表。

然后假设您想要向相同的工作簿写入新数据：

writer = pd.ExcelWriter('g.xlsx')
df.to_excel(writer, sheet_name = 'new_df')
writer.save()

如果现在您检查工作簿，您只会看到一个名为new_df的工作表。

如果Excel文件中还有其他工作表需要保留并只覆盖所需的工作表，则需要使用load_workbook。

在写入任何数据之前，您可以使用以下内容删除要写入的工作表：

std=book.get_sheet_by_name(<sheee_name>)
book.remove_sheet(std)

这将阻止当你尝试写入一个与工作表名称重复的工作簿时，数字会被添加到工作表名称的行为。