@MaxU的解决方案在更新版本的python和相关包中不起作用。它会引发错误:
"zipfile.BadZipFile:文件不是zip文件"
我生成了一个新版本的函数,可以与更新版本的python和相关包很好地配合使用,并测试了python:3.9 | openpyxl:3.0.6 | pandas:1.2.3
此外,我还向助手函数添加了更多功能:
- 现在它可以根据单元格内容宽度调整所有列的大小,并且所有变量都将可见(请参见“resizeColumns”)
- 如果你希望NaN显示为NaN或为空单元格,可以处理NaN(请参见“na_rep”)
- 新增了“startcol”,您可以决定从特定列开始写入,否则将从列= 0开始
以下是函数代码:
import pandas as pd
def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None, startcol=None,
truncate_sheet=False, resizeColumns=True, na_rep = 'NA', **to_excel_kwargs):
"""
Append a DataFrame [df] to existing Excel file [filename]
into [sheet_name] Sheet.
If [filename] doesn't exist, then this function will create it.
Parameters:
filename : File path or existing ExcelWriter
(Example: '/path/to/file.xlsx')
df : dataframe to save to workbook
sheet_name : Name of sheet which will contain DataFrame.
(default: 'Sheet1')
startrow : upper left cell row to dump data frame.
Per default (startrow=None) calculate the last row
in the existing DF and write to the next row...
truncate_sheet : truncate (remove and recreate) [sheet_name]
before writing DataFrame to Excel file
resizeColumns: default = True . It resize all columns based on cell content width
to_excel_kwargs : arguments which will be passed to `DataFrame.to_excel()`
[can be dictionary]
na_rep: default = 'NA'. If, instead of NaN, you want blank cells, just edit as follows: na_rep=''
Returns: None
*******************
CONTRIBUTION:
Current helper function generated by [Baggio]: https://stackoverflow.com/users/14302009/baggio?tab=profile
Contributions to the current helper function: https://stackoverflow.com/users/4046632/buran?tab=profile
Original helper function: (c) [MaxU](https://stackoverflow.com/users/5741205/maxu?tab=profile)
Features of the new helper function:
1) Now it works with python 3.9 and latest versions of pandas and openpxl
---> Fixed the error: "zipfile.BadZipFile: File is not a zip file".
2) Now It resize all columns based on cell content width AND all variables will be visible (SEE "resizeColumns")
3) You can handle NaN, if you want that NaN are displayed as NaN or as empty cells (SEE "na_rep")
4) Added "startcol", you can decide to start to write from specific column, oterwise will start from col = 0
*******************
"""
from openpyxl import load_workbook
from string import ascii_uppercase
from openpyxl.utils import get_column_letter
from openpyxl import Workbook
if 'engine' in to_excel_kwargs:
to_excel_kwargs.pop('engine')
try:
f = open(filename)
except IOError:
wb = Workbook()
ws = wb.active
ws.title = sheet_name
wb.save(filename)
writer = pd.ExcelWriter(filename, engine='openpyxl', mode='a')
try:
FileNotFoundError
except NameError:
FileNotFoundError = IOError
try:
writer.book = load_workbook(filename)
if startrow is None and sheet_name in writer.book.sheetnames:
startrow = writer.book[sheet_name].max_row
if truncate_sheet and sheet_name in writer.book.sheetnames:
idx = writer.book.sheetnames.index(sheet_name)
writer.book.remove(writer.book.worksheets[idx])
writer.book.create_sheet(sheet_name, idx)
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
except FileNotFoundError:
pass
if startrow is None:
startrow = 0
if startcol is None:
startcol = 0
df.to_excel(writer, sheet_name, startrow=startrow, startcol=startcol, na_rep=na_rep, **to_excel_kwargs)
if resizeColumns:
ws = writer.book[sheet_name]
def auto_format_cell_width(ws):
for letter in range(1,ws.max_column):
maximum_value = 0
for cell in ws[get_column_letter(letter)]:
val_to_check = len(str(cell.value))
if val_to_check > maximum_value:
maximum_value = val_to_check
ws.column_dimensions[get_column_letter(letter)].width = maximum_value + 2
auto_format_cell_width(ws)
writer.save()
示例用法:
df = pd.DataFrame({'numbers': [1, 2, 3],
'colors': ['red', 'white', 'blue'],
'colorsTwo': ['yellow', 'white', 'blue'],
'NaNcheck': [float('NaN'), 1, float('NaN')],
})
filename = r"C:\DataScience\df.xlsx"
append_df_to_excel(filename, df, index=False, startrow=0)
append_df_to_excel(filename, df, sheet_name="Cool", index=False, startrow=0)
append_df_to_excel(filename, df, sheet_name="Cool", index=False)
append_df_to_excel(filename, df, sheet_name="Cool", index=False, startrow=0, startcol=5)
append_df_to_excel(filename, df, index=False, truncate_sheet=True, startrow=10, na_rep = '')
ExcelReader
。 - virtualxtc