有没有一种方法可以使用pandas.ExcelWriter自动调整Excel列宽？

Question

有没有一种方法可以使用pandas.ExcelWriter自动调整Excel列宽？

206

我被要求生成一些Excel报告。目前我在处理数据时非常依赖于pandas，因此自然希望使用pandas.ExcelWriter方法来生成这些报告。但是，固定的列宽度是个问题。

到目前为止，我的代码相当简单。假设我有一个名为df的数据框：

writer = pd.ExcelWriter(excel_file_path, engine='openpyxl')
df.to_excel(writer, sheet_name="Summary")

我查看了Pandas文档，但并没有找到设置列宽的选项。是否能自动调整列宽以适应数据？或者我可以在生成的xlsx文件上进行调整吗？（我使用OpenPyXL库生成.xlsx文件，不知道这是否有影响。）

- badideas

1

目前看起来似乎不可能，可以在 Github 上开一个问题以进行增强（也许还有一个 PR？）。看起来并不难做。 - Jeff

谢谢Jeff，我已经提交了这个问题。我不确定是否有时间真正深入pandas代码库来解决它，但你永远不知道 :) - badideas

是的...看到你的问题了...如果需要帮助，请在问题下发表评论！（基本上需要将可选参数传递给“to_excel”，可能是包含列标题样式元素的“col_style=dict”（而不是默认的“header_style”，似乎现在已经硬编码） - Jeff

8

链接到pandas问题 - dmvianna

21个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Michel Kluger · Answer 1

您可以通过调用以下函数来解决问题，在此函数中，df是您要获取大小的数据帧，sheetname是您希望进行修改的Excel工作表。

def auto_width_columns(df, sheetname):
        workbook = writer.book  
        worksheet= writer.sheets[sheetname] 
    
        for i, col in enumerate(df.columns):
            column_len = max(df[col].astype(str).str.len().max(), len(col) + 2)
            worksheet.set_column(i, i, column_len)

- Yauheni Piatrouski · Answer 2

请尝试使用worksheet.autofit()方法。我是通过pip install XlsxWriter==3.0.9达到这个方法的。

附言：我是回答问题的新手，对回答的干燥感到抱歉。

- Ssubrat Rrudra · Answer 3

import re
import openpyxl
..
for col in _ws.columns:
    max_lenght = 0
    print(col[0])
    col_name = re.findall('\w\d', str(col[0]))
    col_name = col_name[0]
    col_name = re.findall('\w', str(col_name))[0]
    print(col_name)
    for cell in col:
        try:
            if len(str(cell.value)) > max_lenght:
                max_lenght = len(cell.value)
        except:
            pass
    adjusted_width = (max_lenght+2)
    _ws.column_dimensions[col_name].width = adjusted_width

- Ashish Jith · Answer 4

最简单的解决方案是在set_column方法中指定列的宽度。

    for worksheet in writer.sheets.values():
        worksheet.set_column(0,last_column_value, required_width_constant)

- Jay Jara · Answer 5

这里有很多有效的解决方案。我认为最简单和最清晰的方法是使用微软的pywin32包，它紧密地模拟了Excel VBA。Range.AutoFit方法可以解决这个问题。以下是示例：

import win32com.client as win32

xlApp = win32.Dispatch('Excel.Application')
wb = xlApp.Workbooks.Open(***file path to excel file goes here***)

ws = wb.Worksheets[***name of worksheet trying adjust column width***]
ws.Columns.AutoFit()

注意： Worksheet.Columns 属性表示一个 Range 对象。Autofit 是属于 Range 对象的一种方法。

- gijsvandinther · Answer 6

这个函数对我很有用，还修复了索引宽度。

def write_to_excel(writer, X, sheet_name, sep_only=False):
    #writer=writer object
    #X=dataframe
    #sheet_name=name of sheet
    #sep_only=True:write only as separate excel file, False: write as sheet to the writer object
    if sheet_name=="": 
        print("specify sheet_name!")
    else:
        X.to_excel(f"{output_folder}{prefix_excel_save}_{sheet_name}.xlsx")
        if not sep_only: 
            X.to_excel(writer, sheet_name=sheet_name)
            
            #fix column widths
            worksheet = writer.sheets[sheet_name]  # pull worksheet object
            for idx, col in enumerate(X.columns):  # loop through all columns
                series = X[col]
                max_len = max((
                    series.astype(str).map(len).max(),  # len of largest item
                    len(str(series.name))  # len of column name/header
                    )) + 1  # adding a little extra space
                worksheet.set_column(idx+1, idx+1, max_len)  # set column width (=1 because index = 1)
                
            #fix index width
            max_len=pd.Series(X.index.values).astype(str).map(len).max()+1
            worksheet.set_column(0, 0, max_len)
            
        if sep_only: 
            print(f'{sheet_name} is written as seperate file')
        else:
            print(f'{sheet_name} is written as seperate file')
            print(f'{sheet_name} is written as sheet')
    return writer

调用示例：

writer = write_to_excel(writer, dataframe, "Statistical_Analysis")

- Mounesh · Answer 7

由于您正在使用openpyxl引擎，您可以尝试以下解决方案。列会自动调整。

for column_cells in sheet.columns:
    new_column_length = max(len(str(cell.value)) for cell in column_cells)
    new_column_letter = (get_column_letter(column_cells[0].column))
    if new_column_length > 0:
        sheet.column_dimensions[new_column_letter].width = new_column_length*1.23

- Wagner Montanini · Answer 8

函数'autosize_to_excel'接受一个DataFrame并将其保存到Excel文件中，调整列宽以适应数据。

import pandas as pd
from openpyxl.utils import get_column_letter
def autosize_to_excel(self,filename: str, df: pd.DataFrame, sheet_name: str = 'sheetName',**kwargs):
    """
    The function 'autosize_to_excel' takes a DataFrame and saves it to an Excel file, adjusting the
    column widths to fit the data.
    
    :param filename: The filename parameter is a string that specifies the name of the Excel file
    that will be created or overwritten with the data from the DataFrame
    :type filename: str
    :param df: The parameter `df` is a pandas DataFrame that contains the data you want to write to
    the Excel file
    :type df: pd.DataFrame
    :param sheet_name: The parameter "sheet_name" is used to specify the name of the sheet in the
    Excel file where the DataFrame will be written. By default, it is set to 'sheetName', defaults
    to sheetName
    :type sheet_name: str (optional)
    """
    try:
        with pd.ExcelWriter(filename) as writer:
            df.to_excel(writer, sheet_name=sheet_name,**kwargs)

            for column in df:
                column_length = max(df[column].astype(str).map(len).max(), len(column))
                column_letter = get_column_letter(df.columns.get_loc(column) + 1) # Obtenha a letra da coluna
                writer.sheets[sheet_name].column_dimensions[column_letter].width = column_length + 2
    except Exception as err:
        print(err)

- Imperial_J · Answer 9

我可能有点晚到场，但只要使用'openpyxl'作为引擎，这段代码就可以运行，有时候pip install xlsxwriter不能解决问题。下面的代码非常好用。您可以根据需要编辑任何部分。

def text_length(text):
    """
    Get the effective text length in characters, taking into account newlines
    """
    if not text:
        return 0
    lines = text.split("\n")
    return max(len(line) for line in lines)

def _to_str_for_length(v, decimals=3):
    """
    Like str() but rounds decimals to predefined length
    """
    if isinstance(v, float):
        # Round to [decimal] places
        return str(Decimal(v).quantize(Decimal('1.' + '0' * decimals)).normalize())
    else:
        return str(v)


def auto_adjust_xlsx_column_width(df, writer, sheet_name, margin=3, length_factor=1.0, decimals=3, index=False):

    sheet = writer.sheets[sheet_name]
    _to_str = functools.partial(_to_str_for_length, decimals=decimals)
    # Compute & set column width for each column
    for column_name in df.columns:
        # Convert the value of the columns to string and select the 
        column_length =  max(df[column_name].apply(_to_str).map(text_length).max(), text_length(column_name)) + 5
        # Get index of column in XLSX
        # Column index is +1 if we also export the index column
        col_idx = df.columns.get_loc(column_name)
        if index:
            col_idx += 1
        # Set width of column to (column_length + margin)
        sheet.column_dimensions[openpyxl.utils.cell.get_column_letter(col_idx + 1)].width = column_length * length_factor + margin
    # Compute column width of index column (if enabled)
    if index: # If the index column is being exported
        index_length =  max(df.index.map(_to_str).map(text_length).max(), text_length(df.index.name))
        sheet.column_dimensions["A"].width = index_length * length_factor + margin

- Crispy13 · Answer 10

一个基于@alichaudry代码的openpyxl版本。
该代码实现了以下功能：1）加载Excel文件，2）调整列宽，3）保存文件。

def auto_adjust_column_widths(excel_file : "Excel File Path", extra_space = 1) -> None:
    """
    Adjusts column widths of the excel file and replaces it with the adjusted one.
    Adjusting columns is based on the lengths of columns values (including column names).
    Parameters
    ----------
    excel_file :
        excel_file to adjust column widths.
    
    extra_space : 
        extra column width in addition to the value-based-widths
    """

    from openpyxl import load_workbook
    from openpyxl.utils import get_column_letter


    wb = load_workbook(excel_file)

    
    for ws in wb:
        df = pd.DataFrame(ws.values,)

        
        for i,r in (df.astype(str).applymap(len).max(axis=0) + extra_space).iteritems():
            ws.column_dimensions[get_column_letter(i+1)].width = r

    
    wb.save(excel_file)