openPyXL - 在取消合并单元格时为单元格范围分配值

3

我有一些包含多个工作表的Excel文件,并且正在编写脚本,如果它们存在于文件中,则从选定的工作表中收集数据并将其组合成一个大工作表。通常情况下,它可以正常工作,遍历文件并查找包含数据的单元格范围,并将其附加到数据帧中。现在我需要做的是向数据帧添加标题行(列名),但在工作表中这些是多行标题。

为了使它在数据帧中看起来相同,我需要取消顶部标题行中的单元格合并,并将第一个单元格的值复制到之前合并的范围中的其余单元格。

我正在使用OpenPyXL访问Excel工作表。我的函数仅接收要处理的工作表作为参数。函数如下:

def checkForMergedCells(sheet):
    merged = ws.merged_cell_ranges
    for mergedCell in merged:
        mc_start, mc_stop = str(mergedCell).split(':')
        cp_value = sheet[mc_start]
        sheet.unmerge_cells(mergedCell)
        cell_range = sheet[mergedCell]
        for cell in cell_range:
            cell.value = cp_value

事实上,cell_range返回的是一个元组,这会导致出现错误信息:

AttributeError: 'tuple' object has no attribute 'value' 下面是在调试过程中显示每个变量传递的值的截图。

运行调试器

6个回答

6

通过索引访问通常会返回元组的元组,除非您尝试获取单个单元格或行。对于编程访问,您应该使用iter_rows()iter_cols()

您可能需要花些时间查看utils模块。

from openpyxl.utils import range_boundaries

for group in ws.merged_cell_ranges:
     min_col, min_row, max_col, max_row = range_boundaries(group)
     top_left_cell_value = ws.cell(row=min_row, column=min_col).value
     for row in ws.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
         for cell in row:
             cell.value = top_left_cell_value

我刚刚添加了...在list(ws.merged_cell_ranges),这样所有的单元格都会解除合并,因为它遍历列表(否则会跳过一些单元格,因为合并单元格的范围在变化)。 - Marta

4

其他答案存在的问题

关于@Charlie Clark所选的答案和其他使用http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python代码的答案,你可以更轻松地取消合并单元格,无需处理range_boundaries和转换。

我在所选答案中也遇到了一些问题,其中一些合并单元格会取消合并,而另一些则不会,一些取消合并的单元格会填充所需的数据,而其他单元格则不会。

问题在于worksheet.merged_cells.ranges是一个迭代器,这意味着它是惰性评估的,因此当调用worksheet.unmerge_cells()时,对象worksheet.merged_cells被改变,并且在再次迭代合并单元格范围时发生副作用。

更好的解决方案

在我的情况下,我想要取消合并单元格,同时复制边框、字体和对齐信息:

                    +-------+------+
+-------+------+    | Date  | Time |
| Date  | Time |    +=======+======+
+=======+======+    | Aug 6 | 1:00 |
|       | 1:00 | -> +-------+------+
| Aug 6 | 3:00 |    | Aug 6 | 3:00 |
|       | 6:00 |    +-------+------+
+-------+------+    | Aug 6 | 6:00 |
                    +-------+------+

对于当前最新版本的openpyxl==3.0.9,我发现以下内容最适合我:

from copy import copy

from openpyxl import load_workbook, Workbook
from openpyxl.cell import Cell
from openpyxl.worksheet.cell_range import CellRange
from openpyxl.worksheet.worksheet import Worksheet


def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    """
    Unmerges all merged cells in the given ``worksheet`` and copies the content
    and styling of the original cell to the newly unmerged cells.

    :param worksheet: The Excel worksheet containing the merged cells.
    """

    # Must convert iterator to list to eagerly evaluate all merged cell ranges
    # before looping over them - this prevents unintended side-effects of
    # certain cell ranges from being skipped since `worksheet.unmerge_cells()`
    # is destructive.
    all_merged_cell_ranges: list[CellRange] = list(
        worksheet.merged_cells.ranges
    )

    for merged_cell_range in all_merged_cell_ranges:
        merged_cell: Cell = merged_cell_range.start_cell
        worksheet.unmerge_cells(range_string=merged_cell_range.coord)

        # Don't need to convert iterator to list here since `merged_cell_range`
        # is cached
        for row_index, col_index in merged_cell_range.cells:
            cell: Cell = worksheet.cell(row=row_index, column=col_index)
            cell.value = merged_cell.value

            # (Optional) If you want to also copy the original cell styling to
            # the newly unmerged cells, you must use shallow `copy()` since
            # cell style properties are proxy objects which are not hashable.
            #
            # See <https://openpyxl.rtfd.io/en/stable/styles.html#copying-styles>
            cell.alignment = copy(merged_cell.alignment)
            cell.border = copy(merged_cell.border)
            cell.font = copy(merged_cell.font)


# Sample usage
if __name__ == "__main__":
    workbook: Workbook = load_workbook(
        filename="workbook_with_merged_cells.xlsx"
    )
    worksheet: Worksheet = workbook["My Sheet"]

    unmerge_and_fill_cells(worksheet=worksheet)
    workbook.save(filename="workbook_with_unmerged_cells.xlsx")

简洁的解决方案

这里是一个更短的版本,没有注释并且不复制样式:

from openpyxl.worksheet.worksheet import Worksheet

def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    for merged_cell_range in list(worksheet.merged_cells.ranges):
        worksheet.unmerge_cells(range_string=merged_cell_range.start_cell)

        for row_col_indices in merged_cell_range.cells:
            worksheet.cell(*row_col_indices).value = merged_cell.value

1

之前的回答都没用,所以我详细阐述了这个答案,并测试过它对我有效。

from openpyxl.utils import range_boundaries
wb = load_workbook('Example.xlsx')

sheets = wb.sheetnames  ##['Sheet1', 'Sheet2']
for i,sheet in enumerate(sheets):
    ws = wb[sheets[i]]
    
    # you need a separate list to iterate on (see explanation #2 below)
    mergedcells =[]  
    for group in ws.merged_cells.ranges:
        mergedcells.append(group)
    
    for group in mergedcells:
        min_col, min_row, max_col, max_row = group.bounds 
        top_left_cell_value = ws.cell(row=min_row, column=min_col).value
        ws.unmerge_cells(str(group))   # you need to unmerge before writing (see explanation #1 below)
        for irow in range(min_row, max_row+1):
            for jcol in range(min_col, max_col+1): 
                ws.cell(row = irow, column = jcol, value = top_left_cell_value)

 

@Дмитро Олександрович几乎是正确的,但我不得不更改一些内容来修复他的答案:

  1. 您将遇到AttributeError:'MergedCell'对象属性'value'是只读的错误,因为您需要在更改其值之前取消合并单元格。(请参见此处:https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1228)

  2. 您不能直接迭代ws.merged_cells.ranges,因为在python中通过迭代“ranges”列表对象并更改它(例如使用unmerge_cells函数或pop函数)将导致仅更改一半的对象(请参见此处:https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1085)。您需要创建一个不同的列表并对其进行迭代。


非常好的回答,涵盖了所有内容。这是唯一一个在3.0.4版本中对我有效的解决方法。 - otocan
@otocan 谢谢,我刚刚修复了链接(openpyxl存储库从bitbucket迁移,因此我引用的所有问题路径都随之更改)。现在它们是最新的。 - rgiannico

1
以下代码来自于http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python/,对我非常有效。
import openpyxl 
from openpyxl.utils import range_boundaries
wbook=openpyxl.load_workbook("openpyxl_merge_unmerge.xlsx")
sheet=wbook["unmerge_sample"]
for cell_group in sheet.merged_cells.ranges:
    min_col, min_row, max_col, max_row = range_boundaries(str(cell_group))
    top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
    sheet.unmerge_cells(str(cell_group))
    for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
        for cell in row:
            cell.value = top_left_cell_value
wbook.save("openpyxl_merge_unmerge.xlsx")
exit()

0
所有之前的解决方案都给我带来了某种错误,可能是由于不同版本的openpyxl造成的。但是在当前版本(3.0.10)中,我找到了适合我的解决方案:
for m_range in list(ws.merged_cells.ranges):
    merged_cell = m_range.start_cell
    ws.unmerge_cells(range_string=str(m_range))

    for row_col_indices in m_range.cells:
        ws.cell(*row_col_indices).value = merged_cell.value

0

我一直收到错误和弃用警告,直到我做了这个:

from openpyxl.utils import range_boundaries

for group in sheet.merged_cells.ranges: # merged_cell_ranges deprecated
    display(range_boundaries(group._get_range_string())) # expects a string instead of an object
    min_col, min_row, max_col, max_row = range_boundaries(group._get_range_string())
    top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
    for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
        for cell in row:
            cell.value = top_left_cell_value

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接