处理 Epplus Excel 转换为 HTML 中的合并单元格

7

我正在使用Epplus将Excel电子表格呈现为HTML。到目前为止,一切都很顺利,除了一个问题...合并单元格跨度。我似乎无法理解它的逻辑。我想分享出来看看社区如何处理。这是我迄今为止的代码。

public String ParseExcelStamps(String FileName)
{
    FileInfo theFile = new FileInfo(FileName);
    String html = "";
    using (ExcelPackage xlPackage = new ExcelPackage(theFile))
    {
        var workbook = xlPackage.Workbook;
        if (workbook != null)
        {
            for (int j = 1; j <= workbook.Worksheets.Count; j++)
            {
                Tab tab = new Tab();
                html+= "<table style='border-collapse: collapse;font-family:arial;'><tbody>";
                var worksheet = workbook.Worksheets[j];
                tab.Title = worksheet.Name;
                if (worksheet.Dimension == null) { continue; }
                int rowCount = 0;
                int maxColumnNumber = worksheet.Dimension.End.Column;
                var convertedRecords = new List<List<string>>(worksheet.Dimension.End.Row);
                var excelRows = worksheet.Cells.GroupBy(c => c.Start.Row).ToList();
                excelRows.ForEach(r =>
                {
                    rowCount++;
                    html += String.Format("<tr>");
                    var currentRecord = new List<string>(maxColumnNumber);
                    var cells = r.OrderBy(cell => cell.Start.Column).ToList();
                    Double rowHeight = worksheet.Row(rowCount).Height;
                    for (int i = 1; i <= maxColumnNumber; i++)
                    {
                        var currentCell = cells.Where(c => c.Start.Column == i).FirstOrDefault();

                        //look aheads for colspan and rowspan
                        ExcelRangeBase previousCellAbove = null;
                        ExcelRangeBase previousCell = null;
                        ExcelRangeBase nextCell = null;
                        ExcelRangeBase nextCellBelow = null;
                        try { previousCellAbove = worksheet.Cells[rowCount-1, i]; }catch (Exception) { }
                        try { previousCell = worksheet.Cells[rowCount, (i - 1)]; }catch (Exception) { }
                        try { nextCell = worksheet.Cells[rowCount, (i + 1)]; }catch (Exception) { }
                        try { nextCellBelow = worksheet.Cells[rowCount+1, i]; }catch (Exception) { }

                        if ((previousCell != null) && (previousCell.Merge) && (currentCell != null) && (currentCell.Merge)){continue;}
                        if ((previousCellAbove != null) && (previousCellAbove.Merge) && (currentCell != null)) {continue; }

                        if (currentCell == null)
                        {
                            html += String.Format("<td>{0}</td>", String.Empty);
                        }
                        else
                        {
                            int colSpan = 1;
                            int rowSpan = 1;
                            if ((nextCell != null) && (nextCell.Merge) && (currentCell.Merge)) {
                                colSpan = 2;
                               // Console.WriteLine(String.Format("{0} - {1}", currentCell.Address, nextCell.Address));
                            }

                            if ((nextCellBelow != null) && (nextCellBelow.Merge) && (currentCell.Merge)) {
                                Console.WriteLine(String.Format("{0} - {1}", currentCell.Address, nextCellBelow.Address));
                            }

                            html += String.Format("<td colspan={0} rowspan={1}>{2}</td>", colSpan, rowSpan, currentCell.Value);
                        }
                    }
                    html += String.Format("</tr>");
                });
                html += "</tbody></table>";
            }//worksheet loop
        }
    }
    return html;
}

我首先要问的问题是是否可以编辑工作表以取消合并单元格。值得一试吗? - andy holaday
我真的很想解决这个问题,而不必编辑源代码,而且我也无法控制源文件。 - BigBadOwl
这个问题的关键在于Epplus的cell.merge只返回一个布尔值。当你检查一个单元格是否被合并时,你无法确定它是与前面、后面、下面还是上面的单元格合并,因此无法确定它是colspan还是rowspan。 - BigBadOwl
1个回答

13
据我所知,这正是您需要的内容。您缺少的是工作表上的MergedCells属性,该属性列出工作表中的所有合并单元格。
我的代码处理了行跨度、列跨度和行/列同时跨越的情况。我使用包含行、列以及行/列跨度的电子表格进行了一些测试,在所有情况下它们都完美地工作。 代码:
int colSpan = 1;
int rowSpan = 1;

//check if this is the start of a merged cell
ExcelAddress cellAddress = new ExcelAddress(currentCell.Address);

var mCellsResult = (from c in worksheet.MergedCells 
                let addr = new ExcelAddress(c)
                    where cellAddress.Start.Row >= addr.Start.Row &&
                    cellAddress.End.Row <= addr.End.Row &&
                    cellAddress.Start.Column >= addr.Start.Column &&
                    cellAddress.End.Column <= addr.End.Column 
                select addr);

if (mCellsResult.Count() >0)
{
    var mCells = mCellsResult.First();

    //if the cell and the merged cell do not share a common start address then skip this cell as it's already been covered by a previous item
    if (mCells.Start.Address != cellAddress.Start.Address)
        continue;

    if(mCells.Start.Column != mCells.End.Column) {
        colSpan += mCells.End.Column - mCells.Start.Column;
    }

    if (mCells.Start.Row != mCells.End.Row)
    {
        rowSpan += mCells.End.Row - mCells.Start.Row;
    }
}

//load up data
html += String.Format("<td colspan={0} rowspan={1}>{2}</td>", colSpan, rowSpan, currentCell.Value);

这太棒了,运行得非常好。终于能够将Excel文档显示为HTML,真是太酷了。我很快会在GitHub上发布完整的代码,并在此处发布链接。感谢Peter,您的解决方案值得50分甚至更多。 - BigBadOwl
1
请确保在 currentCell.Value 周围放置 HtmlEncode。我相当确定,如果不这样做,< 或 > 符号会导致呈现方面的一些问题。 - Peter
2
+1,一个小改进:始终使用Enumerable.Any()而不是Enumerable.Count()。前者只检查是否存在任何元素,而后者则迭代所有元素。更好的做法是:if (mCellsResult.Any()) - Tim Schmelter
@TimSchmelter - 我不完全同意总是使用 Any,这取决于集合的基础类型。但在这种情况下,我同意。(https://dev59.com/g3VC5IYBdhLWcg3wZwTj) - Peter
@Peter:在Count更有效的情况下,差异微不足道,但是Count()可能会执行查询,产生数百万个元素,而Any()始终只有一个,而且命名更有意义。 - Tim Schmelter

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接