.NET Core 内存使用情况

Question

.NET Core 内存使用情况

.net.net-coreopenxmlasp.net-core-3.1.net-framework-version

4

我有一个简单的控制台应用程序，它会读取扁平文件并将其转换成Excel格式。为了将扁平文件转换成Excel格式，我使用了Open-XML SAX方法。我在32位的.NET Framework 4.7.2和.NET Core 3.1上运行了代码。在.NET Framework中，我只使用了300 MB的内存就将1300 MB的文件转换成了Excel，而在.NET Core 3.1上，我尝试将200 MB的扁平文件转换成Excel时，却抛出了内存异常错误。

注意：我需要在32位系统上运行我的应用程序。

对于完全相同的代码，为什么.NET Core会抛出内存异常呢？.NET Core的内存使用是否存在问题？

- sandesh mainali

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- b.pell · Accepted Answer

这是由于.NET Framework和.NET Core之间的更改导致的。这是已知问题，我能够根据Microsoft的一些建议拼凑出一个有限制的解决方法。在GitHub上，他们指出在Write而不是ReadWrite模式下打开Package将允许使用SAX方法流式传输大型电子表格。由于这种方法，顺序很重要。在Write模式下写出的第一件事必须是大工作表，因为任何其他打开的OpenXmlWriter实例都需要ReadWrite，否则它们会抛出异常（因此有限制）。

以下是我遵循的步骤：

创建一个 FileStream（我使用了 File.Create）。
创建一个 Package，传入 FileStream 并使用 FileMode.Create 和 FileAccess.Write。
通过 SpreadsheetDocument.Create 创建一个 SpreadsheetDocument。
通过 OpenXmlWriter 写入您的大型 WorksheetPart。
关闭并释放写入器、包、文件流等对象。
创建一个 FileStream（这次是打开的，使用 File.Open，并使用 FileMode.Open、FileAccess.ReadWrite 和 FileShare.None）。
创建一个 Package，传入 FileStream 并使用 FileMode.Open 和 FileAccess.ReadWrite。
通过 SpreadsheetDocument.Open 创建一个 SpreadsheetDocument。
为 WorkbookPart 创建一个 OpenXmlWriter，添加 Workbook 和 Sheets 的元素，然后将其与原始创建时添加的 Sheet 相关联，关闭并释放这些对象，完成。

现在，这是一些示例代码，应该可以让您接近目标。我正在将IDataReader写入工作表中。这里有一些字符串扩展没有包含在内，但您可以根据需要删除或更改。

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System;
using System.Collections.Generic;
using System.Data;
using System.IO;
using System.IO.Packaging;
using System.Linq;
using System.Reflection;

public class ExcelDoc
{
    /// <summary>
    /// Creates a single sheet spreadsheet from an <see cref="IDataReader"/> that is capable of writing large
    /// quantities of data with a low memory footprint on .NET Core.
    /// </summary>
    /// <param name="dr"></param>
    /// <param name="workSheetName"></param>
    public static void ToFile(string outputFileName, IDataReader dr, string workSheetName)
    {
        string worksheetPartId;

        // Create a file with write access.  To write the large dataset it must first thing written
        // to the writer, any subsequent OpenXmlWriter's seem to require a read.  Because of this, it
        // limits us to one large dataset on one sheet.
        using (var fs = File.Create(outputFileName))
        {
            using (var package = Package.Open(fs, FileMode.Create, FileAccess.Write))
            {
                using (var excel = SpreadsheetDocument.Create(package, SpreadsheetDocumentType.Workbook))
                {
                    // Create the Workbook for the spreadsheet
                    excel.AddWorkbookPart();

                    // Create the writer that we're going to use.. it will write data into the parts of the spreadsheet
                    // which we will then write into the Spreadsheet.
                    List<OpenXmlAttribute> oxa;

                    var wsp = excel.WorkbookPart.AddNewPart<WorksheetPart>();
                    var oxw = OpenXmlWriter.Create(wsp);

                    // We need to get the part ID that we'll larger use to associate the sheet we create to this data.
                    worksheetPartId = excel.WorkbookPart.GetIdOfPart(wsp);

                    oxw.WriteStartElement(new Worksheet());
                    oxw.WriteStartElement(new SheetData());

                    // Header Row
                    int index = 1;

                    oxa = new List<OpenXmlAttribute>();
                    // this is the row index
                    oxa.Add(new OpenXmlAttribute("r", null, index.ToString()));

                    // This is for the row
                    oxw.WriteStartElement(new Row(), oxa);

                    for (int x = 0; x <= dr.FieldCount - 1; x++)
                    {
                        var cell = GetCell(typeof(string), dr.GetName(x));
                        oxa = new List<OpenXmlAttribute>();
                        oxa.Add(new OpenXmlAttribute("t", null, "str"));
                        oxw.WriteElement(cell);
                    }

                    // This is for the row
                    oxw.WriteEndElement();

                    // Add a row for each data item.
                    while (dr.Read())
                    {
                        index += 1;

                        oxa = new List<OpenXmlAttribute>();

                        // this is the row index
                        oxa.Add(new OpenXmlAttribute("r", null, index.ToString()));

                        // This is for the row
                        oxw.WriteStartElement(new Row(), oxa);

                        // Add value for each field in the DataReader.
                        for (int x = 0; x <= dr.FieldCount - 1; x++)
                        {
                            var cell = GetCell(dr[x].GetType(), dr[x].ToString());
                            oxa = new List<OpenXmlAttribute>();
                            oxa.Add(new OpenXmlAttribute("t", null, "str"));
                            oxw.WriteElement(cell);
                        }

                        // this is for Row
                        oxw.WriteEndElement();
                    }

                    // this is for SheetData
                    oxw.WriteEndElement();

                    // this is for Worksheet
                    oxw.WriteEndElement();
                    oxw.Close();
                    oxw.Dispose();
                }
            }
        }

        // Phase 2, we've already written our large dataset, now we need to add the workbook, the sheets and
        // associate the dataset to a sheet.  This requires ReadWrite, it won't be a memory issue because this
        // part doesn't take much memory.
        using (var fs = File.Open(outputFileName, FileMode.Open, FileAccess.ReadWrite, FileShare.None))
        {
            using (var package = Package.Open(fs, FileMode.Open, FileAccess.ReadWrite))
            {
                using (var excel = SpreadsheetDocument.Open(package))
                {
                    // Create the writer that will handle the outer portion of the spreadsheet, it will need to have
                    // these tags closed out when the spreadsheet is closed.
                    var oxw = OpenXmlWriter.Create(excel.WorkbookPart);
                    oxw.WriteStartElement(new Workbook());
                    oxw.WriteStartElement(new Sheets());

                    // Writer this into the global Writer we have open.
                    oxw.WriteElement(new Sheet()
                    {
                        Name = $"{workSheetName}",
                        SheetId = 1,
                        Id = worksheetPartId
                    });

                    // this is for Sheets
                    oxw.WriteEndElement();

                    // this is for Workbook
                    oxw.WriteEndElement();
                    oxw.Close();
                    oxw.Dispose();
                }
            }
        }
    }

    /// <summary>
    /// Returns a spreadsheet <see cref="Cell"/> with its type set according to the .NET type of the data.
    /// </summary>
    /// <param name="type"></param>
    /// <param name="value">The CellValue for the returned <see cref="Cell"/></param>
    private static Cell GetCell(Type type, string value)
    {
        var cell = new Cell();

        if (type.ToString() == "System.RuntimeType")
        {
            cell.DataType = CellValues.String;
            cell.CellValue = new CellValue(value.SafeLeft(32767));
            return cell;
        }

        if (type.ToString() == "System.Guid")
        {
            Guid guidResult;
            Guid.TryParse(value, out guidResult);
            cell.DataType = CellValues.String;
            cell.CellValue = new CellValue(guidResult.ToString());
            return cell;
        }

        // Make sure the value isn't null before putting it into the cell.
        // If it is null, put a blank in the cell.
        if (value == null || Convert.IsDBNull(value))
        {
            cell.DataType = CellValues.String;
            cell.CellValue = new CellValue("");
            return cell;
        }

        var typeCode = Type.GetTypeCode(type);

        switch (typeCode)
        {
            case TypeCode.String:
                cell.DataType = CellValues.String;

                // `ToValidXmlAsciiCharacters` will remove any invalid XML characters falling in the ascii code range of 0-32
                cell.CellValue = new CellValue(value.SafeLeft(32767).ToValidXmlAsciiCharacters());
                break;

            case TypeCode.Int16:
            case TypeCode.Int32:
            case TypeCode.Int64:
            case TypeCode.Double:
            case TypeCode.Decimal:
            case TypeCode.Single:
            case TypeCode.UInt16:
            case TypeCode.UInt32:
            case TypeCode.UInt64:
                // Second most common cases
                cell.DataType = CellValues.Number;
                cell.CellValue = new CellValue(value);
                break;

            case TypeCode.DateTime:
                var dt = Convert.ToDateTime(value).Date;
                cell.DataType = CellValues.String;
                cell.CellValue = new CellValue($"{dt.Year}/{dt.MonthTwoCharacters()}/{dt.DayTwoCharacters()}");
                break;
            default:
                // Everything else
                cell.DataType = CellValues.String;
                cell.CellValue = new CellValue(value);
                break;
        }

        return cell;
    }
}

显然，这并不是理想的解决方案，但它将为您提供一个大型工作表，而不会出现内存异常。我在.NET 5上进行了测试，但在3.1上也应该可以工作。

我包括了关于OpenXml库的GitHub问题以及讨论此问题并获得解决方法的dotnet运行时问题。