PhpSpreadsheet如何处理大数据量

4
我有一个包含3070个值的多维数组。
$tbl= array(
  array(
    "KDNR" => 1,
    "GESCHL" => "test",
    "TITEL" => "test",
    "VORNAME" => "test",
    "FAMNAME" => "test",
    "PLZ" => "test",
    "ORT" => "test",
    "STRASSE" => "test",
    "EMAIL" => "test",
    "PRIVTEL" => "test"
  ),
  "KDNR" => 2,
    "GESCHL" => "test2",
    "TITEL" => "test2",
    "VORNAME" => "test2",
    "FAMNAME" => "test2",
    "PLZ" => "test2",
    "ORT" => "test2",
    "STRASSE" => "test2",
    "EMAIL" => "test2",
    "PRIVTEL" => "test2"
  ),
  etc...
);

我希望将包含3070个数组的数组tbl写入到xlsx文件中。

我使用PhpSpreadsheet来实现这一目标。

以下是我的php代码:

<?php
//call the autoload
require($_SERVER['DOCUMENT_ROOT'].'/src/phpspreadsheet/vendor/autoload.php');
//load phpspreadsheet class using namespaces
use PhpOffice\PhpSpreadsheet\Spreadsheet;
//call iofactory instead of xlsx writer
use PhpOffice\PhpSpreadsheet\Aligment;
use PhpOffice\PhpSpreadsheet\Fill;
use PhpOffice\PhpSpreadsheet\IOFactory;


//load from xlsx template
$reader = IOFactory::createReader('Xlsx');
$spreadsheet = $reader->load($_SERVER['DOCUMENT_ROOT']. '/src/ExcelVorlagen/polbezirk_template.xlsx');

//loop the data
$contentStartRow = 3;
$currentContenRow = 3;

//set coulm dimension to auto size
$spreadsheet->getActiveSheet()
            ->getColumnDimension('A')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('B')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('C')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('D')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('E')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('F')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('G')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('H')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('I')
            ->setAutoSize(true);
$spreadsheet->getActiveSheet()
            ->getColumnDimension('J')
            ->setAutoSize(true);

xdebug_break();
foreach($tbl as $item){
    //insert a row after current row (before current row + 1)
    $spreadsheet->getActiveSheet()->insertNewRowBefore($currentContenRow + 1,1);

    //fill the cell with Data
    $spreadsheet->getActiveSheet()
        ->setCellValue('A'.$currentContenRow, $item['KDNR'])
        ->setCellValue('B'.$currentContenRow, $item['GESCHL'])
        ->setCellValue('C'.$currentContenRow, $item['TITEL'])
        ->setCellValue('D'.$currentContenRow, $item['VORNAME'])
        ->setCellValue('E'.$currentContenRow, $item['FAMNAME'])
        ->setCellValue('F'.$currentContenRow, $item['PLZ'])
        ->setCellValue('G'.$currentContenRow, $item['ORT'])
        ->setCellValue('H'.$currentContenRow, $item['STRASSE'])
        ->setCellValue('I'.$currentContenRow, $item['EMAIL'])
        ->setCellValue('J'.$currentContenRow, $item['PRIVTEL']);
    //increment the current row number
    $currentContenRow++;                 
}

//remove last empty rows
//$spreadsheet->getActiveSheet()->removeRow($currentContenRow,2); 

//set the header first, so the result will be treated as an xlsx file
header('Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet');

//make it an attachment so we can define filename
header('Content-Disposition: attachment;filename="result.xlsx"');

//create IOFactory object
$writer = IOFactory::createWriter($spreadsheet, 'Xlsx');
//save int php output
$writer->save('php://output');


当我执行代码时,创建xlsx文件需要49分钟,对于只有3070行来说,这太长了。是否有更快的方法?还是我的代码存在瓶颈?
希望你们能帮助我。
最好的问候。

将 $spreadsheet->getActiveSheet() 存储在一个变量中并引用该变量而不是一遍又一遍地调用该方法,这样有帮助吗? - kalyfe
如果您知道要添加多少行,请将其用作insertNewRow的第二个参数,并仅调用该方法一次:https://phpoffice.github.io/PhpSpreadsheet/master/PhpOffice/PhpSpreadsheet/Worksheet/Worksheet.html#method_insertNewColumnBefore - kalyfe
3个回答

2
非常感谢你,kalyfe。 我将我的代码修改为以下形式:
$row = count($tbl);

$spreadsheet->getActiveSheet()->insertNewRowBefore($currentContenRow + 1, $row);

foreach($tbl as $item){
    //fill the cell with Data
    $spreadsheet->getActiveSheet()
        ->setCellValue('A'.$currentContenRow, $item['KDNR'])
        ->setCellValue('B'.$currentContenRow, $item['GESCHL'])
        ->setCellValue('C'.$currentContenRow, $item['TITEL'])
        ->setCellValue('D'.$currentContenRow, $item['VORNAME'])
        ->setCellValue('E'.$currentContenRow, $item['FAMNAME'])
        ->setCellValue('F'.$currentContenRow, $item['PLZ'])
        ->setCellValue('G'.$currentContenRow, $item['ORT'])
        ->setCellValue('H'.$currentContenRow, $item['STRASSE'])
        ->setCellValue('I'.$currentContenRow, $item['EMAIL'])
        ->setCellValue('J'.$currentContenRow, $item['PRIVTEL']);
    //increment the current row number
    $currentContenRow++;                 
}

现在创建 xlsx 文件需要 15 秒钟。


1

最近我也需要做类似的工作,想和大家分享一下,希望能有所帮助。

这段代码会对原始数组($tbl)进行重新格式化,将一个列标题 record 注入到数组开头,以便正确格式化数据并使其可以被 PhpSpreadsheet 处理并写入 .xlsx 文件中。

用于处理数据的函数为:$spreadsheet->getActiveSheet()->fromArray()(见下方)。

<?php

use PhpOffice\PhpSpreadsheet\Spreadsheet;
use PhpOffice\PhpSpreadsheet\Writer\Xlsx;

require dirname(__DIR__, 1) . "/vendor/autoload.php";

// the original array
$tbl = [
    [
    "KDNR" => 1,
    "GESCHL" => "test",
    "TITEL" => "test",
    "VORNAME" => "test",
    "FAMNAME" => "test",
    "PLZ" => "test",
    "ORT" => "test",
    "STRASSE" => "test",
    "EMAIL" => "test",
    "PRIVTEL" => "test"
    ],
    [
    "KDNR" => 2,
    "GESCHL" => "test2",
    "TITEL" => "test2",
    "VORNAME" => "test2",
    "FAMNAME" => "test2",
    "PLZ" => "test2",
    "ORT" => "test2",
    "STRASSE" => "test2",
    "EMAIL" => "test2",
    "PRIVTEL" => "test2"
    ],
];

/*
 * inject header 'record'.
 */
$headers = array_keys($tbl[0]); // get headers from source array
array_unshift($tbl, $headers); // insert headers as first record

/*
 * write data to xlsx file
 */
$spreadsheet = new Spreadsheet();
// build spreadsheet from array
$spreadsheet->getActiveSheet()->fromArray($tbl,
    NULL, // array values with this value will not be set
    'A1');

// write array data to xlsx file
$writer = new Xlsx($spreadsheet);
$writer->save('yourfile.xlsx');

重新排列后的$tbl数组已经准备好被$spreadsheet->getActiveSheet()->fromArray()处理,如下所示:
Array
(
    [0] => Array
        (
            [0] => KDNR
            [1] => GESCHL
            [2] => TITEL
            [3] => VORNAME
            [4] => FAMNAME
            [5] => PLZ
            [6] => ORT
            [7] => STRASSE
            [8] => EMAIL
            [9] => PRIVTEL
        )

    [1] => Array
        (
            [KDNR] => 1
            [GESCHL] => test
            [TITEL] => test
            [VORNAME] => test
            [FAMNAME] => test
            [PLZ] => test
            [ORT] => test
            [STRASSE] => test
            [EMAIL] => test
            [PRIVTEL] => test
        )

    [2] => Array
        (
            [KDNR] => 2
            [GESCHL] => test2
            [TITEL] => test2
            [VORNAME] => test2
            [FAMNAME] => test2
            [PLZ] => test2
            [ORT] => test2
            [STRASSE] => test2
            [EMAIL] => test2
            [PRIVTEL] => test2
        )

)

第一条记录将用于设置列标题,后续记录是行数据。
生成的xlsx文件: enter image description here

0
我会使用https://github.com/aVadim483/fast-excel-writer进行测试,在我的笔记本上创建包含10K行的xlsx文件只需要1.12秒。
require 'src/autoload.php';

$row = [
    "KDNR" => 1,
    "GESCHL" => "test",
    "TITEL" => "test",
    "VORNAME" => "test",
    "FAMNAME" => "test",
    "PLZ" => "test",
    "ORT" => "test",
    "STRASSE" => "test",
    "EMAIL" => "test",
    "PRIVTEL" => "test"
];
$tbl = [];
// fill $tpl
for ($i = 0; $i < 10000; $i++) {
    $tbl[] = $row;
}

$excel = \avadim\FastExcelWriter\Excel::create();
$sheet = $excel->getSheet();

$timer = microtime(true);

$sheet->writeRow(array_keys($row));
foreach($tbl as $row) {
    $sheet->writeRow($row);
}

$excel->save('simple.xlsx');

echo 'elapsed time: ', round(microtime(true) - $timer, 3), ' sec';

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接