在PHP中检测MIME类型的正确方法

5
什么是在php中检测文件mime类型的最佳可靠方法?以下代码由许多人建议,但未能检测到docx文件的mime类型:
 $finfo = new finfo(FILEINFO_MIME_TYPE);
 $mime = $finfo->file($_FILES['file']['tmp_name']); 
 echo $mime; exit;  

这里显示的是application/zip,但实际应该是application/vnd.openxmlformats-officedocument.wordprocessingml.document

6
docx/pptx/xlsx 是压缩文件 - 只是这样说而已。这可能是 PHP中finfo_file函数返回的DOCX文件类型为application/zip 的重复问题。 - h2ooooooo
3
因为 .docx 文件是一个包含多个 xml 文件的压缩文件,finfo 能够正确识别它是一个 zip 文件,但并不会深入了解。你需要查看压缩文件内部的文件,才能检测出该集合是否是适用于 OfficeOpenXML Word 文档的文件集合。 - Mark Baker
2个回答

5

根据这个,我已经将其移植到PHP:

function getMicrosoftOfficeMimeInfo($file) {
    $fileInfo = array(
        'word/' => array(
            'type'      => 'Microsoft Word 2007+',
            'mime'      => 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
            'extension' => 'docx'
        ),
        'ppt/' => array(
            'type'      => 'Microsoft PowerPoint 2007+',
            'mime'      => 'application/vnd.openxmlformats-officedocument.presentationml.presentation',
            'extension' => 'pptx'
        ),
        'xl/' => array(
            'type'      => 'Microsoft Excel 2007+',
            'mime'      => 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
            'extension' => 'xlsx'
        )
    );

    $pkEscapeSequence = "PK\x03\x04";

    $file = new BinaryFile($file);
    if ($file->bytesAre($pkEscapeSequence, 0x00)) {
        if ($file->bytesAre('[Content_Types].xml', 0x1E)) {
            if ($file->search($pkEscapeSequence, null, 2000)) {
                if ($file->search($pkEscapeSequence, null, 1000)) {
                    $offset = $file->tell() + 26;
                    foreach ($fileInfo as $searchWord => $info) {
                        $file->seek($offset);
                        if ($file->bytesAre($searchWord)) {
                            return $fileInfo[$searchWord];
                        }
                    }
                    return array(
                        'type'      => 'Microsoft OOXML',
                        'mime'      => null,
                        'extension' => null
                    );
                }
            }
        }
    }

    return false;
}

class BinaryFile_Exception extends Exception {}

class BinaryFile_Seek_Method {
    const ABSOLUTE = 1;
    const RELATIVE = 2;
}

class BinaryFile {
    const SEARCH_BUFFER_SIZE = 1024;

    private $handle;

    public function __construct($file) {
        $this->handle = fopen($file, 'r');
        if ($this->handle === false) {
            throw new BinaryFile_Exception('Cannot open file');
        }
    }

    public function __destruct() {
        fclose($this->handle);
    }

    public function tell() {
        return ftell($this->handle);
    }

    public function seek($offset, $seekMethod = null) {
        if ($offset !== null) {
            if ($seekMethod === null) {
                $seekMethod = BinaryFile_Seek_Method::ABSOLUTE;
            }
            if ($seekMethod === BinaryFile_Seek_Method::RELATIVE) {
                $offset += $this->tell();
            }
            return fseek($this->handle, $offset);
        } else {
            return true;
        }
    }

    public function read($length) {
        return fread($this->handle, $length);
    }

    public function search($string, $offset = null, $maxLength = null, $seekMethod = null) {
        if ($offset !== null) {
            $this->seek($offset);
        } else {
            $offset = $this->tell();
        }

        $bytesRead = 0;
        $bufferSize = ($maxLength !== null ? min(self::SEARCH_BUFFER_SIZE, $maxLength) : self::SEARCH_BUFFER_SIZE);

        while ($read = $this->read($bufferSize)) {
            $bytesRead += strlen($read);
            $search = strpos($read, $string);

            if ($search !== false) {
                $this->seek($offset + $search + strlen($string));
                return true;
            }

            if ($maxLength !== null) {
                $bufferSize = min(self::SEARCH_BUFFER_SIZE, $maxLength - $bytesRead);
                if ($bufferSize == 0) {
                    break;
                }
            }
        }
        return false;
    }

    public function getBytes($length, $offset = null, $seekMethod = null) {
        $this->seek($offset, $seekMethod);
        $read = $this->read($length);
        return $read;
    }

    public function bytesAre($string, $offset = null, $seekMethod = null) {
        return ($this->getBytes(strlen($string), $offset) == $string);
    }
}

使用方法:

$info = getMicrosoftOfficeMimeInfo('hi.docx');
/*
    Array
    (
        [type] => Microsoft Word 2007+
        [mime] => application/vnd.openxmlformats-officedocument.wordprocessingml.document
        [extension] => docx
    )
*/

$info = getMicrosoftOfficeMimeInfo('hi.xlsx');
/*
    Array
    (
        [type] => Microsoft Excel 2007+
        [mime] => application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
        [extension] => xlsx
    )
*/

$info = getMicrosoftOfficeMimeInfo('hi.pptx');
/*
    Array
    (
        [type] => Microsoft PowerPoint 2007+
        [mime] => application/vnd.openxmlformats-officedocument.presentationml.presentation
        [extension] => pptx
    )
*/

$info = getMicrosoftOfficeMimeInfo('hi.zip');
// bool(false)

1

在你的mime类型配置文件中添加所需的mime类型(/etc/magic.mime;/etc/mime.types)。此外,您还可以使用(谷歌一下)准备好的magic.types文件,并在php.ini中使用mime_magic.magicfile选项。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接