如何在Java中递归地解压文件?

35

我有一个包含其他 zip 文件的 zip 文件。

例如, 邮件文件是 abc.zip, 它包含了 xyz.zip, class1.java, class2.java. 而 xyz.zip 包含了文件 class3.javaclass4.java

因此,我需要使用 Java 将 zip 文件提取到一个文件夹中,该文件夹应该包含 class1.javaclass2.javaclass3.javaclass4.java


这将真正搞乱这个任务:http://aioobe.org/zip-quine/.(无限递归的 zip-in-a-zip-in-a-zip-in-a-..) - GKFX
1
只是一个想法,这里提供的所有答案实际上都不起作用,因为您没有考虑嵌套的压缩文件。 - ha9u63a7
7
请注意,此问题答案中的代码目前存在安全漏洞。有关更多信息,请参见此处 - eldruin
10个回答

86

警告:本代码适用于可信的zip文件,写入之前没有路径验证可能会导致安全漏洞,如在zip-slip-vulnerability所述,如果您将其用于解压未知客户端上传的zip文件。


此解决方案与已发布的先前解决方案非常相似,但此解决方案在解压缩时重新创建了正确的文件夹结构。

public static void extractFolder(String zipFile) throws IOException {
int buffer = 2048;
File file = new File(zipFile);

try (ZipFile zip = new ZipFile(file)) {
  String newPath = zipFile.substring(0, zipFile.length() - 4);

  new File(newPath).mkdir();
  Enumeration<? extends ZipEntry> zipFileEntries = zip.entries();

  // Process each entry
  while (zipFileEntries.hasMoreElements()) {
    // grab a zip file entry
    ZipEntry entry = zipFileEntries.nextElement();
    String currentEntry = entry.getName();
    File destFile = new File(newPath, currentEntry);
    File destinationParent = destFile.getParentFile();

    // create the parent directory structure if needed
    destinationParent.mkdirs();

    if (!entry.isDirectory()) {
      BufferedInputStream is = new BufferedInputStream(zip.getInputStream(entry));
      int currentByte;
      // establish buffer for writing file
      byte[] data = new byte[buffer];

      // write the current file to disk
      FileOutputStream fos = new FileOutputStream(destFile);
      try (BufferedOutputStream dest = new BufferedOutputStream(fos, buffer)) {

        // read and write until last byte is encountered
        while ((currentByte = is.read(data, 0, buffer)) != -1) {
          dest.write(data, 0, currentByte);
        }
        dest.flush();
        is.close();
      }
    }

    if (currentEntry.endsWith(".zip")) {
      // found a zip file, try to open
      extractFolder(destFile.getAbsolutePath());
    }
  }
}

}

的翻译是

}


1
我知道这很老了,但是如果有的话,被注释掉的行//destFile = new File(newPath, destFile.getName());留下来的意义是什么? - Liam
2
@Liam,这没有什么重要的意义。我只是尝试了不同的方法来获取当前文件名。我决定使用currentEntry而不是destFile.getName() - NeilMonday
27
这段代码存在安全漏洞!需要事先验证压缩文件中的路径。请查看此处链接了解详情:https://snyk.io/research/zip-slip-vulnerability - eldruin
嗯,恐怕我看不出这如何修复漏洞。难道不应该检查提取目录的规范路径吗?我在这个例子中找不到这个。你为什么认为这样更安全? - Eric
不必先编写zip文件,再加载它,您可以直接使用类ZipInputStream - Marcono1234
要编写zip条目输入流的内容,您可以使用Files.copy(InputStream, Path, CopyOption...) - Marcono1234

9

这里有一些未经测试的代码,基于我之前解压文件的旧代码。

public void doUnzip(String inputZip, String destinationDirectory)
        throws IOException {
    int BUFFER = 2048;
    List zipFiles = new ArrayList();
    File sourceZipFile = new File(inputZip);
    File unzipDestinationDirectory = new File(destinationDirectory);
    unzipDestinationDirectory.mkdir();

    ZipFile zipFile;
    // Open Zip file for reading
    zipFile = new ZipFile(sourceZipFile, ZipFile.OPEN_READ);

    // Create an enumeration of the entries in the zip file
    Enumeration zipFileEntries = zipFile.entries();

    // Process each entry
    while (zipFileEntries.hasMoreElements()) {
        // grab a zip file entry
        ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();

        String currentEntry = entry.getName();

        File destFile = new File(unzipDestinationDirectory, currentEntry);
        destFile = new File(unzipDestinationDirectory, destFile.getName());

        if (currentEntry.endsWith(".zip")) {
            zipFiles.add(destFile.getAbsolutePath());
        }

        // grab file's parent directory structure
        File destinationParent = destFile.getParentFile();

        // create the parent directory structure if needed
        destinationParent.mkdirs();

        try {
            // extract file if not a directory
            if (!entry.isDirectory()) {
                BufferedInputStream is =
                        new BufferedInputStream(zipFile.getInputStream(entry));
                int currentByte;
                // establish buffer for writing file
                byte data[] = new byte[BUFFER];

                // write the current file to disk
                FileOutputStream fos = new FileOutputStream(destFile);
                BufferedOutputStream dest =
                        new BufferedOutputStream(fos, BUFFER);

                // read and write until last byte is encountered
                while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                    dest.write(data, 0, currentByte);
                }
                dest.flush();
                dest.close();
                is.close();
            }
        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
    }
    zipFile.close();

    for (Iterator iter = zipFiles.iterator(); iter.hasNext();) {
        String zipName = (String)iter.next();
        doUnzip(
            zipName,
            destinationDirectory +
                File.separatorChar +
                zipName.substring(0,zipName.lastIndexOf(".zip"))
        );
    }

}

为什么你在方法声明中使用了"throws"关键字,但实际上却捕获并记录了异常?这难道不意味着调用者将期望那些从未抛出的IOException吗..? - Anson MacKeracher
1
如果输入是一个目录,会怎样? - fastcodejava
我相信ZIP文件只能存储文件,而不能存储目录。 - Charlie
1
@Charlie 我不认为这是正确的。如果我使用PeaZip来提取一个包含目录结构的zip文件,那么生成的目录会正确地重新创建文件夹结构。这种方法似乎可以获取所有文件,无论它们的目录结构如何,然后将它们放置在基本目标目录中。 - NeilMonday
1
在删除第47行destFile = new File(unzipDestinationDirectory, destFile.getName());之后,上述代码完美运行。 - Harish
@Harish man. 你太棒了。如此简单的解决方案。太棒了,伙计。还有Charlie,快速提取做得很好。 - Andro Selva

7

我拿到了ca.anderson4并删除了List zipFiles,稍微重写了一下,这就是我得到的结果:

public class Unzip {

public void unzip(String zipFile) throws ZipException,
        IOException {

    System.out.println(zipFile);;
    int BUFFER = 2048;
    File file = new File(zipFile);

    ZipFile zip = new ZipFile(file);
    String newPath = zipFile.substring(0, zipFile.length() - 4);

    new File(newPath).mkdir();
    Enumeration zipFileEntries = zip.entries();

    // Process each entry
    while (zipFileEntries.hasMoreElements()) {
        // grab a zip file entry
        ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();

        String currentEntry = entry.getName();

        File destFile = new File(newPath, currentEntry);
        destFile = new File(newPath, destFile.getName());
        File destinationParent = destFile.getParentFile();

        // create the parent directory structure if needed
        destinationParent.mkdirs();
        if (!entry.isDirectory()) {
            BufferedInputStream is = new BufferedInputStream(zip
                    .getInputStream(entry));
            int currentByte;
            // establish buffer for writing file
            byte data[] = new byte[BUFFER];

            // write the current file to disk
            FileOutputStream fos = new FileOutputStream(destFile);
            BufferedOutputStream dest = new BufferedOutputStream(fos,
                    BUFFER);

            // read and write until last byte is encountered
            while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                dest.write(data, 0, currentByte);
            }
            dest.flush();
            dest.close();
            is.close();
        }
        if (currentEntry.endsWith(".zip")) {
            // found a zip file, try to open
            unzip(destFile.getAbsolutePath());
        }
    }
}

public static void main(String[] args) {
    Unzip unzipper=new Unzip();
    try {
        unzipper.unzip("test/test.zip");
    } catch (ZipException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

}

我测试过了,它可以正常运行。


那段代码实际上是由ca.anderson4编写的;我只是进行了编辑(我不喜欢左右滚动)。 - Michael Myers
好的,我没有看到这个 :-),所以我要感谢ca.anderson4和你的编辑;-) - fero46
这段代码会将所有子目录中的文件都存储在一个文件夹中,从而破坏整个归档结构。 - Ibolit

2

在测试过程中,我注意到File.mkDirs()在Windows下无法正常工作...

/** * 为给定的完整路径名重新创建所有父目录 **/

    private void createParentHierarchy(String parentName) throws IOException {
        File parent = new File(parentName);
        String[] parentsStrArr = parent.getAbsolutePath().split(File.separator == "/" ? "/" : "\\\\");

        //create the parents of the parent
        for(int i=0; i < parentsStrArr.length; i++){
            StringBuffer currParentPath = new StringBuffer();
            for(int j = 0; j < i; j++){
                currParentPath.append(parentsStrArr[j]+File.separator);
            }
            File currParent = new File(currParentPath.toString());
            if(!currParent.isDirectory()){
                boolean created = currParent.mkdir();
                if(isVerbose)log("creating directory "+currParent.getAbsolutePath());
            }
        }

        //create the parent itself
        if(!parent.isDirectory()){
            boolean success = parent.mkdir();
        }
    }

2

按照我的需求进行修改,然后混合了一些最佳答案。这个版本将会:

  • 递归地将zip文件提取到指定位置

  • 创建空目录

  • 正确关闭zip文件


public static void unZipAll(File source, File destination) throws IOException 
{
    System.out.println("Unzipping - " + source.getName());
    int BUFFER = 2048;

    ZipFile zip = new ZipFile(source);
    try{
        destination.getParentFile().mkdirs();
        Enumeration zipFileEntries = zip.entries();

        // Process each entry
        while (zipFileEntries.hasMoreElements())
        {
            // grab a zip file entry
            ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
            String currentEntry = entry.getName();
            File destFile = new File(destination, currentEntry);
            //destFile = new File(newPath, destFile.getName());
            File destinationParent = destFile.getParentFile();

            // create the parent directory structure if needed
            destinationParent.mkdirs();

            if (!entry.isDirectory())
            {
                BufferedInputStream is = null;
                FileOutputStream fos = null;
                BufferedOutputStream dest = null;
                try{
                    is = new BufferedInputStream(zip.getInputStream(entry));
                    int currentByte;
                    // establish buffer for writing file
                    byte data[] = new byte[BUFFER];

                    // write the current file to disk
                    fos = new FileOutputStream(destFile);
                    dest = new BufferedOutputStream(fos, BUFFER);

                    // read and write until last byte is encountered
                    while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                        dest.write(data, 0, currentByte);
                    }
                } catch (Exception e){
                    System.out.println("unable to extract entry:" + entry.getName());
                    throw e;
                } finally{
                    if (dest != null){
                        dest.close();
                    }
                    if (fos != null){
                        fos.close();
                    }
                    if (is != null){
                        is.close();
                    }
                }
            }else{
                //Create directory
                destFile.mkdirs();
            }

            if (currentEntry.endsWith(".zip"))
            {
                // found a zip file, try to extract
                unZipAll(destFile, destinationParent);
                if(!destFile.delete()){
                    System.out.println("Could not delete zip");
                }
            }
        }
    } catch(Exception e){
        e.printStackTrace();
        System.out.println("Failed to successfully unzip:" + source.getName());
    } finally {
        zip.close();
    }
    System.out.println("Done Unzipping:" + source.getName());
}

1

解压缩后应该关闭zip文件。

static public void extractFolder(String zipFile) throws ZipException, IOException 
{
    System.out.println(zipFile);
    int BUFFER = 2048;
    File file = new File(zipFile);

    ZipFile zip = new ZipFile(file);
    try
    { 
       ...code from other answers ( ex. NeilMonday )...
    }
    finally
    {
        zip.close();
    }
}

0
与NeilMonday的答案相同,但提取空目录:
static public void extractFolder(String zipFile) throws ZipException, IOException 
{
    System.out.println(zipFile);
    int BUFFER = 2048;
    File file = new File(zipFile);

    ZipFile zip = new ZipFile(file);
    String newPath = zipFile.substring(0, zipFile.length() - 4);

    new File(newPath).mkdir();
    Enumeration zipFileEntries = zip.entries();

    // Process each entry
    while (zipFileEntries.hasMoreElements())
    {
        // grab a zip file entry
        ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
        String currentEntry = entry.getName();
        File destFile = new File(newPath, currentEntry);
        //destFile = new File(newPath, destFile.getName());
        File destinationParent = destFile.getParentFile();

        // create the parent directory structure if needed
        destinationParent.mkdirs();

        if (!entry.isDirectory())
        {
            BufferedInputStream is = new BufferedInputStream(zip
            .getInputStream(entry));
            int currentByte;
            // establish buffer for writing file
            byte data[] = new byte[BUFFER];

            // write the current file to disk
            FileOutputStream fos = new FileOutputStream(destFile);
            BufferedOutputStream dest = new BufferedOutputStream(fos,
            BUFFER);

            // read and write until last byte is encountered
            while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
                dest.write(data, 0, currentByte);
            }
            dest.flush();
            dest.close();
            is.close();
        }
        else{
            destFile.mkdirs()
        }
        if (currentEntry.endsWith(".zip"))
        {
            // found a zip file, try to open
            extractFolder(destFile.getAbsolutePath());
        }
    }
}

0

没有第三方依赖,防止zip slip,完全注释,递归地重新创建目录结构,忽略空目录,合理的源代码嵌套,提取到zip文件的目录,并使用UTF-8。用法:

Path zipFile = Path.of( "/path/to/filename.zip" );
Zip.extract( zipFile );

这是代码:

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

import static java.nio.file.Files.createDirectories;
import static java.nio.file.StandardCopyOption.REPLACE_EXISTING;

/**
 * Responsible for managing zipped archive files.
 */
public final class Zip {
  /**
   * Extracts the contents of the zip archive into its current directory. The
   * contents of the archive must be {@link StandardCharsets#UTF_8}. For
   * example, if the {@link Path} is <code>/tmp/filename.zip</code>, then
   * the contents of the file will be extracted into <code>/tmp</code>.
   *
   * @param zipPath The {@link Path} to the zip file to extract.
   * @throws IOException Could not extract the zip file, zip entries, or find
   *                     the parent directory that contains the path to the
   *                     zip archive.
   */
  public static void extract( final Path zipPath ) throws IOException {
    assert !zipPath.toFile().isDirectory();

    try( final var zipFile = new ZipFile( zipPath.toFile() ) ) {
      iterate( zipFile );
    }
  }

  /**
   * Extracts each entry in the zip archive file.
   *
   * @param zipFile The archive to extract.
   * @throws IOException Could not extract the zip file entry.
   */
  private static void iterate( final ZipFile zipFile )
    throws IOException {
    // Determine the directory name where the zip archive resides. Files will
    // be extracted relative to that directory.
    final var path = getDirectory( zipFile );
    final var entries = zipFile.entries();

    while( entries.hasMoreElements() ) {
      final var zipEntry = entries.nextElement();
      final var zipEntryPath = path.resolve( zipEntry.getName() );

      // Guard against zip slip.
      if( zipEntryPath.normalize().startsWith( path ) ) {
        extract( zipFile, zipEntry, zipEntryPath );
      }
    }
  }

  /**
   * Extracts a single entry of a zip file to a given directory. This will
   * create the necessary directory path if it doesn't exist. Empty
   * directories are not re-created.
   *
   * @param zipFile      The zip archive to extract.
   * @param zipEntry     An entry in the zip archive.
   * @param zipEntryPath The file location to write the zip entry.
   * @throws IOException Could not extract the zip file entry.
   */
  private static void extract(
    final ZipFile zipFile,
    final ZipEntry zipEntry,
    final Path zipEntryPath ) throws IOException {
    // Only attempt to extract files, skipping empty directories.
    if( !zipEntry.isDirectory() ) {
      createDirectories( zipEntryPath.getParent() );

      try( final var in = zipFile.getInputStream( zipEntry ) ) {
        Files.copy( in, zipEntryPath, REPLACE_EXISTING );
      }
    }
  }

  /**
   * Helper method to return the normalized directory where the given archive
   * resides.
   *
   * @param zipFile The {@link ZipFile} having a path to normalize.
   * @return The directory containing the given {@link ZipFile}.
   * @throws IOException The zip file has no parent directory.
   */
  private static Path getDirectory( final ZipFile zipFile ) throws IOException {
    final var zipPath = Path.of( zipFile.getName() );
    final var parent = zipPath.getParent();

    if( parent == null ) {
      throw new IOException( zipFile.getName() + " has no parent directory." );
    }

    return parent.normalize();
  }
}

现在您已经有了核心算法,需要检查文件扩展名是否为“.zip”,如果是,则对该文件进行递归调用Zip.extract( ... )


0

这里有一些代码,我测试过它运行得非常好:

package com.test;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

public class Unzipper {  
    private final static int BUFFER_SIZE = 2048;
    private final static String ZIP_FILE = "/home/anton/test/test.zip";
    private final static String DESTINATION_DIRECTORY = "/home/anton/test/";
    private final static String ZIP_EXTENSION = ".zip";
 
    public static void main(String[] args) {
     System.out.println("Trying to unzip file " + ZIP_FILE); 
        Unzipper unzip = new Unzipper();  
        if (unzip.unzipToFile(ZIP_FILE, DESTINATION_DIRECTORY)) {
         System.out.println("Succefully unzipped to the directory " 
             + DESTINATION_DIRECTORY);
        } else {
         System.out.println("There was some error during extracting archive to the directory " 
             + DESTINATION_DIRECTORY);
        }
    } 

 public boolean unzipToFile(String srcZipFileName,
   String destDirectoryName) {
  try {
   BufferedInputStream bufIS = null;
   // create the destination directory structure (if needed)
   File destDirectory = new File(destDirectoryName);
   destDirectory.mkdirs();

   // open archive for reading
   File file = new File(srcZipFileName);
   ZipFile zipFile = new ZipFile(file, ZipFile.OPEN_READ);

   //for every zip archive entry do
   Enumeration<? extends ZipEntry> zipFileEntries = zipFile.entries();
   while (zipFileEntries.hasMoreElements()) {
    ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
    System.out.println("\tExtracting entry: " + entry);

    //create destination file
    File destFile = new File(destDirectory, entry.getName());

    //create parent directories if needed
    File parentDestFile = destFile.getParentFile();    
    parentDestFile.mkdirs();    
    
    if (!entry.isDirectory()) {
     bufIS = new BufferedInputStream(
       zipFile.getInputStream(entry));
     int currentByte;

     // buffer for writing file
     byte data[] = new byte[BUFFER_SIZE];

     // write the current file to disk
     FileOutputStream fOS = new FileOutputStream(destFile);
     BufferedOutputStream bufOS = new BufferedOutputStream(fOS, BUFFER_SIZE);

     while ((currentByte = bufIS.read(data, 0, BUFFER_SIZE)) != -1) {
      bufOS.write(data, 0, currentByte);
     }

     // close BufferedOutputStream
     bufOS.flush();
     bufOS.close();

     // recursively unzip files
     if (entry.getName().toLowerCase().endsWith(ZIP_EXTENSION)) {
      String zipFilePath = destDirectory.getPath() + File.separatorChar + entry.getName();

      unzipToFile(zipFilePath, zipFilePath.substring(0, 
              zipFilePath.length() - ZIP_EXTENSION.length()));
     }
    }
   }
   bufIS.close();
   return true;
  } catch (Exception e) {
   e.printStackTrace();
   return false;
  }
 } 
}  

我尝试了这里最受欢迎的答案,但它并没有递归地解压文件,只是解压第一层的文件。

来源: 将文件提取到给定目录的解决方案

此外,还要检查同一人提供的这个解决方案: 在内存中提取文件的解决方案


-3
File dir = new File("BASE DIRECTORY PATH");
FileFilter ff = new FileFilter() {

    @Override
    public boolean accept(File f) {
        //only want zip files
        return (f.isFile() && f.getName().toLowerCase().endsWith(".zip"));
    }
};

File[] list = null;
while ((list = dir.listFiles(ff)).length > 0) {
    File file1 = list[0];
    //TODO unzip the file to the base directory
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接