如何将位于“assets”目录中的文本文件读取为字符串?

9

我有一个文件在我的资产文件夹里... 我该如何读取它?

现在我正在尝试:

      public static String readFileAsString(String filePath)
        throws java.io.IOException{
            StringBuffer fileData = new StringBuffer(1000);
            BufferedReader reader = new BufferedReader(
                    new FileReader(filePath));
            char[] buf = new char[1024];
            int numRead=0;
            while((numRead=reader.read(buf)) != -1){
                String readData = String.valueOf(buf, 0, numRead);
                fileData.append(readData);
                buf = new char[1024];
            }
            reader.close();
            return fileData.toString();
        }

但它抛出了空指针异常...

这个文件名叫做"origin",它在资产文件夹中。

我尝试使用以下方式进行转换:

readFileAsString("file:///android_asset/origin");

并且

readFileAsString("asset/origin");``

但是两者都失败了...有什么建议吗?
5个回答

11

当到达文件的末尾时,BufferedReader的readLine()方法会返回null,因此您需要注意并避免尝试将其附加到您的字符串中。

以下代码应该足够简单:

public static String readFileAsString(String filePath) throws java.io.IOException
{
    BufferedReader reader = new BufferedReader(new FileReader(filePath));
    String line, results = "";
    while( ( line = reader.readLine() ) != null)
    {
        results += line;
    }
    reader.close();
    return results;
}

简单且直接。


3
由于使用了String类,它的效率非常低。应该使用StringBuilder来替换它。 - marcinj
1
@luskan 不一定是真的。如果你查看生成的字节码,你会发现它在底层使用了stringbuilder(虽然不是最高效的方式,但仍然...)。 - kritzikratzi
@kritzikratzi:你能否解释一下StringBuilder在内部是如何使用的? - Charu Khurana
将上述内容放入一个类文件中,然后运行javap -c YourClass。我不太擅长阅读这些东西,但我认为大致上是这样的:String results = ""; while( ( line = readline() ) != null ) results = new StringBuilder( results ).append ( line ).toString(); 当然,这是一种浪费资源的方式,因为在每个循环中都会分配一个StringBuilder并转换为字符串,但很多时候这并不重要。现在添加一个更有效的替代方案。 - kritzikratzi
啊哈,好的,我又玩了一会儿,发现对于大于100k的文件,它开始变得明显地慢了。我已经添加了自己的答案,并进行了测试。 - kritzikratzi

6

简短回答,做这个:

public static String readFile( String filePath ) throws IOException
{
    Reader reader = new FileReader( filePath );
    StringBuilder sb = new StringBuilder(); 
    char buffer[] = new char[16384];  // read 16k blocks
    int len; // how much content was read? 
    while( ( len = reader.read( buffer ) ) > 0 ){
        sb.append( buffer, 0, len ); 
    }
    reader.close();
    return sb.toString();
}

这是非常简单直接的方法,速度很快,并且适用于超大文本文件(100+ MB)。


长篇回答:

虽然有时候并不重要,但是这种方法相当快速且易读。实际上,它比@Raceimation的答案复杂度更低——O(n)而不是O(n^2)。

我测试了六种方法(从慢到快):

  • concat:逐行读取并使用str +=进行连接 *(即使对于较小的文件,这也令人担忧地缓慢(对于一个3MB文件需要约70秒))*
  • 猜测长度的StringBuilder:使用文件大小初始化StringBuilder。我猜测它很慢是因为它确实试图找到如此巨大的线性内存块。
  • 带有行缓冲区的StringBuilder:StringBuilder,按行读取文件
  • 带有char[]缓冲区的StringBuffer:使用StringBuffer连接,以16k块方式读取文件
  • 带有char[]缓冲区的StringBuilder:使用StringBuilder连接,在16k块中读取文件
  • 预分配byte[filesize]缓冲区:分配一个大小为文件大小的byte[]缓冲区,并让Java API决定如何缓冲单个块。

结论:

完全预分配缓冲区是处理超大文件时最快的方法,但该方法不太灵活,因为必须提前知道总文件大小。这就是为什么我建议使用带有char[]缓冲区的strBuilder,它仍然简单易用,如果需要可以轻松更改以接受任何输入流而不仅仅是文件。但是它对于所有合理情况来说速度足够快。

测试结果+代码

import java.io.*; 

public class Test
{

    static final int N = 5; 

    public final static void main( String args[] ) throws IOException{
        test( "1k.txt", true ); 
        test( "10k.txt", true ); 
        // concat with += would take ages here, so we skip it
        test( "100k.txt", false ); 
        test( "2142k.txt", false ); 
        test( "pruned-names.csv", false ); 
        // ah, what the heck, why not try a binary file
        test( "/Users/hansi/Downloads/xcode46graphicstools6938140a.dmg", false );
    }

    public static void test( String file, boolean includeConcat ) throws IOException{

        System.out.println( "Reading " + file + " (~" + (new File(file).length()/1024) + "Kbytes)" ); 
        strbuilderwithchars( file ); 
        strbuilderwithchars( file ); 
        strbuilderwithchars( file ); 
        tick( "Warm up... " ); 

        if( includeConcat ){
            for( int i = 0; i < N; i++ )
                concat( file ); 
            tick( "> Concat with +=                  " ); 
        }
        else{
            tick( "> Concat with +=   **skipped**    " ); 
        }

        for( int i = 0; i < N; i++ )
            strbuilderguess( file ); 
        tick( "> StringBuilder init with length  " ); 

        for( int i = 0; i < N; i++ )
            strbuilder( file ); 
        tick( "> StringBuilder with line buffer  " );

        for( int i = 0; i < N; i++ )
            strbuilderwithchars( file ); 
        tick( "> StringBuilder with char[] buffer" );

        for( int i = 0; i < N; i++ )
            strbufferwithchars( file ); 
        tick( "> StringBuffer with char[] buffer " );

        for( int i = 0; i < N; i++ )
            singleBuffer( file ); 
        tick( "> Allocate byte[filesize]         " );

        System.out.println(); 
    }

    public static long now = System.currentTimeMillis(); 
    public static void tick( String message ){
        long t = System.currentTimeMillis(); 
        System.out.println( message + ": " + ( t - now )/N + " ms" ); 
        now = t; 
    }


    // StringBuilder with char[] buffer
    // + works if filesize is unknown
    // + pretty fast 
    public static String strbuilderwithchars( String filePath ) throws IOException
    {
        Reader reader = new FileReader( filePath );
        StringBuilder sb = new StringBuilder(); 
        char buffer[] = new char[16384];  // read 16k blocks
        int len; // how much content was read? 
        while( ( len = reader.read( buffer ) ) > 0 ){
            sb.append( buffer, 0, len ); 
        }
        reader.close();
        return sb.toString();
    }

    // StringBuffer with char[] buffer
    // + works if filesize is unknown
    // + faster than stringbuilder on my computer
    // - should be slower than stringbuilder, which confuses me 
    public static String strbufferwithchars( String filePath ) throws IOException
    {
        Reader reader = new FileReader( filePath );
        StringBuffer sb = new StringBuffer(); 
        char buffer[] = new char[16384];  // read 16k blocks
        int len; // how much content was read? 
        while( ( len = reader.read( buffer ) ) > 0 ){
            sb.append( buffer, 0, len ); 
        }
        reader.close();
        return sb.toString();
    }

    // StringBuilder init with length
    // + works if filesize is unknown
    // - not faster than any of the other methods, but more complicated
    public static String strbuilderguess(String filePath) throws IOException
    {
        File file = new File( filePath ); 
        BufferedReader reader = new BufferedReader(new FileReader(file));
        String line;
        StringBuilder sb = new StringBuilder( (int)file.length() ); 
        while( ( line = reader.readLine() ) != null)
        {
            sb.append( line ); 
        }
        reader.close();
        return sb.toString();
    }

    // StringBuilder with line buffer
    // + works if filesize is unknown
    // + pretty fast 
    // - speed may (!) vary with line length
    public static String strbuilder(String filePath) throws IOException
    {
        BufferedReader reader = new BufferedReader(new FileReader(filePath));
        String line;
        StringBuilder sb = new StringBuilder(); 
        while( ( line = reader.readLine() ) != null)
        {
            sb.append( line ); 
        }
        reader.close();
        return sb.toString();
    }


    // Concat with += 
    // - slow
    // - slow
    // - really slow
    public static String concat(String filePath) throws IOException
    {
        BufferedReader reader = new BufferedReader(new FileReader(filePath));
        String line, results = "";
    int i = 0; 
        while( ( line = reader.readLine() ) != null)
        {
            results += line;
            i++; 
        }
        reader.close();
        return results;
    }

    // Allocate byte[filesize]
    // + seems to be the fastest for large files
    // - only works if filesize is known in advance, so less versatile for a not significant performance gain
    // + shortest code
    public static String singleBuffer(String filePath ) throws IOException{
        FileInputStream in = new FileInputStream( filePath );
        byte buffer[] = new byte[(int) new File( filePath).length()];  // buffer for the entire file
        int len = in.read( buffer ); 
        return new String( buffer, 0, len ); 
    }
}


/**
 *** RESULTS ***

Reading 1k.txt (~31Kbytes)
Warm up... : 0 ms
> Concat with +=                  : 37 ms
> StringBuilder init with length  : 0 ms
> StringBuilder with line buffer  : 0 ms
> StringBuilder with char[] buffer: 0 ms
> StringBuffer with char[] buffer : 0 ms
> Allocate byte[filesize]         : 1 ms

Reading 10k.txt (~313Kbytes)
Warm up... : 0 ms
> Concat with +=                  : 708 ms
> StringBuilder init with length  : 2 ms
> StringBuilder with line buffer  : 2 ms
> StringBuilder with char[] buffer: 1 ms
> StringBuffer with char[] buffer : 1 ms
> Allocate byte[filesize]         : 1 ms

Reading 100k.txt (~3136Kbytes)
Warm up... : 7 ms
> Concat with +=   **skipped**    : 0 ms
> StringBuilder init with length  : 19 ms
> StringBuilder with line buffer  : 21 ms
> StringBuilder with char[] buffer: 9 ms
> StringBuffer with char[] buffer : 9 ms
> Allocate byte[filesize]         : 8 ms

Reading 2142k.txt (~67204Kbytes)
Warm up... : 181 ms
> Concat with +=   **skipped**    : 0 ms
> StringBuilder init with length  : 367 ms
> StringBuilder with line buffer  : 372 ms
> StringBuilder with char[] buffer: 208 ms
> StringBuffer with char[] buffer : 202 ms
> Allocate byte[filesize]         : 199 ms

Reading pruned-names.csv (~11200Kbytes)
Warm up... : 23 ms
> Concat with +=   **skipped**    : 0 ms
> StringBuilder init with length  : 54 ms
> StringBuilder with line buffer  : 57 ms
> StringBuilder with char[] buffer: 32 ms
> StringBuffer with char[] buffer : 31 ms
> Allocate byte[filesize]         : 32 ms

Reading /Users/hansi/Downloads/xcode46graphicstools6938140a.dmg (~123429Kbytes)
Warm up... : 1665 ms
> Concat with +=   **skipped**    : 0 ms
> StringBuilder init with length  : 2899 ms
> StringBuilder with line buffer  : 2978 ms
> StringBuilder with char[] buffer: 2702 ms
> StringBuffer with char[] buffer : 2684 ms
> Allocate byte[filesize]         : 1567 ms


**/

顺便提一下,您可能已经注意到StringBuffer比StringBuilder稍微快一些。这有点荒谬,因为这两个类是相同的,除了StringBuilder不是同步的。如果有人能够(或者)不能重现这个问题......我最好奇了 :)


6

您可以使用 AssetsManager 打开输入流。

InputStream input = getAssets().open("origin");
Reader reader = new InputStreamReader(input, "UTF-8");

getAssets()Context类的一个方法。

另外请注意,不要重新创建字符缓冲区(buf = new char[1024],循环的最后一行)。


好的发现!不需要重新初始化缓冲区。我也要发布同样的事情! :) - Chris Aldrich
为什么我只能从Activity中进行类型转换?我想将其放到一个外部的静态类中... - Mascarpone
1
所以你可以将上下文作为静态方法的参数提供: static String readFromAssets(final Context context, final String path) { InputStream input = context.getAssets().open(path); ... } - Roman Mazur
1
@mascarpone。或者你可以在你的应用程序中创建一个静态的getContext方法。http://stackoverflow.com/q/5114361/186636 - Peter Ajtai

1

我写了一个做与你的功能一样的函数。我之前写了它,但我相信它仍然能正常工作。

public static final String grabAsSingleString(File fileToUse) 
            throws FileNotFoundException {

        BufferedReader theReader = null;
        String returnString = null;

        try {
            theReader = new BufferedReader(new FileReader(fileToUse));
            char[] charArray = null;

            if(fileToUse.length() > Integer.MAX_VALUE) {
                // TODO implement handling of large files.
                System.out.println("The file is larger than int max = " +
                        Integer.MAX_VALUE);
            } else {
                charArray = new char[(int)fileToUse.length()];

                // Read the information into the buffer.
                theReader.read(charArray, 0, (int)fileToUse.length());
                returnString = new String(charArray);

            }
        } catch (FileNotFoundException ex) {
            throw ex;
        } catch(IOException ex) {
            ex.printStackTrace();
        } finally {
            try {
                theReader.close();
            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }

        return returnString;
    }

现在,如果您想使用此功能,那么在传递文件时,无论是通过文件对象还是字符串,都要确保提供文件的完整路径,例如“C:\Program Files\test.dat”,或者传递相对链接从工作目录。通常情况下,您启动应用程序的目录就是您的工作目录(除非您更改了它)。因此,如果文件在名为data的文件夹中,则应传递“./data/test.dat”。

是的,我知道这在Android上运行,所以Windows URI不适用,但是您应该明白我的意思。


1
你应该尝试使用org.appache.commons.io.IOUtils.toString(InputStream is)将文件内容转换为字符串。你可以传递从中获取的InputStream对象。
getAssets().open("xml2json.txt")

在您的Activity中,要获取String,请使用以下代码:
String xml = IOUtils.toString((getAssets().open("xml2json.txt")));

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接