我需要在每次服务器启动时读取大约50个文件,并将每个文本文件的表示放入内存中。每个文本文件都将有自己的字符串(哪种类型最适合用于字符串占用者?)。
最快的读取文件到内存的方法是什么,最好的数据结构/类型来保存文本,以便我可以在内存中操作它(主要是搜索和替换)?
谢谢
我需要在每次服务器启动时读取大约50个文件,并将每个文本文件的表示放入内存中。每个文本文件都将有自己的字符串(哪种类型最适合用于字符串占用者?)。
最快的读取文件到内存的方法是什么,最好的数据结构/类型来保存文本,以便我可以在内存中操作它(主要是搜索和替换)?
谢谢
内存映射文件是最快的...类似这样:
final File file;
final FileChannel channel;
final MappedByteBuffer buffer;
file = new File(fileName);
fin = new FileInputStream(file);
channel = fin.getChannel();
buffer = channel.map(MapMode.READ_ONLY, 0, file.length());
然后从字节缓冲区中读取。这比使用FileInputStream
或FileReader
要快得多。
编辑:
经过一番调查,发现根据您的操作系统,使用新的BufferedInputStream(new FileInputStream(file))
可能更好。但是,将整个文件一次性读入大小为文件大小的char[]中听起来是最糟糕的方法。
因此,BufferedInputStream
在所有平台上应该提供大致一致的性能,而内存映射文件可能会因基础操作系统而快或慢。与所有性能关键的东西一样,应测试您的代码并查看哪种方法最有效。
编辑:
好的,这里有一些测试(第一个测试需要运行两次以将文件放入磁盘缓存中)。
我对rt.jar类文件进行了测试,解压到硬盘上,在Windows 7 beta x64下进行。共16784个文件,总共94606637字节。
首先是结果...
(请记住,第一个测试需要重复一次以设置磁盘缓存)
ArrayTest
ArrayTest
DataInputByteAtATime
DataInputReadFully
MemoryMapped
这是代码...
import java.io.BufferedInputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.util.HashSet;
import java.util.Set;
public class Main
{
public static void main(final String[] argv)
{
ArrayTest.main(argv);
ArrayTest.main(argv);
DataInputByteAtATime.main(argv);
DataInputReadFully.main(argv);
MemoryMapped.main(argv);
}
}
abstract class Test
{
public final void run(final File root)
{
final Set<File> files;
final long size;
final long start;
final long end;
final long total;
files = new HashSet<File>();
getFiles(root, files);
start = System.currentTimeMillis();
size = readFiles(files);
end = System.currentTimeMillis();
total = end - start;
System.out.println(getClass().getName());
System.out.println("time = " + total);
System.out.println("bytes = " + size);
}
private void getFiles(final File dir,
final Set<File> files)
{
final File[] childeren;
childeren = dir.listFiles();
for(final File child : childeren)
{
if(child.isFile())
{
files.add(child);
}
else
{
getFiles(child, files);
}
}
}
private long readFiles(final Set<File> files)
{
long size;
size = 0;
for(final File file : files)
{
size += readFile(file);
}
return (size);
}
protected abstract long readFile(File file);
}
class ArrayTest
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new ArrayTest();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
InputStream stream;
stream = null;
try
{
final byte[] data;
int soFar;
int sum;
stream = new BufferedInputStream(new FileInputStream(file));
data = new byte[(int)file.length()];
soFar = 0;
do
{
soFar += stream.read(data, soFar, data.length - soFar);
}
while(soFar != data.length);
sum = 0;
for(final byte b : data)
{
sum += b;
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputByteAtATime
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new DataInputByteAtATime();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
DataInputStream stream;
stream = null;
try
{
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++)
{
sum += stream.readByte();
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadFully
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new DataInputReadFully();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
DataInputStream stream;
stream = null;
try
{
final byte[] data;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
data = new byte[(int)file.length()];
stream.readFully(data);
sum = 0;
for(final byte b : data)
{
sum += b;
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadInChunks
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new DataInputReadInChunks();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
DataInputStream stream;
stream = null;
try
{
final byte[] data;
int size;
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
data = new byte[512];
size = 0;
sum = 0;
do
{
size += stream.read(data);
sum = 0;
for(int i = 0; i < size; i++)
{
sum += data[i];
}
}
while(size != fileSize);
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class MemoryMapped
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new MemoryMapped();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
FileInputStream stream;
stream = null;
try
{
final FileChannel channel;
final MappedByteBuffer buffer;
final int fileSize;
int sum;
stream = new FileInputStream(file);
channel = stream.getChannel();
buffer = channel.map(MapMode.READ_ONLY, 0, file.length());
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++)
{
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
File.length()
)new InputStreamReader (new FileInputStream(file), encoding)
进行读取new String(buffer)
如果您需要在启动时进行一次搜索和替换,请使用 String.replaceAll()。
如果您需要重复执行此操作,可以考虑使用 StringBuilder。它没有 replaceAll(),但您可以使用它来直接操作字符数组 (-> 不需要分配内存)。
话虽如此:
如果执行时间只需0.1秒,没有必要花费大量时间使此代码运行速度更快。
如果仍然存在性能问题,请考虑将所有文本文件放入 JAR 中,将其添加到类路径中,并使用 Class.getResourceAsStream() 读取文件。从 Java 类路径加载东西是高度优化的。
ArrayTest_CustomBuffering(将数据直接读入自己的缓冲区)
ArrayTest_CustomBuffering_StaticBuffer(将数据读入仅在开始时创建的静态缓冲区)
FileChannelArrayByteBuffer(使用NIO ByteBuffer并包装自己的byte[]数组)
FileChannelAllocateByteBuffer(使用NIO ByteBuffer和.allocate)
FileChannelAllocateByteBuffer_StaticBuffer(与4相同,但具有静态缓冲区)
FileChannelAllocateDirectByteBuffer(使用NIO ByteBuffer和.allocateDirect)
FileChannelAllocateDirectByteBuffer_StaticBuffer(与6相同,但具有静态缓冲区)
我修改过的TofuBear代码版本:
import java.io.BufferedInputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.MappedByteBuffer;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.util.HashSet;
import java.util.Set;
public class Main {
public static void main(final String[] argv) {
ArrayTest.mainx(argv);
ArrayTest.mainx(argv);
ArrayTest_CustomBuffering.mainx(argv);
ArrayTest_CustomBuffering_StaticBuffer.mainx(argv);
DataInputByteAtATime.mainx(argv);
DataInputReadFully.mainx(argv);
MemoryMapped.mainx(argv);
FileChannelArrayByteBuffer.mainx(argv);
FileChannelAllocateByteBuffer.mainx(argv);
FileChannelAllocateByteBuffer_StaticBuffer.mainx(argv);
FileChannelAllocateDirectByteBuffer.mainx(argv);
FileChannelAllocateDirectByteBuffer_StaticBuffer.mainx(argv);
}
}
abstract class Test {
static final int BUFF_SIZE = 20971520;
static final byte[] StaticData = new byte[BUFF_SIZE];
static final ByteBuffer StaticBuffer =ByteBuffer.allocate(BUFF_SIZE);
static final ByteBuffer StaticDirectBuffer = ByteBuffer.allocateDirect(BUFF_SIZE);
public final void run(final File root) {
final Set<File> files;
final long size;
final long start;
final long end;
final long total;
files = new HashSet<File>();
getFiles(root, files);
start = System.currentTimeMillis();
size = readFiles(files);
end = System.currentTimeMillis();
total = end - start;
System.out.println(getClass().getName());
System.out.println("time = " + total);
System.out.println("bytes = " + size);
}
private void getFiles(final File dir,final Set<File> files) {
final File[] childeren;
childeren = dir.listFiles();
for(final File child : childeren) {
if(child.isFile()) {
files.add(child);
}
else {
getFiles(child, files);
}
}
}
private long readFiles(final Set<File> files) {
long size;
size = 0;
for(final File file : files) {
size += readFile(file);
}
return (size);
}
protected abstract long readFile(File file);
}
class ArrayTest extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new ArrayTest();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
InputStream stream;
stream = null;
try {
final byte[] data;
int soFar;
int sum;
stream = new BufferedInputStream(new FileInputStream(file));
data = new byte[(int)file.length()];
soFar = 0;
do {
soFar += stream.read(data, soFar, data.length - soFar);
}
while(soFar != data.length);
sum = 0;
for(final byte b : data) {
sum += b;
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class ArrayTest_CustomBuffering extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new ArrayTest_CustomBuffering();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
InputStream stream;
stream = null;
try {
final byte[] data;
int soFar;
int sum;
stream = new FileInputStream(file);
data = new byte[(int)file.length()];
soFar = 0;
do {
soFar += stream.read(data, soFar, data.length - soFar);
}
while(soFar != data.length);
sum = 0;
for(final byte b : data) {
sum += b;
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class ArrayTest_CustomBuffering_StaticBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new ArrayTest_CustomBuffering_StaticBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
InputStream stream;
stream = null;
try {
int soFar;
int sum;
final int fileSize;
stream = new FileInputStream(file);
fileSize = (int)file.length();
soFar = 0;
do {
soFar += stream.read(StaticData, soFar, fileSize - soFar);
}
while(soFar != fileSize);
sum = 0;
for(int i=0;i<fileSize;i++) {
sum += StaticData[i];
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputByteAtATime extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new DataInputByteAtATime();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
DataInputStream stream;
stream = null;
try {
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += stream.readByte();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadFully extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new DataInputReadFully();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
DataInputStream stream;
stream = null;
try {
final byte[] data;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
data = new byte[(int)file.length()];
stream.readFully(data);
sum = 0;
for(final byte b : data) {
sum += b;
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadInChunks extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new DataInputReadInChunks();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
DataInputStream stream;
stream = null;
try {
final byte[] data;
int size;
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
data = new byte[512];
size = 0;
sum = 0;
do {
size += stream.read(data);
sum = 0;
for(int i = 0;
i < size;
i++) {
sum += data[i];
}
}
while(size != fileSize);
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class MemoryMapped extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new MemoryMapped();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final FileChannel channel;
final MappedByteBuffer buffer;
final int fileSize;
int sum;
stream = new FileInputStream(file);
channel = stream.getChannel();
buffer = channel.map(MapMode.READ_ONLY, 0, file.length());
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelArrayByteBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelArrayByteBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
final ByteBuffer buffer;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
data = new byte[(int)file.length()];
buffer = ByteBuffer.wrap(data);
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(buffer);
buffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateByteBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateByteBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
final ByteBuffer buffer;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
buffer = ByteBuffer.allocate((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(buffer);
buffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateDirectByteBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateDirectByteBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
final ByteBuffer buffer;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
buffer = ByteBuffer.allocateDirect((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(buffer);
buffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateByteBuffer_StaticBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateByteBuffer_StaticBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
StaticBuffer.clear();
StaticBuffer.limit((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(StaticBuffer);
StaticBuffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += StaticBuffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateDirectByteBuffer_StaticBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateDirectByteBuffer_StaticBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
StaticDirectBuffer.clear();
StaticDirectBuffer.limit((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(StaticDirectBuffer);
StaticDirectBuffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += StaticDirectBuffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
这很大程度上取决于您的文本文件的内部结构以及您打算如何处理它们。
这些文件是键值字典(即“属性”文件)吗?XML?JSON?对于这些,您有标准结构。
如果它们具有正式结构,您还可以使用JavaCC构建文件的对象表示。
否则,如果它们只是数据块,那么读取文件并将其放入字符串中即可。
编辑:关于搜索和替换-只需使用{{link1:String的replaceAll函数}}。
任何常规方法的速度都会受到限制。我不确定你是否会从一个方法到另一个方法看到太大的差别。
我建议集中在能够使整个操作更快的业务技巧上。
例如,如果您读取了所有文件,并将它们存储在一个单独的文件中,并带有每个原始文件的时间戳,那么您可以在不实际打开它们的情况下检查文件是否发生了任何更改(换句话说,是一个简单的缓存)。
如果您的问题是快速启动GUI,那么您可以找到一种方法,在显示第一个屏幕后在后台线程中打开文件。
操作系统对于文件处理得非常好,如果这是批处理的一部分(无用户I/O),则可以使用类似以下方式的批处理文件将所有文件附加到一个大文件中,然后启动Java:
echo "file1" > file.all
type "file1" >> file.all
echo "file2" >> file.all
type "file2" >> file.all
然后只需打开file.all文件(我不确定这样做会更快,但对于我刚才提到的条件来说,这可能是最快的方法)
我想我只是在说,通常解决速度问题的解决方案往往需要稍微扩大您的视野,并使用新参数完全重新思考解决方案。修改现有算法通常只会以可读性为代价获得轻微的速度增强。
你可以使用常规工具如Commons IO FileUtils.readFileToString(File)来在不到一秒的时间内读取所有文件。
你也可以使用writeStringToFile(File, String)方法将修改后的文件保存下来。
http://commons.apache.org/io/api-release/index.html?org/apache/commons/io/FileUtils.html
顺便说一下:50并不是很多的文件。一个典型的个人电脑可能有10万个或更多的文件。