我尝试了一些可能的方法,使用tika可以得到你期望的结果,但我没有看到你使用的代码,所以无法进行二次检查。
我尝试了不同的方法,但并非所有都在代码片段中:
- Java 7
Files.probeContentType(path)
URLConnection
从文件名和内容类型猜测中检测mime- JDK 6 JAF API
javax.activation.MimetypesFileTypeMap
- MimeUtil与我找到的所有
MimeDetector
子类 - Apache Tika
- Apache POI scratchpad
这是测试类:
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.net.URLConnection;
import java.util.Collection;
import javax.activation.MimetypesFileTypeMap;
import org.apache.tika.detect.Detector;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.mime.MediaType;
import org.apache.tika.parser.AutoDetectParser;
import eu.medsea.mimeutil.MimeUtil;
public class FindMime {
public static void main(String[] args) {
File file = new File("C:\\Users\\qwerty\\Desktop\\test.msg");
System.out.println("urlConnectionGuess " + urlConnectionGuess(file));
System.out.println("fileContentGuess " + fileContentGuess(file));
MimetypesFileTypeMap mimeTypesMap = new MimetypesFileTypeMap();
System.out.println("mimeTypesMap.getContentType " + mimeTypesMap.getContentType(file));
System.out.println("mimeutils " + mimeutils(file));
System.out.println("tika " + tika(file));
}
private static String mimeutils(File file) {
try {
MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.ExtensionMimeDetector");
MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.WindowsRegistryMimeDetector");
InputStream is = new BufferedInputStream(new FileInputStream(file));
Collection<?> mimeTypes = MimeUtil.getMimeTypes(is);
return mimeTypes.toString();
} catch (Exception e) {
}
return null;
}
private static String tika(File file) {
try {
InputStream is = new BufferedInputStream(new FileInputStream(file));
AutoDetectParser parser = new AutoDetectParser();
Detector detector = parser.getDetector();
Metadata md = new Metadata();
md.add(Metadata.RESOURCE_NAME_KEY, "test.msg");
MediaType mediaType = detector.detect(is, md);
return mediaType.toString();
} catch (Exception e) {
}
return null;
}
private static String urlConnectionGuess(File file) {
String mimeType = URLConnection.guessContentTypeFromName(file.getName());
return mimeType;
}
private static String fileContentGuess(File file) {
try {
InputStream is = new BufferedInputStream(new FileInputStream(file));
return URLConnection.guessContentTypeFromStream(is);
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
}
这是输出结果:
urlConnectionGuess null
fileContentGuess null
mimeTypesMap.getContentType application/octet-stream
mimeutils application/msword,application/x-hwp
tika application/vnd.ms-outlook
更新:我添加了这个方法来测试使用Tika的其他方式:
private static void tikaMore(File file) {
Tika defaultTika = new Tika();
Tika mimeTika = new Tika(new MimeTypes());
Tika typeTika = new Tika(new TypeDetector());
try {
System.out.println(defaultTika.detect(file));
System.out.println(mimeTika.detect(file));
System.out.println(typeTika.detect(file));
} catch (Exception e) {
}
}
测试了一个没有扩展名的msg文件:
application/vnd.ms-outlook
application/octet-stream
application/octet-stream
测试过将一个txt文件重命名为msg:
text/plain
text/plain
application/octet-stream
在这种情况下,似乎使用空构造函数是最简单和最可靠的方法。
更新,你可以使用Apache POI scratchpad来创建自己的检查器,例如这是一个获取消息mime或如果文件不是正确格式(通常为org.apache.poi.poifs.filesystem.NotOLE2FileException: Invalid header signature
)则返回null的简单实现:
import org.apache.poi.hsmf.MAPIMessage;
public class PoiMsgMime {
public String getMessageMime(String fileName) {
try {
new MAPIMessage(fileName);
return "application/vnd.ms-outlook";
} catch (Exception e) {
return null;
}
}
}
application/vnd.ms-outlook
。对于.msg文件:D0 CF 11 E0 A1 B1 1A E1
。 - Duffydake