我需要追踪特定文件的read
系统调用,目前通过解析strace
的输出来实现此功能。由于read
是在文件描述符上进行操作,因此我必须跟踪fd
和path
之间的当前映射关系。此外,还必须监视seek
以保持跟踪中的当前位置更新。
在Linux中,有没有更好的方法来获取每个应用程序、每个文件路径的IO跟踪记录?
我需要追踪特定文件的read
系统调用,目前通过解析strace
的输出来实现此功能。由于read
是在文件描述符上进行操作,因此我必须跟踪fd
和path
之间的当前映射关系。此外,还必须监视seek
以保持跟踪中的当前位置更新。
在Linux中,有没有更好的方法来获取每个应用程序、每个文件路径的IO跟踪记录?
您可以等待文件被打开,这样您就可以了解 fd 并在进程启动后像这样附加 strace:
strace -p pid -e trace=file -e read=fd
Systemtap是Linux下类似DTrace的重新实现,可能对这里有帮助。
与strace一样,您只有文件描述符(fd),但是通过脚本能力很容易维护fd的文件名(除非使用像dup之类的有趣东西)。 有一个示例脚本iotime可以 说明 它。
#! /usr/bin/env stap
/*
* Copyright (C) 2006-2007 Red Hat Inc.
*
* This copyrighted material is made available to anyone wishing to use,
* modify, copy, or redistribute it subject to the terms and conditions
* of the GNU General Public License v.2.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
* Print out the amount of time spent in the read and write systemcall
* when each file opened by the process is closed. Note that the systemtap
* script needs to be running before the open operations occur for
* the script to record data.
*
* This script could be used to to find out which files are slow to load
* on a machine. e.g.
*
* stap iotime.stp -c 'firefox'
*
* Output format is:
* timestamp pid (executabable) info_type path ...
*
* 200283135 2573 (cupsd) access /etc/printcap read: 0 write: 7063
* 200283143 2573 (cupsd) iotime /etc/printcap time: 69
*
*/
global start
global time_io
function timestamp:long() { return gettimeofday_us() - start }
function proc:string() { return sprintf("%d (%s)", pid(), execname()) }
probe begin { start = gettimeofday_us() }
global filehandles, fileread, filewrite
probe syscall.open.return {
filename = user_string($filename)
if ($return != -1) {
filehandles[pid(), $return] = filename
} else {
printf("%d %s access %s fail\n", timestamp(), proc(), filename)
}
}
probe syscall.read.return {
p = pid()
fd = $fd
bytes = $return
time = gettimeofday_us() - @entry(gettimeofday_us())
if (bytes > 0)
fileread[p, fd] += bytes
time_io[p, fd] <<< time
}
probe syscall.write.return {
p = pid()
fd = $fd
bytes = $return
time = gettimeofday_us() - @entry(gettimeofday_us())
if (bytes > 0)
filewrite[p, fd] += bytes
time_io[p, fd] <<< time
}
probe syscall.close {
if ([pid(), $fd] in filehandles) {
printf("%d %s access %s read: %d write: %d\n",
timestamp(), proc(), filehandles[pid(), $fd],
fileread[pid(), $fd], filewrite[pid(), $fd])
if (@count(time_io[pid(), $fd]))
printf("%d %s iotime %s time: %d\n", timestamp(), proc(),
filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))
}
delete fileread[pid(), $fd]
delete filewrite[pid(), $fd]
delete filehandles[pid(), $fd]
delete time_io[pid(),$fd]
}
由于哈希表的大小限制,它只适用于一定数量的文件。
strace
现在有新选项来跟踪文件描述符:
--decode-fds=set
Decode various information associated with file descriptors. The default is decode-fds=none. set can include the following elements:
path Print file paths.
socket Print socket protocol-specific information,
dev Print character/block device numbers.
pidfd Print PIDs associated with pidfd file descriptors.
这很有用,因为文件描述符在关闭后会被重复使用,而/proc/$PID/fd只提供一次快照,在实时调试时无用。
示例输出,请注意文件名显示在尖括号中,FD 3被用于所有的/etc/ld.so.cache
,/lib/x86_64-linux-gnu/libc.so.6
,/usr/lib/locale/locale-archive
,/home/florian/hello
。
$ strace -e trace=desc --decode-fds=all cat hello 1>/dev/null
execve("/usr/bin/cat", ["cat", "hello"], 0x7fff42e20710 /* 102 vars */) = 0
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3</etc/ld.so.cache>
newfstatat(3</etc/ld.so.cache>, "", {st_mode=S_IFREG|0644, st_size=167234, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 167234, PROT_READ, MAP_PRIVATE, 3</etc/ld.so.cache>, 0) = 0x7f22edeee000
close(3</etc/ld.so.cache>) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>
read(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\206\2\0\0\0\0\0"..., 832) = 832
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 48, 848) = 48
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0+H)\227\201T\214\233\304R\352\306\3379\220%"..., 68, 896) = 68
newfstatat(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "", {st_mode=S_IFREG|0755, st_size=1983576, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f22edeec000
pread64(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 2012056, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0) = 0x7f22edd00000
mmap(0x7f22edd26000, 1486848, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0x26000) = 0x7f22edd26000
mmap(0x7f22ede91000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0x191000) = 0x7f22ede91000
mmap(0x7f22ededd000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</usr/lib/x86_64-linux-gnu/libc-2.33.so>, 0x1dc000) = 0x7f22ededd000
mmap(0x7f22edee3000, 33688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f22edee3000
close(3</usr/lib/x86_64-linux-gnu/libc-2.33.so>) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f22edcfe000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3</usr/lib/locale/locale-archive>
newfstatat(3</usr/lib/locale/locale-archive>, "", {st_mode=S_IFREG|0644, st_size=6055600, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 6055600, PROT_READ, MAP_PRIVATE, 3</usr/lib/locale/locale-archive>, 0) = 0x7f22ed737000
close(3</usr/lib/locale/locale-archive>) = 0
fstat(1</dev/null<char 1:3>>, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}) = 0
openat(AT_FDCWD, "hello", O_RDONLY) = 3</home/florian/hello>
fstat(3</home/florian/hello>, {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
fadvise64(3</home/florian/hello>, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f22edef5000
read(3</home/florian/hello>, "world\n", 131072) = 6
write(1</dev/null<char 1:3>>, "world\n", 6) = 6
read(3</home/florian/hello>, "", 131072) = 0
close(3</home/florian/hello>) = 0
close(1</dev/null<char 1:3>>) = 0
close(2</dev/pts/5<char 136:5>>) = 0
+++ exited with 0 +++
我认为重载open
、seek
和read
是一个不错的解决方案。但是,如果你想以编程方式解析和分析strace输出,我之前也做过类似的事情,并将我的代码放在了github上:https://github.com/johnlcf/Stana/wiki
(我这样做是因为我需要分析其他人运行的程序的strace结果,而要求他们使用LD_PRELOAD并不容易。)
可能做到这一点最好的方法是使用 fanotify。Fanotify 是一个 Linux 内核设施,可以便宜地监视文件系统事件。我不确定它是否允许按 PID 进行过滤,但它确实将 PID 传递给您的程序,因此您可以检查它是否是您感兴趣的那个。
这里有一个很好的代码示例: http://bazaar.launchpad.net/~pitti/fatrace/trunk/view/head:/fatrace.c
然而,目前似乎文档不够详尽。我能找到的所有文档都在 http://www.spinics.net/lists/linux-man/msg02302.html 和 http://lkml.indiana.edu/hypermail/linux/kernel/0811.1/01668.html。
man ptrace
。ptrace
的实用程序很麻烦,但从头开始编写另一个实用程序就不麻烦了吗?.. - Ruslan
/proc/$PID/fd/
。 - Ruslan-y
选项会打印与文件描述符参数相关联的路径。 - sparrowt