在不使用时删除posix共享内存？

Question

在不使用时删除posix共享内存？

clinuxshared-memory

16

是否存在任何方法（无论是 Linux 特定的还是非特定的），可以使 POSIX 共享内存段（使用 shm_open() 获得）在没有进程使用它们时被删除。即，将它们引用计数，并在引用计数变为 0 时系统删除它们。

以下是一些注意事项：

建立一个 atexit 处理程序来删除它们，在程序崩溃时不起作用。
目前，Linux 特定的方法是在段名称中嵌入 pid，并尝试在外部程序中遍历 /dev/shm 找到未使用的段。这种方法的缺点是需要以相当 hackish 的方式定期在外部进行清理。
由于程序可以运行多个副本，因此在启动时为段使用一个明确定义的名称并不可行。

- user964970

你是在问是否有使用系统库的方法来解决这个问题，而不是手动完成吗？ - Joe

你可以使用gdb来调试你的应用程序，这样它就不会崩溃了。这缓解了崩溃应用程序无法自我清理的问题... - Nicholas Wilson

7个回答

6

不行 - 至少在 Linux 上，内核中没有任何可以执行此操作的内容。某个应用程序需要在某个时刻调用shm_unlink()来摆脱共享内存段。

- nos

4

我找到了一种方法，使用系统命令和Linux命令“fuser”，可以列出打开文件的进程。这样，您可以检查共享内存文件（位于/ dev / shm”中）是否仍在使用，并在不使用时将其删除。请注意，检查/删除/创建操作必须在使用命名互斥锁、命名信号量或文件锁的进程间关键部分中进行。

        std::string shm_file = "/dev/shm/" + service_name + "Shm";
        std::string cmd_line = "if [ -f " + shm_file + " ] ; then if ! fuser -s " + shm_file + " ; then rm -f " + shm_file + " ; else exit 2 ; fi else exit 3 ; fi";
        int res = system(cmd_line.c_str());
        switch (WEXITSTATUS(res)) {
        case 0: _logger.warning ("The shared memory file " + shm_file + " was found orphan and is deleted");         break;
        case 1: _logger.critical("The shared memory file " + shm_file + " was found orphan and cannot be deleted");  break;
        case 2: _logger.trace   ("The shared memory file " + shm_file + " is linked to alive processes");            break;
        case 3: _logger.trace   ("The shared memory file " + shm_file + " is not found");                            break;
        }

- Bobax

3

让我们假设最复杂的情况：

您有多个进程通过共享内存进行通信
它们可以随时开始和结束，甚至多次进行。这意味着没有主进程，也没有专门的“第一个”进程可以初始化共享内存。
也就是说，例如没有安全取消链接共享内存的点，因此既不Sergey的答案也不Hristo的答案适用。

我看到两种可能的解决方案，并希望对它们进行反馈，因为互联网对这个问题鲜有回应：

Store the pid (or a more specific process identifier if you have one) of the last process that wrote to the shared memory inside the shared memory as a lock. Then you could do sth. like the following pseudo code:

 int* pshmem = open shared memory()

 while(true) 
     nPid = atomic_read(pshmem)
     if nPid = 0 
        // your shared memory is in a valid state
        break
     else 
        // process nPid holds a lock to your shared memory
        // or it might have crashed while holding the lock
        if process nPid still exists 
          // shared memory is valid
          break
        else 
          // shared memory is corrupt
          // try acquire lock 
          if atomic_compare_exchange(pshmem, nPid, my pid) 
             // we have the lock
             reinitialize shared memory
             atomic_write(pshem, 0) // release lock
          else 
             // somebody else got the lock in the meantime
             // continue loop

This verifies that the last writer didn't die while writing. The shared memory is still persisting longer than any of your processes.

Use a reader/writer file lock to find out if any process is the first process opening the shared memory object. The first process may then reinitialize the shared memory:
```
 // try to get exclusive lock on lockfile
 int fd = open(lockfile, O_RDONLY | O_CREAT | O_EXLOCK | O_NONBLOCK, ...)
 if fd == -1
     // didn't work, somebody else has it and is initializing shared memory
     // acquire shared lock and wait for it
     fd = open(lockfile, O_RDONLY | O_SHLOCK)
     // open shared memory
 else 
     // we are the first
     // delete shared memory object
     // possibly delete named mutex/semaphore as well

     // create shared memory object (& semaphore)
     // degrade exclusive lock to a shared lock
     flock(fd, LOCK_SH)
```
File locks seem to be the only (?) mechanism on POSIX systems that is cleared up automatically when the process dies. Unfortunately, the list of caveats to use them is very, very long. The algorithm assumes flock is supported on the underlying filesystem at least on the local machine. The algorithm doesn't care if the locks are actually visible to other processes on NFS filesystems or not. They only have to be visible for all processes accessing the shared memory object.

This solution has been implemented on top of boost.interprocess.

- Sebastian

共享文件锁非常有趣，谢谢Sebastian！有一个问题。示例假设文件不存在，但实际情况可能不是这样。那么会发生什么？ - SRG

1

open(lockfile, O_RDONLY | O_CREAT | O_EXLOCK | O_NONBLOCK, ...) 如果该文件不存在，则会创建它，否则将打开现有的文件。 - Sebastian

是的，但我的观点是，如果文件已经存在，那么谁是锁的所有者？这难道不意味着每个进程都会执行“else语句”吗？ - SRG

1

我仍然不确定我是否理解。文件是否存在并不影响这里的情况。唯一重要的是，如果有人打开了它并获取了锁定。我尝试同时打开文件并获取独占锁（使用EX_LOCK打开）。只有在文件尚未打开时才会成功。它仅对第一个进程成功，然后由该进程负责初始化锁定。然后将锁切换为共享锁，所有其他进程都可以在if语句中继续进行。 - Sebastian

@Sebastian 如果初始化共享内存的第一个进程终止，它将释放文件锁。之后其他进程可以获得文件锁并重新初始化共享内存，对吗？您知道有什么解决方案吗？ - tango-1

显示剩余2条评论

3

使用sysV API创建的共享内存可以有这样的行为。仅限于Linux。它不是POSIX共享内存，但可能适合您的需求。

在书籍The Linux Programming Interface中，shmctl()的一个可能参数被描述如下。

IPC_RMID 标记共享内存段及其关联的shmid_ds数据结构以进行删除。如果当前没有进程附加到该段，则立即删除；否则，当shmid_ds数据结构中的shm_nattch字段值降至0时，将在所有进程分离后删除该段。在某些应用程序中，我们可以通过在所有进程将其附加到虚拟地址空间（使用shmat()）后立即将其标记为删除来确保在应用程序终止时整洁地清除共享内存段。这类似于打开文件后取消链接文件。在Linux上，如果使用IPC_RMID标记了一个共享段但尚未被删除，因为一些进程仍将其附加，则可能会有另一个进程附加该段。然而，这种行为不可移植：大多数UNIX实现防止向标记为删除的段附加新内容。（SUSv3对此情况的行为保持沉默。）一些Linux应用程序已经依赖于这种行为，这就是为什么Linux没有改变以匹配其他UNIX实现的原因。

- Sergey

但是现在System V IPC已经被弃用了。不建议使用这种机制编写新程序。 - Rachid K.

0

不确定以下是否可行。但是我会尝试。

为什么不执行辅助程序，每次您的程序崩溃时都会执行它。

例如：

/proc/sys/kernel/core_pattern  to  /path/to/Myprogram %p

当进程崩溃时，我的程序会被执行，你可以进一步探索。

请查看。

man 5 core.  for more information.

希望这能在一定程度上有所帮助。

- Whoami

0

你能不能使用全局计数信号量来进行引用计数？在附加和分离调用时包装信号量，以便在附加到内存时增加信号量，在分离时减少信号量。当分离操作将信号量减少到零时释放该段。

- Joe

这与在atexit处理程序中删除它具有相同的含义（对于我的特定用途，atexit是可以接受的，因为只有一个进程写入共享内存，并且该进程可以删除它）。但如果进程崩溃，则无法处理这些段。 - user964970

1

你可以在信号处理程序中捕获崩溃信号并递减引用计数。 - Joe

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Hristo Iliev · Accepted Answer

如果在程序执行的某个时刻，所有需要打开共享内存段的进程都已经完成了这个操作，那么可以安全地解除它的链接。解除链接会将对象从全局命名空间中移除，但只要至少有一个进程保持其文件描述符打开，它仍然会存在。如果在此之后发生崩溃，文件描述符会自动关闭并且引用计数会减少。一旦没有打开未链接的共享内存块的描述符，它就会被删除。

以下是应用场景：一个进程创建了一个共享内存块，解除了它的链接，然后进行分叉。子进程继承了文件描述符，并可使用共享内存块与父进程通信。一旦两个进程都终止，该块将自动被删除，因为两个文件描述符都被关闭。

在解除链接期间，其他进程无法打开共享内存块。同时，如果使用与未链接块相同的名称调用 shm_open()，则会创建一个全新且完全不同的共享内存块。