更好的解决多线程谜题的方案?

7
这是任务:我需要根据文件名进行锁定。可能有多达一百万个不同的文件名。(这用于大规模基于磁盘的缓存)。
我希望内存使用率低,查找时间短,这意味着我需要一个GC'd锁字典。(只有正在使用的锁可以存在于字典中)。
回调操作可能需要几分钟才能完成,因此全局锁是不可接受的。高吞吐量至关重要。
我在下面发布了我的当前解决方案,但我对复杂性感到不满意。
编辑:请不要发布不正确的解决方案。例如,允许在“获取锁对象”阶段和“锁定”阶段之间从字典中删除锁的解决方案是不正确的,无论它是否是“接受”的设计模式。
还有比这更优雅的解决方案吗?
谢谢!
[编辑:我更新了我的代码,根据RobV的建议使用循环而不是递归]
[编辑:再次更新代码以允许“超时”和更简单的调用模式。这可能是我使用的最终代码。仍然与原始帖子中的基本算法相同。]
[编辑:再次更新代码以处理回调内部异常而不使锁对象变为孤立状态]
public delegate void LockCallback();
/// <summary>
/// Provides locking based on a string key. 
/// Locks are local to the LockProvider instance.
/// The class handles disposing of unused locks. Generally used for 
/// coordinating writes to files (of which there can be millions). 
/// Only keeps key/lock pairs in memory which are in use.
/// Thread-safe.
/// </summary>
public class LockProvider {

    /// <summary>
    /// The only objects in this collection should be for open files. 
    /// </summary>
    protected Dictionary<String, Object> locks = 
                    new Dictionary<string, object>(StringComparer.Ordinal);
    /// <summary>
    /// Synchronization object for modifications to the 'locks' dictionary
    /// </summary>
    protected object createLock = new object();
    /// <summary>
    /// Attempts to execute the 'success' callback inside a lock based on 'key'.  If successful, returns true.
    /// If the lock cannot be acquired within 'timoutMs', returns false
    /// In a worst-case scenario, it could take up to twice as long as 'timeoutMs' to return false.
    /// </summary>
    /// <param name="key"></param>
    /// <param name="success"></param>
    /// <param name="failure"></param>
    /// <param name="timeoutMs"></param>
    public bool TryExecute(string key, int timeoutMs, LockCallback success){
        //Record when we started. We don't want an infinite loop.
        DateTime startedAt = DateTime.UtcNow;

        // Tracks whether the lock acquired is still correct
        bool validLock = true; 
        // The lock corresponding to 'key'
        object itemLock = null;

        try {
            //We have to loop until we get a valid lock and it stays valid until we lock it.
            do {
                // 1) Creation/aquire phase
                lock (createLock) {
                    // We have to lock on dictionary writes, since otherwise 
                    // two locks for the same file could be created and assigned
                    // at the same time. (i.e, between TryGetValue and the assignment)
                    if (!locks.TryGetValue(key, out itemLock))
                        locks[key] = itemLock = new Object(); //make a new lock!

                }
                // Loophole (part 1):
                // Right here - this is where another thread (executing part 2) could remove 'itemLock'
                // from the dictionary, and potentially, yet another thread could 
                // insert a new value for 'itemLock' into the dictionary... etc, etc..

                // 2) Execute phase
                if (System.Threading.Monitor.TryEnter(itemLock, timeoutMs)) {
                    try {
                        // May take minutes to acquire this lock. 

                        // Trying to detect an occurence of loophole above
                        // Check that itemLock still exists and matches the dictionary
                        lock (createLock) {
                            object newLock = null;
                            validLock = locks.TryGetValue(key, out newLock);
                            validLock = validLock && newLock == itemLock;
                        }
                        // Only run the callback if the lock is valid
                        if (validLock) {
                            success(); // Extremely long-running callback, perhaps throwing exceptions
                            return true;
                        }

                    } finally {
                        System.Threading.Monitor.Exit(itemLock);//release lock
                    }
                } else {
                    validLock = false; //So the finally clause doesn't try to clean up the lock, someone else will do that.
                    return false; //Someone else had the lock, they can clean it up.
                }

                //Are we out of time, still having an invalid lock?
                if (!validLock && Math.Abs(DateTime.UtcNow.Subtract(startedAt).TotalMilliseconds) > timeoutMs) {
                    //We failed to get a valid lock in time. 
                    return false;
                }


                // If we had an invalid lock, we have to try everything over again.
            } while (!validLock);
        } finally {
            if (validLock) {
                // Loophole (part 2). When loophole part 1 and 2 cross paths,
                // An lock object may be removed before being used, and be orphaned

                // 3) Cleanup phase - Attempt cleanup of lock objects so we don't 
                //   have a *very* large and slow dictionary.
                lock (createLock) {
                    //  TryEnter() fails instead of waiting. 
                    //  A normal lock would cause a deadlock with phase 2. 
                    //  Specifying a timeout would add great and pointless overhead.
                    //  Whoever has the lock will clean it up also.
                    if (System.Threading.Monitor.TryEnter(itemLock)) {
                        try {
                            // It succeeds, so no-one else is working on it 
                            // (but may be preparing to, see loophole)
                            // Only remove the lock object if it 
                            // still exists in the dictionary as-is
                            object existingLock = null;
                            if (locks.TryGetValue(key, out existingLock)
                                && existingLock == itemLock)
                                locks.Remove(key);
                        } finally {
                            // Remove the lock
                            System.Threading.Monitor.Exit(itemLock);
                        }
                    }
                }
            }
        }
        // Ideally the only objects in 'locks' will be open operations now.
        return true;
    }
}

使用示例

LockProvider p = new LockProvider();
bool success = p.TryExecute("filename",1000,delegate(){
  //This code executes within the lock
});

1
你能使用.NET4吗?如果可以的话,你可能会对System.Collections.Concurrent namespace感兴趣。 - R. Martinho Fernandes
4
如果你在协调对文件系统的写操作,为什么不直接以独占方式打开文件,从而实现锁定,而不使用这种方法? - Jeff Foster
1
我不知道你遇到了什么问题,但文件系统是专门设计来处理这种情况的。甚至还有一个原子删除关闭功能。你可能需要使用P/Invoke来访问所有这些功能,但我认为这比编写自己的代码要好得多,无论是在正确性还是性能方面。 - Jeffrey L Whitledge
@计算机语言学家:你解决谜题了吗?我的答案有帮助吗? - Steven
@计算机语言学家:嗯,我今晚会试着看一下。我发现多线程问题,呃,谜题相当有趣 :-) - Steven
显示剩余5条评论
4个回答

1

根据您对文件的操作(您提到了基于磁盘的缓存,因此我假设包括读取和写入),我建议尝试基于ReaderWriterLock的解决方案。如果您可以升级到 .Net 3.5,则建议改用ReaderWriterLockSlim,因为它的性能更好。

作为减少潜在无限递归情况的一般步骤,将代码的第一部分更改为以下内容:

do 
{
    // 1) Creation/aquire phase
    lock (createLock){
        // We have to lock on dictionary writes, since otherwise 
        // two locks for the same file could be created and assigned
        // at the same time. (i.e, between TryGetValue and the assignment)
        if (!locks.TryGetValue(key, out itemLock)) 
            locks[key] = itemLock = new Object(); //make a new lock!

    }
    // Loophole (part 1):
    // Right here - this is where another thread could remove 'itemLock'
    // from the dictionary, and potentially, yet another thread could 
    // insert a new value for 'itemLock' into the dictionary... etc, etc..

    // 2) Execute phase
    lock(itemLock){ 
        // May take minutes to acquire this lock. 
        // Real version would specify a timeout and a failure callback.

        // Trying to detect an occurence of loophole above
        // Check that itemLock still exists and matches the dictionary
        lock(createLock){
            object newLock = null;
            validLock = locks.TryGetValue(key, out newLock);
            validLock = validLock && newLock == itemLock;
        }
        // Only run the callback if the lock is valid
        if (validLock) callback(); // Extremely long-running callback. 
    }
    // If we had an invalid lock, we have to try everything over again.
} while (!validLock);

这将用循环替换您的递归,避免了无限递归导致的堆栈溢出。


好主意。希望我能想到那个 :) - Lilith River
你是否注意到该实现中还存在其他潜在漏洞? - Lilith River
所有回调都是写操作。读取是不受控制的,我让文件系统处理那部分。反正IIS已经在做所有的读取操作了,我的手也被绑住了。 - Lilith River

1

那个解决方案看起来很脆弱和复杂。在锁内部有公共回调函数是不好的实践。为什么不让LockProvider返回某种“锁”对象,以便消费者自己进行锁定。这将把locks字典的锁定与执行分开。它可能看起来像这样:

public class LockProvider
{
    private readonly object globalLock = new object();

    private readonly Dictionary<String, Locker> locks =
        new Dictionary<string, Locker>(StringComparer.Ordinal);

    public IDisposable Enter(string key)
    {
        Locker locker;

        lock (this.globalLock)
        {
            if (!this.locks.TryGetValue(key, out locker))
            {
                this.locks[key] = locker = new Locker(this, key);
            }

            // Increase wait count ínside the global lock
            locker.WaitCount++;
        }

        // Call Enter and decrease wait count óutside the 
        // global lock (to prevent deadlocks).
        locker.Enter();

        // Only one thread will be here at a time for a given locker.
        locker.WaitCount--;

        return locker;
    }

    private sealed class Locker : IDisposable
    {
        private readonly LockProvider provider;
        private readonly string key;
        private object keyLock = new object();

        public int WaitCount;

        public Locker(LockProvider provider, string key)
        {
            this.provider = provider;
            this.key = key;
        }

        public void Enter()
        {
            Monitor.Enter(this.keyLock);
        }

        public void Dispose()
        {
            if (this.keyLock != null)
            {
                this.Exit();
                this.keyLock = null;
            }
        }

        private void Exit()
        {
            lock (this.provider.globalLock)
            {
                try
                {
                    // Remove the key before releasing the lock, but
                    // only when no threads are waiting (because they
                    // will have a reference to this locker).
                    if (this.WaitCount == 0)
                    {
                        this.provider.locks.Remove(this.key);
                    }
                }
                finally
                {
                    // Release the keyLock inside the globalLock.
                    Monitor.Exit(this.keyLock);
                }
            }
        }
    }
}

LockProvider 可以按以下方式使用:

public class Consumer
{
    private LockProvider provider;

    public void DoStufOnFile(string fileName)
    {
        using (this.provider.Enter(fileName))
        {
            // Long running operation on file here.
        }
    }
}

请注意,在进入try语句(using)之前调用Monitor.Enter,这意味着在某些宿主环境(如ASP.NET和SQL Server)中,当异步异常发生时,可能会出现锁永远不被释放的情况。像ASP.NET和SQL Server这样的宿主在超时发生时会强制终止线程。把Enter放到try内部的Monitor.Enter之外重写有点棘手。
希望这可以帮助你。

我不确定是否同意在锁内使用回调函数是一种不好的做法。暴露回调函数的API总是比直接暴露锁更简单且更不容易出现使用错误。调用者在回调函数中唯一可能犯的错误是递归调用,这是显而易见和简单的,并且容易记录。 - Lilith River
@Steven:这仍然没有解决原问题中提到的根本问题。如果在GetLockObjectForKey()和locker.Enter()之间从字典中删除锁,则会同时存在两个单独的锁对象,从而允许两个线程并行执行对同一文件的操作。 - Lilith River
@计算机语言学家:您说得完全正确。我怎么会错过这个呢。这就是为什么我们需要配对编程的原因;-) 我们需要一种等待计数器来确保我们被允许移除该项。我将在本周末更新我的答案。干杯 - Steven
@计算机语言学家:我已经更新了我的答案。现在它使用一个字段来计算等待进入锁的线程数。我认为这有效地解决了根本问题。 - Steven
是的,我认为是这样。但是,当你将超时添加到锁定中时,模式会有些破坏-清理例程做出的假设是所有锁定都成功。 - Lilith River
显示剩余10条评论

0

你能不能简单地使用一个命名的Mutex,其名称来自于你的文件名?

虽然不是轻量级的同步原语,但比管理自己的同步字典要简单。

然而,如果你真的想这样做,我认为以下实现看起来更简单。你需要一个同步字典 - 要么是.NET 4的ConcurrentDictionary,要么是你自己的实现(如果你在.NET 3.5或更低版本上)。

try
{
    object myLock = new object();
    lock(myLock)
    {
        object otherLock = null;
        while(otherLock != myLock)
        {
           otherLock = lockDictionary.GetOrAdd(key, myLock);
           if (otherLock != myLock)
           {
               // Another thread has a lock in the dictionary
               if (Monitor.TryEnter(otherLock, timeoutMs))
               {
                   // Another thread still has a lock after a timeout
                   failure();
                   return;
               }
               else
               {
                   Monitor.Exit(otherLock);
               }
           }
        }
        // We've successfully added myLock to the dictionary
        try
        {
           // Do our stuff
           success();
        }
        finally
        {
            lockDictionary.Remove(key);
        }
    }
}

我认为你的意思是!Monitor.TryEnter,但我喜欢你的方法。 我看到的唯一可能的缺点是在争用情况下会发生大量重新排序,而原始方法只会在更糟糕的巧合情况下重新排序... - Lilith River

0

在.NET中似乎没有一种优雅的方法来做到这一点,尽管我通过@RobV的循环建议改进了算法。这是我最终采用的解决方案。

它免疫了“孤立引用”错误,这似乎是@Steven答案遵循的标准模式的典型问题。

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;

namespace ImageResizer.Plugins.DiskCache {

public delegate void LockCallback();
/// <summary>
/// Provides locking based on a string key. 
/// Locks are local to the LockProvider instance.
/// The class handles disposing of unused locks. Generally used for 
/// coordinating writes to files (of which there can be millions). 
/// Only keeps key/lock pairs in memory which are in use.
/// Thread-safe.
/// </summary>
public class LockProvider {

    /// <summary>
    /// The only objects in this collection should be for open files. 
    /// </summary>
    protected Dictionary<String, Object> locks = 
                    new Dictionary<string, object>(StringComparer.Ordinal);
    /// <summary>
    /// Synchronization object for modifications to the 'locks' dictionary
    /// </summary>
    protected object createLock = new object();
    /// <summary>
    /// Attempts to execute the 'success' callback inside a lock based on 'key'.  If successful, returns true.
    /// If the lock cannot be acquired within 'timoutMs', returns false
    /// In a worst-case scenario, it could take up to twice as long as 'timeoutMs' to return false.
    /// </summary>
    /// <param name="key"></param>
    /// <param name="success"></param>
    /// <param name="failure"></param>
    /// <param name="timeoutMs"></param>
    public bool TryExecute(string key, int timeoutMs, LockCallback success){
        //Record when we started. We don't want an infinite loop.
        DateTime startedAt = DateTime.UtcNow;

        // Tracks whether the lock acquired is still correct
        bool validLock = true; 
        // The lock corresponding to 'key'
        object itemLock = null;

        try {
            //We have to loop until we get a valid lock and it stays valid until we lock it.
            do {
                // 1) Creation/aquire phase
                lock (createLock) {
                    // We have to lock on dictionary writes, since otherwise 
                    // two locks for the same file could be created and assigned
                    // at the same time. (i.e, between TryGetValue and the assignment)
                    if (!locks.TryGetValue(key, out itemLock))
                        locks[key] = itemLock = new Object(); //make a new lock!

                }
                // Loophole (part 1):
                // Right here - this is where another thread (executing part 2) could remove 'itemLock'
                // from the dictionary, and potentially, yet another thread could 
                // insert a new value for 'itemLock' into the dictionary... etc, etc..

                // 2) Execute phase
                if (System.Threading.Monitor.TryEnter(itemLock, timeoutMs)) {
                    try {
                        // May take minutes to acquire this lock. 

                        // Trying to detect an occurence of loophole above
                        // Check that itemLock still exists and matches the dictionary
                        lock (createLock) {
                            object newLock = null;
                            validLock = locks.TryGetValue(key, out newLock);
                            validLock = validLock && newLock == itemLock;
                        }
                        // Only run the callback if the lock is valid
                        if (validLock) {
                            success(); // Extremely long-running callback, perhaps throwing exceptions
                            return true;
                        }

                    } finally {
                        System.Threading.Monitor.Exit(itemLock);//release lock
                    }
                } else {
                    validLock = false; //So the finally clause doesn't try to clean up the lock, someone else will do that.
                    return false; //Someone else had the lock, they can clean it up.
                }

                //Are we out of time, still having an invalid lock?
                if (!validLock && Math.Abs(DateTime.UtcNow.Subtract(startedAt).TotalMilliseconds) > timeoutMs) {
                    //We failed to get a valid lock in time. 
                    return false;
                }


                // If we had an invalid lock, we have to try everything over again.
            } while (!validLock);
        } finally {
            if (validLock) {
                // Loophole (part 2). When loophole part 1 and 2 cross paths,
                // An lock object may be removed before being used, and be orphaned

                // 3) Cleanup phase - Attempt cleanup of lock objects so we don't 
                //   have a *very* large and slow dictionary.
                lock (createLock) {
                    //  TryEnter() fails instead of waiting. 
                    //  A normal lock would cause a deadlock with phase 2. 
                    //  Specifying a timeout would add great and pointless overhead.
                    //  Whoever has the lock will clean it up also.
                    if (System.Threading.Monitor.TryEnter(itemLock)) {
                        try {
                            // It succeeds, so no-one else is working on it 
                            // (but may be preparing to, see loophole)
                            // Only remove the lock object if it 
                            // still exists in the dictionary as-is
                            object existingLock = null;
                            if (locks.TryGetValue(key, out existingLock)
                                && existingLock == itemLock)
                                locks.Remove(key);
                        } finally {
                            // Remove the lock
                            System.Threading.Monitor.Exit(itemLock);
                        }
                    }
                }
            }
        }
        // Ideally the only objects in 'locks' will be open operations now.
        return true;
    }
}
}

使用这段代码非常简单:

LockProvider p = new LockProvider();
bool success = p.TryExecute("filename",1000,delegate(){
  //This code executes within the lock
});

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接