我正在编写一个实时视频成像应用程序,需要加速这个方法。目前执行时间约为10毫秒,我希望将其降至2-3毫秒。
我尝试过Array.Copy和Buffer.BlockCopy,它们都需要约30毫秒的时间,比手动复制慢3倍。
一种想法是以整数形式复制4个字节,然后以整数形式粘贴它们,从而将4行代码减少为1行代码。不过,我不确定该如何做到这一点。
另一个想法是使用指针和不安全代码来完成这个操作,但我也不确定该如何做到这一点。
非常感谢任何帮助。谢谢!
编辑:数组大小为:inputBuffer [327680],lookupTable [16384],outputBuffer [1310720]
public byte[] ApplyLookupTableToBuffer(byte[] lookupTable, ushort[] inputBuffer)
{
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
// Precalculate and initialize the variables
int lookupTableLength = lookupTable.Length;
int bufferLength = inputBuffer.Length;
byte[] outputBuffer = new byte[bufferLength * 4];
int outIndex = 0;
int curPixelValue = 0;
// For each pixel in the input buffer...
for (int curPixel = 0; curPixel < bufferLength; curPixel++)
{
outIndex = curPixel * 4; // Calculate the corresponding index in the output buffer
curPixelValue = inputBuffer[curPixel] * 4; // Retrieve the pixel value and multiply by 4 since the lookup table has 4 values (blue/green/red/alpha) for each pixel value
// If the multiplied pixel value falls within the lookup table...
if ((curPixelValue + 3) < lookupTableLength)
{
// Copy the lookup table value associated with the value of the current input buffer location to the output buffer
outputBuffer[outIndex + 0] = lookupTable[curPixelValue + 0];
outputBuffer[outIndex + 1] = lookupTable[curPixelValue + 1];
outputBuffer[outIndex + 2] = lookupTable[curPixelValue + 2];
outputBuffer[outIndex + 3] = lookupTable[curPixelValue + 3];
//System.Buffer.BlockCopy(lookupTable, curPixelValue, outputBuffer, outIndex, 4); // Takes 2-10x longer than just copying the values manually
//Array.Copy(lookupTable, curPixelValue, outputBuffer, outIndex, 4); // Takes 2-10x longer than just copying the values manually
}
}
Debug.WriteLine("ApplyLookupTableToBuffer(ms): " + sw.Elapsed.TotalMilliseconds.ToString("N2"));
return outputBuffer;
}
编辑:我已经更新了方法,保持了相同的变量名称,以便其他人可以看到基于HABJAN下面的解决方案该代码如何转换。
public byte[] ApplyLookupTableToBufferV2(byte[] lookupTable, ushort[] inputBuffer)
{
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
// Precalculate and initialize the variables
int lookupTableLength = lookupTable.Length;
int bufferLength = inputBuffer.Length;
byte[] outputBuffer = new byte[bufferLength * 4];
//int outIndex = 0;
int curPixelValue = 0;
unsafe
{
fixed (byte* pointerToOutputBuffer = &outputBuffer[0])
fixed (byte* pointerToLookupTable = &lookupTable[0])
{
// Cast to integer pointers since groups of 4 bytes get copied at once
uint* lookupTablePointer = (uint*)pointerToLookupTable;
uint* outputBufferPointer = (uint*)pointerToOutputBuffer;
// For each pixel in the input buffer...
for (int curPixel = 0; curPixel < bufferLength; curPixel++)
{
// No need to multiply by 4 on the following 2 lines since the pointers are for integers, not bytes
// outIndex = curPixel; // This line is commented since we can use curPixel instead of outIndex
curPixelValue = inputBuffer[curPixel]; // Retrieve the pixel value
if ((curPixelValue + 3) < lookupTableLength)
{
outputBufferPointer[curPixel] = lookupTablePointer[curPixelValue];
}
}
}
}
Debug.WriteLine("2 ApplyLookupTableToBuffer(ms): " + sw.Elapsed.TotalMilliseconds.ToString("N2"));
return outputBuffer;
}
byte[]
操作的实时视频成像应用?不可能......迟早会遇到性能瓶颈,所以我强烈建议学习Interop和/或C++/CLI。 - Konrad Kokosaunsafe
块中使用指针算术重写代码并再次测量性能,但是这里没有简单的性能保证。 - Patryk Ćwiekunsafe
可以提高约10%的性能。当然,具体情况因人而异。 - Robert Harvey