多线程循环保持顺序

Question

多线程循环保持顺序

6

我开始尝试使用多线程来处理CPU密集型批处理。基本上，我正在尝试将多个单页tiff文件压缩成单个PDF文档。使用foreach循环或标准迭代可以正常工作，但是对于包含数百页的文档可能会非常慢。我尝试了以下方法（基于我找到的一些示例）来使用多线程，它具有显着的性能提升，但会破坏页面顺序，而不是1,2,3,4，它将是1,3,4,2,6,5，取决于哪个线程先完成。

我的问题是，如何在保持页面顺序的同时利用这种技术，如果我可以做到，是否会抵消多线程的性能优势？谢谢您的帮助。

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);
double[] results = new double[counter];
// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        f_prime = split[i].Replace(" " , "");
        PdfPage page = doc.AddPage();
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);

    }
});

- David

3个回答

2

使用 .AsParallel().AsOrdered()，如本文档所述：http://msdn.microsoft.com/zh-cn/library/dd460677.aspx

我认为代码应该是这样的：

rangePartitioner.AsParallel().AsOrdered().ForAll(
    range => 
    {
        // Loop over each range element without a delegate invocation.
        ...
    });

- StriplingWarrior

2

我不确定其他解决方案是否完全符合他的要求。原因是PdfPage page = doc.AddPage(); 创建并添加新页面，同时进行，因此它将始终无序，因为顺序是通过doc先到先得的方式来决定的。

如果AddPage是一个快速操作，你可以一次性创建所有100个页面，而不需要任何处理。然后回过头来渲染Tiff图像到页面中。

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);

double[] results = new double[counter];

PdfPage[] pages = new PdfPage[counter];
for (int i = 0; i < counter; ++i) 
{
    pages[i] = doc.AddPage();
}

// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        f_prime = split[i].Replace(" " , "");
        PdfPage page = pages[i];
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);
    }
});

编辑

我认为有一种更优雅的解决方案，但在不了解PdfPage属性的情况下，我不想在此之前提供它。如果您能告诉PdfPage属于哪个页面，那么您可以轻松地进行如下操作：

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);

double[] results = new double[counter];

// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        PdfPage page = doc.AddPage();
        // Only use i as a loop not as the index
        int pageIndex = page.PageIndex; // This is what I don't know
        f_prime = split[pageIndex].Replace(" " , "");
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);
    }
});

- Andrew T Finnell

+1：在第二次阅读之前我没意识到，但你是对的。将doc.AddPage()放在循环外面是确保页码顺序正确的唯一方法。 - StriplingWarrior

感谢您的帮助，虽然我还没有尝试过，但一快速查看，这似乎是一个很好的解决方案。 - David

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- BrokenGlass · Accepted Answer

我建议使用返回元素索引的Parallel.ForEach重载方法：

 Parallel.ForEach(rangePartitioner, (range, loopState, elementIndex) =>

然后在循环中，您可以使用结果来填充一个数组，并在所有操作都完成后按顺序遍历这些结果。