在C#中对List<String>进行排序

16

如何根据元素的整数值对列表进行排序

列表如下

"1"
"5"
"3"
"6"
"11"
"9"
"NUM1"
"NUM0"

结果应该像这样

"1"
"3"
"5"
"6"
"9"
"11"
"NUM0"
"NUM1"

有没有使用LINQ或Lambda表达式来实现这个想法的想法?

谢谢您提前。


这些字符串值是否代表十六进制数?或者列表中可能出现“S2”吗? - El Ronnoco
@El Ronnoco:不是十六进制。可以是“S2”之类的…(编辑中) - Thorin Oakenshield
7个回答

17

这被称为“自然排序”,通常用于像文件名之类的项目排序。

下面是一个天真(因为它可能存在大量的Unicode问题)的实现,似乎可以解决问题:

您可以将下面的代码复制到LINQPad中执行并测试。

基本上,比较算法将识别字符串中的数字,并通过在最短的数字前填充前导零来处理这些数字,因此例如两个字符串"Test123Abc""Test7X"应该被视为"Test123Abc""Test007X"进行比较,这样就可以产生您想要的结果。

但是,当我说“天真”时,我的意思是我可能在这里有大量真正的Unicode问题,比如处理变音符和多代码点字符等。 如果有人可以提供更好的实现,我很愿意看到它。

注:

  • 实现实际上没有解析数字,因此任意长度的数字应该可以很好地工作
  • 由于它实际上没有将数字解析为“数字”,因此浮点数将无法正确处理,“123.45”与“123.789”将被比较为“123.045”与“123.789”,这是错误的。

代码:

void Main()
{
    List<string> input = new List<string>
    {
        "1", "5", "3", "6", "11", "9", "A1", "A0"
    };
    var output = input.NaturalSort();
    output.Dump();
}

public static class Extensions
{
    public static IEnumerable<string> NaturalSort(
        this IEnumerable<string> collection)
    {
        return NaturalSort(collection, CultureInfo.CurrentCulture);
    }

    public static IEnumerable<string> NaturalSort(
        this IEnumerable<string> collection, CultureInfo cultureInfo)
    {
        return collection.OrderBy(s => s, new NaturalComparer(cultureInfo));
    }

    private class NaturalComparer : IComparer<string>
    {
        private readonly CultureInfo _CultureInfo;

        public NaturalComparer(CultureInfo cultureInfo)
        {
            _CultureInfo = cultureInfo;
        }

        public int Compare(string x, string y)
        {
            // simple cases
            if (x == y) // also handles null
                return 0;
            if (x == null)
                return -1;
            if (y == null)
                return +1;

            int ix = 0;
            int iy = 0;
            while (ix < x.Length && iy < y.Length)
            {
                if (Char.IsDigit(x[ix]) && Char.IsDigit(y[iy]))
                {
                    // We found numbers, so grab both numbers
                    int ix1 = ix++;
                    int iy1 = iy++;
                    while (ix < x.Length && Char.IsDigit(x[ix]))
                        ix++;
                    while (iy < y.Length && Char.IsDigit(y[iy]))
                        iy++;
                    string numberFromX = x.Substring(ix1, ix - ix1);
                    string numberFromY = y.Substring(iy1, iy - iy1);

                    // Pad them with 0's to have the same length
                    int maxLength = Math.Max(
                        numberFromX.Length,
                        numberFromY.Length);
                    numberFromX = numberFromX.PadLeft(maxLength, '0');
                    numberFromY = numberFromY.PadLeft(maxLength, '0');

                    int comparison = _CultureInfo
                        .CompareInfo.Compare(numberFromX, numberFromY);
                    if (comparison != 0)
                        return comparison;
                }
                else
                {
                    int comparison = _CultureInfo
                        .CompareInfo.Compare(x, ix, 1, y, iy, 1);
                    if (comparison != 0)
                        return comparison;
                    ix++;
                    iy++;
                }
            }

            // we should not be here with no parts left, they're equal
            Debug.Assert(ix < x.Length || iy < y.Length);

            // we still got parts of x left, y comes first
            if (ix < x.Length)
                return +1;

            // we still got parts of y left, x comes first
            return -1;
        }
    }
}

我已经发布了一个问题,想找到更好地处理变音符号和多码点字符的方法,请看这里:https://dev59.com/glDTa4cB1Zd3GeqPJnwC - Lasse V. Karlsen

16

这样如何:

    list.Sort((x, y) =>
    {
        int ix, iy;
        return int.TryParse(x, out ix) && int.TryParse(y, out iy)
              ? ix.CompareTo(iy) : string.Compare(x, y);
    });

4
“NUM10”和“NUM2”之间有什么区别呢?对我们来说,“NUM2”显然是在“NUM10”之前,但是它们排序时并不会按照这个顺序排列。 - Xander
5
@Xander - 请解释一下"obviously"的意思;问题中提到了“整数值” - 但是除非原帖子定义了应该允许的术语/模式,否则我不会将“NUM10”视为整数值。 - Marc Gravell
1
对不起,我的“显然”是指人类如何感知值的含义……比如“NUM”和整数“NUM10”……这更多是提醒要确保简单直接的排序可能/可能不可取。 - Xander

2
Jeff Atwood写了一篇有关自然排序的博客文章,其中提供了一些所需算法的可用实现。
Jeff提供的链接之一指向了Dave Koelle,他提供了一个C#实现
/*
 * The Alphanum Algorithm is an improved sorting algorithm for strings
 * containing numbers.  Instead of sorting numbers in ASCII order like
 * a standard sort, this algorithm sorts numbers in numeric order.
 *
 * The Alphanum Algorithm is discussed at http://www.DaveKoelle.com
 *
 * Based on the Java implementation of Dave Koelle's Alphanum algorithm.
 * Contributed by Jonathan Ruckwood <jonathan.ruckwood@gmail.com>
 *
 * Adapted by Dominik Hurnaus <dominik.hurnaus@gmail.com> to
 *   - correctly sort words where one word starts with another word
 *   - have slightly better performance
 *
 * Released under the MIT License - https://opensource.org/licenses/MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining
 * a copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included
 * in all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
 * USE OR OTHER DEALINGS IN THE SOFTWARE.
 *
 */
using System;
using System.Collections;
using System.Text;

/*
 * Please compare against the latest Java version at http://www.DaveKoelle.com
 * to see the most recent modifications
 */
namespace AlphanumComparator
{
    public class AlphanumComparator : IComparer
    {
        private enum ChunkType {Alphanumeric, Numeric};
        private bool InChunk(char ch, char otherCh)
        {
            ChunkType type = ChunkType.Alphanumeric;

            if (char.IsDigit(otherCh))
            {
                type = ChunkType.Numeric;
            }

            if ((type == ChunkType.Alphanumeric && char.IsDigit(ch))
                || (type == ChunkType.Numeric && !char.IsDigit(ch)))
            {
                return false;
            }

            return true;
        }

        public int Compare(object x, object y)
        {
            String s1 = x as string;
            String s2 = y as string;
            if (s1 == null || s2 == null)
            {
                return 0;
            }

            int thisMarker = 0, thisNumericChunk = 0;
            int thatMarker = 0, thatNumericChunk = 0;

            while ((thisMarker < s1.Length) || (thatMarker < s2.Length))
            {
                if (thisMarker >= s1.Length)
                {
                    return -1;
                }
                else if (thatMarker >= s2.Length)
                {
                    return 1;
                }
                char thisCh = s1[thisMarker];
                char thatCh = s2[thatMarker];

                StringBuilder thisChunk = new StringBuilder();
                StringBuilder thatChunk = new StringBuilder();

                while ((thisMarker < s1.Length) && (thisChunk.Length==0 ||InChunk(thisCh, thisChunk[0])))
                {
                    thisChunk.Append(thisCh);
                    thisMarker++;

                    if (thisMarker < s1.Length)
                    {
                        thisCh = s1[thisMarker];
                    }
                }

                while ((thatMarker < s2.Length) && (thatChunk.Length==0 ||InChunk(thatCh, thatChunk[0])))
                {
                    thatChunk.Append(thatCh);
                    thatMarker++;

                    if (thatMarker < s2.Length)
                    {
                        thatCh = s2[thatMarker];
                    }
                }

                int result = 0;
                // If both chunks contain numeric characters, sort them numerically
                if (char.IsDigit(thisChunk[0]) && char.IsDigit(thatChunk[0]))
                {
                    thisNumericChunk = Convert.ToInt32(thisChunk.ToString());
                    thatNumericChunk = Convert.ToInt32(thatChunk.ToString());

                    if (thisNumericChunk < thatNumericChunk)
                    {
                        result = -1;
                    }

                    if (thisNumericChunk > thatNumericChunk)
                    {
                        result = 1;
                    }
                }
                else
                {
                    result = thisChunk.ToString().CompareTo(thatChunk.ToString());
                }

                if (result != 0)
                {
                    return result;
                }
            }

            return 0;
        }
    }
}

2
尝试编写一个小型帮助类来解析和表示您的标记。例如,不要进行过多的检查:
public class NameAndNumber
{
    public NameAndNumber(string s)
    {
        OriginalString = s;
        Match match = Regex.Match(s,@"^(.*?)(\d*)$");
        Name = match.Groups[1].Value;
        int number;
        int.TryParse(match.Groups[2].Value, out number);
        Number = number; //will get default value when blank
    }

    public string OriginalString { get; private set; }
    public string Name { get; private set; }
    public int Number { get; private set; }
}

现在编写比较器或手动排序变得很容易:
var list = new List<string> { "ABC", "1", "5", "NUM44", "3", 
                              "6", "11", "9", "NUM1", "NUM0" };

var sorted = list.Select(str => new NameAndNumber(str))
    .OrderBy(n => n.Name)
    .ThenBy(n => n.Number);

给出的结果为:
1、3、5、6、9、11、ABC、NUM0、NUM1、NUM44。

顺便提一下,代码仅涉及字符串末尾附近的数字 - a123b12 -> 名称:a123b数字:12 - Kobi

0

这是最快的算法 - 用2毫秒排序50个项目~

static void Sort()
{
    string[] partNumbers = new string[] {"A1", "A2", "A10", "A111"};
    string[] result = partNumbers.OrderBy(x => PadNumbers(x)).ToArray();
}


public static string PadNumbers(string input)
{
        const int MAX_NUMBER_LEN = 10;

        string newInput = "";
        string currentNumber = "";
        foreach (char a in input)
        {
            if (!char.IsNumber(a))
            {
                if (currentNumber == "")
                {
                    newInput += a;
                    continue;
                }
                newInput += "0000000000000".Substring(0, MAX_NUMBER_LEN - currentNumber.Length) + currentNumber;
                currentNumber = "";
            }
            currentNumber += a;
        }
        if (currentNumber != "")
        {
            newInput += "0000000000000".Substring(0, MAX_NUMBER_LEN - currentNumber.Length) + currentNumber;
        }

        return newInput;
    }

~


-1

这里是一个 C# 7 的解决方案(假设列表的名称为 a):

    var numericList = a.Where(i => int.TryParse(i, out _)).OrderBy(j => int.Parse(j)).ToList();
    var nonNumericList = a.Where(i => !int.TryParse(i, out _)).OrderBy(j => j).ToList();
    a.Clear();
    a.AddRange(numericList);
    a.AddRange(nonNumericList);

-1

我认为你只需要使用listName.Sort(),因为sort()方法使用默认比较器来快速排序节点。默认比较器恰好可以满足你的需求。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接