合并两个XElements

3
我不太确定如何提出这个问题,或者它是否存在,但我需要合并两个XElement,其中一个优先于另一个,成为一个元素。
首选的语言是VB.NET和Linq,但如果它演示了如何做到这一点而无需手动拆分和解决每个元素和属性,则任何语言都将有所帮助。
例如,假设我有两个元素。请对它们的差异感到幽默。
<HockeyPlayer height="6.0" hand="left">
<Position>Center</Position>
<Idol>Gordie Howe</Idol>
</HockeyPlayer>

2.

<HockeyPlayer height="5.9" startinglineup="yes">
<Idol confirmed="yes">Wayne Gretzky</Idol>
</HockeyPlayer>

合并的结果将会是:
<HockeyPlayer height="6.0" hand="left" startinglineup="yes">
<Position>Center</Position>
<Idol confirmed="yes">Gordie Howe</Idol>
</HockeyPlayer>

注意几点:#1的height属性值覆盖了#2。 hand属性和值只是从#1复制过来(在#2中不存在)。 #2中的startinglineup属性和值被复制(在#1中不存在)。 #1中的Position元素被复制(在#2中不存在)。 #1中的Idol元素值覆盖了#2,但是#2的confirmed属性(在#1中不存在)被复制。

总的来说,在冲突的情况下(意味着两者具有相同的元素和/或属性),#1优先于#2,并且在没有冲突的情况下,它们都复制到最终结果中。

我尝试搜索此内容,但似乎找不到任何东西,可能是因为我使用的单词太通用了。 有什么想法或解决方案(特别是对于Linq)吗?

2个回答

6

为了方便其他寻找相同内容的人,我假设两位贡献者早已失去兴趣... 我需要做类似的事情,但需要更加完整。但仍然不是完全完整的,因为XMLDoc说它不能很好地处理非元素内容,但我不需要,因为我的非元素内容要么是文本,要么不重要。随意增强和重新发布... 哦,这是C# 4.0,因为我使用的就是它...

/// <summary>
/// Provides facilities to merge 2 XElement or XML files. 
/// <para>
/// Where the LHS holds an element with non-element content and the RHS holds 
/// a tree, the LHS non-element content will be applied as text and the RHS 
/// tree ignored. 
/// </para>
/// <para>
/// This does not handle anything other than element and text nodes (infact 
/// anything other than element is treated as text). Thus comments in the 
/// source XML are likely to be lost.
/// </para>
/// <remarks>You can pass <see cref="XDocument.Root"/> if it you have XDocs 
/// to work with:
/// <code>
/// XDocument mergedDoc = new XDocument(MergeElements(lhsDoc.Root, rhsDoc.Root);
/// </code></remarks>
/// </summary>
public class XmlMerging
{
    /// <summary>
    /// Produce an XML file that is made up of the unique data from both
    /// the LHS file and the RHS file. Where there are duplicates the LHS will 
    /// be treated as master
    /// </summary>
    /// <param name="lhsPath">XML file to base the merge off. This will override 
    /// the RHS where there are clashes</param>
    /// <param name="rhsPath">XML file to enrich the merge with</param>
    /// <param name="resultPath">The fully qualified file name in which to 
    /// write the resulting merged XML</param>
    /// <param name="options"> Specifies the options to apply when saving. 
    /// Default is <see cref="SaveOptions.OmitDuplicateNamespaces"/></param>
    public static bool TryMergeXmlFiles(string lhsPath, string rhsPath, 
        string resultPath, SaveOptions options = SaveOptions.OmitDuplicateNamespaces)
    {
        try
        {
            MergeXmlFiles(lhsPath, rhsPath, resultPath);
        }
        catch (Exception)
        {
            // could integrate your logging here
            return false;
        }
        return true;
    }

    /// <summary>
    /// Produce an XML file that is made up of the unique data from both the LHS
    /// file and the RHS file. Where there are duplicates the LHS will be treated 
    /// as master
    /// </summary>
    /// <param name="lhsPath">XML file to base the merge off. This will override 
    /// the RHS where there are clashes</param>
    /// <param name="rhsPath">XML file to enrich the merge with</param>
    /// <param name="resultPath">The fully qualified file name in which to write 
    /// the resulting merged XML</param>
    /// <param name="options"> Specifies the options to apply when saving. 
    /// Default is <see cref="SaveOptions.OmitDuplicateNamespaces"/></param>
    public static void MergeXmlFiles(string lhsPath, string rhsPath, 
        string resultPath, SaveOptions options = SaveOptions.OmitDuplicateNamespaces)
    {
        XElement result = 
            MergeElements(XElement.Load(lhsPath), XElement.Load(rhsPath));
        result.Save(resultPath, options);
    }

    /// <summary>
    /// Produce a resulting <see cref="XElement"/> that is made up of the unique 
    /// data from both the LHS element and the RHS element. Where there are 
    /// duplicates the LHS will be treated as master
    /// </summary>
    /// <param name="lhs">XML Element tree to base the merge off. This will 
    /// override the RHS where there are clashes</param>
    /// <param name="rhs">XML element tree to enrich the merge with</param>
    /// <returns>A merge of the left hand side and right hand side element 
    /// trees treating the LHS as master in conflicts</returns>
    public static XElement MergeElements(XElement lhs, XElement rhs)
    {
        // if either of the sides of the merge are empty then return the other... 
        // if they both are then we return null
        if (rhs == null) return lhs;
        if (lhs == null) return rhs;

        // Otherwise build a new result based on the root of the lhs (again lhs 
        // is taken as master)
        XElement result = new XElement(lhs.Name);

        MergeAttributes(result, lhs.Attributes(), rhs.Attributes());

        // now add the lhs child elements merged to the RHS elements if there are any
        MergeSubElements(result, lhs, rhs);
        return result;
    }

    /// <summary>
    /// Enrich the passed in <see cref="XElement"/> with the contents of both 
    /// attribute collections.
    /// Again where the RHS conflicts with the LHS, the LHS is deemed the master
    /// </summary>
    /// <param name="elementToUpdate">The element to take the merged attribute 
    /// collection</param>
    /// <param name="lhs">The master set of attributes</param>
    /// <param name="rhs">The attributes to enrich the merge</param>
    private static void MergeAttributes(XElement elementToUpdate, 
        IEnumerable<XAttribute> lhs, IEnumerable<XAttribute> rhs)
    {
        // Add in the attribs of the lhs... we will only add new attribs from 
        // the rhs duplicates will be ignored as lhs is master
        elementToUpdate.Add(lhs);

        // collapse the element names to save multiple evaluations... also why 
        // we ain't putting this in as a sub-query
        List<XName> lhsAttributeNames = 
            lhs.Select(attribute => attribute.Name).ToList();
        // so add in any missing attributes
        elementToUpdate.Add(rhs.Where(attribute => 
            !lhsAttributeNames.Contains(attribute.Name)));
    }

    /// <summary>
    /// Enrich the passed in <see cref="XElement"/> with the contents of both 
    /// <see cref="XElement.Elements()"/> subtrees.
    /// Again where the RHS conflicts with the LHS, the LHS is deemed the master.
    /// Where the passed elements do not have element subtrees, but do have text 
    /// content that will be used. Again the LHS will dominate
    /// </summary>
    /// <remarks>Where the LHS has text content and no subtree, but the RHS has 
    /// a subtree; the LHS text content will be used and the RHS tree ignored. 
    /// This may be unexpected but is consistent with other .NET XML 
    /// operations</remarks>
    /// <param name="elementToUpdate">The element to take the merged element 
    /// collection</param>
    /// <param name="lhs">The element from which to extract the master 
    /// subtree</param>
    /// <param name="rhs">The element from which to extract the subtree to 
    /// enrich the merge</param>
    private static void MergeSubElements(XElement elementToUpdate, 
        XElement lhs, XElement rhs)
    {
        // see below for the special case where there are no children on the LHS
        if (lhs.Elements().Count() > 0)
        {
            // collapse the element names to a list to save multiple evaluations...
            // also why we ain't putting this in as a sub-query later
            List<XName> lhsElementNames = 
                lhs.Elements().Select(element => element.Name).ToList();

            // Add in the elements of the lhs and merge in any elements of the 
            //same name on the RHS
            elementToUpdate.Add(
                lhs.Elements().Select(
                    lhsElement => 
                        MergeElements(lhsElement, rhs.Element(lhsElement.Name))));

            // so add in any missing elements from the rhs
            elementToUpdate.Add(rhs.Elements().Where(element => 
                !lhsElementNames.Contains(element.Name)));
        }
        else
        {
            // special case for elements where they have no element children 
            // but still have content:
            // use the lhs text value if it is there
            if (!string.IsNullOrEmpty(lhs.Value))
            {
                elementToUpdate.Value = lhs.Value;
            }
            // if it isn't then see if we have any children on the right
            else if (rhs.Elements().Count() > 0)
            {
                // we do so shove them in the result unaltered
                elementToUpdate.Add(rhs.Elements());
            }
            else
            {
                // nope then use the text value (doen't matter if it is empty 
                //as we have nothing better elsewhere)
                elementToUpdate.Value = rhs.Value;
            }
        }
    }
}

感谢您的贡献!看起来非常稳定。等我有时间后,我会将其移植到VB.NET并尝试一下。谢谢。 - Todd Main

4

以下是一个控制台应用程序,它能够生成您问题中列出的结果。它使用递归来处理每个子元素。唯一它没有检查的是在Elem1中没有出现却出现在Elem2中的子元素,但是希望这个代码对您有所帮助。

我不确定是否可以说这是最佳解决方案,但它确实有效。

Module Module1

Function MergeElements(ByVal Elem1 As XElement, ByVal Elem2 As XElement) As XElement

    If Elem2 Is Nothing Then
        Return Elem1
    End If

    Dim result = New XElement(Elem1.Name)

    For Each attr In Elem1.Attributes
        result.Add(attr)
    Next

    Dim Elem1AttributeNames = From attr In Elem1.Attributes _
                              Select attr.Name

    For Each attr In Elem2.Attributes
        If Not Elem1AttributeNames.Contains(attr.Name) Then
            result.Add(attr)
        End If
    Next

    If Elem1.Elements().Count > 0 Then
        For Each elem In Elem1.Elements
            result.Add(MergeElements(elem, Elem2.Element(elem.Name)))
        Next
    Else
        result.Value = Elem1.Value
    End If

    Return result
End Function

Sub Main()
    Dim Elem1 = <HockeyPlayer height="6.0" hand="left">
                    <Position>Center</Position>
                    <Idol>Gordie Howe</Idol>
                </HockeyPlayer>

    Dim Elem2 = <HockeyPlayer height="5.9" startinglineup="yes">
                    <Idol confirmed="yes">Wayne Gretzky</Idol>
                </HockeyPlayer>

    Console.WriteLine(MergeElements(Elem1, Elem2))
    Console.ReadLine()
End Sub

End Module

编辑:我刚刚注意到这个函数缺少As XElement。我很惊讶它在没有这个的情况下竟然能正常工作!我每天都使用VB.NET,但它有一些怪癖我仍然不完全理解。


这太棒了,谢谢你。我非常感激你所付出的工作和见解! - Todd Main

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接