使用LINQ解析XML以获取子元素

7
<?xml version="1.0" standalone="yes"?>
<CompanyInfo>
     <Employee name="Jon" deptId="123">
      <Region name="West">
        <Area code="96" />
      </Region>
      <Region name="East">
        <Area code="88" />
      </Region>
     </Employee>
</CompanyInfo>  

public class Employee
{
    public string EmployeeName { get; set; }
    public string DeptId { get; set; }
    public List<string> RegionList {get; set;}
}

public class Region
{
    public string RegionName { get; set; }
    public string AreaCode { get; set; }
}

我正在尝试读取这个XML数据,到目前为止我尝试了以下方法:

XDocument xml = XDocument.Load(@"C:\data.xml");
var xElement = xml.Element("CompanyInfo");
if (xElement != null)
    foreach (var child in xElement.Elements())
    {
        Console.WriteLine(child.Name);  
        foreach (var item in child.Attributes())
        {
            Console.WriteLine(item.Name + ": " + item.Value);
        }

        foreach (var childElement in child.Elements())
        {
            Console.WriteLine("--->" + childElement.Name);
            foreach (var ds in childElement.Attributes())
            {
                Console.WriteLine(ds.Name + ": " + ds.Value);
            }
            foreach (var element in childElement.Elements())
            {
                Console.WriteLine("------->" + element.Name);
                foreach (var ds in element.Attributes())
                {
                    Console.WriteLine(ds.Name + ": " + ds.Value);
                }
            }
        }                
    }

这使我能够获取每个节点、其属性名和值,因此我可以将这些数据保存到数据库中的相关字段中,但这似乎是一种冗长且不灵活的方式,例如,如果XML结构发生变化,所有这些foreach语句都需要重新检查,而且用这种方式过滤数据很困难,我需要编写某些if语句来过滤数据(例如仅获取West地区的员工等)。
我正在寻找一种更灵活的方法,使用linq,类似于这样:
List<Employees> employees =
              (from employee in xml.Descendants("CompanyInfo")
               select new employee
               {
                   EmployeeName = employee.Element("employee").Value,
                   EmployeeDeptId = ?? get data,
                   RegionName = ?? get data,
                   AreaCode = ?? get data,,
               }).ToList<Employee>();

但我不确定如何从子节点获取值并应用过滤器(仅获取特定员工)。这可行吗?任何帮助将不胜感激。

谢谢

3个回答

8
var employees = (from e in xml.Root.Elements("Employee")
                 let r = e.Element("Region")
                 where (string)r.Attribute("name") == "West"
                 select new Employee
                 {
                     EmployeeName = (string)e.Attribute("employee"),
                     EmployeeDeptId = (string)e.Attribute("deptId"),
                     RegionName = (string)r.Attribute("name"),
                     AreaCode = (string)r.Element("Area").Attribute("code"),
                 }).ToList();

但是当XML文件结构发生更改时,仍需要对查询进行修订。

编辑

查询每个员工的多个区域:

var employees = (from e in xml.Root.Elements("Employee")
                 select new Employee
                 {
                     EmployeeName = (string)e.Attribute("employee"),
                     DeptId = (string)e.Attribute("deptId"),
                     RegionList = e.Elements("Region")
                                   .Select(r => new Region {
                                       RegionName = (string)r.Attribute("name"),
                                       AreaCode = (string)r.Element("Area").Attribute("code")
                                   }).ToList()
                 }).ToList();

您可以根据特定地区过滤员工列表:
var westEmployees = employees.Where(x => x.RegionList.Any(r => r.RegionName == "West")).ToList();

6
您可以跟踪结构:
from employee in xml
      .Element("CompanyInfo")       // must be root
      .Elements("Employee")         // only directly children of CompanyInfo

或者更宽松一些。
from employee in xml.Descendants("Employee")    // all employees at any level

然后获取您想要的信息:

       select new Employee
       {
           EmployeeName = employee.Attribute("name").Value,
           EmployeeDeptId = employee.Attribute("deptId").Value,
           RegionName = employee.Element("Region").Attribute("name").Value,
           AreaCode = employee.Element("Region").Element("Area").Attribute("code").Value,
       }

有了关于多个地区的额外信息,假设有一个 List<Region> Regions 属性:

       select new Employee
       {
           EmployeeName = employee.Attribute("name").Value,
           EmployeeDeptId = employee.Attribute("deptId").Value,
           //RegionName = employee.Element("Region").Attribute("name").Value,
           //AreaCode = employee.Element("Region").Element("Area").Attribute("code").Value,
           Regions = (from r in employee.Elements("Region") select new Region 
                      {
                         Name = r.Attribute("name").Value,
                         Code = r.Element("Area").Attribute("code").Value,
                      }).ToList();
       }

谢谢,但我忘记了包括每个员工可以覆盖多个地区(我已经修改了上面的XML),这个查询只检索第一个地区的数据,有没有办法获取员工的所有地区?谢谢 - 03Usr
是的,但请先发布Employee和Region类。 - H H
但是 class Employee 没有 (一个列表的) Regions 属性。 - H H
你还没有真正的类,对吧?一个 List<string> 是不够的。 - H H

2
您可以在一个查询中进行选择,然后在第二个查询中进行过滤或将它们合并为一个查询:
两个查询:
        // do te transformation
        var employees =
          from employee in xml.Descendants("CompanyInfo").Elements("Employee")
          select new
          {
              EmployeeName = employee.Attribute("name").Value,
              EmployeeDeptId = employee.Attribute("deptId").Value,
              Regions = from region in employee.Elements("Region")
                        select new
                            {
                                Name = region.Attribute("name").Value,
                                AreaCode = region.Element("Area").Attribute("code").Value,
                            }
          };

        // now do the filtering
        var filteredEmployees = from employee in employees
                                from region in employee.Regions
                                where region.AreaCode == "96"
                                select employee;

合并一个查询(输出相同):

          var employees2 =
          from selectedEmployee2 in
              from employee in xml.Descendants("CompanyInfo").Elements("Employee")
              select new
              {
                  EmployeeName = employee.Attribute("name").Value,
                  EmployeeDeptId = employee.Attribute("deptId").Value,
                  Regions = from region in employee.Elements("Region")
                            select new
                                {
                                    Name = region.Attribute("name").Value,
                                    AreaCode = region.Element("Area").Attribute("code").Value,
                                }
              }
          from region in selectedEmployee2.Regions
          where region.AreaCode == "96"
          select selectedEmployee2;

但是有一件小事情你需要考虑添加。为了保证鲁棒性,你需要检查元素和属性的存在,然后选择将会像这样:

 var employees =
          from employee in xml.Descendants("CompanyInfo").Elements("Employee")
          select new
          {
              EmployeeName = (employee.Attribute("name") != null) ? employee.Attribute("name").Value : string.Empty,
              EmployeeDeptId = (employee.Attribute("deptId") != null) ? employee.Attribute("deptId").Value : string.Empty,
              Regions = (employee.Elements("Region") != null)?
                        from region in employee.Elements("Region")
                        select new
                            {
                                Name = (region.Attribute("name")!= null) ? region.Attribute("name").Value : string.Empty,
                                AreaCode = (region.Element("Area") != null && region.Element("Area").Attribute("code") != null) ? region.Element("Area").Attribute("code").Value : string.Empty,
                            }
                        : null
          };

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接