我需要获取嵌套的div中的数据,但是我无法得到它。
有嵌套的div,我需要对数据进行适当的格式化。
我已经编写了bs4模块,但是遇到了错误。
BeautifulSoup: AttributeError: 'NavigableString'对象没有属性'name'
请帮助我!
我的HTML
<div id="new">
<div id="newDat">
<div class="Data">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>
</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>
</div>
</div>
<div class="DataOne">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>
</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>
</div>
</div>
<div class="DataTwo">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>
</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>
</div>
</div>
<div class="DataThree">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>
</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>
</div>
</div>
</div>
</div>
我的美丽汤代码
li = page.find('div', {'id': 'new'})
for tag in li:
for i in tag.find_all("div", {"class": "name"}):
print i.getText()
break
for i in tag.find_all("div", {"class": "details"}):
print i.getText()
break
for i in tag.find_all("div", {"class": "header"}):
print i.getText()
break
for i in tag.find_all("div", {"class": "ideas"}):
print i.getText()
break
for i in tag.find_all("div", {"class": "profile"}):
print i.getText()
break
for i in tag.find_all("div", {"class": "phone"}):
print i.getText()
break
我希望输出的结果是这样的。
Div one
Name : Jack Daniels
Details : Barcelona
header : Looking for meeting new people
ideas : I have new ideas
profile: MS in cs
tel : +123123123123123231
Div two
Name : Jack Daniels
Details : Barcelona
header : Looking for meeting new people
ideas : I have new ideas
profile: MS in cs
tel : +123123123123123231
等等。
如果我在<div id = "new">
内有100个
,我需要得到如下输出结果。
break
的循环呢?你可以直接使用find
,例如:tag.find("div", {"class": "name"}).text
。 - t.m.adam