python网页清洗实验


text1 = '''<div class="rank-list-wrap" data-v-9b180bbc>

  • <div class="">1</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 风犬少年的天空 <div class="pgc-info">全16集</div> <div class="detail"> <span class="data-box"> 3.8亿 </span> <span class="data-box"> 402.8万 </span> <span class="data-box"> 275.6万 </span> </div> <div class="pts"> <div>189803</div>综合得分 </div> </div> <!----></div>
  • <div class="">2</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 半泽直树2 <div class="pgc-info">全10集</div> <div class="detail"> <span class="data-box"> 1260.1万 </span> <span class="data-box"> 21.3万 </span> <span class="data-box"> 33.5万 </span> </div> <div class="pts"> <div>187107</div>综合得分 </div> </div> <!----></div>
  • <div class="">3</div> <div class="content"> <div class="img"> <!----></div> <div class="info"> 我命中注定的人 <div class="pgc-info">全10集</div> <div class="detail"> <span class="data-box"> 45.7万 </span> <span class="data-box"> 1.2万 </span> <span class="data-box"> 2.6万 </span> </div> <div class="pts"> <div>151723</div>综合得分 </div> </div> <!----></div>
  • <div class="">4</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 半泽直树 <div class="pgc-info">全10集</div> <div class="detail"> <span class="data-box"> 1722.8万 </span> <span class="data-box"> 29.2万 </span> <span class="data-box"> 68.7万 </span> </div> <div class="pts"> <div>125287</div>综合得分 </div> </div> <!----></div>
  • <div class="">5</div> <div class="content"> <div class="img"> <!----></div> <div class="info"> 三国演义 <div class="pgc-info">全84集</div> <div class="detail"> <span class="data-box"> 7476.5万 </span> <span class="data-box"> 271.8万 </span> <span class="data-box"> 58.2万 </span> </div> <div class="pts"> <div>78488</div>综合得分 </div> </div> <!----></div>
  • <div class="">6</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 铠甲勇士 <div class="pgc-info">全52集</div> <div class="detail"> <span class="data-box"> 1639.5万 </span> <span class="data-box"> 78.2万 </span> <span class="data-box"> 27.7万 </span> </div> <div class="pts"> <div>57942</div>综合得分 </div> </div> <!----></div>
  • <div class="">7</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 鹿鼎记 <div class="pgc-info">全50集</div> <div class="detail"> <span class="data-box"> 404.1万 </span> <span class="data-box"> 7.4万 </span> <span class="data-box"> 3万 </span> </div> <div class="pts"> <div>57914</div>综合得分 </div> </div> <!----></div>
  • <div class="">8</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 外来媳妇本地郎 <div class="pgc-info">全2444集</div> <div class="detail"> <span class="data-box"> 2123.3万 </span> <span class="data-box"> 50.4万 </span> <span class="data-box"> 7.8万 </span> </div> <div class="pts"> <div>55888</div>综合得分 </div> </div> <!----></div>
  • <div class="">9</div> <div class="content"> <div class="img"> <!----></div> <div class="info">地下交通站 <div class="pgc-info">全28集</div> <div class="detail"> <span class="data-box"> 4539.7万 </span> <span class="data-box"> 46.9万 </span> <span class="data-box"> 25.7万 </span> </div> <div class="pts"> <div>46870</div>综合得分 </div> </div> <!----></div>
  • <div class="">10</div> <div class="content"> <div class="img"> <!----> </div> <div class="info"> 少年特工亚历克斯 <div class="pgc-info">全8集</div> <div class="detail"> <span class="data-box"> 98.4万 </span> <span class="data-box"> 1.5万 </span> <span class="data-box"> 5.5万 </span> </div> <div class="pts"> <div>43370</div>综合得分 </div> </div> <!----></div>
  • ''' for i in range(0, 10): text = text1.split('')[i] rank = text.split('<div class="">')[1].split("</div>")[0] title = text.split('')[1].split('>')[4] rate = text.split('')[1].split('</span>')[0] print("排名{},《{}》,播放量{}".format(rank, title, rate))

声明:一代明君的小屋|版权所有,违者必究|如未注明,均为原创|本网站采用BY-NC-SA协议进行授权

转载:转载请注明原文链接 - python网页清洗实验


欢迎来到我的小屋