Created
September 29, 2018 14:28
-
-
Save senseilearning/ab0555e2e89486deec208b048f53f426 to your computer and use it in GitHub Desktop.
日本経済新聞のページタイトルを取得
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# coding: UTF-8 | |
import urllib.request, urllib.error | |
from bs4 import BeautifulSoup | |
# アクセスするURL | |
url = "http://www.nikkei.com/" | |
# URLにアクセスする htmlが帰ってくる → <html><head><title>経済、株価、ビジネス、政治のニュース:日経電子版</title></head><body.... | |
html = urllib.request.urlopen(url) | |
# htmlをBeautifulSoupで扱う | |
soup = BeautifulSoup(html, "html.parser") | |
# タイトル要素を取得する → <title>経済、株価、ビジネス、政治のニュース:日経電子版</title> | |
title_tag = soup.title | |
# 要素の文字列を取得する → 経済、株価、ビジネス、政治のニュース:日経電子版 | |
title = title_tag.string | |
# タイトル要素を出力 | |
print(title_tag) | |
# タイトルを文字列を出力 | |
print(title) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://qiita.com/Azunyan1111/items/9b3d16428d2bcc7c9406