[B! html][beautifulsoup] gigs123のブックマーク

gigs123 id:gigs123

htmlとbeautifulsoupに関するgigs123のブックマーク (3)

BeautifulSoupでスクレイピングのまとめ – taichino.com
何度かBeautifulSoupについては書いているのですが、未だに使い方が覚えられずにイライラします。仕方が無いのでまとめて置く事にしました。BeautifulSoupはHTMLから情報を取得するだけ無く、HTMLの編集もできますが、ここではスクレイピング用途のみに絞っています。使用するのは以下のHTMLです。このHTMLを使って色々と情報を取得したのが以下です。覚えるべきはfindAllだけです。注意する必要があるのは、textを指定した場合にタグオブジェクトが取れずに、テキストオブジェクトが取れるので、一旦parentで親のタグ取りましょうという事と、正規表現で条件指定する場合は、re.compileで正規表現オブジェクトを渡すという事位ですか。 #!/usr/bin/python # -*- coding: utf-8 -*- import re import urllib f
gigs123 2010/09/29
beautifulsoup

html

python
リンク
Beautiful Soup: We called him Tortoise because he taught us.
You didn't write that awful page. You're just trying to get some data out of it. Beautiful Soup is here to help. Since 2004, it's been saving programmers hours or days of work on quick-turnaround screen scraping projects. Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Three features make it powerful: Beautiful Soup provides a few simple methods and
gigs123 2010/01/26
parser

HTML

python

BeautifulSoup
リンク
Beautiful Soup documentation
Beautiful Soup Documentation by Leonard Richardson (leonardr@segfault.org) 这份文档也有中文版了 (This document is also available in Chinese translation) Этот документ также доступен в русском переводе. [Внешняя ссылка] (This document is also available in Russian translation. [External link]) Beautiful Soup 3 has been replaced by Beautiful Soup 4. You may be looking for the Beautiful Soup 4 documentation Bea
gigs123 2009/03/12
python

HTML
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx