[B! perl][Web-Scraper] ishideoのブックマーク

ishideo id:ishideo

perlとWeb-Scraperに関するishideoのブックマーク (19)

Web::Scraperで店舗リストを取得 - ジオ屋さん
ishideo 2012/06/19
perl

Web-Scraper

scrape
リンク
【Perl】WWW::Mechanize と Web::Scraper で PASMO の利用履歴を得る（その１） - blog.remora.cx
This domain may be for sale!
ishideo 2011/01/19
perl

pasmo

scrape

cpan

WWW-Mechanize

Web-Scraper
リンク
Template::Refine
This is Jonathan Rockway's blog, where he talks about Angerwhale, Catalyst, and Everything. I released the first version of Template::Refine today. Template::Refine is my attempt to resolve the eternal conflict between developers and web designers. I'm sure you've heard of this probl em before. A developer and web designer want to work on a project together. The web designer only knows HTML, not pr
ishideo 2008/09/15
perl

cpan

xpath

Template-Refine

template

Web-Scraper
リンク
perl-mongers.org - このウェブサイトは販売用です！ - perl mongers リソースおよび情報
This webpage was generated by the domain owner using Sedo Domain Parking. Disclaimer: Sedo maintains no relationship with third party advertisers. Reference to any specific service or trade mark is not controlled by Sedo nor does it constitute or imply its association, endorsement or recommendation.
ishideo 2008/09/15
perl

cpan

xpath

Template-Refine

template

Web-Scraper
リンク
今日のCPANモジュール（跡地）目次
Redirecting… Click here if you are not redirected.
ishideo 2008/08/07
perl

cpan

Web-Scraper
リンク
てきとうにクリックしたらてきとうにWeb::Scraperのコードを作ってくれるWebScraper IDE - bits and bytes
まえに作ったWeb::Scraperのjavascriptバージョンwebscraper.jsとXPathをてきとうに作ってくれる機能を追加したwebscraperp.jsにHTMLのドキュメントから繰り返し部分をみつけてSITEINFOをつくるAutoPagerize Iteration Detectorみたいなみためをくっつけて、取り出したい部分をクリックしたらてきとうにXPathを生成してWeb::Scraperのコードにして出してくれるFirefoxのextensionを作りました。Firefox3専用です。ごめんなさい。ダウンロード WebScraper IDE (for Firefox3) 使い方今回もいつもお世話になっているスターバックスさんの店舗検索結果(住所・店名・条件から探す)を例に使い方をご紹介します。 WebScraper IDEをインストールするとツールメニュ
ishideo 2008/08/06
extension

firefox

javascript

perl

cpan

Web-Scraper

xpath

xml

dom

tool
リンク
B10[mg]: Scraping Yahoo! Search with Web::Scraper
Yet another non-informative, useless blog As seen on TV! Scraping websites is usually pretty boring and annoying, but for some reason it always comes back. Tatsuhiko Miyagawa comes to the rescue! His Web::Scraper makes scraping the web easy and fast. Since the documentation is scarce (there are the POD and the slides of a presentation I missed), I'll post this blog entry in which I'll show how to
ishideo 2008/08/06
perl

cpan

scraper

Web-Scraper

URI
リンク
perl-mongers.org - このウェブサイトは販売用です！ - perl mongers リソースおよび情報
This webpage was generated by the domain owner using Sedo Domain Parking. Disclaimer: Sedo maintains no relationship with third party advertisers. Reference to any specific service or trade mark is not controlled by Sedo nor does it constitute or imply its association, endorsement or recommendation.
ishideo 2008/07/31
perl

cpan

scraper

webscraper

WWW-Mechanize-Plugin-Web-Scraper

WWW-Mechanize

Web-Scraper
リンク
perl-mongers.org
This domain may be for sale!
ishideo 2008/07/31
perl

Web-Scraper

WWW-Mechanize

cpan

webscraper

hatena
リンク
Web::Scraperが便利すぎて困るの巻 (CodeZine編集部ブログ)
こんにちは、編集マンの久次です。なんだかPerlのWeb::Scraperが便利すぎで、やばいです。これまでWWW::Mechanizeでごにょごにょやっていたのですが、一気にいろんなものが解決しました。それで、いろいろ書いていたら、HTML::TreeBuilderのlook_downというメソッドも強力なことにいまさらながら、気づいたので勉強がてら、ためしにコードを書いてみました。 Webの自動制御に今日も夢が広がる…。＜参考＞ Web::Scraper - Web Scraping Toolkit inspired by Scrapi - search.cpan.org naoyaのはてなダイアリー - Web::Scraper ブログが続かないわけ | Web::Scraper 使い方(超入門) Web::Scraper超便利 scrAPI Cheat Sheet
ishideo 2007/10/12
perl

cpan

Web-Scraper

scraping

tool
リンク
Web::Scraper を使う(続) - Tociyuki::Diary
昨日は、デイリーポータルZのアーカイブリストのページからエントリを抽出するときに XPath を使いました。ですが、../../p の部分がダサイので、CSS セレクタを使う方法を考えてみました。変更箇所は $entries の定義部分だけです。 my $entries = scraper { use utf8; #process q{//td/p/font[text() =~ /べつやく/]/../../p}, # 'entries[]' => $entry; process 'td>p', 'entries[]' => sub { my $h = $entry->scrape($_); ($h->{author} ||= '') =~ /べつやく/ ? $h : (); }; result 'entries'; }; コメントアウトした XPath 版 process では、テキスト
ishideo 2007/07/30
feedback

perl

scraper

web

Web-Scraper

cpan

css
リンク
Web::Scraper で XPath と CSS セレクタを混ぜて使う例 - Tociyuki::Diary
Web::Scraper はいたれりつくせりの仕掛けが仕込んであって、便利ですね。私が、割と良く使っている機能は以下 2 つです。 process の第一引数に、CSS セレクタだけでなく、XPath も指定できます。ただし、XPath を指定するときは先頭を必ずスラッシュ(/)で始めなければいけません。 process の第二引数以降の、値をどこから取得するかを指定する部分に、コード・リファレンスを置くこともできます。これを使うと、DOM ツリー中の値を加工して抽出することができます。具体例として、デイリーポータルZのアーカイブ一覧の中からべつやくれいさんのエントリを抽出してみることにします。まず、アーカイブ・ページのエントリ部分を取り出してやると、こうなっています。 <TD width="580" valign="top" class="tx12px"> <P> <B><FONT c
ishideo 2007/07/27
Web-Scraper

XPath

css

scraper

perl

cpan
リンク
はてなグループの終了日を2020年1月31日(金)に決定しました - はてなの告知
はてなグループの終了日を2020年1月31日(金)に決定しました以下のエントリの通り、今年末を目処にはてなグループを終了予定である旨をお知らせしておりました。 2019年末を目処に、はてなグループの提供を終了する予定です - はてなグループ日記このたび、正式に終了日を決定いたしましたので、以下の通りご確認ください。終了日: 2020年1月31日(金) エクスポート希望申請期限:2020年1月31日(金) 終了日以降は、はてなグループの閲覧および投稿は行えません。日記のエクスポートが必要な方は以下の記事にしたがって手続きをしてください。はてなグループに投稿された日記データのエクスポートについて - はてなグループ日記ご利用のみなさまにはご迷惑をおかけいたしますが、どうぞよろしくお願いいたします。 2020-06-25 追記はてなグループ日記のエクスポートデータは2020年2月28
ishideo 2007/05/31
Web-Scraper

WWW-Mechanize

cpan

perl
リンク
はてなグループの終了日を2020年1月31日(金)に決定しました - はてなの告知
はてなグループの終了日を2020年1月31日(金)に決定しました以下のエントリの通り、今年末を目処にはてなグループを終了予定である旨をお知らせしておりました。 2019年末を目処に、はてなグループの提供を終了する予定です - はてなグループ日記このたび、正式に終了日を決定いたしましたので、以下の通りご確認ください。終了日: 2020年1月31日(金) エクスポート希望申請期限:2020年1月31日(金) 終了日以降は、はてなグループの閲覧および投稿は行えません。日記のエクスポートが必要な方は以下の記事にしたがって手続きをしてください。はてなグループに投稿された日記データのエクスポートについて - はてなグループ日記ご利用のみなさまにはご迷惑をおかけいたしますが、どうぞよろしくお願いいたします。 2020-06-25 追記はてなグループ日記のエクスポートデータは2020年2月28
ishideo 2007/05/23
perl

cpan

Web-Scraper
リンク
CustomFeed::Nowaをあえて使わない方法 - はこべにっき ♨
ナニシテル？がいらないなら、CustomFeedせずに自分のフレンドのRSSを集めてきて、Subscription::File => SmartFeedでも良い気がした。 plugins: - module: Subscription::File config: file: /tmp/nowas.txt - module: SmartFeed::All config: title: '[nowa] 新着記事' あーこれするとAtomのauthorがnobodyになるのかー。おしい。 CustomFeed::Nowaは「何分前」とかいう文字列から現時刻からの相対時間で日付を算出してるので、取得のタイミングによって日付が変わってしまう問題がある。Dedupedがうまくできなくなったりして困る。ナニシテルはともかく、記事に関してはscrapeすれば日付情報とってこれると思うんだけど、RSSある
ishideo 2007/05/21
perl

plagger

cpan

Web-Scraper
リンク
2007年05月12日の記事 | ブログが続かないわけ
ishideo 2007/05/17
perl

cpan

Web-Scraper
リンク
Web::Scraper使ってみた - Unknown::Programming
id:naoyaさんが触ってるの見て面白そうなので僕も触ってみました。 Web::Scraper - naoyaのはてなダイアリーで何を取得してこよーかなーと思ったんですが、ちょーど今流行り？のFizzBuzz問題でブクマコメントがワンライナー大会になってるのでコード(っぽい)ものを取って来るやつを作りました。 #!/usr/bin/perl use strict; use warnings; use Web::Scraper; use Encode; use URI; use URI::Find; use Perl6::Say; my $url = 'http://b.hatena.ne.jp/entry/http://www.aoky.net/articles/jeff_atwood/why_cant_programmers_program.htm'; my $links = scr
ishideo 2007/05/11
perl

scraper

Web-Scraper

cpan
リンク
Web::Scraper便利! - はこべにっき ♨
naoyaのはてなダイアリー - Web::Scraperを見て。これはよさそう。ソース読んでみると単純に値を取得する以外にも、どうやら、配列で結果を受け取ったり、サブルーチンを渡して処理を委譲したりできるようなので、ためしにやってみよう。 use strict; use warnings; use Web::Scraper; use URI; use YAML; use Encode; my %result; sub parse_title { my $node = shift; my $text = $node->as_text; my $left = decode_utf8('『'); my $right = decode_utf8('』'); my ($nth, $title, $date) = $text =~ m/^\[(.*?)\]\s+$left(.*?)$right(.
ishideo 2007/05/11
perl

scraper

Web-Scraper

cpan
リンク
Web::Scraper - naoyaのはてなダイアリー
Today I've been thinking about what to talk in YAPC::EU (and OSCON if they're short of Perl talks, I'm not sure), and came up with a few hours of hacking with web-content scraping module using Domain Specific Languages. 使ってみたよ! #!/usr/local/bin/perl use strict; use warnings; use FindBin::libs; use URI; use Web::Scraper; use Encode; use List::MoreUtils qw/uniq/; my $links = scraper { process 'a.key
ishideo 2007/05/09
cpan

perl

Web-Scraper
リンク
1