[B! parser][html] manabouのブックマーク

manabou id:manabou

parserとhtmlに関するmanabouのブックマーク (4)

GitHub - ericchiang/pup: Parsing HTML at the command line
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
manabou 2018/05/17
pup

html

parser

scrape

css

xpath

jq
リンク
Big Sky :: html をコマンドラインからパースするなら pup が便利
2014年でも html を解析してゴニョゴニョするなんて要件はまだまだある訳で、そんな時に便利なのが pup というコマンドです。 EricChiang/pup - GitHub README.md pup pup is a command line tool for processing HTML. It reads from stdin, prints to stdout,... https://github.com/EricChiang/pup 通常、こういったツールは perl や ruby、python 等で提供されランタイムがインストールされていない環境で動かすのはちょっとした手間が発生していました。しかし pup ならば golang で出来ているのでバイナリ1つあれば動かせます。使い方は、例えばこのサイトのパーマリンクのHTMLを得たいならば curl -s http:
manabou 2014/09/16
html

pup

go

tool

golang

parse

parser
リンク
Javaで使える、HTML5パーサ - CLOVER🍀
ちょっと大量のHTMLファイルをチェックする作業があって、grep／Perl One Linerで頑張るのも厳しいよなぁと思い、HTMLファイルをJavaでパースしてどうにかしようと思い立ちました、今日。で、JavaでHTMLパーサといえば、個人的にはパッと思い浮かぶのがNekoHTML。 CyberNeko HTML Parser http://nekohtml.sourceforge.net/ が、いかんせんこれは古い。HTML5にも対応していませんし。よって、他のパーサを探してみました。2つほど見つかったので、ご紹介します。 HTMLをパースするので、以下のような閉じタグがないHTMLもパースできなければなりません。 index.html <!DOCTYPE html> <html> <head> <title>タイトル</title> </head> <body> <div i
manabou 2014/01/20
java

parser

html

html5

groovy
リンク
ブラウザの仕組み: 最新ウェブブラウザの内部構造
How browsers work Stay organized with collections Save and categorize content based on your preferences. Preface This comprehensive primer on the internal operations of WebKit and Gecko is the result of much research done by Israeli developer Tali Garsiel. Over a few years, she reviewed all the published data about browser internals and spent a lot of time reading web browser source code. She wrot
manabou 2013/07/02
parser

html

html5
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx