[B! scala][crawler] manboubirdのブックマーク

manboubird id:manboubird

scalaとcrawlerに関するmanboubirdのブックマーク (22)

GitHub - snowplow/scala-weather: High-performance Scala library for looking up the weather
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
manboubird 2015/12/21
scala

weather

crawler

snowPlow
リンク
Systems Programming at Twitter
Systems Programming at Twitter Facebook, October 30, 2012 Marius Eriksen Twitter Inc. (Press space or enter to navigate to the next slide, left arrow to go backwards.) A history lesson Twitter evolves 2009: Pure Ruby-on-Rails app with MySQL; lots of memcache. Materialized timelines into memcaches. Social graph moved to a service. Delayed work through queues. 2010: Starting to move timelines out to
manboubird 2015/02/19
finagle

crawler

future

slide

twitter

scala
リンク
ScalaでWebスクレイピングして画像収集する - tototoshi の日記
2011年も相変わらず「Scalaは実用的なのか？」という「え、実用的も何も、普通に使ってますが、、、」としか答えられない質問を幾度も受けました。 Scalaは実用的で、例えばコミケのコスプレ画像の収集などができます。*1 【コミケ81】コスプレイヤー画像まとめ：１日目【C81】さとろぐ。からjpg画像を一括ダウンロードし "data/(画像のURLの最後の/以下)"というファイル名で保存しようと思います。ポイント dispatchでHTMLを取得する LiftのHTMLパーサでHTMLをxmlに変換する ScalaのXMLサポートでxmlを解析して画像のURLを抜き出す ExtractorでURLの分解 scala-ioを使ってファイルに保存 dispatchでHTMLを取得する別にdispatchでなくてもscala標準のscala.io.sourceでもできるし、scalaj
manboubird 2015/02/18
scala

crawler

scraping

image
リンク
WebCrawler in Scala - Y's note
Crawler in Scala 検索Crawlerを作る - Web就活日記以前はnutchを使ったcrawlerを試してみましたが、今回はcrawler自体をscalaで書いているものをまとめようと思います。インターネットで紹介されているものの中には全然使えないものもあったりするので、選択には気をつけてください。個人的にはまとめた結果からJoup、HtmlUnitDriverが記述や設定が簡単で手軽に実行できるという点でお薦めしたいツールになっています。 nomad denigma/nomad JDK/JRE7、Mongo DB、Debianを必要とします。これによって私はテストしませんでしたが。sourceの更新も2年前で止まってしまっていますね。。application.conf、filters.groovy、seeds.txtの3つのファイルを記述するだけで簡単に動かせて、結果を
manboubird 2015/02/16
crawler

scala
リンク
GitHub - reggoodwin/ferrit: Ferrit is a web crawler service written in Scala using Akka, Spray and Cassandra.
manboubird 2015/02/16
crawler

scala

spray

akka

cassandra
リンク
typesafehub/webwords · GitHub
manboubird 2015/02/14
heroku

akka

crawler

scala
リンク
GitHub - daniel-trinh/sprawler: Akka and Spray based webcrawler
manboubird 2015/01/10
spray

akka

crawler

scala
リンク
Scala School - An introduction to Finagle
Finagle is Twitter’s RPC system. This blog post explains its motivations and core design tenets, the finagle README contains more detailed documentation. Finagle aims to make it easy to build robust clients and servers. REPL Futures: Sequential composition, Concurrent composition, Composition Example: Cached Rate Limit, Composition Example: Web Crawlers Service Client Example Server Example Filter
manboubird 2015/01/10
finagle

crawler

scala

rpc
リンク
GitHub - lloydmeta/metascraper: Scala library for scraping metadata from specified URLs (e.g. OpenGraph)
manboubird 2014/12/29
scraping

crawler

metaScraper

scala

akka
リンク
GitHub - bplawler/crawler: Scala DSL for web crawling
manboubird 2014/12/29
scala

crawler

dsl

htmlUnit

headLess
リンク
denigma/nomad · GitHub
README.md Nomad - focused highly customizable web crawler Features Crawling of multiply domains Allows to write flexible rules to decide which links crawl. Support of robots.txt Mongo DB(GridFS) as storage for crawled content TitanDB(with InMemory, BerkeleyDB or Cassandra backend) to store graph of links. Written in Scala. Works in Linux. It should work in Win as well, but I haven't tested it. How
manboubird 2014/12/23
scala

crawler
リンク
GitHub - gip/fureteur: Fureteur is a simple, configurable, fault-tolerant web crawler written is Scala
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
manboubird 2014/12/23
crawler

scala

akka

amqp

fureteur
リンク
Selenium Webdriver as a deadly weapon | sap1ens.com
Weapon During my career I see the battle between website/web app owners and bots/scrapers/crawlers writers. I thought this battle can’t be won. But about 6 months ago I joined it and I think now I have [almost] deadly weapon. Selenium Webdriver is my choice. Probably, you heard or used it before. It’s the most popular tool for the functional tests (also known as end-to-end tests), and projects lik
manboubird 2014/12/23
selenium

webdriver

crawler

scala
リンク
Start with Akka: Let's write some scraper | sap1ens.com
manboubird 2014/12/23
akka

crawler

scala
リンク
LinkedIn Login, Sign in | LinkedIn
manboubird 2014/12/23
crawler

akka

scala
リンク
Domain error
manboubird 2014/04/29
scala

crawler

akka

jsoup
リンク
GitHub - kushti/SimpleCrawler: Simple Crawler Written in Scala On the Top of Akka Framework
manboubird 2014/03/24
crawler

akka

scala
リンク
Scaling Out with Scala and Akka on Heroku | Heroku Dev Center
This article was contributed by Havoc Pennington Havoc Pennington is a developer at Typesafe, the Scala company. In the past he's worked on everything from web apps to Linux UI toolkits to JavaScript runtimes. Last Updated: 08 June 2012 akka cedar scala Table of Contents Web Words Overview: a request step-by-step Akka: Actor and Future Scala Bridging HTTP to Akka Connecting the web process to the
manboubird 2014/03/24
heroku

akka

crawler

scala
リンク
Парковая страница Imena.UA
manboubird 2014/03/24
akka

crawler

informationExtraction

scala
リンク
Getting rid of synchronized: Using Akka from Java · Florian Hopf
I've been giving an internal talk on Akka, the Actor framework for the JVM, at my former company synyx. For the talk I implemented a small example application, kind of a web crawler, using Akka. I published the source code on Github and will explain some of the concepts in this post. Motivation To see why you might need something like Akka, think you want to implement a simple web crawler for offl
manboubird 2014/03/23
akka

scala

crawler

multithread
リンク
1 2 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx