[B! scrape] AKIMOTOのブックマーク

AKIMOTO id:AKIMOTO

scrapeに関するAKIMOTOのブックマーク (40)

City Scrapers
AKIMOTO 2025/03/17
地方政府の公開ミーティング情報をscrapeするためのOSSツール https://www.documenters.org/ で使われている

行政監視

scrape

OSS
リンク
Documenters.org: Making Local Government More Accountable
AKIMOTO 2025/03/17
地方政府の公開ミーティング情報をスクレイプしてまとめることで行政への市民参加や監視を進めるアメリカのプロジェクト。オープンソース

政府

scrape

情報

アメリカ

地方政府

市民運動

行政監視
リンク
【令和最新版】令和のWebスクレイピング(クロール)【ベストプラクティス】
こんにちは、株式会社FP16で結構コードを書いている二宮です。最近Webスクレイピングのコードを色々な方法で書いているので、そこで得た知見をここに残しておこうと思います。ほぼ毎日なにかのWebスクレイピングコードを書いています。 Webスクレイピング手段 Webスクレイピングには色々な方法があります。私が最近主に使っているのはこの5つの手段です。 cheerioでHTMLを解析 Playwrightなどで要素指定でデータを取得する APIを見つけて叩く（バックエンドとの通信を再現してデータを取得） LLMでサイト構造を解析してデータを取得する Next.jsからのレスポンスに含まれているデータを解析して取得するこれが令和のWebスクレイピングのベストプラクティスだと思っています。これらの方法を、目標に合わせて使い分けています。使い分け方 CheerioでHTML解析 JavaS
AKIMOTO 2024/09/28
scrape

tool

解説
リンク
The industry standard for working with HTML in JavaScript | cheerio
cheerioThe fast, flexible & elegant library for parsing and manipulating HTML and XML. Get Started! Proven syntaxCheerio implements a subset of core jQuery. Cheerio removes all the DOM inconsistencies and browser cruft from the jQuery library, revealing its truly gorgeous API. Blazingly fastCheerio works with a very simple, consistent DOM model. As a result parsing, manipulating, and rendering are
AKIMOTO 2024/07/30
javascript

library

scrape
リンク
How to use Python and the Reddit API to build a local database of Reddit posts and comments —…
AKIMOTO 2024/04/10
reddit

API

scrape

tutorial

Praw

Python
リンク
GitHub - JosephLai241/URS: Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
AKIMOTO 2024/04/10
reddit

scrape

tool

OSS
リンク
https://pricing-page-scraper.bharathvaj.me/
AKIMOTO 2024/03/27
pricing

webサービス

SaaS

scrape

tool

OSS
リンク
Apify: Full-stack web scraping and data extraction platform
Your full‑stack platform for web scraping Apify is the largest ecosystem where developers build, deploy, and publish web scrapers, AI agents, and automation tools. We call them Actors.
AKIMOTO 2024/01/17
scrape

SaaS

freemium
リンク
GitHub - Page-Replica/page-replica: Page Replica – Tool for Web Scraping, Prerendering, and SEO Boost
AKIMOTO 2024/01/02
scrape

tool

JavaScript

OSS
リンク
Personal-scale Web scraping for fun and profit
AKIMOTO 2023/12/03
Deno

Astral

scrape

tips
リンク
Introduction | Astral
Astral is a Puppeteer/Playwright-like library designed with Deno in mind. What can I do? Most things that you can do manually in the browser can be done using Astral! Here are a few examples to get you started: Generate screenshots and PDFs of pages. Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)). Automate form submission, UI testing, k
AKIMOTO 2023/12/03
Deno

scrape

browser

tool
リンク
denoでスクレイピングしてslackに投げる
GWにdenoで何か作るという目標を立てていたので簡単ですが、slack botを作成しました。 denoのセットアップ azukiazusaさんの記事を参考にHello World!がサーバーで返せれば問題無いです。完成品 slackのチャンネルにテキストを飛ばす slackのTokenの作成、botの招待まずはtokenをとってきますこちらの記事の通りにすればとってこれるはずです。きちんとTokenが使えるかまで確認し、チャンネルにbotを招待してください。コードこちらのコードを参考にしました。あんまり依存したく無いので書き換えました。 export const sendMessage = async(token: string, channel: string, text: string) => { const response = await fetch('https
AKIMOTO 2023/10/26
Deno

scrape

Slack

tutorial
リンク
Puppeteer と Deno で Yahoo! カレンダーをスクレイピング【認証編】 - 腐ったコロッケ
#概要私的な理由で Yahoo! カレンダーから情報をとってきたいのですが、Yahoo! カレンダーには API がありません。そこで本記事では、認証の確認コード入力以外を自動化してヘッドレスブラウザで Yahoo! カレンダーにログインする方法を提案します。ログインさえできればあとは Yahoo! カレンダーに迷惑のかからない範囲で情報を取ってくることができます。 #Yahoo! へのログインフロー（カレンダーへログインする場合）記事を執筆している2022年3月14日現在、Yahoo! へログインする方法として、パスワード認証、確認コードによる認証（SMS）、確認コードによる認証（メール）の3種類の方法があります。本記事では、このうちのパスワード認証を使いません。なぜなら、Yahoo! は確認コードによる認証の利用を推奨している上、パスワード認証には日常生活を送る上で不便な点が
AKIMOTO 2023/10/26
Deno

Puppeteer

scrape

tutorial

SMS
リンク
GitHub - adbar/trafilatura: Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Trafilatura is a cutting-edge Python package and command-line tool designed to gather text on the Web and simplify the process of turning raw HTML into structured, meaningful data. It includes all necessary discovery and text processing components to perform web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is
AKIMOTO 2023/08/15
Python

web

text

tool

OSS

scrape
リンク
NHK.hta
ようこそ NHKラジオ語学講座ダウンローダ NHK.hta 公開ぺーじです主にインストールと使い方とタイマー起動について説明します最新版は、NHK.hta ver-3.2.10 2022/04/19です。2022年度前期版お知らせ ●NHK.hta 最新版 ver-3.2.10 (NHK.hta のインストールに在り)と ver-3.2.10a (過去のバージョンに在り)を公開しました。今週分（放送後1週間）⇒ ver-3.2.10 先週分（放送の翌月曜日から1週間）⇒ ver-3.2.10a となっています。大抵の講座は、ver-3.2.10 ver-3.2.10a の両方で落とせますが、一部分片方でしか落とせないものもあります。＜例外＞・ニュースで学ぶ「現代英語」、ポルトガル語、アラビア語：今週分のみ・基礎０：先週分のみ ●おそらくWindowsによるチェックではな
AKIMOTO 2023/02/05
NHK

基礎英語

ラジオ

scrape

tool
リンク
Introducing a powerful web scraper for collecting email addresses
AKIMOTO 2022/12/07
メールアドレスを収集するscraperの書き方解説

scrape

Python

tutorial

mail address
リンク
GitHub - soxoj/maigret: 🕵️‍♂️ Collect a dossier on a person by username from thousands of sites
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
AKIMOTO 2022/10/03
シャーロックのforkメグレ https://labs.cybozu.co.jp/blog/akky/2022/10/maigret-lists-accounts-info-from-websites/

username

web

tool

OSS

SNS

アカウント

privacy

scrape

紹介した
リンク
THE LAB #1: Scraping data from an app
This is the first post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used. The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. I usually write in this newsletter about how to extract data from websites but what if our target is an app with no
AKIMOTO 2022/09/05
Fiddlerを使ったアプリ通信からのスクレイプ

scrape

スマートフォン

アプリ

tutorial

Fiddler
リンク
GitHub - ttskch/wordler: A Wordle solver implemented with symfony/panther.
AKIMOTO 2022/04/21
PHP

Panther

scrape

Wordle

solver

OSS
リンク
FPGAjobs - Jobs for Logic Designers!
AKIMOTO 2022/03/10
scrape

webサービス

HDL

求人
リンク
1 2 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx