本文「duckdb」を検索 - はてなブックマーク

1 - 40 件 / 78件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

duckdbの検索結果1 - 40 件 / 78件

分散データシステム入門の決定版『データ指向アプリケーションデザイン』をたった30分で学んでみた #DataEngineeringStudy | DevelopersIO
- 266 users
- dev.classmethod.jp
- テクノロジー
- 2023/02/19
基調講演「30分でわかるデータ指向アプリケーションデザイン」・スピーカー斉藤太郎氏　Twitter：@taroleo / Github：@xerial Principal Software Engineer , Treasure Data 東京大学理学部情報科学科卒。情報理工学 Ph.D。データベース、大規模ゲノムデータ処理の研究に従事。その後、スタートアップであるTreasure Dataに加わり、アメリカ、シリコンバレーを拠点に活動中。日本データベース学会上林奨励賞受賞。OSSを中心にプログラミングやデータ処理を簡単にするためのプロダクトを作成している。「30分でわかるデータ指向アプリケーションデザイン」最新の論文にも触れながら、分散データシステムの世界の魅力を伝えていきます。後半、@tagomoris https://t.co/TQ2TnsFIOT… — Taro L.
- データベース
- あとで読む
- 本
- データ
- 設計
- book
- ソフトウェアデザイン
- database
- DB
- 技術
新しいデータ処理ライブラリの学習はもう不要！ Python 初学者のための Ibis 100 本ノック - Qiita
- 249 users
- qiita.com/kunishou
- テクノロジー
- 2024/01/08
新しいデータ処理ライブラリの学習はもう不要！ Python 初学者のための Ibis 100 本ノックPython機械学習pandasデータ分析ibis-framework Information 2024/1/14： Kaggle notebook for Ibis Kaggle で Ibis を使用するための Sample Notebook を用意しました。Kaggle でもぜひ Ibis をご活用下さい。 🦩 [Ibis] Kaggle-Titanic-Tutorial Ibis 100 本ノック補足記事 Ibis 100 本ノックについて、よりスマートな書き方等について @hkzm さんが補足記事を書いてくれました（この記事を参考にコンテンツのほうもブラッシュアップしたいと思います）。 Ibis 100 本ノックの記事を受けてはじめにどうもこんにちは、kunishou です。
- python
- あとで読む
- ライブラリ
- データ分析
- Ibis
- データ処理
- qiita
- 学習
- pandas
DB Pilot - DuckDB GUI Client
- 216 users
- www.dbpilot.io
- テクノロジー
- 2024/02/04
DuckDB GUI Client DB Pilot is a database GUI client for DuckDB and various other databases. Available for Mac, with Linux and Windows support coming soon. Working with SQL has never been easier - thanks to DB Pilot's integrated AI assistant.
- DB
- database
- あとで読む
- SQL
- GUI
- mac
An in-process SQL OLAP database management system
- 93 users
- duckdb.org
- テクノロジー
- 2020/05/24
DuckDB is a fast in-process analytical database DuckDB supports a feature-rich SQL dialect complemented with deep integrations into client APIs Installation Documentation -- Get the top-3 busiest train stations SELECT station_name, count(*) AS num_services FROM train_services GROUP BY ALL ORDER BY num_services DESC LIMIT 3;
- database
- sql
- db
- olap
- analytics
- データベース
- sqlite
- あとで読む
🦆🦆🦆🦆🦆🦆DuckDB入門🦆🦆🦆🦆🦆🦆
- 73 users
- zenn.dev/notrogue
- テクノロジー
- 2022/09/24
tl;dr SQLiteのOLAP版だよ OLAP系のクエリにおいて、PandasやSQLiteより早いらしいよ CSV・Parquet・Pandas DataFrameの読み書きできて便利だよ背景ポジション・競合一言で言うとSQLiteのOLAP版です。位置づけとしては、論文(DuckDB: an Embeddable Analytical Database (SIGMOD 2019 Demo))記載のSystem Landscapeがわかりやすいです。（DuckDB: an Embeddable Analytical Database (SIGMOD 2019 Demo)より）このLandscapeでは、データベースを Standalone（クライアント・サーバモデル）か、組み込み（シングルマシン・インプロセス）か OLTPかOLAPかの二軸に分割しています。その上で、ク
- SQLite
- db
- OLAP
- あとで読む
- データベース
Databases in 2022: A Year in Review | OtterTune
- 69 users
- ottertune.com
- テクノロジー
- 2023/01/03
OtterTune is an automated optimization service for PostgreSQL and MySQL running on Amazon RDS and Aurora. It uses machine learning to tune your database’s configuration knobs, indexes, and cloud settings. 🦦 Try it now on your first database for free! Another year has gone by, and I’m still alive. As such, it is an excellent time to reflect on what happened in the world of databases last year. It
- database
- db
- あとで読む
- it
SQLFluffを完全に理解する | DevelopersIO
- 62 users
- dev.classmethod.jp
- テクノロジー
- 2023/05/01
Google Cloudのデータエンジニアをしています、はんざわです。今回はSQLのリンターであるSQLFluffを触りながら理解を深めたいと思います。検証環境 macOS： 13.3.1 Python： 3.9.5 SQLFluffとは SQLFluffとは、SQLのフォーマットを自動で問題点の指摘や修正をしてくれるオープンソースサービスです。さっそくインストールして使ってみたいと思います。インストール SQLFluffをインストールするにはPython3が必要です。 $ pip3 install sqlfluff 正常にインストールできているか確認します。 $ sqlfluff version 2.0.7 インストールが完了しました。実際にクエリを用意し、使ってみたいと思います。さっそく使ってみる sqlfluffには大きくlintとfixの2つの機能があります。まずはli
- SQL
- あとで読む
- lint
- VSCode
- development
DuckDB メモ
- 61 users
- zenn.dev/voluntas
- テクノロジー
- 2024/02/04
モチベーション JSONL を読み込んで解析するツールが欲しかったログを読み込ませたいので圧縮機能は必須自社のパッケージ製品が出力する JSONL を読み込んで解析できる仕組み顧客が問題解析を気軽にできるようにしたい顧客向けツールとして提供したいつまり顧客環境で動かしたい 1 バイナリ OSS として提供したい Apache-2.0 として公開したいログファイルは大きくても 100 GB は行かないもともと Go + SQLite + JSONB で検討していた SQL で書きたい SQLite ファイルとして書き出したい SQLite ファイルフォーマットは信頼できる S3 などにファイルを置いておくだけにしたいクラウド版に組み込みたい顧客毎に duckdb ファイル作ればいいのでは？ duckdb ファイルダウンロードできると便利そう DuckDB https://d
- json
- sql
- DB
- ログ
- Rust
- あとで読む
- API
- *あとで読む
DuckDBでお手軽！データフェデレーション - Techtouch Developers Blog
- 58 users
- tech.techtouch.jp
- テクノロジー
- 2024/05/20
tl;dr はじめに DuckDB とは DuckDB では何が読めるのか使ってみる S3 上のJSON を読んでみるリレーショナルデータベース他ツールではなく DuckDB を使うメリットしくじりポイント（特にリリースされたばかりの）バージョンには気をつける S3 のオブジェクト数が多い場合不都合がありがちスレッドの調整が必要な場合も Redshift には未対応終わりに付録 MySQL のデータを読み込む例の MySQL 側の準備 tl;dr DuckDB 便利だよ。分析以外でも使えるよ色々な場所のデータを閲覧・結合できるよ。標準SQLも使えるよただし、細かい落とし穴は色々あるので気をつけてねはじめに2023年4月にデータエンジニアとして入社したmin（@not_rogue）です。暖かくなるにつれ、YouTube で見た南伊豆ロングトレイル | 松崎町に行く機運が
- DB
- MySQL
- あとで読む
- database
- データ
- tech
OSSベクトルDBのChromaを使ってQ&AボットをLangChainで作成する｜mah_lab / 西見公宏
- 58 users
- note.com/mahlab
- テクノロジー
- 2023/04/15
新興で勢いのあるベクトルDBにChromaというOSSがあり、オンメモリのベクトルDBとして気軽に試せます。 LangChainやLlamaIndexとのインテグレーションがウリのOSSですが、今回は単純にベクトルDBとして使う感じで試してみました。データをChromaに登録する今回はLangChainのドキュメントをChromaに登録し、LangChainのQ&Aができるようなボットを作成しようと思います。しかしLangChainのドキュメントはほとんどがJupyter Notebook形式なので、ベクトルDBへ取り込みやすいようにフラットテキストにしてあげる必要があります。以下の関数はJupyter Notebook形式（JSON）のファイルを分解してMarkdown形式に変換し、その後Unstructured.ioのMarkdownスプリッタを利用してコンテンツをチャンクに分割
- langchain
- vectorDB
- generative_model
- DB
- OSS
- あとで読む
- AI
DuckDB as the New jq
- 46 users
- www.pgrs.net
- テクノロジー
- 2024/03/22
Recently, I’ve been interested in the DuckDB project (like a SQLite geared towards data applications). And one of the amazing features is that it has many data importers included without requiring extra dependencies. This means it can natively read and parse JSON as a database table, among many other formats. I work extensively with JSON day to day, and I often reach for jq when exploring document
- JSON
- jq
- DB
- あとで読む
- DuckDB
シングルバイナリでローカル実行可能、高速なOLAP用オープンソースDB「DuckDB 1.0」正式リリース
- 43 users
- www.publickey1.jp
- テクノロジー
- 2024/06/06
シングルバイナリでローカル実行可能、高速なOLAP用オープンソースDB「DuckDB 1.0」正式リリースオープンソースとして開発されているOLAP用データベース「DuckDB」が正式版となるバージョン1.0に到達したことが発表されました。 OLAP用のデータベースといえば、クライアント／サーバ方式の大規模なサーバアプリケーションが一般的ですが、DuckDBは、SQLiteのようにローカル環境上でシングルバイナリでローカル環境でも簡単に実行できる点が最大の特徴です。 SQLでクエリを記述すると同時に、Python、Java、Node.js、Rust、Go、C/C++、R、ODBCなどから呼び出せるAPIも備えており、クライアントアプリケーションに組み込むこともできます。対応するプラットフォームはWindows（x86_64）、macOS（Intel/Apple Silicon）、Lin
- db
- データベース
- あとで読む
- dev
- tech
- software
🦐🦐🦐Markdownで書くBIツール、Evidence触ってみた🦐🦐🦐
- 40 users
- zenn.dev/notrogue
- テクノロジー
- 2022/12/11
気にはなってるけど触ってないビッグデータ系のツール・サービスを触る Advent Calendar 2022の#9です。 Evidenceとは MarkdownにSQLクエリやグラフの設定を記載し、レポート用の静的なHTMLドキュメントを作成するツールです。デモ画面を見ていただくと、作成できるレポートのイメージがしやすいと思います。この方法（コードでレポートを定義、静的なHTMLドキュメントを作成）により、ソースコードと同じように、バージョン管理やレビュー SQLクエリの結果を利用した、レポートの動的な制御（テンプレート）色々な場所への埋め込みがしやすいなどのご利益がありそうです。（Evidence公式サイトより抜粋）インストール・プロジェクトの初期化プロジェクトを設定するディレクトリで、npx degit evidence-dev/templateコマンドを実行します。
- Markdown
- ドキュメント
- SQL
- あとで読む
- DB
- mysql
DataPacket 雑感
- 38 users
- zenn.dev/shiguredo
- テクノロジー
- 2023/04/04
10Gbps Unmetered Dedicated Servers | DataPacket.com を自社サービスに採用して 1 年以上過ぎたので振り返ってみます。前提著者は選定と調達を担当しています著者は運用や構築に関して素人であり、実際の運用や構築は行っていませんオープンになっていない価格については一切書きません自社製品は Raft ベースのクラスター機能を持っていますまとめ (2024-03-03 版) DataPacket の利用で不満は今のところないマシンとネットワークのコストパフォーマンスが大変良いプライベートネットワークに固定できないため DataPacket 間の通信は Tailscale の利用をやめたまとめ DataPacket の利用で不満は今のところないマシンとネットワークのコストパフォーマンスが大変良い Tailscale との組み合わせが
GitHub - tobymao/sqlglot: Python SQL Parser and Transpiler
- 30 users
- github.com/tobymao
- テクノロジー
- 2022/07/05
SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between 21 different dialects like DuckDB, Presto / Trino, Spark / Databricks, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects. It is a very comprehensive generic SQL parser with a ro
- SQL
- python
- Parser
- github
- HotEntry
- あとで読む
- プログラミング
- it
SQL Polyglot
- 30 users
- codapi.org
- テクノロジー
- 2023/12/17
Run a query and get results from: postgres:16.2 mysql:8.1 sqlite:3.45 mssql:2022 mariadb:11.2 clickhouse:23.10 duckdb:0.10 Write a query below or pick one: columns join using exists group by rollup window percentile fetch upsert except json recursive cte information schema arrays strings select dense_rank() over w as erank, first_name, dep.name as dep_name, salary from employee as emp join departm
- SQL
- DB
- あとで読む
- Tools
投資家用・スタートアップ支援用・大学支援用に改良中　更新中）tfidf etc embeddings cluster reconstructing vis:　特許など長文の、動的な文章間類似俯瞰図可視化・迅速閲覧・解析・探索手段。および第三の特許検索手法、動的な知識抽出管理手法、特許自動生成（類似度ベクトルと小規模言語モデル及びChatGPTを用いた空白領域における特許生成追加） - Qiita
- 29 users
- qiita.com/kzuzuo
- テクノロジー
- 2023/05/15
投資家用・スタートアップ支援用・大学支援用に改良中　更新中）tfidf etc embeddings cluster reconstructing vis:　特許など長文の、動的な文章間類似俯瞰図可視化・迅速閲覧・解析・探索手段。および第三の特許検索手法、動的な知識抽出管理手法、特許自動生成（類似度ベクトルと小規模言語モデル及びChatGPTを用いた空白領域における特許生成追加）自然言語処理NLP可視化Visualization特許追記を繰り返しており整合性も取れておらず非常に読みにくい状態です．近日中に再整理します．技術だけではなく方法論が重要となります。後ろ向きに検証し、前向きに予測することが重要となるでしょう。現在検証中です。お題をいただけますと助かります。後ろ向き検証ではどうもわかりきったものを恣意的に選んで言えるかもしれない危惧があるところです。個人的には、会社の方針に
- 検索
- 特許
- ChatGPT
- あとで読む
- 文章
- 機械学習
- qiita
DuckDB-Wasm: Efficient Analytical SQL in the Browser
- 29 users
- duckdb.org
- テクノロジー
- 2021/10/30
TL;DR: DuckDB-Wasm is an in-process analytical SQL database for the browser. It is powered by WebAssembly, speaks Arrow fluently, reads Parquet, CSV and JSON files backed by Filesystem APIs or HTTP requests and has been tested with Chrome, Firefox, Safari and Node.js. You can try it in your browser at shell.duckdb.org or on Observable. DuckDB-Wasm is fast! If you’re here for performance numbers, h
- wasm
- sql
- webassembly
- db
- browser
- CSV
- JSON
- performance
高速インプロセスデータベースDuckDB 1.0.0がリリース | gihyo.jp
- 28 users
- gihyo.jp
- テクノロジー
- 2024/06/06
DuckDB Foundationは2024年6月3日、オープンソースのインプロセス分析データベース「DuckDB」の正式リリースバージョン1.0.0（コードネーム“⁠Snow Duck⁠”⁠）をリリースした。 Announcing DuckDB 1.0.0 DuckDBは高速に動作するインプロセス分析データベース。ビルドする際に外部依存関係がなく、インストールとデプロイが簡単で、ホストアプリケーション内でインプロセスで実行したり、単一のバイナリとして実行できる。Linux、macOS、Windowsや、すべての一般的なハードウェアアーキテクチャ上で実行可能で、 Python、Rに深く統合されているほか、Java、C、C++といった主要なプログラミング言語用のクライアントAPIを備えている。また豊富なSQL方言が利用可能で、CSV、Parquet、JSONなどのファイル形式で、ローカルファ
ディメンショナルモデリングに入門しよう！Snowflakeとdbt Cloudで「Building a Kimball dimensional model with dbt」をやってみた | DevelopersIO
- 27 users
- dev.classmethod.jp
- テクノロジー
- 2024/01/23
ディメンショナルモデリングに入門しよう！Snowflakeとdbt Cloudで「Building a Kimball dimensional model with dbt」をやってみたさがらです。ここ２年ほどの間にdbtが日本でも急速に拡大し、様々な情報が日本語の記事でも見かけられるようになってきました。 dbtを採用してある程度活用を進めていくと、「より効率よくガバナンスを持ってデータを管理するにはどうすればいいんだろうか」といったデータの管理方法に悩む場面が出てくると思います。そんなときに色々調べていくと、データを効率よく管理する手法として「データモデリング」が必要だとわかり、ディメンショナルモデリングやData Vaultなどの手法に行き着くのではないでしょうか。そしてこれらのデータモデリングの手法の内、ディメンショナルモデリングについてdbtを用いて実践された記事がありま
- dbt
- Snowflake
- データ分析
- 設計
- あとで読む
- data
オープンソースの分析データベースシステム「DuckDB」　バージョン1.0.0公開
- 25 users
- atmarkit.itmedia.co.jp
- テクノロジー
- 2024/06/07
DuckDBチームは2024年6月3日（オランダ時間）、オープンソースの分析データベースシステム「DuckDB」の正式バージョン1.0.0を公開した。 DuckDBは、2018年にプロジェクトの最初のソースコードが作成された。現在のC++エンジンのコードは30万行を超える。速度、信頼性、ポータビリティ、使いやすさに重点を置いて設計されており、豊富なSQLの方言をサポートしている。サードパーティーによる拡張機能も複数構築、配布されている。スタンドアロンのCLI（コマンドラインインタフェース）アプリケーションとして利用可能で、Python、R、Java、Wasmといったクライアントがあり、pandasやdplyrなどのパッケージと深く統合されている。 DuckDB 1.0.0の重点ポイント関連記事 PostgreSQLの主要コントリビューター「EDB」が語る、クラウドネイティブデータベース
- あとで読む
Observable Framework
- 23 users
- observablehq.com
- テクノロジー
- 2024/02/17
The best dashboards are built with code. Create fast, beautiful data apps, dashboards, and reports from the command line. Write Markdown, JavaScript, SQL, Python, R… and any language you like. Free and open-source. Observable Framework is an open-source static site generator for data apps, dashboards, reports, and more. Framework includes a preview server for local development, and a command-line
GitHub - sqlfluff/sqlfluff: A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
- 20 users
- github.com/sqlfluff
- テクノロジー
- 2021/05/18
Although SQL is reasonably consistent in its implementations, there are several different dialects available with variations of syntax and grammar. SQLFluff currently supports the following SQL dialects (though perhaps not in full): ANSI SQL - this is the base version and on occasion may not strictly follow the ANSI/ISO SQL definition Athena BigQuery ClickHouse Databricks (note: this extends the s
- sql
- Linter
- lint
- mysql
- あとで読む
Unexplanations: sql is syntactic sugar for relational algebra
- 18 users
- www.scattered-thoughts.net
- テクノロジー
- 2024/03/23
Unexplanations: sql is syntactic sugar for relational algebra This idea is particularly sticky because it was more or less true 50 years ago, and it's a passable mental model to use when learning sql. But it's an inadequate mental model for building new sql frontends, designing new query languages, or writing tools likes ORMs that abstract over sql. Before we get into that, we first have to figure
現実の CSV ファイルのデータを BigQuery に load する仕組みを作るという泥臭い作業を dlt でやってみる
- 17 users
- soonraah.github.io
- テクノロジー
- 2024/01/29
インストールされたバージョンを確認。 $ dlt --version dlt 0.3.25 また、gsfs, pandas, streamlit, google-cloud-bigquery-storage も必要になるのでインストールしておく。 2. pipeline project を作成#次のコマンドで pipeline project を用意する。これは verified source として Filesystem、destination として DuckDB を指定して pipeline project を作るという意味。 Filesystem はローカルのファイルシステムや S3, GCS のようなクラウドストレージからファイルを読むことが可能。このコマンドが成功すると次のようなディレクトリ構造が作られる。 . ├── .dlt │ ├── .sources │ ├── c
- あとで読む
DuckDBとdbtとRillで作るローカルで動くDWHっぽいもの
- 16 users
- zenn.dev/dbttokyo
- テクノロジー
- 2022/12/01
この記事はdbt Advent Calendar 2022の12月1日の記事です。サマリ DuckDBとdbtを使えばローカル環境で一定のデータ量であればオレオレDWHっぽいものが作れるようになる社内にデータ分析基盤がない、データ活用しようにもデータ基盤がなく本格的に取り組もうと思うとセキュリティや運用までかんがえると始めることすらままならないようなプロジェクトや会社でも始められる可能性がある MLのデータの前処理とdb Pythonモデルを使ってローカル環境で一定のクレンジングと前処理のパイプライン等も作れるかも？ DuckDBとは？ SQLiteをベースとした軽量で高速なOLAPデータベースです。近年のPCのメモリ増加で16GBとか乗っていると数百万行ぐらいのデータでもローカルで高速に一定処理することが可能になってしまっています。詳しくは @notrogue さんが書いた記事を
Deprecating and removing Web SQL | Blog | Chrome for Developers
- 15 users
- developer.chrome.com
- テクノロジー
- 2022/09/01
The Web SQL Database API, which allows you to store data in a structured manner on the user's computer (internally based on the SQLite database engine), was introduced in April 2009 and abandoned in November 2010. While it was implemented in WebKit (which powers Safari) and remained active in the Blink engine (which powers Chrome), Gecko (which powers Firefox) never implemented this feature and We
- Chrome
- article
Back at my old job in ~2016, we built a cheap homegrown data warehouse via Postg... | Hacker News
- 14 users
- news.ycombinator.com
- テクノロジー
- 2022/05/25
Back at my old job in ~2016, we built a cheap homegrown data warehouse via Postgres, SQLite and Lambda.Basically, it worked like this: - All of our data lived in compressed SQLite DBs on S3. - Upon receiving a query, Postgres would use a custom foreign data wrapper we built. - This FDW would forward the query to a web service. - This web service would start one lambda per SQLite file. Each lambda
- serverless
- lambda
- database
- data
- development
GitHub - duckdb/duckdb: DuckDB is an in-process SQL OLAP Database Management System
- 12 users
- github.com/duckdb
- テクノロジー
- 2020/02/08
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- RDB
- database
- python
Announcing DuckDB 1.0.0
- 11 users
- duckdb.org
- テクノロジー
- 2024/06/03
To install the new version, please visit the installation guide. For the release notes, see the release page. It has been almost six years since the first source code was written for the project back in 2018, and a lot has happened since: There are now over 300 000 lines of C++ engine code, over 42 000 commits and almost 4 000 issues were opened and closed again. DuckDB has also gained significant
- DB
- duckdb
久しぶりにgcloud components updateをしたら怒られたので対応した話
- 9 users
- techblog.gmo-ap.jp
- テクノロジー
- 2020/06/05
ERROR: (gcloud.components.update) Failed to fetch component listing from server. Check your network settings and try again. 在宅勤務中ということもあり、ネットワーク周りかなとおもって、自宅のルータの設定を色々変えて見たけど全然だめ。結論、手元の環境のpythonライブラリを一度削除し、最新化することで動きました。環境は、MacOS　Mojave　10.14.6です。（OSの設定は自己責任でお願いします）準備 brew自身のアップデート Updated 3 taps (homebrew/core, homebrew/cask and homebrew/services). ==> New Formulae asuka field3d lanraragi reor
- Python
Against SQL
- 9 users
- www.scattered-thoughts.net
- テクノロジー
- 2021/07/10
TLDR The relational model is great: A shared universal data model allows cooperation between programs written in many different languages, running on different machines and with different lifespans. Normalization allows updating data without worrying about forgetting to update derived data. Physical data independence allows changing data-structures and query plans without having to change all of y
Vector databases (4): Analyzing the trade-offs
- 9 users
- thedataquarry.com
- テクノロジー
- 2023/08/21
Choosing the right vector DB solution#Welcome back! In the previous post in this 4-part series, we looked at the different types of indexes typically used in vector DBs. However, indexing is just a small part of the bigger elephant in the room when it comes to vector databases. Recall that in part 2, we described what a vector database is. To distinguish between the various vector DB offerings out
- DB
- あとで読む
Postgres is eating the database world
- 8 users
- pigsty.io
- テクノロジー
- 2024/03/15
PostgreSQL isn’t just a simple relational database; it’s a data management framework with the potential to engulf the entire database realm. The trend of “Using Postgres for Everything” is no longer limited to a few elite teams but is becoming a mainstream best practice. OLAP’s New Challenger In a 2016 database meetup, I argued that a significant gap in the PostgreSQL ecosystem was the lack of a s
- postgresql
- database
Welcome to LangChain — 🦜🔗 LangChain 0.0.161
- 8 users
- langchain.readthedocs.io
- テクノロジー
- 2023/01/08
Getting Started Quickstart Guide Modules Models LLMs Getting Started Generic Functionality How to use the async API for LLMs How to write a custom LLM wrapper How (and why) to use the fake LLM How (and why) to use the the human input LLM How to cache LLM calls How to serialize LLM classes How to stream LLM and Chat Model responses How to track token usage Integrations AI21 Aleph Alpha Azure OpenAI
- Python
- 自然言語処理
RubyKaigi 2022 - Fast data processing with Ruby and Apache Arrow #rubykaigi - 2022-09-13 - ククログ
- 8 users
- www.clear-code.com
- テクノロジー
- 2022/09/15
株式会社クリアコード > ククログ > RubyKaigi 2022 - Fast data processing with Ruby and Apache Arrow #rubykaigi 関連リンク：スライド（Rabbit Slide Show）スライド（SlideShare）リポジトリー内容 RubyKaigi Takeout 2021のRed ArrowのトークではRed Arrowを中心にできることをたくさん紹介しました。その発展形として今年は実際に使えそうな感じになっていることを紹介したかったので、高速データ処理機能にフォーカスすることにしました。が、採択されて資料を作り始めてみると「実際に使えそう」というには各機能の実装にもう少しブラッシュアップが必要なことがわかりました。なんと。。。ということで、Apache Arrowを使って高速にデータ処理できる各種方法につい
State of data 2023
- 6 users
- state-of-data.com
- 学び
- 2023/05/26
IntroductionIn the past 2 years, the data ecosystem has been evolving rapidly. New tools have been emerging every month in the modern data stack. In a hype cycle, it becomes hard to distinguish the signal from the noise. Which of those tools would eventually become simple features or actual products that we would be using in a few years? In addition to our growing number of tools, we've seen a few
DuckDB Doesn’t Need Data To Be a Database
- 6 users
- www.nikolasgoebel.com
- テクノロジー
- 2024/05/30
28 May 2024 DuckDB Doesn’t Need Data To Be a Database One of the many enjoyable things about databases is that they generally try to separate how data is represented internally (say on disk) from how it is used. To the point that it has become the norm to not even store the data on the same hardware that is running the queries. Databases have gotten so good at this, that the term is almost mislead
- DB
- database
Building a Kimball dimensional model with dbt | dbt Developer Blog
- 6 users
- docs.getdbt.com
- テクノロジー
- 2023/04/23
Dimensional modeling is one of many data modeling techniques that are used by data practitioners to organize and present data for analytics. Other data modeling techniques include Data Vault (DV), Third Normal Form (3NF), and One Big Table (OBT) to name a few.Data modeling techniques on a normalization vs denormalization scaleWhile the relevance of dimensional modeling has been debated by data pra
- あとで読む
GitHub - lancedb/lance: Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more i
- 6 users
- github.com/lancedb
- テクノロジー
- 2023/02/12
Lance is a modern columnar data format that is optimized for ML workflows and datasets. Lance is perfect for: Building search engines and feature stores. Large-scale ML training requiring high performance IO and shuffles. Storing, querying, and inspecting deeply nested data for robotics or large blobs like images, point clouds, and more. The key features of Lance include: High-performance random a
- データ
- あとで読む