At Instagram, we have one of the world’s largest deployments of the Apache Cassandra database. We began using Cassandra in 2012 to replace Redis and support product use cases like fraud detection, Feed, and the Direct inbox. At first we ran Cassandra clusters in an AWS environment, but migrated them over to Facebook’s infrastructure when the rest of Instagram moved. We’ve had a really good experie
random list of Apache Cassndra Anti Patterns. There is a lot of info on what to use Cassandra for and how, but not a lot of information on what not to do. This presentation works towards filling that gap.Read less
Yahoo! JapanのHadoopクラスタは6000ノードで120PB。指数関数的に増大するデータ需要を技術で解決していく。Hadoop Spark Conference Japan 2016 日本を代表する規模のビッグデータ処理基盤を持つ企業の1つがYahoo! Japan(以下Yahoo!)です。 同社は2月8日に開催された「Hadooop Spark Conference Japan 2016」において、現在運用中のビッグデータ処理基盤の規模、そして同社が抱えている課題と、それをどう解決していくのかを基調講演の中で示しました。 同社が示した解決方法は、Hadoopなどのビッグデータ処理基盤を使い倒す側から、作る側へ向かうという大胆なものです。同社の貢献はオープンソースとなり、今後さらに多くの課題解決に役立つことになりそうです。 同社データインフラ本部 遠藤禎士(えんどうただし)氏
Cassandra の Column Family は、全体としては以下のような2次元のMapのような構造をしています。 上記の RowKey は CQL では Partition Keyと呼ばれていて、この Partition Key 単位でノードにデータが配置されます。 また、CQLでは主キーかつPartition Keyでない ColumnKey をClustering Columnと呼んでいます (名前の通り、あるPartition中でこのキーでKVの塊をつくるから)。 単一パーティションにread/write が大量に発生すると、特定のノードの負荷が上がることになります。 負荷分散を考慮してPartition Keyを決める必要があります。 refs: http://ameblo.jp/principia-ca/entry-11886808914.html CQL で作ったデータ
The WHERE clause restrictions depend on the type of statement, type of column, and whether a secondary index is used. For SELECT statements on partition keys, either all keys must be restricted or none. Clustering columns cannot be restricted if preceding ones are not. Secondary indexes allow restricting columns not in the primary key.Read less
Product { this.openCategory = category; const productMenu = document.querySelector('.product-menu'); window.DD_RUM.onReady(function() { if (productMenu.classList.contains('show')) { window.DD_RUM.addAction(`Product Category ${category} Hover`) } }) }, 160); }, clearCategory() { clearTimeout(this.timeoutID); } }" x-init=" const menu = document.querySelector('.product-menu'); var observer = new Muta
Performance Tuning for Cassandra Write Operations Optimize Cassandra for Write Operations Cassandra write path is very simple and require little tunning The biggest performance gain for write is to put commit log in a separate disk drive commit log uses sequential write and most hard drive will meet the throughput requirement However, if SSTables share the same drive with commit log I/O contention
Presenters: Michael Nelson, Development Manager at FamilySearch A recent research project at FamilySearch.org pushed Cassandra to very high scale and performance limits in AWS using a real application. Come see how we achieved 250K reads/sec with latencies under 5 milliseconds on a 400-core cluster holding 6 TB of data while maintaining transactional consistency for users. We'll cover tuning of Ca
Cassandra works optimally when the data you need to access is already in memory. Disks are comparatively slow, so when data needs to be read from disk, it works best when it is performed as a single sequential operation. In order to design an effective data model in Cassandra, it's good to keep these best practices in mind: Use clustering columns in your tables so that your rows are ordered on dis
The document discusses dealing with JVM limitations in Apache Cassandra. It identifies key pain points like garbage collection and platform-specific code. It then explores specific issues like fragmentation and offers solutions like arena allocation for memtables. The document also advocates for allowing more low-level access in Java to directly address issues like file mapping limitations, in ord
リリース、障害情報などのサービスのお知らせ
最新の人気エントリーの配信
処理を実行中です
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く