タグ

ブックマーク / aphyr.com (17)

  • Strong consistency models

    Update, 2018-08-24: For a more complete, formal discussion of consistency models, see jepsen.io. Network partitions are going to happen. Switches, NICs, host hardware, operating systems, disks, virtualization layers, and language runtimes, not to mention program semantics themselves, all conspire to delay, drop, duplicate, or reorder our messages. In an uncertain world, we want our software to mai

    kuenishi
    kuenishi 2022/12/12
    "More formally, we say that a consistency model is the set of all allowed histories of operations."
  • Asynchronous replication with failover

    In response to my earlier post on Redis inconsistency, Antirez was kind enough to help clarify some points about Redis Sentinel’s design. First, I’d like to reiterate my respect for Redis. I’ve used Redis extensively in the past with good results. It’s delightfully fast, simple to operate, and offers some of the best documentation in the field. Redis is operationally predictable. Data structures a

    kuenishi
    kuenishi 2018/06/21
    Happened to find a good read
  • Jepsen: Crate 0.54.9 version divergence

    In the last Jepsen analysis, we saw that RethinkDB 2.2.3 could encounter spectacular failure modes due to cluster reconfiguration during a partition. In this analysis, we’ll talk about Crate, and find out just how many versions a row’s version identifies. Crate is a shared-nothing, “infinitely scalable”, eventually-consistent SQL database built on Elasticsearch. Because Elasticsearch has and conti

    kuenishi
    kuenishi 2016/06/29
    oh: "Overly Optimistic Concurrency Control"
  • Jepsen: RethinkDB 2.2.3 reconfiguration

    In the previous Jepsen analysis of RethinkDB, we tested single-document reads, writes, and conditional writes, under network partitions and process pauses. RethinkDB did not exhibit any nonlinearizable histories in those tests. However, testing with more aggressive failure modes, on both 2.1.5 and 2.2.3, has uncovered a subtle error in Rethink’s cluster membership system. This error can lead to st

    kuenishi
    kuenishi 2016/02/12
  • Jepsen: Percona XtraDB Cluster

    Percona’s CTO Vadim Tkachenko wrote a response to my Galera Snapshot Isolation post last week. I think Tkachenko may have misunderstood some of my results, and I’d like to clear those up now. I’ve ported the MariaDB tests to Percona XtraDB Cluster, and would like to confirm that using exclusive write locks on all reads, as Tkachenko recommends, can recover serializable histories. Finally, we’ll ad

    kuenishi
    kuenishi 2016/01/25
    ちょっと残念
  • Jepsen: MariaDB Galera Cluster

    There’s a neat kind of symmetry here: P1 and P2 are duals of each other, preventing a read from seeing an uncommitted write, and preventing a write from clobbering an uncommitted read, respectively. P0 prevents two writes from stepping on each other, and we could imagine its dual r1(x) … r2(x)–but since reads don’t change the value of x they commute, and we don’t need to prevent them from interlea

    kuenishi
    kuenishi 2016/01/25
    Snapshot Isolationじゃないよねぇ〜的な
  • Jepsen: RethinkDB 2.1.5

    In this Jepsen report, we’ll verify RethinkDB’s support for linearizable operations using majority reads and writes, and explore assorted read and write anomalies when consistency levels are relaxed. This work was funded by RethinkDB, and conducted in accordance with the Jepsen ethics policy. RethinkDB is an open-source, horizontally scalable document store. Similar to MongoDB, documents are hiera

    Jepsen: RethinkDB 2.1.5
    kuenishi
    kuenishi 2016/01/25
    RethinkDBは majority write/majority read で使えばsafe
  • Jepsen: Chronos

    Chronos is a distributed task scheduler (cf. cron) for the Mesos cluster management system. In this edition of Jepsen, we’ll see how simple network interruptions can permanently disrupt a Chronos+Mesos cluster Chronos relies on Mesos, which has two flavors of node: master nodes, and slave nodes. Ordinarily in Jepsen we’d refer to these as “primary” and “secondary” or “leader” and “follower” to avo

    Jepsen: Chronos
    kuenishi
    kuenishi 2015/08/19
  • Comments on "You Do it Too"

    In response to You Do It Too: Forfeiting Partition Tolerance in Distributed Systems, I’d like to remind folks of a few things around CAP. Partition intolerance does not mean that partitions cannot happen, it means partitions are not supported. Specifically, partition-intolerant systems must sacrifice invariants when partitions occur. Which invariants? By Gilbert & Lynch, either the system allows n

    kuenishi
    kuenishi 2015/07/09
    CAP定理を間違って理解しているHBase commiter のブログ記事への返答
  • Jepsen: MongoDB stale reads

    Please note: our followup analysis of 3.4.0-rc3 revealed additional faults in MongoDB’s replication algorithms which could lead to the loss of acknowledged documents–even with Majority Write Concern, journaling, and fsynced writes. In May of 2013, we showed that MongoDB 2.4.3 would lose acknowledged writes at all consistency levels. Every write concern less than MAJORITY loses data by design due t

    kuenishi
    kuenishi 2015/05/13
    圧巻のKnossosとNemesis
  • Jepsen: final thoughts

    Previously in Jepsen, we discussed Riak. Now we’ll review and integrate our findings. This was a capstone post for the first four Jepsen posts; it is not the last post in the series. I’ve continued this work in the years since and produced several more posts. We started this series with an open problem. Notorious computer expert Joe Damato explains: “Literally no one knows.” We’ve pushed the bound

    Jepsen: final thoughts
    kuenishi
    kuenishi 2015/05/11
    前半のまとめ、いい話
  • Jepsen: Elasticsearch 1.5.0

    Previously, on Jepsen, we demonstrated stale and dirty reads in MongoDB. In this post, we return to Elasticsearch, which loses data when the network fails, nodes pause, or processes crash. Nine months ago, in June 2014, we saw Elasticsearch lose both updates and inserted documents during transitive, nontransitive, and even single-node network partitions. Since then, folks continue to refer to the

    Jepsen: Elasticsearch 1.5.0
    kuenishi
    kuenishi 2015/05/11
    1.5.0でもっかいJepsenにかけてみた話
  • Jepsen: Aerospike

    Previously, on Jepsen, we reviewed Elasticsearch’s progress in addressing data-loss bugs during network partitions. Today, we’ll see Aerospike 3.5.4, an “ACID database”, react violently to a basic partition. [Update, 2018-03-07] See the followup analysis of 3.99.0.3 Aerospike is a high-performance, distributed, schema-less, KV store, often deployed in caching, analytics, or ad tech environments. I

    Jepsen: Aerospike
    kuenishi
    kuenishi 2015/05/06
    Aerospikeのテストキター!!!
  • Builders vs option maps

    kuenishi
    kuenishi 2015/03/30
  • Jepsen: Elasticsearch

    This post covers Elasticsearch 1.1.0. In the months since its publication, Elasticsearch has added a comprehensive overview of correctness issues and their progress towards fixing some of these bugs. Previously, on Jepsen, we saw RabbitMQ throw away a staggering volume of data. In this post, we’ll explore Elasticsearch’s behavior under various types of network failure. Elasticsearch is a distribut

    Jepsen: Elasticsearch
    kuenishi
    kuenishi 2014/06/24
    Elasticsearch is not designed as primary data storage
  • The trouble with timestamps

    Some folks have asked whether Cassandra or Riak in last-write-wins mode are monotonically consistent, or whether they can guarantee read-your-writes, and so on. This is a fascinating question, and leads to all sorts of interesting properties about clocks and causality. There are two families of clocks in distributed systems. The first are often termed wall clocks, which correspond roughly to the t

    kuenishi
    kuenishi 2013/10/14
    if you want monotonic clock you should use clock_gettime(2) but Cassie looks she doesn't (probably due to performance?).
  • Jepsen: MongoDB

    Previously in Jepsen, we discussed Redis. In this post, we’ll see MongoDB drop a phenomenal amount of data. See also: followup analyses of 2.6.7 and 3.4.0-rc3. MongoDB is a document-oriented database with a similar distribution design to Redis. In a replica set, there exists a single writable primary node which accepts writes, and asynchronously replicates those writes as an oplog to N secondaries

    Jepsen: MongoDB
    kuenishi
    kuenishi 2013/05/19
    mongoでデータが消えるパターン
  • 1