サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
衆院選
www.dbms2.com
Spark is on the rise, to an even greater degree than I thought last month. Numerous clients and other companies I talk with have adopted Spark, plan to adopt Spark, or at least think it’s likely they will. In particular: A number of analytic-stack companies are joining ClearStory in using Spark. Most of the specifics are confidential, but I hope some will be announced soon. MapR has joined Clouder
Two years ago I wrote about how Zynga managed analytic data: Data is divided into two parts. One part has a pretty ordinary schema; the other is just stored as a huge list of name-value pairs. (This is much like eBay‘s approach with its Teradata-based Singularity, except that eBay puts the name-value pairs into long character strings.) … Zynga adds data into the real schema when it’s clear it will
The cardinal rules of DBMS development Rule 1: Developing a good DBMS requires 5-7 years and tens of millions of dollars. That’s if things go extremely well. Rule 2: You aren’t an exception to Rule 1. In particular: Concurrent workloads benchmarked in the lab are poor predictors of concurrent performance in real life. Mixed workload management is harder than you’re assuming it is. Those minor edge
I visited Cloudera Friday for, among other things, a chat about Impala with Marcel Kornacker and colleagues. Highlights included: Impala is meant to someday be a competitive MPP (Massively Parallel Processing) analytic RDBMS. At the moment, it is not one. For example, Impala lacks any meaningful form of workload management or query optimization. While Impala will run against any HDFS (Hadoop Distr
A lot of confusion seems to have built around the facts: Hadoop MapReduce is being opened up into something called MapReduce 2 (MRv2). Something called YARN (Yet Another Resource Negotiator) is involved. One purpose of the whole thing is to make MapReduce not be required for Hadoop. MPI (Message Passing Interface) was mentioned as a paradigmatic example of a MapReduce alternative, yet the MPI/YARN
My clients at Cloudera have been around for a while, in effect positioned as “the Hadoop company.” Their business, in a nutshell, consists of: Packaging up a Cloudera distribution of Apache Hadoop. This distribution doesn’t have proprietary code; it’s just packaged by Cloudera from Apache projects (with a decent minority of the code happening to have been contributed by Cloudera engineers). Paid s
I chatted with Oliver Ratzesberger of eBay around a Stanford picnic table yesterday (the XLDB 4 conference is being held at Jacek Becla’s home base of SLAC, which used to stand for “Stanford Linear Accelerator Center”). Todd Walter of Teradata also sat in on the latter part of the conversation. Things I learned included: eBay has thrown out Greenplum. (Edit: As per the comments below, eBay wouldn’
Once again, I find myself writing and talking a lot about MapReduce. But I suspect that MapReduce-related conversations would go better if we overcame three fairly common MapReduce myths: MapReduce is something very new MapReduce involves strict adherence to the Map-Reduce programming paradigm MapReduce is a single technology So let’s give it a try. When Dave DeWitt and Mike Stonebraker leveled th
My old client Mark Tsimelzon moved over to Yahoo after Coral8 was acquired, and I caught up with him last month. He turns out to be running development for a significant portion of Yahoo’s Hadoop effort — everything other than HDFS (Hadoop Distributed File System). Yahoo evidently plans to, within a year or so, get Hadoop to the point that it is managing 10s of petabytes of data for Yahoo, with re
So far as I can see, there are three implementations of MapReduce that matter for enterprise analytic use – Hadoop, Greenplum’s, and Aster Data’s.* Hadoop has of course been available for a while, and used for a number of different things, while Greenplum’s and Aster Data’s versions of MapReduce – both in late-stage beta – have far fewer users. *Perhaps Nokia’s Disco or another implementation will
I few weeks ago, I posted about a conversation I had with Jeff Hammerbacher of Cloudera, in which he discussed a Hadoop-based effort at Facebook he previously directed. Subsequently, Ashish Thusoo and Joydeep Sarma of Facebook contacted me to expand upon and in a couple of instances correct what Jeff had said. They also filled me in on Hive, a data-manipulation add-on to Hadoop that they developed
eBay’s two enormous data warehouses A few weeks ago, I had the chance to visit eBay, meet briefly with Oliver Ratzesberger and his team, and then catch up later with Oliver for dinner. I’ve already alluded to those discussions in a couple of posts, specifically on MapReduce (which eBay doesn’t like) and the astonishingly great difference between high- and low-end disk drives (to which eBay clued m
Along with five other coauthors — the lead author seems to be Andy Pavlo — famous MapReduce non-fans Mike Stonebraker and David DeWitt have posted a SIGMOD 2009 paper called “A Comparison of Approaches to Large-Scale Data Analysis.” The heart of the paper is benchmarks of Hadoop, Vertica, and “DBMS-X” on identical clusters of 100 low-end nodes., across a series of tests including (if I understood
Last week, Dan Weinreb tipped me off to something very cool: Mike Stonebraker and a group of MIT/Brown/Yale colleagues are calling for a complete rewrite of OLTP DBMS. And they have a plan for how to do it, called H-Store, as per a paper and an associated slide presentation. On the system side, some of their most radical suggestions include: No disks or other persistent storage at all. No multi-th
Political issues around big tech companies The technology industry has an increasingly complex relationship to government and politics, most importantly in three areas: Privacy and surveillance. Censorship. Antitrust, general economic regulation, and other competition management. Here’s some of what I think about that, plus links to a lot more. 1. For a long time, I’ve maintained: Privacy and surv
このページを最初にブックマークしてみませんか?
『DBMS 2 : Database management and analytic technologies in a changing world』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く