サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
買ってよかったもの
0x0fff.com
At the moment there is a huge buzz in the media about the Apache Spark framework and little by little it becomes next big thing in a field of “Big Data”. The simplest thing we can do to prove this is to look at the google trends diagram: I have shown here both Hadoop and Spark for the last 2 years. So Spark is becoming more and more popular among the end customers, and they are looking over the in
This is my second article about Apache Spark architecture and today I will be more specific and tell you about the shuffle, one of the most interesting topics in the overall Spark design. The previous part was mostly about general Spark architecture and its memory management. It can be accessed here. The next one is about Spark memory management and it is available here. What is the shuffle in gen
Edit from 2015/12/17: Memory model described in this article is deprecated starting Apache Spark 1.6+, the new memory model is based on UnifiedMemoryManager and described in this article Over the recent time I’ve answered a series of questions related to ApacheSpark architecture on StackOverflow. All of them seem to be caused by the absence of a good general description of the Spark architecture i
When you are completely ready to start your “big data” initiative with Hadoop, one of your first questions would be related to the cluster sizing. What is the right hardware to choose in terms of price/performance? How much hardware you need to handle your data and your workload? I will do my best to answer these questions in my article. Let’s start with the simplest thing, storage. There are many
このページを最初にブックマークしてみませんか?
『Distributed Systems Architecture』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く