[B! apache-software-foundation][apache-spark] nabinnoのブックマーク

nabinno id:nabinno

apache-software-foundationとapache-sparkに関するnabinnoのブックマーク (9)

spark/KinesisWordCountASL.scala at master · apache/spark · GitHub
* Licensed to the Apache Software Foundation (ASF) under one or more
nabinno 2015/12/22
github

apache-software-foundation

apache-spark
リンク
[SPARK-7076][SPARK-7077][SPARK-7080][SQL] Use managed memory for aggregations by JoshRosen · Pull Request #5725 · apache/spark
This patch adds managed-memory-based aggregation to Spark SQL / DataFrames. Instead of working with Java objects, this new aggregation path uses sun.misc.Unsafe to manipulate raw memory. This reduces the memory footprint for aggregations, resulting in fewer spills, OutOfMemoryErrors, and garbage collection pauses. As a result, this allows for higher memory utilization. It can also result in better
nabinno 2015/10/31
github

apache-software-foundation

apache-spark

josh-rosen
リンク
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
nabinno 2015/08/25
github

apache-software-foundation

apache-spark
リンク
[SPARK-9020][SQL] Support mutable state in code gen expressions by cloud-fan · Pull Request #7392 · apache/spark
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
nabinno 2015/07/16
github

apache-software-foundation

apache-spark
リンク
[SPARK-5006] spark.port.maxRetries doesn't work - ASF JIRA
We normally config spark.port.maxRetries in properties file or SparkConf. But in Utils.scala it read from SparkEnv's conf. As SparkEnv is an object whose env need to be set after JVM is launched and Utils.scala is also an object. So in most cases portMaxRetries will get the default value 16.
nabinno 2015/06/05
apache-software-foundation

apache-spark
リンク
Spark Release 1.3.0 | Apache Spark
Spark 1.3.0 is the fourth release on the 1.X line. This release brings a new DataFrame API alongside the graduation of Spark SQL from an alpha project. It also brings usability improvements in Spark’s core engine and expansion of MLlib and Spark Streaming. Spark 1.3 represents the work of 174 contributors from more than 60 institutions in more than 1000 individual patches. To download Spark 1.3 vi
nabinno 2015/06/05
apache-software-foundation

apache-spark
リンク
MLlib: RDD-based API - Spark 3.5.1 Documentation
MLlib: RDD-based API This page documents sections of the MLlib guide for the RDD-based API (the spark.mllib package). Please see the MLlib Main Guide for the DataFrame-based API (the spark.ml package), which is now the primary API for MLlib. Data types Basic statistics summary statistics correlations stratified sampling hypothesis testing streaming significance testing random data generation Class
nabinno 2015/06/05
apache-software-foundation

apache-spark

mlib
リンク
RDD Programming Guide - Spark 3.5.1 Documentation
RDD Programming Guide Overview Linking with Spark Initializing Spark Using the Shell Resilient Distributed Datasets (RDDs) Parallelized Collections External Datasets RDD Operations Basics Passing Functions to Spark Understanding closures Example Local vs. cluster modes Printing elements of an RDD Working with Key-Value Pairs Transf ormations Actions Shuffle operations Background Performance Impact
nabinno 2015/06/05
apache-software-foundation

apache-spark

guide
リンク
Apache Spark™ - Unified Engine for large-scale data analytics
Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
nabinno 2014/12/07
apache-spark

data-mining

python

graphx

scala

apache-software-foundation

machine-learning

mapreduce

database
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx