[B! Spark] Hashのブックマーク

Hash id:Hash

Sparkに関するHashのブックマーク (15)

Breaking the “curse of dimensionality” in Genomics using “wide” Random Forests
Unified governance for all data, analytics and AI assets
Hash 2017/08/01
VariantSpark RF, とな

bioinformatics

spark
リンク
Interactive Analysis of Genomic Datasets Using Amazon Athena | Amazon Web Services
AWS Big Data Blog Interactive Analysis of Genomic Datasets Using Amazon Athena Aaron Friedman is a Healthcare and Life Sciences Solutions Architect with Amazon Web Services The genomics industry is in the midst of a data explosion. Due to the rapid drop in the cost to sequence genomes, genomics is now central to many medical advances. When your genome is sequenced and analyzed, raw sequencing file
Hash 2016/12/08
1000 Genomes Project のデータを Spark でマエショリして Athena で叩くやつ. コード => https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-athena-genomics

AWS

spark

bioinformatics

atheism
リンク
Submitting User Applications with spark-submit | Amazon Web Services
AWS Big Data Blog Submitting User Applications with spark-submit Francisco Oliveira is a consultant with AWS Professional Services Customers starting their big data journey often ask for guidelines on how to submit user applications to Spark running on Amazon EMR. For example, customers ask for guidelines on how to size memory and compute resources available to their applications and the best reso
Hash 2016/10/04
spark

EMR
リンク
Apache Spark @Scale: A 60 TB+ production use case
Facebook often uses analytics for data-driven decision making. Over the past few years, user and product growth has pushed our analytics engines to operate on data sets in the tens of terabytes for a single query. Some of our batch analytics is executed through the venerable Hive platform (contributed to Apache Hive by Facebook in 2009) and Corona, our custom MapReduce implementation. Facebook has
Hash 2016/09/06
facebook

spark

あとで
リンク
AWS Solutions Architect ブログ
Apache SparkとAmazon DSSTNEを使った、Amazon規模のレコメンデーション生成 Amazonのパーソナライゼーションでは、お客様毎の製品レコメンデーションを生成するためにニューラルネットワークを使っています。Amazonの製品カタログは、あるお客様が購入した製品の数に比較して非常に巨大なので、データセットは極端に疎になってしまいます。そして、お客様の数と製品の数は何億にものぼるため、我々のニューラルネットワークのモデルは複数のGPUで分散しなければ、空間や時間の制約を満たすことができません。そのため、GPU上で動作するDSSTNE (the Deep Scala ble Sparse Tensor Neural Engine)を開発しオープンソースにしました。我々はDSSTNEを使ってニューラルネットワークを学習しレコメンデーションを生成していて、ECのウェブサイト
Hash 2016/07/11
Amazon 商品レコメンドを支える技術のお話だ

spark

AWS

ECS

DSSTNE
リンク
夏真っ盛り！Spark + Python + Data Science祭り (2016/07/25 19:00〜)
[2016/07/04追記] 好評につき80名から100名に増枠しました！ DMM.com ラボ、サイバーエージェント、Clouderaの最前線のエンジニアが各自の視点から発表！SparkやPythonを使い、ビッグデータを活用したData Science、機械学習を活かしたプロダクトの活用事例や、ツール、アーキテクチャを知りたい人にお勧めのミートアップを開催決定！対象 Sparkを使っていて、データを活用したプロダクトを作りたい人機械学習やデータ分析はしているが、Sparkはまだ使ったことのない人 Pythonを使ってビッグデータの分析・活用がしたい人などの方々に楽しんでもらえる発表を予定しています。概要 SparkやPythonを用いてビッグデータ分析を行ったり、機械学習を活かしたプロダクトの開発についいての知見を共有する会です。大量のデータに対してどういうアーキテクチャを用い
Hash 2016/06/29
後で申し込む（既に人数超過）

spark

event

python

machine_learning
リンク
Apache Spark as a Compiler: Joining a Billion Rows per Second on a Laptop
Unified governance for all data, analytics and AI assets
Hash 2016/05/31
Spark 2.0 は Tungsten エンジンの whole-stage code generation でさらに爆速になるというお話

spark

performance
リンク
Analyze Your Data on Amazon DynamoDB with Apache Spark | Amazon Web Services
AWS Big Data Blog Analyze Your Data on Amazon DynamoDB with Apache Spark Manjeet Chayel is a Solutions Architect with AWS Every day, tons of customer data is generated, such as website logs, gaming data, advertising data, and streaming videos. Many companies capture this information as it’s generated and process it in real time to understand their customers. Amazon DynamoDB is a fast and flexible
Hash 2016/05/20
あとで読む

AWS

DynamoDB

spark

あとで
リンク
Exploring Geospatial Intelligence using SparkR on Amazon EMR | Amazon Web Services
Hash 2016/04/15
Spark

AWS

R

あとで
リンク
Hadoop / Spark Conference Japan 2016
Hash 2016/02/02
えーなにこれ錚々たるメンツでは… 行きたいけどド平日

Spark

Hadoop

event
リンク
『Sparkによる実践データ解析』という本の付録を執筆しました - ほくそ笑む
リクルートの高柳さんと共同で『Sparkによる実践データ解析』という本の付録を執筆しました。 Sparkによる実践データ解析 ―大規模データのための機械学習事例集作者: Sandy Ryza,Uri Laserson,Sean Owen,Josh Wills,石川有,Sky株式会社玉川竜司出版社/メーカー: オライリージャパン発売日: 2016/01/23メディア: 大型本この商品を含むブログ (4件) を見る執筆した付録の内容は「SparkRについて」です。 SparkR は、R 言語から Spark を使うためのパッケージで、公式サポートされています。 SparkR については、以前 Spark Meetup で発表しました。 Spark Meetup 2015 で SparkR について発表しました #sparkjp - ほくそ笑むこのときはまだ、機能として不十分な点が目立ちま
Hash 2016/01/14
spark

R

book
リンク
Introducing Redshift Data Source for Spark
Unified governance for all data, analytics and AI assets
Hash 2015/12/29
spark

Redshift
リンク
SparkR (R on Spark) - Spark 3.5.1 Documentation
SparkR (R on Spark) Overview SparkDataFrame Starting Up: SparkSession Starting Up from RStudio Creating SparkDataFrames From local data frames From Data Sources From Hive tables SparkDataFrame Operations Selecting rows, columns Grouping, Aggregation Operating on Columns Applying User-Defined Function Run a given function on a large dataset using dapply or dapplyCollect dapply dapplyCollect Run a g
Hash 2015/12/26
spark

R

tutorial
リンク
AWS News Blog
AWS Week in Review – AWS Documentation Updates, Amazon EventBridge is Faster, and More – May 22, 2023 Here are your AWS updates from the previous 7 days. Last week I was in Turin, Italy for CloudConf, a conference I’ve had the pleasure to participate in for the last 10 years. AWS Hero Anahit Pogosova was also there sharing a few serverless tips in front of a full house. Here’s a picture I […] Amaz
Hash 2015/06/18
簡単なサンプルがあるので試したい

EMR

spark

MapReduce

あとで
リンク
GitHub - bigdatagenomics/adam: ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
ADAM is a library and command line tool that enables the use of Apache Spark to parallelize genomic data analysis across cluster/cloud computing environments. ADAM uses a set of schemas to describe genomic sequences, reads, variants/genotypes, and features, and can be used with data in legacy genomic file formats such as SAM/BAM/CRAM, BED/GFF3/GTF, and VCF, as well as data stored in the columnar A
Hash 2014/12/09
apache

github

scala

bioinformatics

spark
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx