[B! concurrent-computing] [6ページ] nabinnoのブックマーク

nabinno id:nabinno

concurrent-computingに関するnabinnoのブックマーク (3,378)

Cloudera
See why 96% of enterprises are expanding the use of AI agents Read the report
nabinno 2019/12/22
cloudera

apache-hadoop

distributed-computing

concurrent-computing
リンク
Spark and YARN - Qiita
SparkとYARNについて書きます。テーマ的にインフラストラクチャについての話が多くなると思います。 SparkとHadoopの関係性 SparkはHadoopクラスタへの依存はしていない。(ただし、ややこしいのだがHDFSやYARNのクライアントライブラリへの依存はある)なのでHadoopなしでも動かすことができる。しかしそれでもHadoopと一緒に動作させることが多いのは以下の理由による。クラスタマネージャとしてのYARN Sparkはアプリケーション（厳密にはSparkアプリケーション）ごとに下記のようなクラスタが構築される。Driver Programと呼ばれる、SparkContextオブジェクトを持ち、アプリケーションコードの主要部分を実行するアプリケーションのマスタコンポーネントと、RDDに対するオペレーションを実行するExecutor群。そして、Driver Progr
nabinno 2019/12/19
qiita

apache-spark

mapreduce

distributed-computing

apache-yarn

cluster-manager

concurrent-computing
リンク
Amazon Athena – Amazon S3上のデータに対話的にSQLクエリを | Amazon Web Services
Amazon Web Services ブログ Amazon Athena – Amazon S3上のデータに対話的にSQLクエリを私達が扱わなければいけないデータの量は日々増え続けています(私は、未だに1,2枚のフロッピーディスクを持っていて、1.44MBというのが当時はとても大きいストレージだったことを思い出せるようにしています)。今日、多くの人々が構造化されたもしくは準構造化されたペタバイト規模のファイル群を、日常的に処理してクエリしています。彼らはこれを高速に実行したいと思いつつ、前処理やスキャン、ロード、もしくはインデックスを貼ることに多くの時間を使いたいとは思っていません。そうではなくて、彼らはすぐ使いたいのです: データを特定し、しばしばアドホックに調査クエリを実行して、結果を得て、そして結果に従って行動したいと思っていて、それらを数分の内に行いたいのです。 Amazon
nabinno 2019/12/19
amazon-athena

presto

structured-query-language

mapreduce

distributed-computing

concurrent-computing
リンク
GitHub - treasure-data/trino-client-ruby: Trino/Presto client library for Ruby
require 'trino-client' # create a client object: client = Trino::Client.new( server: "localhost:8880", # required option ssl: {verify: false}, catalog: "native", schema: "default", user: "frsyuki", password: "********", time_zone: "US/Pacific", language: "English", properties: { "hive.force_local_scheduling": true, "raptor.reader_stream_buffer_size": "32MB" }, http_proxy: "proxy.example.com:8080",
nabinno 2019/12/19
github

treasure-data

presto-client

ruby

presto

mapreduce

apache-hadoop

distributed-computing

concurrent-computing
リンク
Welcome to Apache Pig!
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure l
nabinno 2019/12/19
apache-pig

mapreduce

apache-hadoop

distributed-computing

concurrent-computing
リンク
Apache Pig - Wikipedia
nabinno 2019/12/19
apache-pig

mapreduce

apache-hadoop

distributed-computing

concurrent-computing
リンク
Presto (SQL query engine) - Wikipedia
Presto (including PrestoDB, and PrestoSQL which was re-branded to Trino) is a distributed query engine for big data using the SQL query language. Its architecture allows users to query data sources such as Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, Mongo DB and Teradata,[1] and allows use of multiple data sources within a query. Presto is community-driven open-source software released under
nabinno 2019/12/17
presto

mapreduce

apache-hadoop

distributed-computing

concurrent-computing
リンク
Apache Sparkってどんなものか見てみる（その１ - 夢とガラクタの集積場
こんにちは。 Kafkaを試している最中で微妙ですが、最近使えるのかなぁ、と情報を集めているのが「Apache Spark」です。 MapReduceと同じく分散並行処理を行う基盤なのですが、MapReduceよりも数十倍速いとかの情報があります。・・・んな阿呆な、とも思ったのですが、内部で保持しているRDDという仕組みが面白いこともあり、とりあえず資料や論文を読んでみることにしました。まず見てみた資料は「Overview of Spark」（http://spark.incubator.apache.org/talks/overview.pdf）です。というわけで、読んだ結果をまとめてみます。 Sparkとは？高速でインタラクティブな言語統合クラスタコンピューティング基盤 Sparkプロジェクトのゴールは？以下の2つの解析ユースケースにより適合するようMapReduceを拡張
nabinno 2019/12/15
apache-spark

mapreduce

apache-hadoop

distributed-computing

concurrent-computing
リンク
Apache Storm
Why use Apache Storm? Apache Storm is a free and open source distributed realtime computation system. Apache Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Apache Storm is simple, can be used with any programming language, and is a lot of fun to use! Apache Storm has many use cases: realtime analytics, online m
nabinno 2019/12/15
apache-storm

stream-processing

concurrent-computing
リンク
Apache Storm を使ってストリーム処理を書いてみよう
こんにちは。 Hadoop 周辺をよく触っている T.O. です。 Hadoop 周辺をよく触っているので、最近 Hadoop 周辺を触ってきて得た話などを書いていきます。ということで今回は、数あるストリーム処理エンジンの中のひとつ Apache Storm を少々触ってストリーム処理を書いてみよう、という話を。 Apache Storm とはひとことで言えば、いわゆるストリーム処理エンジン。以前、別のブログで Apache Apex について書きましたが、おおまかにはそれと同じカテゴリーに属するツールです。例のごとく、詳しいことは公式ドキュメントを熟読すればだいたいわかります（リンク先は 1.1.0 のもの）。書籍は Amazon を調べれば多少見つかりますが、良書かどうかは不明です。このあたりの分野についてはウェブで英語で書かれたドキュメントを読むのが一番良いように思います
nabinno 2019/12/15
apache-storm

stream-processing

concurrent-computing
リンク
O'Reilly Japan - 入門 PySpark
PythonからSparkを利用するための機能、PySparkを使いこなすテクニックとノウハウを習得する書籍です。はじめに高速になったSpark 2.0の特徴とアーキテクチャを解説し、次に構造化及び非構造化データの読み取り、PySparkで利用できる基本的なデータ型、MLlibとMLパッケージによる機械学習モデルの構築を説明します。さらにGraphFramesを使ったグラフの操作、ストリーミングデータの読み取り、クラウドへのモデルのデプロイなどの方法を豊富なサンプルと一緒に学びます。またローカルでのSpark＋Python＋Jupyter環境の構築方法も紹介。大規模なデータを処理し、活用したいエンジニア必携の一冊です。序文訳者まえがきはじめに 1章　Sparkを理解する 1.1　Apache Sparkとは 1.2　SparkのジョブとAPI 1.2.1　実行のプロセス 1.2.2　
nabinno 2019/12/15
oreilly

tomasz-drabas

pyspark

apache-spark

mapreduce

distributed-computing

concurrent-computing

e-book

python
リンク
Apache Spark で分散処理入門 - Qiita
Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0 /_/ 2016年7月末にApache Spark 2.0.0がリリースされ、始めてみたので色々メモメモなのでご容赦ください🙇 また、この記事中にサンプルで載せているコードはjavaがメインですがscala、pythonの方がすっきりかけている気がじます。これからも随時編集していきます Apache Spark とは上の画像はhttps://spark.apache.orgから、場合によってはHadoopのMapReduce100倍速いらしいです、強い、Spark Sparkは巨大なデータに対して高速に分散処理を行うオープンソースのフレームワーク。 (Java Magazin
nabinno 2019/12/15
qiita

apache-spark

mapreduce

distributed-computing

concurrent-computing

guide
リンク
Apache Hive - Wikipedia
Apache Hive はHadoopの上に構築されたデータウェアハウス構築環境であり、データの集約・問い合わせ・分析を行う[1]。Apache Hiveは当初はFacebookによって開発されたが、その後Netflixのようにさまざまな団体が開発に参加し、またユーザーとなった[2][3]。 Hive はAmazon Web ServicesのAmazon Elastic MapReduceにも含まれている[4]。 Apache HiveはHadoop互換のファイルシステム（たとえばAmazon S3)に格納された大規模データセットの分析を行う。使用には、map/reduceを完全にサポートしたSQLライクな「HiveQL」という言語を用いる。クエリの高速化のため、ビットマップインデックスを含めたインデクス機能も実装している[5]。標準設定では、Hiveはメタデータを組み込みApach
nabinno 2019/12/15
apache-hive

mapreduce

structured-query-language

apache-hadoop

distributed-computing

concurrent-computing
リンク
分散処理に入門してみた（Hadoop + Spark） | Casley Deep Innovations株式会社技術ブログ
こんにちは。SI部の腰塚です。 RDBやデータウェアハウスの仕事に携わることが多かった筆者は、数年前からたびたび聞こえたビッグデータ分析や機械学習のための分散処理フレームワークに興味を覚えたものの、ついぞアクセスしないままここまで来てしまいました。今回ブログを書くにあたって、せっかくなのでイチから手さぐり入門し、いまさら他人に聞けない分散処理の初歩からhadoop・sparkを触ってみるまでをまとめたいと思います。 1.分散処理の基礎知識 1-1.分散処理の処理方式：MapReduce まず分散処理とは、ひとつの計算処理をネットワークで接続した複数のコンピュータで同時並列で処理することです。ビッグデータ活用の市場が日々大きくなるに従って、数百テラ～ペタのデータ処理も珍しいものではなくなっており、日常的にこの規模のデータを扱うシステムでは、現実的な時間的・費用的コストで処理する工夫が必要
nabinno 2019/12/15
apache-hadoop

apache-spark

mapreduce

distributed-computing

concurrent-computing
リンク
テスト駆動開発から証明駆動開発へ #JTF2019 / July Tech Festa 2019
July Tech Festa 2019 で使用したスライドです。近年、テストを書く文化は広く普及しており、開発フローにおいて自動テストを組み込むことはもはや常識となりました。しかしよく考えてみると、有限個のテストケースが保証しているのは、所詮「特定の有限個の入力に対する出力」にしか過ぎません。…
nabinno 2019/12/08
speaker-deck

coq

formal-methods

software-testing

concurrent-computing
リンク
Leslie Lamport - Wikipedia
Leslie B. Lamport (born February 7, 1941) is an American computer scientist and mathematician. Lamport is best known for his seminal work in distributed systems, and as the initial developer of the document preparation system LaTeX and the author of its first manual.[2] Lamport was the winner of the 2013 Turing Award[3] for imposing clear, well-defined coherence on the seemingly chaotic behavior o
nabinno 2019/11/28
leslie-lamport

people

latex

concurrent-computing
リンク
PRINCIPIA Limited
セミナー形式手法に関するセミナーを開催しています．詳細につきましてはセミナーのページをご覧ください． SyncStitch: A Model Checker based on the Process Algebra CSP SyncStitch is a model checker based on the process algebra CSP (Communicating Sequential Processes). By using SyncStitch, you can check six types of properties of the system you are developping: Deadlocks Divergences (also known as livelocks) Refinement relation on traces semantics (sa
nabinno 2019/11/22
principia

concurrent-computing

company
リンク
"Well-Architected"なアーキテクチャに必要なこととは？現役CTOたちが選出した、Startup Architecture of the year 2019【AWS Summit Tokyo】
6月13日、AWS Summit Tokyoにてスタートアップ企業によるピッチコンテスト「Startup Architecture of the year」が開催された。2回目となる今年は、創業3年以内のスタートアップ企業を対象に実施。各社のビジネスのビジョンと、それを支えるシステムアーキテクチャに焦点を当て、スケーラビリティやパフォーマンス、コスト効率など多様な観点からWell-Architectedなアーキテクチャを選出する。一般公募から審査を勝ち進んできた7社が登壇し、しのぎを削った。果たして、グランプリに輝いたのは？今年も優れたアーキテクチャが集まった“Startup Architecture of the year” 昨年初めて開催され、好評のまま終わった「Startup Architecture of the year ピッチコンテスト」。今年もファイナリストたちが各5分のピ
nabinno 2019/11/19
codezine

amazon-web-services

amazon-well-architected

concurrent-computing
リンク
安定性のパターン大全 (とその実装) - Qiita
Deleted articles cannot be recovered. Draft of this article would be also deleted. Are you sure you want to delete this article? Cognitect社のNygardさんが10年ぶりに改訂したRelease It! 2nd Editionがまもなくリリースされます。内容は現在のベータ5版で全て書ききっておられるようなので、是非読んでみてください。 https://pragprog.com/book/mnee2/release-it-second-edition その中から4章の安定性パターンの概要をご紹介し、実際JavaのFailsafeライブラリを使った実装例を示したいと思います。安定性のパターン Stability Patterns 分散システムや後続をブロッ
nabinno 2019/10/19
qiita

stability-pattern

software-design-pattern

concurrent-computing

failsafe

java
リンク
Envoy proxy - home
Envoy is an open source edge and service proxy, designed for cloud-native applications As on the ground microservice practitioners quickly realize, the majority of operational probl ems that arise when moving to a distributed architecture are ultimately grounded in two areas: networking and observability. It is simply an orders of magnitude larger probl em to network and debug a set of intertwined d
nabinno 2019/10/18
envoy

sidecar-pattern

microservice

software-design-pattern

concurrent-computing

load-balancing

kubernetes
リンク
前のページ 2 3 4 5 6 7 8 9 10 11 次のページ