[B! availability] dannのブックマーク

dann id:dann

availabilityに関するdannのブックマーク (15)

http://linux-ha.osdn.jp/wp/wp-content/uploads/whats_ha.pdf
dann 2019/10/02
availability
リンク
目指せ！落ちない高可用性サーバ、ハードウェアの選び方 - Qiita
10年以上金融機関で働いているインフラエンジニアの落ちないサーバにするための考察です。ハードウェアの専門家ではないので、正確ではないかもしれません。今までの経験からの個人的考え方になります。私たちオンプレ重視のインフラエンジニアは、クラウドサービスではできない高可用性サーバを導入したり、複数台構成で１台故障しても問題ない構成のサーバはコスト重視するなど、システムに最適なサーバを導入しようとしています。高可用性サーバを追求する目的 ■アプリに影響を与えないように Active/Standby構成にしていて、インフラ的にはダウンタイムが数秒だとしても、アプリによっては復旧に時間がかかったり、問題ないことの確認にも時間がかかってしまいます。また、正しくサーバが落ちればアプリが問題ないとしても、サーバが中途半端な状態のままになってしまい、なんだかおかしいということもあります。
dann 2019/10/02
availability
リンク
Riak: 本物の高可用性を実現する仕組みとは？
The document contains SQL statements and execution plans for counting records in a table where the ID is between 1 and 10 and the status is either '00' or '01'. It shows that for a status of '00' there are 10000 records, but for a status of '01' there are 0 records. Execution plans and statistics are provided with each statement to analyze the performance and resource usage.
dann 2014/07/04
riak

availability
リンク
How Google Backs Up the Internet Along With Exabytes of Other Data - High Scalability -
Raymond Blum leads a team of Site Reliability Engineers charged with keeping Google's data secret and keeping it safe. Of course Google would never say how much data this actually is, but from comments it seems that it is not yet a yottabyte, but is many exabytes in size. GMail alone is approaching low exabytes of data. Mr. Blum, in the video How Google Backs Up the Internet, explained common back
dann 2014/02/12
backup

availability

google
リンク
ゴシッププロトコルによる冗長化と負荷分散の検証
分散システムのFault Injectionの話 NTTデータテクノロジーカンファレンス2017で発表する際に用いたプレゼン資料 https://oss.nttdata.com/hadoop/event/201710/index.html 最近勉強を始めたコンテナ技術に関する基礎的な知識をまとめました。 [訂正と注釈] p.27-30: 「Deployment」内の「Version: 1」 => 「Version: 2」 p.37: 「終了コードをから」 => 「終了コードから」 p.39: 「HTTPSが利用できない」=> AWS上では、SSL終端するLBがサポートされています。https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws p.40: 「ユーザがingress con
dann 2013/10/28
availability
リンク
GitHub availability this week
AI & MLLearn about artificial intelligence and machine learning across the GitHub ecosystem and the wider industry. Generative AILearn how to build with generative AI. GitHub CopilotChange how you work with GitHub Copilot. LLMsEverything developers need to know about LLMs. Machine learningMachine learning tips, tricks, and best practices. How AI code generation worksExplore the capabilities and be
dann 2012/09/20
github

availability

mysql
リンク
Summary of the AWS Service Event in the US East Region
July 2, 2012 We’d like to share more about the service disruption which occurred last Friday night, June 29th, in one of our Availability Zones in the US East-1 Region. The event was triggered during a large scale electrical storm which swept through the Northern Virginia area. We regret the probl ems experienced by customers affected by the disruption and, in addition to giving more detail, also w
dann 2012/07/04
sysadmin

availability
リンク
「エラー忘却型コンピューティング」なんて言い出したのは誰だ！ - Plan9日記
正確にはFailure-oblivious computingを「エラー」忘却型コンピューティングって訳したのは誰だという話。訳す過程でfailureがerrorに入れ替わっている。情報系の人間は「名前重要！」とか言う割に、障害（fault）、異常（error）、故障（failure）という用語の定義、使い方に無自覚な人が少なくない。Twitterで意外と反応があったので、（自戒を込めて）書き起こしてみる。 Failure-oblivious computingは、2004年のOSDIでMITのMartin Rinardらが論文"Enhancing Server Availability and Security Through Failure-Oblivious Computing"で提案した技術。Cのような言語で不正ポインタ参照が発生しても、これを検出してなかったことにして（適当な値を
dann 2012/04/22
availability

error
リンク
絵で見てわかる某分散データストア
1. 分散キャッシュむにゃむにゃ Python Developers Festa 2012.03 絵で見てわかる某分散データストアキャッシュ 2012/03/17 Takahiko Sato 1 2. じこしょうかいおまえだれよ • 本名 – 佐藤貴彦（さとうたかひこ） • 日本オラクルで働いてます。 – 主担当はデータベースやらミドルウェア製品です。 – その他よろず屋やってます。 – データベース、ネットワーク、OSとかインフラ系 • Oracle の社内ポリシーに、「匿名でオラクルのことについて語んじゃねーよ！」という項目あります。という訳で、今回だけは実名で発表します。 2
dann 2012/03/18
availability
リンク
Oracle |クラウド・アプリケーションとクラウド・プラットフォーム
2025年2月13日に東京で開催される、オラクルのテクノロジーおよびネットワーキングイベント「CloudWorld Tour」にぜひご参加ください。オラクルのエグゼクティブや専門家、パートナー、同業他社と交流し、クラウドやAIなどについて意見を交わしましょう。
dann 2011/11/19
oracle

availability

asm
リンク
Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things - High Scalability -
When building a system on top of a set of wildly uncooperative and unruly computers you have knowledge probl ems: knowing when other nodes are dead; knowing when nodes become alive; getting information about other nodes so you can make local decisions, like knowing which node should handle a request based on a scheme for assigning nodes to a certain range of users; learning about new configuration
dann 2011/11/15
availability

gossip
リンク
■まえがきこのたび、特定非営利活動法人エルピーアイジャパンは、Linux/OSS 技術者教育に利用していただくことを目的とした教材、「高信頼システム構築標準教科書　― 仮想化と高可用
■まえがきこのたび、特定非営利活動法人エルピーアイジャパンは、Linux/OSS 技術者教育に利用していただくことを目的とした教材、「高信頼システム構築標準教科書　― 仮想化と高可用性 ―」を開発し、Web 上にて公開し（URL： http://lpi.or.jp/linux text/ha.shtml ）、無償提供することとなりました。この「高信頼システム構築標準教科書　― 仮想化と高可用性 ―」は、大手 IT ベンダーをはじめとする多くの企業からの、「Linux/OSS を使った高信頼システムを構築するための実践的なガイドブックが欲しい」という要望に応えて開発されました。クラウドサービスやプライベートクラウドの利用が拡大する中、クラウド基盤をはじめとするミッションクリティカルシステムでの Linux/OSS のニーズはますます高まっています。中でもクラウド基盤構築
dann 2011/06/09
availability

sysadmin
リンク
HDFS block replica placement in your hands now!
dann 2010/11/18
hdfs

replica

availability
リンク
PowerPoint Presentation
2009/2/24 1 ScalabilityとAvailability 早稲田大学丸山不二夫はじめに  クラウド技術の最大の特徴は、安価なサーバを沢山並べて処理能力を拡大するという Scale-outの戦略である。  このことは、多数のマシンからなるScale-out のシステム構成では、システムを構成するマシンのエラーが、確率的には避けられないことを意味している。  これは、システムのAvailabilityにとっては、重大な問題である。はじめに  講演では、分散システムでは、Scalabiltyと Availabilityが矛盾するということから出発して、現在のクラウドシステムが、どのように、 Scalabilityと Availabilityを両立させようとしているかを見ていく。はじめに  クラウドのAvailabilityは、基本的には、マシン
dann 2010/11/10
availability
リンク
HA構成と復旧作業時間と信頼性 - kazuhoのメモ置き場
２台でHAノードを組んでいて１台が落ちた場合に、何時間以内に再度２台構成に復帰させる必要があるのかなーと思って、ちょっと計算してみた。ノード毎の障害発生の確率が独立であると仮定すると、 $ perl -le 'print exp(log($ARGV[0])/(365*24))**$ARGV[1]' 0.97 1 0.999996522927565のように、サーバの障害発生率が 3%/year で、かつ復旧に１時間かかる場合、復旧中に残存ノードにも障害が発生してサービスが停止する可能性は 0.001% 以下。 $ perl -le 'print exp(log($ARGV[0])/(365*24))**$ARGV[1]' 0.97 24 0.999916553598325２台構成に戻るまで24時間かかる場合だと、約0.01%。 $ perl -le 'print exp(log($ARGV[
dann 2009/10/13
availability

system

ha
リンク
1