yassのブックマーク / 2013年10月2日 - はてなブックマーク

yass id:yass

2013年10月2日のブックマーク (6件)

HiveとHBaseの連携は難しい - wyukawa's diary
Hive 0.11.0にバージョンアップしてmultiple insertに関わるバグである[HIVE-3699] Multiple insert overwrite into multiple tables query stores same results in all tables - ASF JIRAを心配しなくて良くなったけど代わりにネストしたgroup byに関わるバグである[HIVE-5237] Incorrect group-by aggregation in 0.11.0 - ASF JIRAを踏んだwyukawaです、こんにちは。ユニークユーザとか求める時にネストしたgroup byが出てくる可能性はあるのですがcount(distinct ...)とか使って回避しました。で、今回書くのはそういう話じゃなくてHiveとHBaseとの連携に関してです。結論から言うと結構
yass 2013/10/02
"今回書くのはそういう話じゃなくてHiveとHBaseとの連携に関してです。結論から言うと結構難しいです。少なくとも僕にとっては難しくて周りにHiveとHBaseのエキスパートがいるからなんとか運用がまわっているのが実態です"

hive

hbase
リンク
Amazon EC2インスタンスガチャをやってみました - 元RX-7乗りの適当な日々
歴史のあるクラウドサービスは、どこもそうなってしまう傾向があるとは思いますが、ホストサーバでの実CPUのアーキテクチャ・世代の違いで、サーバインスタンスのCPUパフォーマンスに微妙な差がついてしまいます。 2006年よりサービス提供しているAmazon EC2でもその傾向があることは割と知られていて、同じ性能だと思って並べて使っていたサーバインスタンスが、同じ処理量にもかかわらず使っているCPUリソースに差がついている、なんてことが起こります。 con_mameさんも、以下のエントリで書かれていますね。 EC2で同じECUだけどCPUは違う - まめ畑昔は、us-eastでm1.smallのインスタンスをよく使ったもので、その頃はいつもAMDのOpteronプロセッサでしたが、最近では、ほとんどIntel Xeonですし。ということで、現時点(2013/10)で、EC2インスタンスで使
yass 2013/10/02
" 利用CPUは、割とSandy Bridge世代に寄ってきていることが確認できましたが、やはり旧世代のCPUについても、ちょこちょこ出てくるような印象です。(特に2.5ECU/1vCPUのインスタンス) "

aws

ec2

CPU
リンク
word2vecに英辞郎データを放り込んでみた - naoya_t@hatenablog
英辞郎をword2vecに放り込んでみたらちょっと面白かったのでメモを。word2vecについては前回の記事を参照。使ったのはEIJI-138.TXT（最新より１つ古いバージョンです） EDPさんから1980円ぐらいで買えます。 ■semantically-motivated {形} : 意味論的｛いみろんてき｝に動機付けられた ■semantically-restricted {形} : 意味的｛いみてき｝に制限｛せいげん｝された ■semantics {名-1} : 意味論｛いみろん｝、記号論｛きごうろん｝ ■semantics {名-2} : 《コ》〔プログラムの〕動作 ■semantics : 【＠】セマンティックス、【分節】se・man・tics ■semantics course : 意味論｛いみろん｝のコース ■semaphore {名-1} : 手旗信号｛てばたしん
yass 2013/10/02
word2vec

nlp
リンク
OpenLogic Login
Open Source Support Resolve open source issues ranging from package selection and setup to integration and production probl ems with expert, commercial-grade technical support. See how we help Open Source Auditing Inventory how open source is being used within your organization through detailed analysis and scanning that delivers comprehensive, actionable reports. See how it works
yass 2013/10/02
"smem / This command reports physical memory usage, taking shared memory pages into account. In its output, unshared memory is reported as the unique set size (USS). / The USS plus a process's proportion of shared memory is reported as the proportional set size (PSS)."

linux

memory

smem
リンク
Deep-learningはラテン語の動詞活用を学習できるか？ Can deep-learning learn latin conjugation? - naoya_t@hatenablog
ラテン語ネタが続きますが工藤さんがぐぐたすで紹介してた word2vec が面白そうだったので。 https://code.google.com/p/word2vec/ で少し遊んでみた。いわゆる deep learning で単語のベクトル表現を学習してくれる。面白いのは、2つのベクトルの差が、2つの単語の関係をよく近似してくれること。 It was recently shown that the word vectors capture many linguistic regularities, for example vector operations vector('Paris') - vector('France') + vector('Italy') results in a vector that is very close to vector('Rome'), and
yass 2013/10/02
word2vec

nlp
リンク
word2vec in yhat: Word vector similarity | Daniel Rodriguez
A few weeks ago Google released some code to convert words to vectors called word2vec. The company I am currently working on does something similar and I was quite amazed by the performance and accuracy of Google's algorithm so I created a simple python wrapper to call the C code for training and read the training vectors into numpy arrays, you can check it out on pypi (word2vec). At the same time
yass 2013/10/02
word2vec

nlp
リンク
- 2013年10月3日
- 2013年10月2日
- 2013年10月1日