Beginner must-see! A future that can be opened by learning HadoopDataWorks Summit
In an era where artificial intelligence (AI) is reshaping enterprises across the globe—be it in healthcare, finance, or manufacturing—it’s hard to overstate the transformation that AI has had on businesses, regardless of industry or size. At Cloudera, we recognize the urgent need for bold steps to harness this potential and dramatically accelerate the time to […] Read blog post
Guest post by Prasanth Jayachandran , who has been working on implementing CUBE support for Pig, as part of the large-scale distributed cubing effort. Update: As per Dmitriy’s tweet: …the naive implementation is in. The scalable count distinct impl is pending 0.11 branching, will go into 0.12. The next version of Apache Pig will support the CUBE operator ( patch available here ). The CUBE operator
Computing aggregates over a cube of several dimensions is a common operation in data warehousing. The standard SQL syntax is "GROUP relation BY dim1, dim2, dim3 WITH CUBE" – which in addition to all dim1-2-3, produces aggregations for just dim1, just dim1 and dim2, etc. NULL is generally used to represent "all". A presentation by Arnab Nandi describes how one might implement efficient cubing in Ma
Hive & Pig Two ways of doing one thing Or One way of doing two things Ashutosh Chauhan Who am I? • Pig Committer & PMC Member • Hive Committer & PMC Member • Hcatalog Committer & PPMC Member • ASF Member • Software Engineer at HortonWorks Two ways of doing same thing • Both generate map-reduce jobs from a query written in higher level language. • Both frees users from knowing all the little secret
The document discusses a presentation about practical problem solving with Hadoop and Pig. It provides an agenda that covers introductions to Hadoop and Pig, including the Hadoop distributed file system, MapReduce, performance tuning, and examples. It discusses how Hadoop is used at Yahoo, including statistics on usage. It also provides examples of how Hadoop has been used for applications like lo
Riding the wave of the generative AI revolution, third party large language model (LLM) services like ChatGPT and Bard have swiftly emerged as the talk of the town, converting AI skeptics to evangelists and transforming the way we interact with technology. For proof of this megatrend look no further than the instant success of ChatGPT, […] Read blog post
「PigとHive何が違うの?」 「Difference between Pig and Hive? Why have both?(PigとHive何が違うの?)」 という質問を、先日、StackOverFlowで見かけました。恐らくHadoopを触ると一度は疑問に思う事ではではないでしょうか。 PigとHiveは、共にSQLライクな記法でMapReduceを書けるDSLですが、利用者数においてはHiveに軍配が上がっているようにみえます。 一方で、「Pigをもっと早く試せば良かった」というお話を伺うこともあり、有用(かもしれない)ツールであれば、正しく理解しておいた方がよさそうです。 というわけで、ここではPigの活用を探ります。 Pigの性能 Pigが今一つ利用されていないのは、SQLとの親和性に加え、性能面で、「JavaMapReduce>Hive>Pig」という傾向があるからで
リリース、障害情報などのサービスのお知らせ
最新の人気エントリーの配信
処理を実行中です
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く