- Apache Spark is an open-source cluster computing framework for large-scale data processing. It was originally developed at the University of California, Berkeley in 2009 and is used for distributed tasks like data mining, streaming and machine learning. - Spark utilizes in-memory computing to optimize performance. It keeps data in memory across tasks to allow for faster analytics compared to dis
![HDFSネームノードのHAについて #hcj13w](https://cdn-ak-scissors.b.st-hatena.com/image/square/275e10fec573d87ac375dca613a7798815d4744e/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2Fhcj2013winterhdfsha20130121lt-130121213737-phpapp01-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)