- Apache Spark is an open-source cluster computing framework for large-scale data processing. It was originally developed at the University of California, Berkeley in 2009 and is used for distributed tasks like data mining, streaming and machine learning. - Spark utilizes in-memory computing to optimize performance. It keeps data in memory across tasks to allow for faster analytics compared to dis
![Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -](https://cdn-ak-scissors.b.st-hatena.com/image/square/c3fa323b49c3afec7db6f0804d2e9bb72a0956a7/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2F20161221sparkmemory-161221132628-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)