Imagine you have a 4.2GB CSV file. It has over 12 million records and 50 columns. All you need from this file is the sum of all values in one particular column. How would you do it? Writing a script in python/ruby/perl/whatever would probably take a few minutes and then even more time for the script to actually complete. A database and SQL would be fairly quick, but then you'd have load the data,