Home About Archives Search Feed

What it looks like to process 3.5 million books in Google’s cloud

🔒 googlecloudplatform.blogspot.com

This is where Google’s cloud offerings shine, seemingly purpose-built for data-first computing. In just two weeks, I was able to process 3.5 million books, spinning up a cluster of 160 cores and 1TB of RAM, followed by a single machine with 32 cores, 200GB of RAM, 10TB of SSD disk and 1TB of direct-attached scratch SSD disk. I was able to make the final results publicly accessible through BigQuery at query speeds of over 45.5GB/s.

Posted on February 17, 2016

← Next post    ·    Previous post →