There has certainly been an increase this year with regards to both data breaches caused by data exposed within AWS S3 instances, as well as researchers who have uncovered sensitive data being exposed with AWS S3. 

In order to properly secure data in AWS S3, data is stored in "buckets". These buckets act as a logical unit of storage. If the buckets are configured to be private then the access control built into AWS S3, coupled with infrastructure level security will likely frustrate, and hopefully stop the majority of attackers.

The challenge not dissimilar with any other technology in that if the technology is not fully understood, then the people responsible for configuring, administering and securing AWS may make one simple configuration mistake that exposes their sensitive business or customer records to the public. 

This can be done simply by marking a bucket as β€œpublic” within AWS S3. Configured in such a way opens the floodgates to both hackers and researchers to make use of automation tools in order to fuzz URLs, and utilise specific Google Dorking techniques to locate exposed details on publicly accessible S3 buckets. 

In June 2017 Deep Root Analytics reported 198 million voters records were exposed to the public. In July 2017 Verizon reported 14 million subscriber records were exposed, and the WWE (World Wrestling Entertainment) also reported more than three million records were exposed in the same month, all by using publicly misconfigured buckets stored on AWS. The recent report regarding the four million Time Warner exposure is just a further confirmation that this problem is still prominent. 

What this really highlights is that while we know and understand the many benefits of using proven cloud providers like AWS S3, logging authentication and data manipulation activities should not be overlooked. 

Like any system connected to the business, having a baseline of normal day-to-day activities will aid with identifying unusual activities such as large file transfers leaving the cloud, numerous access denied messages during scanning and enumeration, or even destruction caused within the cloud itself due to deliberate configuration changes once a foothold has been gained.