For 45 minutes on Monday morning, a variety of Google services were inaccessible across Europe and North America. Google Search, Gmail, and a variety of Drive programs were all down. Google’s physical devices also reported critical errors during the outage. Initial reports blamed this on an error in the service’s authentication system, but a new report from the company shows that the problem was more widespread than initially thought.Google revealed that the root issue was a flaw with the company’s storage management system. The issues only cascaded from there: limiting the authentication system’s capacity meant that the entire identity-management system was broken. All users of Google Cloud Platform and Google Workspace at the time of the outage were affected.
So what lessons do this outage teach?
Big Tech Companies Aren’t Infallible
This is the third major failure in as many months, along with the five hours Amazon Web Services was disrupted in November and Microsoft Azure’s outage in October. It can be tempting to trust blindly when a company has a track record of reliability and success, but track records won’t keep you afloat if a failure occurs.
Diversify and Monitor
If all your tools for support, monitoring, servicing, collaborating, etc. are on the same platform, you’ll be wiped out by those platform’s errors. While it can be tempting to unify your systems for simplicity’s sake, your monitoring tools should always be separate so that you can be notified in case of an outage. End-to-end visibility is the goal.
Backups Are Your Friend!
Having independent access to your data is crucial when your cloud host fails. Backups create overlapping coverage so that no one failure impacts your company. On top of that need for access, backups remove any worries about losing data that’s stored remotely.
In short, these failures should keep us from becoming complacent. Security isn’t just about preventing attacks, it’s about preventing all disruptions in service. Take care of your technology, be aware of what these outages can do to your business, and take steps to prevent failure before it happens.
If you need more information on preventing service disruptions, leave a comment or email us at email@example.com.