Update on LastPass Connectivity Errors

At 3:57 Eastern Time this morning, one of the data centers that LastPass relies on went down. Our team immediately took action to migrate LastPass to run entirely on a different data center. As a result, many users experienced connection errors with the LastPass service, and LastPass.com has been intermittently unavailable throughout the morning. We have been engaged with our data center provider the entire time to resolve the issues. Please note this does not impact the security of your data.

We are doing everything we can to mitigate the impact and resolve the situation as quickly as possible, and apologize for the inconvenience caused. We strongly recommend users login through the browser extensions to access their vault, where most users should have access though some may still see warnings that they are in “offline mode”.

We will continue to update our user base and appreciate your patience.

Update: 1:28 pm EST

Though one of our data centers remains completely down, the service is generally stable and should be available to the majority of users (with the exception of login favicons). Some users may see connection errors but should still be able to access their data. We continue to work as quickly as possible to get the service back to 100%.

Update: 4:13 pm EST

Most users should now be able to connect to LastPass browser extensions and LastPass.com without errors, though favicons still may not sync. We continue to closely monitor the situation.

August 13, 2014: Post Mortem of Yesterday’s Outage

As noted in our original post, on August 12th, 2014 a data center that LastPass relies on went down around 4 am Eastern Time. Below, we have outlined the timeline of events as they unfolded at the data center and with the LastPass service at large.

We again sincerely apologize for the inconveniences caused, and want to assure our community we are moving forward stronger than before, as we remain deeply committed to the security and reliability of our service for our users.

Joe Siegrist
CEO of LastPass

Summary of Events

The majority of users were unaffected due to having proper redundancy in place to deal with the loss of a data center, as well as the built-in offline access via the LastPass browser extensions. However, during our efforts to scale at the secondary data center to ensure sufficient capacity at peak of the day, we inadvertently worsened the situation through human error. Our team certainly has takeaways from the experience and will be implementing changes going forward, as detailed in the concluding statements below.

We did receive a full RFO from our data center confirming that the BGP routing table issues affecting other companies yesterday played a role as well. For more, see: http://www.zdnet.com/internet-hiccups-today-youre-not-alone-heres-why-7000032566/

Timeline of Events (EDT)

3:50 am – We detected extreme latency and packet loss between one of our data centers and most major networks, including inter-connectivity with the other data center.

3:54 am – Our monitoring system detected the situation as critical and paged two operators.

4:00 am – We contacted our data center provider regarding the issue we were experiencing with their service.

5:00 am – With no update from our impacted data center provider, we switched from two data centers to run entirely on the second data center and disabled the affected data center.

6:00 am – We noticed IPv6 has suddenly started working at the now-disabled data center, making it clear to us that major networking changes were being made.

7:00 am – Our report was escalated by the impacted data center provider.

8:00 am – We determined that the outage will likely be extended, so we executed on a plan to add some spare machines into load balancing at the second data center to ensure we would have plenty of spare capacity at the peak of the day.

8:15 am – We began to receive alerts of intermittent connectivity issues at our second (now only) data center.

8:30 am – A small percentage of users reported logout errors that prevented them from utilizing offline mode.

9:00 am – We continued trying to work with our impacted data center provider, but received no updates on the situation or information on resolution.

9:30 am – Latency and connectivity issues increased at the second (now only) data center, which we began investigating.

10:00 am – We received acknowledgement from our impacted provider indicating this is a widespread problem, and indicated they would reload the core routers. They noted that it may be an extended outage.

10:30 am – The impacted data center’s network went completely down.

12:00 pm – We tracked down the source of an issue at the second data center, in which 3 machines we had added were running at 100Mbps instead of Gigabit (despite having Gigabit cards and being connected to Gigabit switches) and were network saturated.

12:45 pm – We resolved the issue with the 3 additional machines, and fully restored service still running on the second data center only, though favicons remained disabled.

2:15 pm – Impacted provider indicated they were fully online, though those machines remained unreachable for us.

2:30 pm – We authorized the impacted data center staff to reboot our networking equipment, with no effect.

3:30 pm – We discovered the underlying issue with why some users are being logged off immediately after login and resolved.

3:45 pm – Members of our team arrived at the impacted data center, and verified that our networking equipment was still down.

4:15 pm – We completed a swap to spare equipment, bringing the impacted data center back online.

8:45 pm – We completed testing and confirmed that replication to secondary data center looked good, and were fully restored with both data centers active again.

Conclusions & Lessons Learned

As a result of yesterday’s events, we have formed the following key takeaways and action steps:

  • We have moved our status page to be hosted outside our network, since it was inaccessible for periods of time.
  • In an effort to gather more detailed information for our community, we delayed communicating about the situation. Going forward, we will share what information we have, however sparse, and work to update the community from there, via the blog, the status page, our social accounts, and email where appropriate.
  • Our monitoring checks now verify port speed:
  • We are considering moving to another data center provider.
  • In an effort to improve the situation, we worsened it through our actions, and we will be more cautious in taking preventative actions when running on a single data center.
  • We’re moving to a hosted model for DNS that includes external service checks.
  • Though we designed some systems to be ‘non-critical’, such as favicons for sites, we’ll be improving our systems to minimize visual disruption during a massive outage.
  • A small number of users were impacted by an inability to access the service offline, we continue to investigate and test this.
  • We will be implementing more disaster and redundancy tests of our systems to better prepare for a catastrophic, single data center scenario.

199 Comments

  • Anonymous says:

    Shame on you Lastpass

  • Anonymous says:

    Finally.
    This has been a disaster, lastpass, and shown you up to be incompetent and amateurish.
    It would be interesting to hear what the cause of “one of our data centers has failed” was – if it was a breach then that will be the end of you.

  • Anonymous says:

    back for me too

  • Anonymous says:

    It’s back up for me. Try to log in.

  • Anonymous says:

    Before moving to a different provider keep in mind that LastPass should be fully aware that their customer base can’t handle two outages. I would think that LP should be one of the most stable options after this downtime.

    I use the LastPass Pocket as well to keep a backup of my passwords. To bad the password I need was created last night and I didn’t back it up. *sigh*

  • nia says:

    I have to disagree with these postings in defense of LastPass. LastPass DOES NOT store your passwords in the “cloud”. They store a one-way cryptographic “hash” of your master password in the cloud. Your passwords are local on your device and are unlocked using that hash. They can be unlocked other ways though too. For instance, LastPass offers several mechanisms to keep your encrypted password blob with you and locally on your computer. My recommendation: setup multi-factor authentication with Yubiko’s ‘Yubikey’ product per LastPass’ instructions on their site. Optionally, use the LastPass Pocket detailed here: https://helpdesk.lastpass.com/lastpass-on-the-go-2/lastpass-pocket/

    • Anonymous says:

      Interesting as Lastpass states THEY DO, encryption is what is done locally.

    • Anonymous says:

      Rubbish (nia, not Anonymous) – if you are seeing local passwords then they will be the browser that is serving them up, not lastpass.

      The passwords are stored in the cloud, encrypted with your master password / hash. Otherwise, how would you be able to get a password down to, say, a new PC? It comes from the cloud.

    • Anonymous says:

      Lastpass DOES keep an encrypted copy of your data in the cloud. This is what allows it to keep all your devices in sync such as another computer, phone, etc.

    • nia says:

      They explain the “salted hash” concept that they use, where passwords never actually reside in their datacenter, at this link: https://lastpass.com/how-it-works/ An even better explanation of the specifics is on the Security Now! podcast from a few years ago: https://www.grc.com/sn/sn-256.htm

    • nia says:

      Folks, please educate yourselves by listening to the detailed podcast I mention above, and by reading the LastPass site. I would NOT have been able to gain access to my passwords this morning, while Lastpass was down, if the passwords were stored in the cloud. I gained access to the passwords because I used my yubikey USB one-time password device to decrypt the LOCALLY stored passwords. There you have it.

    • Anonymous says:

      So how do I log into a wiped-daily PC, go to lastpass.com, enter my password and auth code, and see a list of all my passwords then? I can guarantee that they aren’t stored locally on that PC.
      Answer: they are stored in the cloud.
      Doesn’t mean that you can’t have a local copy. Yo can,and I do on several PC’s, but they are also available through a browser.

    • nia says:

      I think there’s a semantic misunderstanding here. A copy of your encrypted password “blob”, as I’ve heard it called, is on the LP server. They can’t decrypt it even if served a subpoena. That same blob is also local on your computer. By contrast you CAN decrypt it independently without any access to the LastPass servers in their datacenter.

      They go into more detail at the link below but here’s how LP describes this:

      “All encryption/decryption occurs on your computer, not on our servers. This means that your sensitive data does not travel over the Internet and it never touches our servers, only the encrypted data does.”

      https://helpdesk.lastpass.com/getting-started/introduction/why-is-lastpass-safe/

    • nia says:

      Only when your cryptographic ‘blob’ is downloaded to your device, be it the browser at a cybercafe or to your iPhone, does the clear text password become visible. I probably should have worded my initial post as such:

      “LastPass DOES NOT store your [unencrypted] passwords in the cloud. Your passwords are encrypted in a datafile that is local on your device, with a copy of that encrypted datafile periodically backed up to the LastPass server.”

      I’m sorry for not wording it better originally. My original point was to show that LastPass does not make its users depend on access to the Internet.

    • Bankbuddy says:

      GRC.com is a great resource. I will know what to think of all this when The Explainer in Chief speaks.