All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It’s all very exciting, personally, as someone not responsible for fixing it.

Apparently caused by a bad CrowdStrike update.

Edit: now being told we (who almost all generally work from home) need to come into the office Monday as they can only apply the fix in-person. We’ll see if that changes over the weekend…

  • jedibob5@lemmy.world
    link
    fedilink
    English
    arrow-up
    106
    arrow-down
    3
    ·
    3 months ago

    Reading into the updates some more… I’m starting to think this might just destroy CloudStrike as a company altogether. Between the mountain of lawsuits almost certainly incoming and the total destruction of any public trust in the company, I don’t see how they survive this. Just absolutely catastrophic on all fronts.

    • NaibofTabr@infosec.pub
      link
      fedilink
      English
      arrow-up
      53
      ·
      3 months ago

      If all the computers stuck in boot loop can’t be recovered… yeah, that’s a lot of cost for a lot of businesses. Add to that all the immediate impact of missed flights and who knows what happening at the hospitals. Nightmare scenario if you’re responsible for it.

      This sort of thing is exactly why you push updates to groups in stages, not to everything all at once.

      • rxxrc@lemmy.mlOP
        link
        fedilink
        English
        arrow-up
        36
        ·
        3 months ago

        Looks like the laptops are able to be recovered with a bit of finagling, so fortunately they haven’t bricked everything.

        And yeah staged updates or even just… some testing? Not sure how this one slipped through.

        • dactylotheca@suppo.fi
          link
          fedilink
          English
          arrow-up
          46
          ·
          3 months ago

          Not sure how this one slipped through.

          I’d bet my ass this was caused by terrible practices brought on by suits demanding more “efficient” releases.

          “Why do we do so much testing before releases? Have we ever had any problems before? We’re wasting so much time that I might not even be able to buy another yacht this year”

            • dactylotheca@suppo.fi
              link
              fedilink
              English
              arrow-up
              7
              ·
              3 months ago

              Certainly not! Or other industries for that matter. It’s a good thing executives everywhere aren’t just concentrating on squeezing the maximum amount of money out of their companies and funneling it to themselves and their buddies on the board.

              Sure, let’s “rightsize” the company by firing 20% of our workforce (but not management!) and raise prices 30%, and demand that the remaining employees maintain productivity at the level it used to be before we fucked things up. Oh and no raises for the plebs, we can’t afford it. Maybe a pizza party? One slice per employee though.

    • RegalPotoo@lemmy.world
      link
      fedilink
      English
      arrow-up
      25
      ·
      3 months ago

      Agreed, this will probably kill them over the next few years unless they can really magic up something.

      They probably don’t get sued - their contracts will have indemnity clauses against exactly this kind of thing, so unless they seriously misrepresented what their product does, this probably isn’t a contract breach.

      If you are running crowdstrike, it’s probably because you have some regulatory obligations and an auditor to appease - you aren’t going to be able to just turn it off overnight, but I’m sure there are going to be some pretty awkward meetings when it comes to contract renewals in the next year, and I can’t imagine them seeing much growth

      • Skydancer@pawb.social
        link
        fedilink
        English
        arrow-up
        4
        ·
        3 months ago

        Nah. This has happened with every major corporate antivirus product. Multiple times. And the top IT people advising on purchasing decisions know this.

        • SupraMario@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 months ago

          Yep. This is just uninformed people thinking this doesn’t happen. It’s been happening since av was born. It’s not new and this will not kill CS they’re still king.

    • Munkisquisher@lemmy.nz
      link
      fedilink
      English
      arrow-up
      11
      ·
      3 months ago

      Yeah saw that several steel mills have been bricked by this, that’s months and millions to restart

      • candybrie@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        Why is it bad to do on a Friday? Based on your last paragraph, I would have thought Friday is probably the best week day to do it.

        • Lightor@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          2 months ago

          Most companies, mine included, try to roll out updates during the middle or start of a week. That way if there are issues the full team is available to address them.

    • ThrowawaySobriquet@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      3 months ago

      I think you’re on the nose, here. I laughed at the headline, but the more I read the more I see how fucked they are. Airlines. Industrial plants. Fucking governments. This one is big in a way that will likely get used as a case study.

    • Bell@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      Don’t we blame MS at least as much? How does MS let an update like this push through their Windows Update system? How does an application update make the whole OS unable to boot? Blue screens on Windows have been around for decades, why don’t we have a better recovery system?

      • sandalbucket@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        Crowdstrike runs at ring 0, effectively as part of the kernel. Like a device driver. There are no safeguards at that level. Extreme testing and diligence is required, because these are the consequences for getting it wrong. This is entirely on crowdstrike.