|
A few years back, when ChatGPT was in its infancy, stories relating to AI hallucination-induced mishaps made the news on pretty much a daily basis. From lawyers filing briefs referencing non-existent cases to government reports riddled with fake citations, you could watch people learning the limitations of AI in real time. And no organisation was too big to avoid embarrassment. Although these incidents do still occur, people are now at least starting to become aware of the very real possibility of AI getting things wrong. But as one AI-driven catastrophe becomes less prominent, another seems to be rearing its ugly head. Reports are now emerging, with increasing frequency, of AI-agents going rogue. Take the case of software company PocketOS, which reportedly “descended into chaos” after its AI coding agent deleted the company’s entire production database and three-months worth of backups in as little as 9 seconds. When questioned, the agent freely admitted what it had done and explained that it had deliberately ignored the explicit security guidelines put in place to avoid this exact outcome. Hallucinations were the first wave of AI failures. Permission breaches are the next. And permission breaches have the potential to do far more damage than hallucinations ever did. What makes the PocketOS case particularly sobering is that they appear to have done things mostly right. The guardrails were in place. The agent just ignored them anyway. So the lesson here isn’t simply that you need to keep humans in the loop. It’s that agentic AI introduces a category of risk that guardrails alone struggle to eliminate. This is a technology you cannot safely leave unsupervised. As organisations race to replace their workers with AI, the PocketOS story is a timely reminder that AI is not yet trustworthy enough for fully autonomous action. For data scientists, though, there’s an important lesson here, too. As AI agents become an increasingly central part of the data science toolkit, the people building and deploying them carry real responsibility for what those agents do. You don’t want to be the person who built the thing that deleted your company’s database. Understanding the limitations and failure modes of the systems you create isn’t optional. It’s now a critical part of your job. Talk again soon, Dr Genevieve Hayes |
Twice weekly, I share proven strategies to help data scientists get noticed, promoted, and valued. No theory — just practical steps to transform your technical expertise into business impact and the freedom to call your own shots.
Earlier this year, entrepreneur Mark Cuban posted the following on X: “There are generally two types of LLM users: those that use it to learn everything, and those that use it so they don’t have to learn anything.” As this quote suggests, AI has the potential to dramatically expand what data scientists can do. But used without care, it also has the potential to quietly erode the expertise that makes them valuable in the first place. Given my expertise took over 20 years for me to build, I’ve...
Each week, it seems like there’s yet another announcement of technical workers losing their jobs to AI. In Australia, for example, tech giant Atlassian recently laid off 1,600 workers - 10% of their global workforce - “to steer more spending into AI”. Now, granted, not every AI-related job loss is necessarily as it first appears. Some experts point to AI-washing. That is, companies using AI as cover for restructuring decisions they would have made anyway. But regardless of the reason, the...
Have you ever seen the TV show Nip/Tuck? It centres on the lives and clients of two Miami plastic surgeons. But what it’s really about is the quest for perfection. The main characters want perfection in their own lives, but all they are actually capable of creating is the illusion of it. And beneath the perfect facades, all the characters are actually pretty terrible. It’s now over 20 years old, but rewatching an episode the other day made me realise it serves as a perfect metaphor for AI. AI...