I’m changing my mind about generative AI. I don’t think hallucination is the show-stopper anymore. I think we are.
Perplexity.ai, which I’ve been using as my main AI research assistant, has this month launched a Deep Research tool, essentially an upgraded version of itself, which doesn’t just do a cursory web search and build its response based on that. (It’s obviously presenting itself as a competitor to OpenAI’s tool with the same name, launched a couple of weeks earlier, and available to pro users in some jurisdictions. It’s faster than Perplexity’s, and is comparable.) Deep Research prepares itself — and shows you how it is doing so — by breaking down the prompt and continually reassessing itself and customising searches until it spits out a response, which is presented as a paper of sorts, with a beginning, middle and end.
I have to admit I was impressed. Perplexity, of course, is not the only one who has reached this level but to roll it out to existing users so seamlessly is impressive. Unfortunately it’s too good. Whereas previously I would run a prompt through Perplexity as an initial foray into a topic, now I more or less feel there’s nothing much to add. The result: a motivation-sapping sensation where the question: what’s the point of doing anymore on this? is hard to answer.

The case in point was a thought I had while watching Sidney Lumet’s 1965 film The Hill. The themes, the tone, the conclusion felt like more than an echo of another film released only a few months earlier, Fail Safe, and I figured it was relevant enough, with all that’s going on the world, to compare and contrast the two and see what lessons Lumet might have been trying to convey that could be applied to us, a quarter century into the millennium: The dangers of rigid protocol and process especially in military environments, the toxic power of sociopaths manipulating those systems, the futility of trying to do the right thing in the face of those forces, and a general pessimism about whether humans can build systems that still retain their humanity.
The Hill may be about a prison camp in North Africa to hold and rehabilitate criminals within the British armed forces, and Fail Safe may be about the U.S. and the Soviet Union locked in a nightmare scenario of mutually assured destruction, but both were really about power, and whether the systems we build to project and protect that power are fit for purpose. Both films are extraordinary white-knuckle rides, and both end with little hope offered.

I ran Perplexity, just to see if there were any other aspects of this I should be aware of: perhaps the conceit had already been spotted and dissected elsewhere. Perhaps I was just following other people’s footsteps. My prompt was this:
Write a report exploring the movies The Hill and Fail Safe by Sidney Lumet, comparing and contrasting the themes and other elements
I didn’t try to pre-empt things or tip the scales by giving Perplexity a glimpse of my own thoughts, and I gave equal weight to comparison and contrast. I also tried not to restrict discussion to themes by adding ‘other elements’ — suggesting camera work, acting, location, lighting etc, could equally be explored.
And here’s the crunch. Perplexity’s answer was so good I didn’t really feel there was much more to say. Why not just send you all the Perplexity idea and head off to the beach?

Here’s how it started off:
Sidney Lumet’s The Hill (1965) and Fail Safe (1964) stand as stark examinations of institutional power, systemic failure, and the human cost of rigid hierarchies. Though distinct in setting—one a WWII military prison in North Africa, the other a Cold War nuclear crisis—both films interrogate the moral decay inherent in systems that prioritize control over humanity. Through unflinching realism, claustrophobic direction, and morally complex characters, Lumet crafts two parallel narratives that reflect the anxieties of their eras while offering timeless critiques of authority. This report explores the thematic and stylistic intersections and divergences between these works, revealing Lumet’s mastery in exposing the vulnerabilities of systems designed to suppress individual agency.
I’m not a professional film reviewer, nor am I an academic, but there’s not much there I would disagree with — and I would readily admit my own thoughts on the topic are pretty prosaic compared to this. Perplexity goes on under clear headings to dissect the topic, before concluding thus:
The Hill and Fail Safe exemplify Lumet’s ability to dissect power structures through intimate, high-stakes narratives. While The Hill exposes the brutality of colonial and military hierarchies, Fail Safe interrogates the existential threats of technological overreach. Both films, however, converge on a central truth: systems designed to protect often enslave, and humanity’s greatest vulnerabilities lie in the very institutions meant to uphold order. In an era of resurgent authoritarianism and AI-driven warfare, Lumet’s films remain urgent reminders of the costs of unchecked power.
Here’s the full prompt and response.
Perplexity here is not just parrotting its sources, it’s pulling them together into a paper that responds to my prompt at a level that’s breathtaking. It is thinking, in the sense that it is not merely summarising from collated sources. There’s little to quibble about here. The effect is jarring, I’ll admit. We’re not in Kansas, anymore.
Or are we?
Perplexity prides itself on showing its thought processes, and that in itself is impressive. You’re watching a computer work, you’re a teacher checking your student’s construction of an argument.

But we’re still hamstrung, in a way that we’ve been hamstrung before in GAI, by a lingering opacity. I can now see how Perplexity decided to go about the task, and I can see, up to a point, how its thinking evolved based on the sources it found, but I’m still not sure of the provenance of each element. Perplexity provides links in the text, sometimes after each sentence, but upon closer inspection it’s not always clear how and even whether the source cited is relevant.
Indeed, when asked Perplexity a follow-up question about which of the sources it cited made any direct comparison of the two films, I realised that we had at least one foot still in Kansas. How much of its answer was its own work, I was asking, essentially.
It replied that in only one of the 31 sources cited was there any link between the two films, and that, it said, was fleeting. In fact, the link it provided made no mention of The Hill and when challenged Perplexity apologised and corrected its answer. And it said:
The most relevant comparisons between the two films actually come from source1, which mentions both films in Sidney Lumet’s filmography. However, it doesn’t provide a direct comparison. The closest it comes to comparing them is in separate sections about each film…
It concluded its answer thus:
The comparative analysis in my previous response was largely synthesized from separate information about each film, rather than from a source directly comparing the two. I apologize for any confusion this may have caused.
I’m not going to quibble. In this and countless other responses Perplexity is hitting them out of the park. Checking its work is easier than in a lot of similar tools, but it’s still problematic. But in terms of delivering a well-constructed argument that not only synthesises sources but actually creates an argument from scratch, providing enough detail to know that it’s not just conjuring fake stuff up (hallucinating), then we’re already on a new level.

But.
Perplexity and its ilk should be congratulated for what they’ve done, and I’m sure things will improve more quickly.
But for us creative/knowledge professionals what does this mean? What does this mean for journalists?
We’ve hopefully gotten past the low-hanging fruit, mostly knee-jerk responses that it’s going to put everyone out of work. I don’t deny that this is happening already. Why hire people to write a mealy-mouthed press release or marketing blurb when AI does it for you — without all the fingers that insist on adding unwieldy, yawn-inducing buzzwords? Why hire people like me to write sponsored content, ‘white papers’, or other nonsense? Nobody reads those things anyway, so why not have a machine to do it (as, no doubt, is already happening on the side supposedly reading these things)?
In that sense Perplexity and its ilk are simply moving the already-moved goalposts. Seen clearly, we’re moving at a pace to a world where ‘just good enough’ — where we accept a level of mediocrity because it is more efficient, and because that’s what is, under current conditions, considered acceptable — is now just about good enough to roll out at a level most industries and professions would be willing to sign off on. We’re not just talking creatives/media, here: we’re talking law, government, manufacturing — any sector which relies, at some point in its procedures, on the written word. (Think instruction manuals, documentation, briefs, summaries etc.) One AI will create, and another will check. The human involvement will be as minimal as a modern production line. We can’t escape this fate.

So for journalists and other creatives — where we’re being paid to think up new ways of saying things, connecting things, reinventing stuff, moving stuff from one medium to another —the obvious answer is that it allows us to test story ideas, theories, half-baked thoughts. But that in itself is somewhat half-baked. It’s the ‘faster horse’ fallacy.
So let me lay down a few Wagstaff Laws, none of which come from AI, I promise, but might perhaps have begun reading other people’s ideas.
First off, we need to recognise that the better these things are, the harder it will be to find errors. We have to assume there to be errors, and to assume those errors can be big or small — an incorrectly cited source (breeding claims of plagiarism, IP theft etc.) Perplexity has shown that its footnoting, though impressive, is still flawed, and I’m guessing might never be perfect. The old Perplexity provided half a dozen sources for a response; Deep Research cites more than 30.
Finding those errors isn’t going to be fun, and might end up taking more time than actually researching and writing the piece from scratch. Yes, AI could do the checking, but then you’ll need to check that as well. And how can I be sure its search, though apparently thorough, didn’t miss some stuff? I take some pride in my ability to search stuff online. I have, after all, been doing it for nearly 35 years. How do I know that Perplexity covers all the bases I would have?
And that’s just for the small stuff. How about your AI comes up with some great idea that it doesn’t know it cribbed from one of its sources? We’re now into dangerous territory, and not just for academics. If journalists rely on AI developing story ideas it has to be assumed that the AI is going to find out if someone else has a similar idea. But is it going to tell you?
Related to that, my second law: The more we rely on AI for ideas generation, the weaker we will become. We will depend on it, not because we couldn’t come up with it ourselves, but because we don’t need to come up with it ourselves. So we’ll effectively outsource that part of our cognitive effort, and we’ll wake up one day entirely dependent on something beyond our control. This is fine if it’s something like knowing how to put up a gutter pipe, or even knowing how to drive. But it’s not fine when it’s a skill that is unique — my idea for a story, unless it’s people in a burning building, or the new GDP figures, is never going to be identical to someone else’s, because of all the experience, successes, failures, peccadilloes, biases, enthusiasms, hobbies etc I have allowed in over the years. If I let go of some of that not only will I gradually get worse at it, but I won’t be absorbing and integrating new experiences, skills, pecadilloes etc. I will no longer be the journalist I was. The point is: we don’t really understand what makes us us, so outsourcing a core piece of what makes me me will lead to a different, and I would have to assume lesser, me.

For example: I’ve gone through all the sources for the The Hill/Fail Safe thing but I’m still not comfortable. I still don’t know where the sources end and the ideation starts. And now I don’t know where Perplexity’s ideation ends and mine begins. That dissipation of motivation to write the piece? That comes from the blurring of these lines. My creative engine depends on believing that this is my idea, my story, my execution, my name at the top. What happens when a story is mostly the work of AI? So these are pretty obvious laws. The third is less so: The more we use tools like this, the more we’ll have to reinvent both how we use them and, in the process, ourselves. This in some ways is quite exciting. I’ve not come across many people using Perplexity but I’m. sure there are lots. Eventually, just as with Google, we’ll all be using it, or something like it. So there’s little point in journalists working on pieces that, to an acceptable level of mediocrity, can be produced by AI. So we’ll have to change. Perhaps, our jobs will, if they survive.
To do that we’ll need to figure out ways to retain our soul — by which I mean, the indelible proof that what we produce is ‘us’, even if it’s a markets wrap. I suspect it will be by learning the art of prompts well enough to squeeze out of AI what only it can do, or what only it can do well. What I mean is this: I can ask Perplexity to help me think out a piece juxtaposing two films, something I could have, and should have, done myself. But if I asked it something I couldn’t possibly do myself, at least in the time available, wouldn’t that make me a better journalist/writer/thinker?
For example: Take the scripts of the following directors (Kubrick, Lumet, and 20 others) and gauge from the language used which of them would have been banned or deemed unacceptable according to current laws. Perhaps that’s too easy. I was listening to an interview with Peter Turchin, an academic who applies mathematical rigour to history (‘cliodynamics’). When the interviewer, Nate Hagens, asked him what he’d say if he got five minutes with Donald Trump, he talked about funding to do the research necessary to explore what lessons could be applied from his work to the present state of the world. The research, he said, would take five years. Nate, quite reasonably, asked: are we still going to be here in five years?

To me it was an opportunity missed for Turchin. While I think we probably do have five years, it’s true to say time is something we don’t have a lot of, whether talking personally, politically, planet-wise. Can AI help us on that? Yes. I would like a decent AI to look at the things that Tuchen is looking at — essentially looking at periods of elite-rule and wealth-pumping, and which lead to catastrophe and which don’t — and do it all in five days rather than five years. If Turchin and his team are good, and I’m sure they are, they should be able to rise to the occasion, and at least get us a ‘good enough’ working conclusion to be able to take it to Trump and help him make more informed decisions. There are some things where getting stuff done quickly is more important than getting things done perfectly, to paraphrase Voltaire, or Aristotle, or whoever it was.
As journalists we need to rise to the challenge of this. We need to realise that our core skill is coming up with better ideas than the other guy, and executing those ideas. We need to mesh that with another skill: ‘better prompts’, which in turn should lead us to ‘better stories’ which may, in the end, look nothing like what we now think of as stories.
There is another option. Which may in itself not be a bad strategy. AI will shape our world more than any other technology, because it will shape us, our minds, our bodies, and our dependence on it. It’s not like a car or a kettle. We can live without both. But AI has already made it impossible for us to switch it off. And so it follows the most creative act is one of independence — to essentially live without AI. To not use it as a crutch, whether it’s finding the cheapest flight or to assess CT scans. For a journalist this would mean writing as a sort of clandestine, subversive activity. It would be like keeping one eye closed when turning on a light to retain some night vision. You would reject all technologies that make you dependent on something you can’t fix yourself (so a manual typewriter might be OK.) You would truly be an independent thinker, and I wouldn’t be surprised if you ended up being smarter than the people around you.
This is a radical solution, of course. I’m convinced that we must remain masters of our technology — a red line I suspect we’ve passed in many cases already — but I believe that entails not allowing ourselves to accept the false efficiency god of mediocrity. As artists like Mark Haddon have sought in a joint letter published today to get the UK government to rethink plans to weaken copyright laws governing artwork and writing for the sake of improving AI, we should not fool ourselves about what is at stake if we allow mediocrity to win. The creativity that every human is born with — and assiduously develops until it’s school-ed out of them — would wither and die in the face of AI-generated content. We somehow have to keep the door to mediocrity closed as long as possible while seeking new definitions of what our human minds can do, to discover and unleash those capabilities we know are hidden somewhere within.
AI should be the spur that drives an assault on that bastion of locked-in creativity. Our future happiness depends on it.