Always Account for Adaptation
Back in early 2020, probably early March, I was stocking up on beans and hand sanitizer at a supermarket in Helsinki, masked up of course. The bean shelf was full, hand sanitizer was selling for normal prices, and I spotted an unmasked Sanna Marin (PM at the time) in the cheese aisle. The whole situation felt somewhat unreal. No one else seemed to have any idea what was about to hit them.
Many people following AI closely have been comparing the current situation to that moment. For those of us paying attention, it seemed like the rest of society was never going to respond sufficiently: the disease would simply rip through sclerotic Western countries and overwhelm hospitals, killing millions. I remember my dad asked me at the time, skeptically, what I expected to be the concrete economic harm from COVID. I wasn’t sure, and said something to the effect that if tons of people working in logistics are sick at the same time, this could result in severe economic problems.
In hindsight, much of this was importantly wrong. The bean shelf was empty when I went back the next week. Most societies did respond, many quite aggressively. Hospitals mostly didn’t get overwhelmed, at least not nearly to the extent many would have expected.
The response to COVID came a bit late, was ham-fisted in many ways, and some would argue it did more damage than it prevented. But, contrary to early predictions, there very much was a response, and for most of us, the COVID era is much more strongly characterized by the response than by the first-order effects of the disease.
I was making a common mistake: It’s easy to assume that new developments will hit a largely stationary society, and to underrate the extent to which society will adapt in response to them.
The point, plainly put
People have been issuing various dire warnings about the impacts of AI: of incredibly high-impact cyber incidents wreaking chaos, and deepfakes and superpersuasion essentially driving most of society insane.
I think these predictions often underestimate important adaptation effects. People can and will simply change their behavior in response to new threats to avoid those threats. If something is no longer safe or reliable, people will stop doing it and stop relying on it. If dining at a restaurant becomes risky for yourself or others, people can, and in many cases did, just stop doing it. If letting your AI agent read random websites becomes too risky due to prompt injections, people will use whitelists. If AI can generate convincing pictures of anything, or convincing arguments for anything, people will stop trusting mere pictures or mere arguments (and indeed mostly already don’t trust mere arguments).
My point here is not to say “there’s no need to worry, we’ll just adapt”. As with COVID, the adaptation will be clumsy and costly and imperfect. But I am saying that things probably won’t be as bad as many alarmists think, and more importantly, that a large part of the impact will take the form of adaptation costs, rather than the first-order harm from the thing itself.
Cyber chaos
It’s a common refrain in conversations about security to bemoan the awful state of security practices at most organizations. Based on this, common predictions are made that AI-enabled cyber operations will cause catastrophic outcomes, because everyone is so underprepared.
But consider: have you been hacked in a way that caused substantial harm to you? Has someone you know? In my case at least, the answer is: not really. How can it be the case that most people and organizations have terrible security practices, yet don’t actually experience all that much harm? What this actually sounds like is people rationally investing relatively little in security because it’s mostly just not a big deal.
Of course, it’s largely not a big deal because cyber attacks are labor intensive and difficult, and there aren’t that many competent bad actors out there, so most people are not worth their opportunity cost to attack. And AI might well change this, by automating parts or all of the cyber kill chain.
But the fact that most people have terrible security practices is arguably evidence that we might be able to weather this storm just fine: it means that most people could improve quite a lot, relatively cheaply, if the threat environment got more serious.
This, combined with the fact that abundant AI security talent will also benefit defenders, suggests we’ll probably mostly be fine. Some people will get burned early on, and people might get a lot more careful about making sure they have good MFA setups, etc., and this will be somewhat inconvenient. And vibe coding your own “disposable software” might end up having a lot more limited applications than one might expect. Worst comes to worst, people might move back to using cash more for payments, or get way more careful about untrusted links. But if there’s real pressure, people will adapt.
Adaptation dynamics like this also came up in the context of some work I’ve been doing on AI integrity, i.e., the concern that AI systems’ behavior could be compromised by poisoned data and other tampering.
A key question about integrity is how easy it will be to perform “spray and pray” data poisoning, by simply putting texts and other media online that exhibit a particular bias or backdoor, and hoping that they get hoovered up into LLM training data.
For example, last year there was some concern about Russian propaganda “infecting” LLMs through a network of propagandistic news websites, but the actual effect of this particular campaign seems to have been limited.
It’s very unclear how feasible these kinds of attacks will be. But suppose that spray-and-pray poisoning is feasible: How would the whole ecosystem adapt to this? In a scenario like this we should expect all sorts of actors to perform such attacks to get LLMs to recommend their products, support their politics, or use their JavaScript frameworks. Consequently, labs would either work very hard to prevent this, or customers would learn not to trust LLMs.
This would likely prevent a worst-case outcome where some nation state is able to suddenly compromise our information ecosystem or coding agents, because other actors would essentially have compromised them already, and we would have adapted to their unreliability.
In other words, it’s quite unlikely LLM agents would prove to be unreliable, but would be widely relied on regardless.
Of course, integrity attacks would still be important in this world, but they would be primarily important as a general drag on the reliability—and thus deployment—of AI, rather than leading to widespread compromise.
Epistemic eschatons
Something similar is true for “AI-enabled misinformation”: People have been worrying about the use of generative AI for misinformation at least since GPT-2. These days, it is already basically the case that AI can be used to generate convincing video. Has this resulted in an explosion of people believing insane things? Too soon to really tell, but so far I haven’t seen strong signs. People are instead mostly adapting by updating not to trust random pictures and video from unknown sources.
This is not without costs, of course. It might mean the end of quasi-anonymous “citizen journalism” where random people’s smartphone videos provide rapid evidence of events on the ground. Pending maturation of verification tech like C2PA, we may be forced to return to the epistemic dark ages of the 2000s, when we relied on trusted journalistic institutions sending reporters to the scene to ask people what happened. Indeed even now the vast majority of the information we act on comes to us in the form of text, which has always been trivial to falsify, yet somehow we make do just fine.
AI-enabled persuasion is another variant of this: Alarmists warn that LLMs are getting very persuasive, and implicitly thus warn that this persuasion will be a powerful instrument for shaping society. Many stories about catastrophic impacts of “superintelligence” are also partly premised on the assumption that superintelligence would be capable of superpersuasion, i.e., basically convincing anyone of anything.
The problem is, if you know that AIs can come up with highly convincing-sounding arguments for essentially any claim, regardless of whether the claim is true, wouldn’t you just stuff your ears with beeswax? Or more realistically, you could safely listen to the argument and you simply wouldn’t really be convinced, because “an AI can make a convincing sounding arguments for X” would have ceased to be Bayesian evidence in favor of X.
Another reason you wouldn’t be convinced of random AI superpersuasion is that you would have already heard the super-persuasive argument for the opposite position from the other group’s superintelligence an hour before, or because you asked the superintelligence in your own pocket for its take on the issue.
The Cambridge Analytica case is instructive here: The media wanted to sell you a story that a small company was able to use novel technology to swing an entire US Presidential election. But the assessment of sober domain experts seems to be that this is not what happened at all, mostly because plenty of other groups were also using similar tactics, and even those tactics didn’t work very well, thanks to the onslaught of other forms of persuasion that people have already learned to weather.
Maybe you think that this would basically drive everyone insane instead? They’d be constantly changing their mind in response to confusing arguments, or they’d retreat to total epistemic nihilism.
The main reason I’m not super worried about this is that this seems to basically be the status quo ante. As every smart, nerdy teenager learns to their frustration, arguments, no matter how rational, are mostly not sufficient to convince people. This is because most people correctly recognize that “some guy can make an argument for X, and I can’t spot the flaws in the argument” is already often true regardless of whether X is true.
Instead, people mostly rely on their own experience, trusted experts, and other forms of social proof. People believe that a product is good if their friends recommend it, while mostly dismissing companies’ own marketing claims.
Call me an optimist, but I would expect that superintelligences competing to convince people will mostly look for a competitive advantage by generating real, credible evidence for their claims, and building trust through that, rather than just coming up with increasingly sophisticated “arguments”, because people already basically aren’t convinced by “arguments”.
Insofar as people currently believe badly incorrect stuff, the explanations for this seem to be largely social and psychological, not intellectual: For things where people don’t have a lot of skin in the game directly, it can be tempting to believe whatever will make your friends like you, or whatever makes you feel better about yourself. This effect might sap some of the epistemic upside from AI, but AI likely won’t make it much worse, because it’s already priced in: Plenty of biased sources of information are available for those who want to have biased views about particular topics.
I suspect that many people expect some form of epistemic armageddon because they’re implicitly expecting AI to be very sudden and concentrated. For example, if you did pop a modern image model into 2010, you might be able to fool a lot of people. And if you were the only one with access to a modern LLM, you might be able to win a lot of debates.
But even these probably only work if you’re the only one with access, and people don’t know you have it. If you get a really sudden intelligence explosion, this might be the case for some brief moment, but it’s a self-limiting situation: Using your capability aggressively makes it very likely that the existence of the capability will be noticed.
The blogger Dynomight thinks that AI could actually be extremely persuasive precisely by infiltrating our systems of social proof. But he too assumes that there’s approximately one AI that is “everywhere” and free to infiltrate your circles of trust unopposed over some substantial period. More realistically, competing AIs would compete to infiltrate, and would be incentivized to call each other out. And even if AIs are engaged in a coordinated persuasion campaign across some society or group, people in that society or group will probably notice and become distrustful in response.
Our best evidence for the end result of competition among agents of different degrees of honesty comes from the human case, and that evidence suggests that sociopathic lying sometimes kinda works if you’re careful and very smart, but that the most popular strategy seems to be to be basically trustworthy, at least to those close to you. Life is a repeated game, and liars will eventually get discovered.
To support his claim that humans are extremely persuadable, Dynomight points out that lots of people believe pretty crazy stuff, see: every religion that isn’t yours. But in almost all cases this seems well explained by people choosing to adopt certain beliefs because there are very real benefits to belonging to a community with a shared belief system. They’re not being tricked to act against their interests. Really, it’s not clear to me that what’s happening in these quasi-religious cases is best described as anyone being “persuaded” to “believe” anything at all in the everyday sense that a friend might e.g. persuade you to buy a product.
Again, my point here isn’t really that there’s nothing to worry about. In this case, I think the thing to worry about would be the concentration of capabilities, lack of transparency into what capabilities exist, and the possibility of very sudden progress in capabilities. But the “superpersuasion” itself isn’t the central concern: If everyone’s a superpersuader, no one is.
Important Implications
The big caveat to all of this, as discussed above, is that adaptation could fail if AI capabilities are secret, concentrated, and develop very suddenly.
There are significant forces pushing against all three of those, and if the technology is momentarily all of these, it’s hard to keep it that way. But these are all mutually correlated, meaning that you should probably put at least some probability mass on worlds where all three of these are true. And it’s certainly conceivable that they could be mutually reinforcing in extreme cases.
But what I think is underrated, and what I want to write about more on this blog, is that these three are extremely central to plausible stories of doom, and might also be particularly promising to intervene on, plausibly more promising than “safety” or overall “capabilities”.
Predicting the first-order impacts of novel technologies is already hard, and correctly predicting higher-order effects is necessarily even harder. It’s easy to slip into assumptions that others will be gullible, incompetent, and slow to react, even if you wouldn’t believe that of yourself. But the historical record does show that when it really, immediately does matter, people will react, even if it can be hard to predict exactly how.
The COVID case I think shows that these adaptation responses can be surprisingly intense, and we may have more collective agency over them than is commonly supposed.
I’m not entirely sure what this implies for analysts and advocates now. The efforts of the biosecurity community to think about pandemic responses in advance seem to have been somewhat useful, but were wrong about important things (masks) and lockdowns were implemented largely against prior recommendations, for better or for worse. But somewhat-random blog posts were apparently quite influential in the moment, and relative outsiders were able to set up contact-tracing infrastructure that likely would have been dismissed as socially infeasible a year earlier. (Admittedly Claude tells me said contact tracing infrastructure turned out to be of limited value.) And ultimately Operation Warp Speed was of course the ultimate demonstration of unexpected state capacity.
The winning move at the time seems to have been to keep as up to date as possible on the latest evidence of how the pandemic is actually progressing. Clinging to pre-existing expertise was often a mistake. I think we’re already seeing some of this with AI, where preconceptions and supposed expertise built up over decades seem to be more liability than asset.

I feel like it's way too optimistic to think "If everyone’s a superpersuader, no one is.". I know that there are many people in my life who just do not have, like, the ability to understand what it means that models are so good to create realistic video or text, and also at any speed.
And remember that LLM's primarily generate text which is the mode of communication that we've assumed comes directly from a human for a while. It was already 'hard' enough to try to find if someone's post was from a karma farmer, credibility-less clueless but confident person, or copypaster (and trying to find evidence of those requiring cross referencing their other text), damaging credibility of many sites, but those can be detected easier than LLM's now.
I think it's more of a narrative of normalcy people have that makes it seem like "quite a lot of weird effort no one will do" that is why for example there has been no "big deepfake controversy" moment yet. Not because people are clued-in. Recently a streamer I watch a lot just played a full AI-generated quiz on stream, constantly be like "the writing was so good" and no one seemed to notice or care besides only 1 other person in chat who plays with frontier AI.
I would not even say these capabilities are 'secret' but that societal awareness is terrible (not helped by hate-spamming "shameful never use AI even for the most obvious work-saving rote work, it's just autocomplete" mobs to be fair).
There's many points in the article I find reasonable but I just feel like something is deeply missing in the whole, like I find the 'group of aware people' vs. 'group of unaware-to-AI people' very stark. Even thinking in terms of groups of people a bit much:
"But he too assumes that there’s approximately one AI that is “everywhere” and free to infiltrate your circles of trust unopposed over some substantial period. More realistically, competing AIs would compete to infiltrate, and would be incentivized to call each other out. "
What does the world look like here? I mean I'd already think that 'secret' AIs, controlled by humans for some motive by the way, will be designed to be subtle enough that they just screw with consensus without even knowing with tons of different accounts. Those are notoriously hard to detect because it involves tracking some 'randos' online. Yes it isn't an only-AI problem but the rapid speed increase will be.
And if they are NOT secret to the point we have 'competing' AI's competing over manipulating the same community, you could probably say goodbye to humans talking in it, because that means social platforms will all become a live Moltbook. I mean I guess I see AI stuff obviously all over my Substack feed while my own handwritten posts get 1-2 likes. I bet 'calling out' in this context will just mean more engagement bait. You've probably seen reddit posts where a mass upvoted post that's AI will get top comments calling it out as AI, but somehow the post remains mass upvoted.
I have heard it suggested elsewhere, to deal with this already-existing-before-AI-but-got-much-worse problem of not knowing who is real and who is just trying to gain power, to have some "proof of work" of truly good content or at least some reputation to uphold even if you're not online to primarily write, just to show not being an LLM. I don't know if this is very viable optics-wise because it filters out real people who are legitimately worse than LLM's at everything or just aren't very ambitious enough to care.
I guess overall I'd say we aren't really fine, we've also had lots of other erosion in trust in many areas and I truly think a lot of people are more like "I guess this is just the new normal, who can do anything about it, oh I just ignore it, I am pacified and domesticated to not care", blank-stare equivalent.