11/12/2025 | Press release | Distributed by Public on 11/12/2025 14:23
Photo: Julia/Adobe Stock. Image was generated with AI.
Commentary by Carol Kuntz
Published November 12, 2025
Two Princeton computer scientists published an article in August that argues that artificial intelligence (AI) is a "normal" technology. This perspective stands in marked contrast to the less clearly articulated but more dominant perspective that views AI as a sort of technological tsunami, relentlessly pouring into every crevice of human life, eventually serving as super-lawyers, super-doctors, and perhaps eventually benevolent dictators or evil overlords.
A "normal" technology in this rendering is still transformative but follows the pattern of other strategic technologies like electricity. It transforms the world around it, but confronts friction in its advance into jobs, homes, and schools. The advance takes decades, allowing governments, businesses, civil society, and individuals to make decisions about how to allocate benefits and risks.
A 2025 CSIS report, Artificial Intelligence and War: How the Department of Defense Can Lead Responsibly, strongly aligns with this view: AI is vulnerable to mistakes when used to solve real-world challenges. Particularly when the consequences of mistakes are high, AI experts must work closely with subject matter experts to develop analytical tools that can predict, measure, and mitigate these mistakes.
The CSIS report focused on the use of AI in the substantive domain of war. Proceeding despite significant uncertainty about prospective mistakes may be acceptable when assessing the efficacy of recommendation algorithms for streaming movies or urging the purchase of new consumer goods.
For war, though, failing to understand the prospect of errors could lead to mistaken or misguided military effects; even the bravest fighter doesn't want to die from fratricide. Military officers and policymakers should seek to anticipate these types of errors, as well as those that could lead to excessive civilian deaths or unwarranted destruction of civilian infrastructure.
Structured decision tools to elicit and integrate key insights from warfighters and technical experts would be needed in three different contexts:
Britain focused on driving tanks slowly with infantry, while Germany developed the new operational concept of blitzkrieg, honing new tactics and new organizations to implement it. History, at the beginning of World War II, revealed the most effective way to use tanks in war. Well-instrumented wargames are a valuable contribution toward developing war-winning operational concepts.
There are many reasons why known mistakes in algorithms still occur, particularly in high-consequence domains. While complicated, AI is not impenetrable.
Warfighters-the domain experts in war-should train to understand key assumptions undergirding algorithms. They need practice identifying ways in which unexpected characteristics of a specific military operation would not be consistent with an underlying assumption in an algorithm.
An unexpected characteristic might emerge when using a highly accurate algorithm for identifying enemy tanks that was trained on data about Russian tanks. In a particular operation, though, the algorithm should not be used despite the elegance of its math: A member of the U.S. coalition in this operation is using Russian tanks.
Structured tools like a dashboard or a decision tree should be built to ensure that the critical expertise from the AI experts and the domain experts is elicited and reconciled before algorithms are integrated into fighting forces. Simulations and training exercises should ensure that appropriate warfighters are well-trained on the characteristics of algorithms that are particularly sensitive to the precise characteristics of a specific operation.
Whether "normal" or not, AI is coming. It may take decades, or it may be, as some observers warn, a sort of technological tsunami relentlessly pouring into every crevice of human life.
Domain experts should accept their responsibility to actively shape the use of AI in their fields. They should demand and fund assessments that highlight, instead of burying, fine-grained judgments in algorithms that influence whether mistakes of various sorts plague the use of AI in their substantive domain. Tsunamis of all sorts can cause a great deal of damage.
But irrigation, after all, was an early human innovation. Through it, humans were able to divert water-much, but not all, of the time-to serve beneficial purposes. Risks remained, but the damage was reduced, and the benefits were greatly increased and more widely shared.
AI mistakes generally are caused by some sort of problem in one of three contexts: first, in the mechanism of the algorithm itself; second, in the training database; or third, in the user's interpretation of the result of the algorithm.
First, consider the mechanisms of the algorithms themselves. The mechanisms of different types of algorithms can cause mistakes if the strengths and weaknesses are not well-understood. For example, there is a reasonable argument against the use of current generative AI algorithms in safety-critical functions.
Hallucinations
Generative AI can create new content in response to a query from a human interlocutor. It crafts this new content by drawing on insights derived from reviewing hundreds of millions or billions of examples in a systematic way. Recent technical advances enable it to understand the broader context within which words or images reside, understanding more subtle connotations and meanings.
While systematic analysis lies behind the generative efforts of these algorithms, the creation of new content represents a certain amount of risk, as in any creative enterprise. With the ability to craft a new answer to any question comes the risk of random errors, more commonly called "hallucinations."
While there are many valuable uses for these generative AI algorithms, they should not be used for the most sensitive purposes, like final targeting and firing decisions in the DOD. The CSIS report recommends that firing and targeting algorithms would be among a set of "sensitive" algorithms used that require explicit approval in the rules of engagement (ROE) approved by the secretary of defense for a particular operation.
Safety-critical uses in general should not rely on generative AI, at least not until there are substantial improvements in managing its risk of random errors. Domain experts need to play a critical role in identifying and shaping the use of generative AI in "safety-critical" roles.
Aligning AI and Human Goals
Alignment problems are a second concern about the mechanism of algorithms. Alignment problems arise because of the difficulty of precisely characterizing goals. This has long been a problem with reinforcement learning, an algorithmic approach that devises an optimal strategy to achieve the specified goal.
Alignment problems could arise when the AI misunderstands or misinterprets the goal that the human specified for the algorithm. A simple example might be the human interlocutor directing an autonomous car to get them to the airport in time to catch a flight, but failing to convey that there should be no serious accidents or reckless driving en route.
Alignment problems will expand beyond reinforcement learning as AI agents operate through foundation models. Foundation models refer to the marquee AI models of this era, like OpenAI's ChatGPT-5 and Anthropic's Claude, for example. These models have a generative AI algorithm at their heart but are refined with supervised and reinforcement learning algorithms.
Such AI agents are only starting to creep into use but are being considered for tasks such as making airplane and hotel reservations for a trip or, perhaps eventually, responding to an episode of overheating at a nuclear power plant. The dangers of misaligned goals between the AI agent and the human become more challenging as AI is asked to perform more difficult tasks and more dangerous as AI becomes more powerful, creative, and interactive on the web.
The second source of mistakes in AI algorithms is in the training database. The AI used today is trained on databases, most of which contain examples of the phenomena of interest. These phenomena might be human genomes with cancer, potential consumers with certain interests and demographic traits, or perhaps mechanized vehicles, some of which are civilian school buses and others of which are military troop transports.
The algorithm is trained by closely studying the database. Once the algorithm has been optimized, it goes out into the real world and examines previously unseen examples of the phenomena of interest-deciding whether the patient has cancer, whether the consumer should be contacted with a particular ad, or whether a vehicle is a legitimate target for the military.
Drift
Training databases can cause problems of various sorts. The training database can become out of date if the study inputs start to change in the real world. For example, a training database that contains the insight that fashionable people are all wearing skinny jeans will soon be generating recommendations that seem perilously out of vogue. This type of problem is called drift: The real world has drifted away from the one that generated the training database.
Poisoned Data
This difference between inputs and outputs could be heightened by an adversary purposefully manipulating the data to cause the algorithm to generate the wrong result. This is called "poisoned data" and is used by cybercriminals to fool the financial community, and it should certainly be expected to occur in war.
Edge Cases
Similarly, training databases need to be diverse. This means that they need to contain a good reflection of the entire phenomenon. Important but somewhat unusual data are called "edge cases." Autonomous cars provide a powerful example of the importance of including edge cases in the training database. In driving, the edge cases would often reveal particularly hazardous conditions or difficult intersections.
Because these events are unusual, the database is unlikely to have a thorough set of them. Yet because they are potentially challenging and dangerous, it is particularly important to ensure that they are represented in the data, particularly for navigation systems. A thorough coverage of edge cases would be important for flying a drone or guiding an autonomous submersible.
The third source of errors in algorithms is the interpretation of the result by the user. A powerful problem would be automation bias. This is the tendency for a human to follow the direction of a complicated quantitative process, particularly in a high-pressure, time-sensitive context, such as war.
Domain experts need to demand more of themselves and their colleagues in AI. AI, on whatever timescale, is likely to be transformative. Domain experts-through either insisting on an active role or accepting a passive one-will at least partially be responsible for whether AI principally is transformative for good or ill.
Carol Kuntz is an adjunct fellow (non-resident) with the Strategic Technologies Program at the Center for Strategic and International Studies in Washington, D.C.
Commentary is produced by the Center for Strategic and International Studies (CSIS), a private, tax-exempt institution focusing on international public policy issues. Its research is nonpartisan and nonproprietary. CSIS does not take specific policy positions. Accordingly, all views, positions, and conclusions expressed in this publication should be understood to be solely those of the author(s).
© 2025 by the Center for Strategic and International Studies. All rights reserved.