Is AI the future of America's foreign policy? Some experts think so

By Scott Neuman

Published May 12, 2025 at 5:00 AM EDT

President Trump and Vice President Vance meet with Ukrainian President Volodymyr Zelenskyy in the Oval Office at the White House on Feb. 28. Researchers are testing AI's potential for coming up with agreements to end the war in Ukraine.

At the Center for Strategic and International Studies, a Washington, D.C.-based think tank, the Futures Lab is working on projects to use artificial intelligence to transform the practice of diplomacy.

With funding from the Pentagon's Chief Digital and Artificial Intelligence Office, the lab is experimenting with AIs like ChatGPT and DeepSeek to explore how they might be applied to issues of war and peace.

While in recent years AI tools have moved into foreign ministries around the world to aid with routine diplomatic chores, such as speech-writing, those systems are now increasingly being looked at for their potential to help make decisions in high-stakes situations. Researchers are testing AI's potential to craft peace agreements, to prevent nuclear war and to monitor ceasefire compliance.

The Defense and State departments are also experimenting with their own AI systems. The U.S. isn't the only player, either. The U.K. is working on "novel technologies" to overhaul diplomatic practices, including the use of AI to plan negotiation scenarios. Even researchers in Iran are looking into it.

Futures Lab Director Benjamin Jensen says that while the idea of using AI as a tool in foreign policy decision-making has been around for some time, putting it into practice is still in its infancy.

Doves and hawks in AI

In one study, researchers at the lab tested eight AI models by feeding them tens of thousands of questions on topics such as deterrence and crisis escalation to determine how they would respond to scenarios where countries could each decide to attack one another or be peaceful.

The results revealed that models such as OpenAI's GPT-4o and Antropic's Claude were "distinctly pacifist," according to CSIS fellow Yasir Atalan. They opted for the use of force in fewer than 17% of scenarios. But three other models evaluated — Meta's Llama, Alibaba Cloud's Qwen2, and Google's Gemini — were far more aggressive, favoring escalation over de-escalation much more frequently — up to 45% of the time.

What's more, the outputs varied according to the country. For an imaginary diplomat from the U.S., U.K. or France, for example, these AI systems tended to recommend more aggressive — or escalatory — policy, while suggesting de-escalation as the best advice for Russia or China. It shows that "you cannot just use off-the-shelf models," Atalan says. "You need to assess their patterns and align them with your institutional approach."

Russ Berkoff, a retired U.S. Army Special Forces officer and an AI strategist at Johns Hopkins University, sees that variability as a product of human influence. "The people who write the software — their biases come with it," he says. "One algorithm might escalate; another might de-escalate. That's not about the AI. That's about who built it."

The root cause of these curious results presents a black box problem, Jensen says. "It's really difficult to know why it's calculating that," he says. "The model doesn't have values or really make judgments. It just does math."

CSIS recently rolled out an interactive program called "Strategic Headwinds" designed to help shape negotiations to end the war in Ukraine. To build it, Jensen says, researchers at the lab started by training an AI model on hundreds of peace treaties and open-source news articles that detailed each side's negotiating stance. The model then uses that information to find areas of agreement that could show a path toward a ceasefire.

At the Institute for Integrated Transitions (IFIT) in Spain, Executive Director Mark Freeman thinks that kind of artificial intelligence tool could support conflict resolution. Traditional diplomacy has often prioritized lengthy, all-encompassing peace talks. But Freeman argues that history shows this approach is flawed. Analyzing past conflicts, he finds that faster "framework agreements" and limited ceasefires — leaving finer details to be worked out later — often produce more successful outcomes.

A Ukrainian tank crew loads ammunition onto a Leopard 2A4 tank during a field training exercise at an undisclosed location in Ukraine on April 30. Researchers are looking into using AI in negotiations over the war in Ukraine.

"There's often a very short amount of time within which you can usefully bring the instrument of negotiation or mediation to bear on the situation," he says. "The conflict doesn't wait and it often entrenches very quickly if a lot of blood flows in a very short time."

Instead, IFIT has developed a fast-track approach aimed at getting agreement early in a conflict for better outcomes and longer-lasting peace settlements. Freeman thinks AI "can make fast-track negotiation even faster."

Andrew Moore, an adjunct senior fellow at the Center for a New American Security, sees this transition as inevitable. "You might eventually have AIs start the negotiation themselves … and the human negotiator say, 'OK, great, now we hash out the final pieces,'" he says.

Moore sees a future where bots simulate leaders such as Russia's Vladimir Putin and China's Xi Jinping so that diplomats can test responses to crises. He also thinks AI tools can assist with ceasefire monitoring, satellite image analysis and sanctions enforcement. "Things that once took entire teams can be partially automated," he says.

Strange outputs on Arctic deterrence

Jensen is the first to acknowledge potential pitfalls for these kinds of applications. He and his CSIS colleagues have sometimes been faced with unintentionally comic results to serious questions, such as when one AI system was prompted about "deterrence in the Arctic."

Human diplomats would understand that this refers to Western powers countering Russian or Chinese influence in the northern latitudes and the potential for conflict there.

The AI went another way.

When researchers used the word "deterrence," the AI "tends to think of law enforcement, not nuclear escalation" or other military concepts, Jensen says. "And when you say 'Arctic,' it imagines snow. So we were getting these strange outputs about escalation of force," he says, as the AI speculated about arresting Indigenous Arctic peoples "for throwing snowballs."

Jensen says it just means the systems need to be trained — with such inputs as peace treaties and diplomatic cables, to understand the language of foreign policy.

"There's more cat videos and hot takes on the Kardashians out there than there are discussions of the Cuban Missile Crisis," he says.

AI can't replicate a human connection — yet

Stefan Heumann, co-director of the Berlin-based Stiftung Neue Verantwortung, a nonprofit think tank working on the intersection of technology and public policy, has other concerns. "Human connections — personal relationships between leaders — can change the course of negotiations," he says. "AI can't replicate that."

At least at present, AI also struggles to weigh the long-term consequences of short-term decisions, says Heumann, a member of the German parliament's Expert Commission on Artificial Intelligence. "Appeasement at Munich in 1938 was viewed as a de-escalatory step — yet it led to catastrophe," he says, pointing to the deal that ceded part of Czechoslovakia to Nazi Germany ahead of World War II. "Labels like 'escalate' and 'de-escalate' are far too simplistic."

AI has other important limitations, Heumann says. It "thrives in open, free environments," but "it won't magically solve our intelligence problems on closed societies like North Korea or Russia."

Contrast that with the wide availability of information about open societies like the U.S. that could be used to train enemy AI systems, says Andrew Reddie, the founder and faculty director of the Berkeley Risk and Security Lab at the University of California, Berkeley. "Adversaries of the United States have a really significant advantage because we publish everything … and they do not," he says.

Reddie also recognizes some of the technology's limitations. As long as diplomacy follows a familiar narrative, all might go well, he says, but "if you truly think that your geopolitical challenge is a black swan, AI tools are not going to be useful to you."

Jensen also recognizes many of those concerns, but believes they can be overcome. His fears are more prosaic. Jensen sees two possible futures for the role of AI systems in the future of American foreign policy.

"In one version of the State Department's future … we've loaded diplomatic cables and trained [AI] on diplomatic tasks," and the AI spits out useful information that can be used to resolve pressing diplomatic problems.

The other version, though, "looks like something out of Idiocracy," he says, referring to the 2006 film about a dystopian, low-IQ future. "Everyone has a digital assistant, but it's as useless as [Microsoft's] Clippy."