Satellite in orbit
Military aircraft
Mountain landscape
Ethan Kolasky presenting
Cargo container ship
Naval vessel at night
Conference presentation

DC Keynote: Autonomous Systems and the Next Revolution in Military Affairs

by Ethan Kolasky • 7/11/2015

Introduction

Thank you, David, for the lovely intro. As David said, I'm from the AI and AI research field, and today we're going to talk about autonomous systems, AI, and the next revolution in military affairs.

The HMS Prince of Wales: A Historical Lesson

Let me start with a short story. In 1939, the HMS Prince of Wales was commissioned as England's most advanced battleship, with eighteen-inch guns, heavy armor, and a modern anti-aircraft system. By 1941, it was at the bottom of the sea off the coast of Malaysia. It was sunk by a Japanese torpedo squadron at the cost of four Japanese aircraft, worth three point three billion dollars in today's money, and destroyed two of England's most advanced naval assets, over two billion dollars.

This is an example of a revolution in military affairs. We've seen this in the numbers: at the beginning of World War II, a country like America had a navy mostly comprised of battleships. We had seventeen battleships and eight aircraft carriers. At the end of World War II, the numbers were completely flipped. We had twenty-three battleships and ninety-six aircraft carriers.

A good definition of a revolution in military affairs is a fundamental advance in technology, doctrine, or organization that renders existing methods of conducting warfare obsolete. And we're going through several simultaneous RMAs today. These include AI, automation, and cyber.

For this presentation, we're going to talk about these topics. We're going to define what AI really is, how AI has been used in the conventional kill chain, the role of AI in supervised learning, how that differs from generative AI, and then the interplay between AI and cybersecurity.

What is AI, Really?

Let's start with the definition. What's AI, really? Artificial intelligence is the ability for machines to replicate data by learning the mathematical equation to fit that data. If this sounds like statistics, it's because it is. Statistical techniques like linear regression are AI, just scaled up to fit complex domains like human intelligence and language reasoning.

When we talk about an AI model—a question I get a lot from people in policy spaces—what are we actually talking about? An AI model is two parts. You've got the architecture, which is the mathematical equation, and the weights, which are the numbers we plug into the mathematical equation to make it work. To get the weights, we have the training algorithm and the training data. The training algorithm takes the training data and converts it down into those numbers.

A Concrete Example

Let's look at this through a very concrete example. We've got a data distribution with L's and X's on a 2D plot, and we're trying to build a model to replicate this data distribution. If we start by learning linear models, we separate the data by a straight line. We've got this linear regression equation: y equals ax plus b. This is the architecture and the weights, and we learn the values for a and b that fit the data.

Now, obviously, this model doesn't perfectly fit the data. We've got L's and X's on the wrong side of the line. So how do we fit the data better? We use a more complicated equation. We add a second linear regression equation. We take the output from the first, feed it into the second, and by doing that, we get a more complicated equation that can better fit the data.

This is essentially a neural network. Each node in the network is one of those linear regression equations, and when we scale them up to massive numbers of nodes, you end up with a network that can model and train a complex mathematical equation. The compelling story here is that things we're actually interested in—things like intelligence, reasoning, or even piloting a fighter jet—can be modeled as mathematical equations, which we can approximate with neural networks.

AI in the Kill Chain

Obviously, this is incredibly powerful technology, and it has been applied in the kill chain. We're going to look at this through the lens of the conventional kill chain, which has five steps: find (finding the target), fix (fixing the target's location), fire (deploying an asset), finish (destroying the target), and feedback (after-action reports).

The find, fix, and feedback stages use some of the more conventional aspects of AI, things like supervised learning and machine learning. We've got companies like BlackSky using image models for satellite surveillance, Palantir using underwater audio models to track things like torpedo signatures, and Palantir doing cross-domain analysis and training machine learning models on that data.

The following three stages are where some of the most science fiction, potentially revolutionary aspects of generative AI come in. These are things like Saronic building drone boats, Kratos building unmanned fighter jets, and Anduril building underwater unmanned submarines.

Supervised Learning vs. Generative AI

Let's cover the distinction between supervised and generative AI, and then dive into what that distinction actually means. Many of you, probably all of you, have heard of generative AI, but many of you may not know what that actually entails.

With supervised learning, we're taking input and mapping it to an output. The input might be something like "The new Marvel movie is amazing." The output would be a classification, like: is the sentiment of the sentence positive or negative?

Generative AI is different. Generative AI is taking input and learning the data distribution from the input. In this case, you might have "The new Marvel movie is," and learn the possible next words and have a probability for every potential next word. If you have those probabilities, you can model the entire underlying data. You can replicate the data by running the model and progressively outputting each word in the sentence.

We're going to start by talking about supervised learning, because it's a revolution in military affairs in its own right.

Image Models and Supervised Learning

With images, as I'm sure most of you are aware, an image is a collection of pixels. In a hundred by hundred pixel image, you've got ten thousand pixels. Each pixel is three numbers that represent the color. So you've got thirty thousand separate numbers that make up the image. You can train an AI model to take that group of numbers and do the categorization for what the image represents. You take the input image, feed it through the model, and get out the category.

One of the earliest breakthroughs in supervised learning was realizing that to make these models really good, it took a massive amount of training data. In 2007, Fei-Fei Li at Stanford created ImageNet, which was one of the first large-scale databases with twelve million images across three thousand different categories. By using this dataset, they were able to make massive performance improvements across image models. Although there were a ton of architectural improvements that went into making these models really good, the underlying data powered all those architectural improvements.

You see this across supervised learning domains, whether it's image categorization, audio categorization, or facial recognition—there's a massive benefit to having larger and larger amounts of training data, which leads to superhuman performance.

Military Applications

This means in the context of military affairs that we can do the job of intelligence analysts at scale. In safe moves, we've gone from having flip books with mugshots to recognize suspects, to being able to do facial recognition at scale in places like a crowded train station. This is having an offensive effect on military affairs.

We can see this with Israel's campaign against Hezbollah. Most of you know this campaign through the pager attack early on. What many of you may not know is that even after the pager attack, Israel was able to continuously strike and eliminate Hezbollah leadership for weeks afterwards. They were able to do this by using AI surveillance at scale.

There's a Chinese company that makes cameras called Dahua, and these cameras are prevalent in Beirut. It turns out that these cameras have massive security vulnerabilities, doing things like having default passwords. There's an Israeli firm that specializes in hacking these cameras, using facial recognition software to identify suspects, and then using machine learning to calculate heat maps of suspect movements across the area. They can identify safe houses, groups of commanders in Beirut, and concentrations of Hezbollah troops across Beirut. Using this data, they were able to substantially eliminate Hezbollah leadership. This plays an offensive role, but generative AI is potentially even more revolutionary.

The Generative AI Revolution

To explore this, we'll first talk about the breakthroughs that led to the generative AI revolution. One of the breakthroughs is this paper called "Attention Is All You Need." This paper proposes the transformer architecture. As a reminder to the audience, the transformer architecture—an architecture is a mathematical equation for the data.

This architecture has several advantages. First, transformers can model incredibly diverse types of data. They're incredibly multimodal by design. You can model text, images, videos, 3D objects, genomics, audio, robot joint movements—pretty much whatever you want to plug into a transformer works.

The second advantage is they allow for parallel training at scale. With previous architectures, you trained sequentially. If you're training on a sentence, you train word by word in order. You train on the first word, the second word, the third word, and so on. Transformers take the entire sentence in a single computational step through the transformer, and that's your only training step. This means that you can train at scale using modern compute like GPUs.

Scaling Laws

This enables what are called scaling laws. Scaling laws are predictive equations that tell you: as you scale the training data, how much better will these models become? They're predictive because if you look at the model, you can say, "Okay, I want a model of X performance. I need X amount of training data and X amount of compute."

When OpenAI proposes building a billion-dollar model and looks at scaling laws, they can look at performance, see what they'll get out of this, and justify the price they're paying for the model. A lot of the breakthroughs with generative AI have just been by looking at the scaling laws and saying, "They're right. Let's use more training data and compute, and let's get better performing models."

Post-Training

But these pre-trained models are not a commercial product. What takes you from pre-trained models to commercial products is post-training. Post-training allows you to go from good next-word completion to being able to do things like question answering in ChatGPT, coding like Claude 3.7, and tasks like folding laundry or operating humanoid robots.

For this presentation, we're going to talk about reinforcement learning, which is one class of techniques in post-training. The reason we're focusing on reinforcement learning is because it's potentially the most revolutionary post-training technique. Reinforcement learning lets you go from models that perform at human level to superhuman performance across various different domains.

A good definition of reinforcement learning is: it's taking a virtual environment to run a model within, and it's taking a set of metrics to evaluate model performance. It's running the model in simulation in that environment, and then using metrics to increasingly improve performance.

With this example on screen, we can see we're training robot dogs. The robot dogs are walking across terrain in a virtual environment, and the metrics might be: are the robot dogs moving fast? Are they stumbling? When they stumble, how often are they on the ground? With these metrics, you can quickly train the robotics model to be able to walk on its own. It turns out that you can take these out of the virtual environment to perform really well in the real world. You see Boston Dynamics' Spot robot walking on really rough terrain. This is powered and enabled by reinforcement learning.

Revolutionary Effects in Ukraine

These techniques are having a revolutionary effect in military affairs, as we see on the ground in Ukraine. Drones without any automation have a ten to twenty percent success rate at hitting targets. With automation, we reach a seventy percent success rate. It's important to note that these are not fully autonomous vehicles. They've got some autonomy, like last-mile striking capabilities. They're not end-to-end autonomous, but still, the level of autonomy being implemented is having a revolutionary effect.

What it means is that we're increasingly denying the battlefield to conventional large weapon systems. We see this in the Black Sea, where Ukraine can repeatedly take out Russian warships and essentially push the Russian Navy out of the Black Sea, and even on land, with Russia being able to defeat the Ukrainian counteroffensive by using things like drones.

There's a quote I saw: "The mark of a revolution in military affairs is when what was previously invaluable becomes almost worthless." We saw this with battleships in World War II, and we see this today with conventional weapon systems like Abrams tanks and warships—things that cost militaries millions and billions of dollars are becoming inoperable on the battlefield.

The Next Wave

This is just the beginning. We're seeing the rise of unmanned ground vehicles as the next wave of automation in Ukraine. These ground vehicles have been quickly adopted. In 2024, Ukraine had a hundred ground vehicles deployed. By 2025, Ukraine is trying to deploy fifteen thousand separate ground vehicles, and they're spending five hundred million dollars to make this happen.

The US, for its part, is not standing still. We've got a range of autonomy initiatives across unmanned fighter jets, unmanned surface vehicles, and unmanned rocket launchers.

Cyber as an RMA

Obviously, this is all fascinating, but how does cyber play into this RMA? Cyber is its own RMA, but it's also got incredibly fascinating interplays with AI, and we're going to explore these.

Cyber is its own RMA because it allows for two things: it allows for striking targets far behind the front lines, and it allows for striking targets at scale instantaneously.

Critical Infrastructure Attacks

In terms of striking targets behind the front lines, we see this with critical infrastructure. As you can see on screen, we've got a graph of the different categories of critical infrastructure and the relevant incidents against each category. Almost every single category is being hit, whether it's nuclear, water treatment facilities, agriculture, oil and gas—everything is being struck.

Supply Chain Attacks

With supply chain attacks, what's different about cyber is you can attack things at scale. With conventional supply chain attacks, you strike single nodes across the network, doing things like bombing ports and slowly diminishing the supply chain until the whole thing collapses. With cyber, you can take out the entire supply chain in one step because nodes share the same software stack. You can almost instantly cripple a supply chain, and then the supply chain slowly recovers because physical attacks do more damage than cyberattacks, or at least are harder to recover from.

We see this in things like the NotPetya attack, where NotPetya, which was originally a Russian virus, spread from Ukrainian financial infrastructure to South Korean ports operated by Maersk, doing hundreds of millions of dollars of damage and taking the ports offline for two weeks.

AI in Offensive Cyber

How does AI play into this cyber RMA? This slide shows the MITRE ATT&CK framework. The MITRE ATT&CK framework maps out how threat actors can move around a system. It gives you tactics and techniques that threat actors use while within the system. Almost every stage of the MITRE ATT&CK framework has some level of AI being applied to it.

There's a fascinating report by Google about how chatbots are being used to automate steps in the MITRE ATT&CK framework. The steps include reconnaissance—Iranian-linked threat actors used AI to research defense and nuclear facilities. Weaponization—using AI to find vulnerabilities in code. And information operations—using AI to write propaganda and generate personas.

AI-Powered Phishing

Phishing is one of the most revolutionary aspects and one of the most commonly used applications of AI in offensive cyber today. There's a report showing that a stunning two percent of phishing attacks now use AI, and fifty-four percent of AI-powered phishing attacks are successful, according to a report out of Harvard.

This is only the beginning. AI is going to increasingly automate larger and larger chunks of the attack chain, including doing things like finding vulnerabilities and writing exploit code. Expel's Renting Agent is one great example. Expel is a startup that builds AI agents for offensive cyber, and their agent is the number one performing user on the bug bounty platform, HackerOne.

The Path Forward

The question is: what will it take to get from finding vulnerabilities and writing exploits to actually automating the entire attack chain? The answer is simply longer context. The question is: when will we get there?

We talked earlier about scaling laws and how well they've increased model performance, but that's not the only answer. Scaling laws are running out. They're increasingly having diminishing marginal returns. So the question is: will there be more breakthroughs that lead to increased reasoning and better performance?

This graph is my answer. If you plot out the increased time that AI agents can handle compared to when GPT was released, it's increasing exponentially. Back in 2022, GPT-2 could do tasks that took a minute. By 2024, GPT-4 could do tasks that took ten minutes. By 2025, Claude 3.7 does hour-long tasks, and it will just keep going.

My answer is that these breakthroughs are led by a series of individual breakthroughs by genius AI researchers doing incredibly impressive work. But the truth is, they're predictable. This isn't one phenomenon at scale—it's these breakthroughs occurring at a predictable rate.

Conclusion

If this continues to happen, this will be the most defining trend of our lifetime. It means that everything I've talked about, whether it's AI systems on the battlefield or AI cyber attacks, can and will come true.

Thank you, and any questions?