AI: Pilot Threat or Bean-Counter Pipe Dream? (Part 1)

Some of my old Air Force flying buddies have done tours in the DOD’s 5-Sided Buzzword Bingo Palace (aka: The Pentagon.) Every once in a while, one of them starts talking about “leveraging synergies between Artificial Intelligence (AI) and…[blah, blah blah]” They frequently start talking like AI is going to solve all their problems and completely replace conscious human thought in armed conflict.

After I wipe the look of disgust from my face (I’ve developed a severe buzzword aversion), I try to explain that AI technology is nowhere near capable of doing what they’re talking about. Frequently, they’re still riding a wave of timing and good-deal assignments that has failed to crush their soul yet, so we end up agreeing to disagree.

I can understand how they’ve been misguided though. Science fiction has made such a habit of thinking about the future of AI that it isn’t even entirely the realm of nerds living in their mothers’ basements anymore. From movies like Terminator and The Matrix where humanity’s relationship with AI goes very badly, to more optimistic predictions like Commander Data on Star Trek The Next Generation, and Isaac Asimov’s beautiful story The Bicentennial Man, we all believe that AI will be part of our future.

The problem is that SciFi authors can take liberties with technology that won’t catch up for decades. (Just look at how long it took “athleisure” to go mainstream after it was introduced by the aliens of Star Trek in the 1960s.)

I frequently hear similar discussions about AI among airline pilots. After news like FedEx and Sikorsky testing single-pilot operations on an ATR 42, some of the most pessimistic pilots I know launch into rants like:

“AI is going to replace us all!”
“Our jobs are doomed!”
“Don’t start a career in aviation. It’s too late!”
“Mark my words, there will be single-pilot widebody cargo operations next year!”

On one hand, I see this happening eventually. I’m enough of a SciFi fan that I believe computers will eventually gain the ability to safely pilot all kinds of vehicles. However, at least for the time being, reports of our demise have been greatly exaggerated.

I believe that airline pilot jobs are and will continue to be very safe for many years. I believe there’s still time for young men and women to start down our career path today and enjoy a full flying career.

Do you think otherwise? Before you jump to the comments, allow me to make my case.

How AI Works

First, it’s important to note that my BS is in Computer Engineering. My senior project was scratch-building and programming a swarm of seven autonomous robots to work together and solve a puzzle. Our project wasn’t a wild success, but I’ve stayed abreast of related developments over the years. My analysis here is based on the underlying technology…not the hopes and dreams of an Air Force General who thinks he can produce better pilots by having them fly less, or an airline exec who loses sleep trying to figure out ways to cut pilot pay.

Artificial Intelligence works very differently from traditional computer programming. Traditional computer code has to spell everything out very clearly:

  • If airspeed is greater than speed bug, then:
    • Reduce throttle
  • If airspeed is less than speed bug, then:
    • Increase throttle

This type of code has to be written by a human, and qualifies as what we call “Garbage In, Garbage Out” (GIGO.) This means, if the programmer screws up, the computer won’t do what you want it to. You’ll also notice that these four lines of code aren’t good enough. What’s the rate of our airspeed change? Are we climbing or descending? How heavy are we? A half-decent autothrottle controller program will address all of those variables and more. Can you imagine how quickly that code could grow from a few lines to hundreds or even thousands?

The discipline of troubleshooting involves sifting through line upon line of computer code trying to find errors. It’s brutal.

It’d be nice if we could use traditional computer code to program autonomous aircraft. Unfortunately, we just can’t. There are just too many variables and there would be millions of lines of code. If one thing went wrong, the troubleshooting process would just be too tough. Just look what happened when one thing went wrong with the code running the original Jurassic Park….

AI is completely different. Humans don’t write the code that makes an AI work. Instead, a programmer specifies some initial conditions and basic principles, then “trains” a computer to accomplish a task. During that training, the computer essentially writes its own code. What’s crazy is that the computer does this so quickly that no human being could possibly dig through the resulting code to understand how it works. The code rapidly becomes far too complex. Instead, we just have to train the AI until it succeeds often enough to fit our risk tolerance, and then hope for the best.

(That should scare you. More importantly, that should scare airline passengers, companies that need to ship valuable goods…and politicians and generals who bear responsibility for fighting wars.)

AI Training Example: Cats

Hopefully this made sense, but in an effort to hammer home these differences we’re going to check out an example. One common AI task is image recognition like the face unlock feature on your mobile phone. We’ll get to some more useful applications shortly, but (in honor of my friend Coco) we’re going to start by training an AI to recognize pictures of cats.

To start out, we programmers can specify some starting conditions:

  • Cats usually have 2 ears
  • Cats usually have a tail
  • Cats are usually furry
  • Cats usually have whiskers
  • Cats are carnivorous and have sharp teeth
  • Cats walk on four legs, have paws, and generally have claws

Once we’ve specified these rules, we start training our AI by letting it try to apply them against a set of test images. We’ve carefully assembled this set of images. We know the right answers, but require the AI to make each guess before revealing the answer. So, it goes something like this:

Hey AI, is this a cat?

Yes Emet, this is a cat.

Not bad AI. You’re 1/1. Now try this one:

Yes Emet, this is a cat.

Well, I didn’t program you to be a conversationalist, but yes, you got it. You’re 2/2. Next:

Yes Emet, this is a cat.

Wrong too long AI! You failed! The reign of our Robot Overlords has not yet come! Current score: 2/3.

At this point, the AI has to make some adjustments. It examines the picture and thinks to itself:

The object in this picture has two ears, four paws, claws, whiskers, and it is furry. Teeth and tail were not visible. Note ratio of face size to snout and nose size. Note slightly different ear shape. Future probability determinations will pay more attention to those ratios and shapes.

Note that we don’t actually know how it thought through all this. These observations and adjustments all happen rapidly in a matter of microseconds between the AI seeing the last picture and this one:

No Emet, this is not a cat.

Wrong again AI! You’re only 2/4 now. Better shape up or you’re scrap!

Why did our AI fail to correctly ID this cat? It only saw 1 leg/paw and the overall body shape was nothing like the other cats it’s seen. The cat’s eyes were closed, and the animal was in an odd orientation. The fur was fluffed and the AI couldn’t discern body shape against an otherwise chaotic background.

Let’s try again AI:

Yes Emet, this is a cat.

Whomp whomp! You’re now 2/5 AI.

In our poor AI’s defense, this was a tough one. Other than the long snout, tongue, and eyes, I bet very young human children wouldn’t do better than 50/50 on correctly identifying this one as a dog rather than a cat.

Our AI learns from this. It makes more observations about fur density, body size and ratios, and stance. Focusing on the snout, eyes, and nose gives it a better idea of what to look for. However, other pictures will throw our AI for different loops:


No Emet, this is not a cat.

You’re down to 2/6. That’s really terrible performance.

That one is almost excusable, right? There’s a lot to this picture that doesn’t look remotely cat. Maybe it even thought we were asking about the butterfly itself, which would mean it sort of answered correctly. It’s never seen a cat with so much of its face covered by a distracting shape. Maybe we don’t feel too bad about this one. How many times in history has a butterfly landed on a cat’s nose anyway? The chances of this happening again are minuscule, so do we really care if it can properly identify a cat with a butterfly on its nose?

For a task as trivial as identifying cat pictures? Absolutely not. However, that principle has significant, life-altering consequences in contexts we’ll discuss shortly.

In the meantime:

Is this a cat?

No Emet, this is not a cat.

Good job AI. That was a softball, but you’re 3/7 now. How about this one:

Yes Emet, this is a cat.

And just like that, we’re up to 50%. Not bad for a plastic case bouncing electrons around for fun. Let’s try some other tough ones:

No Emet, this is not a cat.

Wrong AI. 4/9 puts you below 50% again.

Are we being unfair here? All the AI got was part of one leg and part of a head. Worse, there’s more human body in the picture than cat. It’s not the AI’s fault that the picture had so little information. And yet, you and I had no trouble getting this one right.

Eventually, the AI learns from examples like this one though. See:


Yes Emet, this is a cat.

We’re back to 50% at 5/10. Our AI has learned to choose “Yes, cat” even if it only has part of the picture. It’s getting better.

Here’s the last one:

No Emet, this is not a cat.

Nope. 5/11 overall. Worse odds than random chance.

Why didn’t this last one take? The cat had a lot of common features, but the complete lack of fur, the wrinkly skin, and the tattooed human in the background all detracted enough from the AI’s confidence that it couldn’t be certain enough to conclude “cat.” (Personally, I’m not sure these count as cats myself, but I’m looking for excuses to paint AI in a bad light, so I’m not throwing it any bones!)

Full-Scale Training

If we continued this exercise, the AI’s overall performance would improve. It learns from both its successes and its failures. It’s important to note that we can’t jump in here and tweak its code ourselves. That code is already well beyond our comprehension.

If things start going wrong, the best we can do is wipe the AI’s memory, tweak the starting conditions we give it, and run it through the same set of examples to see if it performs better.

However, our best bet is just to give it more examples and let it program itself. In real-world development, it takes hundreds of thousands, or even millions of tests like this to train an AI.

Yes, that means training an AI to recognize cats with any reasonable degree of accuracy would require a human being to assemble a dataset with millions of cat pictures, and plenty of non-cat pictures as distractions. Can you imagine how much work that is?

There are entire companies who specialize in creating these datasets, licensing them out to AI developers to train their systems. Even if it only takes a worker 30 seconds to find an image file, download it, correctly label it, and fill out the metadata the computer uses to check its answers, it would take more than 8,000 hours to complete a 1,000,000 image dataset. That’s four full-time employees working for an entire year. If those employees make $10/hr, that dataset costs $83K to develop. That doesn’t include the cost of computers, internet, a building, electricity, benefits, management, marketing, etc. This also assumes that it’s possible to find a million cat pictures on the internet. (While this last point is likely possible with our cat example, it’ll have significant implications shortly.)

I don’t know about you, but I can’t think of many cat-related applications that would make it worth paying $83K to train an AI.

Synergy? You Wish!

At this point, a shiny Aide de Camp working for Major General Stanley T. Desktop at HQ USAF/A3R8H3Q9 might think, “Well, that’s a big investment, but once we’ve trained our AI to recognize cats we’ll be able move it on to more useful tasks.”

This type of wishful thinking is one of the biggest downfalls for people who don’t understand AI. The problem is that there is no generalization. An AI is only capable of performing the task you’ve trained it to do.

If you used our earlier cat identification example (expanding from our 11 images to 1,000,000 or so) our AI might be able to correctly recognize a cat 94% of the time. That’d be stellar. However, that’s all our AI would be capable of doing. You could ask our AI:

Is this a cat?

And it could correctly answer: “No Emet, this is not a cat.

However, our AI is wholly incapable of giving you any other information. If you asked “Is this a dog?” it could only respond with “That does not compute,” because it has no idea what a dog is.

You could give the AI some starting parameters for what a dog is and have it run the dataset again. However, since that dataset is mostly cats, and includes a lot of other distractors, it would not get very good at identifying dogs. If you wanted a dog-recognizing AI, you’d have to go spend another $83K+ on a dog dataset, and train a brand-new AI from scratch.

It’s critical to note that our AI would be useless for any other task. If it had answered, “No Emet, this is not a cat,” we couldn’t follow up by asking: “Well AI, if it’s not a cat then what is it?” The AI doesn’t even have the ability to comprehend that question, let alone have any idea how to process it and produce an answer.

We also couldn’t ask: “What is on the cat’s nose?” Even if we’d given it some starting parameters to understand what a nose is, the AI has no context for understanding what a butterfly is. If you want an AI capable of identifying cats with butterflies on their noses, you’d have to spend a million pictures teaching it to identify cats, another million teaching it to identify butterflies, and another million teaching it to identify cats with butterflies on their noses. Want it to correctly identify dandelion seeds on a cat’s nose? You’d have to start all over again.

I hope your previous dreams of using AI to solve the world’s problems are slightly crushed. You should already be able to see how time consuming and expensive it is to train an AI. Even if you have the resources to spend on a project, the results are extremely limited. What’s worse, studies are showing that even if an AI can perform well in a laboratory setting, there’s a decent chance it won’t perform nearly as well in real-world applications.

Are you training your AI to recognize cats so that you can set up cameras around town to recognize strays and alert you to their location? Your multi-hundred-thousand-dollar AI might show decent results in the lab, but it probably won’t be nearly as good at recognizing cats in real-world imagery. Whomp, whomp.

Want more bad news? It’s difficult or even impossible to copy an AI from one type of system and implement it on another. Sure, you can train an AI on one computer and transfer an exact copy to another computer of exactly the same type. However, if you upgrade your processors, server architecture, cameras, and other systems, you’re going to have to start over training a brand new AI from scratch. There’s no way to copy Alexa onto an iPhone.

Now that we’ve taken a look at some of the basics behind how AI works we can get into some more practical examples in Part 2 of this series.

Part 1 | Part 2 Part 3 >

(Thanks to Possessed Photography from Unsplash for this article’s feature pic. Most other pictures came from Unsplash as well.)