A Driver Test For Self-Driving Cars Isn't Enough

I recently read yet another argument that a driving road test should be enough to certify an autonomous vehicle as safe for driving. In general, the idea was that if it's good enough to put a 16 year old on the road, it should be good enough for a self-driving vehicle.  I see this idea enough that it's worth explaining why it it's a really bad one.

 CC SA 3.0

Even if we were to assume that a self-driving vehicle is no different than a person (which is clearly NOT true), applying the driving test is only half the driver license formula. The other half is the part about being 16 years old. If a 12 year old is proficient at operating a vehicle, we still don't issue a drivers license. In addition to technical skills and book knowledge, we as a society have imposed a maturity requirement in most states of "being 16." It is typical that you don't get an unrestricted license until you're perhaps 18. And even then you're probably not a great driver at any age until you get some experience. But, we won't even let you on the road under adult supervision at 12!
The maturity requirement is essential.  As a driver we're expected to have the maturity to recognize when something isn't right, to avoid dangerous situations, to bring the vehicle to a safe state when something has gone wrong, to avoid operating when the vehicle system (vehicle + driver) is impaired, to improvise when something weird happens, and to compensate for other drivers who are having a bad day (or are simply suffering from only being 16). Autonomous driving systems might be able to do some of that, or even most of it in the near term. (I would argue that they are especially bad at self-detecting when they don't know what's going on.) But the point is a normal driving test doesn't come close to demonstrating "maturity" if we could even define that term in a rigorous, testable way. It's not supposed to -- that's why licenses require both testing and "being 16."
To be sure, human age is not a perfect correlation to maturity. But as a society we've come to the situation in which this system is working well enough that we're not changing it except for some tweaks every few years for very young and very old drivers who have historically higher mishap rates. But the big point is if a 12 year old demonstrates they are a whiz at vehicle operation and traffic rules, they still don't get a license.  In fact, they don't even get permission to operate on a public road with adult supervision (i.e., no learners permit at 12 in any US state that I know of.)  So why does it make sense to use a human driving test analogy to give a driver license, or even a learner permit, to an autonomous vehicle that was designed in the last few months?  Where's the maturity?
Autonomy advocates argue that encapsulating skill and fleet-wide learning from diverse situations could help cut down the per-driver learning curve. And it could reduce the frequency of poor choices such as impaired driving and distracted driving. If properly implemented, this could all work well and could improve driving safety -- especially for drivers who are prone to impaired and distracted driving. But while it's plausible to argue that autonomous vehicles won't make stupid choices about driving impaired, that is not at all the same thing as saying that they will be mature drivers who can handle unexpected situations and in general display good judgment comparable to a typical, non-impaired human driver. Much of safe driving is not about technical skill, but rather amounts to driving judgment maturity. In other words, saying that autonomous vehicles won't make stupid mistakes does not automatically make them better than human drivers.  
I'd want to at least see an argument of technological maturity as a gate before even getting to a driving skills test. In other words, I want an argument that the car is the equivalent of "being 16" before we even issue the learner permit, let alone the driver license. Suggesting that a driving test is all it takes to put a vehicle on the road means: building a mind-boggling complex software system with technology we're just at the point of understanding how to validate, doing an abbreviated acceptance test of basic skills (a few minutes on the road), and then deciding it's fine to put in charge of several thousand pounds of metal, glass, and a human cargo as it hurtles down the highway. (Not to mention the innocent bystanders it puts at risk.)
This is a bad idea for any software.  We know from history that a functional acceptance test doesn't prove something is safe (at most it can prove it is unsafe if a mishap occurs during the test).  Not crashing during a driver exam is to be sure an impressive technical achievement, but on its own it's not the same as being safe!  Simple acceptance testing as the gatekeeper for autonomous vehicles is an even worse idea. For other types of software we have found that in practice that you can't understand software quality for life-critical systems without also considering the rigor of the engineering process. Think of good engineering process as a proxy for "being 16 years old." It's the same for self-driving cars. 
(BTW, "life-critical" doesn't mean perfect. It means designed with sufficient engineering rigor to be suitable for its intended criticality. See ISO 26262 or your favorite software safety standard, which is currently NOT required by the government for autonomous vehicles, but should be.)
It should be noted that some think that a few hundred million miles of testing can be a substitute for documenting engineering rigor. That’s a different discussion. What this essay is about is saying that a road test — even an hour long grueling road test — does not fulfill the operational requirement of “being 16” for issuing a driving license under our current system. I’d prefer a different, more engineering-based method of certifying self-driving vehicles for use on public roads. A road test should certainly be part of that approach. But if you really want to base everything on a driver test, please tell me how you plan to argue the part about demonstrating the vehicle, the design team, or some aspect of the development project has the equivalent 16-year-old maturity. If you can’t, you’re just waving your hands about vehicle safety.

(Originally posted on Dec. 19, 2016, with minor edits)

Welcome To My Blog on Self Driving Car Safety

Welcome!

This blog primarily covers Autonomous Vehicle safety, often known as self-driving car safety.

I'm splitting this blog off from my Better Embedded Software blog to reflect my increased emphasis on Autonomous Vehicle (AV) safety.

In my professor gig, these days my main research is on AV stress testing (the ASTAA and RIOT projects at CMU NREC).

I'm also very active in the startup company that I co-founded with Mike Wagner: Edge Case Research LLC. Over the past couple years we have emphasized both stress testing and creating safety cases for AVs.

Comments are moderated. I read them all.  Comments that ask a question typically get approved, although I can't offer specific advise on your particular system this way.  Additional thoughts and responsible contrary views are also typically approved.

While I appreciate the "nice blog" type posts, I typically don't approve them for posting.  (So many of them are comment spam!)  If you really want to be sure I get that type of message just e-mail it to me directly, and it will be appreciated.  If you'd like it to be posted as a comment on the blog let me know.  Send to:  BlogComments@koopman.us   

A few of the initial posts are duplicates from my other blog, but I'll be posting all my new AV content here.  I hope you enjoy the postings!

-- Phil Koopman
Pittsburgh, PA, USA

About the author:
Dr. Philip Koopman has been working on autonomous vehicle safety for more than 20 years. As a professor at Carnegie Mellon University he teaches safe, secure, high quality embedded system engineering. (http://ece.cmu.edu/~koopman/) As co-founder of Edge Case Research LLC he finds ways to improve AV safety, including robustness testing, architectural safety patterns, and structured safety argument approaches. (http://ecr.guru)

TechAD Talk on Highly Autonomous Vehicle Validation

Here are the slides from my TechAD talk on self driving car safety:


Highly Autonomous Vehicle Validation from Philip Koopman

Highly Autonomous Vehicle Validation: it's more than just road testing!
- Why a billion miles of testing might not be enough to ensure self-driving car safety.
- Why it's important to distinguish testing for requirements validation vs. testing for implementation validation.
- Why machine learning is the hard part of mapping autonomy validation to ISO 26262

(Originally posted on Nov 11, 2017)

SCAV 2017 Keynote: Challenges in Autonomous Vehicle Validation


Challenges in Autonomous Vehicle Testing and Validation from Philip Koopman


Challenges in Autonomous Vehicle Validation
Keynote Presentation Abstract
Philip Koopman
Carnegie Mellon University; Edge Case Research LLC
ECE Dept. HH A-308, 5000 Forbes Ave., Pittsburgh, PA, USA
koopman@cmu.edu

Developers of autonomous systems face distinct challenges in conforming to established methods of validating safety. It is well known that testing alone is insufficient to assure safety, because testing long enough to establish ultra-dependability is generally impractical. That’s why software safety standards emphasize high quality development processes. Testing then validates process execution rather than directly validating dependability.

Two significant challenges arise in applying traditional safety processes to autonomous vehicles. First, simply gathering a complete set of system requirements is difficult because of the sheer number of combinations of possible scenarios and faults. Second, autonomy systems commonly use machine learning (ML) in a way that makes the requirements and design of the system opaque. After training, usually we know what an ML component will do for an input it has seen, but generally not what it will do for at least some other inputs until we try them. Both of these issues make it difficult to trace requirements and designs to testing as is required for executing a safety validation process. In other words, we’re building systems that can’t be validated due to incomplete or even unknown requirements and designs.

Adaptation makes the problem even worse by making the system that must be validated a moving target. In the general case, it is impractical to validate all the possible adaptation states of an autonomy system using traditional safety design processes.

An approach that can help with the requirements, design, and adaptation problems is basing a safety argument not on correctness of the autonomy functionality itself, but rather on conformance to a set of safety envelopes. Each safety envelope describes a boundary within the operational state space of the autonomy system.

A system operating within a “safe” envelope knows that it’s safe and can operate with full autonomy. A system operating within an “unsafe” envelope knows that it’s unsafe, and must invoke a failsafe action. Multiple partial specifications can be used as an envelope set, with the intersection of safe envelopes permitting full autonomy, and the union of unsafe envelopes provoking validated, and potentially complex, failsafe responses.

Envelope mechanisms can be implemented using traditional software engineering techniques, reducing the problems with requirements, design, and adaptation that would otherwise impede safety validation. Rather than attempting to prove that autonomy will always work correctly (which is still a valuable goal to improve availability), the envelope approach measures the behavior of one or more autonomous components to determine if the result is safe. While this is not necessarily an easy thing to do, there is reason to believe that checking autonomy behaviors for safety is easier than implementing perfect, optimized autonomy actions. This envelope approach might be used to detect faults during development and to trigger failsafes in fleet vehicles.

Inevitably there will be tension between simplicity of the envelope definitions and permissiveness, with more permissive envelope definitions likely being more complex. Operating in the gap areas between “safe” and “unsafe” requires human supervision, because the autonomy system can’t be sure it is safe.

One way to look at the progression from partial to full autonomy is that, over time, systems can increase permissiveness by defining and growing “safe” envelopes, shrinking “unsafe” envelopes, and eliminating any gap areas.

ACM Reference format:
P. Koopman, 2017. Challenges in Autonomous Vehicle Validation. In
Proceedings of 1st International Workshop on Safe Control of Connected
and Autonomous Vehicles, Pittsburgh, Pennsylvania, USA, April 2017
(SCAV 2017), 1 page.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is  granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Copyright is held by the owner/author(s).
SCAV'17, April 21-21 2017, Pittsburgh, PA, USA
ACM 978-1-4503-4976-5/17/04.
http://dx.doi.org/10.1145/3055378.3055379

(Originally posted 4/23/2017)

Autonomous Vehicle Safety: An Interdisciplinary Challenge



Autonomous Vehicle Safety: An Interdisciplinary Challenge

By Phil Koopman & Mike Wagner

Abstract:
Ensuring the safety of fully autonomous vehicles requires a multi-disciplinary approach across all the levels of functional hierarchy, from hardware fault tolerance, to resilient machine learning, to cooperating with humans driving conventional vehicles, to validating systems for operation in highly unstructured environments, to appropriate regulatory approaches. Significant open technical challenges include validating inductive learning in the face of novel environmental inputs and achieving the very high levels of dependability required for full-scale fleet deployment. However, the biggest challenge may be in creating an end-to-end design and deployment process that integrates the safety concerns of a myriad of technical specialties into a unified approach.

Read the preprint version here for free (link / .pdf)

Official IEEE version (subscription required):
http://ieeexplore.ieee.org/document/7823109/  
DOI: 10.1109/MITS.2016.2583491

IEEE Intelligent Transportation Systems Magazine (Volume: 9, Issue: 1, Spring 2017, pp. 90-96)

Correction:
"This would require a safety level of about 1 billion operating hours per catastrophic event. (FAA 1988)" should be
"This would require a safety level of about 1 billion operating hours per catastrophic event due to the failure of a particular function. (FAA 1988)"  (Note that in this context a "function" is something quite high level such as the ability to provide sufficient thrust from the set of jet engines mounted on the airframe.)

(Originally posted January 30, 2017)

Response to DoT Policy on Highly Automated Vehicles

[While DoT has since published revised draft policies, the original draft policy and my response are still relevant to provide an overall picture about critical needs for self-driving vehicle safety.]

I've prepared a draft response to DoT/NHTSA on their proposed policy for highly automated vehicle safety.

EE Times article that summarizes my response

The topics I cover are:
1. Requiring a safety argument that deals with the challenges of validating machine learning
2. Requiring transparent independence in safety assessment
3. Triggering safety reassessment based on safety integrity, rather than “significant” functionality
4. Requiring assessment of changes that can compromise the triggering of fall-back strategies
5. Characterizing what “reasonable” might mean regarding anticipation of exceptional scenarios
6. Assuring the integrity of data that is likely to be used for crash investigations
7. Diagnostics that encompass non-collision failures of components and end-of-life reliability loss
8. More uniform codification of traffic rule exceptions
9. Ensuring the safety of driver-takeover strategies for SAE Level 2 systems

Document pointers:
Comments either to this blog or to my university e-mail are welcome.  
         koopman - at - cmu.edu