WiseMove: Investigating Safe Autonomous Driving

I recently worked on a software tool for experiments in autonomous driving, related to safe policy learning and options based training. Related paper artifact can be found on arXiv.

For the last few months, my team at WISE Lab have been working on this idea of a software tool. Our goal was to conduct experiments on various traffic scenarios, to compare trade-offs between safety, performance and feasibility of various techniques available for autonomous driving.

To that end, we developed a simple intersection scenario. The goal of the ego vehicle is to reach the rightmost end, starting from the leftmost end, avoiding crashes and safety violations.

To do this, we employed a hierarchical learning architecture. We trained some common maneuvers, (like Follow, Wait, KeepLane, etc.) to perform local driving, and then a higher level behavioral policy that learned how to use these maneuvers to reach the goal. And finally, we used an AlphaGo like Monte Carlo Tree Search to choose safer maneuvers than usual.

The main benefit of using WiseMove is that it is possible to flag safety violations before hand. This forces the learning to adhere to these safety conditions, while learning. The safety conditions can be specified in LTL.