Learning and Approximate Solutions to Partially Observable
Markov Games
|
Abstract:
One of the keys to understanding autonomic computing is understanding
models of adversarial intent and action. These are manifested when
one agent thwarts and tries to thwart the actions of another. The
mathematical framework for this is learning and so-called zero sum
or pursuit evasion games. The main drawback in this area is that
solutions to these games with or without partial information is
that they are NP hard. I will discuss the use of approximation and
learning to make such problems tractable. To make the results concrete,
I will discuss their application in parallel with the theory on
a number of mobile robot agents. Versions of this work would apply
to models of immune systems or information assurance.
Deterministic
pursuit-evasion games on finite graphs have been relatively well-studied,
and there has been an attempt to abstract the region in which the
game takes place to a finite graph. However, when the environment
is unknown a priori, the “map-learning” phase often precedes it,
which is time-consuming and computationally very hard even for the
simplest environment. In this research, we formulate pursuit-evasion
games involving unmanned aerial vehicles (UAVs) and unmanned ground
vehicles (UGVs) in a probabilistic framework and use reinforcement
learning and approximate dynamic programming to obtain approximate
solutions with the satisfactory performance. We will discuss the
extension of this result to games involving active evaders or obstacles
to compute the pursuit policies that are (sub-)optimal in a sense
of minimizing the expected time to find the evader or increasing
the possibility of finding the computation of finding the evader
in a given finite time interval.
Joint
with Dr. Jin Kim, Omid Shakernia, Dr. David Shim, Rene Vidal, Hoam
Phong, Peter Ray, and Ron Tal with Andrew Ng and Professor Michael
Jordan.
S. Shankar Sastry became Chairman, Department of Electrical Engineering
and Computer Sciences, University of California, Berkeley in January,
2001. The previous year, he served as Director of the Information
Technology Office at DARPA. From 1996–1999, he was the Director
of the Electronics Research Laboratory at Berkeley, an organized
research unit on the Berkeley campus conducting research in computer
sciences and all aspects of electrical engineering. During his
Directorship from 1996–1999, the laboratory grew from $29M to
$50M in volume of extra-mural funding. He is a Professor of Electrical
Engineering and Computer Sciences and a Professor of Bioengineering.
Dr. Sastry received his Ph.D. degree in 1981 from the University
of California, Berkeley. He was on the faculty of MIT as Asst.
Professor from 1980–1982 and at Harvard University as a chaired
Gordon Mc Kay professor in 1994. He has held visiting appointments
at the Australian National University, Canberra, the University
of Rome, Scuola Normale, and the University of Pisa, the CNRS
laboratory LAAS in Toulouse (poste rouge), Professor Invite at
Institut National Polytechnique de Grenoble (CNRS laboratory VERIMAG),
and as a Vinton Hayes Visiting fellow at the Center for Intelligent
Control Systems at MIT. His areas of research are embedded and
autonomous software, computer vision, computation in novel substrates
such as DNA, nonlinear and adaptive control, robotic telesurgery,
control of hybrid systems, embedded systems, sensor networks and
biological motor control.
Nonlinear Systems: Analysis, Stability and Control is Dr. Sastry’s
latest book, published by Springer-Verlag in 1999. He has coauthored
over 250 technical papers and 6 books, including Adaptive Control:
Stability, Convergence and Robustness (with M. Bodson, Prentice
Hall, 1989) and A Mathematical Introduction to Robotic Manipulation
(with R. Murray and Z. Li, CRC Press, 1994). He has co-edited
Hybrid Control II, Hybrid Control IV and Hybrid Control V (with
P. Antsaklis, A. Nerode, and W. Kohn, Springer Lecture Notes in
Computer Science, 1995, 1997, and 1999, respectively) and co-edited
Hybrid Systems: Computation and Control (with T.Henzinger, Springer-Verlag
Lecture Notes in Computer Science, 1998) and Essays in Mathematical
Robotics (with Baillieul and Sussmann, Springer-Verlag IMA Series).
Books on Embedded Software and Structure from Motion in Computer
Vision are in progress.
Dr. Sastry served as Associate Editor for numerous publications,
including: IEEE Transactions on Automatic Control; IEEE Control
Magazine; IEEE Transactions on Circuits and Systems; the Journal
of Mathematical Systems, Estimation and Control; IMA Journal of
Control and Information; the International Journal of Adaptive
Control and Signal Processing; Journal of Biomimetic Systems and
Materials.
Dr. Sastry was elected into the National Academy of Engineering
in 2001 “for pioneering contributions to the design of hybrid
and embedded systems.” He also received the President of India
Gold Medal in 1977, the IBM Faculty Development award for 1983–1985,
the NSF Presidential Young Investigator Award in 1985 and the
Eckman Award of the of the American Automatic Control Council
in 1990, an M.A. (honoris causa) from Harvard in 1994, Fellow
of the IEEE in 1994, the distinguished Alumnus Award of the Indian
Institute of Technology in 1999, and the David Marr prize for
the best paper at the International Conference in Computer Vision
in 1999.
He has supervised 45 doctoral students to completion and over
50 M.S. students. His students now occupy leadership roles in
several locations such as Dean of Engineering at Caltech, Director
of Information Systems Laboratory, Stanford, Army Research Office,
and on the faculties of every major university in the United States
and abroad.
|