Dr. Zhang and his PhD committee: Somesh Jha (University of Wisconsin), David Evans, Tom Fletcher; Tianxi Li (UVA Statistics), David Wu (UT Austin), Mohammad Mahmoody; Xiao Zhang.
Xiao will join the CISPA Helmholtz Center for Information
Security in Saarbrücken, Germany this fall as a
tenure-track faculty member.
From Characterizing Intrinsic Robustness to Adversarially Robust Machine Learning
The prevalence of adversarial examples raises questions about the
reliability of machine learning systems, especially for their
deployment in critical applications. Numerous defense mechanisms have
been proposed that aim to improve a machine learning system’s
robustness in the presence of adversarial examples. However, none of
these methods are able to produce satisfactorily robust models, even
for simple classification tasks on benchmarks. In addition to
empirical attempts to build robust models, recent studies have
identified intrinsic limitations for robust learning against
adversarial examples. My research aims to gain a deeper understanding
of why machine learning models fail in the presence of adversaries and
design ways to build better robust systems. In this dissertation, I
develop a concentration estimation framework to characterize the
intrinsic limits of robustness for typical classification tasks of
interest. The proposed framework leads to the discovery that compared
with the concentration of measure which was previously argued to be an
important factor, the existence of uncertain inputs may explain more
fundamentally the vulnerability of state-of-the-art
defenses. Moreover, to further advance our understanding of
adversarial examples, I introduce a notion of representation
robustness based on mutual information, which is shown to be related
to an intrinsic limit of model robustness for downstream
classification tasks. Finally in this dissertation, I advocate for a
need to rethink the current design goal of robustness and shed light
on ways to build better robust machine learning systems, potentially
escaping the intrinsic limits of robustness.