# Security and Privacy Research at the University of Virginia

Our research seeks to empower individuals and organizations to control how their data is used. We use techniques from cryptography, programming languages, machine learning, operating systems, and other areas to both understand and improve the privacy and security of computing as practiced today, and as envisioned in the future. A major current focus is on adversarial machine learning.

Everyone is welcome at our research group meetings. To get announcements, join our Teams Group (any @virginia.edu email address can join themsleves; others should email me to request an invitation).

Security Research Group Lunch (22 August 2022)
Bargav Jayaraman, Josephine Lamp, Hannah Chen, Elena Long, Yanjin Chen,
Samee Zahur (PhD 2016), Anshuman Suri, Fnu Suya, Tingwei Zhang, Scott Hong

Past Projects
Secure Multi-Party Computation: Obliv-C · MightBeEvil

Web and Mobile Security: ScriptInspector · SSOScan
Program Analysis: Splint · Perracotta
N-Variant Systems · Physicrypt · More…

Recent Posts

## Visualizing Poisoning

How does a poisoning attack work and why are some groups more susceptible to being victimized by a poisoning attack?

We’ve posted work that helps understand how poisoning attacks work with some engaging visualizations:

Poisoning Attacks and Subpopulation Susceptibility
An Experimental Exploration on the Effectiveness of Poisoning Attacks
Evan Rose, Fnu Suya, and David Evans

Machine learning is susceptible to poisoning attacks in which adversaries inject maliciously crafted training data into the training set to induce specific model behavior. We focus on subpopulation attacks, in which the attacker’s goal is to induce a model that produces a targeted and incorrect output (label blue in our demos) for a particular subset of the input space (colored orange). We study the question, which subpopulations are the most vulnerable to an attack and why?

## Congratulations, Dr. Zhang

Congratulations to Xiao Zhang for successfully defending his PhD thesis!

Dr. Zhang and his PhD committee: Somesh Jha (University of Wisconsin), David Evans, Tom Fletcher; Tianxi Li (UVA Statistics), David Wu (UT Austin), Mohammad Mahmoody; Xiao Zhang.

Xiao will join the CISPA Helmholtz Center for Information Security in Saarbrücken, Germany this fall as a tenure-track faculty member.

From Characterizing Intrinsic Robustness to Adversarially Robust Machine Learning

## BIML: What Machine Learnt Models Reveal

I gave a talk in the Berryville Institute of Machine Learning in the Barn series on What Machine Learnt Models Reveal, which is now available as an edited video:

David Evans, a professor of computer science researching security and privacy at the University of Virginia, talks about data leakage risk in ML systems and different approaches used to attack and secure models and datasets. Juxtaposing adversarial risks that target records and those aimed at attributes, David shows that differential privacy cannot capture all inference risks, and calls for more research based on privacy experiments aimed at both datasets and distributions.

The talk is mostly about inference privacy work done by Anshuman Suri and Bargav Jayaraman.

## ICLR 2022: Understanding Intrinsic Robustness Using Label Uncertainty

(Blog post written by Xiao Zhang)

Motivated by the empirical hardness of developing robust classifiers against adversarial perturbations, researchers began asking the question “Does there even exist a robust classifier?”. This is formulated as the intrinsic robustness problem (Mahloujifar et al., 2019), where the goal is to characterize the maximum adversarial robustness possible for a given robust classification problem. Building upon the connection between adversarial robustness and classifier’s error region, it has been shown that if we restrict the search to the set of imperfect classifiers, the intrinsic robustness problem can be reduced to the concentration of measure problem.

In this work, we argue that the standard concentration of measure problem is not sufficient to capture a realistic intrinsic robustness limit for a classification problem. In particular, the standard concentration function is defined as an inherent property regarding the input metric probability space, which does not take account of the underlying label information. However, such label information is essential for any supervised learning problem, including adversarially robust classification, so must be incorporated into intrinsic robustness limits. By introducing a novel definition of label uncertainty, which characterizes the average uncertainty of label assignments for an input region, we empirically demonstrate that error regions induced by state-of-the-art models tend to have much higher label uncertainty than randomly-selected subsets.

This observation motivates us to adapt a concentration estimation algorithm to account for label uncertainty, where we focus on understanding the concentration of measure phenomenon with respect to input regions with label uncertainty exceeding a certain threshold $\gamma>0$. The intrinsic robustness estimates we obtain by incorporating label uncertainty (shown as the green dots in the figure below) are much lower than prior limits, suggesting that compared with the concentration of measure phenomenon, the existence of uncertain inputs may explain more fundamentally the adversarial vulnerability of state-of-the-art robustly-trained models.

Paper: Xiao Zhang and David Evans. Understanding Intrinsic Robustness Using Label Uncertainty. In Tenth International Conference on Learning Representations (ICLR), April 2022. [PDF] [OpenReview] [ArXiv]

## Microsoft Research Summit: Surprising (and unsurprising) Inference Risks in Machine Learning

Here are the slides for my talk at the Practical and Theoretical Privacy of Machine Learning Training Pipelines Workshop at the Microsoft Research Summit (21 October 2021):

Surprising (and Unsurprising) Inference Risks in Machine Learning [PDF]

The work by Bargav Jayaraman (with Katherine Knipmeyer, Lingxiao Wang, and Quanquan Gu) that I talked about on improving membership inference attacks is described in more details here:

The work on distribution inference is described in this paper (by Anshuman Suri):

The work on attribute inference and imputation isn’t yet posted, but feel free to contact me with any questions about it.