Security and Privacy Research at the University of Virginia

Our research seeks to empower individuals and organizations to control how their data is used. We use techniques from cryptography, programming languages, machine learning, operating systems, and other areas to both understand and improve the privacy and security of computing as practiced today, and as envisioned in the future. A major current focus is on adversarial machine learning.

Everyone is welcome at our research group meetings. To get announcements, join our Slack Group (any @virginia.edu email address can join themsleves, or email me to request an invitation).

Recent Posts

Virginia Consumer Data Protection Act

Josephine Lamp presented on the new data privacy law that is pending in Virginia (it still needs a few steps including expected signing by governor, but likely to go into effect Jan 1, 2023): Slides (PDF)

This article provides a summary of the law: Virginia Passes Consumer Privacy Law; Other States May Follow, National Law Review, 17 February 2021.

The law itself is here: SB 1392: Consumer Data Protection Act


Algorithmic Accountability and the Law

Brink News (a publication of The Atlantic) published an essay I co-authored with Tom Nachbar (UVA Law School) on how the law views algorithmic accountability and the limits of what measures are permitted under the law to adjust algorithms to counter inequity:

Algorithms Are Running Foul of Anti-Discrimination Law
Tom Nachbar and David Evans
Brink, 7 December 2020



Computing systems that are found to discriminate on prohibited bases, such as race or sex, are no longer surprising. We’ve seen hiring systems that discriminate against women image systems that are prone to cropping out dark-colored faces and credit scoring systems that discriminate against minorities.

Anyone considering deploying an algorithm that impacts humans needs to understand the potential for such algorithms to discriminate. But what to do about it is much less clear.

The Difficulty of Mandating Fairness

There are no simple ways to ensure that an algorithm doesn’t discriminate, and many of the proposed fixes run the risk of violating anti-discrimination law. In particular, approaches that seek to optimize computing systems for various notions of fairness, especially those concerned with the distribution of outcomes along legally protected criteria such as race or sex, are in considerable tension with U.S. anti-discrimination law.

Although many arguments about discriminatory algorithms are premised on unfair outcomes, such notions have limited relevance under U.S. law.

For the most part, U.S. law lacks a notion of fairness.

Legal rules generally call upon more specific notions than fairness, even if they are connected to fairness. Thus, in the context of illegal employment discrimination (which we will use as our motivating example), instead of mandating fairness, U.S. law generally prohibits conduct that discriminates on the basis of protected characteristics, like race and sex.

Process and Intent Matter

Moreover, the law does not generally regulate behavior based on outcomes; what matters is the intent and process that led to the outcome. In the case of U.S. employment discrimination law, those rules of intent and process are contained in two types of protections against discrimination: disparate treatment and disparate impact.

An employer is liable for disparate treatment when there is either explicit or intentional discrimination. Disparate treatment protections prohibit the use of overt racial classifications, but also provide liability for hidden but intentional discrimination, such as cases where a victim can show they are a member of a racial minority and were qualified, but were rejected, and the employer cannot provide any nondiscriminatory justification for the decision, such as the case of McDonnell Douglas Corp. v. Green.

Under disparate impact, an employer following a process that has a statistically observable negative impact on a protected group is not necessarily liable. Instead, the disparate outcomes transfer the burden to the employer to show that the decision-making process is justified based on valid criteria, as with the case of Griggs v. Duke Power Co..

If you think those two approaches sound confusingly similar, you’re not alone. Disparate impact liability frequently mirrors disparate treatment liability in that the disparate outcome itself is not enough to establish a violation.

The role of disparate outcomes is to shift the burden to the employer to provide a non-discriminatory reason for its decision-making process. What matters legally is not so much the outcome, as the intention and process behind it.

Correcting for past racial disparities will require a more sophisticated and deep-seated approach than simply altering algorithms.

The Law Doesn’t Care Whether Decisions Are Made by Algorithms or Humans

When outcomes are based on the output of some algorithm, the employer still needs to justify that the decisions it makes are based on valid criteria. The law doesn’t care whether decisions are made by algorithms or by human decision-makers; what matters is the reason for the decision. It is up to the humans responsible to explain that reasoning.

Although many have argued for increased algorithmic transparency, even the most transparent algorithms cannot really explain why they made the decisions they did. This presents a major challenge for discrimination law because, in discrimination law, the “why” matters.

Algorithmically generated explanations can help but, by themselves, cannot answer the legal “why” question. Even an interpretable model that appears to have no discriminatory intent is not necessarily non-discriminatory. The rules it has learned could have been influenced by selecting training data that disadvantages a particular group, and the features it uses could be determined in ways that are inherently discriminatory.

To satisfy discrimination law, it is the process and the intent that matter, and explanations an algorithm itself produces are insufficient to establish that intent. Indeed, explanations of the intent of algorithms should be viewed with the same skepticism that we have when humans attempt to describe their own decision-making processes. Just because an explanation is provided does not mean it should be believed.

Optimizing an Algorithm for Fairness May Be Discriminatory

This is a particular problem for methods used by systems designers, who frequently seek to optimize for particular outcomes. An optimization approach does not fit well with legal requirements. When algorithm designers focus on fairness as a property to be optimized, they ignore the legal requirements of anti-discrimination.

Discrimination law does not operate through optimization, because discrimination (or anti-discrimination) is not something to be optimized. Anti-discrimination is a side constraint on a decision-making process, not its principal goal (e.g., to find good employees).

Systems should be designed to optimize for their principal goal, with the constraint of avoiding discrimination (in intent or process) while doing so. Attempts to produce outcomes that seem less discriminatory might themselves constitute illegal discrimination. The 2009 U.S. Supreme Court case of Ricci v. DeStafano provides a prime example of that tension. In the case, the New Haven Fire Department used an examination to determine which firefighters should be promoted to lieutenant. When that test produced a result that was racially skewed compared to the population of firefighters, the city (in part because they were concerned about disparate impact liability), invalidated the results of the test. White firefighters sued, claiming the city’s response in rejecting the test was itself disparate treatment, since the motivation for rejecting the test was to produce a different racial outcome, and the Supreme Court agreed.

Algorithms Alone Cannot Save Us

Although reducing racial disparity is a laudable goal, the law substantially limits the discretion of both employers and system designers in engineering for equitable outcomes. Racially disparate outcomes may seem unfair, and they might even be evidence of underlying illegal discrimination, but the law neither deputizes systems designers to operationalize their notions of what are racially fair outcomes, nor immunizes them for acts of discrimination undertaken in order to correct racial disparities.

Correcting for past racial disparities will require a more sophisticated and deep-seated approach than simply altering algorithms to produce outcomes optimized toward some fairness criterion.

Algorithms alone are neither the source nor the solution to our problems. Solving them will require fundamental change, and the real question is whether we as a society — not just our algorithms — are prepared to do that work.


Microsoft Security Data Science Colloquium: Inference Privacy in Theory and Practice

Here are the slides for my talk at the Microsoft Security Data Science Colloquium:
When Models Learn Too Much: Inference Privacy in Theory and Practice [PDF]

The talk is mostly about Bargav Jayaraman’s work (with Katherine Knipmeyer, Lingxiao Wang, and Quanquan Gu) on evaluating privacy:


Fact-checking Donald Trump’s tweet firing Christopher Krebs

I was a source for thie “Pants on Fire!” fact check by PolitiFact on Donald Trump’s tweet that fired Christopher Krebs claiming that “The recent statement by Chris Krebs on the security of the 2020 Election was highly inaccurate, in that there were massive improprieties and fraud - including dead people voting, Poll Watchers not allowed into polling locations, “glitches” in the voting machines which changed…”

PolitiFact: Fact-checking Donald Trump’s tweet firing Christopher Krebs, 18 November 2020

David Evans, a professor of computer science at the University of Virginia, told PolitiFact that he signed the joint statement because “there is no substance … no credible specifics, and no evidence to support any of the claims” of a rigged election.

“It is always difficult to prove something did not occur, which is why people who work in security are so careful to avoid strong statements,” Evans said. “But in this case, because of the size of the margin, all of the security measures that were in place and worked as intended, and the lack of any evidence of anything fraudulent happening, one can be highly confident that there is no credible possibility that the results of the election are invalid.”

The expert’s statement (for which I am one of 59 signers) is here: Scientists say no credible evidence of computer fraud in the 2020 election outcome, but policymakers must work with experts to improve confidence.

It was covered in this article: New York Times, Election Security Experts Contradict Trump’s Voting Claims, 16 November 2020.


Voting Security

I was interviewed for a local news story by Daniel Grimes on election security: UVA cybersecurity expert: Virginia is one of the safer states to cast a ballot, NBC 29 News, 21 October 2020.