Catch What Your Model Misses: An Anticipatory Design Framework for Ethical AI Evaluation

Saturday, April 18, 2026 · 11:00 AM – 12:30 PM · UVA School of Data Science

Many costly AI mistakes are buried in decisions made long before the data science work begins. They’re hiding in the problem definition that teams locked in months earlier. By the time those blind spots surface, you’ve already spent significant time and compute finding out the hard way.

The tools most data scientists reach for, such as quantitative fairness techniques and bias metrics, are valuable, but they operate downstream of the decisions that matter most. We’ll explore how to proactively dissect use cases to prevent unintended harms before choices get locked into models.

Drawing on anticipatory design methodology and responsible AI research, this session gives participants a structured, repeatable process and tool for interrogating AI use cases before development begins, and guidance on how to maintain alignment. This enhanced perspective will equip data scientists to engage more fully in strategic AI decision-making and ethical deliberation.

What we’ll cover:

We’ll examine where AI initiatives break down, and why the problem definition phase is where ethical risk accumulates fastest and costs least to address. We’ll then work through the key dimensions a responsible pre-build evaluation should cover: the use context, stakeholder vulnerabilities, technical risk profiles, and standards-based practices to challenge embedded assumptions for better-aligned AI systems.

The core of the session is a guided exercise in which participants draft their own system instructions and methodology statement for a custom GPT-based evaluator tool. Working from a provided template and prompting scaffold, participants will build an evaluator designed to surface the conversations, knowledge gaps, and blind spots that responsible AI (RAI) literature tells us teams commonly miss. The result is a companion tool that surfaces the right questions at the right moment in your process.

Participants will then compare their evaluators in small groups. Divergent approaches become learning moments about the assumptions each of us carries into a build decision without realizing it.

Who should attend:

Data scientists, ML engineers, and AI practitioners, especially those working in regulated or high-stakes domains.

What you’ll leave with:

A framework for AI use case evaluation you can apply immediately, a reusable system instructions template for building your own GPT-based evaluator, and a curated set of resources from the RAI literature that informed the framework.

About the Speaker

Jill Heinze

Strategic Advisor, AI Product Development & Founder, Saddle-Stitch Consulting

Jill Heinze helps product leaders make smarter AI decisions through strategic intelligence and ground truth research. As founder of Saddle-Stitch Consulting, she brings 20 years of user research and competitive intelligence experience to help organizations navigate AI uncertainty, revealing what competitors miss and avoiding expensive mistakes before they happen. She serves as Responsible AI Program Director for The American College of Financial Services and hosts Responsible Tech Talks on LinkedIn Live.