
Oct 16, 2025
How AI Startups Are Accelerating Innovation in Health Care
Penguin Ai is Leveraging the AhaviTM Data Platform Developed by UPMC Enterprises to Validate Generative AI Models
Artificial Intelligence (AI) has the potential to transform health care but faces many challenges in implementation, including access to robust data and safe environments for training, testing, and validating models prior to deployment in health care settings.
To address these challenges, Penguin Ai is leveraging the AhaviTM real-world data platform, which was developed by UPMC Enterprises to accelerate AI in health care. Penguin will train and validate its generative AI models with Ahavi, which provides secure access to de-identified data from more than 5 million patients — including both structured and unstructured data from across the UPMC health system.
Penguin’s custom-built models are aimed at tackling the $1 trillion administrative burden in health care that is caused by inefficient, repetitive, and manual tasks. By automating repetitive tasks and streamlining workflows, Penguin helps payers and providers reduce approval times, lower costs, and improve operational efficiency.
In addition to the partnership that provides access to Ahavi, UPMC Enterprises became an investor in Penguin last month.
Ahavi Accelerates Innovation for AI Startups
There are many opportunities to improve health care with the deployment of AI applications, including enabling faster, more accurate clinical decision-making and diagnostic analysis, personalizing treatment plans, and reducing costs, inaccuracies, and inconsistencies. Taken together, these improvements can benefit patients in the form of increased access, efficiency, satisfaction, safety, and outcomes.
However, companies developing AI for health care need access to large amounts of real-world data to build accurate, reliable, and generalizable AI models. Getting access to this data can be challenging.
Ahavi solves this challenge by offering AI developers and others a secure environment with de-identified data from more than 5 million patients across UPMC’s large and diverse footprint of hospitals. The Ahavi platform includes structured and unstructured data that can be linked back as far as 2019 and 2012, respectively, with a minimum match rate of 80% between those sources.
Unlike synthetic or trial-based data, the real-world data in Ahavi captures longitudinal patient journeys, enabling AI to learn from disease progression, treatment responses, and outcomes over time. Ahavi is uniquely valuable to AI developers because the platform’s tools enable developers to train and validate AI models using diverse, longitudinal datasets that reflect actual clinical environments.
Ahavi bridges the gap between innovation and real-world application in health care AI. For these reasons, collaboration between Ahavi and Penguin was a natural fit.
The Penguin Ai Partnership with Ahavi
Penguin’s purpose-built AI platform for health care organizations lowers the barrier to entry for integrating generative AI models, deploying digital agents, and automating costly routine administrative tasks. By developing custom-built language models and machine learning for health care, payer and provider organizations can automate repetitive tasks, streamline workflows, and support clinical decision-making.
A complete platform, Penguin combines AI tooling with proprietary language models and out-of-the box digital workers and agents to address back-office workflows. The platform enables customers to get their data ready for AI, use pre-built health care generative AI models via APIs, or start with an out-of-the-box solution in areas such as medical coding, prior authorizations, claims adjudication, appeals management, risk adjustment, medical chart summarization, and payment integrity.
But to accomplish this mission, Penguin’s AI models need access to robust, longitudinal, high-quality data for testing and training, and that’s where Ahavi comes in. Penguin is leveraging Ahavi’s Research & Development Infrastructure Services environment to build, train, and validate its small language models on three cohorts of patient insights. Ahavi supplies secure, de-identified, customized test environments that allow Penguin to refine its models for further use across the market.
UPMC selected Penguin as a strategic data partner because of Penguin’s enterprise-grade platform, which is uniquely designed to support health care initiatives. Penguin’s first project being trained and refined in an Ahavi environment is a Physician 360 solution that summarizes dense structured and unstructured patient medical records into a snapshot of relevant information physicians can quickly review right before a clinical visit. The goal is to build patient rapport and make the most of the clinical visit time.
Penguin’s second solution using Ahavi is called Enhanced Prior Authorization and aims to refine the company’s current prior authorization Small Language Model (SLM) and Agent to improve the overall process for clinicians, insurers and, ultimately, patients. The Agent uses AI models to distill and efficiently present information to skilled decision-makers. This “human-in-the-loop” approach balances the need to get quick answers to doctors and patients, while ensuring those decisions are evidence-based and compassionate. Ahavi will help Penguin expand their current Prior Auth Agent’s ability to estimate the probability that a request will be accepted or denied, allowing physicians to review and attach appropriate claims documentation in real-time.
This partnership marks the beginning of a long-term collaboration between UPMC Enterprises and Penguin, as both organizations work together to build patient- and provider-centered AI solutions that address the unique challenges of the health care industry.
Penguin’s CEO and co-founder is Fawad Butt, a longtime health IT executive with significant experience leading data and analytics teams and overseeing enterprise-wide platforms and tools. He was previously chief data officer at UnitedHealthcare and Optum, and at Kaiser Permanente. With deep expertise in data governance, digital transformation, and AI strategy, Butt is positioned to guide Penguin in its mission to tackle the $1 trillion administration burden in health care by accelerating AI adoption within health systems.
“This partnership represents a significant step forward in our mission to revolutionize health care through AI,” Butt said. “With UPMC Enterprises’ support as both an investor in our company and as an early partner on their Ahavi data platform, we are well-positioned to build cutting-edge AI solutions that will benefit the entire health care ecosystem.”
Conclusion
The collaboration between Penguin and UPMC Enterprises represents a pivotal step toward more intelligent, patient-centered health care. By combining cutting-edge AI capabilities with rich, diverse clinical data, this collaboration can enable more accurate predictions, faster decision-making, and more streamlined workflow.
Penguin’s platform capabilities, specifically its comprehensive governance and bias correction, service those existing needs, while also preparing for future demands in this evolving AI market. Whether companies are training, validating, or fine-tuning AI models, the UPMC solution transforms real-world data into actionable insights.
“Think of Ahavi as a sandbox where researchers, innovators, and AI developers can road test their solutions in a real-world setting that sits apart from actual clinical data infrastructure and does not involve health care operations,” said Deepan Kamaraj, MD, PhD, director of informatics and analytics at UPMC Enterprises. “This allows for rigorous evaluation and validation in a fraction of the time of a traditional pilot study, enabling these products to reach patients faster.”
Next Steps
- Learn more about Ahavi and how it can benefit AI developers.
- Find out how Penguin Ai is transforming health care with a purpose-built platform.
- Get regular updates on Ahavi by signing up for our newsletter.
Note: UPMC has a financial interest in Penguin Ai.