 
                Data is everywhere and is currently growing at an exponential pace. Broadly, we can break down data into two categories: structured and unstructured.
Structured data is used to keep the world organized. It is standardized information that follows a model and set order. Examples in your everyday life might be the name, age, and birthdate of everyone in your family. It could be the name, stock keeping unit (SKU), and price of all the items at a grocery store. Or it might be the username, address, and products to ship to a specific customer. This structured data helps businesses make reasonable inferences about their customers, productivity, revenue, and more. But the best part is that computers can quickly understand and use structured data.
In health care, structured data can help providers make reasonable inferences about patient care. Structured data in health care can include demographic information (age, gender, race), chart information (weight, heart rate, blood pressure), or current/past diagnoses, medications, and test results, to name a few. This data is an essential aspect of patient care. According to a 2017 IEEE Engineering in Medicine and Biology Society study, the average person is likely to generate more than one million gigabytes of health-related data in their lifetime, equivalent to 300 million books of patient data.
However, most data health care providers and health plans generate is unstructured data. This could include a post-encounter summary, a physical therapy progress note, a CT scan, a radiology reading, or even feedback about a recent visit. It takes a skilled person to understand what they see or read, and extract meaning or assign some structure to this unstructured data. This might be a nurse looking at a picture of a patient’s arm and confirming, yes, the patient does have a rash with blisters, or it might be a physician reading through another doctor’s note to see that the patient previously had tightness in their chest six months ago. Historically, it has been challenging for computers to use unstructured data without human intervention.
The challenge is that unstructured data can account for up to 80 percent of a patient’s medical record. That is a lot of data that isn’t readily accessible. Moreover, many health systems or companies have this unstructured data siloed in multiple repositories.
Often, aggregating and interpreting this data presents a problem for health systems as it is time-consuming, costly, and error-prone to have skilled staff members scan and extract their findings. Fortunately, the experts at UPMC Enterprises have been working towards a solution to support the unstructured data challenge.
The solution, known as Alexandria Charts, has been trusted, used, and optimized for over a decade at UPMC to unlock patient data within unstructured clinical documents.
Built by and for developers to accelerate solutions surrounding unstructured data, Alexandria Charts provides a rich set of modern APIs and operational tooling to help address the complex and challenging task of real-time data aggregation and cohesion.
As a result of Alexandria Charts’ developer-centric nature, the enterprise-level, HIPAA-compliant, and HITRUST-certified platform is poised to be the next generation of application interoperability and solution accelerator. Companies and health systems can now utilize a true end-to-end data integration solution on their terms with SaaS and self-hosted deployment models.
“Alexandria Charts has been a huge platform accelerator for us. The combination of core functionality for EMR integration, combined with standard APIs, makes it possible for our team to work at scale building and deploying applications quickly and efficiently,” said Rebecca Jacobson, President of Astrata, a UPMC Enterprises portfolio company that is advancing health care quality measurement and value-based care.
To help companies extract meaning from unstructured data, Alexandria Charts also supports multiple Optical Character Recognition (OCR) and Natural Language Processing (NLP) engines out of the box. OCR engines can “scan” documents and extract text from images, PDFs, and other file formats. NLP enriches this process by enabling those systems to recognize relevant concepts in the resulting text, making the data more easily stored, accessed, and interpreted by both humans and machines. Finally, Alexandria Charts can help resolve patient identities across documents or repositories and govern which groups can see specific patient data.
Today, the Alexandria Charts platform supports a wide variety of use cases, including risk adjustment, Healthcare Effectiveness Data and Information Set (HEDIS) quality measures, adenoma detection reporting, clinical documentation improvement, and more. For UPMC, the platform has aggregated more than 350 million documents across eight million provider patients and four million UPMC Health Plan members.
Unstructured data is an ever-growing piece of the world around us. It’s also very much embedded in every health care system. Once unlocked, this data can help improve how health systems, health plans, and providers care for patients.
To learn more about how Alexandria Charts can help you unlock your unstructured data or help accelerate your solution, visit their website.


