Newsletter 2019: Q1

Research Hub Update

April 11, 2019 — What is the All of Us Research Program?

The All of Us Research Program is building one of the largest biomedical resources of its kind, specifically engaging volunteers who reflect the diversity of America. This resource will be available to a diverse community of researchers and the public to explore how lifestyle, environment, and biological makeup affect health and disease.

Researchers will be able to access participant research data, with personal identifiers removed, through the Research Hub in order to explore how various factors contribute to individual health and disease.

Learn more about the All of Us Research Program.

What is the Research Hub?

The Research Hub is your gateway to participant research data, and is designed to facilitate a one-stop-shop experience. With the Research Hub, you can learn about the breadth of data that is being collected and curated for research, and when it is available for researchers to access and analyze.

All of Us Research Data Set

The Research Hub will house an array of data collected by the All of Us Research Program. Data types expected to be included in our 2019 release include participant-provided information via surveys, physical measurements, and electronic health records.

See below for a current list of curated data types housed at the Data Research Center (DRC). This list will change frequently as new data types become available.

Survey Data

When participants enroll in the program, they are asked to respond to a number of surveys covering a variety of topics. Currently, the DRC has populated participant-provided information from the following surveys:

  • The Basics: This survey asks questions about basic demographics from participants, including their work and home life.
  • Overall Health: This survey asks questions about a participant’s overall health, including general health, daily activities, and women’s health questions.
  • Lifestyle: This survey asks questions about a participant’s use of tobacco, alcohol, and drugs.

Physical Measurements

Physical measurements are collected from participants by trained program staff members at participating health care organizations. The data is collected via a secure online data collection tool developed exclusively for the All of Us Research Program. Physical measurements include both physiological and anthropometric measurements.

  • Physiological Measurements: Blood pressure, heart rate
  • Anthropometric Measurements: Height, weight, waist circumference, hip circumference

Other data points collected include: BMI, Pregnancy at Enrollment, and Wheelchair Use at Enrollment.

Electronic Health Records (EHR)

Electronic health records, or EHR, are simply electronic records of  information collected and kept in a secure electronic system when individuals receive health care. The information included depends on what kinds of health care has been received and what types of providers a patient has seen.

The All of Us Research Program employs Observational Medical outcomes Partnership (OMOP) Common Data Model (CDM) Version 5 infrastructure to ensure feasibility and standardization across EHR data for researchers. EHR data will be accessed longitudinally throughout the life of the program. More information can be found on the OHDSI/OMOP CDM Wiki page.

The All of Us Research Data Set is comprised of EHR data derived from the following 14 OMOP tables:

  • Person: Contains basic demographic information describing a participant including gender, birth date, race, and ethnicity.
  • Visit_occurrence: Contains the type of visit a Person has at a care site (outpatient care, inpatient confinement, emergency room, or long-term care), as well as date and duration of time information.
  • Condition_occurrence: Conditions are records of a Person suggesting the presence of a disease or medical condition stated as a diagnosis, a sign, or a symptom, which is either observed by a Provider or reported by the patient.
  • Drug_exposure: Captures records about the utilization of a drug when ingested or otherwise introduced into the body. Drugs include prescription and over-the-counter medicines, vaccines, and large-molecule biologic therapies. Radiological devices ingested or applied locally do not count as drugs. Drug exposure is inferred from clinical events associated with orders, prescriptions written, pharmacy dispensings, procedural administrations, and other patient-reported information.
  • Measurement: Contains both orders and results of a systematic and standardized examination or testing of a Person or Person’s sample, including laboratory tests, vital signs, quantitative findings from pathology reports, etc.
  • Procedure_occurrence: Contains records of activities or processes ordered by, or carried out by, a health care provider on the patient to have a diagnostic or therapeutic purpose.
  • Observation: Captures clinical facts about a Person obtained in the context of examination, questioning or a procedure. Any data that cannot be represented by any other domains, such as social and lifestyle facts, medical history, family history, etc. are recorded here.
  • Location: Represents a generic way to capture physical location or address information of Persons and care sites.
  • Provider: Contains a list of uniquely identified health care providers. These are individuals providing hands-on health care to patients, such as physicians, nurses, midwives, physical therapists etc.
  • Device_exposure: Captures information about a person’s exposure to a foreign physical object or instrument which is used for diagnostic or therapeutic purposes. Devices include implantable objects (e.g. pacemakers, stents, artificial joints), medical equipment and supplies (e.g. bandages, crutches, syringes), other instruments used in medical procedures (e.g. sutures, defibrillators) and material used in clinical care (e.g. adhesives, body material, dental material, surgical material).
  • Death: Contains the clinical events surrounding how and when a Person dies.
  • Care_site: Contains a list of uniquely identified institutional (physical or organizational) units where health care delivery is practiced (offices, wards, hospitals, clinics, etc.)
  • Fact_relationship: Contains records about the relationships between facts stored as records in any table of the CDM. Relationships can be defined between facts from the same domain, or different domains. Examples of Fact Relationships include: Person relationships (parent-child), care site relationships (hierarchical organizational structure of facilities within a health system), etc.
  • Specimen: Contains the records identifying biological samples from a person.

Within the context of the Research Hub tools, EHR data will be presented at the highest level of granularity, which is by EHR domain. Domains include: Demographics, Conditions, Procedures, Drugs, Measurements, and Visits.

Research Hub Tool Overview

See what is coming soon!

The DRC has been working diligently to develop software and tools for researchers to explore and analyze the All of Us Research Data Set. Below are brief descriptions of what is in the works!

Data Browser

The Data Browser will allow anyone to view aggregate counts of participant research data. Counts will be available for survey data collected from participants, physical measures collected during exams, and medical concepts from electronic health records (EHR) data. This tool will be available to anyone, no registration or login required.

The Data Browser is anticipated to be released in the first half of 2019.

Researcher Workbench

The Researcher Workbench, or simply referred to as the Workbench, is an analysis platform designed for researchers to create cohorts of individual level participant research data, review these cohorts, and analyze and visualize the data in a Jupyter notebook using Python or R.

The initial version of the Workbench provides three tools for working with data:

  • Workspaces: Create a project workspace. Store cohorts and notebooks. Share with team members.
  • Cohort Builder: Build and review a custom data set.
  • Notebooks: Analyze your cohort. Create graphs or tables to showcase your work.

To ensure participant privacy, researchers will be required to register and verify their identity in order to use the Workbench.

The Workbench is anticipated to be released in Winter of 2019.

What’s Happening Now in the Research Hub

The DRC aims to build the Research Hub with the end user in mind. Over the past year, the Research Hub has been subjected to multiple rounds of UX testing within the All of Us Research Program, led by the DRC, to help inform product development. The DRC has sought input from an array of potential users of the Research Hub, collated their feedback, and leveraged these insights to help guide iterative development of the tools.

Currently, the Research Hub is in its first beta launch within the DRC. The purpose of this release is to open the Research Hub up to vigorous internal testing which includes: quality control, quality assurance, data characterization, data validation, and user testing. Exhaustive security testing and functionality testing is also ongoing.