H-Scale: Healthcare Big Data


Big data technology is enabling many organizations to leverage massive and diverse data sets to generate business insights and drive performance.  The healthcare industry today generates significant volumes of data from EHRs, claims systems, medical devices, labs, pharmacies, and from new data sources like consumer apps, genomics, etc. However, healthcare organizations need to make significant healthcare-specific customizations and adaptations on standard big data tools like Apache Hadoop before they can effectively use them.

CitiusTech’s H-Scale enables healthcare organizations to accelerate the use of Hadoop and other Big Data technologies in healthcare, while addressing unique healthcare industry requirements. H-Scale helps healthcare organizations implement a variety of big data use cases such as data aggregation, data lake, late binding, streaming analytics and advanced analytics.

Healthcare Data Processing, Healthcare Data Lake, Hadoop Distributions

H-Scale: Architecture

H-Scale is built on Apache Hadoop ecosystem components and is designed to be healthcare data aware. It allows highly configurable ingestion of data from a variety of source systems such as EHRs, PACS, PMS, biometric devices, mobile apps, social media etc. using industry standard transport mechanisms (e.g. REST, MLLP, XDS, DICOM, etc.)

The ingested data is stored in a secure data healthcare lake built on standard Hadoop ecosystem components such as HDFS and HBase. Once ingested, the data can be processed and persisted after enrichment. Standard healthcare data models based upon the HL7 V3 RIM, such as FHIR are used as guidelines to define data persistence. The data can further be reconciled against other data elements already present so that valuable business entities can be created for analytics purposes. Proper attribution, lineage and provenance are maintained to enable data governance.

The SaaS delivery model of H-Scale allows healthcare organizations - health systems, health plans, life science companies, and   medical technology companies - to quickly ramp up their big data processing and analytics capabilities in a HIPAA-compliant manner.  H-Scale provides core capabilities to parse, store, manage and query massive healthcare datasets, enabling organizations to focus their effort on building their bespoke big data solutions.


The core of H-Scale contains highly configurable Hadoop aware ingestion and data processing engines. These allow definition of pipelines for ingestion and processing of data, enabling organizations to easily get specific data types into the H-Scale health lake. This capability enables organization to rapidly aggregate and analyze high volume data, without the overhead of data extraction, conversion, normalization etc.

Customized as per client requirements

The data processed by H-Scale can be exposed to downstream systems such as a data warehouse environment, using standard SQL views. As a part of the H-Scale software service implementations offered by CitiusTech, analytics use cases are customised by us as per client requirements, to process the data for organization-specific insights.

Key Use Cases

Data Aggregation and Data Lake

  • Aggregation of large structured and unstructured clinical and financial data sets into a data lake
  • Clinical content extraction from physician notes, PDF documents, image files etc.

Late Binding and Schema-on-read

  • Storage of all content (without loss of any data) on HDFS/Hadoop with relevant metadata
  • Access to stored data to meet current business needs and new use cases in the future

Streaming Real-time Analytics

  • Network monitoring system enabling real-time alerts and notifications
  • Real-time insights and analytics on data from devices using sensor output monitoring

Advanced Analytics

  • Exacerbation prediction from data captured through medical devices (e.g. oxygen concentrators)
  • Mining of unstructured data (e.g. discharge summary, clinician notes, etc.) for readmission management
  • Treatment pathways built leveraging claims data
  • Feature extraction from diagnostic images (e.g. measuring stenosis of an artery, etc.)

CitiusTech Videos

H-Scale: Accelerating Big Data in Healthcare

CitiusTech’s H-Scale enables healthcare organizations to accelerate the use of Hadoop and other big data technologies in healthcare, while addressing unique healthcare industry requirements such as data security and encryption, data privacy, user and access management, and support for interoperability (HL7, FHIR) and clinical terminology standards.

H-Scale: Healthcare Big Data Demo

Why H-Scale?

Accelerate the use of big data technologies, while addressing data security, access management and interoperability challenges in healthcare.

  • About H-Scale

    H-Scale supports real-time processing of healthcare data using Hadoop ecosystem components such as Apache Spark and Storm. The real-time processing pipeline can access persisted data so that rich insights can be generated and then distributed using standards-based mechanisms.

    H-Scale is designed to support Hortonworks, IBM Big Insights and other popular Hadoop distributions.

  • Key Success Stories

    H-Scale provides a comprehensive healthcare big data stack to store, process and query large healthcare data sets.

    Please click on the link below to read more about how H-Scale has helped accelerate innovation in healthcare big data for our partners.

To learn more about H-Scale or to request a demo, please drop us an email at h-scale@citiustech.com or call
us at +1 (877) CITIUS1.