Life Sciences

Innovating Drug Discovery with a Pioneering Consortium of Biotech Partners and the Rhino Health Federated Computing Platform

Feb 29, 2024
Malhar Patel, Senior Director of Engagement
Lili Lau, Director of Product Marketing

The intersection of advanced computing technologies like Artificial Intelligence (AI) and Machine Learning (ML) with medical research is transforming drug discovery. Explore with us a collaborative partnership blueprint between a pioneering Consortium of biotech partners and Rhino Health in enabling privacy-preserving AI and data analytics through Rhino Health’s Federated Computing Platform (FCP).

1. The Consortium and Rhino Health’s FCP at Work

The drug discovery process poses a variety of challenges:

  1. The process fundamentally revolves around understanding how molecules and proteins in the  human body interact, as the efficacy of drugs hinges on their interaction with protein ligands, which play a pivotal role in modulating disease treatment.
  2. This complex investigation demands substantial computational resources, making AI an essential tool to enhance the efficiency and effectiveness of the drug discovery process.
  3. Biopharma organizations are at the forefront of developing proprietary AI solutions based on their specific drug discovery data. However, they recognize a crucial limitation: their AI model’s potential is capped by the narrow scope of each organization’s datasets. More significant insights and advancements could be achieved if they could access and analyze more extensive datasets, which typically remain isolated within other biopharma organizations.
  4. The critical need to preserve the confidentiality of model IP and drug discovery data IP complicates the situation further. Direct data or model sharing between these organizations is not feasible due to stringent IP protection requirements. Consequently, they seek a solution that allows collaborative enhancement of their AI models while safeguarding their valuable intellectual assets.

In response, this consortium has turned to Rhino Health’s FCP. Rhino Health’s FCP offers a neutral, robust technological framework facilitating secure collaboration across various biopharma entities. The FCP enables collaborative advancements in AI-driven drug discovery, simultaneously upholding the integrity and confidentiality of each entity’s IP.

They recognized the need for a secure, collaborative solution and turned to Rhino Health’s FCP. Rhino Health’s FCP allows them to collaboratively enhance their AI models by accessing a broader range of datasets without direct data sharing, ensuring the maintenance of IP and data confidentiality. The FCP’s unique approach facilitates effective collaboration in drug research while safeguarding each member’s proprietary assets.

Rhino Health’s innovative use of Federated Computing through its FCP goes beyond the conventional scope of Federated Learning. While Federated Learning focuses on building ML models across multiple datasets without sharing the data, Federated Computing encompasses a broader spectrum of use cases. This includes Federated validation of models, and Federated preprocessing of data, which are critical in drug discovery and research.

With Rhino Health’s FCP, Consortium members can access a more comprehensive range of data analytics and ML tools while maintaining strict controls over their data. Rhino Health FCP’s approach ensures that the data never leaves the institution’s firewall, thereby preserving privacy and security.

Here are several real-world examples highlighting the effectiveness of Rhino Health’s FCP with the healthcare industry and consortia:

  1. Collaborative AI Development in Biopharma: A biopharma client used the Rhino Health’s FCP to set up an academic-industry consortium for Generative AI experiments. The platform enabled seamless collaboration among diverse stakeholders, leading to rapid advancements in drug development models.
  2. Multi-Institutional Data Analysis: Rhino Health’s FCP facilitated a multi-institutional project involving ten hospitals and eight industrial partners, primarily in GDPR-regulated regions. The project focused on building AI models on CT scans, showcasing the platform’s capability to handle complex data privacy regulations.
  3. Global Healthcare Research: The FCP was instrumental in a project involving seven hospitals across four continents. In weeks, these institutions utilized the FCP to validate a brain aneurysm detection model, demonstrating the FCP’s ability to support fast-paced, global research endeavors.

These cases exemplify the Rhino Health FCP’s ability to combine diverse datasets and research efforts under a unified, secure framework. By leveraging the platform, the Consortium has been able to transcend traditional boundaries in drug discovery, paving the way for more innovative, collaborative, and efficient research methodologies.

2. The Rhino Health Federated Computing Platform

Rhino Health’s FCP is an innovative solution that addresses the need for privacy-preserving AI and data analytics. It provides a distributed computing framework that enables seamless collaboration between data owners while maintaining stringent data privacy and adhering to global healthcare privacy laws such as HIPAA¹ in the US and GDPR² in the EU.

2.1. FCP Functionality and User Experience

The FCP uses Edge Computing and Federated Learning (FL) technologies. This approach allows centralized orchestration of custom workflows and AI model training without necessitating the centralization of sensitive data. Such a structure lowers the barriers to AI adoption in highly regulated industries like healthcare and facilitates multi-site data access without complex collaborations or data transfer agreements.

The FCP is user-friendly, offering access through a web-based interface, Python³ SDK⁴ for more customized interactions, and a REST API. This flexibility accommodates various user preferences and technical expertise levels.

2.2. FCP Architecture and Security

Rhino Health’s FCP architecture is composed of two main components:

  1. Rhino Health Client: This software is installed on a virtual machine or physical server, typically behind the data custodian’s firewall (on-prem or on a Virtual Private Cloud). The Rhino Health Client has direct access to local datasets and the necessary computational resources (GPUs⁵/CPUs⁶) to securely run computational workloads like AI model training without moving sensitive data outside the firewall.
  2. Rhino Health Cloud: Responsible for task orchestration across Rhino Health Clients (e.g. for Federated Learning (FL) tasks), the Rhino Health Cloud serves as the access point for all user interactions with the FCP. Hosted on AWS, the Rhino Health Cloud ensures that patient data is not transferred to it, thus enabling collaborations with institutions having restrictive data-sharing policies.

2.2.1. Security Features

  • Compliance and Certifications: The FCP is HIPAA and GDPR compliant and holds ISO-27001⁷ and SOC-2 type 2⁸ certifications, demonstrating its commitment to meeting high-security standards.
  • Data Encryption: Both in transit and at rest, data within the FCP is encrypted, providing robust protection against unauthorized access.
  • Zero Trust Architecture: The platform operates on the principle of least privilege, ensuring that access is tightly controlled and monitored.
  • Data Privacy: The platform upholds data privacy through techniques like de-identification and differential privacy. This allows data to be used for collaborative purposes while minimizing risks of personal data exposure.
  • Secure Data Processing: All data processing within the FCP occurs within the confines of the institution’s firewall, ensuring that sensitive patient data remains secure.

2.2.2. Layered Structure of the Rhino Health’s FCP

1. User Interaction Layer:

This layer is where users interface with the platform, primarily though two avenues:

  • Web  UI: A graphical user interface accessible via browsers, providing an intuitive and visual way for users to interact with the platform.
  • Python SDK: A Python-based software development kit for more technical and customized interactions with the platform. This is particularly useful for users who prefer to work within Jupyter Notebooks¹⁰ or similar Python environments.

2. Rhino Health Cloud Environment Layer:

This layer involves the Rhino Health Cloud Environment, which communicates with the Web UI and Python SDK. It comprises various components like the REST API, facilitating tasks such as Federated Learning orchestration, experiment management, distributed ETLs¹¹, secure data annotation, and audit and traceability.

The Cloud Environment is characterized by its handling of parameters, statistics, and data indexing, with a strict policy of not storing or processing Personal Health Information (PHI) or Personally Identifiable Information (PII).

3. Rhino Health Client Interface Layer:

The Rhino Health Client, installed within the institution’s network, forms this layer. It manages local data storage (raw and de-identified data) and securely interfaces with the cloud environment through encryption protocols like TLS v1.3¹².

The Rhino Health Client includes components for Federated Learning and utilizes the institution’s computational resources, such as GPUs, to run necessary workloads without exporting sensitive data.

4. Data Source Connection Layer:

This final layer comprises hospital IT systems like DICOM¹³, HL7¹⁴, and genomics databases. These systems connect to the Rhino Health Client, either directly or through secure protocols like SFTP¹⁵ or DICOM Web, allowing the transfer of data to the local data store in a safe and controlled manner.

2.2.3. Flow of Interaction in Rhino Health FCP

  • The architecture facilitates a smooth interaction flow from user interfaces to the core data processing units while maintaining data security and privacy.
  • Users can manage and configure projects, visualize data, and generate reports through the Web UI or Python SDK. These interactions pass through the Rhino Health Cloud, where they are processed and directed accordingly without exposing sensitive patient data.
  • The Rhino Health Client is a local processing hub within the institutional environment. It securely handles data for various computational tasks, including AI model training, under the governance of the cloud environment.
  • Data from the custodian’s systems is securely ingested into the Rhino Health Client, ensuring that sensitive data is not exposed outside the institution’s firewall at any point.

The FCP architecture seamlessly integrates user-friendly interfaces with robust backend processing, ensuring secure and efficient data handling for AI and ML tasks in healthcare. This design respects the privacy of patient data and provides a versatile platform for collaborative medical research and innovation.

3. The Future of AI in Healthcare and Rhino Health FCP’s Role

The Rhino Health FCP’s integration into the Consortium’s workflow has transformed the drug discovery process by enabling the secure sharing and analysis of data across multiple organizations without compromising data privacy. The FCP can significantly accelerate the development of new drugs by the consortium, especially as the Rhino Health Client can be installed in secure environments within weeks and leveraged for multiple projects. The ability to process vast amounts of data quickly and safely across different research environments can shorten the timeline from research to clinical trials, potentially bringing life-saving treatments to market faster.

The consortium’s use of AI and ML, facilitated by Rhino Health’s FCP, opens new doors in personalized medicine. Researchers can develop more nuanced and compelling treatment plans tailored to individual patient profiles by analyzing diverse datasets from various institutions. This approach promises to enhance the efficacy of treatments and minimize adverse reactions, leading to improved patient outcomes.

Rhino Health’s FCP is setting a new standard for collaborative healthcare research. Its ability to maintain data integrity and confidentiality while allowing for comprehensive data analysis across institutions has shown that concerns over data privacy and IP protection need not hinder collaborative research. This model can be replicated in other research consortia, leading to more breakthroughs in various medical fields.

One of the notable successes of this collaboration is navigating the complex landscape of regulatory compliance. Using Rhino Health’s FCP, the consortium has demonstrated that it is possible to conduct expansive, collaborative research within the bounds of stringent regulatory requirements such as GDPR and HIPAA. This compliance is crucial for the global expansion of similar research initiatives.

The partnership is a testament to the potential of AI and Federated Computing in healthcare. It is a model that can be emulated in other areas of medicine, from genomics to epidemiology. The success of this initiative suggests that the future of healthcare will be increasingly data-driven, collaborative, and personalized.

Conclusion

The collaboration between the pioneering Consortium of biotech partners and Rhino Health underscores the transformative potential of AI and Federated Computing in advancing healthcare. It sets a precedent for future innovations in drug discovery and patient care. The Rhino Health’s FCP, at the heart of this collaboration, emerges as a critical enabler of this new era in healthcare research and innovation.

Discover the potential of Federated Computing technology in healthcare with Rhino Health. Connect with us now to learn more about our solutions and how they can benefit your organization.

Notes:

(1) HIPAA: The “Health Insurance Portability and Accountability Act” is a US law protecting patients’ medical information (“Protected Health Information” or PHI), setting standards for its use, disclosure, and security.

(2) GDPR: The “General Data Protection Regulation” is an EU regulation protecting the personal data of all individuals within the EU, including sensitive data like health information.

(3) Python: A high-level, interpreted programming language known for its readability and versatility, widely used in various fields including web development, data analysis, and artificial intelligence.

(4) SDK: Software Development Kit is a collection of software tools and libraries developers use to create applications for specific software, hardware platforms, or operating systems.

(5) GPUs: Graphics Processing Units are the primary components of a computer that perform most of the processing inside the computer. Also known as the ‘brain’ of the computer, CPUs handle basic instructions from a computer’s software and hardware, executing primary arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions.

(6) CPU: Central Processing Units are the primary components of a computer that perform most of the processing inside the computer. Also known as the ‘brain’ of the computer, CPUs handle basic instructions from a computer’s software and hardware, executing primary arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions.

(7) ISO-27001: An international standard that outlines best practices for an information security management system (ISMS), ensuring the secure handling of sensitive data.

(8) SOC-2 type 2: A type of audit report that evaluates an organization’s information systems relevant to security, availability, processing integrity, confidentiality, or privacy over a specified period.

(9) UI: User Interface is the mean by which a user interacts with a computer system, software application, or any electronic device, typically involving elements like screens, keyboards, and menus.

(10) Jupyter Notebook: An open-source web application that allows the creation and sharing of documents containing live code, equations, visualizations, and narrative text, commonly used for data sharing, transformation, numerical simulation, and machine learning.

(11) Distributed ETL: A data processing framework that extracts data from various sources, transforms it into a structured format, and loads it into a system for analysis, distributed across multiple computing environments to enhance efficiency and scalability.

(12) TLS v1.3: Transport Layer Security version 1.3. Is the latest version of the internet security protocol used to encrypt data transmitted over a network, ensuring secure communication between clients and servers, with enhancements in speed and security compared to previous versions.

(13) DICOM: Digital Imaging and Communications in Medicine is an international standard used for storing, transmitting, and retrieving medical images, enabling the integration of medical imaging devices such as scanners, servers, workstations, and network hardware.

(14) HL7: Health Level Seven International is a set of international standards for the exchange, integration, sharing, and retrieval of electronic health information, aimed at standardizing the transfer of clinical and administrative data.

(15) SFTP: SSH File Transfer Protocol is a secure file transfer protocol that provides file access, transfer, and management functionalities over any reliable data stream, typically used for secure file transfer operations between a client and a server.