Software

Statistics Software


CombInference: Package for combinatorial inference


The CombInference package implementation of the methods proposed in the paper is designed as a tool for selecting graph features from high-dimensional graphical models, specifically focusing on controlling the false discovery rate (FDR). This package applies methods like the K-dimensional Persistent Homology Adaptive Selection (KHAN) algorithm to select significant topological features across various fields, such as biology, chemistry, neuroscience, and sociology.

  Overview

This package is an ideal tool for researchers and data scientists working with high-dimensional graphical models and networks, allowing them to explore and quantify significant structural features with controlled false discovery rates.

RankInference: Package for ranking inference


The RankInference package is designed to implement nonparametric inferential methods for ranking large language models (LLMs) based on pairwise comparisons. It is built on the extended Bradley-Terry-Luce (BTL) model and provides a robust framework for testing hypotheses and constructing confidence intervals of LLM rankings across different contexts. This package facilitates reliable model comparisons, particularly in trust-sensitive fields such as medicine, law, and finance.

  Overview

This package is ideal for researchers, data scientists, and practitioners working with LLMs who need reliable methods for model comparison and ranking in dynamic, real-world scenarios, especially where trust and accuracy are paramount.

DIANE: Package for distributed non-convex optimization


The DIANE R package is an implementation of the methods proposed in the paper for learning feature embeddings from multi-institutional Electronic Health Records (EHR) data. The package is designed to overcome the challenges of non-convexity and communication efficiency in distributed datasets. DIANE utilizes a low-rank Ising graphical model with a non-convex bi-factored surrogate loss to estimate knowledge graphs and embeddings from large-scale binary data.

This package is ideal for researchers and data scientists working with large-scale, distributed EHR data, offering powerful methods for feature extraction, knowledge graph construction, and risk prediction while maintaining computational and communication efficiency.

FederatedRL: Package for federated reinforcement learning


The FederatedRL R package implements distributed reinforcement learning algorithms designed for multi-institutional datasets, focusing on optimizing dynamic treatment regimes (DTRs) in healthcare settings. The package is tailored for environments with privacy constraints and distributed data sources, offering efficient federated learning solutions.

  Overview

FederatedRL is designed for researchers and practitioners in healthcare and other domains requiring distributed reinforcement learning. It offers powerful tools for creating dynamic treatment regimes in environments where privacy and efficiency are essential.

Bioinformatics Software


ARCH: Package for intergrating codified and narrative data from electronic health records


The ARCH R package implements methods for generating large-scale knowledge graphs (KG) by analyzing codified and narrative data from electronic health records (EHR). This package is designed to address challenges in EHR analysis by extracting meaningful relationships between clinical concepts, represented as codified data and unstructured narrative notes, using advanced representation learning techniques. It combines natural language processing (NLP) and codified data to create comprehensive, low-dimensional embeddings of clinical features with statistical certainty.

  Overview

The ARCH R package is particularly valuable for researchers and healthcare analysts working with large-scale, multi-institutional EHR datasets. It enhances data representation, feature extraction, and knowledge discovery, making it an essential tool for clinical decision-making and biomedical research.

hyperbolicEHR: Package for learning hyperbolic embeddings for electronic health records


The HyperbolicEHR package implements a multi-source hierarchical clustering algorithm specifically designed to process large-scale electronic health records (EHR) data. This package addresses the complexity and lack of organization often encountered in EHR datasets by leveraging hyperbolic geometry to construct efficient hierarchical structures. By integrating multiple sources of codified and narrative medical data, the HyperbolicEHR package enhances the analysis and interpretability of medical codes such as diagnoses, medications, and laboratory results.

The HyperbolicEHR R package is ideal for researchers and healthcare analysts seeking to leverage hierarchical clustering methods for large, complex EHR datasets. It provides a comprehensive approach to organizing and analyzing clinical data, enabling more accurate and actionable insights in healthcare applications.

MedArena: Comparing different medical large language models


The MedArena provides a comprehensive framework for evaluating the performance of large language models (LLMs) specifically in the medical domain. This package is designed to test models' understanding and interpretation of medical concepts, terminology, and contextual relationships that are critical for accurate decision-making in healthcare.

The LLM Arena R package is a valuable tool for healthcare researchers, data scientists, and clinicians aiming to evaluate and compare the performance of large language models in medical applications. By providing a robust testing environment and domain-specific metrics, it ensures that models are accurate, reliable, and effective for real-world healthcare use cases.

BONME: Package for intergating multi-institutional electronic health records


The BONME R package implements the Block-wise Overlapping Noisy Matrix Embedding (BONME) method, designed for integrating and analyzing multi-source electronic health record (EHR) data. This package provides a robust framework for handling block-wise missingness in data, ensuring efficient completion of missing submatrices and enabling comprehensive EHR analysis across multiple sources.

The BONME R package is an invaluable tool for healthcare researchers, data scientists, and institutions that require robust methods to integrate and analyze large, multi-source EHR datasets. It helps uncover hidden relationships between clinical features and enhances the overall quality of healthcare analytics.