Research
Visit our GitHub page to find our most recent research projects: https://github.com/msqc-goethe!
High-level Illustration of the Generic MSA concept
Research Scope and Projects
Our group's main research focus is the novel architecture and programming paradigm of modular supercomputing.
This includes communication networks for optimized interaction and scheduling of disaggregated resources such as CPU and GPU clusters, but also the integration of future non-von-Neumann architectures, especially quantum processors, as modular compute resources for performing hybrid quantum-HPC-calculations, as well as design concepts of modular datacenters.
Research Topics
Simulation and Analysis of Heterogeneous Network and Storage Infrastructures in HPC and MSA systems
Modular and Automated Monitoring/Analysis of System Utilization and Workload Behavior in HPC systems
Parallel I/O and Storage Optimization of Emerging Application Workloads and their I/O Subsystems (DL, AI, Big Data, Workflows etc.)
Container Standards and Container Monitoring in HPC Systems
HPC System Benchmarking – Taxonomy, Comparability, Reproducibility
Analysis of Correlation between Generated Network Traffic Patterns and File Access Patterns / Network Traffic and I/O
Integration of non-von-Neumann architectures such as quantum computers into modular supercomputing systems
Design, Porting, Benchmarking and Practical Use of Quantum Algorithms
Hybrid HPC-Quantum-Computing Algorithms
MAWA-HPC: Modular and Automated Workload Analysis for HPC
Given the complexity of modern supercomputers and HPC systems, achieving theoretical peak performance depends on myriad parameters. In order to optimize the system performance and efficiently use the underlying resources, various methods can be applied, including simulation, benchmarking, and monitoring. However, these methods and the tools used are not compatible with each other, i.e., both the individual tools and the approaches consider only a certain part of a certain domain, e.g., network or I/O, resource allocation. At the same time, each of these approaches generates certain knowledge that can be applied to similar problems or for a certain system configuration. To avoid that such knowledge is generated only for one-time purposes, and also to support other users, this knowledge must be easily accessible and available to the community. The MAWA-HPC project aims to develop a generic workflow and tool suite that can be applied to different use cases in different domains. Through its modular design, the workflow should be able to support different community tools at each stage, increasing the compatibility of each tool and covering new use cases.
Toward the Integration of Quantum Processors into the Modular Supercomputing Architecture
In this project, we show that quantum processors fit naturally into the MSA concept, as they are a highly specialized resource and therefore only well suited for specific applications. We show how this integration can be achieved by extending the OmpSs-2 programming model, which has already been used in the MSA context for other accelerators. In classic code, functions can be offloaded to the QPU using pragma instructions. The quantum sources that implement these functions are in a separate file and are compiled using a customized quantum compilation toolchain. During runtime, offloading is done with a task-aware accelerator library in OmpSs-2, which is already available for other accelerators. The integration of QPUs into the MSA concept is of particular interest for Variational Quantum Algorithms (VQA), as they require close interaction between classical and quantum processors.
EUPEX Project: European Pilot for Exascale
The EuroHPC EUPEX consortium will design and build European modular Exascale-ready pilot system integrating European general-purpose processor technology (EPI), interconnect technology (BXI) and a software stack for HPC based on a modular supercomputing architecture (MSA). The Modular Supercomputing and Quantum Computing group will contribute to the key technologies here. In particular, we will evaluate direct communication methods for distributed accelerator technologies and adapt them for the new pre-exascale evaluation platform EUPEX. We will also look at the aspect of I/O load balancing in parallel file systems for application and metadata and introduce suitable optimization approaches.
NHR Project: Container Standards in HPC
The NHR project is focused on implementing a central repository for curated user containers as well as containerized services that are portable among participating NHR centers along with other HPC sites. These developments will also be used to provide HPC-as-a-Service to NHR users. The project will also provide documentation and best practices for container runtimes and container management solutions and evaluates and implements security mechanisms and monitoring instrumentation for containers. The NHR (national high-performance computing) network consists of several centers that both operate the high-performance computers and offer a coordinated consulting service on the methodological competence of scientific high-performance computing. The goal is to provide scientists at German universities with the computing capacity they need for their research and to strengthen their skills in using this resource efficiently.
Joint Project: Hydrogeological Modeling on a Regional Scale (HYMNE II)
The computational program "distributed density-driven flow" (d³f++) makes it possible to simulate density-dependent groundwater flow and nuclide transport in the geosphere with all interactions currently relevant for long-term safety analysis for large, hydrogeologically complex areas in practicable computation times. State-of-the-art efficient and scalable solution methods for parallel computers are used. In the HYMNE II project, these are being further developed. This involves time parallelism, further development of the LIMEX multigrid method, improved data adaptivity and modular model coupling. HYMNE II is carried out with partners from the Gesellschaft für Anlagenbau und Reaktorsicherheit and the company TechSim UG.
Selected Research Projects
A Uniform Benchmark Generation Approach for
Big Data Applications based on Decision Trees
Big Data Applications based on Decision Trees
MAWA-HPC: Modular and Automated
Workload Analysis for HPC Systms
Workload Analysis for HPC Systms
GPI-Bench:
A Comprehensive Benchmark Suite for GASPI
A Comprehensive Benchmark Suite for GASPI
Tarazu: A Dynamic End-to-End
I/O Load Balancing Framework
I/O Load Balancing Framework