Current Research

Operating System Design for Disaggregated Memory Architecture

(Storage) Systems Support for Large Language Models

Past Research Directions

Networked Memory Architecture (2017 - 2020)

As the storage and network technologies evolve rapidly, the CPU performance remains comparatively stagnant as Moore’s law slows in the past years. Due to this reason, the CPU running with the heavy-weight storage software can easily become the bottleneck. We tackle this problem from various aspects.

CPU-efficient IO Engine (2020 - 2024)

Purely reducing the overhead of storage software is still not enough; system designers must be also device-aware since the emerging hardware typically exhibits bizarre performance behavior. For example, non-volatile storage devices have asymmetric read/write performance, device-level IO amplification, and performance variability; in this context, we designed:

Low Tail Latency Concurrency Control (2020 - 2024)

Apart from seeking higher throughput and lower latency, datacenter applications also require their performance to be predictable (often defined as 99th or 99.9th percentile latencies). Latency variability can arise for many reasons, including sharing resources (e.g., CPU cores, caches, memory bandwidth, etc.), background activities, queuing, and others. In the past years, we have witnessed an active line of research work that improves performance predictability at different layers, but they ignore the fact that the workload is another source of incurring latency spikes due to request conflicts. Here, I take a much deeper dive to the concurrency protocol design with the workload-aware principle in mind.