Virtuoso

Fast and Accurate Virtual Memory Research via Imitation-based OS Simulation

Rapid prototyping of hardware/OS co-designs for virtual memory. Powered by MimicOS, a lightweight userspace kernel that imitates only the necessary OS functionalities.

Key Features

MimicOS

A lightweight userspace OS kernel that imitates only the necessary kernel functionalities. Accelerates simulation compared to full-system Linux while providing high-level programming interfaces for developing new OS memory management routines.

Modular and Composable

7+ physical memory allocators, 5 page table formats, 8 MMU designs, 6 TLB prefetchers, and 4 speculative translation engines. All components are composable via configuration files for rapid prototyping.

Multi-Simulator Integration

Integrates with diverse architectural simulators specializing in different system design aspects. Currently supports Sniper (event-driven CPU simulator) and Ramulator2 (cycle-accurate DRAM timing) through a shared MimicOS core.

Architecture Overview

Sniper

Event-driven CPU simulator

Address Translation

Page table walkers, MMU designs,
TLB hierarchies, Prefetchers,
Speculative engines

Ramulator2

Cycle-accurate DRAM timing

Policy-based integration

MimicOS

Lightweight userspace kernel imitating OS memory management

Physical Memory Management

Allocators, Buddy system, NUMA policies,
HugeTLBfs, Swap, Exception handlers, VMA tracking

Supported Components

Physical Memory Allocators 7

  • Baseline : Simple 4KB buddy allocator
  • ReserveTHP : Reservation-based transparent huge pages with 2MB promotion
  • SpOT : Contiguity-aware allocator exploiting OS allocation patterns (Alverti et al., ISCA '20)
  • ASAP : Prefetched address translation with aggressive superpage allocation (Margaritov et al., MICRO '19)
  • Utopia : Restricted segments (RestSegs) with direct VA-to-PA computation (Kanellopoulos et al., MICRO '23)
  • Eager Paging : Allocates contiguous physical ranges for entire VMAs, used by RMM (Karakostas et al., ISCA '15)
  • NUMA ReserveTHP : Multi-node reservation-based allocator with per-node capacity and placement policies

Page Table Formats 5

  • 4-Level Radix : Standard x86-64 page table with PML4, PDPT, PD, and PT levels
  • Elastic Cuckoo Hash : Cuckoo hashing page table with elastic bucket resizing (Skarlatos et al., ASPLOS '20)
  • Hash Don't Cache : Open-addressing hash table with linear probing, eliminates page walk caching (Yaniv and Tsafrir, SIGMETRICS '16)
  • Hash Table Chaining : Chained hash table with dynamic resizing support
  • Range Table : B-tree based range translations for contiguous physical mappings

MMU Designs 8

  • Base MMU : Standard radix page table walker with configurable multi-level TLB hierarchy
  • Speculative MMU : Races speculative translation against conventional page table walks
  • POM-TLB : Part-of-Memory TLB using DRAM as a software-managed large TLB (Ryoo et al., ISCA '17)
  • Range MMU (RMM) : Range Lookup Buffer for contiguous translations with eager paging (Karakostas et al., ISCA '15)
  • DMT : Direct Memory Translation for virtualized clouds (Zhang et al., ASPLOS '24)
  • Utopia MMU : RestSeg walker with CATS prediction and page migration (Kanellopoulos et al., MICRO '23)
  • HW Fault Handler : Hardware page fault handler with a delegated memory pool
  • Virtualization MMU : Nested MMU for two-dimensional guest-to-host address translation

TLB Prefetchers 6

  • Agile TLB Prefetcher (ATP) : Adaptive multi-stride prefetcher with frequency-based detection (Vavouliotis et al., ISCA '21)
  • Recency Prefetcher : Pointer-table based prefetching using page access recency
  • Distance Prefetcher : Distance-indexed prediction table for irregular access patterns
  • Stride Prefetcher : Classic next-page stride-based TLB prefetcher
  • H2 Prefetcher : History-based TLB prefetcher using past access sequences
  • ASP : PC-indexed arbitrary stride prefetcher with configurable lookahead

Speculative Translation Engines 4

Additional Features

  • CHiRP : Control-Flow History Reuse Prediction for dead-entry aware cache replacement (Mirbagher-Ajorpaz et al., MICRO '20)
  • MPLRU : Adaptive bandit-based cache replacement controller
  • HugeTLBfs : Pre-allocated 2MB and 1GB huge page pool management
  • Swap Cache : Page swap tracking with free page management
  • CXL Memory : Tiered DDR + CXL.mem Type-3 performance models
  • Multicore : 2 to 16 core configurations with NUMA topology support

Workshops & Tutorials

ASPLOS 2026

Full-day tutorial on hardware/OS co-design for memory management. Hands-on demos, invited talks, and live coding sessions.

Learn More

MICRO 2025

Workshop on virtual memory simulation methodology and the Virtuoso framework. Research talks and demonstrations.

Learn More

Getting Started

$ git clone --recursive https://github.com/CMU-SAFARI/Virtuoso.git && cd Virtuoso

See the README for full build instructions, trace downloads, running simulations, and the experiment framework. The repository includes a 22-config smoke test suite covering all supported allocators, page tables, MMU designs, TLB prefetchers, and speculative engines.

Citation

If you use Virtuoso in your research, please cite:

@inproceedings{kanellopoulos2025virtuoso, title = {{Virtuoso: Enabling Fast and Accurate Virtual Memory Research via an Imitation-based Operating System Simulation Methodology}}, author = {Kanellopoulos, Konstantinos and Sgouras, Konstantinos and Bostanci, F. Nisa and Kakolyris, Andreas Kosmas and Konar, Berkin Kerim and Bera, Rahul and Sadrosadati, Mohammad and Kumar, Rakesh and Vijaykumar, Nandita and Mutlu, Onur}, booktitle = {ASPLOS}, year = {2025} }