Essential Readings on Database Storage
·546 words·3 mins
Paper Reading
Table of Contents
Disk I/O #
- Managing Non-Volatile Memory in Database Systems, 2018, SIGMOD
- Design Tradeoffs of Data Access Methods, 2016, SIGMOD
- Designing Access Methods: The RUM Conjecture, 2016, EDBT
- The Five Minute Rule 20 Years Later and How Flash Memory Changes the Rules, 2008, ACM Queue
- The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time, 1987, SIGMOD
B Tree Families #
Buffer Management #
- Virtual-Memory Assisted Buffer Management, 2023, SIGMOD
- Memory-Optimized Multi-Version Concurrency Control for Disk-Based Database Systems, 2022, VLDB
- Are You Sure You Want to Use MMAP in Your Database Management System?, 2022, CIDR
- Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile Memory, 2021, SIGMOD
- In-Memory Performance for Big Data, 2014, VLDB
Log & Recover #
- Rethinking Logging, Checkpoints, and Recovery for High-Performance Storage Engines, 2020, SIGMOD
- FineLine: log-structured transactional storage and recovery, 2018, VLDB
- Scalable Logging through Emerging Non-Volatile Memory, 2014, VLDB
Codec & Compression #
- Mostly Order Preserving Dictionaries, 2019, ICDE
- Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems, 2014, EDBT
- Dictionary-based Order-preserving String Compression for Main Memory Column Stores, 2009, SIGMOD
- Integrating Compression and Execution in Column-Oriented Database Systems, 2006, SIGMOD
- How to Wring a Table Dry: Entropy Compression of Relations and Querying of Compressed Relations, 2006, VLDB
LSM Tree & Its Variants #
- Revisiting the Design of LSM-tree Based OLTP Storage Engine with Persistent Memory, 2021, VLDB
- LSM-based Storage Techniques: A Survey, 2019, VLDBJ
- WiscKey: Separating Keys from Values in SSD-conscious Storage, 2016, FAST
- The Log-Structured Merge-Tree (LSM-Tree), 1996
Learned Index #
- From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees, 2020
- Learning Multi-dimensional Indexes, 2019
- The Case for Learned Index Structures, 2018
Compaction Optimizations #
TODO
Column Store #
- Column Sketches: A Scan Accelerator for Rapid and Robust Predicate Evaluation, 2018, SIGMOD
- BitWeaving: Fast Scans for Main Memory Data Processing, 2013, SIGMOD
- Column Imprints: A Secondary Index Structure, 2013, in SIGMOD
- SQL Server Column Store Indexes, 2011, SIGMOD
- Cache Conscious Indexing for Decision-Support in Main Memory, 1999, VLDB
- Bitmap Index Design and Evaluation, 1998, SIGMOD
Distributed Storage #
Replication & Consistency #
- Ark: A Real-World Consensus Implementation, 2014, CoRR
Replica Placement #
- Adaptive HTAP through Elastic Resource Scheduling, 2020, SIGMOD
- MorphoSys: Automatic Physical Design Metamorphosis for Distributed Database Systems, 2020, VLDB
- Autoscaling Tiered Cloud Storage in Anna, 2019, VLDB
- Automated Demand-driven Resource Scaling in Relational Database-as-a-Service, 2016, SIGMOD
Transaction & Concurrenct Conctrol #
- Scalable and Robust Snapshot Isolation for High-Performance Storage Engines, 2022, VLDB
- Memory-Optimized Multi-Version Concurrency Control for Disk-Based Database Systems, 2022, VLDB
- An Empirical Evaluation of In-Memory Multi-Version Concurrency Control, 2017, VLDB
- Serializable Snapshot Isolation in PostgreSQL, 2012, VLDB
- Serializable Isolation for Snapshot Databases, 2009, TODS
- Making Snapshot Isolation Serializable, 2005, TODS
- A Critique of ANSI SQL Isolation Levels, 1995, SIGMOD
Systems #
Leanstore:
Umbra:
Distributed NoSQL Systems:
- PNUTS to Sherpa: Lessons from Yahoo!’s Cloud Database, 2019, VLDB
- Cassandra - A Decentralized Structured Storage System, 2010, SOSP
- PNUTS: Yahoo!’s Hosted Data Serving Platform, 2008, VLDB
- Dynamo: Amazon’s Highly Available Key-value Store, 2007, SOSP
- Bigtable: A Distributed Storage System for Structured Data, 2006, OSDI