Skip to content

codeGROOVE-dev/multicache

Repository files navigation

multicache logo

multicache

CI Go Report Card Go Reference codecov Release License

multicache is the most well-rounded cache implementation for Go today.

Designed for real-world applications in unstable environments, it has a higher average hit rate, higher throughput, and lower latency for production workloads than any other cache. To deal with process eviction in environments like Kubernetes, Cloud Run, or Borg, it also offers an optional persistence tier.

Install

go get github.com/codeGROOVE-dev/multicache

Use

cache := multicache.New[string, int](multicache.Size(10000))
cache.Set("answer", 42)
val, ok := cache.Get("answer")

With persistence:

store, _ := localfs.New[string, User]("myapp", "")
cache, _ := multicache.NewTiered(store)

_ = cache.Set(ctx, "user:123", user)       // sync write
_ = cache.SetAsync(ctx, "user:456", user)  // async write

GetSet deduplicates concurrent loads to prevent thundering herd situations:

user, err := cache.GetSet("user:123", func() (User, error) {
    return db.LoadUser("123")
})

Options

multicache.Size(n)           // max entries (default 16384)
multicache.TTL(time.Hour)    // default expiration

Persistence

Memory cache backed by durable storage. Reads check memory first; writes go to both.

Backend Import
Local filesystem pkg/store/localfs
Valkey/Redis pkg/store/valkey
Google Cloud Datastore pkg/store/datastore
Auto-detect (Cloud Run) pkg/store/cloudrun

For maximum efficiency, all backends support S2 or Zstd compression via pkg/store/compress.

Performance

multicache has been exhaustively tested for performance using gocachemark.

Where multicache wins:

  • Throughput: 551M int gets/sec avg (2.4X faster than otter). 89M string sets/sec avg (27X faster than otter).
  • Hit rate: Wins 6 of 9 workloads. Highest average across all datasets (+2.7% vs otter, +0.9% vs sieve).
  • Latency: 8ns int gets, 10ns string gets, zero allocations (3.5X lower latency than otter)

Where others win:

  • Memory: freelru and otter use less memory per entry (73 bytes/item overhead vs 14 for otter)
  • Specific workloads: sieve +0.5% on thesios-block, clock +0.1% on ibm-docker, theine +0.6% on zipf

Much of the credit for high throughput goes to puzpuzpuz/xsync and its lock-free data structures.

Run make benchmark for full results, or see benchmarks/gocachemark_results.md.

Algorithm

multicache uses S3-FIFO, which features three queues: small (new entries), main (promoted entries), and ghost (recently evicted keys). New items enter small; items accessed twice move to main. The ghost queue tracks evicted keys in a bloom filter to fast-track their return.

multicache has been hyper-tuned for high performance, and deviates from the original paper in a handful of ways:

  • Tuned small queue - 13.7% vs paper's 10%, tuned via binary search to maximize average hit rate across 9 production traces
  • Full ghost frequency restoration - returning keys restore 100% of their previous access count
  • Increased frequency cap - max freq=5 vs paper's 3, tuned via binary search for best average hit rate
  • Death row - hot items (high peakFreq) get a second chance before eviction
  • Extended ghost capacity - 1.22x cache size for ghost tracking, tuned via binary search
  • Ghost frequency ring buffer - fixed-size 256-entry ring replaces map allocations

License

Apache 2.0