SnowPack

snowpack is a Java library that allows you to store millions of small files (called flakes) on disk, in larger files (called chunks) for performance benefits. When you store millions of small files on disk, the read performance degrades considerably because of want to disk seeks required to read the File Allocation Table as well as to read the particular disk portion. As is known, that disk seeks are expensive than disk transfer, we utilize the fact to merge many files into larger chunks, so that seeks are reduced.

The file metadata is stored in a simple LevelDB database to make sure that we have everything to read about the file from these larger chunks.

Features

  • Store millions of files into larger chunks
  • Much faster performance - store around 5000+ files totalling 250MB+ in less than a minute on commodity hardware
  • Read speeds are much faster
  • Recently used files are saved in memory cache
  • All files in the current writable chunks are in-memory cached to prevent disk seeks
  • Metadata is stored in the fast LevelDB database
  • Backups are as easy as copying the chunk files (usually 1GB in size) to the backup store
  • Recovery is possible on crash using the SnowpackRecover tool
  • Only chunk files are needed to recover

License

The library is released under the terms of Apache Public License Version 2.

Fork me on GitHub