Skip to main content

--dedup - Xtool

The xtool component is more enigmatic. It stands for "external tool." In this context, --dedup xtool signals that the primary application (e.g., a file archiver like zpaq , a backup utility like restic , or a data processing framework like datamash ) should not rely on its built-in, often generic, deduplication algorithm. Instead, it passes the responsibility—or at least the heavy lifting—to an external, user-specified tool. This external tool could be a cryptographic hash calculator ( sha256sum ), a binary diffing utility ( bsdiff ), a content-defined chunking algorithm ( lbzip2 in a custom pipeline), or even a machine learning classifier for fuzzy duplicates.

backup-agent run --src /data --dest /backup --dedup xtool --xtool-max-chunk 256KB --dedup xtool

: During extraction, the tool can simply point back to already-extracted data rather than processing the same stream multiple times. The xtool component is more enigmatic

Run XTool with larger chunk sizes to prioritize speed over space savings: This external tool could be a cryptographic hash

Run XTool with a smaller minimum chunk size to maximize space savings (at the cost of speed):