ZFS Guide

zfs

If you care about your data, you should use ZFS. Personally, I think this is the best choice for most use cases where data integrity is a factor. If you are building a NAS then you should really use ZFS. A close contender would be BTRFS.

What is ZFS?

ZFS is an advanced, next generation file system. It solves many of the problems with existing filesystems. It combines the features of a filesystem and a volume manager in one piece of software. It is a 128-bit file system so it provides ridiculous, astronomically large limits to FS size ( 256 quadrillion Zettabytes ) and individual file size (16 exbibytes ).

It provides:

RAID
snapshots
integrity checking
automatic repair
much more …

Without integrity checking, your data is at risk of being corrupted. ZFS protects data from otherwise undetectable corruption such as bit rot.

Super Tiny ZFS Cheat Sheet

This will bring you from zero to semi-competent ZFS admin in about 30 seconds.

sudo apt update	get latest info from repo
sudo apt install zfsutils-linux	install
sudo zpool create pool1 mirror sdc sdd	RAID 1 Mirror
sudo zpool create pool1 mirror sdb sdc mirror sdd sde	RAID 10
sudo zpool status	show status
sudo sudo zpool replace pool1 sdd sde	replace failed disk
sudo zpool destroy pool1	destroy pool
sudo zpool scrub pool1	scrub pool

ZFS Quick Start

WARNING - Read the section on pool scrubbing. It is important.

NOTE - On older versions of Ubuntu you had to use a different package and add an additional repository.

Pools can be created using entire disks, slices / partitions, or files.

TIP - When creating a Zpool you can use “-f” to force if there is an error about no EFI label but potential information in the MBR.

Install on Ubuntu 16.04/18.04


sudo apt update
sudo apt install zfsutils-linux
whereis zfs

Info Commands

Before setting up a pool, you can check what devices are available on your system:


sudo fdisk -l

You can list out existing zpools with the list command and you can check the status of your zpools with the status command.


sudo zpool list
sudo zpool status

Striping - RAID 0

You can create a basic striped pool like this. It will have no redundancy but should provide improved performance and increased space. Note, even though you don’t get redundancy with RAID 0 striping, ZFS still gives you integrity checking. I still wouldn’t do this unless you don’t care about your data.


sudo zpool create new-pool /dev/sdb /dev/sdc             # striped

Mirroring - RAID 1

If you want redundancy, you can create a basic 2 disk mirror like this.


sudo zpool create new-pool mirror /dev/sdb /dev/sdc  # Mirrored, mounted at /new-pool

If you want a three-way mirror for additional redundancy, you can do that too. You can specify as many disks as you like.


sudo zpool create tank mirror ada1 ada2 ada3            # 3 disks all mirrored together

RAID 10 - Striped Mirror

You can create a RAID 10 device also. This basically strips over two mirrors.

You can create a RAID 10 pool like this:


sudo zpool create example mirror /dev/sdb /dev/sdc mirror /dev/sdd /dev/sde

Vdevs might look like this: “mirror-0” and “mirror-1”.

Note that you can also expand a RAID 1 pool by adding another mirror. This will also result in a RAID 10 pool.

RAID Z

You can also create RAID Z file systems if you like. Here are some examples showing how you would do that for RAID Z1/2/3.


zpool create tank raidz1 ada1 ada2 ada3       # create pool with a RAID Z1 vdev
zpool create tank raidz ada1 ada2 ada3         # also without the 1
zpool create tank raidz2 ada1 ada2 ada3 ada4 ada5             # RAID Z2
zpool create tank raidz3 ada1 ada2 ada3 ada4 ada5 ada6    # RAID Z3

You can read more about what RAID Z in another section further down in this document.

Alternate Mount Point

The default mount point would match the name of the pool. For example a pool called test1 would be mounted at /test1. You can specify an alternate mount point with the -m switch.


sudo zpool create -m /alt-location new-pool mirror /dev/sdb /dev/sdc

Destroy a Pool

WARNING - You will lose all data on a pool when you destroy it.

You can destroy a pool like this:


sudo zpool destroy new-pool

Create a File System

NOTE - each zpool will have a filesystem created by default.

You can create and destroy additional file systems like this:


sudo zfs create test-pool1/dataset1       # create FS
sudo zfs destroy mypool/tmp     # destroy an FS

If you create a new FS named “test-pool1/dataset1”, the default mount point will be /test-pool1/dataset1.

Using a File as a Device


dd if=/dev/zero of=example.img bs=1M count=2048
sudo zpool create pool-test /home/user/test1.img /home/user/test2.img
sudo zpool status

Expanding

zpool add - This adds a vdev to a pool. This generally results in more storage space.
zpool attach - This attaches a device to a vdev in the pool. This generally results in more redundancy.

Attaching disks:

You can attach a Disk like this. It will add ada4 to whichever vdev has ada1. This results in no capacity increase.


zpool attach tank ada1 ada4

You can detach a disk like this.


zpool detach tank ada4

Adding vdevs:

Adding a RAID Z1 vdev to a pool is done like this. This will increase capacity.


zpool add tank raidz1 ada4 ada5 ada6

Adding a mirrored vdev to a pool is done like this:


zpool add tank mirror ada4 ada5 ada6
zpool add OurFirstZpool ada4  # add disk to pool

Replacing a Failed Drive

We have an entire separate guide for this here:

Our ZFS Replace Disk Guide

Things You can Do

Spares:

raid-z only?


zpool add geekpool spare c1t3d0  # adding a spare to a pool
zpool status                     # shows up under status
zpool autoreplace=on mpool       # set auto replace to on

Dry run:

use -n for a dry run when creating a pool


zpool create -n geekpool raidz2 c1t1d0 c1t2d0 c1t3d0

Export / Import Pool

writes all unwritten data
removes the pool from the system


zpool export geekpool   # export it
zpool list                        # won't show the pool anymore

zpool export -f geekpool   # force if something is mounted

zpool import            # show pools that can be imported
zpool import -d /      # show pools that use files as devices

zpool import tank1                                 # import by name
zpool import 940735588853575716      # import by ID
zpool import -d / geekfilepool             # import pool with files as devices
zpool import -f geekpool               # force

Quotas, reservations

Quota - FS can’t use more than this amount.
Reservations
- This much is reserved for this FS. It won’t be available to other FS.
- When a quota is defined first, a reservation can’t be higher than quota.


zfs set quota=500m geekpool/fs1
zfs set reservation=200m geekpool/fs1
zfs list               # how much is shown of each?

Set mount point:


zfs set mountpoint=/test geekpool/fs1
df -h |grep /test

Intent Log

ZIL (ZFS Intent Log)

Add a disk to be used for the intent log.
This can speed up writes. Typically you would use a fast disk like an SSD.


sudo zpool add mypool log /dev/sdg -f

ZFS Cache Drives

Cache drives add a layer of caching between RAM and the main storage drives. Typically, you would use a faster SSD for caching and larger mechanical drives for main storage.

You can add a cache drive to a zpool like this.


sudo zpool add mypool cache /dev/sdh -f

Compression

lz4 is considered a good, fast, and safe option.


sudo zfs set compression=on mypool/projects
sudo zfs set compression=gzip-9 mypool
sudo zfs set compression=lz4 mypool
sudo zfs get compressratio                  # check compresion level

ZFS Snapshots

read only copy of FS
saves state of FS at a that point
can be used to roll back
can extract files from the snapshot


sudo zfs snapshot -r mypool/projects@snap1  # create snapshot
sudo zfs list -t snapshot                    # list snapshots
rm -rf /mypool/projects                                   # destroy all files
sudo zfs rollback mypool/projects@snap1     # rollback to snapshot
sudo zfs destroy mypool/projects@snap1

ZFS Clones

writable copy of FS
can only be created from a snapshot
snapshot can’t be destroyed until all clones are destroyed


sudo zfs snapshot -r mypool/projects@snap1
sudo zfs clone mypool/projects@snap1 mypool/projects-clone

ZFS Send and Receive

send - A snapshot can be streamed to a file or other location.
receive - A stream can be received to create a new filesystem
these are great for backups

Backup a snapshot to a file. Then restore it to a new FS.


sudo zfs snapshot -r mypool/projects@snap2
sudo zfs send mypool/projects@snap2 > ~/projects-snap.zfs
sudo zfs receive -F mypool/projects-copy < ~/projects-snap.zfs

Zip and encrypt:


zfs send mybook/testzone@20100719-1600 | gzip | openssl enc -aes-256-cbc -a -salt > /storage/temp/testzone.gz.ssl
openssl enc -d -aes-256-cbc -a -in /storage/temp/testzone.gz.ssl | gunzip | zfs receive mybook/testzone_new

Backup with SSH:


zfs send mybook/testzone@20100719-1600 | ssh testbox zfs receive sandbox/testzone@20100719-1600

ZFS Ditto Blocks

more copies of data for additional redundancy
spread 1/8th of the disk apart for dingle device pools or on another device for multi device pools


sudo zfs set copies=3 mypool/projects

ZFS Deduplication

If two blocks are duplicates, one will be deleted and both references will point to the same block.

Trade off - saves space, uses up more memory due to in memory deduplication tables
Approximately 320 bytes of memory are needed per deduplicated block.
Write performance will decrease as this table grows.

Setting up deduplication is usually not worth it.


sudo zfs set dedup=on mypool/projects

Pool Scrubbing

Scrubbing a pool will run an integrity check on everything in the pool. ZFS checks integrity when data is read. If you aren’t reading your data you won’t be able to detect bit rot. This is why you need to run a scrub on a regular basis.

How often should a ZFS pool be scrubbed? The general rule of thumb is that you should run a scrub once a month at the very minimum. Scrubbing your pools weekly is much better.


sudo zpool scrub mypool
sudo zpool status -v mypool

TIP - Run a scrub as a cronjob.

Testing


dd if=/dev/urandom of=/mypool/random.dat bs=1M count=4096  # populate test data
md5sum /mypool/random.dat
sudo dd if=/dev/zero of=/dev/sde bs=1M count=8192     # simulate a failure

sudo zpool scrub mypool     # check for issues
sudo zpool status

sudo zpool detach mypool /dev/sde                          # remove disk
sudo zpool attach mypool /dev/sdf /dev/sde  -f         # add it back ( to the same vdev that  has sdf )
sudo zpool scrub mypool                        # scrub again

Reusing Disks

Use zpool labelclear after a zpool destroy if you plan to reuse the disks. — example here —

ZFS RAID Levels

ZFS supports several different options for RAID levels. Each of these has advantages and disadvantages.

RAID Z

This is similar to RAID 5. It is different in that it uses both dynamic stripe width and copy on write functionality to avoid the write hole error. There are three different types of RAID Z, RAID Z1, Z2, and Z3. RAID Z1 is similar to RAID 5, Z2 is similar to RAID 6, and Z3 is similar to RAID 7.

RAID Levels

RAID 0 - striped
- can’t remove vdevs without destroying the pool
RAID 1 - mirror
RAID-Z1
- 3 disk min, only 1 disk can die
- can’t attach to vdev but can add more vdevs to the pool
RAID-Z2
- 4 disk min, only 2 disk can die
RAID-Z3
- 5 disk min, only 3 disk can die
RAID 10
- 4 disk min
- achieved by striping across 2 mirrors

Things You Should Know

Considerations

ZFS performs poor with less than 2 GB of RAM
WARNING - You still need backups. ZFS and other RAID systems are not a substitute for regular backups. The possibility of an entire Zpool being wiped out is a very real threat. If garbage is written to the filesystem, this can be written to all mirrored disks. A faulty disk controller, corrupted metadata, user errors, or malware can all destroy your filesystems.
Error Correcting RAM (ECC) is recommended
Data deduplication feature consumes a lot memory, use compression instead.
COW ( copy on write ) - ZFS uses copy on write. This means that when data is copied, it just creates a pointer to the original data until a change is actually made. This makes snapshots very practical.

Terms

Dataset - filesystem. These are created inside a Zpool.
vdevs (virtual devices) - Grouping of storage providers into various RAID configurations. These are usually disks, partitions, or files.
Storage providers – spinning disks or SSDs
Zpools – Aggregation of vdevs into a single storage pools.

History

Really, Really brief summary:

ZFS was originally created at Sun Microsystems. They made it an open source project. Oracle bought Sun Microsystems and has closed any changes they make to the code. There now exists a project called OpenZFS. This is what most projects are using now.

ZFS Guide

What is ZFS?

Super Tiny ZFS Cheat Sheet

ZFS Quick Start

Install on Ubuntu 16.04/18.04

Info Commands

Striping - RAID 0

Mirroring - RAID 1

RAID 10 - Striped Mirror

RAID Z

Alternate Mount Point

Destroy a Pool

Create a File System

Using a File as a Device

Expanding

Attaching disks:

Adding vdevs:

Replacing a Failed Drive

Things You can Do

Spares:

Dry run:

Export / Import Pool

Quotas, reservations

Intent Log

ZFS Cache Drives

Compression

ZFS Snapshots

ZFS Clones

ZFS Send and Receive

Zip and encrypt:

Backup with SSH:

ZFS Ditto Blocks

ZFS Deduplication

Pool Scrubbing

Testing

Reusing Disks

ZFS RAID Levels

RAID Z

RAID Levels

Things You Should Know

Considerations

Terms

History

References