ZFS Part 2: Implementing Zpool/ZFS Configuration and ZFS Features

OK – we’re good to go for our final ZFS configuration. Recall from earlier that I will be configuring a two-disk RAID1 set, with an extra disk for hot-spare use, and the final disk to play with for backups, encryption, dedup, etc.

It was that easy:

If we query the pool, all required elements (mirroring, hot-spare), will be in place:

The default ZFS dataset will also have been created:

For the root dataset mount, that mountpoint will be fine. Each new ZFS dataset created will have a more appropriate mountpoint set.

So – let’s assume we need to install some software, and require a /u01 filesystem on which to install (a common sight if you’ve worked around Oracle software long enough).

That simple command has created the new dataset for us:

Next, let’s move the mountpoint to somewhere more sensible, namely /u01:

Again – another simiple task, it even created the non-existent mountpoint for us:

This dataset will stay the same size as its parent, datapool, until we start actually using it (or use and create other datasets).

Let’s turn on deduplication for the parent, so that it will be inherited from by any other datasets configured within datapool. I wont turn on encryption and compression and will control those at a finer-grained per-dataset level.

Then we verify:

As we created datapool/u01 prior to setting dedup=on, we need to head over there and set it on for that dataset too:

Let’s start creating some more interesting child datasets.

ZFS Features

The first feature I wanted to try out was deduplication. Essentially – if a block is duplicated numerous times across a pool, it will be deduplicated (i.e. its duplicates removed) thus improving storage utilisation.

I set dedup=on on datapool so any new dataset created will inherit that property from its parent. Therefore:

Verify as always:

You even get a nice little note that the value for this property has been inherited from the parent ZFS dataset – datapool.

If we check via du the “real” values are reported:

A zpool list however has else to say:

You can see that only 37.1M is actually allocated due to a deduplication factor of 7.37x – between our three copies of /usr/sbin the system was able to deduplicate by quite a large factor saving us a couple of hundred meg. Pretty cool. You can also get a simulated deduplication histogram on a datapool (with dedup=on or dedup=off) using zdb:

For all but the most demanding scenarios where calculating deduplication would be stressful on system resources, setting dedup=on on a dataset is normally a good idea.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s