ZFS Deleting files doesn’t free up space

So I have a proxmox server on which I run a few VM’s and the other day it completely ran out of space. This was because of overprovisioning through thin volumes.

After much head scratching and metaphorically banging my head against a wall, here are the things I learnt.

Empty Trash

Local Trash

Make sure that have emptied the trash on the VM’s .Ubuntu has this issue and so might other distributions

Network Trash

If you have SAMBA enabled on your VM’s make sure that the Recycle Bin is not enabled. I have openmediavault running on a VM and I had to go through and disable the Recycling Bin. Make sure that the Recycle bin is emptied. They are hidden folders in the root of your shares.

Correct Driver & Settings

  • When setting up the hard drive for your VM, make sure you use virtio-scsi (or just scsi on the web interface).
    • If you disk is already set up using IDE or VirtIO,
      • Delete it. Don’t worry, it’s only deleting the link. The disk itself will show up in the interface afterwards
      • Double click on the unattached disk and select SCSI and Discard
      • You might have to fix the references to the drive in the OS
  • On the Device Screen, make sure discard is selected.

TRIM

Configure the OS to send TRIM commands to the drive

Linux

On Mount

You can pass the parameter discard to any mountpoint and the correct TRIM commands will be sent to the disk. HOWEVER, this is apparently a big performance hit.

To do the actual trim, run

$ fstrim /

OR to run fstrim on all supported drives

$ fstrim -a

Digital Ocean has a detailed post about setting TRIM and setting up a schedule etc.

Windows

My deepest apologies! Also, I don’t run Windows on any of my VM’s so I have no experience with it.

Understanding ZFS Disk Utilisation and available space

I am hopeful the following will help someone scratch their head a little less in trying to understand the info returned by zfs.

I set up a pool using 4 2TB SATA disks.

$ zpool list -v
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 7.25T 2.50T 4.75T - 10% 34% 1.00x ONLINE -
raidz2 7.25T 2.50T 4.75T - 10% 34%
sda2 - - - - - -
sdb2 - - - - - -
sdc2 - - - - - -
sdd2 - - - - - -

The total size displayed here is the total size of the 4 disks. The maths works as 4*2TB = 8TB = ~7.25TiB

RAIDZ2 is like RAID6 and it uses two disks for parity. Thus, I would expect to have ~4TB or 3.63TiB of available space. I haven’t been able to find this number displayed anywhere.

However, you can find the amount of disk space still available using the following command.

$# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 1.21T 2.19T 140K /rpool
rpool/ROOT 46.5G 2.19T 140K /rpool/ROOT
rpool/ROOT/pve-1 46.5G 2.19T 46.5G /
rpool/data 1.16T 2.19T 140K /rpool/data
rpool/data/vm-100-disk-1  593M 2.19T  593M -
rpool/data/vm-101-disk-1 87.1G 2.19T 87.1G -
rpool/data/vm-102-disk-1 71.2G 2.19T 71.2G -
rpool/data/vm-103-disk-1 2.26G 2.19T 2.26G -
rpool/data/vm-103-disk-2 13.2M 2.19T 13.2M -
rpool/data/vm-103-disk-3 13.2M 2.19T 13.2M -
rpool/data/vm-103-disk-4   93K 2.19T   93K -
rpool/data/vm-103-disk-5 1015G 2.19T 1015G -
rpool/data/vm-104-disk-1 4.73G 2.19T 4.73G -
rpool/data/vm-105-disk-1 4.16G 2.19T 4.16G -
rpool/swap               8.66G 2.19T 8.66G -

The value of 2.19T is the amount of unallocated space available in the pool. To verify this, you can run


# zfs get all rpool
NAME PROPERTY     VALUE                           SOURCE
rpool type                      filesystem                       -
rpool creation              Fri Aug 4 20:39 2017    -
rpool used                     1.21T                                 -
rpool available             2.19T                                -

...

If we add the two numbers here, 1.21T + 2.19T = 3.4T.

5% of disk space is reserved, so 3.63 * 0.95 = 3.4T

et voila

Restricting Linux Logins to Specified Group

If you have linux boxes that authenticate over ldap but want logins for specific boxes to be restricted to a particular group, there is a simple way to achieve this.

Firstly, create a new file called /etc/group.login.allow (it can be called anything – you just need to update the line below to reflect the name)

In this file, pop in all the groups that should be able to login

admin
group1
group2

Edit /etc/pam.d/common-auth (in ubuntu), it might be called /etc/pam.d/system-auth or something else very similar. At the top of the file (or at least above other entries, add the following line:

auth required pam_listfile.so onerr=fail item=group sense=allow file=/etc/group.login.allow

For the record, found this little tidbit over at the centos forums

Looping from the bash commandline [1113]

I figured this out the other day from idle curiosity. There is occassionally the need to have a never ending loop to be executed directly from the bash commandline instead of writing a script.

I used this to run sl (yes sl, not ls – try it – I love it) repeatedly.

$ while true; do ; done

for example

$ while true; do sl; done

Bear in mind that this loop is infinite and there is no way to cancel out of it except to kill of the terminal.

Expanding glusterfs volumes [1112]

Once you have set up a glusterfs volume, you might want to expand the volume to add storage. This is an astoundingly easy task.

The first thing that you’ll want to do is to add in bricks. Bricks are similar to physical volumes a la LVM. The thing to bear in mind is that depending on what type of cluster you have (replicated / striped), you will need to add a certain number of blocks at a time.

Once you have a initialised the nodes, to add in a set of bricks, you need the following command which adds two more bricks to a cluster which keeps two replicas.

$ gluster volume add-brick testvol cserver3:/gdata cserver4:/gdata

Once you have done this, you will need to rebalance the cluster, which involves redistributing the files across all the bricks. There are two steps to this process, the “fixing” of the layout changes and the rebalancing of the data itself. You can perform both tasks together.

Continue reading

My Thoughts on OCFS2 / Understanding OCFS2 [1110]

As mentioned earlier, we have been considered networked filesystems instead of NFS to introduce into a number of complex environments. OCFS2 was one of the first candidates.

In fact, we also considered GFS2 but looking around on the net, there seemed to be a general consensus recommending ocfs2 over gfs2.

Ubuntu makes it pretty easy to install and manage ocfs2 clusters. You just need to install ocfs2-tools and ocfs2console. You can then use the console to manage the cluster.

What I totally missed in all of my research and understanding, and due to lack of in depth knowledge on clustered filesystems was that OCFS2 (and GFS2 for that matter) are shared disk file systems.

What does this mean?

Continue reading

Exporting X11 to Windows [1109]

Playing Skyrim the last week, sometimes I just missed Linux so terribly that I wanted a piece of it and not just the command line version. I wanted X Windows on my Windows 7.

There has been a solution for this for several years and the first time I did this, I installed cygwin with X11 but there is a far simpler way to accomplish this.

Install XMing. I then used putty, which has the forward X11 option. Once logged in, running xeyes shows the window exported onto my Windows 7. Ah.. so much better.

I actually used this to run terminator to connect to a number of servers. Over local LAN, the windows didn’t have any perceptible lag or delay. It was more or less like running it locally.

It is possible to set up shortcuts to run an application through putty and have it exported to your desktop. I haven’t played with this enough to comment though.

This of course only worked because I have another box which is running Linux. If that is not the case for you, then you might want to try VirtualBox but since the linux kernel developers have described the kernel modules as tainted crap, you might want to consider vmware instead which is an excellent product.