Files management


Cheatsheet for preparing and a new disk and using it with:

  • LVM (Logical Volume Manager)
    • Principle: physical volumes (hardware) are merged into volume groups, and then split back into logical volumes, which can be used as a partition (or as a disk with a partition table, but not very useful?).
    • Advantages: flexible/resizable, snapshots, raid…
    • Cons: only supported in Linux, additional layer of complexity
    • Cons for removable media: volume groups are automatically activated when the device is plugged, and need to be manually deactivated before removing it, even if we haven't mounted any volume. Issues with sleep ?
  • LUKS
  • Filesystems (ext4, btrfs, …)
  • Subvolumes


This section presents how to use each tool, in a given order, but depending on the use case you may want to skip some layers, or apply them in a different order. In particular:

  • You may not use LVM
  • You may want to perform LVM over LUKS instead of LUKS over LVM, if you prefer having a single password and unlock for all volumes.
  • But you DO want to use LUKS (seriously, always encrypt your disks).


  • If you want to use LVM and don't need to boot on the disk, not much to do in this section, just remove all the existing partitions with gparted, then go directly to the next one to create the LVM volumes directly on the raw device.
  • Otherwise, create a gpt partition table with gparted
  • If you need to boot on the disk, with gparted:
    • create the EFI partition: set size around 200-400MB, format it to fat32, set boot and esp flags
    • create the boot partition: set size around 300-500MB, format it to ext2
    • mount everything: root partition to /mnt, boot partition to /mnt/boot, efi partition to /mnt/boot/efi, mount –bind /mnt/{dev,proc,sys}, mount -t efivarfs efivarfs /mnt/sys/firmware/efi/efivars
    • chroot to the new root: chroot /mnt
    • install grub: grub-install –root-directory=/ –boot-directory=/boot –efi-directory=/boot/efi –bootloader-id=<os-name>
    • you can check the EFI install with efibootmgr -v (and remove an entry with efibootmgr -b <0005> -B
    • grub-mkconfig -o /boot/grub/grub.cfg
  • If you want to use LVM, create a single large partition with the remaining space with gparted to create the LVM volumes on this partition.
  • Otherwise create the required system and data partitions with gparted


  • create physical volume: pvcreate <device-name> (device can be the whole device if not a boot device, or a partition).
    • Check with pvdisplay or pvs.
    • if it complains with the error Cannot use <device-name>: device is partitioned, you need to remove existing traces of partition table or filesystem with the command wipefs –all <device-name>
  • create volume group: vgcreate <vgroup-name!> <device-name>.
    • Check with vgdisplay or vgs.
  • create logical volume: lvcreate -n <lvolume-name!> [-L <absolute-size>] [-l <relative-size>] <vgroup-name>.
    • Check with lvdisplay or lvs
    • <absolute-size>: 200G, 3T, …
    • <relative-size>: +100%FREE
  • If using an SSD drive, TRIM commands from the layers below (eg filesystem) will be transparently forwarded without any special configuration. However if you wish that LVM issues its own TRIM commands when some space is not allocated by LVM, you can set the issue_discards option to 1 in /etc/lvm/lvm.conf.


  • Encrypt the volume/partition/device: cryptsetup luksFormat -c aes-xts-plain64 -h sha256 -s 512 <volume-name>
    • <volume-name> is the device or partition name if not using LVM, or /dev/mapper/<vgroup-name>-<lvolume-name> if using it.
    • Choose a strong passphrase as it can be brute-forced (at least 80 bits of entropy)
    • By default it will configure the key derivation take 2 seconds
  • Open (decrypt) the volume: cryptsetup luksOpen <volume-name> <evolume-name!>
  • If using an SSD drive, you probably should enable TRIM-forwarding: cryptsetup –allow-discards –persistent refresh <evolume-name> (check security implications though). Check with cryptsetup luksDump <volume-name>. If you enabled it by mistake (for instance on a non-SSD), you can disable it with cryptsetup –persistent refresh <evolume-name> (it resets flags).


  • Choose the filesystem:
    • ext4 for a default robust journaled filesystem
    • btrfs: modern filesystem based on copy-on-write (instead of journal), offering more features (integrity checks, subvolumes, snapshots / deduplication, compression, encryption, RAID, …), but also some drawbacks (slightly less stable, requires more resources, wasted allocated data, …). It is a good choice for a work volume (because integrity checks and snapshots are really useful), but less obvious for backup volumes (because native deduplication is less performant with moved files and modified files, especially if not using btrfs-send, so it is more efficient to use a deduplicating backup software, which will also handle integrity checks).
    • zfs: similar to btrfs, using both copy-on-write and a journal (for improved performance with synchonous writes), more mature and sligthly more stable, but not included in kernel due to licensing (though easy to use).
  • Create the filesystem: mkfs.<fs> <evolume-name>
  • Tune the filesystem:
    • With ext4, if not the system partition, you can remove the 5% reserved for root: tune2fs -r 0 /dev/mapper/<evolume-name>
  • Mount the filesystem: mount /dev/mapper/<evolume-name> /mnt/<evolume-name>
  • Some filesystem tuning must be done after mount:
    • With btrfs, you can enable compression: btrfs property set <fs-root> compression <algo> with <algo> equal to lzo (fastest) or zstd (compromise). Note that this syntax does not support configuring levels, nor forcing compression to disable heuristics. For that you have to use instead a mount option in the previous step: compress=zstd:1 (default is :3) or compress-force=lzo.


Some filesystems such as BTRFS and ZFS allow to create subvolumes.

  • BTRFS:
    • The filesystem root is a subvolume
    • You can create other subvolumes: btrfs subvolume create <subvolume-path> (directory must not exist, -p for creating parents).
      • Check with btrfs subvolume list <fs-root> and btrfs subvolume show <subvolume-path>



  • cryptsetup luksOpen <volume-name> <evolume-name>
  • mount /dev/mapper/<evolume-name> /mnt/<evolume-name>


  • umount /dev/mapper/<evolume-name>
  • cryptsetup luksClose <evolume-name>
  • vgchange -an <groupe-name if using LVM and the disk will be removed


  • BTRFS snapshots:
    • A snapshot is deduplicated copy of a subvolume, using CoW (Copy-on-Write) mechanism.
    • They are useful for storing a history with deduplication, but also to freeze the subvolume before making a copy
    • They can be stored inside the subvolume (because they are a subvolume themselves, and snapshots are not recursive)
    • Create a read-only snapshot: btrfs subvolume snapshot -r <input-subvolume-path> <output-snapshot-path> (<output-snapshot-path> is typically <input-subvolume-path>/.snapshots/<date>) (just remove -r for a read-write snapshot).
    • Delete a snapshot: btrfs subvolume delete <snapshot-path>
    • Analyze snapshot disk usage: btrfs quota enable <subvolume> and then btrfs qgroup show <subvolume> source]
  • ZFS snapshots:
    • They can be recursive.


Renaming / Updating

  • vgrename <old-name> <new-name>
  • lvrename <group> <old-name> <new-name>
  • cryptsetup luksChangeKey


  • pvck <device>
  • btrfs check <mount-point> to verify the structural integrity of the filesystem
  • btrfs scrub <mount-point> to verify the data integrity


  • cryptsetup luksHeaderBackup /dev/DEVICE –header-backup-file /path/to/backupfile
  • vgcfgbackup -f /path/to/backup/file vg01

Disk usage

  • btrfs filesystem usage <subvolume> (option -g to display GB only).


  • With btrfs:
    • btrfs filesystem defrag for defragmenting files.
      • can also be used to change compression of existing files (but breaks deduplication) with option -czstd (inherits level specified at mount).
    • btrfs filesystem balance -dusage=<percentage> for defragmenting free space (only data chunks less full than <percentage> will be compacted).
  • filefrag -v <file> to analyze the fragmentation of a file, and list all extents.


  • With btrfs:
    • compsize <subvolume-path> in order to get statistics about quantity of compressed files, and compression ratio.
    • compsize <file-path> in order to get compression details about a specific file.


  • TRIM (or discard) operation means informing the SSD drive about the unused memory, so that it can perform efficiently wear leveling.
  • Checking TRIM support: run lsblk –discard, and check for non-zero values in columns DISC-GRAN (DISCard GRANularity) and DISC-MAX (DISCard MAX bytes).
  • Warning: make sure that your device supports TRIM before using it, or data loss can occur.
  • Each layer must forward the TRIM commands to the layer above, until it reaches the drive. If you haven't done it persistently for LUKS as suggested in the create section, you can open it with this option: cryptsetup <…> –allow-discards
  • Then two options are available to enable it:
    • Continuous TRIM, i.e. configuring the filesystem to notify instantly each block that is freed.
      • It is not advised because doing it to often can reduce the lifetime of poor quality SSDs.
    • Periodic TRIM, i.e. explicitly notifying the free blocks periodically.
      • Using the fstrim utils from the util-linux package.
      • Manually: run fstrim –verbose <mount-point> for a single volume, or fstrim –verbose -A for all mounted filesystems listed in /etc/fstab and the root filesystem inferred from the kernel command line.
      • Weekly: enable the timer systemctl start fstrim.timer

Source :



* Resize the LVM logical volume: lvresize -L <absolute-size> <lvolume-name>

  • <absolute-size> can also be an increment, e.g. +50G

* Open the volume with Luks: cryptsetup luksOpen <lvolume-name> <evolume-name> * Resize the filesystem:

  • ext4: e2fsck -f <evolume-name> ; resize2fs <evolume-name>
  • btrfs: mount the filesystem then btrfs filesystem resize max /mnt/<evolume-name>




Everyone has personal data that nothing could recreate (pictures, emails, creations, …), or global data and configuration that it would take a lot of time to recreate. However you can lose some of them or all of them in several situations: hard drive crash, hard drive corruption, computer theft, computer destruction (fire…).

My advice:

  • partition your hard drive to have a separate partition for system and data
  • put important application data on the data partition (configuration, emails, …)
  • do a full mirror backup of the data partition regularly (eg with rsync or a deduplicate software such as Attic) on an external hard drive or a network drive. Try to keep at least one copy somewhere else from your home (network drive, or one at home and one at work).
  • take precautions to put the odds on your side in case of problem: make copies of your disks MBR (output of command p of fdisk), of your encrypted partitions headers, etc.



Borg Backup

  • Create a Borg repository in the current folder:
    borg init -e <encryption> [--append-only] .
    • <encryption> can be:
      • none to disable it, for instance on an already encrypted volume
      • repokey to enable it (or repokey-blake2 to use Blake2 instead of Sha256, which is often faster)
      • authenticated to disable encryption but still enable authentication (or authenticated-blake2 to use Blake2)
    • –append-only means that no data can be removed with borg, archives can only be added. It can be used to protect an online repository against malware.
  • Create archives:
    borg create <repo>::<!archive> <path> --stats --progress
        --compression auto,zstd,12 --chunker-params 15,23,19,4095 --noctime -x --exclude-caches
    • –compression: it can make sense to adjust the compression level depending on your computer speed and your storage speed, so that compression does not slow down the backup, but still save as much space as possible under this constraint. However it is not always easy to find an universal value (data that compress very well are mostly limited by the input storage speed, while data that compress less well are mostly limited by the output storage speed). You have roughly the choice between LZ4 (very quick), LZMA (very high compression ratio), and ZSTD (wide-range) in between.
    • –chunker-params: this is also an important but a bit complicated tuning. Originally default value was creating small chunks causing huge cache and memory usage, so they switched to much larger chunks, but which can be too large for some applications (for instance when modifying only metadata of an image file, we want to deduplicate the data), so I came with this compromise 15,23,19,4095.
  • borg info <repo>
  • borg list <repo>
  • borg info <repo>::<archive>
  • borg diff <repo>::<archive1> <archive2>
  • borg mount <repo>::<archive> <mountpoint>


  • Create a Restic repository in the current folder:
    restic init --repo .
    • Note that encryption and password are mandatory, because. However you can store the password in a file in the repository, or use the a password file with –password-file.
  • Create snapshots:
    restic --repo <repo> --verbose --compression auto --ignore-ctime backup <path>
    • The chunker cannot be configured, contrary to Borg. It is equivalent to 19,23,21,512, similarly to Borg's default 19,23,21,4095, but unlike my chosen values.
    • –compression: unlike Borg, there is only choicies auto, max and off
  • restic –repo <repo> snapshots to list snapshots

BTRFS snapshots

The BTRFS filesystem allows to perform some sorts of backups:

  • On the work disk, regularly creating snapshots allows to keep an history, for recovery in case of bad manual operation
  • It is also useful in order to “freeze” the content, so that the backup with another tool saves a consistent copy
  • On a backup disk, snapshots can also be used to keep an history.
  • If you update the backup with rsync for instance, then moved files and modified files will not be deduplicated (because they are sent again by rsync and won't be recognized).
  • However if you moved or modified files on a btrfs filesystem, you can send the increment between two snapshots: btrfs send -p <parent-src-snapshot> <src-snapshot> | pv | btrf receive <target-snapshot> (<parent-source-snapshot> must have been sent already).
  • You can also deduplicate afterwards using offline tools for out-of-band deduplication (cf list)

Container files


Digital Will

The goal is threefold:

  • No one non-authorized can ever access to your data
  • The trusted persons can only access to your data in some conditions (deceased, coma, …)
  • You are alerted when your data is accessed by the trusted persons (in case the access was not legitimate)

Different approaches:

  1. A service that gives your data to designated persons when they provide a death certificate for you (Wishbook, …)
    • Cons: sending a fake document, does not work for coma, service needs to remain available, price, incomplete control.
    • Variants: store it directly in a vault at the bank, or at the notary.
  2. A service that regularly checks that you are alive by requesting a connection with your private credentials (period can be adapted to the situations), and gives your data when you fail to do it, after warning emails (can be self-hosted).
    • Cons: there will be some delay between when you stop pinging and when your data becomes available, service needs to remain available, and if you host it yourself there is still a risk that your server crashes at the wrong time.
  3. A service that waits for a request to reveal your data with a personal password, sends you one or several emails to warn you that this request has been made, and in the absence of opposition from you in some delay (that can be adapted to the situation) sends your data (can be self-hosted).
    • Cons: there will be some delay between when you stop pinging and when your data becomes available, service needs to remain available, or if you host it yourself there is still a risk that your server crashes at the wrong time (but if the service is associated to your password manager for instance, then the availability is not a problem anymore…)
  4. Split the secret between several people (cf Shamir's secret, implemented for instance in ssss or libgfshare), so that X out of Y need to agree to obtain your data.
    • Cons: people need to remain accessible (and not loose the information), compromise between robustness and risk of conspiracy, what data can each person access
  5. Store your key on a piece of opaque paper (eg visit card), with you (eg in your smartphone), wrapped in a unique piece of paper (eg journal paper) that is glued. The goal is that it is impossible to read the key (eg through light) without removing the wrapping paper, impossible to remove it without tearing it apart, impossible to replace it without you noticing quickly. This base key then has to be derived a large number of times, so that it takes several days with a classical computer to derive the final key, which the designated persons then can use to decrypt their personal message. In case of unauthorized attempt, you can change the password. A better solution could be to use a proper TLP (Time Lock Problem) / TLE (Time Lock Encryption) / VDF (Verifiable Delay Function) / Delayed encryption, which would have the big advantage of fast generation, but it doesn't seem to exist standard implementations, even of seminal Rivest & Shamir proposal.
    • Cons: need to actively monitor the integrity of your artifact, doesn't work if you have an accident that destroys the artifact, need to revoke the information in case of unauthorized attempt (thus only revokable information is protected, such as passwords protecting data for which it is impossible to make an unauthorized copy beforehand), need to revoke the information in case of upgrade of the derivation scheme in order to follow hardware improvements.
    • Variant: it seems that it is possible to make “paint to scratch” with dishwashing soap and gouache, thus it can be an alternate way to hide while allowing to detect unauthorized access. Even if there is a tentative of reconstructing the paint layer, if a complex color mix was chosen, with approximate random borders, and even random color gradients, it would be very difficult to replicate accurately enough so that it goes unnoticed (especially if the color changes while drying).

Different methods could be combined, for instance 2 or 3 plus 4. But 3 managed by the password manager is probably unbeatable.

Ideally, for increased safety, the data to be obtained is always encrypted with a key that the designated persons possess.

What to transmit?

  • Passwords (master password of your password manager, computer, encrypted data partitions, phone, …)
  • Instructions about what data you have


  • Different levels of amount of information for your spouse, children, other family, friends?
  • How to transmit data (such as pictures) to a child? Probably has to go through a tutor.

File Systems

ext3, ext4

Reserved blocks

By default ext3 reserves 5% of disk space to super-user. The intent is to let to critical applications the ability to write to the disk when it is full, but it has no use for a data partition, you just waste 5% of your partition.

You can check and remove these reserved blocks with the following commands:

tune2fs -l /dev/sda1 | grep Reserved
tune2fs -r 0 /dev/sda1


  • geekie for images (fork of gqview)
  • fdupes
  • fslint

Secure delete

  • secure-delete (srm, sfill, sswap, smem)
    • -l option to be a lot faster: 2 passes instead of 38 (or -ll for only 1 pass), enough to prevent the use of consumer tools like photorec, but not for specialized companies and governments ;-)
  • shred (less advanced but more common)

Data recovery

First unmount your partition and remount it read-only.

  • extundelete –restore-file Documents/file.dat /dev/sda4 : the easiest solution if there are only a few files and you know their name. Accepts not unmounting the partition, works generally ok if you do it immediately after removing the files.
  • testdisk (photorec) is great to recover files on a mobile storage device because it works with any filesystem (finds signatures in data so no need of journal), and find all deleted files on the partition.
  • ext3grep
    ext3grep <partition> --restore-file <filename> # filename => file ; works great, but only for one file at a time...
    ext3grep <partition> --restore-all --deleted --after=1270639550 # dates -> files
    ext3grep <partition> --histogram=dtime --deleted --after=1270639000 --before=1270640000 # => dates
    ext3grep <partition> --ls --inode 2 # filenames => inodes (navigating in directories with inodes)
    ext3grep <partition> --search Libs/jafar/modules/ # filename,dates -> blocks
    ext3grep <partition> --restore-inode <inode> # inodes => files

    Notes: “restore-all” failed while building stage2 cache with error “ext3grep: void init_directories(): Assertion `lost_plus_found_directory_iter != all_directories.end()' failed.”. However doing a “ls inode” created this stage2 cache, and afterwards “restore-all” worked… but just restored everything on the disk even not deleted files/dirs, not taking into account the “after” option… But manually editing the stage2 cache to only keep files/dirs you want to restore then “restore-all” worked perfectly!

  • ext4magic
  • others: debugfs, foremost, unrm, ext3undel

Disk Recovery

In case the MBR/partition table of you disk is damaged.

Make a backup before

You should always keep a backup of your partition table !

The first way is to store the output of p command of fdisk.

You can also do a dump of the MBR and EBR:

dd if=/dev/sda of=sda.dd bs=512 count=1 # full MBR dump
sfdisk -d /dev/sda > sda.sfdisk         # MBR and EBR partition tables

Out of curiosity, the file command is able to interpret the content of your MBR dump:

file sda.dd

Restore with a backup

If you have the output of the p command of fdisk, then you can manually recreate the partition table with fdisk with the same information. As long as you don't mount or format, modifying the partition table with fdisk doesn't modify the partitions data.

If you have a full dump of MBR and EBR, you can automatically restore it:

dd if=sda.dd of=/dev/sda
sfdisk /dev/sda < sda.sfdisk

To restore the MBR without the partition table:

dd if=sda.dd of=/dev/sda bs=446 count=1

To restore only the partition table:

dd if=sda.dd of=/dev/sda bs=1 skip=446 count=66

Restore without a backup

If you don't have a copy of your partition info, don't panic, some software can recover them by searching for the partitions in the disk content (but it has to be formatted as a standard filesystem, ie not encrypted):

  • testdisk (very good) howto
  • gpart (didn't work very well for me, only found the NTFS partition) howto

Performance optimization

  • verynice
software/files.txt · Last modified: 2024/07/05 08:09 by cyril
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0