papyri/md/lvm_snapshot_merging.md

# LVM Snapshot Merging

## Contents

  - [Overview](#overview)
  - [General Setup](#general-setup)
  - [Preparing a Scenario](#preparing-a-scenario)
  - [Merging a Snapshot](#merging-a-snapshot)
  - [Advanced Usage](#advanced-usage)
  - [Data Retention](#data-retention)
  - [References](#references)


## Overview

Within the LVM2 Device Mapper infrastructure a method and kernel module exists to merge the contents of a snapshot back into it's source using `lvconvert`; the typical use case for many snapshots is to back up and discard, this article outlines an alternate use where the need is to re-merge the changes back to the source instead.


## General Setup

In order to work with the `lvconvert` merging process:

  - The **LVM2** packages for the distro must be installed
  - The kernel module `dm-snapshot.ko` must be loaded
  - A snapshot to merge must exist

Check for the snapshot-merge feature using `dmsetup targets` and load the module as needed:

```
# dmsetup targets
mirror           v1.14.0
striped          v1.5.6
linear           v1.1.0
error            v1.2.0

# modprobe -v dm-snapshot
insmod /lib/modules/2.6.32-642.4.2.el6.x86_64/kernel/drivers/md/dm-bufio.ko
insmod /lib/modules/2.6.32-642.4.2.el6.x86_64/kernel/drivers/md/dm-snapshot.ko

# dmsetup targets
snapshot-merge   v1.3.6
snapshot-origin  v1.9.6
snapshot         v1.13.6
mirror           v1.14.0
striped          v1.5.6
linear           v1.1.0
error            v1.2.0
```


## Preparing a Scenario

For instructional purposes, we need to prepare a scenario:

  - A LV with snapshot in the VG is mounted and in use
  - A snapshot was active/mounted and we add new data to it

The setup looks like:

```
Create a new snapshot and add dummy data:

# lvcreate -s /dev/vgcbs00/lvdata -L 10G -n lvdata_snap
# mount /dev/vgcbs00/lvdata_snap /snap
# for ii in 5 6 7 8; do dd if=/dev/zero of=/snap/data.$ii bs=4M count=10$ii; done

Examine the results:

# ls /data/ /snap/
/data/:
data.1  data.2  data.3  data.4
/snap/:
data.1  data.2  data.3  data.4  data.5  data.6  data.7  data.8

# df -h /data/ /snap/
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vgcbs00-lvdata
                       37G  1.7G   34G   5% /data
/dev/mapper/vgcbs00-lvdata_snap
                       37G  3.4G   32G  10% /snap
```

So here we have the origin (source) LV `lvdata` and a snapshot of it `lvdata_snap`; 4 additional data files were added to the mounted snapshot `/snap/` in order to simulate a a snapshot which has had data written to it since it was instantiated. We see there is now twice as much data in the snapshot and the additional files we want to merge back to the origin.


## Merging a Snapshot

To merge the snapshot back to it's origin:

1. Ensure the origin has enough space to hold the contents of the additional snapshot data
2. Ensure the **origin and snapshot are unmounted** from the filesystem
3. Use the `lvconvert` command to merge the snapshot to the origin

> It is possible to perform the merge while the origin is online, see the **Advanced Usage** section for this scenario.

Using the `lvconvert` command is very straightforward, be sure to use the `-i` flag to set an update interval for progress output. Be aware that the **snapshot will be deleted after merge** so ensure this is the expected outcome desired:

```
# umount /snap
# umount /data

# lvconvert --merge -i 2 vgcbs00/lvdata_snap
  Merging of volume lvdata_snap started.
  lvdata: Merged: 83.4%
  lvdata: Merged: 84.8%
  ...lots of status...
  lvdata: Merged: 100.0%
  Merge of snapshot into logical volume lvdata has finished.
  Logical volume "lvdata_snap" successfully removed

# mount /dev/vgcbs00/lvdata /data
# ls /data/
data.1  data.2  data.3  data.4  data.5  data.6  data.7  data.8
```

You have now successfully merged the snapshot content back to it's origin.


## Advanced Usage

It is possible to reduce the offline time by performing an online merge when the LV is next activated; this can be done on the fly if the volume group can be deactivated (so it cannot be the home of the root filesystem) however requires a **complete deactivation**, making it unsuitable for the root partition or another volume which cannot be deactivated while the server is online. The process is similar to the basic usage, you simply have to leave the origin mounted, perform the commands then wait for it to complete in the background.

Using the same scenario setup prepared above:

```
# umount /snap

# lvconvert --merge vgcbs00/lvdata_snap
  Logical volume vgcbs00/lvdata contains a filesystem in use.
  Can't merge over open origin volume.
  Merging of snapshot vgcbs00/lvdata_snap will occur on next activation of vgcbs00/lvdata.

(stop service using /data)
# umount /data
# vgchange -an vgcbs00
# vgchange -ay vgcbs00
# mount /dev/vgcbs00/lvdata /data
(start service using /data)
```

At this point you can `ls` immediately and see the changed data on the source, however be aware it's still merging in the background. Using `lvs -a` to watch the `Data%` column decrease it's way to zero is how status is checked, and when it's complete it will **delete the snapshot** LV on it's own.

```
# lvs -a
  LV            VG      Attr       LSize  Pool Origin Data%
  lvdata        vgcbs00 Owi-aos--- 37.50g
  [lvdata_snap] vgcbs00 Swi-a-s--- 10.00g      lvdata 4.76

...

# lvs -a
  LV            VG      Attr       LSize  Pool Origin Data%
  lvdata        vgcbs00 Owi-aos--- 37.50g
  [lvdata_snap] vgcbs00 Swi-a-s--- 10.00g      lvdata 3.17
```

Eventually you will see the `Data%` column decrease to 0.00, then the snapshot will be removed.


## Data Retention

Simply put, the data changes on the snapshot will overwrite the data on the origin volume.

It does not matter if it's a new file or changed, the process will not attempt to merge the actual contents of a file (similar to diff and patch) but instead just replace the data blocks from the snapshot on the origin volume. File deletions are handled the same - if a file is deleted on the snapshot, upon merging it the file will be removed on the origin volume. The process does not differentiate between a binary or text file, all data is treated the same during the merge process.

Starting with a few data files:

```
/data/test-d1.txt
This line was edited on /data before snapshot creation

/data/test-d2.txt
This line was edited on /data before snapshot creation

/data/test-s2.txt
This line was edited on /data after snapshot creation

====

/snap/test-d1.txt
This line was edited on /data before snapshot creation
This line was edited on /data after snapshot creation

/snap/test-d2.txt
This line was edited on /data before snapshot creation
This line was edited on /snap after snapshot creation

/snap/test-s1.txt
This line was edited on /snap after snapshot creation

/snap/test-s2.txt
This line was edited on /snap after snapshot creation
```

We then run the `lvconvert` process outlined above to merge the data and end up with:

```
/data/test-d1.txt
This line was edited on /data before snapshot creation
This line was edited on /data after snapshot creation

/data/test-d2.txt
This line was edited on /data before snapshot creation
This line was edited on /snap after snapshot creation

/data/test-s1.txt
This line was edited on /snap after snapshot creation

/data/test-s2.txt
This line was edited on /snap after snapshot creation
```

As the content of the test text files shows, the snapshot data blocks simply replace the origin whether it's been edited in one place or both - there is no attempt to compare timestamps or perform otherwise more intelligent data merging, it's an all or nothing approach to the merge process.


## References

  - <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/snapshot_merge.html>