Welcome to my first attempt at a blog. Hopefully this will help someone somewhere out there someday in the DFIR community. Since this turned out to be a longer topic than I originally thought, or I'm more long-winded than I originally thought, I'll go ahead and say up front that this is being divided into Part 1 (this entry) and a follow-up Part 2 available at this link.
Defining the Problem
So, to get right to it, here's the challenge that gave rise to this blog entry. My day job at a Fortune 500 corporation is primarily focused on digital forensics, usually of the "deep dive" variety. In the course of investigations that involve disk-based analysis of a given server, dealing with physical hard drives/RAIDS/etc tends to be a very rare occurrence.
The data almost always is in the form of a bunch of files from a virtualized environment (VMware for the content below). More specifically, those files are server clones -- including "pooled storage volumes" (for lack of a better description), meaning a single virtual storage volume written across two or more physical VMDK files.
As a "bonus", most forensic tools (Encase, FTK, TSK, etc) might be able to correctly identify the file type of those VMDK files as LVM (technically, "LVM2" or "type 8e") but they can't stitch the multiple files back together for you into a single volume for purposes such as mounting an intact file system and/or creating a forensic image of the entire thing.
After working on several of these cases and poking around online for answers (while not finding much of anything about LVM from the forensic angle), hopefully the info here can shave off a few hours from the research/try/fail cycle for someone else out there in DFIR-land.
CAVEAT-1: This is still a work in progress, with experimentation and everything else underway while trying to figure out some of the LVM quirks. So if there's a better way to do it than I'm laying out here, other ideas are definitely welcome.
CAVEAT-2: Most of the investigations I work are intrusions, malware, APT related, etc. So the examples here are not shown with the "typical" read only (mount -o ro) options, that would be more of a concern for cases involving human resources, law enforcement, etc. Aside from protecting the original evidence / source VM files (which are on a physical write blocker while imaging them) the main concerns here are to get the files assembled, mounted, imaged, and start the analysis asap.
What is LVM / LVM2?
In a nutshell, LVM is a software layer on top of (or "around"?) the virtual hard disks and partitions, which creates an abstraction for administrative usage to manage hard drive replacement, re-partitioning, backups, and re-sizing disk volumes on the fly. Very handy to address on-demand storage requirements as they change.
The main concern here, for DFIR purposes, is how to get the thing mounted and forensically imaged. But anyone wanting more in-depth information about LVM technology can get get a crash course while reviewing these links (at the very least, I recommend checking out the graphic at https://en.wikipedia.org/wiki/File:Lvm.svg ):
- https://en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)
- https://en.wikipedia.org/wiki/Logical_volume_management
- https://www.maketecheasier.com/lvm-set-up-ubuntu/
- https://en.wikipedia.org/wiki/File:Lvm.svg
So Let's Get To It
When it comes to doing the analysis and experimentation here, the only things we'll be using are:
- A Windows host system (Server 2012R2 in this case but anything Win7 and later should work just as well),
- A Windows-based forensic tool to examine image files (FTK Imager in this case),
- Virtual machine guest software, i.e., VMware Workstation, to run
- The SANS SIFT forensic platform, available at this link.
In this case, we'll be dealing with Linux Logical Volume Manager server files (with the ext4 file system buried somewhere in there), consisting of 2 storage volumes. One volume is completely self-contained within a single physical VMDK file (numbered 1 below) while the other larger volume is split across 2 physical VMDK files (numbered 2 and 3 below) -- which will cause mounting and imaging problems. The larger/split volume is the star of this show.
The goal is to merge the physical files from numbers 2 and 3 into one logical volume in order to build out the full file and folder structure to be forensically imaged for later analysis.
Figure-1 |
After loading the CLONE_1-flat and CLONE_2-flat VMDK files (File >> Add Evidence Item...) with the "Image Files" option under the "Select Source" dialog and then expanding the drop-downs under the "Evidence Tree", it shows the following:
Figure-2 |
We also see the administratively assigned Volume Group names (redacted, obviously). Additional key data is found to the right, under the "File List" column, identifying which segment (physical file number) it is along with the total number of expected files in the entire logical group.
So we'll go ahead and image the two VMDK clones as "Raw DD format" (unsplit/unsegmented) files under two separate jobs on FTK Imager. Flash forward and upon completion we see the results -- two individual raw DD image files. Both image files are named after their source VMDK counterparts:
Figure-3 |
Now it's time to jump over to VMware, fire up the SIFT Workstation, open a shell and see what we have. Logging in as the default user (sansforensics) we'll use ls -l to see the directory contents and make sure the Windows directory containing the CLONE image files is shared with the guest VM SIFT workstation:
Figure-4 |
Figure-5 |
OK, so that didn't work. How about using the mmls command from The Sleuth Kit (TSK) to see if we can get some information about the partitions and file systems?
Figure-6 |
Figure-7 |
At this point it became time to do some research and here's where we'll fast forward ahead. After reading various online forums written by frustrated-Linux-admin-after-frustrated-Linux-admin, it turns out there's a missing Linux package that needs to be installed... lvm2. And we can confirm if it needs to be installed by issuing the lvm command:
Figure-8 |
As instructed above, we'll download and install it with sudo apt-get install lvm2...
Figure-9 |
A quick crash course is available at:
http://fibrevillage.com/storage/455-lvm-command-reference
Or a more in-depth review of the commands at:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Cluster_Logical_Volume_Manager/LVM_CLI.html
For basic forensic mounting and imaging purposes, the primary commands we're concerned with are:
- pvs - reports information about LVM physical volumes
- pvscan - scans all supported LVM block devices
- lvdisplay - logical volume in-depth information
They'll be used to obtain data needed to feed two other standard commands:
- losetup - mount/unmount the files as loop devices
- dmsetup - manages the device-mapper driver and can force unmounting if losetup doesn't successfully do it (which is often the case)
Returning to the example, first we'll make sure we're in the same directory with the CLONE dd image files (at /mnt/hgfs/Examples/VM_dd_Files/Linux-individual-vols) and then mount the files as loop devices with the command losetup. Note that everything from here onward will require sudo / superuser privileges, so you may want to just sudo -i before doing any of this.
To make this work, it was necessary to put each individual CLONE dd image file on successive loop devices at /dev/loop0 and /dev/loop1, as shown below. If you want it read only, use losetup -r as the command. Just beware that using the -r option sometimes makes LVM uncooperative (not sure why, but "it just happens").
Figure-10 |
Figure-11 |
Figure-12 |
But the LV Path is also included (showing the device path), which will be useful to know later, in case the device wasn't automatically mounted during the losetup activities. Other potentially useful data includes the VG Name created by the admin and the UUID (a large string of random hex, similar to ef674641-3ab9-8e5c-1dbd-eac810715327a). Let's try sudo fdisk -l again to see if anything has changed:
Figure-13 |
Figure-14 |
Figure-15 |
Figure-16 |
So it looks like we have a properly mounted volume, the Logical Volume Manager software "barrier" is gone, and valid file/folder structure now exists for navigation. If there's a targeted examination to perform, this might be as far as we need to go since the LVM2 volume is fully accessible.
Otherwise if a full forensic image is required, the entirely assembled logical volume can now be imaged. After that, we'll verify that it's a valid image file and then undo all of the "LVM stuff" up to this point, like detaching the loop devices. But since this is a good stopping point, reviewing the imaging and unmounting procedures for the LVM volume will be continued separately in Part 2 at this link.
The system tools will have issues when a name of a LV, LVG already exists.
ReplyDeleteHave a look at https://github.com/libyal/libvslvm instead.
Would libvslvm be useful in this case (i.e., multiple segments)? According to the documentation:
ReplyDeleteNot supported:
* multiple physical volumes
* multiple segments <------------------
* non-linear stripes
* snapshots
Not at the moment. However depending on your mileage it is open source and can extended.
ReplyDeleteIf you have to rely on non-extendable tools for your analysis you'll find yourself working around limitations a lot of your time anyway. Could as well spend that time on improving open tools?
Ah, ok thanks. Although my code would not qualify as "improving" open tools... breaking them? YES.
DeleteStill slowly learning as I go along :)
Help improving can also be done without coding, you could also help by indicating the scenarios you are dealing with and creating representative test files.
ReplyDeleteAlso there are numerous "forensic" gotcha with the method and tools you describe, some mentioned here:
http://forensicswiki.org/wiki/Linux_Logical_Volume_Manager_(LVM)