A Look into Linux Forensics

In 1983, Richard Stallman announced that he was going to begin developing an operating system (OS) that was similar to the UNIX OS, except that it would be made up of entirely free software. The OS was called GNU (a recursive acronym for “Gnu is Not Unix”). The term “free software” refers to several freedoms that the software gives the user (not necessarily pertaining to money); In other words, it is akin to the concept “free speech” more than it is to “free lunch.”

The four freedoms are:

1.) The freedom to run programs as you wish.

2.) The freedom to study how the program works, and change it.

3.) The freedom to redistribute copies.

4.) The freedom the distribute copies of your modified versions.

By 1991, the GNU OS was almost complete, the only missing component was the kernel. The kernel is responsible for allocating the resources on the computer’s hardware that is required by the software that is running. The kernel is the interface layer between a computer’s hardware and the applications that it is running. Linus Torvalds releases the Linux Kernel under the GNU Public License (GPL), thereby making the Linux kernel free software.

This is the same Linux kernel that is used today on many different devices, including Android devices and the Chrome OS. Numerous third parties bundle the kernel with higher-level components to build fully functional computing platforms: these bundles are known as “distributions,” or “distros” for short. There are nearly a thousand available distributions, however, most Linux distributions fall into the following three main development branches: Debian, Slackware, or Red Hat.

Evidence Collection on Live Machines

When dealing evidence collection, it is important to follow proper forensic procedure when examining the suspect machine. Part of this means to highly prioritize proper documentation and maintaining a thorough log all actions performed on the suspect machines. There must be a chain of custody and all evidence should be properly seized, transported, and handled otherwise it may be rendered as inadmissible in a court of law.

First and foremost, we must recognize the computer system’s operational state. If the computer is turned on in any way (e.g. standby mode or running), the first step is to analyze any potentially valuable evidence found in running processes or memory. If we were to simply shut the system down before examining what is running on the system, we (a) lose the ability to capture data stored in volatile memory and (b) may not be able to get back into the system. Examples of volatile data are: running processes, network connection status, mounted remote file systems, loaded kernel modules, logged-on users, and contents of the /proc directory (in Linux systems).

For a Windows computer, we can check running processes simply by looking at the Task Manager (Ctrl+Alt+Delete). We may also inspect active connections using the net sessions command and inspect any shared files or folders that are open using the command openfiles in the Command Prompt. It is advisable to take photographs (not screenshots) of each of these outputs before shutting down the computer.

For a Linux system, examiners can identify the current running processes with the shell command ps or list open files and what process opened them with the command lsof, list processes in a tree (command: pstree) loaded kernel modules (commands lsmod and modinfo), and mounted file systems (command: findmnt or by searching the file systems table in /etc/fstab and the mounted systems file table located at /etc/mtab). Additionally, we can view a system’s network status on a Linux machine with the same command as a Windows: netstat.

Investigating Live Linux Machines

Running Linux systems may contain important information which will otherwise be lost at shutdown. RAM is the most difficult to capture due to its volatile nature. As RAM is generally used for program execution, there may be some data here that is pertinent to an examiner’s investigation.

An examiner must consider the needs of the investigation and determine what volatile data to collect before shutting the system down. Examples of volatile data are: running processes, network connection status, mounted remote file systems, loaded kernel modules, logged-on users, and contents of the /proc directory.

Modern day operating systems operate in a protected mode for security reasons and so acquisition of the entire physical address space can only be achieved in system mode. Acquiring an image of the system memory (RAM) oftentimes requires the injection of a Linux kernel module into the running kernel. However, since the Linux kernel checks modules for having the correct versions and checksums before loading, the kernel will refuse to load a module pre-compiled on a different version or configuration.

One solution to his problem is maintaining a library of kernel modules for every possible distribution and kernel version. For incident response, this makes memory acquisition problematic, especially in the case of mobile phones where phone vendors may publish the kernel version they used, but the configuration and details on vendor specific patches remain unknown.

Examiners responding to running systems ought to make an attempt at identifying the use of encryption by reviewing all running processes (command: ps), loaded kernel modules (commands lsmod and modinfo), and mounted file systems (command: findmnt or by searching the file systems table in /etc/fstab and the mounted systems file table located at /etc/mtab).

Collecting Data using Linux Utilities

Before dismantling a computer it is important to take pictures of it from all different angles, document the system hardware components and how they are connected, and label each wire so that each connection can be easily restored to it’s original configuration.

It is imperative to create a forensic image all storage devices prior to examining them in order to preserve them as evidence. This first means forensically wiping your target drive so that there is no residual data left. We can do this manually using the dd utility on a Linux machine, which will provide a bit-level copy of the drive you are imaging (which includes deleted files and slack space). The following command will use /dev/zero as an input file to write onto the target partition /dev/hdb1.

dd if=/dev/zero of=/dev/hdb1 bs=2048

Once everything on the target drive is overwritten with null values, we can then use the dd utility piped into netcat (nc) in order to complete and transfer the bit-level forensic image onto the forensically wiped target drive. Comparing a hash of each image will ensure that there has been no changes between the original hard drive and it’s forensic copy.

Examining File Systems andd Disk Configurations

There are a number of disk configurations and file systems that are exclusive to the Linux environment. Modern Linux systems use an abstraction layer, the Virtual File System, as an interface for the kernel and other applications to the specific file systems (such as Ext3, Ext4, or NTFS). The VFS enables access to data on any file system regardless of the file system’s specific implementation or the actual location of the data.

Most Linux systems today are installed on the Extended file system (Ext) family of file systems. As a result, most of the forensic tools can parse these systems and interpret their metadata. Ext allows security setting and ownership metadata to be applied to files and folders, including the flags for read (r), write (w), and execute (x) modes. Files are hidden in Linux by pre-pending the filename with a period, “.”, and require specific options when using commands to reveal (e.g. ls -a to list all files in the current directory):

One of the first things an examiner will likely do is make a bit level image of the system’s hard drive to further investigate without the risk of changing anything. Data Dump, or simply dd, is a part of the GNU coreutils found on Linux systems and is a very powerful tool which is used to copy data, regardless of their filesystem types. With dd one can create a forensic image with Linux where all the files, slack space, and unallocated data are captured.

Basic System Configuration Information

Linux system configuration settings are stored in easily accessible text-based configuration files throughout the file system. Most of the system-level configuration files are located in the /etc/ directory and most user-level configuration files are located within the user’s home directory. A forensic analysis of a Linux system generally begins with the identification of the distribution and kernel version (command: uname -a):

The location of this information is typically contained at /etc/issue or /etc/version.

Users and Groups

Most Linux distributions maintain the list of user accounts in a readable text file, /etc/passwd. Within the /etc/passwd, the following information will reside: Username, encoded password field, Numeric user ID (which defines a unique user and their associated permissions), Numeric primary group ID (specified in the /etc/group file), Full name of the user, the path to the user’s home directory, and the user’s command shell.

The user’s password is stored in the /etc/shadow file, rather than the /etc/passwd file. This is a text file containing hashed passwords and account expiration information for all users. By default, on root-privileged users can read the /etc/shadow file, contrary to the /etc/passwd file which is world-readable by default:

The default Linux user profile location, /home/<user>, is where user-created data and configuration information defaults to. A shorthand of the user’s home directory’s relative path is “~/”. When the user is the root account it’s home directory will typically be located in /root instead.

Log Files

Linux systems are configured for robust event logging using the syslog facility. Syslog allows processes to send events for storage in log files locally. As a result, many of the Linux daemons, services, and system-level functions end up using the syslog facility for logging, rather than using their own log files. The two main syslogging tools are Rsyslog and syslog-ng. Most of the log files are found in the /var/log directory:

Below are a few of the logs that a forensics investigator may review:

  • System Log (/var/log/syslog and /var/log/messages): contains system events like device mounting, network config changes, and security logs
  • Authorization Log (/var/log/auth.log): Authentication-related events including user logon/logoffs, sudo events and commands
  • Kernel Logs (/var/log/dmesg and /var/log/kern.log): Kernel debugging, info, error messages, kernel message buffer
  • Installer Logs (/var/log/installer): Events generated during system installation

Examining the Command Shell

The shell is a command-line interpreter that provides a user interface for the OS on Linux. A user can enter commands as text and the command line interpreter will execute it. The command shell contains artifacts that an examiner may find of interest. For example, by executing the history command, the shell will show the contents of the ~/.bash_history file; unless the user covered their tracks and deleted the history, a step by step command history can reveal quite a lot.

Additional Considerations and Challenges

There are a number of considerations when it comes to collecting digital evidence. Precautions should be taken to prevent exposure to evidence that may lead to contamination. In respect to contaminating digital evidence, this includes ensuring that network traffic is isolated and that all interactions with the systems are done through forensic copies (so as to prevent changing anything on the original piece of evidence). RF signal blocking containers (e.g. a Faraday bag) are often used for physical network isolation.

With the developed market of modern computing there are numerous makes and models of all sorts of devices, reaching far beyond just personal computers. Many of these devices use closed source operating systems which can make it difficult to extract evidence (e.g. Chromebook vs. Mac). Specific expertise may be necessary for many of these cases.