Category Archives: Linux IR

Helix 3 Pro: First Impressions

0
Filed under Computer Forensics, Evidence Acquisition, Evidence Analysis, Incident Response, Linux IR, Memory Analysis, Registry Analysis, Windows IR

I have used several versions of Helix over the recent years.  I enjoy the tool set and recommend it to forensics colleagues, sysadmins, and even family members.

Quite a substantial ruckus was raised this year when e-fense announced that Helix 3 would no longer be free to download.  Instead, would-be users must pay to register as a forum user to get access to Helix 3 Pro updates for a year.

I took the plunge and purchased my forum membership.  Here are the first things I noticed:

  • Some of the highlights…
    • The forum allows access to the Helix 3 software the member applies a registration token.
    • After adding the token, I was able to download not only Helix 3 Pro, but also Helix 3, and contributed tools.
    • Helix 3 Pro is really nothing like the 1.8 and 1.9 versions that came before it.  Although it still provides a bootable live CD as well as executables that can be run in Windows in Linux, the interfaces for all the modes of use have been made more consistent and seamless.  Also, a Mac OS X set of tools have been added.
    • The Helix 3 Pro CD also provides a set of cell phone forensics tools (that I will cover in a follow-on posting).
    • One of e-fense’s goals with the Helix 3 release was to provide a forensics tool that did not touch the host computer in any way.  I have not tried to verify this yet, although I intend to do so soon.
  • And the lowlights…
    • On my Dell D630 laptop (and few other systems), the boot process generated a number of errors and — in some cases — would not detect a graphical interface mode correctly, leaving me with an unusable Helix environment.
    • The majority of the tools that made previous versions of Helix useful are just completely gone.  This is apparently done so that the Helix Pro 3 image can be trusted.  I spoke to a sales representative at e-fense who told me that several customers were using Helix 3 Pro in environments where open source software of questionable origins is, well, frowned upon.
    • Static binaries formerly found on the Helix 1.x CDs are now separate downloads.  They are still available through the Helix forums.

This is the first in a series of blog postings I plan to publish on Helix 3 Pro.  Please post comments if there are specific tools or features of the LiveCD you would like me to cover.

John Jarocki, GCFA Silver #2161, is an Information Security Analyst specializing in intrusion detection, forensics, and malware analysis. He also holds GCIA, GCIH, GCFW and GSEC certifications and is the Treasurer of NM InfraGard.  John recently co-authored a controversial paper on using LiveCDs to mitigate online banking risks.

Tricking the “script” Command

Comments Off
Filed under Computer Forensics, Evidence Acquisition, Incident Response, Linux IR

Keeping a record of all of the commands you type as well as their output is obviously useful during a forensic investigation.  On Unix and Linux systems, we have the “script” command which does precisely this.  You run “script <filename>” and the script command spawns a new shell: everything you type and all output you receive in return is automatically captured to the specified file.

From a forensic perspective, however, the classic problem is that script insists on writing its output to a file in the local file system.  This is particularly a problem during the initial stages of incident response when you’re operating on a live system trying to verify whether or not it has been compromised.  If you capture your session with the script command, you may be trampling important data as your output file grows.  Of course you could attach a portable storage device and write your output there, but that could be problematic on many levels.

This topic came up recently when Ed Skoudis and I were working on an article for our Command Line Kung Fu blog (a useful resource for Incident Response professionals, since we provide helpful command line short-cuts using both the Windows and Unix command shells).  During our conversation, I realized that there might be a work-around that allows the script command to send its output over the network.  Sure enough, after a little bit of hacking I came up with a devious little method to accomplish exactly this.

Suppose we had another machine on the same network as the host we are investigating.  Let’s suppose this other machine has IP address 192.168.100.1 and there’s a netcat process listening on port 9999 and redirecting its output to some file (”nc -l 9999 >myoutput”).  Now, on the machine you’re investigating, run the following commands:

$ mkfifo /tmp/fifo
$ cat /tmp/fifo >/dev/tcp/192.168.100.1/9999 &
[1] 1523
$ script -f /tmp/fifo
Script started, file is /tmp/fifo
$

Some explanation is clearly in order here:

  • The mkfifo command creates a special type of object in the file system called a FIFO (short for “first in, first out”).  The FIFO allows one command to write data into this “file” and another command to read data out of it.  This is perfect for our purposes.
  • The next command uses the cat to read data out of the FIFO and redirect that output into /dev/tcp/192.168.100.1/9999.  This “file name” is a specially recognized syntax in the bash shell that means “write the data over the network to host 192.168.100.1 on port 9999/tcp” (for more information see Ed’s excellent “Netcat Without Netcat” presentation).  We use “&” to put this command in the background where it will sit and wait for us to start putting data into the FIFO.
  • Finally we fire up the script command and tell it to write its data into the FIFO.  We also add the “-f” (”flush”) argument so that the script command doesn’t buffer its output– the remote side will see the commands being typed and get the output instantly.  We’re now in the subshell spawned by the script command and anything that happens from here on out should start appearing in the output file on the remote system.

The “/dev/tcp/…” syntax is a useful little bit of bash shell trickery.  But even if you didn’t have this, you could do something similar if your incident response kit included netcat.  Just change the second command in the output above to “cat /tmp/fifo | nc 192.168.100.1 9999″.

The real trick here is obviously knowing how to create and manipulate FIFOs, and for whatever reason this doesn’t seem to be something that’s commonly covered in most Unix classes.  But you have to admit it’s an impressively useful thing to know for situations such as this when you have a command like script that insists only on writing to a local “file”.

Obviously, on principles of strict forensic soundness it must be admitted that creating the FIFO does alter the file system slightly.  The FIFO itself requires an inode, though no data blocks will be consumed.  Also the contents of the directory the FIFO is placed in will be modified.  However, I will point out that many Unix systems are configured by default to use memory-based file systems for directories such as /tmp or /var/run.  If this is the case on the system you’re investigating, I would recommend creating your FIFO in one of these memory-based file systems to minimize any impact to the file system on disk.

Hope you find this little trick useful in future investigations!

Hal Pomeranz, Deer Run Associates, is an independent IT/Computer Security consultant and a SANS Faculty Fellow.  He spends far too much time hanging out in the sketchy parts of the Unix operating system.

Directory Link Counts and Hidden Directories

1
Filed under Computer Forensics, Evidence Analysis, Incident Response, Linux IR

by Hal Pomeranz, Deer Run Associates

One of the things I love about teaching at SANS is that the students are smart people and come up with great ideas.  Sometimes these ideas even lead to useful tools, as was the case a few years ago when we were talking about hidden directories in the Digital Forensics section of Sec506.

First, a little background information.  Unix file systems keep track of a “link count” to all objects in the file system.  This “link count” value is the number of different directory entries that all point to the inode associated with the object.  In the case of a regular file, the link count is the number of hard links to that file.

However, Unix file systems don’t let you create hard links to directories, yet the link count on a directory is always at least two, and even increases by one for each sub-directory in that directory.  Why is this so?

  • Any object in the file system must have a directory entry that connects it into the file system.  For example, if you have a directory like “/tmp”, there’s a pointer in the root directory (”/”) that points to the “tmp” directory entry.  So that gives you one link.
  • Every directory contains the “.” link that points back to itself.  So that gives us the minimum value of 2 links per directory.
  • Every subdirectory has a “..” link that points back to its parent, incrementing the link count on the parent directory by one for each subdirectory created.

For one thing, the above behavior is why it’s important to monitor the link count on critical directories in your file system using a file integrity assessment tool like Tripwire, Samhain, or AIDE.  You can detect people adding or deleting directories when the link counts change.

But consider the following output from a compromised system:

# ls -a /foo
.  ..
# ls -dl /foo
drwxr-xr-x    3 root     root         4096 Jun 12 18:46 /foo

Our “/foo” directory is empty except for the normal “.” and “..” links, meaning we’d expect the link count to be 2.  Yet we see from the “ls -l” output that the link count on this directory is listed as 3 (look for the link count in the second column after the permissions flags and before the file ownership).  What’s going on here?

What’s happening is that I’ve used a kernel-level rootkit to hide a subdirectory of “/foo”.  However the rootkit was not able to decrement the link count of the parent directory without causing a file system discrepancy that would show up the next time you fsck-ed the file system.  In fact, I’ve never encountered a kernel-level rootkit that has attempted to mask the parent directory link count in any fashion when it hides a directory.

As my students pointed out, this suggests an obvious heuristic for detecting the presence of hidden directories on a system.  Simply write a tool that traverses the entire file system searching for directories where the number of subdirectories in a given directory is not equal to the “link count minus two”.  While this technique will only tell you that a hidden directory exists and not necessarily give you the name of the hidden directory, it will pinpoint exactly where to start looking once you get a chance to analyze the file system image on a system that doesn’t have the kernel-level rootkit loaded.

In any event, the tool was extremely simple to write.  It’s called “chkdirs” and it’s now part of the chkrootkit distribution.  And it’s all thanks to some smart and interested SANS students.

Hal Pomeranz is an independent IT/Computer Security Consultant and a SANS Faculty Fellow.  He believes that when the student is ready the master will appear… even if that master is one of your students!

Google Privacy tip of the day

Comments Off
Filed under Computer Forensics, Evidence Analysis, Incident Response, Linux IR, Reverse Engineering, Windows IR

by Jeff Bryner

If I keep writing on Google and forensics, they’ll probably re-arrange my searches someday to all return kittenwar. However, just for you I’ll sacrifice my sanity to pass on a helpfull tidbit about Google Toolbar.

Whether you’re looking to determine information about what’s in the toolbar, or looking to protect your privacy you may be interested to know that on startup the toolbar retrieves the favicon.ico file of all sites in your bookmark list.

I don’t normally use it, but in deciphering some web traffic I had a hunch to work out so I tested it against XP and IE. I bookmarked two sites, rebooted and restarted IE with a blank home page. The network traffic on starting IE shows hits to the bookmarked sites just after a hit to google:

www.google.com/notebook/token?zx=3SZEx
www.google.com/notebook/toolbar/?cmd=list&tok= \
aRBAwIlBy_NYqYTlGJBr6wjUPYs%3A1233632528845&num=12000&zx=LbDcf&min=1233614147935&all=1
whitehouse.gov/favicon.ico
www.whitehouse.gov/favicon.ico
microsoft.com/favicon.ico
www.microsoft.com/favicon.ico

I had bookmarked microsoft.com and whitehouse.gov without the www, the toolbar apparently follows redirects as does wget:

wget http://microsoft.com/favicon.ico
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.microsoft.com/favicon.ico [following]
HTTP request sent, awaiting response... 200 OK
Length: 3638 (3.6K) [image/x-icon]
Saving to: `favicon.ico'

The traffic won’t show in the index.dat file of IE, because IE didn’t do it, the user didn’t do it. The toolbar did it as a favor to you so when you pull down the bookmark list your icons will be there so you recognize the site.

Now on the surface that may not seem like such a big deal, but it all depends on the sensitivity of what you’ve got bookmarked. Nuff said?

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis. He just re-upped on GCFA and is now cramming procrastinating his studies for GCIH re-cert.

Recovering Open But Unlinked File Data

Comments Off
Filed under Computer Forensics, Evidence Acquisition, Incident Response, Linux IR

By Hal Pomeranz, Deer Run Associates

If you’ve ever been a Unix system administrator, you may have encountered “open but unlinked” files in the course of your normal duties.  The typical scenario is a user who’s launched a process that creates an unexpectedly large output file which consumes all of the free space in the partition.  In a panic, the user deletes the output file but leaves the process running.  Unfortunately, the operating system is not allowed to reclaim the space until the last process that has the output file open actually exits.  So until the user kills their process, the space is still in use and the file system is full.  But when you as the system administrator logs in to free some space in the partition, you’re unable to find the massive file that’s consuming all of the space with your normal file system tools because the file has been unlinked (deleted) from the file system.  Finding the process that’s holding the file open and killing it would free the space, but that requires some specialized knowledge and trickery which we’ll see a little later.

In an incident, attackers have been known to use open but unlinked files to hide their data.  For example, suppose the attacker were running a packet sniffer that was capturing usernames and passwords off your network and storing it in a file.  Perhaps they have another process that’s reading the data as it’s placed in the file and using some covert channel to move it off system.  At this point the attacker could delete the data file: the packet sniffer would continue writing data to the file and their reader process could continue reading the data because they opened the file before it was removed from the file system, but the system administrator would have trouble locating the file because it’s now unlinked from the file system.  In fact, the attacker can even delete the executables for the packet sniffer and the reader process from the file system and the current processes will continue to run.

This kind of open but unlinked file data can be difficult to recover from a “cold” system image, because the minute the system is shut down and the attacker’s processes are terminated, the data in these files just becomes part of the free block collection and must be recovered like any other deleted file data.  However, if you have the luxury of analyzing the running system, it is extremely easy to spot and recover this kind of file data.

Creating Our Test Case

As a stand-in for our attacker’s hypothetical packet sniffer, I’m going to start a tcpdump process and have it dump its packet captures into a file.  Since I plan on removing the tcpdump binary as part of my demonstration, I’m first going to make a copy of the binary in /tmp:

# cp /usr/sbin/tcpdump /tmp/tcpdump
# /tmp/tcpdump -w /tmp/capture &
[1] 12437
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
# ls -l /tmp/tcpdump /tmp/capture
-rw-r--r-- 1 root root   4096 2009-01-21 18:08 /tmp/capture
-rwxr-xr-x 1 root root 639416 2009-01-21 18:07 /tmp/tcpdump

So far, so good.  Our tcpdump process is running and has captured a little bit of data.  Now I’m going to remove the binary and the output file and verify that the process is still running:

# rm /tmp/tcpdump /tmp/capture
# ls -l /tmp/tcpdump /tmp/capture
ls: cannot access /tmp/tcpdump: No such file or directory
ls: cannot access /tmp/capture: No such file or directory
# ps -ef | grep tcpdump
root     12437 12289  0 18:08 pts/1    00:00:00 /tmp/tcpdump -w /tmp/capture

Great!  The process is still active, but neither the binary nor its output file are visible to standard Unix tools like ls.  Now let’s have some fun.

The Power of lsof

It turns out that the highly indispensible lsof utility has an option for detecting exactly these open but unlinked files:

# lsof +L1
COMMAND     PID USER   FD   TYPE DEVICE     SIZE NLINK   NODE NAME
init          1 root    0u   CHR    5,1              0    623 /dev/console (deleted)
init          1 root    1u   CHR    5,1              0    623 /dev/console (deleted)
init          1 root    2u   CHR    5,1              0    623 /dev/console (deleted)
wineserve 10510  hal   31u   REG  254,1 16777216     0  42233 /tmp/.wine-1000/server-fe05-4ee00e/anonmap.WKvI4J (deleted)
firefox   10826  hal   60u   REG  254,4     8200     0 139466 /var/tmp/etilqs_wkK3U9h1sAcQpD3 (deleted)
tcpdump   12437 root  txt    REG  254,1   639416     0  25463 /tmp/tcpdump (deleted)
tcpdump   12437 root    4w   REG  254,1    65536     0  25500 /tmp/capture (deleted)

The “+L1″ option is translated as “show me all files with link count less than one”– in other words, “show me all files with link count zero” which is just another way of saying “files which have been unlinked from the file system”.  It’s almost anti-climactic how easy it is to spot these files with lsof.

However, the above output also demonstrates another important aspect of this discussion: not all open but unlinked files are necessarily cause for concern.  There are a few processes that are common to the Unix/Linux platform that sometimes make use of open but unlinked files as part of their normal operations.  This is clearly a situation where one must “know thy systems”– be familiar with how the operating systems you’ll be investigating appear during normal operations– so that you can spot discrepancies.  That being said, programs that I’ve seen regularly using open but unlinked files include the Linux init process, Firefox, the Wine Windows emulator (all of which you can see in the output above), and VMware Server.

OK, at this point we’ve spotted the open but unlinked files, but how can we recover the data that’s in them?  The lsof output above gives us the inode numbers for the files if you look in the “NODE” column (25463 for our deleted tcpdump binary and 25500 for the output file).  This means we could use a tool like icat from the Sleuthkit to dump the contents of these files.  But it turns out that there’s another approach that doesn’t require any tools that aren’t already present in most Unix-like operating systems.

/proc to the Rescue!

The lsof output also gives us the process ID of the processes with open but unlinked files.  In this case, our tcpdump process is PID 12437.  Let’s head over to the /proc/12437 directory and see what we can see:

# cd /proc/12437
# ls
attr		 cpuset   io	    mountinfo	oom_score  smaps
auxv		 cwd	  latency   mounts	pagemap    stat
cgroup		 environ  limits    mountstats	root	   statm
clear_refs	 exe	  loginuid  net		sched	   status
cmdline		 fd	  maps	    numa_maps	schedstat  task
coredump_filter  fdinfo   mem	    oom_adj	sessionid  wchan

There’s a lot of interesting data in /proc, but for our purposes the “exe” object is the first interesting thing:  “exe” is always a copy of the binary image of the running process.  So all we have to do to recover the deleted executable is make a copy of this “file” under /proc.  Obviously in a real incident you wouldn’t want to copy the file back into the file system of the compromised machine because you’d potentially be overwriting important evidence.  You’d either want to copy it to a portable drive you’ve connected to the system, or move the executable off the machine via the network with a command line like:

# cat /proc/12437/exe | nc 192.168.1.1 9999

This assumes you’ve got a machine acting as a “collector” at 192.168.1.1 with another netcat process listening on port 9999/tcp and writing the incoming data into a file.

Just to prove to you that this works, however, let me just copy the binary into my local file system and demonstrate that the result is a working executable:

# cp /proc/12437/exe /tmp/testing
# /tmp/testing --version
testing version 3.9.8
libpcap version 0.9.8
Usage: testing [-aAdDeflLnNOpqRStuUvxX] [-c count] [ -C file_size ]
        [ -E algo:secret ] [ -F file ] [ -i interface ] [ -M secret ]
        [ -r file ] [ -s snaplen ] [ -T type ] [ -w file ]
        [ -W filecount ] [ -y datalinktype ] [ -Z user ]
        [ expression ]

Again, in a real incident you wouldn’t just blindly execute a suspicious program you’d recovered in this manner.  You’d only want to experiment with a copy of the executable in an isolated “sandbox” type system.

OK, so what about recovering the data file that the tcpdump process is writing?  If you look at the directory listing of /proc/12437, you’ll notice that there’s an object called /proc/12437/fd.  “fd” here stands for “file descriptor” and /proc/12437/fd is actually a directory that contains links to all files this process currently has open.  Let’s take a look at this directory in more detail:

# ls -l /proc/12437/fd
total 0
lrwx------ 1 root root 64 2009-01-21 18:14 0 -> /dev/pts/1
lrwx------ 1 root root 64 2009-01-21 18:14 1 -> /dev/pts/1
lrwx------ 1 root root 64 2009-01-21 18:12 2 -> /dev/pts/1
lrwx------ 1 root root 64 2009-01-21 18:14 3 -> socket:[4197672]
l-wx------ 1 root root 64 2009-01-21 18:14 4 -> /tmp/capture (deleted)

Notice that the link names are named for the internal file descriptor number used by the process, but the names of the files associated with these file descriptors can be clearly seen, along with the fact that the /tmp/capture file has been deleted.

The cool thing is that you can use these links as arguments to normal Unix file system commands.  For example, if you wanted to copy the data in the deleted file over the network to your capture workstation:

# cat < /proc/12437/fd/4 | nc 192.168.1.1 9999

There’s a little bit of subtlety to the command above.  Notice that I’m not just doing “cat /proc/12437/fd/4″, because that would only give me a snapshot of the contents of the file at a particular instant in time.  Instead I’m using input redirection (”<”) to dump the current state of the file and then continue reading data and dumping it into netcat as it’s being written into the file.  Our cat process will only terminate when the tcpdump file closes its output file.  This is one of the things that makes this approach superior to just using a tool like icat to dump the current state of the file.

Some Final Thoughts

It’s fairly uncommon to find attackers making use of open but unlinked files in this manner.  But if you’re accessing the compromised system anyway– for example to capture a memory dump before you shut the machine down– it doesn’t hurt to run a quick lsof command to check for any instances of these files.  In the cases where they do exist, the techniques described above can allow you to quickly recover some information that is typically critical to analyzing what the attacker is doing to your system.

Also, don’t forget about the /proc/<pid>/cwd link, which is a link to the current working directory of the process.  If the attacker was foolish enough to start the suspicious process from their rootkit installation directory, you may be able to zero-in on the compromise in record time!

Hal Pomeranz is an independent Computer Security/IT consultant and SANS Institute Faculty Fellow.  He wants to name his first child “Elesso’ef Slashproc”, which probably means it’s a good thing he’s not planning on having kids.

A Quick Idiom for Pretty-Printing /proc Data

Comments Off
Filed under Incident Response, Linux IR

This is just a short note about a useful little idiom that a lot of people I run into seem to have never seen before.  You’re all aware that the /proc file system contains a great deal of information about processes that’s useful in an incident response situation.  However, when you start looking at this data it can sometimes be difficult to read:

$ cd /proc/self
$ cat environ
GNOME_KEYRING_SOCKET=/tmp/keyring-r8yNJT/socketLOGNAME=halGDMSESSION=default...

Yuck!  All of the environment variables are jammed together in an unreadable mess.

The reason the output appears this way is that the various strings in the /proc structures use nulls (ASCII zero) instead of newlines as terminal characters (just like strings in C).  You don’t usually see the nulls because they’re non-printable characters.

But with a little help from the “tr” command you can convert the nulls to newlines and make everything much more readable:

$ cat environ | tr \\000 \\n
GNOME_KEYRING_SOCKET=/tmp/keyring-r8yNJT/socket
LOGNAME=hal
GDMSESSION=default
[...]

Notice the use of double backslashes in the command above — the extra backwhack makes sure that the arguments to “tr” end up being \000 and \n after being interpolated by the shell (or you could use single quotes).

I hope you find this little trick useful.  I find myself using it constantly.

Hal Pomeranz is an independent IT/Computer Security consultant and a SANS Faculty Fellow.  He spends far too much of his life herding Unix/Linux systems.

Rapier: A Different Data Carver

Comments Off
Filed under Computer Forensics, Evidence Analysis, Linux IR, Windows IR

By Keven Murphy

Rapier is a data carver written for Linux. It is a bit different than the other ones out there. First of all, the data carver treats the input file as a stream of data. For example, if the header/word is broken up between cluster/sector boundaries, Rapier doesn’t see the data divided up between the clusters/sectors. Instead, it ignores these boundaries. Secondly, headers and footers (footers are not 100% implemented yet) can be up to 100 bytes/characters long. Third, there are a few built-in search patterns. Those are index.dat and registry files. Like most data carvers, it doesn’t review the data it carves out to see if it is good data. That part is left to the forensics examiner.

Every byte on the drive is reviewed by Rapier. I realize that this can make it run long as compared to other data carvers. But I have good reasons for that. About a year ago I was using Foremost trying to recover some WMV files for a case I was working. Foremost was able to recover two files. I ran PhotoRec on the drive while reviewing the Foremost output. Photorec found a couple more than Foremost. After that, it got me wondering if either tool had found all of the wmv files. Then I used Sigfind,which identified a total of eight files. I ended up carving them out by hand.

Based upon that experience, I was thinking that the Boyer-Moore string search (see Foremost’s engine.c file) that Foremost used was missing things. In the end, I decided it might be fun to write my own. Let me say that it was an interesting challenge and continues to be.

Author’s Note: I mean no disrespect to the people who wrote and maintain Foremost and PhotoRec. I use both pieces of software every day. Both of them are great tools. Neither am I saying that Rapier is the best thing since sliced bread. It is just that, sometimes, you need to use a different tool.

How To Run Rapier

Just running Rapier without any command line arguements shows the following:

Rapier 0.2 Alpha -c {config_filename} -f {filename} -l log {filename} [-b bsize] [-t] [-v]
Note: [] optional command arguments.
    -c   Config file
    -f   Input filename
    -l   Log filename
    -b   Block size -- Default is 512
    -t   Put starting date & time and ending date & time in the logs
    -v   Verbose

Running Rapier is pretty easy. The steps below outline the general process:

  1. Rapier requires an image to review. You can either give it a dd image, device (/dev/hda1 for example), or a blkls file that consists of unallocated space or slack space.
  2. Adjust the rapier.conf file to what you are looking for.
  3. Run Rapier:
    # ./rapier -v -t -c rapier.config  -f image_file.unallocated.dd -b 4096 -l rapier.log

File Recovery

Rapier automatically recovers the files after it has completed its scan. In the directory you ran Rapier from, you will find the files Rapier recovered in the following filename format: {extension}-{starting offset}-{ending offset}.{extension}. For example a index.dat file might have the following name: dat-26215424-26248192.dat. In a future release, I plan on creating directories based upon the extension and putting those files in the respective directory.

Rapier Files and Log Output

Rapier creates three additional files besides the ones it recovers. The first is the log file given by the -l option. The second is CSV file, {log file}.csv, that contains a table of the files it finds with the starting and ending offsets. The third file is a binary file that has the name of {log file}.found.dat. This file contains binary data of the files Rapier has found with the offset information. I thought it might be handy to be able to recovery the files without re-scanning the entire drive. This option is something I am working for the next version.

An Example Log File

Rapier Log File
Version: 0.2 Alpha
Copyright by Citadel Systems, L.L.C.
Website: http://www.citadelsystems.net
License: Citadel Systems, L.L.C. grants Licensee a non-exclusive and non-transferable
license to use the Product. Licensee may not use the product for commercial purposes
beyond an initial thirty (30) day evaluation period without the purchase of a commercial
license from Citadel Systems, L.L.C. Commercial purposes include any activity engaged in
for the purpose of directly generating revenue or in support of activity that generates
revenue. This license does not entitle Licensee to receive from Citadel Systems, L.L.C.
hard-copy documentation, technical support, telephone assistance, or enhancements or
updates to the Product.

Rapier started at: Sun Jan 11 15:58:08 2009

Searching for the following
---------------------------
Total Number of headers: 2
Ext.: dat
Header Length (HEX) 10 in bytes:
Header (HEX): 43 6C 69 65 6E 74 20 55 72 6C
Length to get: 10240000

Ext.: txt
Header Length (HEX) 7 in bytes:
Header (HEX): 46 49 4E 44 5F 4D 45
Length to get: 1024

=============================================================================================================================

-----------------------------------------------------------------------------------------------------------------------------
Found txt header at Block/Sector:  79    Offset: 4092  Total Offset: 327676      Additional Bytes Before Header: 10
-----------------------------------------------------------------------------------------------------------------------------
Found dat header at Block/Sector:  6400  Offset: 1024  Total Offset: 26215424    Additional Bytes Before Header: 0
-----------------------------------------------------------------------------------------------------------------------------
Found dat header at Block/Sector:  6528  Offset: 1024  Total Offset: 26739712    Additional Bytes Before Header: 0
-----------------------------------------------------------------------------------------------------------------------------
Found dat header at Block/Sector:  6656  Offset: 1024  Total Offset: 27264000    Additional Bytes Before Header: 0
-----------------------------------------------------------------------------------------------------------------------------

Note: The number in Block/Sector field above is based upon the block size (-b) option in the command line. To figure out the offset of the file based on block/sector and offset within that block/sector it would be: (6656 * 4096 block size) + 1024 = 27264000.

Opening Rapier data file: rapier.log.found.dat

Reading Rapier data file
========================

File -- Header: txt     Total Offset: 327676
Filename: txt-327676-328700.txt          Startpos: 327666        Endpos: 328700          Filelength: 1024

File -- Header: dat     Total Offset: 26215424
Filename: dat-26215424-26248192.dat      Startpos: 26215424      Endpos: 26248192        Filelength: 32768

File -- Header: dat     Total Offset: 26739712
Filename: dat-26739712-26772480.dat      Startpos: 26739712      Endpos: 26772480        Filelength: 32768

File -- Header: dat     Total Offset: 27264000
Filename: dat-27264000-27280384.dat      Startpos: 27264000      Endpos: 27280384        Filelength: 16384

Recovery of files done

Rapier finished at: Sun Jan 11 15:58:11 2009

An Example CSV File

"Rapier CSV Log File"
"Version: 0.2 Alpha"
"Copyright by Citadel Systems, L.L.C."
"Website: http://www.citadelsystems.net"
"License: Citadel Systems, L.L.C. grants Licensee a non-exclusive and"
"non-transferable license to use the Product. Licensee may not use the"
"product for commercial purposes beyond an initial thirty (30) day "
"evaluation period without the purchase of a commercial license from "
"Citadel Systems, L.L.C. Commercial purposes include any activity engaged"
"in for the purpose of directly generating revenue or in support of "
"activity that generates revenue. This license does not entitle Licensee"
"to receive from Citadel Systems, L.L.C. hard-copy documentation, "
"technical support, telephone assistance, or enhancements or updates to "

"the Product."

"Rapier started at: Sun Jan 11 15:58:08 2009"

Filename,File Type (Extension),Starting Offset,Ending Offset,File Length

txt-327676-328700.txt,txt,327666,328700,1024
dat-26215424-26248192.dat,dat,26215424,26248192,32768
dat-26739712-26772480.dat,dat,26739712,26772480,32768
dat-27264000-27280384.dat,dat,27264000,27280384,16384

"Rapier finished at: Sun Jan 11 15:58:11 2009"

Other Suggested Uses

Word Searches

Version 0.2 does something new. It can be configured to search for a word then pull out X number of bytes before the word and after the word. This feature came in handy when I was doing my forensics review for Toad SQL logs (see my blog on Oracle Forensics: Toad from Quest Software). I was able to find my keyword and then pull out 4096 bytes before my keyword. Also, I set the total file size to be 8192 bytes, which includes the 4096 bytes before my keyword + keyword + what ever is left to make a total of 8192 bytes. For example, my keyword length is 10, Rapier would grab 4096 bytes before my keyword + 10 (keyword length) + 4082 bytes after my keyword = 8192 bytes total.

ascii_config_generator.pl For Generating Rapier Config Files From Keywords

Located under the tools directory, in the Rapier file download, is a ASCII to hex converter. If you have a list of keywords use this to create your rapier.config file. Below is from the output of ascii_config_generator.pl –man.

ascii_config_generator.pl [options]

Options:
    --help            Brief help message
    --man             Full documentation
    --wordfile {file} File containing the wordlist. Note: 1 word per line.
    --output {file}   New Rapier config file
    --beforetext {#}  How many bytes to grab before keyword. Default: 1024. Optional.
    --totallength {#} Total length to grab. Default: 4096. Optional.
    --nocase          Do not create lines with uppper and lower case. Optional.

Below is an example of it being ran at the command line:

#  ./ascii_config_generator.pl --wordfile keywords.txt \
   --output new_rapier.config --beforetext 512 --totallength 4096

Sample Output From The Script

hello,\x68\x65\x6C\x6C\x6F,,512,4096
Dodge_Charger,\x44\x6F\x64\x67\x65\x5F\x43\x68\x61\x72\x67\x65\x72,,512,4096
Daytona,\x44\x61\x79\x74\x6F\x6E\x61,,512,4096
HELLO,\x48\x45\x4C\x4C\x4F,,512,4096
DODGE_CHARGER,\x44\x4F\x44\x47\x45\x5F\x43\x48\x41\x52\x47\x45\x52,,512,4096
DAYTONA,\x44\x41\x59\x54\x4F\x4E\x41,,512,4096
hello,\x68\x65\x6C\x6C\x6F,,512,4096
dodge_charger,\x64\x6F\x64\x67\x65\x5F\x63\x68\x61\x72\x67\x65\x72,,512,4096
daytona,\x64\x61\x79\x74\x6F\x6E\x61,,512,4096

The script will generate three lines for every keyword. The first set of lines, I had three keywords in my list, are converted to hex exactly how they look in the keyword file. The next set is convert into all upper case and the last set is converted into all lower case. The only thing left to do is use the new config file with Rapier.

Screenshots of Rapier

Rapier working — Click on the picture to bring up a larger view.

Rapier 0.1 Alpha Working
Rapier finished — Click on the picture to bring up a larger view.
Rapier 0.1 Alpha Working

Additional Headers

If you need headers for files try: http://www.garykessler.net/library/file_sigs.html

Download

Download it at: http://www.citadelsystems.net/index.php/forensics-tools/34-data-carver/46-rapier

Keven Murphy, GCFA Gold #24, is an IT security manager contracted to a Fortune 100 defense contractor.

How To: Build a Response CD

2
Filed under Computer Forensics, Incident Response, Linux IR, Malware Analysis

In both our compliance auditing and incident response/forensics practice we make heavy use of customized CDs full of analysis tools. Let’s take a look at the process of building one step by step. For our example we’re going to use Linux but this process really works for any UNIX based system you have.

Step One:

The first thing that you need to do is to create a directory that you’ll use to hold all of your security tools while building your CD. Generally it’s easier if you build a directory structure that’s familiar, so we’ll recommend that you first create a top level folder that will essentially be the root of the security tools CD that you create. For our purposes we’ll call this directory “response”.

Inside of the “response” folder you will want to create an organized directory structure. In reality you can just dump everything into the root of the security CD but it’s much better to keep things organized. Depending on the task I may have any number of folders here; for instance, I may include a “scripts” folder, a “forensic_tools” folder, etc. At a minimum, however, I will always have a “bin” and “lib” directory for housing both the security tools and the dynamic libraries that they require.

Create the directories for your security CD:

 mkdir response
 cd response
 mkdir bin
 mkdir lib

Step Two:

This is probably the hardest step in our process. What you need to do is determine exactly which security or analysis tools you want to include on your CD. Personally, it makes a lot of sense to me to create one complete security CD with everything that I think I’ll ever need. Here are some basic suggestions just to get us going:

bash, csh, etc. – If there’s a shell you prefer to use, make sure you’ve got a copy of it on your CD. If we’re not going to trust the tool binaries on the CD why would we trust the shell?

ls, cd, cat, more, etc. – If we’re doing incident response or forensics we clearly can’t trust anything on the system that we’re analyzing. This means that even the basics must be brought along with us on our CD.

lsof – If I had to pick just one tool to bring along for incident response or security analysis of a system it would be lsof. Since UNIX systems view pretty much everything as a file, the ability to view open files becomes incredibly important. This tool is worth an entire article all by itself!

netstat, ifconfig, ifstatus, tcpdump, etc. – Basic networking tools are a must. If you think that you might be dealing with a kernel level rootkit these tools will still be fooled, but for a userspace rootkit these tools are invaluable for figuring out what’s happening at the network level.

netcat, dd, icat, ils, fls, etc. – With this list we start getting into really specialized functionality. Netcat likely needs no introduction. Netcat has been termed the “Swiss Army Knife” of network tools. If I need to get data on or off of a box quickly this is almost always my tool of choice. Added to that we include a variety of tools from The Sleuthkit. More often than not when doing incident response I’ve found it very useful to be able to examine the contents of a particular inode that has obviously been deleted. Certainly you could image the drive and extract it offline but sometimes there’s no substitute for a fast answer!

Now the list above is by no means intended to be a complete list. As mentioned already, we always include our own tools and scripts. What do you do with them now, though? Take your list of tools and copy each binary into the “bin” directory in your tools directory:

Copy everything you want in your toolbox into bin:

 cp /bin/ls bin
 cp /bin/netstat bin
 ...

Step Three:

So far we’re in good shape. Our next step isn’t hard but it’s the one that usually prevents people from building their own UNIX response or security CD. UNIX binaries are, by default, compiled to use dynamic libraries. This allows not only for good reuse of code and smaller disk sizes, it also leads to smaller in-memory sizes since the shared libraries are only loaded one time rather than being duplicated throughout the memory space of the system. This means that we either need to recompile every tool that we want to put on our CD as a static binary (unpleasant at best) or somehow bring along the dynamic libraries.

As it turns out, there’s a really easy to use tool that can tell you which libraries any UNIX binary needs: ldd. The ldd tool (List Dynamic Dependencies) will tell you which version of the library is necessary and where on the system that library currently resides. What we need to do is to extract this dynamic library information from all of the security tools that we have in our “bin” folder and copy them into the “lib” folder. Here’s a little UNIX magic to make this happen: (To use this, your working directory should be “response”)

Resolve and copy library dependencies

 ldd bin/* | grep "/lib" | sed -e 's/.*\(\/lib\/[^ ]*\).*/cp \1 lib/' | sh

With that command run you should now have every required library in your “lib” folder for every command in your “bin” folder!

Step Four: (Optional)

Our next task is optional. At this point you really could just copy this whole bundle into an archive, copy it to your Windows host and unpack it onto a CD that you burn from there. When dealing with UNIX tools, though, I find that it’s often better to actually make the ISO image on the UNIX system too so that you have control over what the file attributes will be, especially the execute bit for your security tools.

This might sound like a hard task but there’s actually a great file system tool available on Linux systems in particular but which has been ported to all major UNIX systems: mkisofs. “mkisofs” allows you to point it at a directory structure and it will then convert that into an ISO that can be used to burn a CD! To do this, if you’ve been following along, just do the following:

Use “mkisofs” to turn “response” into an ISO:

  cd ..
 mkisofs -o ResponseCD.iso response

Your ISO is actually a disk image, so you can now take this to any platform and use your favorite CD burning tool to burn this image out as a CD. How do you use the CD now that you have it?

To use the CD, you take it to the system that you want to examine and mount it. Once it’s mounted you need to set your PATH variable to point to wherever your “response/bin” directory is on the mounted CD and the LD_LIBRARY_PATH to wherever the “response/lib” directory is. Now execute a clean shell (bash, csh, etc.) from the CD and you’re ready to go!!!

This is just one of the many topics discussed and taught hands on in David Hoelzer’s class, “Advanced System & Network Auditing”, available through The SANS Institute.  David is a Senior Fellow with The SANS Institute and the principal examiner for Enclave Forensics.  You can find a variety of topics on his blog.

Bring Me My Pipe

1
Filed under Computer Forensics, Evidence Acquisition, Evidence Analysis, Incident Response, Linux IR, Malware Analysis, Memory Analysis, eDiscovery
//flickr.com/photos/28481088@N00/

Pipes photo courtesy of tanakawho at flickr.com

Often used and under appreciated, the pipe feature in unix/linux/dos has to be my favorite tool in incident response and forensics.

Need the device at /dev/sda imaged with progress indicators and an md5sum?

dd if=/dev/sda| pipebench | tee sda.dd | md5sum >sda.md5.txt

Need a summary of the unique hosts from Internet Explorer’s index.dat history file?

pasco index.dat | grep -v 'javascript\:' | egrep -i 'ftp|http' | sort -k 4 | awk '{print $3}' | awk 'BEGIN {FS="/"}{print $2$3}'| sort | uniq | less

No other feature I can think of makes it easier to quickly analyze data. There’s even a wiki entry dedicated to pipe 101 exploration.

Here’s my top 3 bash shell pipelines I use every day on my gentoo linux workstation:

1) Tailing log files in real-time:
tail -f tabDelimitedNastyBusyLogfile.txt | remark filterfilehere.txt | awk 'BEGIN {FS="\t"} { print $fieldnumberhere,$fieldnumberhere } system("")'
tail -f tabDelimitedNastyBusyLogfile.txt | egrep -vif whitelistfilehere.txt | awk 'BEGIN {FS="\t"} { print $fieldnumberhere, $fieldnumberhere } system("")'

Have a busy syslog file (ASA firewall?) to monitor that you couldn’t possibly read fast enough? Piping a file through remark or a grep white list can limit what you see while awk can pick out fields in the output to highlight. Remark can easily color code output which is handy for assigning colors to something like IP netblocks, keywords, etc.

Quick frustration avoidance tip: the system(”") at the end of the awk script bypasses the internal awk output buffer so you see output right away instead of waiting for your system to flush the buffer.

Do you use the 172.21.0.0/16 private IP space? You can quickly get a visual of good IP ranges with a remark entry like:

/172\.21\.([0-9]{1,3}\.)[0-9]{1,3}/g {green }

which will color all your private IP space green, making it easy to pick out whether you are the source or destination at a glance.

Alternatively you can filter out/white list entries with a remark entry like

/Accessed URL/ skip

if you’re not interested in those entries, allowing you to hone in on what you’re after.

2) while read i
This shell construct:
somecommand | while read i;do somecommandto $i;done

is one of the most useful I’ve ever found for getting something done quickly to a lot of data. For example, recovering deleted files from an ntfs dd image:

ils -rf ntfs imagefile.dd | awk -F '|' '($2=="f") {print $1}' | while read i; do icat -rsf ntfs imagefile.dd $i > ./deleted/$i; done

A quick way to sort unknown files (maybe those recovered using the command above) by type:

file * | grep -i jpeg | cut -f 1 -d ':' | while read i; do mv "$i" jpegs; done

Not strictly using a ‘while’ but still useful if you need to quicklyresolve hostnames in an IP range:

for (( i=1; i<=254 ; i++ )) ; do resolveip 10.0.0.$i 2>/dev/null ; done | grep -vi 'Unable'

3) Quick totals
Sort, uniq and head piped together can get you a top 10 quicker than Dave Letterman:
cat filewithlotsofIPAddresses.txt | egrep -oE "([0-9]{1,3}\.){3}[0-9]{1,3}" | sort | uniq -c | sort -rn | head -n10

will get you the top 10 IP addresses in a file sorted by appearances, highest to lowest. Not as funny as Letterman, but hey it’s Linux!

No doubt if you’ve been around the block more than a few times you’ve got your own pipelines you can’t live without. If you’re open to it, share them so folks can pick up something new or add to their favorites.

Links to more info:

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis.