Category Archives: Email Investigations

Facebook Memory Forensics

0
Filed under Browser Forensics, Computer Forensics, Email Investigations, Evidence Analysis, Memory Analysis, Reverse Engineering

OK, like everyone I joined facebook just to get updates on my high school reunion. (Who knew you could also use it as a possible alibi.)

But then, after writing pdgmail and pdymail and seeing all the neat personal information in facebook…tada pdfbook! Memory parsing to grab facebook info.

Like it’s predecessors pdgmail and pdymail, I’m following the simple construct that memory strings are easy to get to and yield a treasure of information given today’s web 2.0 world of javascript, dhtml, json, etc. Facebook, it turns out doesn’t seem to cough up xml like yahoo, or json like gmail but rather unique class ID strings in it’s html.

What does this mean to forensics? Well with a memory dump from any of the popular memory dumping tools, strings -el  and pdfbook you can get:

  • status updates
  • facebook emails
  • lists of friends
  • likely owners of the memory image

Friends come with their unique facebook ID’s like:

Story from friend: id:6815841748: Name:Barack Obama

Facebook emails are raw html with authors, dates, etc like so :

FacebookEmailDetail author: Storm Large url: http://www.facebook.com/stormlarge
FacebookEmailDetail Date: October 29 at 9:41am
FacebookEmailDetail Body: Nov 19.2009 - 8:30PM
Molly Malones - Los Angeles, California
More info:

Facebook recent activity is like so:

RecentActivity:Jeff became a fan of Fishbone.

Status updates show up like so:

StoryMessage:Jeff Bryner 2 gamble @the airport or not, that is the question.

If you’re really lucky the memory image will contain enough html to produce what pdfbook recognizes as a ‘delete’ button which is only passed out to the owner of the html content. In other words, you are allowed to delete your posts on facebook, pdfbook recognizes this and your facebook userid, correlates it and deduces that the likely owner of the memory image is:


Likely Owner of fbook memory artifacts: FacebookUserID:1421688057 Name:Jeff Bryner

A sample usage:

on a windows or linux box, use pd from www.trapkit.de ala:
pd -p 2345> 2345.dump

where 2345 is the process ID of running instance of IE/firefox/browser of your choice.

You can also use any memory imaging software like mdd, win32dd, etc. to grab the whole memory on the box rather than just one process. You can also use common memory repositories like pagefile.sys, hiberfile.sys, etc.

I’ll refer the reader to the memory imaging tool reference at the forensic wiki

Transfer the dumped memory to linux and do:

strings -el 2345.dump> memorystrings.txt
pdfbook -f memorystrings.txt

It’ll find what it can out of the memory image and spit out it’s findings to standard out. Grep your way to facebook happiness or redirect the output to a file for later viewing.

As this is mosly html parsing, it’s very brittle; meaning that a change in the classID of one of the facebook UI components breaks this program. Matter of fact it’s already broken once since the UI rework of 10/2009. So it will work for awhile until they redesign and I’m out of sync.  Maybe I’ll post it to sourceforge or github so you all can update as you see fit.

Along those lines, look for the diary of pdfbook creation with explanation of it’s regex goodness at the newly created digitalforensicsmagazine.com freshly created this month! Disect and contribute your own regex hacks for finding stuff you recognize in your own facebook memory images.

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS, performs forensics, intrusion analysis, and security architecture work on a daily basis and runs p0wnlabs.com just for fun.

Analysis of e-mail and appointment falsification on Microsoft Outlook/Exchange

1
Filed under Computer Forensics, Email Investigations, Evidence Analysis, eDiscovery

Author: Joachim Metz <forensics@hoffmannbv.nl>

Summary

In digital forensic analysis it is sometimes required to be able to determine if an e-mail has or has
not been falsified. In this paper a review of certain Outlook Message Application Programming
Interface (MAPI)
is provided which can help in determining falsified e-mails or altered
appointments in an Microsoft Outlook/Exchange environment.

About the libpff project

In 2008 Joachim Metz a forensic investigator at Hoffmann Investigations started the libpff project.
At that time the best source about the Personal Folder File (PFF) format in the public domain was
the libpst project. The libpst project dated back to 2002 and had been contributed and maintained
by David Smith, Joe Nahmias, Brad Hards and Carl Byington.

However the libpst, at that time, wasn’t a library and had no support for recovering deleted items
in PST and OST files. The initial goal of the libpff project to create a shared library for PST and
OST that had support for recovering deleted items. Recovering deleted items requires detailed
knowledge of the inner structures of the PFF format. This was the beginning of an interesting
journey. In which even recently additional information about the inner structures has been
discovered, like the 6c and 8c table and the use of indirection in large tables.

In March 2009 PFF forensics was first discussed as part of Microsoft Office forensics in the
Hoffmann Advanced Forensic Sessions (HAFS). A paper titled ‘Personal Folder File (PFF)
forensics’
was published as part of the HAFS. This paper explains the basics of the PFF format,
which can be quite a challenge to understand. One of the main conclusions of the both the paper
and the seminar was that different forensic tools provide different results when recovering deleted
items in PST and OST files.

In the mean time the libpff project has evolved. Due to continued analysis of the PFF format and
several contributions new aspects of the file format have been discovered. Some of which are the
PFF items that contain information about the recipients, sub folders, sub messages and sub
associated items.

Also a lot of information available about the MAPI has made available. The OpenChange project
provides libmapi which contains an Open Source implementation of the MAPI. And the
MFCMAPI project has provided a lot of MAPI information now available on MSDN.

Within Hoffmann Investigations libpff has been to put to work for two purposes. First as a tool to
cross reference findings in other forensic tools and secondarily as a tool that can provide more
information about PST and OST files than those forensic tools. In the upcoming Hoffmann
Advanced Forensic Sessions in November 2009
PFF forensics will be therefore once more the
subject of discussion. In the mean time several of the interesting findings are provided in this
paper.

1. Introduction

Wouldn’t it be nice to have your forensic analysis software to filter out falsified e-mails and
appointments for you? However, most of the current forensic tools provide little information about
the authenticity of e-mail messages and appointments. Therefore, certain analysis have to be done
manually. This paper will give you an understanding of parts the Outlook Message Application
Programming Interface (MAPI) to help identify falsified e-mails in Microsoft Outlook/Exchange
environments.

1.1. Background

If you are a forensic investigator in the field of corporate environments you are probably dealing
with Microsoft Outlook and Exchange most of the time. What you might not know is that both
make heavy use of the MAPI. The MAPI is not only a programming interface but also a useful
resource of information regarding properties of e-mail attributes. For those of you not familiar
with analyzing the Personal Folder File format used by Microsoft Outlook for PST and OST files,
I advice reading [METZ09] before reading this paper.

2. Falsified e-mail message

In a recent investigation we had to investigate if a user had sent an e-mail at a certain date and
time. We started by determining the existence of the e-mail in the mailbox of both the sender and
the recipients. But there were other characteristics that were highly interesting from a forensic
point of view.

A certain e-mail dated March 10, 2009 was forwarded on March 17, 2009. The original e-mail
could not be found in any of the mailboxes. The first indication of falsification was a discoloring
of the day of the month in a print-out of the forwarded e-mail. The 0 in March 10, was gray while
the surrounding text was clearly black.

2.1. The e-mail body

In Outlook/Exchange an e-mail message can contain RTF and/or HTML body text. Both RTF and
HTML formats use formatting codes. Using these formatting codes we did a low-level analysis of
the body text. Most of the available forensic tools do not provide access to these formatting codes,
but lucky for us there is libpff and its tools.
After having compiled libpff with verbose and debug output and having pffexport export the PST

file with the verbose option (-v), we had created a detailed debug log file. In this log file we looked up the e-mail and its RTF body. In the RTF body the following information was found:

{\*\htmltag84 <b>}\htmlrtf {\b \htmlrtf0 Sent:
{\*\htmltag92 </b>}\htmlrtf }\htmlrtf0 Tuesday March 1
{\*\htmltag84 <span style='color:#1F497D'>}\htmlrtf {\htmlrtf0 0
{\*\htmltag92 </span>}\htmlrtf }\htmlrtf0 , 2009 13:48
{\*\htmltag116 <br>}\htmlrtf \line
\htmlrtf0
{\*\htmltag4 \par }

Using other forwarded e-mails as a reference, we established that the bold formatting code should not be there.

2.2. Conversation index

Looking at existing e-mail messages we hypothesized that the original e-mail was not created on
March 10, 2009 but was in fact an e-mail created on March 17 2009 that had been altered. We
wanted proof besides the lack of the original e-mail message in the mailboxes of the sender and
the recipients.

A MSDN article titled 'Tracking conversations' provided us with a fairly reliable answer.
[MSDN] states that:

PR_CONVERSATION_INDEX (PidTagConversationIndex) indicates the position of the
message within a particular conversation. It is a client's reponsibility to
set PR_CONVERSATION_INDEX for each outgoing message, whether it is a new
message, a forwarded message, or a reply. Clients can set this property
manually or call ScCreateConversationIndex, a utility function provided by
MAPI.
ScCreateConversationIndex generates the value of a conversation index for any
outgoing message. ScCreateConversationIndex implements the index as a header
block that is 22 bytes in length, followed by zero or more child blocks each 5
bytes in length.
The header block is composed of 22 bytes, divided into three parts:
 * One reserved byte. Its value is 1.
 * Five bytes for the current system time converted to the FILETIME structure
 format.
 * Sixteen bytes holding a GUID, or globally unique identifier.
Each child block is composed of 5 bytes, divided as follows:
 * One bit containing a code representing the difference between the current
 time and the time stored in the header block. This bit will be 0 if the
 difference is less than .02 second and greater than two years and 1 if the
 difference is less than one second and greater than 56 years.
 * Thirty one bits containing the difference between the current time and the
 time in the header block expressed in FILETIME units.This part of the child
 block is produced using one of two strategies, depending on the value of
 the first bit. If this bit is zero, ScCreateConversationIndex discards the
 high 15 bits and the low 18 bits. If this bit is one, the function discards
 the high 10 bits and the low 23 bits.
 * Four bits containing a random number generated by calling the Win32
 function GetTickCount.
 * Four bits containing a sequence count that is taken from part of the random
 number.

Reverse-engineering this description for the PFF format I found that the part of the header block

containing the ‘One reserved byte’ with a value of 1 is actually the first byte of the filetime. So

there are not 5 bytes of the filetime but 6. The date and time in the header block of the

conversation index matches the creation date and time of e-mail messages.

The child block contains a difference between the current and the previous time and not the time

stored in the header block, as according to the MSDN specification. This was validated using the

creation date and time of multiple e-mails.

The conversation index for the specific e-mail translates to:

0x0071 (PidTagConversationIndex : Conversation index)
0x0102 (PT_BINARY : Binary data)
Header block:
 Filetime        : Mar 17, 2009 10:13:04 UTC
 GUID            : 11111111-2222-3333-4444-555555555555
Child block: 1
 Filetime        : Mar 17, 2009 10:18:03 UTC
 Random number   : 2
 Sequence count : 0
Child block: 2
 Filetime        : Mar 17, 2009 10:24:01 UTC
 Random number   : 9
 Sequence count : 0
Child block: 3
 Filetime        : Mar 17, 2009 10:42:39 UTC
 Random number   : 9
 Sequence count : 0
Child block: 4
 Filetime        : Mar 17, 2009 10:45:36 UTC
 Random number   : 14
 Sequence count : 0
Child block: 5
 Filetime        : Apr 17, 2009 07:19:08 UTC
 Random number   : 8
 Sequence count : 0

Note that the precision of the date and time difference in the child block varies and does not match

the creation date and time. The actual reason for this variation is yet unknown.

0x3007 (PidTagCreationTime : Creation time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime        : Apr 17, 2009 08:41:20 UTC

However there is no date March 10, 2009 in the conversation index. Looking at the conversation

indexes of other forwarded and replied e-mail messages this is the behavior we would expect.

Note that the GUID ‘11111111-2222-3333-4444-555555555555′ in this example was altered.

Using the GUID we found corresponding e-mails, with the same GUID in the conversation index.

Most of these e-mails had a different content. This finding supported our hypothesis. All of the

corresponding e-mails also had a creation date of March 17, 2009. Therefore, it was plausible that

the e-mail with the discolored zero in ‘March 10′ was falsified using another e-mail created on

March 17, 2009. Upon being faced with the findings in an interview, the sender of the e-mail admitted that he had

altered the e-mail.

3. The modified appointment

In another investigation we found an appointment that contained a conversation topic that
contained one of the keywords we were looking for. However the appointment had an entirely
different subject and the last modification date and time already indicated that the appointment
was modified at a later date.

We needed to be certain that this behavior was caused by modifying an appointment. Using
Outlook we created a PST file with an appointment. Libpff provided us with the following
information about the subject and the conversation topic:

0x0037 (PidTagSubject : Subject)
0x001f (PT_UNICODE : UTF-16 Unicode string)
Unicode string  : ^A^ATest1
0x0070 (PidTagConversationTopic : Conversation topic)
0x001f (PT_UNICODE : UTF-16 Unicode string)
Unicode string  : Test1

And about the date and time values:

0x0039 (PidTagClientSubmitTime : Client submit time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:07:47 UTC
0x0071 (PidTagConversationIndex : Conversation index)
0x0102 (PT_BINARY : Binary data)
Header block:
 Filetime         : Jul 23, 2009 14:07:47 UTC
 GUID             : 11111111-2222-3333-4444-555555555555
0x0e06 (PidTagOriginalDeliveryTime : Message delivery time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:07:47 UTC
0x3007 (PidTagCreationTime : Creation time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:04:28 UTC
0x3008 (PidTagLastModificationTime : Last modification time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:07:50 UTC

The ^A characters in the subject are control characters and can be ignored.

Note that the creation and last modification date and time are not equal.

Next we modified the appointment and had libpff provide us with information about the subject

and the conversation topic:

0x0037 (PidTagSubject : Subject)
0x001f (PT_UNICODE : UTF-16 Unicode string)
Unicode string  : ^A^AModified1
0x0070 (PidTagConversationTopic : Conversation topic)
0x001f (PT_UNICODE : UTF-16 Unicode string)
Unicode string  : Test1

And about the date and time values:

0x0039 (PidTagClientSubmitTime : Client submit time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:07:47 UTC
0x0071 (PidTagConversationIndex : Conversation index)
0x0102 (PT_BINARY : Binary data)
Header block:
 Filetime         : Jul 23, 2009 14:07:47 UTC
 GUID             : 11111111-2222-3333-4444-555555555555
0x0e06 (PidTagOriginalDeliveryTime : Message delivery time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:07:47 UTC
0x3007 (PidTagCreationTime : Creation time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:04:28 UTC
0x3008 (PidTagLastModificationTime : Last modification time)
0x0040 (PT_SYSTEM : Windows Filetime (64-bit))
Filetime          : Jul 23, 2009 14:08:37 UTC

As you can see the conversation topic and index do not change when an appointment is modified.

The last modification date and time in the example is not much of an indication that the

appointment was modified, mainly because we did the modification right after the creation of the

appointment.

4. Conclusion

E-mails and appointments in Outlook/Exchange provide us with certain properties that can be
useful for digital forensic analysis of e-mails, like the conversation index and multiple formatted
body texts. Others may be the conversation topic and original creation and/or modification dates
and times.

Appendix A. References

[METZ09]
Title:     Personal Folder File (PFF) forensics
Subtitile: Analyzing the horrible reference file format
Author(s): Joachim Metz
URL:       http://kent.dl.sourceforge.net/sourceforge/libpff/PFF_forensics.pdf
[MSDN]
Title:     Tracking conversations
URL:      http://msdn.microsoft.com/en-us/library/cc765583.aspx

Perl Fu: Email Discovery

1
Filed under Email Investigations, eDiscovery

Hal Pomeranz, Deer Run Associates

I hope Mike Worman doesn’t hate on me for stealing his “Perl Fu” idea, but I recently have been dealing with a task that is perfect for Perl.  One of my customers is having to do a laborious discovery process through a huge email archive that is in “Unix mailbox format”– meaning large text files with the email messages all concatentated togther.  They need to find any one of a list of relevant keywords in messages stored in these hundreds of gigabytes of large text files and output the entire text of the matching email messages.

Unix mailbox format is a file format that I’ve dealt with a lot, and I’ve written many scripts to parse these kinds of files.  So it probably took me less time to write the script to do this than it’s going to take me to write this blog post.  But I figured this is a task that other readers of the blog might encounter from time to time, so here’s the code:

#!/usr/bin/perl
# mgrep -- match patterns and output messages from Unix mailbox files
# Usage: mgrep [-i] [-f file] [pattern] file1 ...

use strict;
use Getopt::Std;

my %opts = ();
getopts('if:', \%opts);

my $pattern = undef;
if (length($opts{'f'})) {
    open(FILE, "< $opts{'f'}") ||
	die "Can't open pattern file $opts{'f'}: $!\n";
    my @lines = <FILE>;
    close(FILE);
    chomp(@lines);
    $pattern = '(' . join('|', @lines) . ')';
}
else {
    $pattern = shift(@ARGV);
}
$pattern = "(?i)$pattern" if ($opts{'i'});

my $message = undef;
while (<>) {
    if (/^From\s/) {
	print $message if ($message =~ /$pattern/s);
	$message = undef;
    }
    $message .= $_;
}
print $message if ($message =~ /$pattern/s);

The actual meat of the program is the “while (<>) …” loop down in the bottom third of the code.  We spend more code processing arguments and setting up the pattern match than on actually processing the input files.  But here are some notes to help you make sense of what’s happening in the program:

  1. First we “use strict” to have Perl help us enforce good programming practice in our script, like pre-declaring variables with “my” to help prevent typos and other errors.
  2. Then we incorporate the standard Perl command line argument processing library (”use Getopt::Std”) and call getops() to process the command line arguments.  Here we’re specifying that our program accepts both “-i” (case insensitive matching) and “-f” to specify a file name containing a list of patterns to match against.  The “:” after the “f” in the getops() string means that”-f” expects an argument, namely the file name.  Any options that getopts() finds will be stored in the “%opts” array.
  3. Next our “if” block checks to see if the “-f” option was set.  If so, then we attempt to open the specified file name and read in its contents (”die” causes the program to abort if the file can’t be opened).  We use chomp() to remove the newlines from the lines we read in and then we concatenate all of the patterns together to form a pattern string like “(pattern1|pattern2|…)” (”pattern1 or pattern2 or …”).  Note that if “-f” was not set, then we just read the pattern in from the command line like the normal Unix grep program (that’s the “else { … }” block).
  4. Next we check to see if the “-i” (case-insensitive match) option is set.  If so, then we add “(?i)” at the front of our pattern.  In a Perl pattern match, this is one way to express case-insensitive matching.
  5. Now we’re finally ready to start processing our input files.  The “while (<>) { … }” construct is a useful bit of Perl shorthand that emulates the standard Unix command-line processing.  Specifically it means that if there are any remaining command-line arguments, they should be treated as file names and opened sequentially and all lines processed one at a time from each file.  If there are no unused arguments on the command line after our argument processing, then the program should look for its input from the standard input.
  6. Within the body of the loop, we’re processing our input one line at a time.  At the end of the loop we’re simply concatenating the lines we read into the “$message” variable that holds our message text.  “$_” is the magic Perl variable that represents the text of the line we’re currently processing, and “$message .= $_” means “append $_ to the text already in $message”.
  7. Now for the uninitiated, Unix mailbox format is nothing but a large text file with messages concatenated one after the other.  You can recognize the start of each new mail message when you find a line that begins “From<whitespace>“.  Our “if { … }” block at the top of the loop matches this pattern as an indication that we’ve reached the end of one message and are starting in on another.  If the message we’ve collected so far matches the pattern specified by the user then we print the entire contents of the mail message.  Then we empty our “$message” variable and so we can start collecting the next mail message.
  8. After we’ve processed all of our input files, we still need to determine whether or not we should output the last message from the last file we processed.  That’s why there’s one more print statement after the end of the loop.

Whew!  That’s a lot of words for a simple script, but I hope it helps you wrap your head around some of the more obscure bits of Perl syntax and gives you some ideas for writing your own scripts.  By the way, because I chose to use Perl for this task, one of the happy accidents is that we can actually use the Perl regular expression syntax for the patterns we give as input to the program (whether we put them in a file or specify them on the command line).  This is good news because Perl’s pattern matching syntax is much more flexible and expressive than the one used by the regular Unix grep command.

Happy email hacking!

Hal Pomeranz is an independent IT/Computer Security Consultant and a SANS Faculty Fellow.  He is available as a strolling Perl programmer for weddings and bar mitzvahs.

Block Pornography – The Bane of Computer Forensics

Comments Off
Filed under Browser Forensics, Computer Forensics, Email Investigations, Ethics, Evidence Analysis, Reporting

By J. Michael Butler

What is more important?  Searching for porn on an organization owned asset, or looking for misuse of organization owned data?  Not even a trick question.  Too easy.  So why do organization’s computer forensic experts still find themselves searching for porn?  Because it is there. 

New problem?  I think not.  In T.h.e. Journal, there is an article written in 1997 addressing this same issue and suggesting a product called “Little Brother” to fix it.[1]  Today there are a plethora of software products for home and office use, ranging from free to more than $100 per workstation.  Some are more effective than others, but evaluation is outside the scope of this article.  Just know that software solutions exist.[2] 
 
Moreover, hardware solutions are also ubiquitous.[3]  Consider firewall and/or proxy type products.  With the correct settings, most porn can be stopped at the proxy.  I say most because there are so many new sites that pop up every day.  On the other hand, proxy software and hardware are constantly updated to add new blocked sites, so even if a new site is temporarily available, it will be blocked eventually.  So a good proxy will make it so much trouble to surf undesirable sites that one must assume users will eventually give it up.
 
I have included links to a few software and hardware tools at the end of this article.  This is not an endorsement of any of them; just a simple statement that there are many options available.  Nor should this list be considered all inclusive.  I assure you I am forgetting more than I listed. 

Not too long ago, the GCFA alumni e-mail list was hit with a brisk discussion of porn on corporate computers, to which one wag replied ‘just block the traffic.’  No porn – no problem.  Makes a lot of sense to me!  So, what’s stopping the companies from stopping porn surfing?  Money.

It takes money to block porn.  Not a huge amount, but enough to get the attention of management and the bean counters.  The problem is, like the old Fram oil filter ad used to say, ‘…you can pay me now, or you can pay me (a lot more) later.’

This is more than a “moral” issue or a policy issue.  Porn can be an addiction, just like booze and cigarettes.  So, do you really think you will solve the issue by writing a policy?  Perhaps you have heard the old saw:  “We can’t legislate morality.”  But, at the same time, we can take steps to keep porn out of the workplace. 
 
Let’s examine what happens when an organization has a user who spends work time on pornography.  Everybody loses.  Most of all, the organization!  First, there is the lost of productivity.  That costs money.  Whatever the user is supposed to be doing is not getting done.  Or it is taking a lot longer to complete.  Customers are not happy.  That costs money.  Coworkers are not happy.  In fact, Coworkers may well be running to HR if they are offended by what they see on the offender’s computer.  Has your company looked at the legal liabilities?  That costs a LOT of money!  Check out this article from the ABA:  http://www.abanet.org/buslaw/blt/ndpolicy.html.

Finally, what is the cost of losing an otherwise trained and capable employee?  When an employee makes poor choices that cost that person a job, not only does he or she take a hit, the organization loses thousands of dollars.  Possible costs include:

  • Exit costs
  • Recruiting
  • Interviewing
  • Hiring
  • Orientation
  • Training
  • Compensation & benefits while training
  • Lost productivity
  • Customer dissatisfaction
  • Reduced or lost business
  • Administrative costs
  • Lost expertise
  • Temporary workers[4]

 
Webpronews estimates “[Losing employees] costs you 30-50% of the annual salary of entry-level employees, 150% of middle level employees, and up to 400% for specialized, high level employees!”[5]  You can also calculate in the resource costs for the HR expert, Legal representative, and forensic analyst who have to spend time on such an issue.  There is another expense.[6]

A significant percentage of inappropriate web surfing can be stopped.  Period.  Access to anonymous proxies can be stopped.   Users can be blocked from wasting time at porn sites, as well as other sites considered undesirable by management.  By carefully and thoughtfully spending appropriate funds now, an organization can avoid loss of productivity and loss of good employees.

Why is this important enough to blog about?  Because it can be stopped.  There are software and hardware tools that, among other things, will effectively block a high percentage of porn sites.  Try entering “How to block porn” in a search engine, and you will have over 1 million hits.  How much should you spend to block porn?  Well, it depends.  How many employees do you want to make sure you keep?  How important is it that all your projects are completed on time?  So, stop it, already.  Make sure your resources spend their time on your priorities.


 
[1] http://www.thejournal.com/articles/14024

 

[2] Software options for blocking porn:

Aobo Porn Blocker:  http://www.download3k.com/Press-Block-Porn-website-with-Aobo-Porn-Blocker.html

Blog article on Proxy Auto Configuration files :  http://www.ericphelps.com/security/pac.htm

CyberPatrol:  http://www.cyberpatrol.com/products.asp

Guardware:  http://www.guardware.com/default.php

NetDog Filter:  http://www.netdogsoft.com/

NetNanny:  http://www.netnanny.com/products/netnanny?pid=10-1

OpenDNS:  http://voices.washingtonpost.com/securityfix/2007/06/a_softwarefree_approach_to_blo.html

SafeSquid Proxy:  http://www.howtoforge.com/how-to-block-porn-pictures-and-images-with-safesquid-proxy-server

WebWatcher:  http://www.webwatchernow.com/Monitoring-Software/Consumer/Website-Blocking-lnd.html?gclid=CNLK-_GK9JgCFQGbnAodKUZf0g

 

[3] A few hardware vendors with solutions for blocking porn:

Blue Coat:  http://www.bluecoat.com/products/

Checkpoint:  http://www.checkpoint.com/

Cisco:  http://www.cisco.com/en/US/prod/collateral/vpndevc/ps5708/ps5710/ps1018/C78-345384-04_CiscoIntegratedFirewallSolutions.html

Juniper:  http://www.juniper.net/us/en/

Paloalto:  http://www.paloaltonetworks.com/

Radware:  http://www.radware.com/

Sonic Guard:  http://www.sonicguard.com/

Websense:  http://www.websense.com/site/scwelcome/index.html
 
[4] http://www.webpronews.com/expertarticles/2006/07/24/employee-retention-what-employee-turnover-really-costs-your-company

 

[5] http://www.webpronews.com/expertarticles/2006/07/24/employee-retention-what-employee-turnover-really-costs-your-company

 

[6] More on cost of losing/replacing employees:

Cost of losing an employee:  http://www.idahosbdc.org/upload/pdf/18TomMaydewarticle.pdf by Tom Maydew

Average Cost of Bringing On a New Employee:  http://www.entrepreneur.com/ask/answer4031.html

The Real Cost of Losing an Employee: http://www.hartfordmedia.com/pg-employee.html

The Billion Dollar Cost of Lost Business Knowledge:

http://ezinearticles.com/?The-Billion-Dollar-Cost-of-Lost-Business-Knowledge&id=1454207

J. Michael Butler, GCFA Gold #00056, is an Information Security Consultant employed by a fortune 500 application service provider who processes approximately half of the $5 trillion of residential mortgage debt in the US. He is a certified computer forensics specialist. In addition, he authored the enterprise wide security incident management plan and information security policies for his corporation. He can be reached at jmbutler_1 at hotmail dot com.

pdymail: Yahoo! mail in memory

3
Filed under Browser Forensics, Computer Forensics, Email Investigations, Evidence Analysis, Memory Analysis

I thought GMail gave up quite a bit of information revealed through pdgmail. Little did I know how much was in Yahoo! mail!

pdymail is the sister script to pdgmail for gathering Yahoo! email artifacts from memory.

The good thing about web2.0 with it’s AJAX, JSON, etc., interfaces is that most of it is text and even more is XML which is nicely discoverable in memory. Yahoo! mail classic interface artifacts are easily found on the hard disk in browser cache files. The new Yahoo! mail interface uses XML and while it doesn’t leave much behind on the disk, it leaves tons in memory.

Like pdgmail, pdymail is a rather simple Python script tested mostly against a pddump of a process in memory. It also works against Mantech dd memory images, and pretty  much any other way you can get memory and run strings against it.

The difference in Yahoo! mail is the vast amount of information that is retrievable! It’s already in XML in memory and rather than parse it and lose something precious to the investigator pdymail simply finds XML artifacts and presents them in a XML document.

You can use this XML document to rather easily reconstruct the contents of an inbox including, dates, senders, receipients and even the IP addresses of the host who sent the email. Flags for spam, ham, read, unread, forwarded, sender in the address book, etc., are also available per message. Note that the one thing I haven’t found yet is the actual body of an email. Maybe you can? If so send me an example and I’ll revise the script.

Here’s how to  run it:

on the subject machine, use pd from www.trapkit.de like so:
pd -p 1234> 1234.dump

where 1234 is the processID of a running instance of IE, or some other browser you think might have Yahoo! mail artifacts in it’s memory.

Then on your analysis  box do:
strings -el 1234.dump> memorystrings.txt
pdymail -f memorystrings.txt

It’ll spit out an XML document that you can analyze for whatever you’re looking for. In my next post I’ll detail some use cases using XML parsing using XML Starlet on Linux, but for now…happy hunting!

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis. He just re-upped on GCFA and is now cramming for GCIH re-cert.

Destruction of adverse documents

Comments Off
Filed under Email Investigations, Evidence Acquisition, eDiscovery

It is an offence to destroy any document that is or may be used as evidence in an ongoing or potential judicial proceeding in most western (at least the common law) jurisdictions. An organization must not destroy documents on the foundation that the evidence is unfavorable. The penalties for the destruction of documents suspected to possibly be subject to litigation may perhaps end in a charge of obstruction to justice. This makes the determination of deleted material that has been destroyed following a litigation hold situation a key goal of the forensic investigator.

Adverse inferences are often upheld in litigation if a party cannot produce the required documents. There is also the hazard of reputation damage. In British American Tobacco Australia Services Limited v Roxanne Joy Cowell for the estate of Rolah Ann McCabe [2002] VSCA 197 the Judge in first instance seriously denounced BAT for the methodical destruction of a large number of records. Documents that may hold evidentiary value need to be retained. Sardonically implementing a record retention policy without taking proper precautions will generally draw an adverse inference from the court if there is any departure from the policy.

The consequence is that policy also necessitates ongoing education about the policy and the procedures utilized to enforce it and constant re-examination of its content. Where a document has been deliberately destroyed, the court is likely to come to a negative determination.

The litigation process of discovery
Discovery is the progression of events that follow the initiation of legal proceedings. A matter will proceed to Court only after all parties have delivered up relevant documents or have presented testimony that they cannot provide these documents. The process of e-discovery involves electronic records such as emails.

Rigidly enforced periods make it vital for the parties to be able to retrieve documents and emails promptly. The forensic investigator has a duty to uncover breaches of litigation hold. Documents destroyed within the period following knowledge of a law suit for instance come under this category.

Expectation of Privacy
Privacy in the workplace is a contentious subject. The definitions of privacy, and its means of protection, vary by jurisdiction. Employee email is commonplace and is used for both work and private means. Organizations have stringent legal requirements in the European Union, Australia, the United States, and other jurisdictions to guard information on private individuals from unauthorized disclosure.

The expectation of privacy does not provide the right to destroy evidence. It is a matter for the court to determine if a file is relevant to a particular case or if it may be excluded.

How strong can the law be?
To answer this, I put forth an example of a fairly recent Australian law. The Victorian Crimes (Document Destruction) Act 2006 (the Document Destruction Act) was passed into law in Victoria (an Australian State) in 2006. Together with the Evidence (Document Unavailability) Act 2006 (the Document Unavailability Act), these pieces of legislation amend the Victorian Crimes Act 1958 and Evidence Act 1958, correspondingly. They where issued in response to concerns raised by the Report on Document Destruction and Civil Litigation in Victoria, by Professor Peter Sallmann. These documents add weight to the need for all companies comprehend their responsibility in respect of how they store or destroy any documents. This incorporates email and other electronic files.

The Document Destruction Act establishes additional criminal penalties and the Document Unavailability Act sets up new civil consequences. The Document Destruction Act affects acts carried out in Victoria such as those by companies resident (or engaging in business) within Victoria. The Document Unavailability Act pertains to civil proceedings initiated within Victoria.

These particular acts are focused on proceedings that have been started within a single state in Australia. The thing is, that the individual laws may vary (and at times be unclear), but it is nearly universal that the destruction of a document that could be used as evidence in a court is a crime. Where this really comes into effect is that the evidence of the destruction of a document can in  fact be worse then the material which may have been contained in the document that was destroyed.

Craig Wright, GFCA Gold #0265, is an author, auditor and forensic analyst. He has nearly 30 GIAC certifications, several post-graduate degrees and is one of a very small number of people who have successfully completed the GSE exam.

PTK: Evidence adding and Indexing

Comments Off
Filed under Computer Forensics, Email Investigations, Evidence Acquisition, Evidence Analysis, Incident Response, Memory Analysis, Mobile Device Forensics, Write Blockers, eDiscovery

At the moment the output formats used in computer forensics for the support of media duplication are mainly three:

●    dd (RAW image) –  the best and most utilized format
●    Encase format (EWF) – closed format now widely supported by the CF products
●    AFF Lib Format– very complete but still expanding

PTK can recognize the above listed formats. Usually, a media copy can be made from a single file or on split files. PTK is able to recognize the split image situation and, given the first chunk, automatically import the additional files. No log files or other types of data are allowed inside the evidence directory (i.e. file.e01, file.e02, file.log is not permitted). Through TSK, PTK automatically recognizes every partition in the image including support for the following file systems type: NTFS, FAT, UFS 1, UFS 2, EXT2FS, EXT3FS, and ISO 9660. One may also define, if necessary, the original time zone. Remember that for the FAT file system, time information are saved according to the local system date. With PTK, during the FAT image importing, the timestamps are converted from the original system’s local time into GMT/UTC time. For the NTFS file system, the timestamps are already saved in  GMT/UTC format and thus the time zone setting represents only a visualization parameter that can be changed at all times. For every added evidence you can obviously calculate the hash code (MD5, SHA1) and check it with a well-known one.

File system detection

ram-dump unknown file system

In case PTK is not able to identify the file system the user can choose to import the image as RAM dump and make use of the RAM dump analysis or import it as RAW image and have the ability to analyse the disk through the Data Unit or to run the Live Keyword Search on it. During the evidence importing process it is possible to decide whether to create a symbolic link to the image or copy the entire evidence, split or not, inside the PTK directory  images (%www path%/ptk/images).

Even if PTK doesn’t change in any way the evidence file, it is advisable to always use a write blocker. In case the write  block is Firewire, and not ATA, it is recommended that you copy the entire evidence on a disk in order to improve data access speed and the performance consequently. The indexing process requires a number of resources in terms of CPU and I/O disk. Once the evidence is imported it is possible to start working directly on it through various analysis modules (File Analysis, Live Keyword Search, Data Unit, etc..) or start the indexing process. PTK’s indexing engine, discussed on in previous articles, allows one to perform different automated tasks and produce results that all investigators assigned to the case can consult. The indexing process supplies all investigators with its analysis results  but it’s launched only once by the Master Investigator. The diagram below  contains the indexing process operated by PTK using TSK tools. The performance of the indexing engine was  improved compared to the first beta versions.

PTK indexing form

PTK indexing engine

The next article will deal with PTK’s  multi-user system, the possibility to forbid more than one investigator to access specific cases  and the bookmarking features available for every investigator.

Michele Zambelli, GCFA SIlver #1856, is a member of PTK Team and a Security Consultant at DFLabs Italy.

How math can help with forensics

2
Filed under Computer Forensics, Email Investigations, Evidence Acquisition, Evidence Analysis

Data mining, text mining and network association are all statistical tools that have come into their own as the shear quantity of available computational power increases. True, you do not need to have a strong basis in math to use these programs, but math can help determine where they may be used.

Text data mining takes the standard associative keyword based search techniques and increases their effectiveness through the ability to map associations with other words and to create visual representations of the data. This allows an investigator to drill down into previously undetermined associations and also allows the investigator to analyze immense amounts of data. One of the problems in the past has been in how to represent this data.

This is where visualisation technologies come to play. These allow the investigator to uncover previously hidden relationships in the data. More importantly, the visualisation techniques that are available today make the reporting to a lay jury simpler.

In the visualisation network:

A dot represents a person and is also called a node.

A line connecting two dots represents an existing conversation and is also called an edge.

The GEOMI software developed at NICTA is an Open source project that consists of a set of Java scripts developed at the Systems Biology Initiative, University of New South Wales (Ho et al. (2008) J Proteome Res., 7:104-12).

The benefits that come from this type of visualisation come from the simplification of complex datasets (such as social networks, chats and logs) into an easily comprehensible 3-D map that a user can rotate, zoom and otherwise interact with.

My team has used this type of program in modelling chat logs. In the image above (the names have been altered to remove the details related to a case), the social networks are displayed with the tightly connected groups being packed together and the “outsiders” to the conversations are displayed further apart in the network display. This program has allowed for the display of social relationships between chat users. Additionally, it has been used to model changes to logs and to detect tampering with evidence.

The GEOMI program is developed by the Systems Biology Initiative, UNSW. [Prof. Marc Wilkins, Director, m.wilkins@unsw.edu.au, Simone Li & Edwin Ho]. With their help, Ignatius and I shall be publishing a paper on the use of this and other visualisation programs in the following months.

Craig Wright, GFCA Gold #0265, is an author, auditor and forensic analyst. He has nearly 30 GIAC certifications, several post-graduate degrees and is one of a very small number of people who have successfully completed the GSE exam.

pdgmail: new tool for gmail memory forensics

9
Filed under Computer Forensics, Email Investigations, Memory Analysis

I saw John McCash’s artical on GMail forensics … I was hooked and created pdgmail.

I’ve been messing around with the volatile toolkit for memory forensics and thought I’d try my hands at GMail memory forensics since, as John says, the GMail data isn’t supposed to end up on disk anyways, maybe it’s in the the browser memory?

Boy is it!

I used the pd dump tool from www.trapkit.de, available here, and tested against my meager GMail account, Windows XP, 2000, IE 6, IE 7 and Firefox 3. In all cases I was able to retrieve contact data, last login times and IP addresses, basic email headers and email bodies. Even if the browser was ‘logged out’ of GMail, they all still retained this data. Even for messages that were not opened, contacts that weren’t used. Simply loading up the GMail UI loads all this data in the memory image.

How to use?

First step is to gather the browser memory. Here’s a sample pd session where 6352 is the PID of a running IE instance:

E:\Program Files\tools>pd -p 6352 > 6352.dump
pd, version 1.1 tk 2006, www.trapkit.de

Dump finished.

E:\Program Files\tools>dir
Directory of E:\Program Files\tools

09/27/2008 06:57 PM 117,908,254 6352.dump

Whoa big file! But this is forensics, we don’t scare at large data sets. To use the pdgmail tool run this memory dump through strings -el to create a strings file, then either cat that file through pdgmail, or run pdgmail with the -f flag specifying your strings filename. example:

strings -el 6352.dump | pdgmail | less

Best mileage will be with Python 2.4.4 or 2.5 on Linux. I haven’t tested it below those versions or on Windows.

It looks for these things:

  • contacts
  • last access records
  • GMail account names
  • message headers
  • message bodies

Contacts show up as:
contact: name: "jeff bryner" email: "myemailaddress@gmail.com

Last Access records show most recent two logins and appear as:
last access: "14 hours ago" from IP "10.15.26.8", most recent access Tue Oct 14 10:57:53 2008 from IP "12.9.4.238"

Email messages are the messiest mostly because memory artifacts don’t always conform to API standards, so picking them out is a best guess.

Using the most familiar email of all, headers show up as:
message header: ["ms","113b0d734737dec4","",4,"Gmail Team ","Gmail Team","mail-noreply@google.com",1184082900000,"Did you know that GMail was voted #2 in PC World's Top 100 products of 2005, ...",["^all","^i"]

Message bodies are parsed to turn the unicode into proper html:

Did you know that GMail was voted #2 in PC World’s Top
100 products of 2005
, right after Firefox? Why wouldn’t you want to
switch? Well, because it can be a pain to switch to a new email
address. We know.

etc…

Nothing fancy, just some glorified regex and unicode handling dumped to stdout. It parses if possible, otherwise it just spits out a familiar line. Feel free to send me patches, tweak, rewrite, etc. Hope it helps someone!

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis.

Forensic Gmail Artifact Analysis

Comments Off
Filed under Computer Forensics, Email Investigations, Evidence Analysis, eDiscovery

I don’t know if you’ve had the pleasure of trying to extract GMail message content from a drive image, but there aren’t a lot of references out there. Those that I found helpful, I’ve listed below.

Gmail uses JavaScript to manage the user experience on the front end, and passes content back and forth between the client and server using ‘datapack’ files, which are formatted using JavaScript Object Notation (JSON). See Google for details on JSON, but basically a complete datapack file looks something like the following (indentation & newlines added):

while(1);
[
     [
          ["tag1","string1.1","string1.2","string1.3","string1.4","string1.5"]
          ,["tag2","data2.1"]
          ,["tag3"
               ,[]
               ,[]
          ]
     ,["tag4",number4.1]
     ,["tag5",number5.1]
     ,["tag6","string6.1","string6.2","string6.3","string6.4",number6.5,
      number6.6,number6.7,"string6.8","string6.9"]
.
.
.
.
     ]
]

Each pair of brackets is a data structure. Given a complete datapack file and a complete description of each tag, including its name and the ordering and individual descriptions of each of its various subordinate data fields, one could format the contents for display as the GMail application did originally.

Here’s what I’ve got so far (no subfield descriptions, sorry):

Keyword/Tagname Description
["gn", Account Name
["st", Server name
["qu", Account Quota
["ds", Folders
["t", Message List (Thread)
["cs", Conversation Summary
["mi", Message Information/Index
["mb", Message Body (This is where the meat is)
["ma", Message Attachments (Number & Filenames)
while(1); GMail Data Packet header (beginning of file)
["i", Invitation
["ft", Fast Tip (no I don't know what that means)
["ct", Categories/Labels/Contacts
["ts", Thread Summary (Similar to Conversation Summary)
["te", End of Thread List
["v", GMail Version

"So where do I find the files that contain this content?", you ask. Sad to say, sometimes you don't. The reason that this data is sometimes lying around to benefit a forensic analyst is largely because of browser bugs or lack of proper support for the no-cache HTML meta tag. This data isn't supposed to be written to disk in the first place, but due to a number of issues outside the scope of this article, it often is. I understand that support is improving for this in newer browser versions, so most GMail forensics may soon be a thing of the past. Then again, some people are still running Windows 95 (shudder) so this will probably be useful for a while at least.

When the files are cached, you will find them named "mail[somenumber]“, and located either in Temporary Internet Files, or wherever your tool of choice puts files it can’t identify the previous location of. You’ll also quite often be able to find these files in unallocated space by searching for the various keywords I’ve specified. Additionally, you will find other files in the same places named “mail[somenumber].htm”. While these contain other ‘stuff’, there’s often some JSON as described above buried inside them.

Finally, the most useful part of this is the “mb” datapacks, which contain the formatted body of a message. All message body elements found in a given file belong to the same message, and can simply be concatenated to produce a mostly readable body. The following UNIX/cygwin shell script can be applied to a datapack file to render any message body it might contain back into more-or-less displayable HTML:

for I in $*
do
     cat $I | grep \"mb\" |while L=`line`
     do
          echo $L| \
          sed -e s/\\\(\\\\n\ \\?\\\)\\+/\<br\>/g \
          -e s/\\\\u003e/\>/g \
          -e s/\\\\u003d/=/g \
          -e s/\\\\u0026/\\\&/g \
          -e s/\\\\u003c/\</g \
          -e s/^,\\\[\"mb\",\"// \
          -e s/\",1\\\]$// -e s/\\\\\"/\"/g \
          >> $I.html
     done
done

If you liked this article, want to add something to it, or simply want to call me on the carpet for some inaccuracy, please feel free to leave a comment.

References: (Some may not be available to those without Guidance Software portal access, sorry)

Slides from CEIC 2008 Presentation on Gmail Forensics

Codeproject page for GMail Agent API / Mail Notifier & Address Importer

Locating GMail Traces Article at ForensicFocus.com

A perl interface to Google’s webmail service

GMail Agent API/Mail Notifier & Address Importer

GMail Evidence – EnCase User’s Group Posting

Web Mail Question – EnCase User’s Group Posting

JSON (Google)

So, You Don’t Want To Cache, Huh?

John McCash, GCFA Silver #2816, is currently a Forensic Investigator employed by a fortune 500 telecommunications equipment provider.