The new version of SOF-ELK is here. Download, turn on, and get going on forensics analysis.

 

Sof-Elk (Horizontal)

We are excited to announce the release of an all-new version of the free SOF-ELK®, or Security Operation and Forensics ELK virtual machine. Now based on the new version of the Elastic Stack, SOF-ELK is a complete rebuild that is faster and more effortless than its predecessors, making forensic and security data analysis easier than ever.

 

Since its introduction about five years ago, there have been more than 10,000 downloads of SOF-ELK around the world by computer forensic investigators and information security operations personnel in the government, law enforcement and commercial sectors. The SOF-ELK platform is a customized build of the open source ELK stack, consisting of the Elasticsearch storage and search engine, Logstash ingestion and enrichment component, and the Kibana dashboard frontend.

 

SOF-ELK was always designed to help minimize the typically long and involved setup process the ELK stack requires by delivering a pre-built virtual appliance that can be downloaded in a ready-to-use state. It can consume various source data types (numerous log types as well as NetFlow), parses out the most critical data fields, and presents it on several stock dashboards. Users can also build custom data visualizations that suit their own investigative or operational requirements (and because of the fully open source nature of the project, can choose to contribute those custom builds back to the primary code repository). Learn more about the SOF-ELK distribution.

 

The new version of SOF-ELK has been rebuilt from the ground up to take advantage of the new version of the Elastic Stack software and uses all of the Elastic Stack’s components. This required a total rebuild of all dashboards and supporting scripts. Simply download this distribution, turn it on, feed it some data and begin analysis. It’s that easy.

 

The SANS team has performed extensive testing of this distribution of the SOF-ELK platform. The new version’s most immediate benefits are its speed and refreshed browser interface. This is faster and easier to use and can visualize massive amounts of data through the dashboards. We also expect faster development cycles. Here are some more changes in the new SOF-ELK:

 

  • Supports the latest updates on the kernel and all CentOS packages.
  • Includes new parsers from upstream and community contributions.
  • Rebuilt and revalidated all Logstash parsers against latest syntax
  • Better handles dynamic (boot-time) memory allocation for Elasticsearch.
  • Rebuilt all Kibana dashboards to handle updated index mappings and field names
  • IPv6 addresses can now be handled as IPs instead of strings
  • Features many under-the hood changes that will make our roadmap much smoother in the future

The SOF-ELK platform was initially developed for SANS FOR572, Advanced Network Forensics and Analysis, and is now used in SANS SEC555, SIEM with Tactical Analysis. Additional course integrations are being actively worked at this time and considered for future versions. However, SOF-ELK was always designed as a free resource for the digital forensic and broader information security communities at large – a ready-to-use appliance that teams can use without having to invest the many hours into deploying, configuring, and maintaining an Elastic Stack instance. We hope you check out the latest version of the SOF-ELK distribution.SOF-ELK®: A Free, Scalable Analysis Platform for Forensic, incident Response, and Security Operations

____________________________________________________________________________________

JOIN THIS WEBCAST TO FIND OUT MORE: 
SOF-ELK®: A Free, Scalable Analysis Platform for Forensic, Incident Response, and Security Operations

  • When: Tuesday, March 5th, 2019 at 1:00 PM EST 
  • Conducted by Phil Hagen
  • Register now

Overview

There is no shortage of digital evidence, with many DFIR and Security Operations teams handling terabytes of log and network data per week. This amount of data presents unique challenges, and many tools are simply inadequate at such a large scale. Commercial platforms that are up to the task are often far out of budgetary reach for small- and medium-sized organizations.

The Elastic Stack, a big data storage and analysis platform, has become increasingly popular due to its scalability and open-source components. Countless investigative and security teams have incorporated Elastic into their toolkits, often realizing the significant level of effort required to customize and manage such a powerful tool. To overcome some of these hurdles, the SOF-ELK platform was created. SOF-ELK aims to be an appliance-like virtual machine that is preconfigured to ingest and parse several hundred different types of log entries, as well as NetFlow data. The intent is to provide analysts and investigators with a tool that leverages the power of the Elastic Stack with minimal setup time and effort. Originally a part of the SANS FOR572, Advanced Network Forensics & Threat Hunting course, SOF-ELK has been incorporated into additional SANS courses and is released as a free and open-source platform for the overall security community.

In this webcast, we will explore SOF-ELKs use cases, types of log data currently supported, as well as how to load data from live or archived sources. We will also show the various dashboards supplied with the VM and show how new features can be activated through the projects GitHub repository.

Top 11 Reasons Why You Should NOT Miss the SANS DFIR Summit and Training this Year

The SANS DFIR Summit and Training 2018 is turning 11! The 2018 event marks 11 years since SANS started what is today the digital forensics and incident response event of the year, attended by forensicators time after time. Join us and enjoy the latest in-depth presentations from influential DFIR experts and the opportunity to take an array of hands-on SANS DFIR courses. You can also earn CPE credits and get the opportunity to win coveted DFIR course coins!

summit video pic

To commemorate the 11th annual DFIR Summit and Training 2018, here are 11 reasons why you should NOT miss the Summit this year:

1.     Save money! There are two ways to save on your DFIR Summit & Training registration (offers cannot be combined):

·       Register for a DFIR course by May 7 and get 50% off a Summit seat (discount automatically applied at registration), or

·       Pay by April 19 and save $400 on any 4-day or 6-day course, or up to $200 off of the Summit. Enter code “EarlyBird18” when registering.

2.     Check out our jam-packed DFIR Summit agenda!

·       The two-day Summit will kick off with a keynote presentation by Kim Zetter, an award-winning journalist who has provided the industry with the most in-depth and important investigative reporting on information security topics. Her research on such topical issues as Stuxnet and election security has brought critical technical issues to the public in a way that clearly shows why we must continue to push the security industry forward.

·       The Summit agenda will also include a presentation about the Shadow Brokers, the group that allegedly leaked National Security Agency cyber tools, leading to some of the most significant cybersecurity incidents of 2017. Jake Williams and Matt Suiche, who were among those targeted by the Shadow Brokers, will cover the history of the group and the implications of its actions.

·       All DFIR Summit speakers are industry experts who practice digital forensics, incident response, and threat hunting in their daily jobs. The Summit Advisory Board handpicked these professionals to provide you with highly technical presentations that will give you a brand-new perspective of how the industry is evolving to fight against even the toughest of adversaries. But don’t take our word for it, have a sneak peek, check out some of the past DFIR Summit talks.

kaplan

Immerse yourself in six days of the best in SANS DFIR training. Here are the courses you can choose from:

·       FOR500 – Advanced Windows Forensics

·       FOR585 – Advanced Smartphone Forensics

·       FOR610 – Reverse-Engineering Malware

·       FOR508 – Digital Forensics, Incident Response & Threat Hunting

·       FOR572 – Advanced Network Forensics: Threat Hunting, Analysis, and Incident Response

·       FOR578 – Cyber Threat Intelligence

·       FOR526 – Memory Forensics In-Depth

·       FOR518 – Mac and iOS Forensic Analysis and Incident Response

·       MGT517 – Managing Security Operations: Detection, Response, and Intelligence

3.    All courses will be taught by SANS’s best DFIR instructors. Stay tuned for more information on the courses we’re offering at the conference in a future article post.

4.     Rub elbows and network with DFIR pros at evening events, including networking gatherings and receptions. On the first night of the Summit, we’re going to gather at one of Austin’s newest entertainment venues, SPiN, a ping pong restaurant and bar featuring 14 ping pong tables, lounges, great food, and drinks. Give your overloaded brain a break after class and join us at our SANS Community Night, Monday, June 9 at Speakeasy. We will have plenty of snacks and drinks to give you the opportunity to network with fellow students.

5.     Staying to take a DFIR course after the two-day Summit? Attend SANS@Night talks guaranteed to enrich your DFIR training experience with us. Want to know about threat detection on the cheap and other topics? As for cheap (and in this case, that doesn’t mean weak), there are actions you can take now to make threat detection more effective without breaking the bank. Attend this SANS@Night talk on Sunday evening to learn some baselines you should be measuring against and how to gain visibility into high-value actionable events that occur on your systems and networks.

6.     Celebrate this year’s Ken Johnson Scholarship Recipient, who will be announced at the Summit. This scholarship was created by the SANS Institute and KPMG LLP in honor of Ken Johnson, who passed away in 2016. Early in Ken’s digital forensics career, he submitted to a Call for Presentations and was accepted to present his findings at the 2012 SANS DFIR Summit. His networking at the Summit led to his job with KPMG.

7.     Prove you’ve mastered the DFIR arts by playing in the DFIR NetWars – Coin Slayer Tournament. Created by popular demand, this tournament will give you the chance to leave Austin with a motherlode of DFIR coinage! To win the new course coins, you must answer all questions correctly from all four levels of one or more of the six DFIR Domains: Windows Forensics & Incident Response, Smartphone Analysis, Mac Forensics, Memory Forensics, Advanced Network Forensics, and Malware Analysis. Take your pick or win them all!

 

8.     Enjoy updated DFIR NetWars levels with new challenges. See them first at the Summit! But not to worry, you will have the opportunity to train before the tournament. You’ll have access to a lot of updated posters that can serve as cheat sheets to help you conquer the new challenges, as well as the famous SIFT WorkStation that will arm you with the most powerful DFIR open-source tools available. You could also choose to do an hour of crash training on how to use some of our Summit sponsors’ tools prior to the tournament. That should help give you an edge, right? That new DFIR NetWars coin is as good as won!

9. The Forensic 4:cast Awards winners will be announced at the Summit. Help us text2985celebrate the achievements of digital forensic investigators around the world deemed worthy of the award by their peers. There is still time to cast your vote. (You may only submit one set of votes; any additional voting will be discounted). Voting will close at the end of the day on May 25, 2018.

10.  Come see the latest in tools offered by DFIR solution providers. Summit sponsors and exhibitors will showcase everything from managed services covering advanced threat detection, proactive threat hunting, and accredited incident response to tools that deliver rapid threat detection at scale, and reports that provide insights for identifying potential threats before they cause damage.

11.  Last but not least, who doesn’t want to go to Austin?!? When you think Austin, you think BBQ, right? This city isn’t just BBQ, Austin has amazing food everywhere and there’s no place like it when it comes to having a great time. The nightlife and music include the famous 6th Street – which, by the way, is just walking distance from the Summit venue. There are many other landmarks such as Red River, the Warehouse District, Downtown, and the Market District. You will find entertainment of all kinds no matter what you’re up for. Nothing wrong with some well-deserved play after days full of DFIR training, lectures, and networking!

As you can see, this is an event you do not want to miss! The SANS DFIR Summit and Training 2018 will be held at the Hilton Austin. The event features two days of in-depth digital forensics and incident response talks, nine SANS DFIR courses, two nights of DFIR NetWars, evening events, and SANS@Night talks.

The Summit will be held on June 7-8, and the training courses run from June 9-14.

We hope to see you there!

DFIR Summit 2016 – Call for Papers Now Open

FullSizeRender

The 9th annual Digital Forensics and Incident Response Summit will once again be held in the live musical capital of the world, Austin, Texas.

The Summit brings together DFIR practitioners who share their experiences, case studies and stories from the field. Summit attendees will explore real-world applications of technologies and solutions from all aspects of the fields of digital forensics and incident response.

Call for Presentations- Now Open
More information 

The 9th Annual Digital Forensics and Incident Response Summit Call for Presentations is now open. If you are interested in presenting or participating on a panel, we’d be delighted to consider your practitioner-based case studies with communicable lessons.

The DFIR Summit offers speakers the opportunity for exposure and recognition as industry leaders. If you have something substantive, challenging, and original to offer, you are encouraged to submit a proposal.

Deadline to submit is December 18th, 2015.

Summit Dates: June 23 & 24, 2016
Post-Summit Training Course Dates: June 25-30, 2016

Submit now 

Timeline analysis with Apache Spark and Python

This blog post introduces a technique for timeline analysis that mixes a bit of data science and domain-specific knowledge (file-systems, DFIR).

Analyzing CSV formatted timelines by loading them with Excel or any other spreadsheet application can be inefficient, even impossible at times. It all depends on the size of the timelines and how many different timelines or systems we are analyzing.

Looking at timelines that are gigabytes in size or trying to correlate data between 10 different system’s timelines does not scale well with traditional tools.

One way to approach this problem is to leverage some of the open source data analysis tools that are available today. Apache Spark is a fast and general engine for big data processing. PySpark is its Python API, which in combination with Matplotlib, Pandas and NumPY, will allow you to drill down and analyze large amounts of data using SQL-syntax statements. This can come in handy for things like filtering, combining timelines and visualizing some useful statistics.

These tools can be easily installed on your SANS DFIR Workstation, although if you plan on analyzing a few TBs of data i would recommend setting up a Spark cluster separately.

The reader is assumed to have a basic understanding of Python programming.

Quick intro to Spark

This section is a quick introduction to PySpark and basic Spark concepts. For further details please check the Spark documentation, it is well written and fairly up to date. This section of the blog post does not intend to be a Spark tutorial, I’ll barely scratch the surface of what’s possible with Spark.

From apache.spark.org: Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

Spark’s documentation site covers the basics and can get you up and running with minimal effort. Best way to get started is to spin up an Amazon EMR cluster, although that’s not free. You can always spin up a few VMs in your basement lab :)

In the context of this DFIR exercise, we’ll leverage Spark SQL and  Spark’s DataFrame API. The DataFrame API is similar Python’s Pandas. Essentially, it allows you to access any kind of data in a structured, columnar format. This is easy to do when you are handling CSV files.

Folks over at Databricks have been kind enough to publish a Spark package that can convert CSV files (with headers) into Spark DataFrames with one simple Python line of code. Isn’t that nice? Although, if you are brave enough, you can do this yourself with Spark and some RDD transformations.

To summarize, you’ll need the following apps/libraries on your DFIR workstation:

  • Access to a Spark (1.3+) cluster
  • Spark-CSV package from Databricks
  • Python 2.6+
  • Pandas
  • NumPy

It should be noted that the Spark-CSV package needs to be built from source (Scala) and for that there are a few other requirements that are out of the scope of this blog post, as is setting up a Spark cluster.

Analysis (IPython notebook)

Now, let’s get on with the fun stuff!

The following is an example of how these tools can be used to analyze a Windows 7 system’s timeline generated with log2timeline.

You can find the IPython Notebook over on one of my github repos

Is Anti-Virus Really Dead? A Real-World Simulation Created for Forensic Data Yields Surprising Results

One of the biggest complaints that many have in the DFIR community is the lack of realistic data to learn from.  Starting a year ago, I planned to change that through creating a realistic scenario based on experiences from the entire cadre of instructors at SANS and additional experts who reviewed and advised the attack “script”.  We created an incredibly rich and realistic attack scenario across multiple windows-based systems in enterprise environment.  The attack scenario was created for the new FOR508: Advanced Forensics and Incident Response course.  Our main goal was to place the student in the middle of a real attack that they have to response to.

The purpose is to give attendees of the new FOR508 real filesystem and memory images that they will examine in class to detect, identify, and forensicate APT-based activities across these systems in class.  The goal is to give students who attend the course “real world” data to analyze.  The goal was to create attack data to use in our courses at SANS so our students could have a direct feel for what it is like to investigate advanced adversaries.

This past week, we ran through the exercise. I had a team of attackers mimic the actions of an advanced adversary similar to the APT.  Having seen APT tactics first hand, I scripted the exercise but also wanted to create a realistic environment that would mimic many organizations home enterprise networks.  My attack team (John Strand and Tim Tomes) learned quickly the difference between a penetration test and a test that duplicated typical APT actions.

Over the week, I learned some very valuable lessons by being able to observe the attack team first-hand.  More in future blog articles, but the first question I had on my list was: “Is A/V really dead?”

Is A/V really dead?

Over the years, I knew that it can be circumvented, but until I helped plan out and execute this exercise I was exposed to the truth first hand.  In many incidents over the years (including many APT ones), we and other IR teams have found that A/V detected signs of intrusions, but they were often ignored.  I expected at least some of those signs to exist this past week while running through the exercises we were creating.  I had hoped differently, but after a week of exploiting a network using the same APT techniques that we have seen our adversaries use, I think it paints a very dark picture for how useful A/V in stopping advanced and capable adversaries.  This isn’t an anti-AV or HIDS write-up but to give you something to think about when it comes to what we blindly are looking for.  I would never recommend someone go without it as AV still stops very basic and simple attacks, but it is clear that in order to find and defend against advanced adversaries we need to do more than rely on A/V.  Having A/V state that they can defeat advanced adversaries such as the APT has been a challenge for them and they anti-virus industry has not been very open about this very clear fact.

To be honest, I actually had some hope for some of the enterprise level A/V and HIDS products to catch some of the more basic techniques we used (as I wanted the artifact to be discovered by attendees), but A/V proved easy to circumvent by my team.  While I’m sure all many of these products stop low-hanging fruit attacks, we found that we basically did whatever we wanted without our enterprise managed host-based A/V and security suite sending up a flare.

What?  Nothing?

What is bundled into this suite?  –>Anti-virus, Anti-spyware, Safe surfing, Anti-spam, Device Control, Onsite Management, Host Intrusion Prevention System HIPS bundled in McAfee Endpoint Protection Suite –  http://shop.mcafee.com/Catalog.aspx. I also separately purchase their desktop host intrusion prevention piece and built that into McAfee EPO and deployed that across the environment as well.

To help understand how this might have happened, many have asked for the details of the network and the attack.

The Windows Based Enterprise Network:

The network was setup to mimic a standard “protected” enterprise network using standard compliance checklists.  We did not include any additional security measures that usually are implemented post APT incidents.  This was supposed to mimic a network at a “Day 0” compromise, not actively hunting, using threat intelligence, white listing, and more.  However, we did have a substantial firewall, A/V, host based IDS, patching automatically done, and more.  We also had fairly restrictive users on it, but included some bad habits found in most enterprise settings (poor admin password policy, local admin accounts w/same password, XP user with admin rights locally)

  • Full auditing turned on per recommended guidelines
  • Users are restricted to being a users (cannot even install a program if they wanted to)
  • Windows DC set up and configured by Jason Fossen (our Windows Security Course) he didn’t tighten down the network more than what is expected in real enterprise networks
  • Systems installed and have real software on it that is used (Office, Adobe, Skype, Tweetdeck, Email, Dropbox, Firefox, Chrome)
  • Fully patched (Patches are automatically installed)
  • Enterprise Incident Response agents (F-Response Enterprise)
  • Enterprise A/V and On-Scan capability (McAfee Endpoint Protection – Anti-virus, Anti-spyware, Safe surfing, Anti-spam, Device Control, Onsite Management, Host Intrusion Prevention (HIPS) )
  • Firewall only allowed inbound 25 and outbound 25, 80, 443 only.
  • The “APT actors” has hit 4 of the systems in this enterprise network network.  (Win2008R2 Domain Controller, Win764bit, Win732bit, WinXP).
  • Users have been “using” this network for over a year prior to the attack.  That way it has the look and feel of something real.  These users have setup social media (yes they are on twitter… you might be friends with them), email, skype, etc. Each character user has a backstory and a reason to be there working.

Bad habits we included and commonly see in most enterprise networks:

  • Local Admin User (SRL-Helpdesk) found on each system w/same password
  • A regular user with local admin rights on an XP machine.

Malware Used (non-public):

  • C2 Beacon – Port 80 C2 channel encoded in XMLRPC traffic.  Meterpreter backend  –  What this malware did was beacon out every 32 seconds to a specific IP address over port 80 looking like traditional web traffic.  The command and control channel was embedded into XMLRPC traffic.  The command and control shell we decided to use a meterpreter backend as developing a new one was too costly and we found that it was still “good enough” for our requirements.    Malware detected on Microsoft Security Essentials due to payload, but not in McAfee’s products (I know — odd!).
  • C2 Channel – Custom Meterpreter Backed based executable.  Will connect out over port 80.  It doesn’t have persistence or a beacon interval.  Must be started to connect.

Malware Used (Public):

The evasion technique is pretty simple, wrap the executable into a python script (you can also use perl and Ruby) then insert it into a good executable or export to a new one.

  • Poison Ivy – Straight export to Python Array.  Pretty sad that it worked actually.  This is where I had hoped to create some alerts that I would have had to suppress.
  • Psexec – Not malware
  • Radmin – No encoding needed.  Apparently this backdoor is OK?
  • mimikatz – No encoding.  Again another place hoping to suppress some alerts so we could find them in the “system forensics” piece of the exercise.

APT Attack Phases

This exercise and challenge will be used to show real adversary traces in network captures, host systems, memory, hibernation/pagefiles etc.  Hopefully the malware will be used in FOR610: (malware analysis) for additional exercises.  The network captures will hopefully be used in FOR558 (network forensics). 

And through the week none of the defenses we had put in place did not matter what-so-ever.  It was quite simple to evade any detection.  Our APT “team” consisted of John Strand and Tim Tomes.

  • Phase 1 – Spearphsing attack (w/signed Java Applet attack – public) and malware C2 beacon installation (custom malware – encapsulated port 80 http traffic and POISON IVY)
  • Phase 2 – Lateral movement to other systems, malware utilities download, install additional beacons, and obtain domain admin credentials
  • Phase 3 – Search for intellectual propery, profile network, dump email, dump enterprise hashes
  • Phase 4– Collect data to exfiltrate and copy to staging system.  Rar up data using complex passphrase
  • Phase 5 – Exfiltrate rar files from staging server, perform cleanup on staging server

In the end, we will have created authentic memory captures on each box, network captures, malware samples, in addition to full disk images w/Restore Points (XP) and VSS for (Win7 and Win2008) machines.

Why did we choose McAfee’s product? 

I have seen a lot of enterprise managed A/V and HIPS suites and none of them have fared well against the APT actors and malware.  It is too easy to obscure the malware to avoid detection so any A/V choice here (McAfee, Symantec, etc) would have yielded similar results.  And that matches what myself and many others have witnessed in seeing these products at locations where APT adversaries roamed freely for months before detection.  In the end, it really would not have mattered to choose product X over Y.  We wanted to select a product where most attendees of FOR508 would feel at home when performing incident response.

In order to set up a realistic environment I wanted to go with one of the product choices that was implemented in more environments so that attendees could easily “identify” with their own enterprise networks through the lens of this exercise.  I asked the SANS advisory board for their recommendations in late August 2011 and found that most seemed to lean toward  McAfee EPO.  This agreed with me as the DoD Host Based Security System HBSS also implemented similar functionality found in the A/V products we ended up using.

Furthermore, when we installed the product we did not tighten the configuration options beyond what the default settings had chosen for itself.  I literally mimicked an admin purchasing a product, installing it, and crossing my fingers — hoping it worked correctly.  In most environments we have investigated, this was typical (w/ standard out-of-box settings.)  We also came to find out that the .dat files were not automatically updated.  The system apparently needed some care and feeding a bit.  However, having said that, the attack team verified during the test that the malware used (public and proprietary) evaded detection using the latest .dat files as of week of 2 April 2012.

We uploaded all the log files for their team to review to get a better sense of what we had installed, if anything was incorrectly applied, and for additional feedback. We used their analysis tool called WebMer found here: http://mer.mcafee.com/enduser/downloadlatestmer.aspx to collect all the log files, operational parameters, and more from each system.  McAfee verified via a conference call that the system was installed correctly and operational during the attack, but that some settings could have been implemented that would have slowed the speed of the advance, but not stopped it.

Their team determined we could have implemented several key things to have help slow the attack:

From McAfee — snip —

Old DAT 
SANS system’s show they are using DAT version 6498, this DAT is more than 200 days old and we recommend they update the DAT. Standard deployment is daily update of DATs.
End From McAfee —-snip—–
(Author note about the old DAT:  We were not able to verify the old DAT as the product was install with all defaults enabled including automatic updates.  When we tested the malware — the attack team told me that they tested on systems with 100% up to date on the test systems with McAfee installed on it.  So even it was older, it didn’t matter as it didn’t detect anything either way… new or older DAT.  McAfee even admitted after seeing the malware used in this attack that we shared with them that their current DAT would not have hit on those samples in the way we had them configured.)
From McAfee — snip —
VSE’s Access Protection
VSE’s Access Protection can mitigate the threat that non-public malware poses. There is a large list of pre-defined Access Protection Rules that an administrator would selectively enable as part of the deployment process depending on the systems and environment VSE is being deployed to.
 
Here are some of the pre-defined Access Protection Rules that would mitigate or at least alert on the threat posed by the approach SANS used. The first one is recommended for all systems. The latter two are recommended for servers and desktops where the usage scenario is more controlled or specific. In other systems they would typically just be marked for “report” and not “block”.
  • Prevent Windows Process spoofing – enable for BLOCK/REPORT
  • Prevent all programs from running files from the Temp folder – enable for REPORT
  • Prevent creation of new executable files in the Windows folder – enable for REPORT

End From McAfee —-snip—–

If we enabled these protections, it would not have really been a stopping point but it would have created logs.  I actually regret not having implemented these prior to the test exercise as it would have been great to have someone see the logs during the investigation.

Overall,  the point of the exercise was not to embarrass anyone.  I wanted to come as close to “real” as I could get.  As a result, we knew we had to include real world implementations of some of the best tools money can buy.  In the end, this isn’t about trying to shame anyone or pick on a A/V vendor.  It is about reporting “What happened?” and “What did we notice?”  Hopefully everyone learns something in the end about the exercise and benefits from it.

Some IR/Forensic Results from the Attack

Some results from doing some quick Forensic Analysis of these machines

Timeline Analysis of Spearphishing Attack

 

Memory Analysis (Quick hits using Redline without analysis – Yes… for in-depth analysis I would also use Volatility)

Conclusion:

We used a combination of custom crafted malware and well-known malware such as Poison Ivy, metasploit, and more. We used simple A/V evasion to get around it and we NEVER turned it off. RESULT-> NOT A PEEP from A/V.  Yes it was installed correctly as it did detect the un-armored metasploit payload quickly and killed it (a test to make sure it DID in fact work as I became worried it really didn’t work or was setup wrong).  I would gladly let anyone from McAfee look at our setup to make sure we didn’t make a mistake, but I followed their guide to the letter and used recommended settings when installing the product (They took us up on that, and we sent in the logs from all 4 systems).  I also have found a lot of clients with incorrect installed Enterprise products, so it is clearly possible I mundged something up during the install.  If we are wrong, then we are wrong and we can go back and do run through it again after we apply their suggestions as we have it snapshotted inside an ESX server.  I was actually anticipating it would find at least ONE thing we did.  Nothing was found.

If anyone needs just a little proof that you are using A/V products to mainly defend against low-skilled attackers, then there it is. I asked that the attack team use skills learned in most “Penetration Testing” courses.  They didn’t use anything really advanced, which is one of the reasons many argue that even the “Advanced Persistence Threat” isn’t really that advanced.  We even made many mistakes during the attack.  Even then… nothing was found and nothing was automatically blocked.  If this were a real compromise, we could have been on this network for months or years prior to anyone finding us.  Just like in the real world.

Digital Forensics Case Leads: Bulk_extractor how-to, Verizon Report, FTK review, China prime suspect in RSA and other incidents

In this week’s edition of Case Leads we have a how-to for Bulk_extractor’s find feature, first impressions on the new database options in FTK, an extension for log2timeline for parsing the cache in Firefox, the Verizon data breach report, and statements by current and former US government officials about Stuxnet and China.

If you have an item you’d like to contribute to Digital Forensics CaseLeads, please send it to caseleads@sans.org.

Tools:

  • Bulk_extractor is a tool that is periodically mentioned on the blog.  Simson Garfinkel posted a brief how-to that demonstrates the use of bulk_extractor in finding keywords in a disk image.  The post explains why bulk_extractor is better (in some cases) than strings and grep (part of the reason is bulk_extractor parses compressed files.)
  • FTK 4.0 by AccessData has received some attention as it now provides the option of using PostgreSQL over Oracle.  This article captures some of the first impressions of that switch.

Good Reads:

  • Verizon released its annual Data Breach Investigations Report covering 2011.  (There is also an archive of previous year’s reports.)  The information in these reports can be useful in honing and measuring your organization’s approach to security.  As an example the reports typically measure or estimate how long it took to penetrate an organization and how much time elapsed before the organization detected the attack.  That type of information can be used to gauge a SOC or to establish log retention policy.
  • This could also be filed under “Tools” but it’s certainly a good read if your investigation involves Firefox and malware.  The article addresses an extension to Kristinn Gudjonsson’s log2timeline application that enables it better parse the Firefox cache.

News:

Levity:

Coming Events:

Call For Papers:

 

Digital Forensics Case Leads is a (mostly) weekly publication of the week’s news and events relating to digital forensics. If you have an item you’d like to share, please send it to caseleads@sans.org.

Digital Forensics Case Leads for 20120330 was compiled by Ray Strubinger. Ray regularly leads digital forensics and incident response efforts and when the incidents permit, he is involved in aspects of information security ranging from Data Loss Prevention to Risk Analysis.

Digital Forensics SIFT’ing: Cheating Timelines with log2timeline

Hopefully at one point in time everyone has experienced the enjoyment of a teacher that allowed them to use a “cheat sheet” on a test. For the unfamiliar, the concept is simple; take an 8.5 x 11” piece of paper, cram as much information as you can on both sides, and use it as an open reference for a test. The key was not only to put as much information as you could fit on the two-sided document, but for that information to be neatly organized and readily accessible so you could quickly reference information and articulate answers before the test clock ran out.

Without hesitation, it can be challenging to memorize commands and too consuming at other times to search through #DFIR resources (online resources, books, notes, contacts, and etc) to answer questions like “Is there an alternative to mounting split .e01 image in SIFT workstation if mount_ewf.py fails?” or “How do I create a GREP statement that shows me all sources in a timeline?”

It was not long until I found myself taking the “cheating” outside of school and into my #DFIR career. Within months I found it instrumental to create cheat sheets for all types of tools and processes including imaging using dc3dd, GREP expression examples, exporting mailboxes using Microsoft Exchange cmdlets, and etc. At first I thought it was a great personal resource, but then everyone who saw them wanted a copy! I found that beginners used them as guides and experts liked them to reference the command they rarely used.

As a novice user of “off the shelf” forensic products, I naturally gravitated to the SANS SIFT workstation when I heard about its capabilities (and NO cost!). It was great to see an open source initiative in the #DFIR community, such as log2timeline, that had features in some respect that would only be expected from expensive off the shelf products.

After reading Rob Lee’s blog titled, “How to Make a Difference in the Digital Forensics and Incident Response Community” I thought to myself, perhaps if I created a cheat sheet for log2timeline it would make a difference? You be the judge. At the #SANS360 event in DC I released what will hopefully be one of many cheat sheets to come.

  • On the front side there is a basic checklist of items that can be considered when building an analysis work plan prior to performing computer forensic analysis

  • On the back there is a simple workflow for how to use SIFT and log2timeline to produce, filter, and review timelines.

>>>> Download the PDF version of this cheat sheet  (Rick Click and click Save As)

Note: It’s intended to be printed in color, double-sided and laminated. Credits to Ed Goings, Rob Lee, Kristinn Gudjonsson, and SANS for content.

About author:

David Nides is a Senior in KPMG’s Forensic Technology Services practice in Chicago, IL. He currently plays a lead role in KPMG’s national Incident Response team consulting clients globally in APT, data breach, and other cyber crime investigations. You can follow David on twitter @davnads or at his forensic blog.

Digital Forensic SIFTing: SUPER Timeline Creation using log2timeline

This is a series of blog articles that utilize the SIFT Workstation. The free SIFT workstation, can match any modern forensic tool suite, is also directly featured and taught in SANS’ Advanced Computer Forensic Analysis and Incident Response course (FOR 508). SIFT demonstrates that advanced investigations and responding to intrusions can be accomplished using cutting-edge open-source tools that are freely available and frequently updated.

The SIFT Workstation is a VMware appliance, pre-configured with the necessary tools to perform detailed digital forensic examination in a variety of settings. It is compatible with Expert Witness Format (E01), Advanced Forensic Format (AFF), and raw (dd) evidence formats.

Super-Timeline Background

I first started teaching timeline analysis back in 2000 when I first started teaching for SANS.  It was in my first SANS@Night presentation I gave in Dec 2000 at what was then called “Capitol SANS” and I demonstrated a tool I wrote called mac_daddy.pl based off of the TCT tool mactime.  Since that point every certified GCFA has answered test questions on timeline analysis.

We have reached a new resurgence in timeline analysis thanks to Kristinn Gudjonsson and his tool log2timeline.  Kristinn’s work in the timeline analysis field will probably change the way many of you approach cases.

First of all, all of these tools will be found in the SIFT Workstation are ready to go out of the box, but you can keep them up to date at Kristinn’s website www.log2timeline.net.  Kristinn’s tool was also recently added to the FOR508: Advanced Computer Forensic Analysis and Incident Response course last year and has already been taught to hundreds analysts who are now using it in the field daily.

Kristinn’s log2timeline tool will parse all of the following data structures and more through AUTOMATICALLY recursing through the directories for you instead of having to manually accomplish this.

This is a list of the currently available formats log2timeline is able to parse.  The tool is being constantly updated so to get the current list of available input modules it is possible to let the tool print out a list:

# log2timeline -f list

Artifacts Automatically Parsed in a SUPER Timeline:

How to automatically create a SUPER Timeline

log2timeline recursively scans through an evidence image (physical or partition) and extracts artifact timestamp data gathered from the evidence that the tool log2timeline supports (see artifacts above).  This tutorial will step a user who is interested in creating their first timeline from start to finish.

Step 0 – Use the SIFT Workstation Distro

Download Latest SIFT Workstation Virtual Machine Distro: http://computer-forensics.sans.org/community/downloads

It is recommended that you use VMware Player for PCs and VMware Fusion for MACs.  Alternatively, you can install the SIFT workstation in any virtual machine or direct hardware using the downloadable ISO image as well.

Launch the SIFT workstation and login to the console by using the password “forensics”.

Step 1 – Identify your evidence and gain access to it in the SIFT Workstation

The  files used in this example.

Link your evidence files to the SIFT Workstation through the Windows File Share that is enabled by default in the /cases directory.  Either you can plug in a USB drive, mount a remote drive share, or copy the evidence to your /cases/ directory.

It should be noted that the design of the SIFT workstation has a separate drive for the /cases directory to allow for a larger virtual drive or you can connect it to an actual hard drive as well that you mount at the /cases directory.

Open in Explorer \\siftworkstation\cases\EXAMPLE-DIR-YYYYMMDD-###

If your evidence is a E01 then use this previous article on the topic to mount it correctly inside the SIFT workstation.  If your evidence is RAW go ahead and skip to STEP 2.  Access to the raw image is required as log2timeline cannot parse E01 files… yet.

http://computer-forensics.sans.org/blog/2011/11/28/digital-forensic-sifting-mounting-ewf-or-e01-evidence-image-files

  • $ sudo su –
  • # cd /cases/EXAMPLE-DIR-YYYMMDD-####/
  • # mount_ewf.py nps-2008-jean.E01 /mnt/ewf
  • # cd /mnt/ewf

Note the commands that are inputted by the forensicator are highlighted in the blue outlined box.

Step 2 – Create The Super Timeline

Manual creation of a timeline is challenging and still requires some work to get through.  We have included in the SIFT Workstation an automated method of generating a timeline via the new log2timeline tool that can simply be pointed at a disk image (raw disk).  Again, If you are examining an E01 or AFF file, please mount it first using mount_ewf.py or affuse respectively.

Creating a Super Timeline requires you to know whether or not your evidence image is a Physical or a Partition Image.  A Physical image will include the entire disk image and can be parsed by the tool mmls to list the partitions.  A Partition image will be the actual filesystem (e.g. NTFS) and can be parsed by the tool fsstat to list information about the partition.

Once you have figured out if you have a physical disk image or a partition image, then choose the correct implementation of the command to run with the correct timezone.

Critical Failure Point Note: There is confusion over what the (-z) time zone option is used for. The (-z) is the time zone of the SYSTEM. The timezone is used to baseline convert time data that is stored in “local” time and will output the data into the same time zone.

Why the need?  Certain artifacts, such as setupapi.log files and index.dat files, store times in local system time instead of UTC.  Without telling log2timeline what the local system time is, it would slurp up the data from those artifacts incorrectly.  To correct this, log2timeline converts the data into UTC but then output the data back into the same –z time zone by default.

The output of your timeline will always be the same time zone as your –z option unless you specify a different time zone using the “BIG” –Z option.  This will allow you to convert a system time of EST5EDT to UTC output if you desire to compare computers from two different time zones in a single timeline.

If your time zone includes areas that have daylight savings time, it is important to use the correct location with daylight savings time. For example, on the East Coast, the correct implementation of daylight savings for the timezone value would be EST5EDT. For Mountain Time it would be MST7MDT.  log2timeline has autocomplete enabled in the SIFT Workstation, so all you would need to do is type -z [tab tab] to see all the available timezone options that it will recognize. If you do not use this time zone setting correctly with daylight savings accounted for, any local time timeline data that is analyzed in local time will be incorrect.  So just to iterate, using EST as your timezone will treat all timestamps as if they were EST.  Using EST5EDT (and other similarly named ones) will take daylight savings into account.

In summary: It is crucial that the -z option matches the way the system is configured to produce accurate results.

log2timeline LIST-Files

The list files in log2timeline are vitally important to understand.  They can be used to specify exactly which log2timeline modules you would like to parse in a given image.  There are some default list-files already built into log2timeline found in the /usr/share/perl5/Log2t/input directory and all in with the value .lst.  The list-files will always be used with the –f list-file option and the -r (recurse directory) and will automatically parse any artifact included in the list-file chosen found in the starting or subdirectory of the location you are examining with log2timeline.

Once you understand the .lst files are just a list of artifacts you would like to examine, it is fairly simple to add your own for any type of situation.  For example, you could add one for an intrusion investigation against an IIS webserver by using only the artifacts mftevt, and iis.  This will save you a lot of time especially if there are a bunch of IIS log files on the system.

Intermediate log2timeline LIST-Files usage

If you prefer to not make a list-file for use.  log2timeline can take any number of processors, as long as they are separated with a comma.

An example: -f winxp,-ntuser,syslog–  This will load up all the modules in the winxp input list file, and then add the syslog module, and remove the ntuser one.

The same can be done here:  -f webhist,ntuser,altiris,-chrome — This will load up all the modules inside the webhist.lst file, add ntuser and altiris, and then remove the chrome module out of the list.

This is the proper way to form a command using the log2timeline list-files option in the SIFT workstation.

Now that we have an understanding of the basic functionality, it is best if we quickly take a look at some cases in where a targeted timeline could be used.

Case Study 1 – Intrusion Incident (IIS Web Server)

Perhaps you are examining a case in which you have to examine a web server for a possible residue of attack.  In this case you are not sure when the attack took place but you would like to look at not only the system’s MFT, you would also like to include the IIS log files and the system’s event logs.  This greatly reduces the amount of clutter in your timeline as you already know your attack via the web would be found in these 3 places.

Mount your disk image correctly using the SIFT workstation on /mnt/windows_mount

Now build the commands to build your initial timelin

Step 1 – Find Partition Starting Sector   # mmls image.dd calculate offset ##### (sector *512)

Step 2 – Mount image for processing   # mount -o ro, noexec,show_sys_files,loop,offset=##### image.dd /mnt/windows_mount

Step 3 – Add Filesystem Data   # log2timeline –z EST5EDT –f mft,iis,evt /mnt/windows_mount –w intrusion.csv

Once you have run the three commands you will now have your timeline built.  It is best that you now sort your timeline using l2t_process.  l2t_process is most effective when you are bounding it by two dates to limit looking at all times on the system.

Step 4 – Filter Timeline   # l2t_process -b intrusion.csv > filtered-timeline.csv

#l2t_process –b /cases/ EXAMPLE-DIR-YYYYMMDD-####/timeline.csv 01-16-2008..02-23-2008 > timeline-sorted.csv

Case Study 2 – Restore Point Examination

In this example, we are using log2timeline to sort a Windows XP restore points only looking for “Evidence of Execution” only.  This is used to show how you can use log2timeline to provide a targeted timeline of only a piece of the drive image instead of the entire system itself.  In this example, we could see historically the last execution time for many executables on each day a restore point was created.

# log2timeline -r -f ntuser -z EST5EDT /mnt/windows_mount/System\ Volume\ Information/ -w /cases/forensicchallenge/restore.csv

# kedit keyword (add userassist, runmru, LastVistedMRU, etc.)

# l2t_process -b restore.csv -k keyword.txt > filtered-timeline.csv

What if a key part of our case was determining when the last time internet explorer was last executed over time?  It is now easily visible each time IE was last executed on a specific day using timeline analysis techniques like those I showed above.  Here you can easily track the execution of a specific program across multiple days thanks to quick analysis using the restore point data (NTUSER.dat hives) and log2timeline.

Case Study 3 – Manual Super Timeline Creation

In some cases, you might not want to use the full super timeline.  To understand what log2timeline is stepping through automatically; it might be useful to accomplish the same output by hand.  The following is the steps that log2timeline takes care of us in a single command instead of 3 steps.

In Summary:

Timeline analysis is hard.  Understanding how to use log2timeline will help engineer better solutions to unique investigative challenges.  The tool was built for maximum flexibility to account for the need for both targeted and overall super timeline creation. Create your own preprocessors for targeted timelines.  Use log2timeline to only collect the data you need.   Or use it to collect everything.

In the next article we will talk about more efficient ways of analyzing data collected from log2timeline

CONGRATS!

You just created your first SUPER Timeline… now you get to analyze thousands of entries!  (Wha???)

In another upcming  article, I will discuss how to parse and reduce the timeline efficiently so you can analyze the data easier.  SUPER-TIMELINES obtain much data from your operating system, but learning how to parse it into something useable is extremely valuable.  In my SANS360 talk, I will take this technique even further.  Of course, we go through all these techniques in our full training courses at SANS specifically FOR508: Advanced Computer Forensic Analysis and Incident Response.

Keep Fighting Crime!

Rob Lee has over 15 years of experience in digital forensics, vulnerability discovery, intrusion detection and incident response. Rob is the lead course author and faculty fellow for the computer forensic courses  at  the SANS Institute and lead author for FOR408 Windows Forensics and FOR508 Advanced Computer Forensics  Analysis and Incident Response.

 

 

How to Make a Difference in the Digital Forensics and Incident Response Community

Over the years of teaching, I have found that there is no shortage of talent in our DFIR community.  There are so many individuals that are incredibly sharp, truly skilled, and solving critical cases for their organizations.

Sometimes we find that we become so focused on solving cases that we forget that we could figure out a way to share some of our talents back to the community.  I commend the many peers that I have that have started blogs and author tools that truly make a difference.  In some cases, an individual has a lot of skill, but sometimes needs an idea.  Many in the community can probably list of multiple research projects that we would love to tackle if given enough time.  But simply we don’t have that extra time… so we share these ideas with others who might have a spare CPU cycle or two.

The main point?  I truly encourage you to reach out to individuals in the community and ask “What would be a great project for me to work on?”  or “What still needs to be researched?”  You will be surprised at how often the answer to that question will be much longer than you expected.  The work you might perform on that project could potentially change the entire digital forensics community.

The Kristinn Guðjónsson Story:

Kristinn is a good friend and someone that I really respect in the community.  Back in the summer of 2009, Kristinn reached out to me and started having a dialog about some parsing utilities that he had created.  He had just started to blog and seemed to be interested in feedback.  I started to use some of his utilities and noticed that he was a fairly decent coder.  He was also looking for a project that he would be able to submit for the GIAC GCFA Gold Certification. (Kristinn’s paper is here) It was at this point that Kristinn was thirsty for a project and seemed eager to tackle something large so I mentioned the initial idea of the Super Timeline tool to him.  Now, extending timeline analysis wasn’t exactly new.  Brian Carrier and others such as Mike Cloppert had also started research on this topic as well in his GCFA Gold paper on EX-TIP, but work wasn’t extended beyond what was initially written.  One key step occurred when Harlan and I began collaborating on timeline analysis when I asked him to modify his regtime.pl script to output into bodyfile output in October 2008.  As a result, expanding timeline capabilities beyond the filesystem and registry were a key project that had not truly been tackled before.  What we needed was someone that had the time to dedicate to the research and development.  Enter Kristinn.

Kristinn’s reply was wonderful.   In the very next email he was already looking at possibly using this project as the submission for his Gold Paper.  He began work right away and log2timeline was born soon thereafter.  I’m still fortunate that Kristinn calls upon me for feedback from time to time, but the community is now reaching out to Kristinn for the question “What needs to be done?”  And that is wonderful.  That is the way it should happen.

Kristinn’s email discussing whether or not this idea could be submitted for his GIAC Gold Certification.

Kristinn’s GIAC GCFA Gold paper was submitted and his project clearly has changed the way many of us all look at forensics.  Can the right research project change your future?  Yes.  Kristin recently landed a job at a top US IT firm doing… IR/Forensic work after moving from Iceland.  Much of that is due to his research and work on a project like log2timeline and his contributions to the DFIR community.

Become the Next Kristinn

The main idea here is that there are many out there who want to contribute.  There are many research projects that are still left to explore.  There are many tools that have not been written yet and many papers that are simply questions at this point.  No you do NOT need to program.  You do not need to write a thesis.  However, I would recommend that you join a DFIR mailing list and ask questions or share your thoughts.  If you want to be more formal, start a blog and discuss things that interest you.

If you are looking for a project or a research idea and are short on ideas, reach out to the DFIR community for their input. Email me or others that you think might have some ideas.  Tweet @sansforensics to gather some ideas.  The field is still extremely new.  Many ideas are too time consuming that we cannot explore them properly ourselves.  There are also many small groups that work in development teams.  log2timeline has started to move in that direction with the creation of a google group development list (http://groups.google.com/group/log2timeline-dev) or the incredible group that help develop plugins and memory analysis support as a part of the volatility community or look to contribute via their code page (http://code.google.com/p/volatility/).  There are many small groups like this and individuals who probably would love additional collaboration.  If you need ideas of where you can help, ask.

Kristinn’s story of how log2timline was created is a great example of this.  I’m not only happy that the tool was finally made, I’m truly happy with the new friend we made in the process.

Thanks for the hard work Kristinn, there are many bad guys looking at the inside of a cell due to your research, leadership, and work.  *hat tip*

Rob Lee has over 15 years experience in computer forensics, vulnerability discovery, intrusion detection and incident response. Rob is the lead course author and faculty fellow for the computer  forensic courses at  the SANS Institute and lead author for FOR408 Windows Forensics and FOR508 Advanced Computer  Forensics  Analysis and Incident Response.

Log2timeline Plugin Creation

About a year ago, I needed to add an Apache log to a  supertimeline  I was working on.  I wrote a bash script to do this, as I was not  familiar  with perl at the time. I later went back and learned some basics of perl and converted it to my first log2tlimeline plugin. Since then, I’m wrapping up my third plugin.

Before you begin writing your plugin, in addition to this post, it’s best to look through the gold  paper  Kristinn Gudjonsson  wrote. This will give you a good understanding of how the tool works and should answer many of your questions about the architecture.  In this post, I’m covering how to create a OSX PLIST plugin for the tool, but the technique is the same for most files you’ll want to parse.

Getting Started

When writing the plugin, it is important to understand the file you are parsing. You should understand all the different conditions that may generate different results in the file.  A couple of different ways to find this out is: review source code for the program, look for open source tools that parse the file already, and generate your own output from the program while trying to replicate all options that will generate a log.

In the Download.plist,  I’ve found four different conditions: normal completed download, canceled  download, file was downloaded and deleted, and  file downloaded outside the user directory.  If the file you are parsing exists on  multiple  platforms, then you’ll need to check the format for each OS.

When I start working on a new plugin, I create at least 2 scripts. One I call the master script, this is in the proper log2timeline format. The other is a scratch script for all my testing. I find it much easier to troubleshoot basic perl problems using this rather then troubleshooting through log2timeline.  I test the code logic in the  scratch  file, then move it to the final script. Below is the initial scratch script I start with for each section of code I’m testing.


#!/usr/bin/perl
$file = $ARGV[0];
open (<INPUT_FILE>, $file)
   or die "Could not open file";

while (<INPUT_FILE>, $file){
stuff
}
close(INPUT_FILE);

The script above takes a filename as an argument, opens the file and does something with each line in the while loop. Then closes the file. Replace the “stuff” with the code you want to test on the file  contents.

Step 1 Copy a template

The default install path for log2timeline is as follows:

OSX  /opt/local/lib/perl5/site_perl/5.12.3/Log2t/
Ubuntu /usr/share/perl5/Log2t/

Copy a template to base your plug-in on.

  • The author has created  a template file that is located in the source directory dev/template_for_input_logfile.pm.
  • If there is already a parser for a similar file type, I would start with that one and make the necessary changes as needed.

Since I’m creating a PLIST plugin I’m going to base my plugin off  the /opt/local/lib/perl5/site_perl/5.12.3/Log2t/input/safari.pm plugin that Hal Pomeranz created.

Step 2 Edit basic information

At the top of the file, fill in your information. Include the version of the plugin code and explain what the plugin is going to do.

Step 3 Name your plugin

Right after the comments you should fill in the name of your plugin. Mine is safari_download. This should also match the file name of the plugin which is safari_download.pm.

package Log2t::input::safari_download;

Step 4 Determine initial packages  you will needed.

The great thing about having this program written in perl is the large amount of libraries already  available  to make life easier. Log2timeline also has a list of  libraries  that are  available.

Library How to include Purpose
Common use Log2t::Common; Mostly used by the main tool itself. Provides information about where to find library files, version number, etc. Some input modules load this library to use the “get_username_from_path” subroutine, which tries to extract the a username from the path of the file (as the name clearly indicates).
Time use Log2t::Time; Used by most if not all input modules. This module provides multiple subroutines that take as an input a date or a timestamp in various formats and returns back the timestamp in Epoch format. It also has subroutines to change Epoch time to text.
BinRead use Log2t::BinRead; Used by most input modules that deal with binary files. This library is created to make it easier to read data from binary files.
Network use Log2t::Network; Very simple library, currently the only subroutine is the get_icmp_text that takes as an input both the ICMP text and code and returns a text value.
Numbers use Log2t::Numbers; Simple library that contains two subroutines, one to join together two numbers and another one to round up a number.
Win use Log2t::Win; Library that can be used by input modules that parse Windows artifacts that might contain some GUIDs. It contains a list of few GUIDs that can be transformed into default values of software.
WinReg use Log2t::WinReg; A library that registry modules use to extract deleted registry entries from a hive file.

In my plugin, I’m using the following.


use strict;
use Log2t::Common;
use Log2t::Time;

Step 5 Determine how to process the file

The first subroutine you need to modify is the new() routine. It is the default constructor for the module and it starts by running the parent’s class constructor (input.pm).

The parent’s class defines few variables that can be changed in the new() routine if needed:


$self->{‘multi_line’} = 1;
$self->{‘type’} = ‘file’;
$self->{‘file_access’} = 0;

The above values are the default ones and need not be defined unless you want to change them. To explain each variable, it is very important to understand how the main engine in log2timeline calls the input module.

The main engine starts by initializing the module  and uses the values of these variables to adjust how it calls the module when parsing files. There are basically two methods of retrieving timestamps. Either the engine only asks once for a timestamp and the module is supposed to return a hash value that contains timestamp objects (explained later) or the engine calls the input module once for each timestamp there is.

The variable that defines this behavior is the ‘multi_line’.  If it is set to one, the engine will treat this as an ASCII file or a file that contains one timestamp per line. It calls the input module once for each line that contains a timestamp, until there are no more.  If  the ‘multi_line’ variable is set to zero, then there is only one call made to retrieve timestamps. The module should return a reference to a hash that contains timestamp objects.

In this Plist parser, I will be using 0. You need to parse all the XML elements at one time.

Step 6 Plugin Descriptions

In sub get_description, enter a short description of what the plugin does. This will be displayed when you run ( log2timeline -f list ).


sub get_description
{
    return "Parse the contents of a Safari Download.plist file";
}

In get_help, enter a long  description  how the the module works and what is does. This will display when you run (log2timeline -f safari_download -h).


sub get_help
{
    return "Usage: $0 -f safari_download ... -- [-u username] [-h hostname]

This plugin parses the content of Download.plist, a binary property
list file containing Safari download history.  On Mac OS X systems,
this file is typically in /User/<username>/Library/Safari";
}

Step 7 Determine the format of the file

This is where you actually start doing some work. Verify is the subroutine that runs and checks if the file is the correct format to parse.  This needs to be very specific. If other files also meet the same criteria, they will be parsed incorrectly. In a normal log file, you will need to setup a regex that will match the line format of the file and then check to make sure the data is valid.

Binary Plist Files

The Plist files are binary or XML and you’ll need to do something a little different.  To determine what Hal did for the safari plugin, we need to look at the format of the History.plist file.

Look at the first line of the file

#cat /Users/twebb/Library/Safari/History.plist |head -n1
??list00_WebHistoryFileVersion_WebHistoryDatesPUtitle_lastVisitedDateZvisitCountQDQW_http://www.apple.com/startpage/]Apple - Start[322428070.5?

Now we see how the file starts above in green. This may not be  unique  to the file so you’ll need to do some testing on  similar files, but in this case it is  unique. Now lets look at Hal’s code.

read($self->{'file'}, $buf, 32);

Log2timeline is feeding the file to his plugin and he is reading the first 32  characters  into a variable called buf.


unless ($buf =~ /^bplist00.*WebHistoryFile/) {
 $return{'msg'} = 'Does not appear to be a History.plist file';
 return \%return;

Now that we have data in variable $buf we need to see if the file matches what we  expect.  Take the contents of variable $buf and see if it matches the regex of /^bplist00.*WebHistoryFile/ . If you look at the file in blue (above) , it does match the  beginning  of the file.  If it does not, it returns the error.

XML Plist Files

The web history plist is a binary xml file where the download history plist is a standard xml file, but the same perl library will parse both types in the same way.

#cat /Users/twebb/Library/Safari/Downloads.plist |head -n5

?xml version="1.0" encoding="UTF-8"?>
!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
plist version="1.0">
dict>
	key>DownloadHistory</key>

What makes this file unique is the DownloadHistory key on the 5th line of the file.  We will need to read the start of the file and check to see if DownloadHistory exists. Let’s start writing the subroutine verify.


sub verify
{
my $self = shift;
my $buf = undef;

my %return = ('success' => 0, 'msg' => 'No such file or directory');
return \%return unless (-f ${$self->{'name'}});

The first portion above just sets up default variables. $buf is the name of the variable we will read data into for analysis.


for( my $i=175 ; $i < 200; $i++ )         # This sets up a counter i for lines
{
    seek($self->{'file'},$i,0);           #Goto byte i
    read($self->{'file'},$temp,1);        #Read and stores into $temp
    $line .= $temp;                       #Setup variable for line
 }
unless ($line =~ /DownloadHistory/) {  #Search the line for DownloadHistory
    $return{'msg'} = 'Does not appear to be a DownloadHistory.plist file';
    return \%return;
}

The code above is the main portion where the match happens. We start reading the file at bit 175 and stop at bit 200. If it contains /DownloadHistory/ then it will parse the file.  You want this portion of the script to be as fast  and  exact  as possible. When using timescanner with this plugin, all files will be  scanned  using this part of the code. The faster the better.


$return{'success'} = 1;
$return{'msg'} = 'Success';
return \%return;
}

The $return{‘success’}  is the value to tell the main log2timeline program that this is a valid file to parse.  If you return the value 0, it will not parse the file.

The $return{‘msg’} value  will be displayed to the user. For error messages the more descriptive, the better.

If the match works, it will return 1 to the main program and the message successful.

Step 7.1 BinRead Library

The BinRead Library is setup to make reading files easier. Its able to support ASCII and binary files. Below are the details of the library.

$line = Log2t::BinRead::read_ascii_until( $self->{'file'}, \$ofs, "\n", 100 );
read_ascii ( \*FH, \$ofs, $length )

This function returns an ASCII string of length $length read from the binary file FH (accepts FH as a reference to a typeglob of the filehandle).
The variable offset dictates where in the binary file we find the start of the string, the offset variable is a reference, since the offset variable is increased
as each character is read (so the offset variable will be $ofs+$length at the end of the function).

read_ascii_end ( \*FH, \$ofs, $max )

This function returns an ASCII string of maximum length $length, from the binary file FH (accepts FH as a reference to a typeglob of the filehandle), but otherwise until an end of a string or a null character is seen. The variable offset dictates where in the binary file we find the start of the string, the offset variable is a reference, since the offset variable is increased as each character is read (the offset variable will be set at the end of the string).

read_8 ( \*FH, \$ofs )

This function reads 8 bits or one byte from the file FH (accepts FH as a reference to a typeglob of the filehandle) and return it according to the set endian of the file (default is little endian). The offset is then increased by one.

Program using the BinRead  library.

You can replace the for loop in the code earlier  ,(for( my $i=175 ; $i < 200; $i++ ), with this BinRead library for cleaner code.

my $ofs =175;
my $line = Log2t::BinRead::read_ascii( $self->{'file'}, \$ofs, 200 );
    unless ($line =~ /DownloadHistory/) {  #Match the line for DownloadHistory
        $return{'msg'} = 'Does not appear to be a DownloadHistory.plist file';
        return \%return;
               }

Step 8 Init and  File Location


sub init
{
my $self = shift;
# Try really hard to get a user name
unless (defined($self->{'username'})) {
$self->{'username'} = Log2t::Common::get_username_from_path(${$self->{'name'}});
    }

        return 1;
}
}

The engine calls the init subroutine after the file has been verified and before the file is parsed. For many files, this sub routine will not be needed and can be skipped or removed from the script.  If  the file you are planning to parse is under the users directory, you may want to include the code above. Log2timeline will then try to parse  the username from within the file path.

The init section can also be used to setup other items. In my generic_linux plugin, I use it to calculate the last modified date for the syslog file. This is due to syslog not including the year along with month, day and time within the message.

Step 9 get_time

Now that we know the file is valid, we need to actually parse the file.  The sub-routine get_time is where the magic happens.


my $self = shift;
my $Data = undef;         # Perl data structure produced from plist file
my %container = undef;  # the container that stores all the timestamp data
my $cont_index = 0;     # index into the container
my $objects;

eval { $objects = Mac::PropertyList::parse_plist_file($self->{'file'}); };

You’ll need to setup the variables for the parsing.  Then you’ll need to read in the file name that referenced in ($self->’file’). This is handed to your plugin from the main log2timeline perl program.  In this instance above, it passing the file name to the library Mac::PropertyList for parsing.


eval { $Data = $objects->as_perl; };
foreach my $ref (@{$$Data{'WebHistoryDates'}}) {
        # New %t_line structure.  Most of the basic information is fixed.
        $container{$cont_index} =  ('source' => 'WEBHIST',
                  'sourcetype' => 'Safari history',                  'version' => 2,
                  'extra' => { 'user' => $self->{'username'}, },
        );

The plist library returns what it processes as the variable $Data. The XML file element WebHistoryDates is the main element that everything branches off of in the file. So for each item($ref) that is loaded in from the array $DATA  under WebHistoryDates will be used.

TLINE STRUCTURE

This is what gets sent back to the main log2timeline program and generates the output that we all know and love.


   # create the t_line variable
        %t_line = (
                'time' => { 0 => { 'value' => $date, 'type' => 'Time Written', 'legacy' => 15 } },
                'desc' => $text,
                'short' => $text,
                'source' => 'PLIST',
                'sourcetype' => 'LOG',
                'version' => 2,
                'extra' => { 'user' => 'username extracted from line' }
        );

Time:

  • Value -Needs to be converted to epoch.
  • Type- What the time means. Last Visited, Time Written.
  • Legacy- MACB notation (8,4,2,1) This is a 4-bit value.
  • 1=Modify Time
  • 2=Access Time
  • 4=Create Time
  • 8=Birth Time
  •  If you want all entries listed add them up for 15.

Description:This what is in the file that we care about. What was access, downloaded, viewed, created..

Short: A shorter description.

Source: Short Description where the data came from.

Sourcetype: Long Description where data came from.

AV => anti virus logs
EVT => Event Log
EVTX => Event Log (newer format)
EXIF => metadata
FILE =>filesystem timestamp
LOG => log file

version: Version of the t_line format. Currently 2.

extra: Anything additional available from parsed data.

Testing

When you think you are ready to test,  copy the file into the input directory under Log2timeline  and give it a try.

#log2timeline -f (plugin) file

In my case I use:

#log2timeline -f safari_download /User/webb/Library/Safari/Downloads.plist

You will need to make sure that your test file includes all the known values in the file. This will insure that you are parsing all the data correctly.

Wrapping It Up

If you made it this far, then hopefully you will decided to create a plugin for this awesome tool. This is a great way to give back to the community and support open source. Special thanks to Kristinn Gudjonsson for help with  clarifications in the post.  There is a Google group for log2timeline developers,  if your intrested in working on plugins please feel free to join.