Thursday, 19 January 2012

New Digital Detective Knowledge Base Launched

As a small company providing forensic software to both corporate, and law enforcement customers, we strive to provide first class support for our software.  To assist us in achieving this goal, we have taken a number of steps to improve the support we provide, in particular, we wanted to help our customers quickly find the answers to their questions.

We are pleased to announce the launch of our new, and much improved, Knowledge Base.  Each software product now has its own unique space which is fully searchable and full of rich, dynamic content such as technical articles, RSS feeds, blog posts, FAQ, Problem Solving and Tutorials.  Each knowledge base article can be easily exported in PDF and is easily viewable within a web browser or mobile device. 

 

Digital Detective Knowledge Base

 

Take a look for yourself – to get started, here are the main Product Spaces for NetAnalysis, HstEx and Blade:

 

Wednesday, 7 December 2011

Hit Counter Accuracy - Caveat Emptor!

Author: Paul Andrews, Head of Digital Forensics, Digital Detective Group

A frequent question when dealing with browser forensics is 'Does the Hit Count value mean that the user visited site 'x', on 'y' occasions?' Most browsers record a 'Hit Count' value in one or more of the files they use to track browser activity, and it is important that an analyst understands any potential pitfalls associated with the accuracy, or otherwise, of this value.

We recently received a support request from an analyst who was analysing Internet Explorer data. They had found a record relating to a Bing Images search, which showed a hit count of 911. The particular search string was significant, and very damning had it actually been used 911 times. The analyst wanted to know if the hit count value could be relied upon.

The following experiment was carried out in order to establish how this surprisingly high hit count value could have been generated. In order to obtain a data set which contained as little extraneous data as possible, a brand new VMWare virtual machine was created. The machine was setup from the Microsoft Windows XP SP3 installation disc, which installed Internet Explorer v 6.0.2900.5512.xpsp.080413-2111 by default. Two user accounts were created on the machine - one to be used as an Admin account, for installing software etc; and the other to be used as the ‘browsing’ account. This separation of the accounts further assisted with minimising the possibility of any unwanted data being present within the 'browsing' account. Using the Admin account, the version of Internet Explorer in use on the virtual machine was upgraded to IE v 8.0.6001.18702. The 'browsing' account was then used for the first time. Starting Internet Explorer immediately directed the user to the MSN homepage. The address ‘www.bing.com’ was typed into the address bar, which led to the Bing search engine homepage. The ‘Images’ tab was clicked. This Auto Suggested a search criterion of ‘Beautiful Britain’, as can be seen in the figure below:

 

Bing Image Search 1

Figure 1

The term 'aston martin' was then typed into the search box, as shown below:

 

Bing Search 2

Figure 2

None of the images were clicked or zoomed, nor was the result screen scrolled. Internet Explorer was closed, and the browsing account logged off. The Admin account was used to extract the browser data for processing in NetAnalysis. The below image shows some of the results. Both of these entries are from Master History INDEX.DAT files:

 

Figure 3 - NetAnalysis with IE Bing Results

Figure 3

As can be seen, both entries show a hit count of 5. Both of these pages were visited only once, so it is immediately apparent that the hit count value maintained by Internet Explorer may not be an accurate count of how many times a particular page has been visited. However, this still did not explain how Internet Explorer had produced a hit count of 911.

The virtual machine was started again, and the browsing account logged on. The previous steps were repeated; typing ‘www.bing.com' into the URL bar; visiting the Bing homepage; and clicking on the ‘Images’ tab. Once again, Bing Auto Suggested the search criterion of ‘Beautiful Britain’, and displayed the same thumbnail results page. The search criterion ‘aston martin’ was again typed into the search box and the same thumbnail results page was produced. None of the images were clicked or zoomed. The results page was scrolled using the side scroll bar, which generated more thumbnails as it went. Internet Explorer was closed, and the browsing account logged off. The Admin account was used to extract the browser data for processing in NetAnalysis. The below image shows some of the results. Both of these entries are again from Master History INDEX.DAT files:

 

Figure 4 - NetAnalysis showing 511 hit count

Figure 4

As can be seen, the ‘Beautiful Britain’ search now has a hit count of 13 - it is not at all clear how Internet Explorer determined this figure. Moreover, the ‘aston martin’ search now shows a hit count of 511. This page was not visited 511 times, nor were 511 of the thumbnail images clicked. The contents of the INDEX.DAT for the local cache folders (Content.IE5) were checked to see how many records were held relating to thumbnails that had been cached. The results were as follows:

 

Figure 5 - NetAnalysis showing 307 records

Figure 5

So it does not even appear that there are 511 thumbnails held in the local cache. The result page was scrolled quickly, so the user did not see a large proportion of the thumbnail images.

In conclusion, it is apparent that the ‘Hit Count’ maintained by Internet Explorer cannot be relied upon. Although this experiment involved a quite specific process relating solely to image searches carried out on one particular search engine, the disparity between results and reality makes it clear that unquestioning acceptance of what Internet Explorer is recording as a 'Hit Count' could lead to significant errors if presented in evidence.

To complete the experiment, two further identical Virtual Machines were created. On one, the Google Chrome browser (v 15.0.874.106 m) was installed and used. On the other, the Mozilla Firefox browser (v 8.0) was installed and used. The same steps were repeated: typing ‘www.bing.com' into the URL bar; visiting the Bing homepage; and clicking on the ‘Images’ tab. The results from these processes are shown below:

Chrome:

Figure 6 - NetAnalysis with Google Chrome Search

Figure 6

 

Firefox:

Figure 7 - NetANalysis with Mozilla Firefox Search

Figure 7

It is apparent that both of these browsers seem to maintain a more accurate 'Hit Count'.

Friday, 18 November 2011

NetAnalysis Foundation Training Announcement

Digital Detective Group is pleased to announce the launch of their all new NetAnalysis™ Foundation training course.

NetAnalysis™ is one of the most highly regarded and accepted software tools for browser forensic analysis.  It is widely used in both public and private sectors and has become the industry leading software for the recovery and analysis of browser related artefacts.

The NetAnalysis ™ Foundation training course will run at Learning Tree International, Euston House, 24 Eversholt Street, London, NW1 1AD on the following dates, with further dates being scheduled throughout the year.

 

Digital Detective Training Dates NetAnalysis Foundation Course

 

This 2-day course is competitively priced at just £830 + VAT per place.  To book a place or to check availability, please contact us on 0845 224 8892 (or our email address: sales (at) digital-detective.co.uk).

Friday, 4 November 2011

New User Manual for NetAnalysis v1.53

We are pleased to announce the release of the updated user manual for NetAnalysis v1.53.  It can be downloaded from here:

The manual contains updated information for both NetAnalysis and HstEx.

Friday, 7 October 2011

NetAnalysis v1.53 Released

This is an important release of NetAnalysis which fixes a number of issues in relation to the changes implemented in the latest browsers. 

All of the main browsers, Internet Explorer, Mozilla Firefox, Google Chrome, Apple Safari and Opera, have either made changes to their file formats or added additional new features.  These changes have necessitated a considerable amount of research, development and testing to add the required support.  This document outlines some of the changes with this release.

Apple Safari
This release has been tested with Apple Safari up to version 5.1.7354.50.  Safari has introduced a number of changes to the cache structure which is not supported in earlier versions of NetAnalysis.

Google Chrome
This release has been tested with Google Chrome up to version 14.0.835.202.  A modification in the cache in relation to the way digital certificates are stored introduced an error in NetAnalysis v1.52 when importing the cache.  This has now been resolved.

Microsoft Internet Explorer
This release has been tested with Microsoft Internet Explorer up to version 9.0.8112.16421.  Internet Explorer 9 introduced a new integrated download manager which stores the details of downloaded files in a new download INDEX.DAT file.  This file has a different structure to the standard INDEX.DAT files.  Figure 1 shows NetAnalysis 1.53 with a Download INDEX loaded.  You can see the original URL and Download Path columns.


Internet Explorer Downloads NetAnalysis

Figure 1


Mozilla Firefox
This release has been tested with Mozilla Firefox up to version 7.0.1.  Mozilla has been on a mission recently and has released version 4 to 7 of their browser in a very short time frame.  Version 4 saw a significant change to the structure of the cache as well as the structure for storing cached files on disk.  We have also added support for the signons database.

Opera
This release has been tested with Opera up to version 11.51.  Opera is another browser which has made changes to the structure of their cache file and disk layout. 

Other New Features
In addition to the main five browsers, we have also tested this release against Sundial browser, version 4.0.1.  To see a full list of all the changes, please see the following:

Wednesday, 14 September 2011

Random Cookie Filenames

As forensic examiners will be aware, Microsoft Internet Explorer stores cached data within randomly assigned folders.  This behaviour was designed to prevent Internet data being stored in predictable locations on the local system in order to foil a number of attack types.  Prior to the release of Internet Explorer v9.0.2, cookies were an exception to this behaviour and their location was insufficiently random in many cases. 


Cookie Files

Generally, for Vista and Windows 7, cookie files are stored in the location shown below:

Microsoft Windows  Internet Explorer Cookie Location
\AppData\Roaming\Microsoft\Windows\Cookies\

Table 1

The cookie filename format was the user’s login name, the @ symbol and then a partial Hostname for the domain of the cookie. 


Digital Detective NetAnalysis Windows Cookies

Figure 1


With sufficient information about a user’s environment, an attacker might have been able to establish the location of any given cookie and use this information in an attack.

To mitigate the threat, Internet Explorer 9.0.2 now names the cookie files using a randomly-generated alphanumeric string.  Older cookies are not renamed during the upgrade, but are instead renamed as soon as any update to the cookie data occurs.   Figure 2 shows an updated cookie folder containing the new files.


Digital Detective NetAnalysis New Cookies Window 
Figure 2

This change will have no impact on dealing with the examination of cookie data.  It will obviously no longer be possible to identify which domain a cookie belongs to from just the file name.

Monday, 11 July 2011

Digital Evidence Discrepancies - Casey Anthony Trial

Introduction

Over the past few weeks, there has been worldwide interest in the trial of Casey Anthony which was held in Orlando, Florida.  Anthony was indicted on charges of murder following the discovery of the body of her daughter Caylee Marie Anthony in 2008.  On Tuesday 5th July 2011, the jury returned a not guilty verdict and she was cleared of murdering her child.

Those of you who have followed this case and listened to the expert testimony may have been intrigued and possibly confused as to some of the alleged facts as the case unfolded. 

The digital forensic evidence in this case is of particular interest to me as it involved the recovery and analysis of a Mozilla Firefox history database.  The Internet history records within this database turned out to be extremely important to the prosecution case as the existence of Google searches relating to “chloroform” and other possibly relevant records prior to the child’s disappearance could have indicated premeditation.  This, of course, could have meant the difference between a conviction for murder in the first degree and manslaughter if found guilty.  The State of Florida also has the death penalty as a punishment option for capital crimes.

During a keyword search of Anthony’s computer, a hit was found for the word “chloroform”.  The hit was identified in what appeared to be a Mork database belonging to Mozilla Firefox.  The file was identified as residing in unallocated clusters, and rather surprisingly, is reported to have been intact.  Furthermore, all of the blocks belonging to the file were said to be contiguous. 

Mork Database

The Mork database structure used by Mozilla Firefox v1-2 is unusual to say the least.  It was originally developed by Netscape for their browser (Netscape v6) and the format was later adopted by Mozilla to be used in Firefox.  It is a plain text format which is not easily human readable and is not efficient in its storage structures.  For example, a single Unicode character can take many bytes to store.  The developers themselves complained it was extremely difficult to parse correctly and from Firefox v3, it was replaced by MozStorage which is based on an SQLite database.

Forensic Analysis

It is a matter of record that our software NetAnalysis (v1.37) was used during the initial examination of this data, and then at a later stage another tool was used.  This is, of course, good forensic practice and is often referred to as “dual tool verification”.

Within a Mork database, the timestamp information relating to visits are stored as a micro-second count from an epoch of 1st January 1970 at 00:00:00 hours UTC (Universal Coordinated Time).  In NetAnalysis v1.37, the forensic examiner had an option to leave the timestamps as they were recorded in the original evidence or to apply a bias to the UTC value to translate it to a local “Standard Time”.  In this older version, there was no option to present the timestamp as a local value adjusted for DST (Daylight Saving Time).  This changed in NetAnalysis v1.50 when a further date column was introduced which presented the examiner with UTC and local times adjusted for DST.

According to video footage of the trial testimony, the forensic examiner wanted the output to reflect local time and not standard time and tried another tool.  This second tool was unable to recover any records from the Mork file.  The forensic examiner then approached the developer during a training course and discussed the issues he was having with the software.  The developer of the second tool then reviewed the Mork database over a period of a few nights and corrected the problem.  That software then managed to recover 8,557 records (320 less than NetAnalysis was able to recover at the time).

Discrepancies between Forensic Tools

During testimony, the defence picked up on the fact that there were some major differences in the results produced by both tools.  The defence assertion was that the initial results produced by NetAnalysis were in fact correct, and that the results from the second tool were flawed.  This was discussed at some lengths in the video testimony on 1st July 2011 when the forensic examiner was questioned regarding the differences. 

According to CNN, Jose Baez, the lead counsel for the defence said:

“the state's computer forensic evidence involving chloroform research, a central element of their premeditation argument, was used to mislead the jury and that the flaws in that evidence infected their entire case like a cancer.” 

He pointed out the discrepancy between the first analysis the sheriff's office did that showed one visit to a website about chloroform and an analysis done later with a second program that appeared to show 84 visits.  However, according to Baez, the first report showed a progression that made it clear that the 84 visits were actually to MySpace.

This was a major discrepancy with critical digital evidence presented in an extremely serious trial.  As the software developer of NetAnalysis, I was extremely anxious to review the raw data and confirm the facts. 

The first time I was made aware of this case (and the discrepancy between both tools) was around 9th June 2011.  To date, I have not been asked by any party representing the prosecution (or defence) to comment on the discrepancies between both tools.   I have however, since the conclusion of the trial, obtained a copy of the recovered “History.dat” Mork database file.

Mork Database File

Using this data, I will walk through the deconstruction of the critical elements of the file and verify the evidence presented during the trial.  The file is 3,338,603 bytes in length and contains data from a Mork database.

Mork Database Header

Figure 1

The block in Figure 1 shows the definition of the database table holding the history data.  The definition identifies the fields in each row as: “URL”, “Referrer”, “LastVisitDate”, “FirstVisitDate”, “VisitCount”, “Name”, “Hostname”, “Hidden”, “Typed”, “LastPageVisited”, and “ByteOrder”.  Not all of these fields will be present in every history record.  Each field is allocated an integer value for identification purposes.  For example, the “URL” field has been allocated the value 82.

According to the Mozilla Developers Network, the model is described as:

“The basic Mork content model is a table (or synonymously, a sparse matrix) composed of rows containing cells, where each cell is a member of exactly one column (col). Each cell is one attribute in a row. The name of the attribute is a literal designating the column, and the content of the attribute is the value. The content value of a cell is either a literal (lit) or a reference (ref). Each ref points to a lit or row or table, so a cell can "contain" another shared object by reference.”

Deconstructing the Mork Database

To demonstrate how this works, and to validate the data, we will walk through a couple of examples.  As we have no access to the SYSTEM registry hive from the suspect system, we must assume the computer was correctly set to Eastern Time in 2008 during these visits (time zone verification is always one of the first tasks for the forensic examiner prior to examining any time related evidence).

Figure 2 shows a screen shot of NetAnalysis with the data loaded and filtered showing some of the records identified in the testimony from the trial.

NetAnalysis Screen with Mork Database Loaded

Figure 2

The first record (at the bottom of the screen) shows a visit to MySpace on 2008-03-21 15:16:13 (local time).  The visit count shows the value as 84.  The Mork record for this entry is shown in Figure 3.

Mork record 6E2F

Figure 3

The record is enclosed within square brackets and the individual fields for the record are enclosed within round brackets.  The data stored within the brackets contain name/value pairs.  Moving from left to right, the first block of data “-6E2F” identifies the Mork record ID (record ID values are not unique).  The first name/value pair shows (^82^B1).  If you refer back to the Mork header in Figure 1, we can see that field 82 refers to the “URL” (Uniform Resource Locator).   The data for this field is stored in cell B1.  The data cell is enclosed in brackets as shown in Figure 4 (line 47).  The cell data shows (B1=http://www.myspace.com/).

Mork Field B1

Figure 4

Using the same methodology, we can see that field 84 refers to “LastVisitDate” and is stored in cell 27F42 as shown in Figure 5 (2008-03-21 19:16:13 UTC / 2008-03-21 15:16:13 Local Time).  This integer represents the number of micro-seconds from the 1st  January 1970, 00:00:00 UTC.

Mork Field 27F42

Figure 5

Field 85 refers to “FirstVisitDate” and is stored in cell BAF8 as shown in Figure 6 (2007-12-26 20:25:56 UTC / 2007-12-26 20:25:56 15:25:56 Local Time). 

Mork Field BAF8

Figure 6

Field 88 refers to “Hostname” and is stored in cell 16F as shown in Figure 7.

Mork Field 16F

Figure 7

Field 87 refers to “Name” and is stored in cell DA as shown in Figure 8.

Mork Field DA

Figure 8

Further examination of the Index in Figure 3 shows field 86.  This refers to the “VisitCount” and has been assigned the value 84.  This data is actually stored in the Index record and not a separate cell.  If an Index record does not have a field 86, then the “VisitCount” is 1.  Once the visit count is 2 or above, field 86 is assigned a value.  The last field 8A refers to the “Typed” flag and has been assigned the value 1.  This is a Boolean field 0 = False and 1 = True.

Decoded Record 6E2F

Figure 9

The data from this record has been gathered together in Figure 9.  The Name field relates to the Page Title and is stored in pseudo Unicode format with $00 representing 0x00 values. 

According to the testimony during the trial, this record was not recovered by the second tool.

Visit Count Discrepancy

At various times during the trial, the prosecution referred to a visit to a page (“http://www.sci-spot.com/Chemistry/chloroform.htm”) which allegedly took place at 15:16:13 hours (local time) on 21st March 2008.  This record was recovered by the second forensic tool and indicated a visit count of 84.  This visit was as a result of a Google search for “how to make chloroform”. 

This evidence contradicts the data recovered by NetAnalysis which showed a single visit at 19:16:34 hours UTC (15:16:34 hours local time).  Figure 9 shows a visit to MySpace, which has been verified manually above, and shows 84 visits as of 21st March 2008 at 15:16:13 hours (local time).  This is the record highlighted in NetAnalysis in Figure 2.

The Mork record containing “http://www.sci-spot.com/Chemistry/chloroform.htm” is identified as record 174EF.  The Index record from the original file is highlighted and shown in Figure 10 below.

Mork Record 174EF

Figure 10

The entire record is contained within square brackets.  The highlighted line above shows the full record.  The first field 82 (“URL”) is stored in cell 27F4B, as shown in Figure 11.

Mork Field 27F4B

Figure 11

The second field 84 (“LastVisitDate”) is stored in cell 27F4C, as shown in Figure 12 (2008-03-21 19:16:34 UTC / 2008-03-21 15:16:34 Local Time).  Once again, this integer represents the number of micro-seconds from the 1st  January 1970, 00:00:00 UTC. 

Mork Field 27F4C

Figure 12

The third field 85 (“FirstVisitDate”) is stored in cell 27F4C.  This is the same cell value as for (“LastVisitDate”) and indicates this is the first visit to this web site during the scope of the current recorded history.  The First and Last visit times are the same.

The fourth field 83 (“Referrer”) is stored in cell 27F49, as shown in Figure 13.

Mork Field 27F49

Figure 13

The referrer field is very interesting from a forensic point of view as it shows the referring page.  As the HTTP GET is sent to the web server for a page, the browser also sends the referring page as part of the request.  This allows web masters to log the route by which visitors land on their pages.  Mozilla Firefox records this information for each record.  It is therefore relatively easy to track the actions of a user from page to page.  In this case, the referring site was a Google search for “how to make chloroform”.  With this information (which NetAnalysis shows in the “Referral URL” Column) there really is no need to “guess” how a user arrived at a specific page.

The fifth field 88 (“Hostname”) is stored in cell 27F4D, as shown in Figure 14.

Mork Field 27F4D

Figure 14

The last field 87 (“Name”) is stored in cell 27F4E, as shown in Figure 15.  The decoded value for this string is “New Page 1”.

Mork Field 27F4E

Figure 15

Once again, I have gathered together the data for this record and presented it in a table format for easy review.  This can be seen in Figure 16.

Decoded Record 174EF

Figure 16

There are two critical points to make with this record.  Firstly, there is no field 86 (“VisitCount”) therefore this URL has only been visited once (not 84 times).  This is further corroborated by the fact that field 85 (“FirstVisitDate”) shows the exact same date/time as the “LastVisitDate”.  The second point is that the visit was recorded at 15:16:34 hours (local time) and NOT at 15:16:13 hours as was stated during the trial (from the report produced by the second forensic tool).

Validity of the Recovered File

With the release of NetAnalysis v1.50 (current version v1.52), the Mork database parser was completely re-written from scratch (as were the other parsing modules).  This was primarily to make the code easier to migrate and maintain and to ensure we were recovering as much data as possible.  I tested the current release of NetAnalysis v1.52 against the Casey Anthony data.  I know from manually examining the data, there are 9,075 individual Index records.  Loading the data into NetAnalysis resulted in 9,060 records being recovered.  This initially caused me some concern.  However, further examination of the data revealed that there was nothing to be concerned about.  There were 15 records which had missing “URL” cells; 14 of these records also had missing “LastVisitDate” cells. 

If there are missing data cells within the file, this is a strong indicator that the file is not intact.

Conclusion

There are a number of conclusions to be drawn from the digital evidence presented in this trial; however, I will leave this to the members of the digital forensic community.  Forensic tool validation is certainly at the forefront of our thoughts.  Whilst it may not be possible to validate a tool, it is possible to validate the results against known data sets.  If two forensic tools produce completely different results, this should at least warrant further investigation.

References