Time is valuable. And time spent performing tedious tasks not only takes away from our productivity but could have devastating effects on time-sensitive investigations.
We (examiners and investigators) rely on our mobile forensic tools to obtain, decode, and present data with speed and integrity. However, it is not possible for our tools to support every mobile device and app that exists. It is inevitable that we will need to manually process data from an extraction in order to obtain the evidence we need. That manual processing can require hours of forensic analysis.
That is where Python comes to the rescue! Python is a scripting language ideal for iterating through and parsing data. Once we identify a pattern, we can unleash the power of Python to automate processes and save countless hours of manual labor.
Here is an example of a feature phone that was not supported by any commercial mobile forensic tools on the market and the potential hours of manual processing that would have been required if it weren’t for Python.
A “similar profile” was used to obtain a logical extraction of an LG feature phone. The similar profile did not decode the texts, calls, or contacts from the device. In fact, the only messages decoded using the similar profile were the auto reply messages that came with the phone. But fortunately, the file system was obtained providing hundreds of files with the naming convention of “inbox####.dat” found in the SMS/inbox folder of the device. It became apparent that these were incoming text messages that were not decoded.
Not only were there hundreds of un-decoded incoming messages in the inbox folder, there were also dozens of files in the SMS/outbox folder labeled “outbox####.dat” that needed decoding as well.
Analyzing the inbox####.dat files
The inbox and outbox files were binary files and had to be analyzed using a hex viewer. Viewing one of the inbox####.dat files in hex revealed the following:
When analyzing files like these, the key is to look for patterns. In this case, there was a hex value that identified the length of the text message. That hex value was found at offset 205. The message started at offset 206. The timestamp was 4 bytes long and was found at offset 12. The sender’s phone number was 10 bytes long and was found at offset 515.
After comparing several inbox####.dat files to each other, this pattern was found to be consistent. Once the pattern was discovered, it was time to employ Python to do the rest of the work.
We can write what needs to be done in Pseudocode. Pseudocode is a readable form of what our computer program or script will do without having to write actual code. Here is the repeatable process that will be conducted on all the inbox####.dat files in the extraction:
For every file in his extraction where the file name is inbox####.dat, open the file and do the following:
- Go to offset 12 and read 4 bytes
- Convert those 4 bytes to a little-endian decimal value
- Convert that value to a readable date/time format
- Store that value in a variable called “timestamp”
- Go to offset 205 and read 1 byte
- Convert that byte to a decimal value
- Store that value in a variable called “textSize”
- Go to offset 206 and read “textSize” bytes
- Store the string in a variable called “message”
- Got to offset 515 and read 10 bytes
- Store that value in a variable called “phoneNumber”
- Print the “timestamp,” “message,” and “phoneNumber”
- To an external file called “incoming texts.csv”
- Add it to the XAMN report
The following screenshot shows what the code for parsing each inbox####.dat would look like:
The script provided the following results:
A Python script written for another device was modified to suit this phone. The total time spent analyzing the inbox####.dat files for a pattern and modifying the existing script took less than 30 minutes whereas manually parsing each file for the date/time, text message, and phone number would have taken hours.
This is just one example of how Python can be leveraged to improve the efficiency of your investigations. There are a myriad of use cases for Python ranging from traversing files to parsing records from databases, iterating through PLists, XML and HTML files, to creating useful custom utilities.
Here are some examples of small Python utilities that I created to assist me in my digital forensic investigations:
Gesture Decoder: For decoding the gesture.key found on Android devices
Tempus: Timestamp converter made to decode BREW timestamps found in phones running the BREW OS.
VarInt Calculator: Used to convert VarInts found in SQLite databases to integers and lengths.
Base64 Decoder: My own tool to decode files encoded in Base64.
SQLite Database Analyzer: A tool I created to analyze the SQLite database header.
James Eichbaum is MSAB’s Global Training Manager and an instructor as well. He is a former peace officer, having served a combined total of 16 years with the Modesto Police Department and Stanislaus County Sheriff’s Office in California. As a detective with both agencies, James was a digital forensics examiner assigned to the Sacramento Valley High Tech Crimes Task Force. James possesses a Bachelor’s Degree in Information Systems Security and an Associate’s Degree in Computer Science.