Keywords matter: Tips on improving your investigative search


“Great,” I thought as I sat at my desk reviewing a keyword list. Have you ever done a keyword search for the word “kill”? If so, you know where I am going with this. It appears incredibly often, and I am not referring to murder.

An even worse case is, perhaps, when you don’t get any keywords.  Or you get something like “find evidence of prostitution,” as a colleague of mine recently received. I joke about this in class. “Really? You couldn’t give the investigator more than that?” “Find pictures” is another (not so) helpful parameter.

Above the author uses XAMN’s text search to perform a search of one word.  Note an asterisk * character can be used as a wild card.  So, a search for west could include Northwest if you typed *west.  Both “and” and “or” can be used in the text filter as well. For example, find “kill and bill” or find “kill or bill”.

With XAMN mobile data analysis tool, we are fortunate to have a number of methods that help us sift through the copious amounts of data we are presented with, including filtering and categories. However, we still rely on hash and keyword values. Both of these can easily be imported into XAMN.

Above is the add Filter button in XAMN. Note that both Hash and Word Lists may be added.

Often keywords will be case specific. So, in the prostitution example we would want to know names, addresses, usernames, email, application or webpage used, phone numbers, and what was used to identify currency. Having an intake form as a spreadsheet (sorted A-Z) or document that asks for keywords to be specified can make our lives easier.

Keywords can also be regional. There can be multiple languages involved. In drug cases there are a lot of slang terms — for example, the words “smoke” or “green” may be used to identify marijuana. Other problems we can face with mobile device data are typos and autocorrect.  I always suggest including common misspellings. Keep a note of terms that you come across in your investigations.

Specificity matters. The more specific your words are, the better results you will receive. If you take a generic keyword list, you will get a lot of false positives.  For instance, the word “steal” could be used to discuss taking something from someone else illegally, buying and getting a good deal on something, or a baseball player stealing a base.

Word List being imported into XAMN. Note you can add words to the list.

Subject matter experts are also important. So, I go to the experts for keywords.  Although I have investigated arson cases, I am by no means a subject matter expert. For keywords I turned to the Fire Marshal’s Office. When putting together a search warrant for investigating the illegal killing of a moose, I asked the conservation officers for advice on keywords. I didn’t know everything we should be looking for nor did I know the correct technical terms.

Once a Word List Filter is used, clicking on the ellipses “…” provides more functions if needed. Note in this test case the Word List Filter took us from over 25,247 artifacts on a test device to 1,612.

So how do I come up with keywords?

  1. Case specific information
  2. Examining previous searches and considering if I need a subject matter expert to assist
  3. Generic keyword lists
  4. Word Cloud

I was introduced to the word cloud by a former police officer turned college professor. If you go online and do an image search for “word cloud murder” you will see that someone put together a list of words associated with murder. Some words are more useful than others, but it is helpful.

With XAMN you can add keywords or keyword lists and search all of the devices in a case at one time. Just be careful not to go beyond the scope of your search by adding and running an arson keyword list in a theft case. Make sure you ask the investigator for keywords. “Find evidence of prostitution,” does not help at all.

You will find my general keywords in the MSAB Customer Forum Index – Other – Keyword Lists.

Please feel free to assist your fellow examiners by adding your general keyword lists. No matter what region or what language. With the amount of data we face, we could all benefit from the perfect word(s).