Analyzing date information in datasets for fraud investigation delivers many if not the most important information. So answers to questions like when something occured or how long a process took…well, all time-related information, are crucial to identify fraud cases and to define the fraud schemes.
For this purpose the data analysis software IDEA provides functions and also formulas. Formulas are a really helpful tool (like in Excel or other well known analysis tools) to better understand the dataset in front of you. The formulas focussing on date information are plenty in IDEA. One is more needed than the other. Therefore in this blogpost I will present the three most important date functions, that I use constantly in IDEA fraud reviews and audits.
Before we begin allow me to recap shortly how to use the formulas in IDEA:
Like in Excel you can add columns to your data table in IDEA, which calculate applying formulas. The formulas in IDEA start with the "@"-symbol and you apply them via the Equation Editor after e.g. adding a field or column, choosing field type "Virtual Numeric" and clicking the calculator symbol on the right side of "Parameter":
Good, now that we know how to apply equations, let´s just talk about my favourite date related formulas for fraud analysis:
The name of this formula is the abbreviation for "day of week" and thus we already infer what it does.
Yes, you guessed right: It gives you the name of the day from a specific date.
Say you have a given date like 11/11/2011.
This is btw not chosen by coincidence. Since Iam from Cologne, I like the carnival there, which starts every at 11th November :-).
Anyhow, the formula would tell you that this day was a Friday. Nice. But rather than "Friday", it´ll tell you a "6". That is simply because Sunday is "1", Monday is "2" and so on.
Why is this one a favourite in fraud investigations of big data sets?
With the information provided by this equation, you can then extract transactions that were booked on weekends or products that were shipped on a Sunday or payments that were proceeded on days that usually are no paydays according to accounting routines.
You see: Thus it´s easy to identify red flags within an initially confusingly huge amount of data.
But: These are flagged transactions. Not more, not less. So in a next step check the calendar date in terms of was this close to any cut-off date? In addition, investigate why this transaction was booked on a day when ususally nobody works!
The @workday - formula goes in a comparable direction as the @dow function just simplifying the whole thing by spitting out a "1" for a date that was during work week. In contrast it reults in a "0" once the respective date was during the weekend. That´s it.
Still, there are cases where you need to see the specific day, therefore you use @dow. If you want to review for weekend entries or working day, then I propose the simpler @workday function.
In this context let me explain how to employ the equation:
Within the Equation Editor open the "Date/Time" folder in the equation area in the center and there choose the formula by double clicking it. Now you find it included in the equation area. That is the area were you actually build your formula.
Click between the brackets within the equation area and you see below and next to the question mark the syntax, that is how is the equation structured. You see which fields need to be placed and where within in the equation.
Additional help for every equation in IDEA you find on the right side in the Equation Editor, where you find explanations of what each formula does and how to use it including examples.
This one is so important for my dataanalysis! It should be placed at first position. However, being the last it raises the tension :-)
What is the function good for?
It calculates the difference between two date fields in terms of number of days. So if for the moment it sounds boring to you, let me describe why I rate it so important: Quite simply because the difference between two events in researching a certain business process delivers you with so much insight.
You can use it for process mining to check where are bottlenecks in the process or as I describe in the blogpost about payments before invoice you can investigate for alarming payments.
You can also use it to identify old records in your accounts and even make ageing structures.
There are many use cases where this function helps you out and you will probably find your own, which I would certainly like to know of. So please send me in your use case for this or one of the previously mentioned functions. Looking forward to hear about that.