Skip Ribbon Commands
Skip to main content
FinFolio > Blog > Posts > Data Quality II - Finding Bad Data
 

 Recent Posts

 
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
1 - 30Next
 

 Categories

 
  
  
  
  
Data Quality II - Finding Bad Data

When I started FinFolio, I spent a lot of time analyzing portfolio accounting data quality issues. It was one of our biggest headaches at AdvisorMart and we feel it's the biggest challenge currently facing our industry. As I mentioned in Part I, the biggest problem isn't that portfolio accounting systems have bad data, it's that you don't *know* when there is bad data.

If you can't find it, you can't fix it.

We started our research by going through reconciliation issues one-by-one, trying to find ways to detect them before we printed reports. We were hoping to find a pattern that could be used across all issues. We built a library of small check-and-balance rules, each one designed to detect a certain problem.

We discovered that as our library of data quality rules grew, rules were starting to be tripped for problems they weren't originally intended to find. This was quite the revelation for us--as we created more and more rules to analyze transaction activities, it was harder and harder to get bad data into the system without tripping a rule. Sometimes one bad transaction would trip multiple rules, or changing a transaction would fix one rule but break another (indicating there was still a problem). Instead of finding one global fix, we discovered that many small fixes could achieve the same goal.

Think of the interlocking rules like a heist movie, where somebody is stealing an object surrounded by laser beams. One beam is easy to jump over but the more and more beams there are, the harder it is to slip through.

We currently have about fifteen different rules. Here are a two of the most common:

Historical Balance Comparison - Similar to the industry-standard balance check that compares your rolled-up share balances to a balance file sent by the custodian. The difference is that we store historical custodian balances so you can see exactly when your account went out of balance and if it went back in balance.

Double Entry - Checks all trades for an account to ensure your total purchases net to zero against your total sells at the end of every day. This finds offsetting trades that were entered on the wrong date, dividends where you didn't receive the offsetting cash purchase, and similar issues.

A good way to demonstrate alerts is to take an existing account, make a single change, and observe which alerts are created. In the FinFolio sample database, if you change the date of an existing Buy transaction, you will immediately see three errors:

  • the historical balance comparison alert shows your ending day balance no longer matches the historical custodial balance, and
  • (x2) the double-entry error fires twice, once on the date the purchase should be, and once on the changed date.

Each of these rules is useful in isolation. But when you have enough interlocking checks and balances, it becomes almost impossible to create incorrect transactions. And when you do have problems you can see patterns in the errors that let you quickly identify the cause.

(continued in Data Quality III - Presenting the Errors)

Comments

There are no comments for this post.