Case Study: How DLabs implemented an AI accounting system
Big Data and a boatload of data are not the same. But they do make a perfect match. Check out the case where we helped a large company deal with their boatload-of-accounting-data problem with a dedicated Big Data solution.
The need: cut costs and employee turnover associated with accounting activities
The client’s main activity is providing accounting services for the entire holding. This means processing about 75,000 incoming invoices a year that all need to be evaluated and categorized. It took two full-time employees to categorize all the documents. The tasks were overwhelmingly repetitive and time-consuming, but humans were needed because the OCR (optical character recognition) software used was helpless when it couldn’t follow the if-then rule. For the employees, however, the work was so monotonous and unfulfilling that it resulted in high turnover at the position.
When something is too difficult for simple computing and too easy for humans, that means it’s time to call in machine learning as a service. With proper training, ML can bridge that gap, empowering your computers with a dose of AI and freeing your humans to do more creative work. All the while taking a big machete to your costs!
The challenge: automate the invoice classification process
Suppliers send in invoices either in paper format (as scans) or in electronic form. Then, the following data has to be extracted to categorize and assign each document to one of about a hundred plus cost accounts:
- invoice date
- payment deadline
- customer identification number
- invoice number (any combination of letters, numbers, and characters)
- invoice amount
- supplier organization number
- supplier’s bank account
- supplier’s name on the invoice
That’s way too much data for OCR systems. Also, the invoice processing module and its automatic classification needed to be integrated with the client’s infrastructure. Simply put, AI accounting needed to be implemented.
The solution: custom NLP algorithms
We decided to find classification patterns using machine learning. Given the fact that the OCR processes did not separate the text from noise and we had little control over the type of input data provided, building and evaluating the model was tricky. We chose to implement custom-made classification and normalization machine learning algorithms based on NLP (natural language processing). In some cases, however, machine learning classification turned out to be ineffective. For example, when rules were evident and we could simply categorize invoices them with if-then. Therefore, the extensive rule system approved by the client featured a combined (ML + rule-based) approach which resulted in the best accuracy of mapping invoices.
The result: invoice processing time reduced by 75%
Our custom implementation resulted in extracting the whole text from an invoice. The performance of the final classifier reached more than 1000 invoices per hour (about 6 hours to extract the text from 7047 invoices).
No current machine learning system is 100% accurate so the client still needs to verify the results by using employees. Human participation in the process, however, is now cut by three-quarters thanks to the invoices already being classified along with accuracy probability statistics. The human double-check is additionally used as part of the learning process for the algorithm, so each human intervention actually eliminates the need for such interventions in the future. With time, the algorithm minimizes human input.
This not only reduces costs and allows staff to perform more thankful tasks, but it also gives the company a significant competitive advantage in its industry.