Bank Transaction Tracker (BTT) is a multi-user, web based application for tracking of personal income and expenditure using bank transaction details. It does automatic classification of transactions into categories for analysis. It is targetted at SaaS providers and Banking Institutions that want to provide a personal finance / budgeting application for their customers.
While developing BTT, I decided I needed a way to automatically classify bank transactions rather than have the end user manually classify each one. Initially I tried a simple pattern matching approach but it quickly became obvious that this would not make the grade. Often the description fields of bank transactions are not intuitive and I wanted my application to allow each individual user to choose their own personal way of classifying their spending. I'd heard a lot about Machine Learning at that time and decided to investigate if it would be an appropriate tool to solve my problem. I completed a Udacity Machine Learning course and realised that Machine Learning was well suited to what I wanted to achieve.
When using Machine Learning algorithms you typically use training data to train the algorithms. In my case I used existing bank transactions that were already classified by the user as my training data. Each time new transactions were uploaded, I applied Machine Learning algorithms to predict the correct classification. I tested a number of Machine Learning algorithms such as Naive Bayes, Support Vector Machines (SVM) and Adaboost. The best results were achieved using SVM. Fortunately BTT was developed using Python 3 so it was very easy to use the excellent Python Machine Learning library, scikit-learn. You can see the code here: BTT