Construct a customer database from PDF bank statements using Python programming and Microsoft SQL
View/ Open
Date
2021-06Publisher
Brac UniversityAuthor
Nandi, Bikash KumarMetadata
Show full item recordAbstract
This report proposes a model of extracting customers' transactions information from
pdf Bank Account Statement and stores result-set into a customer Microsoft SQL
(MsSQL) database for further automated analysis. In nancial sector, it is very
important to analysis bank account statement properly to measure the creditwor-
thiness for credit approval. To achieve this target, a credit analyst needs to spend a
signi cant time for manual analysis which leads to delay credit approval and some-
times inaccurate analysis diverts to take wrong approval. So, at present, automated
bank account statement analysis is a big demand in the nancial sector. This model
will overcome the aforementioned limitations and serve the current market demand.
For targeting to achieve this desired goal, the whole process has been divided into 4
basic segments. The rst segment entails converting pdf to text by using a python
library (pdftotext), the second one emphasis on correction raw text le (.txt) data
by removing unnecessary characters and spaces and do formatting as per need, the
third segment consists of parsing formatted text (.txt) and retrieving desired trans-
actional information, and nally the fourth segment stores the desired information
into a customer database.