The architecture diagram begins at the front-end, with the user accessing the application
through the
internet. Once they login and enter the system, they are redirected through HTTPS requests to
the main
dashboard that shows the user their data in categorised tables and graphs. To the viewer, this
is simply an
appealing web page designed using a Bootstrap template- with data visualisation generated
through Microsoft
PowerBI where the data in the graphs is dynamic and fetched directly from the Marklogic database
through a
Python Script.
The Django webapp is hosted on a virtual machine using an Apache server. The requests received
from the
user are encrypted by an SSL certificate (supplied by Let’s Encrypt) and thus a secure HTTPS
connection is
ensured. The URL’s class in Django reads the Http request and redirects you to the correct
method in the
Django view, which handles all the heavy lifting of the code. This consists of generating the
tables of data
(fetched from the Marklogic database using the REST API), passing the correct data into the
templates and
then returning the dynamically created templates to the web server to then be displayed to the
user as an
HTML response.
It is assumed that in a fully developed app, data will come from open banking data API. It asks
the user
for authentication to get data fromm all their bank accounts. Data is stored in a NoSQL
(Marklogic) database
on a virtual machine (hosted on Azure). Data can be accessed through the REST API server, which
allows URL
queries for data extraction and curl commands for data upload. For this prototype, we have been
using
self-generated JSON dummy data stored in the database since we haven’t been given access to a
data
store/blob.
This Entity Relationship Diagram illustrates the structure and relationship of data stored in
the
database. Data is stored in two parts. The majority of the data is stored in the database and
the data of
the users' profiles and account numbers are stored in a SQLite2 database with all the django
python files
locally.
The data stored in Django contains profiles and account numbers. There is a many to one
relationship
between account and user as one user may have multiple accounts with different banks. For
the
time being,
they are linked using the foreign key "UserID".
Using the accountid in account numbers, the webapp then fetches data from the MarkLogic
database
which
contains all the information about an account. Every account also has a balance and multiple
transactions
associated with it and all this detailed information can be fetched using the accountID.
Understanding Open Banking data and restructuring it to store on the MarkLogic database, Data Generation
The data obtained from using the Open Banking API is in JSON format and contains information about all accounts the user has connected. The JSON file is very detailed and has provided us extra freedom in structuring the data. Therefore the data retrieved using the Open Banking API would be extremely length with a very complicated structure. It took us awhile to understand the data but we have successfully understood the data, which allowed us to generate our own data based on the Open Banking's data format.
MarkLogic NoSQL Database
We have also realised the inefficiency to read the full data when we are only
presenting a
specific set of data for each account. Thus, we took the liberty to restructure the
data
whenever the user links their various accounts on our web app. We would generate
separate
JSON files for each account and re-upload them to the MarkLogic database hosted on
Microsoft
Azure.
MarkLogic is a NoSQL database. Therefore scheme is necessary when storing data on
the
database. MarkLogic's REST API also allows fast uploading and efficient data
retrieval.