This has been a very difficult project to provide concrete testing for, as a large proportion of this project involved continual trial and error along with research rather than concrete testing platforms. However we have been able to implement a few testing strategies throughout our project.
After the creation of the initial version of our pdf extraction algorithm, we wrote a basic unit testing facility in order to make sure any future work would not break the existing capabilities of the algorithm. This was particularly important as at the time we were going to include a lot more work on top of what we had built. Of course in the end most of this work was discarded as we moved on to the BERT model. THe code for the unit tests can be found here.
When creating the final reportQuery system the question and answer system was split from the code to insert the data into our database. The former was under the supervision of Yansong and Rachel, and the latter by Mark. In order to integrate these two separate systems quite extensive integration testing had to occur to make sure the interaction between the two components was as it should be. We started by creating a .json file of data extracted from a pdf, and attempting to submit that data to our database, this allowed fast debugging of any initial bugs caused by the integration or of any miss-communications during development. After this we were able to run a small batch of 10 PDFs to confirm the system could work over a prolonged period. We could then compare the output from BERT with what was put into the database. Only once we were happy there weren't any issues did we run the rest of the PDFs through our system.
We also performed testing on various different BERT models to determine the one with the best accuracy for our project. This is explained on the Research page.