
Backend update and code structure (Jan 2022)
Since the last update I have set up the initial code structure required for our project. I have set up the alembic migration environment, which autogenerates database migrations based on database models defined in python, making adding new tables and relationships extremely easy, as well as the ability to revert to a previous version of the database if things go wrong, this is connected to a Postgres database on my local machine.
I've set up authentication so users can log in and out and created the endpoints for uploading and downloading models and datasets.
I have integrated the dockerising prototype into our system, model_gen contains a model api and a docker file, the model upload endpoint takes in source code, a pickle, and requirements.txt and builds an image that serves an api on port 8000 when run. These models are then written as .tar files and saved to the models directory, with their location/filepath written to the database.
I've set up the initial endpoints for uploading and downloading of datasets.
find out more about alembic here.
We have chosen to use SQLAlchemy for database operations as it allows for an object relational mapping, meaning everything is handled in python, making source code much more readable (in our opinion).
Here is an image of the directory structure, let me talk you through it

alembic
The alembic directory contains the required files for generating database revisions and the revisions themselves.
api
The api directory contains the majority of our source code.
authorise.py handles user authorisation with the OAuth2.0 password flow and JWTs.
crud.py contains the CRUD (create, read, update and delete) functions that are required for updating and reading from our database.
database.py defines the database connection, database engine, and database session.
main.py contains our endpoints.
model_gen.py contains the code for generating docker images from trained models.
models.py contains the definition of our database schema as python objects, used for generating alembic revisions.
schemas.py contains the pydantic models/schema for each table in our database, so that inputs and outputs for endpoints can be typed hinted with FastAPI.
models
The models directory is used to store models as tar files for download.
model_gen
The model_gen directory contains the files necessary to generate docker images with api's for trained models.
datasets
The datasets directory is used to store datasets as zip files for download.