Deployment Setup¶

1.1 VMs and Deployment Environments¶

The virtual machines for CodaLab Worksheets are named according to the scheme:

vm-clws-<env>-<machine-type>-<index>

<env> = prod, test, or dev
<machine-type> = server, worker, gpuworker, etc.
<index> = 0 for unique machines, or the worker index for workers

Here are the types of VMs that exist and the components each contains:

Server vm-clws-<env>-server-0 (only one)
- The server is the central manager of requests and responses. It has the following components, each in separate Docker images named as follows:
  - bundle-manager (source codalab-worksheets): responsible for creating and managing bundles and assigning them to workers
  - rest-server (source codalab-worksheets): runs the API and handles all API requests including those from CLIs, the webpage and workers
  - frontend (source codalab-worksheets): serves our Web UI to people who visit https://worksheets.codalab.org
  - monitor (source codalab-worksheets): runs a simple monitoring script that backs up the DB, runs sanity checks and sends email alerts
  - nginx (source codalab-deployments): has our nginx config to route requests into the correct recipients. The nginx config is copied over from this repo and it runs a standard nginx image.
Workers vm-clws-<env>-worker-<index>/vm-clws-<env>-gpuworker-<index>
- The workers act as compute machines to run bundles. They communicate with the server. Each has only one component, running in a single Docker image:
  - worker (source codalab-worksheets): worker state is cached in /mnt/scratch/bundles/
MySQL vm-clws-prod-mysql-0 (only one)
- Only for prod, this machine runs our MySQL database
- If you ever need to manipulate the prod database directly (!) you should SSH into this machine.

Note: Sign in to http://portal.azure.com using the codalab.worksheets@gmail.com account to view the VM info. (This is not necessary for day-to-day work.)

In the above VM names you might have noticed the <env> placeholder: this refers to the various deployment environments used to test code of differing quality. These can be:

prod: Production-grade code, running from well-tested releases, only updated per schedule by the person on call.
dev: Development-phase code, stage code from master branch to test on real infrastructure.

We have separate VMs for each of these environments, and each VM runs a copy of the Docker images that instantiate its components.

1.2 Repo Overview¶

This codalab-deployment repo contains credentials, scripts, deployment templates, and basically anything else you'd need to manipulate the public CodaLab deployments.

README.md            # The file you are currently looking at
certs/               # Credentials like SSH keys and login information
azure/               # For managing Azure resources like VMs
ansible/             # For managing server and worker processes on these VMs
on-call/             # Some useful tools for managing and checking the status of our services

Here are some of the tools to be familiar with:

Docker (docs): Allows us to create virtual containers from configured images, which are executable packages that includes everything needed to run an application: the code, a runtime, libraries, environment variables, and configuration files. It's like a lightweight VM that allows us toreproduce and isolate the environments that our various components run in (even if they're on the same machine).
Ansible (docs): A deployment/config automationsystem. We use it to set up our Azure VMs exactly as we need them and start our Docker containers remotely. Without thiswe'd have to manually run a whole bunch of cli commands each time.
Vault: TODO(transition to this) Use Vault to encrypt secrets link, best practices docs