-
Daniel Ecer authored43c8a2ae
This is where the ScienceBeam model is trained.
You can read more about the computer vision model in the Wiki.
Pre-requisites
- Python 2.7 (currently Apache Beam doesn't support Python 3)
- Apache Beam
- TensorFlow with google cloud support
- gsutil
Dependencies
Dependencies not already mentioned in the prerequisites can be installed by running:
pip install -r requirements.txt
and:
pip install -r requirements-dev.txt
Cython
Run:
python setup.py build_ext --inplace
Local vs. Cloud
Almost all of the commands can be run locally or in the cloud. Simply add --cloud
to the command to run it in the cloud. You will have to have gsutil installed even when running locally.
Before running anything in the cloud, please run upload-config.sh
to copy the required configuration to the cloud.
Configuration
The default configuration is in the prepare-shell.sh script. Some of the configuration can be overriden by adding a .config
file which overrides some of the variables, e.g.:
#!/bin/bash
TRAINING_SUFFIX=-gan-1-l1-100
TRAINING_ARGS="--gan_weight=1 --l1_weight=100"
USE_SEPARATE_CHANNELS=true
Inspecting Configuration
By running source prepare-shell.sh
the configuration can be inspected.
e.g. the following sequence of commands will print the data directory:
source prepare-shell.sh
echo $DATA_PATH
The following sections may refer to variables defined by that script.
Pipeline
The TensorFlow training pipeline is illustrated in the following diagram:
The steps from the diagram are detailed below.
Preprocessing
The individual steps performed as part of the preprocessing are illustrated in the following diagram: