Skip to content

6. Experiment

Onceyour data pipeline is ready, you can finally begin to train models.

Create and assing an experiment issue. The assigment of this issue will automatically create an associated branch. Remember to link it to the corresponding milestone.

It's important to accurately describe the goals and assumptions behind your experiment.

Experiment branches

Each single change to the configurations and/or data pipeline should be associated with a separated experiment branch. Don't launch mutiple experiments randomly changing hyperparameters; it's better to run fewer but reasoned experiments and documenting the conclusions after the failures and successes.

A good first step to validate the data pipeline is to launch an experiment with little or no modifications to the configuration files that are included in the template; unless yor data exploration revealed an incompatibility (i.e. the default input shape is way too large/small for your data).

Dataset config

You might need to update some paths in your dataset configuration files (configs/datasets):

  • Check default annotation and image paths point to the right folders.
  • Check CLASSES variable is set (it must always contain at least the background class in the segmentation case), expand it with the classes defined in your ai-dataset schema.

You can find default dataset config files here:

Model config

You might also need to update your model configuration files (configs/models).

You need to update num_classes variable of your model to match the number of classes in you dataset. This is how it looks for the default detection model:

bbox_head=dict(
    type="RetinaHead",
    num_classes=2,
    in_channels=256,
    . . .

Besides, if you need to perform a deterministic training, so that the experiment results are reproducible or comparable to other experiments trained with different hyper-parameters, you will need to set up the model seed in the model config file, just assigning it an integer value, such as

seed = 0

Warning

Deterministic operations are often slower than nondeterministic operations, so single-run erformance may decrease for your model. However, determinism may save time in development by facilitating experimentation, debugging, and regression testing. See pytorch docs on reproducibility for more information.

Finally, if you want to train your model from scratch, without starting from the default pre-trained model weights, you must update the load_from parameter:

load_from = None

topk accuracy

If you are working with a classification experiment, remember that the argument topk is used for selecting different accuracies in the evaluation stage. The value of topk can be set in two different places: 1) in the model configuration as model['head']['topk']; and 2) in the evaluation dictionary (within the schedulers folder) as evaluation['metric_options']['topk']. Both model and evaluation topk arguments should have consistent values for the experiment to be finished without errors. They can be set as a tuple of integer values, each of one should be lesser than or equal to the number of CLASSES used for the current experiment.

Background class

If you are working with a segmentation experiment, remember that the background counts as a class, so if you want to segment two classes car and boat you have to set num_classes=3. You can find more info about how segmentation classes and masks work at the end of this page.

Score thresholds

If you are working with a detection experiment, remember that there are two different score thresholds which are used during an experiment. The first is score_thr, used for the testing stage in order to filter predictions below a desired score. See the test_cfg section in the model config file. This score_thr has a default value of 0.05 and it shouldn't be changed, as it is used for computing mean_ap metric during the test stage. The other score value is show_score_thr, which is used in test_stage_detection.py. This one has a default value of 0.3 and it is used for painting detection boxes above that score value in the test images.

Tensorboard metrics

You might want to use tensorboard metrics to compare results between different experiments. To do so, access your runtime configuration files (configs/runtimes) and uncomment the tensorboard hook line.

log_config = dict(
    interval=1,
    hooks=[
        dict(type="TextLoggerHook"),
        dict(type="MlflowLoggerHook", exp_name="{{cookiecutter.__project_slug}}"),
        dict(type="TensorboardLoggerHook", log_dir=""),
    ],
)

Experiment stage

Before going any further, if you have changed any of the config files add those changes to git and commit:

git add configs/
git commit -m "Update configs"

Assuming you are working on the associated branch, you can now add a new stage for the experiment by editing dvc.yaml:

Adding an experiment stage

# dvc.yaml
# . . .
  run_experiment_mlflow:
    cmd: export MLFLOW_TRACKING_URI="http://10.5.0.58:8999/" &&
      mlflow run . --experiment-name {{cookiecutter.__project_slug}} --no-conda
      -P dataset=configs/datasets/{{cookiecutter.dataset}}.py
      -P model=configs/models/resnet_18.py
      -P runtime=configs/runtimes/runtime.py
      -P scheduler=configs/schedulers/one_cycle_8_epochs.py
    deps:
      - configs
      - results/data/transform/coco_to_mmclassification-{{cookiecutter.dataset}}
    metrics:
      - results/metrics.json:
          cache: false
    plots:
      - results/prc.json:
          cache: false
          x: recall
          y: precision
# dvc.yaml
# . . .
  run_experiment_mlflow:
    cmd: export MLFLOW_TRACKING_URI="http://10.10.30.58:8999/" &&
      mlflow run . --experiment-name {{cookiecutter.__project_slug}} --no-conda
      -P dataset=configs/datasets/{{cookiecutter.dataset}}.py
      -P model=configs/models/retinanet_r50_fpn.py
      -P runtime=configs/runtimes/runtime.py
      -P scheduler=configs/schedulers/one_cycle_8_epochs.py
    deps:
      - configs
      - results/data/transform/property_split-{{cookiecutter.dataset}}
    metrics:
      - results/metrics.json:
          cache: false
# dvc.yaml
# . . .
  run_experiment_mlflow:
    cmd: export MLFLOW_TRACKING_URI="http://10.10.30.58:8999/" &&
      mlflow run . --experiment-name {{cookiecutter.__project_slug}} --no-conda
      -P dataset=configs/datasets/{{cookiecutter.dataset}}.py
      -P model=configs/models/fpn_r50.py
      -P runtime=configs/runtimes/runtime.py
      -P scheduler=configs/schedulers/one_cycle_8_epochs.py
    deps:
      - configs
      - results/data/transform/coco_to_mmsegmentation-{{cookiecutter.dataset}}
    metrics:
      - results/metrics.json:
          cache: false

Note

You can add the gpu argument (-P gpu=N) in order to decide which GPU selection mode to use. See the GPU Usage section for further detail about how to set the GPU selection mode properly.

After that, if you run in a terminal dvc repro the data pipeline stages should be skipped and the new experiment stage launched. Depending on your project this might take a while to complete. Results will be logged to the experiment tracking server under the experiment-name section.

You can add, commit and push the changes:

git add dvc.yaml dvc.lock results/
git commit -m "Finished experiment"
git push
dvc push

Explore in detail the results logged and regardless of the success or failure of the experiment, open a pull request filling the corresponding experiment section.

The background mystery: segmentation masks

Typically we assume the existence of the background class when we define our classes using the COCO annotations schema, in this schema we start counting our classes ids in 1, i.e. {"id": 1, "name": "car"}, {"id": 2, "name": "boat"}...; following this, the id 0 is always reserved for the background class, although it isn't explicitly defined. After processing our COCO annotations with the help of coco_to_mmsegmentation.py script, a bunch of masks are created, one per image. These masks are numpy arrays which contain the ground truth segmentation class for each pixel of the image: this class can be background (represented by a 0) or any of the defined classes (represented by their corresponding class id). The absence of an user defined class for a pixel is handled like if it was a background pixel. Therefore, it isn't necessary to explicitly annotate the background of your images. Think about it, it would be very inconvenient to annotate every background pixel of an image, so in the end we just annotate the most interesting parts of it (our defined classes) and let the coco_to_mmsegmentation.py script automatically "annotate" all the background pixels. This is very helpful, but you shouldn't forget about the background class just because you don't have to annotate it.