Add python path in scripts (#2625)

* add python path in scripts * Make python path work * Shorter path * Add docs * Change dist scripts * Change slurm example

Add python path in scripts (#2625)
3b53fe15 · Wenwei Zhang · GitHub · 99c879d2 · 3b53fe15 · 3b53fe15
Unverified Commit 3b53fe15 authored 4 years ago by Wenwei Zhang Committed by GitHub 4 years ago
--- a/README.md
+++ b/README.md
@@ -40,7 +40,7 @@ This project is released under the [Apache 2.0 license](LICENSE).

 ## Changelog

-v2.0.0 was released in 5/5/2020.
+v2.0.0 was released in 6/5/2020.
 Please refer to [changelog.md](docs/changelog.md) for details and release history.

 ## Benchmark and model zoo

--- a/docs/changelog.md
+++ b/docs/changelog.md
 ## Changelog

-### v2.0.0 (4/5/2020)
+### v2.0.0 (6/5/2020)
 In this release, we made lots of major refactoring and modifications.

 1. **Faster speed**. We optimize the training and inference speed for common models, achieving up to 30% speedup for training and 25% for inference. Please refer to [model zoo](model_zoo.md#comparison-with-detectron2) for details.

--- a/docs/getting_started.md
+++ b/docs/getting_started.md
@@ -288,13 +288,13 @@ Difference between `resume-from` and `load-from`:
 If you run MMDetection on a cluster managed with [slurm](https://slurm.schedmd.com/), you can use the script `slurm_train.sh`. (This script also supports single machine training.)

 ```shell
-./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} [${GPUS}]
+[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
 ```

 Here is an example of using 16 GPUs to train Mask R-CNN on the dev partition.

 ```shell
-./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x 16
+GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x
 ```

 You can check [slurm_train.sh](https://github.com/open-mmlab/mmdetection/blob/master/tools/slurm_train.sh) for full arguments and environment variables.
@@ -330,8 +330,8 @@ dist_params = dict(backend='nccl', port=29501)
 Then you can launch two jobs with `config1.py` ang `config2.py`.

 ```shell
-CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} 4
-CUDA_VISIBLE_DEVICES=4,5,6,7 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} 4
+CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
+CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
 ```

 ## Useful tools

--- a/docs/install.md
+++ b/docs/install.md
@@ -128,16 +128,10 @@ pip install -v -e .

 ### Using multiple MMDetection versions

-If there are more than one mmdetection on your machine, and you want to use them alternatively, the recommended way is to create multiple conda environments and use different environments for different versions.
+The train and test scripts already modify the `PYTHONPATH` to ensure the script use the MMDetection in the current directory.

-Another way is to insert the following code to the main scripts (`train.py`, `test.py` or any other scripts you run)
-```python
-import os.path as osp
-import sys
-sys.path.insert(0, osp.join(osp.dirname(osp.abspath(__file__)), '../'))
-```
+To use the default MMDetection installed in the environment rather than that you are working with, you can remove the following line in those scripts

-Or run the following command in the terminal of corresponding folder to temporally use the current one.
 ```shell
-export PYTHONPATH=`pwd`:$PYTHONPATH
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
 ```
--- a/tools/dist_test.sh
+++ b/tools/dist_test.sh
 #!/usr/bin/env bash

-PYTHON=${PYTHON:-"python"}
-
 CONFIG=$1
 CHECKPOINT=$2
 GPUS=$3
 PORT=${PORT:-29500}

-$PYTHON -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
+python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
    $(dirname "$0")/test.py $CONFIG $CHECKPOINT --launcher pytorch ${@:4}
--- a/tools/dist_train.sh
+++ b/tools/dist_train.sh
 #!/usr/bin/env bash

-PYTHON=${PYTHON:-"python"}
-
 CONFIG=$1
 GPUS=$2
 PORT=${PORT:-29500}

-$PYTHON -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
+python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
    $(dirname "$0")/train.py $CONFIG --launcher pytorch ${@:3}
--- a/tools/slurm_test.sh
+++ b/tools/slurm_test.sh
 #!/usr/bin/env bash

 set -x
-export PYTHONPATH=`pwd`:$PYTHONPATH
+
 PARTITION=$1
 JOB_NAME=$2
 CONFIG=$3
@@ -12,6 +12,7 @@ CPUS_PER_TASK=${CPUS_PER_TASK:-5}
 PY_ARGS=${@:5}
 SRUN_ARGS=${SRUN_ARGS:-""}

+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
 srun -p ${PARTITION} \
    --job-name=${JOB_NAME} \
    --gres=gpu:${GPUS_PER_NODE} \

--- a/tools/slurm_train.sh
+++ b/tools/slurm_train.sh
@@ -6,12 +6,13 @@ PARTITION=$1
 JOB_NAME=$2
 CONFIG=$3
 WORK_DIR=$4
-GPUS=${5:-8}
+GPUS=${GPUS:-8}
 GPUS_PER_NODE=${GPUS_PER_NODE:-8}
 CPUS_PER_TASK=${CPUS_PER_TASK:-5}
 SRUN_ARGS=${SRUN_ARGS:-""}
-PY_ARGS=${PY_ARGS:-"--validate"}
+PY_ARGS=${@:5}

+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
 srun -p ${PARTITION} \
    --job-name=${JOB_NAME} \
    --gres=gpu:${GPUS_PER_NODE} \