## Mirror sites
We use AWS as the main site to host our model zoo, and maintain a mirror on aliyun.
You can replace `` with `` in model urls.
- All FPN baselines and RPN-C4 baselines were trained using 8 GPU with a batch size of 16 (2 images per GPU). Other C4 baselines were trained using 8 GPU with a batch size of 8 (1 image per GPU).
- All models were trained on `coco_2017_train`, and tested on the `coco_2017_val`.
- We use distributed training and BN layer stats are fixed.
- We adopt the same training schedules as Detectron. 1x indicates 12 epochs and 2x indicates 24 epochs, which corresponds to slightly less iterations than Detectron and the difference can be ignored.
- All pytorch-style pretrained backbones on ImageNet are from PyTorch model zoo.
- For fair comparison with other codebases, we report the GPU memory as the maximum value of `torch.cuda.max_memory_allocated()` for all 8 GPUs. Note that this value is usually less than what `nvidia-smi` shows.
- We report the inference time as the overall time including data loading, network forwarding and post processing.
More models with different backbones will be added to the model zoo.
Please refer to [Faster R-CNN]( for details.
Please refer to [Mask R-CNN]( for details.
Please refer to [Fast R-CNN]( for details.
Please refer to [RetinaNet]( for details.
Please refer to [Cascade R-CNN]( for details.
Please refer to [Cascade Mask R-CNN]( for details.
### Hybrid Task Cascade (HTC)
Please refer to [HTC]( for details.
Please refer to [SSD]( for details.
Please refer to [Group Normalization]( for details.
Please refer to [Weight Standardization]( for details.
Please refer to [Deformable Convolutional Networks]( for details.
### CARAFE: Content-Aware ReAssembly of FEatures
Please refer to [CARAFE]( for details.
### Instaboost
Please refer to [Instaboost]( for details.
Jiangmiao Pang
### Libra R-CNN
Please refer to [Libra R-CNN]( for details.
Jiangmiao Pang
### Guided Anchoring
Please refer to [Guided Anchoring]( for details.
Please refer to [FCOS]( for details.
Please refer to [FoveaBox]( for details.
Please refer to [RepPoints]( for details.
Please refer to [FreeAnchor]( for details.
Please refer to [Grid R-CNN]( for details.
Please refer to [GHM]( for details.
Please refer to [GCNet]( for details.
Please refer to [HRNet]( for details.
Please refer to [Mask Scoring R-CNN]( for details.
Please refer to [Rethinking ImageNet Pre-training]( for details.
Please refer to [NAS-FPN]( for details.
### ATSS
Please refer to [ATSS]( for details.
We also benchmark some methods on [PASCAL VOC](, [Cityscapes]( and [WIDER FACE](
We compare mmdetection with [Detectron2](
The backbone used is R-50-FPN.
- 8 NVIDIA Tesla V100 GPUs
- Intel Xeon 4114 CPU @ 2.20GHz
- Python 3.7
- PyTorch 1.4
- CUDA 10.1
- CUDNN 7.6.03
- NCCL 2.4.08
<th>Lr schd</th>
<td rowspan="2">Faster R-CNN</td>
<td>38.6 & 35.2</td>
<td>38.8 & 35.4</td>
<td>41.0 & 37.2 </td>
The training speed is measure with s/iter. The lower, the better.
<th>Detectron2 (V100)</th>
<th>mmdetection (V100)</th>
The inference speed is measured with fps (img/s) on a single GPU, the higher, the better.
To be consistent with Detectron2, we report the pure inference speed (without the time of data loading).
For Mask R-CNN, we exclude the time of RLE encoding in post-processing.
The speed in the brackets of detectron2 is tested using our own server, which is slightly slower than the official speed.
<th>mmdetection (V100)</th>
<td>18.2 (17.8)</td>
<td>Faster R-CNN</td>
<td>Mask R-CNN</td>