YOLOv5 in PyTorch > ONNX > CoreML > iOS
YOLOv5 in PyTorch > ONNX > CoreML > iOS glenn-jocher released this
This release aggregates various minor bug fixes and performance improvements since the main v3.0 release and incorporates PyTorch 1.7.0 compatibility updates. v3.1 models share weights with v3.0 models but contain minor module updates (inplace
fields for nn.Hardswish() activations) for native PyTorch 1.7.0 compatibility. For PyTorch 1.7.0 release updates see https://github.com/pytorch/pytorch/releases/tag/v1.7.0.
Breaking Changes
- 'giou' hyperparameter has been renamed to 'box' to better reflect a criteria-agnostic regression loss term (#1120)
Bug Fixes
- PyTorch 1.7 compatibility update.
torch>=1.6.0
required,torch>=1.7.0
recommended (#1233) - GhostConv module bug fix (#1176)
- Rectangular padding min stride bug fix from 64 to 32 (#1165)
- Mosaic4 bug fix (#1021)
- Logging directory runs/exp bug fix (#978)
- Various additional
Added Functionality
- PyTorch Hub functionality with YOLOv5 .autoshape() method added (#1210)
- Autolabelling addition and standardization across detect.py and test.py (#1182)
- Precision-Recall Curve automatic plotting when testing (#1107)
- Self-host VOC dataset for more reliable access and faster downloading (#1077)
- Adding option to output autolabel confidence with --save-conf in test.py and detect.py (#994)
- Google App Engine deployment option (#964)
- Infinite Dataloader for faster training (#876)
- Various additional
Assets
6
glenn-jocher released this
This releases includes nn.Hardswish() activation implementation on Conv() modules, which increases mAP for all models at the expense of about 10% in inference speed. Training speeds are not significantly affected, though CUDA memory requirements increase about 10%. Training from scratch as well as finetuning both benefit from this change. The smallest models benefit the most from the Hardswish() activations, with increases of +0.9/+0.8/+0.7/+0.2mAP@0.5:0.95 for YOLOv5s/m/l/x.
All mAP values in our README are now reported at --img-size 640 (v2.0 reported at 672, and v1.0 reported at 736), so we've succeeded in increasing mAP while reducing the required --img-size :)
We've also listed YOLOv5x Test Time Augmentation (TTA) mAP and speeds for v3.0 in our README table for the first time (and for v2.0 below). Best results are YOLOv5x with TTA at 50.8 mAP@0.5:0.95. We've also updated efficientdet results in our comparison plot to reflect recent improvements in the google/automl repo.
Breaking Changes
- This release does not contain breaking changes.
- This release is only backwards compatible with v2.0 models trained with torch>=1.6.
Bug Fixes
- Hyperparameter evolution fixed, tutorial added (#607)
Added Functionality
- PyTorch 1.6 compatible.
torch>=1.6
required (43a616a) - PyTorch 1.6 native Automatic Mixed Precision (AMP) replaces NVIDIA Apex AMP (#573)
nn.Hardswish()
activations replacenn.LeakyReLU(0.1)
in base convolution modulemodels.Conv()
- Dataset Autodownload feature added (#685)
- Model Autodownload improved (#711)
- Layer freezing code added (#679)
- TensorRT export tutorial added (#623)
** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.
- August 13, 2020: v3.0 release: nn.Hardswish() activations, data autodownload, native AMP.
- July 23, 2020: v2.0 release: improved model definition, training and mAP.
- June 22, 2020: PANet updates: new heads, reduced parameters, improved speed and mAP 364fcfd.
- June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
- June 9, 2020: CSP updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
- May 27, 2020: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
- April 1, 2020: Start development of future compound-scaled YOLOv3/YOLOv4-based PyTorch models.
Pretrained Checkpoints
v3.0 with nn.Hardswish()
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 37.0 | 37.0 | 56.2 | 2.4ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 44.3 | 44.3 | 63.2 | 3.4ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 47.7 | 47.7 | 66.5 | 4.4ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 49.2 | 49.2 | 67.7 | 6.9ms | 164 | 89.0M | 166.4B | |
YOLOv5x + TTA | 50.8 | 50.8 | 68.9 | 25.5ms | 39 | 89.0M | 354.3B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation except for TTA. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.001
. Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce TTA results by python test.py --data coco.yaml --img 832 --augment
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
v2.0 with nn.LeakyReLU(0.1)
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 36.1 | 36.1 | 55.3 | 2.2ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 43.5 | 43.5 | 62.5 | 3.2ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 47.0 | 47.1 | 65.6 | 4.1ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 49.0 | 49.0 | 67.4 | 6.4ms | 164 | 89.0M | 166.4B | |
YOLOv5x + TTA | 50.4 | 50.4 | 68.5 | 23.4ms | 43 | 89.0M | 354.3B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 672 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
Assets
6
glenn-jocher released this
Breaking Changes
IMPORTANT: v2.0 release contains breaking changes. Models trained with earlier versions will not operate correctly with v2.0. The last commit before v2.0 that operates correctly with all earlier pretrained models is:
https://github.com/ultralytics/yolov5/tree/5e970d45c44fff11d1eb29bfc21bed9553abf986
To clone last commit prior to v2.0:
git clone https://github.com/ultralytics/yolov5 # clone repo
cd yolov5
git reset --hard 5e970d4 # last commit before v2.0
Bug Fixes
- Various
Added Functionality
- Various
** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 8, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS.
- July 23, 2020: v2.0 release: improved model definition, training and mAP.
- June 22, 2020: PANet updates: new heads, reduced parameters, improved speed and mAP 364fcfd.
- June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
- June 9, 2020: CSP updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
- May 27, 2020: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
- April 1, 2020: Start development of future compound-scaled YOLOv3/YOLOv4-based PyTorch models.
Pretrained Checkpoints
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 36.1 | 36.1 | 55.3 | 2.1ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 43.5 | 43.5 | 62.5 | 3.0ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 47.0 | 47.1 | 65.6 | 3.9ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 49.0 | 49.0 | 67.4 | 6.1ms | 164 | 89.0M | 166.4B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 672 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
Assets
2
glenn-jocher released this
YOLOv5 1.0 Release Notes
- June 22, 2020: PANet updates: increased layers, reduced parameters, faster inference and improved mAP 364fcfd.
- June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
- June 9, 2020: CSP updates: improved speed, size, and accuracy. Credit to @WongKinYiu for excellent CSP work.
- May 27, 2020: Public release of repo. YOLOv5 models are SOTA among all known YOLO implementations.
- April 1, 2020: Start development of future YOLOv3/YOLOv4-based PyTorch models in a range of compound-scaled sizes.
Pretrained Checkpoints
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 36.6 | 36.6 | 55.8 | 2.1ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 43.4 | 43.4 | 62.4 | 3.0ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 46.6 | 46.7 | 65.4 | 3.9ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 48.4 | 48.4 | 66.9 | 6.1ms | 164 | 89.0M | 166.4B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --img 736 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
Assets
2
Watchers:160 |
Star:8089 |
Fork:2653 |
创建时间: 2020-05-18 11:45:11 |
最后Commits: 昨天 |
许可协议:GPL-3.0 |
69be8e7
Compare
Verified
This release implements two architecture changes to YOLOv5, as well as various bug fixes and performance improvements.
Breaking Changes
Bug Fixes
Added Functionality
Updated Results
Latest models are all slightly smaller to due removal of one convolution within each bottleneck, which have been renamed as C3() modules now in light of the 3 I/O convolutions each one does vs the 4 in the standard CSP bottleneck. The previous manual concatenation and LeakyReLU(0.1) activations have both removed, simplifying the architecture, reducing parameter count, and better exploiting the .fuse() operation at inference time.
nn.SiLU() activations replace nn.LeakyReLU(0.1) and nn.Hardswish() activations throughout the model, simplifying the architecture as we now only have one single activation function used everywhere rather than the two types before.
In general the changes result in smaller models (89.0M params -> 87.7M YOLOv5x), faster inference times (6.9ms -> 6.0ms), and improved mAP (49.2 -> 50.1) for all models except YOLOv5s, which reduced mAP slightly (37.0 -> 36.8). In general the largest models benefit the most from this update. YOLOv5x in particular is now above 50.0 mAP at --img-size 640, which may be the first time this is possible at 640 resolution for any architecture I'm aware of (correct me if I'm wrong though).
Pretrained Checkpoints