Computer Vision Model Leaderboard

benchmarked on: COCO 2017
(validation set, 5k images)

Frequently Asked Questions

What is mAP?

Mean Average Precision (mAP) is a metric used to evaluate the object detection models. It is the average of the precision-recall curves at different IoU thresholds.

What is the difference between mAP 50, mAP 75 and mAP 50:95?

mAP can be evaluated at multiple IoU thresholds. mAP 50, for example, evaluates it while considering detections that overlap with an IoU of 0.5 or greater - everything else is a false positive. mAP 50:95 is an average of all considered IoU thresholds - 0.5, 0.6, … 0.8, 0.9, 0.95. It is the primary metric showing how well the model performs, across increasing levels of rigour.

What do the small, medium, and large labels next to the mAP scores mean?

The small, medium, and large labels next to the mAP scores indicate the size of the objects in the images. This is important because object detection models can struggle with detecting small objects. The COCO dataset has three categories of object sizes: small (less than 32x32 pixels), medium (between 32x32 and 96x96 pixels), and large (greater than 96x96 pixels). You can learn more about the definition on the COCO detection evaluation description page.

What is F1 Score?

Recall measures how many of the target objects were detected. Even if the model detected more cars than there were in the image, but captured all targets among them - the recall will be 100%. Precision measures how many of the detected objects are correct. If the model found a box on some cars in an image, but classified 20% as bicycles, precision is 80%, regardless of how many were found. What if you want high recall and high precision? F1 Score simply combines the two into a single metric. With a high F1 score, you can be sure that model produced both high precision and recall in its results.

Here is the formula for F1 Score, where P is precision and R is recall:

F1 = \frac{2 P R}{P + R}

Is this project open source?

Yes! You can find the code for this project on GitHub.

What is the gear icon next to model names?

Hovering over it shows the parameters used to run the model. We aim to make the parameters as similar as possible to the ones used by the original authors.

What parameters were used to run each model?

We aim to make the parameters as similar as possible to the ones used by the original authors. Hover over the gear icon next to the model name to see the parameters used.

Can I suggest a model to benchmark?

Yes! If there is a model that you would like to see benchmarked, you may open a PR with the model instructions. Please have a look at the README to learn about the structure of the repository.