This article presents 5 awesome annotation tools which I hope will help you create Computer Vision datasets.
If you are a Data Scientist working in Computer Vision, you also probably realized that you need a fast and simple labeling tool for at least one of these two reasons:
to create your datasets for PoC or R&D experiments
to ensure the quality of your data so that it won’t affect the performance of your Deep Learning algorithms
I dug far into the world of computer vision labeling and realized that it is filled with quite an impressive number of tools (see these three awesome-lists here, here and there, or this blog post). I spent quite some time comparing the most promising (and active) projects to learn that most of these tools were designed to reach only one among three targets:
If you want to open a business in labeling and you need:
advanced project management features
tons of features so any task can be done
-automation tools to increase efficiency
If you belong to a startup you probably require:
APIs or at least, simple ways to connect the labeling tool to private APIs
An intuitive user experience (UX) so each annotator you are temporarily hiring can start working instantly
If you are working on your own and you:
don’t care about APIs / project management
just want to start tagging as fast as you can!
Here is a quick list of my favorite tools which allow annotating bounding boxes (detection) and polygons (segmentation) for computer vision applications.
If you find out these tools do not work as expected, try to run them in Chrome!
In Computer Vision, there is mainly three types data for training your algorithms:
Picture + label for training classifiers(ResNets)
Bounding box + label for detectors (YOLOv3, Faster R-CNN…)
Polygon + label forsegmentation applications (Mask R-CNN)
Difference between segmentation data (blue) and detection data (purple)
As you also probably realized, one of the most impacting factors for the success for an AI project is the quantity of “quality data” that you can use. What I mean by “quality data” for computer vision applications is:
every picture/annotation has an appropriate label
each bounding box or polygon accurately surrounds the entity to train on”
Even though the latter definition certainly lacks objectivity, we want our algorithms to achieve human-level performance. Thus, we require “human-level” annotations.
Computer Vision Annotation Tool (CVAT)
Almost 20 years after introducing OpenCV, Intel reiterates in the computer vision field and released CVAT, a very powerful and complete annotation tool. Even though it requires some time to learn and master, it proposes tons of features for labeling computer vision data.
Strengths:
It is easy to install and scale since it is a web-app running in a Docker
It proposes a lot of automation instruments (such as automatic annotation using the TensorFlow* Object Detection API, video interpolation…)
It allows managing collaborative work so different members of a team can work together on the same annotation task
Weaknesses:
The UI is quite complicated. For instance, setting up an annotation task can be quite tricky the first time
Not very intuitive at first, it can takes several days to master
Only runs with Chrome so you have to find workarounds if you fear Google…
Check out an online Demo here
Visual Object Tagging Tools (VoTT)
VoTT is developed by Microsoft and proposes an awesome user experience, which will probably save you a lot of time and energy while annotating. Moreover, creating a project is also straightforward so you don’t need to deep dive in the doc to use it.
Strengths:
The code is really well written (in React) and perfectly defines interface so it is easy to fork it and add the extra functionalities you need
As I said, the UX is perfect, with a Dark Theme and a dashed grid following your mouse so it is really easy to know where to start a bounding box. This might seem like a bonus, but trust me, this really makes a difference!
It proposes to use deep learning algorithms to automatically detect objects (it is shipped with SSD trained on COCO classes)
It comes as a web-app and an electron app. This let you either use it as a thick client or to an app running in your web browser
Weaknesses:
To use the web-app version, you need your data to be hosted on Azure, the cloud computing service of Microsoft (however, the electron version allows you to use data on your hard drive disk but you need to install it with npm).
It does not provide a built-in API (it is quite easy to tweak the code to allow your private API to communicate with it though)
You cannot label a picture: you are only allowed to draw bounding boxes (or polygons) with labels associated. Thus, it is not suitable for creating a classification dataset
Check out the web-app here!
DataTurks is a startup created in 2018 which offers services for labeling images, videos and also text. However, you had to pay for it until it recently became Open Source (this is probably linked to the fact that Walmart bought it in February 2019). Even though they almost did not communicate about it and seemed to have ceased any development since, the annotation tool is awesome and now free!
When you are using it, don’t pay attention to any license nor limited use for non-commercial edition that is written in different places. Dataturks is now free and you can use all of its features (I have tried and tested it)!
Advantages:
As for CVAT, it is a web app running in a Docker (see here for getting the docker image)
Allows collaborative and asynchronous works: two teammates working on the same dataset won’t get the same image to annotate
Proposes an API for creating and getting back annotation tasks
Weaknesses:
DataTurks seems to have stopped working on its product
The UX is OK but small tweaks could make it so much better
Check out an online Demo here
Make-sense was just released two months ago (in June 2019 if you are reading this in the future…) and already has an incredible UX. Starting annotating has never been so fast! Go to the website, drag and drop your images and start annotating.
Strengths:
Fast, efficient but most of all, easy!
Really cool UX
As for privacy about the images you load, don’t worry because as they say: “[They] don’t store your images, because [they] don’t send them anywhere in the first place”
Weaknesses:
Does not provide any project management feature
Does not provide any API neither
Start labeling here!
I hope this article will have helped you choose a labeling tool that fits your needs, and don’t hesitate to comment if you have found better ones!
LabelMe which is developed by the MIT. See here for an online version or there for running it on premise.
Coco-Annotatorhere seems fully featured whereas it is quite recent (user authentication system, API endpoints). Check out the demo here (Username:admin Password: password)
☞ Top 4 Programming Languages to Learn In 2019
☞ Top 4 Programming Languages to Learn in 2019 to Get a Job
☞ What To Learn To Become a Python Backend Developer
☞ Dart Programming Tutorial - Full Course