To adapt an old adage, training a machine learning model is a "garbage in, garbage out" process. That is, if you use low-quality, inaccurate annotations, the model you train on on that data will not perform well. But not to worry: at CrowdAI, we've spent countless hours curating training data using our own in-house annotation tools, so we know exactly how to approach annotation to avoid feeding any "garbage" into your training process.
Perhaps the most important component of creating high quality training data (and thus a high quality model) is the set of instructions you create to manage the annotation process. Here's our step-by-step guide to creating annotation instructions in our platform, as well as some best practices along the way.
Create Project Instructions
Project Instructions consist of the high-level information you want everyone to know when annotating media in a specific Project. The instructions that go in this section are not tied to any one Category (e.g. "cars" or "damage"), but instead apply across the entire Project itself.
For example, you may create a Project using geospatial imagery, but you want to exclude any examples that are in snowy conditions. In your Project Instructions, you could let everyone know to skip or reject annotation tasks that include snowy imagery.
Create instructions for a Category
Categories are the individual objects, features, or characteristics you want to build a computer vision model to identify. When you create a new Category in your account, you'll want to provide specific instructions for how someone should label that Category if they come across it in an image or video.
This information is what we call Category Instructions, and they belong the Category itself. That means if you re-use that Category across more than one Project in your account, those instructions will automatically copy over and be ready for use again and again!
Here's a full article on how to create Categories for annotation, and below is a quick walkthrough of how to add instructions to a newly created Category.
First, open your Category Manager from within your Project's Annotate page by clicking Manage Categories. This will pull up the full list of all Categories ever created in your account.
Click +Add Category.
Give your new Category a name, and decide if it's nested under another Category (if you leave nesting blank, we'll treat this as a top-level Category). Pick a label type and color, then scroll down.
The Instructions section has a rich text "what you see is what you get" (wysiwyg) editor. This is where you should enter the specific instructions for how to label just this Category. Remember: there's a separate field for Project-specific instructions, so make sure what you put in this box is relevant only to this new Category itself.
You can create tables, embed images, and use bullet or numbered lists organize your information. We highly recommend using a few visual examples here—after all, computer vision is all about vision!
Once you're satisfied with your instructions, click Create.
Now that you've created Category Instructions, you can attach that Category to a Project by searching for it in the What's Being Annotated section.
CrowdAI will automatically pull in your Project Instructions plus the Category Instructions for any Categories you've attached to the Project. You can click Preview Instructions to see how this information is collated and will be presented to users directly within an annotation task.
And that's it! You're now ready to kick off annotation with robust instructions for each Category in your Project.
Best practices for Category Instructions
Here are some of our own best practices to consider when creating Category Instructions.
Be specific. Remember that someone else may be the one reading these instructions and deciding how to annotate media for you. The more specific you are about what you're looking for in a particular image or video, the more likely someone else will be able to find it!
Think about edge cases. If you're looking to annotate cars that pass by a traffic camera, for example, stop to think about what else might pass by that camera and could be mistaken for a car. Do buses count? What about pickup trucks? Be sure to include directions on how to deal with these edge cases in your instructions.
Include visual examples! As I mentioned above, computer vision is inherently a visual process. Include a picture or two in your Category Instructions to help illustrate your directions for other users.