Pixel-level Semantic Image Segmentation

Semantic Image Segmentation is Image Annotation where every pixel of the image gets classified to a category. 

                                                                                                                                                              
Think of building a classification model for each pixel in the image with the choice for the label.  Not an easy task
Annotating with the pixel-level precision can be a challenging tasks, that is why only a handful of image annotation and data labeling companies include it in their data offering. While object detection, bounding boxes for computer vision is predominant annotation techniques for Machine Learning model, semantic segmentation is a growing practice as it allows precise image analysis.
 
One of the popular use cases were a Kaggle Carvana Masking challenge. As a classic challenge, the idea behind was to mask out backgrounds, noisy parts of the photos that get uploaded in the car-sale website by users? Such photography enhancement strategies is definitely a game changer for photography and web industry.
 
So how to make image segmentation be pixel level and ensure that the foundation of it – polygonal annotation – would translate the full masks with no pixel lost?
Case study Semantic Segmentation for a photography company 
 
The interesting case for this particular segmentation of the image, is defining the layers. While in most of the aerial images, the set classes don’t overlap, i.e. buildings, vegetation, road, for the case of the photography segmentation, we define classes using indexing.
  • Background – or 0 – index class.  
  • Plates, packaging, bowls as a class – or  – 1 – level class.
  • Food as a class – or 2- level class.
We tested using bounding box annotation for the entire background depicting all the pixels,  following the polygonal annotation of the classes with the idea of indexing of ovelapping points of objects on the image.
 
By using indexing vs calculation of the x/y of the objects any street level, 90% angle images or full aerial view can be marked for annotation allowing to in a necessary case substract pixel of a higher level from the lower class pixel.
 
From the left to right there is the example of annotation for Background, 1 – level class, and  then level 3, that eventually ends up into the full segmented labeled image consisting more than 8 different classes, part of 3 level of indexing.
If you have an interesting and challenging case in semantic segmentation, reach out.