AI challenges in Domestic Object Recognition

In searching for the best model

Computer vision and Convolutional Neural Network (CNN) are powerful tools for solving various problems that can greatly improve our daily life. In this white paper we present our experience in testing different hypotheses aiming to train the best CNN model that outputs representation vectors of images in such a way, so images from the same class to stay closer in the embedding space while images from different classes have greater distance.

We use the transfer learning approach (fig. 1) and try to solve the problem under some constrains like memory and inference time. The dataset in the study is not publicly accessible and contains different food products. During the analysis, we compare baseline CNN architectures, various input shapes, and cost functions, including our definition for Quadruplet loss. In order to measure models' performance, we propose a new metric named Mean Percentile Intersection.

Figure 1: CNN architecture for representation vectors

Register and download now!

By registering, you confirm that you consent to the storage and processing of your personal data as described in Soft2run’s privacy-policy. If you prefer not to receive marketing emails from Soft2run, you can opt-out or customize your preferences here.