quantization is the procedure of assigning a feature vector to the closest
visual word of a predefined visual vocabulary. Once the visual dictionary is
learnt, each descriptor of an image is quantized and the histogram of visual
word occurrences serves as a global description of the image. Then, the
histogram values are usually scaled to 0 1 and are fed to the classifier
either for training or testing. The efficiency of this part of the system is
crucial, since it affects processing times for both training and testing. The
complexity of the descriptor quantization mainly depends on the dimensions of
the descriptor and the number of visual words
image classification stage is involved in both training and testing phases. In
order to identify the appropriate classifier for the specific problem, several
experiments with three supervised classification methods were conducted.
Fig 1 Design for Admin Work for
Fig 1 Illustrates the design for uploading
images from food list dataset and it shows the results about uploaded images.
After uploading the images, we undergo key point extraction techniques such as
interest point detectors, random sampling and dense sampling. In interest point
detectors we use SIFT technique i.e., scale invariant feature transform.
Fig 2 key point extraction techniques a)
Design for SIFT sampling b) Design for DENSE sampling c) Design for RANDOM sampling
Fig 2 shows the
various key point extraction techniques. These key points are the central
points which detects the patches, later on a local image
descriptor is applied to a rectangular area around each key point.
After the key point
extraction method, we undergo descriptor quantization. Once the visual
dictionary is learnt, each descriptor of an image is quantized and the
histogram of visual word occurrences serves as a global description of the
image. Colour histogram are the most common colour descriptors they represent
the colour distribution for an image. Here we categorize the colour space into three
colour spaces, red, green, blue respectively.
Fig 3 Design for Feature Description
Fig 3 Illustrates the design for feature
description details it contains the details about RGB values about uploaded
explains the feature description module. once the descriptors of image have
been done the most representative patches need to be identified. Various patches
are clustered in predefined number of clusters. The centres of the clusters consist
of the dictionary of visual words, whereas the entire clustering is known as dictionary
Fig 4 Design for Adding Image in Food List
Fig 4 shows the Design
for adding image in food list items, after the classification of the image and
completion of the classifier analyse the food type and store the images in the
The final, optimized system achieved
overall recognition accuracy in the order of 78%, proving the image classification
feasibility of a BoF-based system for the food recognition problem. For future
work, a hierarchical approach will be investigated by merging visually similar
classes for the first levels of the hierarchical model, which can then be
distinguished in a latter level by exploiting appropriate discriminative
VII. FUTURE ENHANCEMENT
Hierarchical approach will be investigated
by merging visually similar classes for the first levels of the hierarchical
model, which can then be distinguished in a latter level by exploiting
appropriate discriminative features. Moreover, the enhancement of the visual
dataset with more images will improve the classification rates, especially for
the classes with high diversity.
F. Perronnin, Z. Akata, Z. Harchaoui, and C.
Schmid, “Towards good practice in
large-scale learning for image classification,” in Proc. IEEE Conf. Comput. Vis.
Pattern Recog., 2012, pp. 3482–3489.
K. E. A. Van de Sande, T. Gevers, and C. G. M. Snoek, “Evaluating color descriptors for object and scene
recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp.
1582–1596, Sep. 2010.
Nister and H. Stewenius, “Scalable recognition with a vocabulary tree,” in Proc. 2006 IEEE Comput.
Soc. Conf. Comput.
Vis. Pattern Recog., 2006, vol. 2, pp. 2161–2168.
F. Jurie, and B. Triggs, “Sampling strategies for bag-of-features image classification,” in Proc. Eur. Conf. Comput. Vis., 2006, pp. 490–503.
M. Chen, K. Dhingra, W. Wu, L. Yang, R. Sukthankar, and J. Yang, “PFID: Pittsburgh fast-food image
dataset,” in Proc. 16th IEEE Int. Conf. Image Process., 2009, pp. 289–292.
L. Fei-Fei and P. Perona,
“A bayesian hierarchical model for learning natural scene categories,” in Proc. IEEE Comput.
Soc. Conf. Comput.
Vis. Pattern Recog.,
2005, vol. 2, pp. 524–531.
T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” in Proc.10th Eur.
Conf. Mach. Learning, 1998, pp. 137–142.
L. Scarnato, L. Gianola, C. Wenger, P. Diem, and S. Mougiakakou, “A visual
dataset for food recognition and carbohydrates estimation for dia- betic
patients,” in Proc. 5th Int. Conf. Adv. Technol
Treatments Diabetes (ATTD2012), Barcelona, Spain, 2012.
G. Shroff, A. Smailagic, and D. P. Siewiorek, “Wearable context-aware food recognition for
calorie monitoring,” in Proc. 12th IEEE
Int. Symp. Wearable Comput.,
2008, pp. 119–120.
F. K. Bishop, D. M. Maahs, G. Spiegel, D. Owen, G. J. Klingensmith,
A. Bortsov, J. Thomas, and E. J. Mayer-Davis, “The carbohydrate count- ing in adolescents with type 1
diabetes (CCAT) study,” Diabetes Spectr., vol. 22, no. 1, pp. 56–62, 2009.