Graph-based food ingredient detection

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Graph-based food ingredient detection Ghotbi, Borna

Abstract

In this work, we address the problem of food ingredient detection from meal images, which is an intermediate step for generating cooking instructions. Although image-based object detection is a familiar task in computer vision and has been studied extensively in the last decades, the existing models are not suitable for detecting food ingredients. Normally objects in an image are explicit, but ingredients in food photos are most often invisible (integrated) and hence need to be inferred in a much more contextual manner. To this end, we explore an end-to-end neural framework with the core property of learning the relationships between ingredient pairs. We incorporate a Transformer module followed by a Gated Graph Attention Network (GGAT) to determine the ingredient list for the input dish image. This framework encodes ingredients in a contextual yet order-less manner. Furthermore, we validate our design choices through a series of ablation studies and demonstrate state-of-the-art performance on the Recipe1M dataset.

Item Metadata

Title	Graph-based food ingredient detection
Creator	Ghotbi, Borna
Publisher	University of British Columbia
Date Issued	2019
Description	In this work, we address the problem of food ingredient detection from meal images, which is an intermediate step for generating cooking instructions. Although image-based object detection is a familiar task in computer vision and has been studied extensively in the last decades, the existing models are not suitable for detecting food ingredients. Normally objects in an image are explicit, but ingredients in food photos are most often invisible (integrated) and hence need to be inferred in a much more contextual manner. To this end, we explore an end-to-end neural framework with the core property of learning the relationships between ingredient pairs. We incorporate a Transformer module followed by a Gated Graph Attention Network (GGAT) to determine the ingredient list for the input dish image. This framework encodes ingredients in a contextual yet order-less manner. Furthermore, we validate our design choices through a series of ablation studies and demonstrate state-of-the-art performance on the Recipe1M dataset.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2019-12-16
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0387154
URI	http://hdl.handle.net/2429/72773
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2020-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Graph-based food ingredient detection Ghotbi, Borna

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights