- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Graph-based food ingredient detection
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Graph-based food ingredient detection Ghotbi, Borna
Abstract
In this work, we address the problem of food ingredient detection from meal images, which is an intermediate step for generating cooking instructions. Although image-based object detection is a familiar task in computer vision and has been studied extensively in the last decades, the existing models are not suitable for detecting food ingredients. Normally objects in an image are explicit, but ingredients in food photos are most often invisible (integrated) and hence need to be inferred in a much more contextual manner. To this end, we explore an end-to-end neural framework with the core property of learning the relationships between ingredient pairs. We incorporate a Transformer module followed by a Gated Graph Attention Network (GGAT) to determine the ingredient list for the input dish image. This framework encodes ingredients in a contextual yet order-less manner. Furthermore, we validate our design choices through a series of ablation studies and demonstrate state-of-the-art performance on the Recipe1M dataset.
Item Metadata
Title |
Graph-based food ingredient detection
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2019
|
Description |
In this work, we address the problem of food ingredient detection from meal images, which is an intermediate step for generating cooking instructions. Although image-based object detection is a familiar task in computer vision and has been studied extensively in the last decades, the existing models are not suitable for detecting food ingredients. Normally objects in an image are explicit, but ingredients in food photos are most often invisible (integrated) and hence need to be inferred in a much more contextual manner. To this end, we explore an end-to-end neural framework with the core property of learning the relationships between ingredient pairs. We incorporate a Transformer module followed by a Gated Graph Attention Network (GGAT) to determine the ingredient list for the input dish image.
This framework encodes ingredients in a contextual yet order-less manner. Furthermore, we validate our design choices through a series of ablation studies and demonstrate state-of-the-art performance on the Recipe1M dataset.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2019-12-16
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0387154
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2020-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International