UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Graph-based food ingredient detection Ghotbi, Borna

Abstract

In this work, we address the problem of food ingredient detection from meal images, which is an intermediate step for generating cooking instructions. Although image-based object detection is a familiar task in computer vision and has been studied extensively in the last decades, the existing models are not suitable for detecting food ingredients. Normally objects in an image are explicit, but ingredients in food photos are most often invisible (integrated) and hence need to be inferred in a much more contextual manner. To this end, we explore an end-to-end neural framework with the core property of learning the relationships between ingredient pairs. We incorporate a Transformer module followed by a Gated Graph Attention Network (GGAT) to determine the ingredient list for the input dish image. This framework encodes ingredients in a contextual yet order-less manner. Furthermore, we validate our design choices through a series of ablation studies and demonstrate state-of-the-art performance on the Recipe1M dataset.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International