UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Manipulating scale-dependent perception of images Trentacoste, Matthew Michael 2011

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2012_spring_trentacoste_matthew.pdf [ 69.04MB ]
Metadata
JSON: 24-1.0052094.json
JSON-LD: 24-1.0052094-ld.json
RDF/XML (Pretty): 24-1.0052094-rdf.xml
RDF/JSON: 24-1.0052094-rdf.json
Turtle: 24-1.0052094-turtle.txt
N-Triples: 24-1.0052094-rdf-ntriples.txt
Original Record: 24-1.0052094-source.json
Full Text
24-1.0052094-fulltext.txt
Citation
24-1.0052094.ris

Full Text

Manipulating Scale-Dependent Perception of Images  by Matthew Trentacoste B.Sc. Computer Science, Carnegie Mellon University,  M.Sc. Computer Science, University of British Columbia,                Doctor of Philosophy in       (Computer Science)  The University Of British Columbia (Vancouver) November  © Matthew Trentacoste,   Abstract The purpose of most images is to effectively convey information. Implicit in this assumption is the fact that the recipient of that information is a human observer, with a visual system responsible for converting raw sensory inputs into the perceived appearance. The appearance of an image not only depends on the image itself, but the conditions under which it is viewed as well as the response of human visual system to those inputs. This thesis examines the scale-dependent nature of image appearance, where the same stimulus can appear different when viewed at varying scales, that arises from the mechanisms responsible for processing spatial vision in the brain. In particular, this work investigates changes in the perception of blur and contrast resulting from the image being represented by different portions of the viewer’s visual system due to changes in image scale. These methods take inspiration from the fundamental organization of spatial image perception into multiple parallel channels for processing visual information and employ models of human spatial vision to more accurately control the appearance of images under changing viewing conditions. The result is a series of methods for understanding the blur and contrast present in images and manipulating the appearance of those qualities in a perceptually-meaningful way.  ii  Preface All publications, along with the relative contributions of the collaborating authors, that have resulted from the research presented in this thesis are listed in the following. Scale-Dependent Perception of Countershading M. Trentacoste, R. Mantiuk and W. Heidrich. This work is currently under submission [Trentacoste et al., b] and presented in Chapter . The author had the initial idea, implemented the software for both the perceptual experiment and the algorithm, wrote the majority of the paper and compiled the submission video. Dr. Mantiuk analyzed the results of perceptual experiment, contributed to the writing of the paper and provided insights in discussions. Dr. Heidrich supervised the project, aided in writing the paper and contributed to discussions. Synthetic Depth-of-Field for Mobile Devices  M. Trentacoste, R. Mantiuk, W.  Heidrich. This work is currently awaiting resubmission [Trentacoste et al., c] and presented in Chapter . The author had the initial idea, implemented the software, and wrote the paper. Dr. Mantiuk provided input in discussions. Dr. Heidrich supervised the project, contributed ideas and aided in drafting the paper. Blur-Aware Image Downsampling M. Trentacoste, R. Mantiuk, W. Heidrich. The contents of this paper [Trentacoste et al., a] are split between chapters, where the details of the blur estimation algorithm are included in Chapter  while Chapter  presents the blur-aware downsizing operator. The author had the  iii  initial idea, implemented the software for both the perceptual experiment and the algorithm, wrote the majority of the paper, compiled the submission video and presented the work at Eurographics . Dr. Mantiuk analyzed the results of perceptual experiment, contributed to the writing of the paper and provided insights in discussions. Dr. Heidrich supervised the project, contributed ideas and aided in writing the paper. Quality-Preserving Image Downsizing M. Trentacoste, R. Mantiuk, W. Heidrich. This poster [Trentacoste et al., b] described an earlier version of the work contained in Trentacoste et al. [a,c], and contained additional details on the appearance of noise in downsampled images. The pertinent contributions are also included in Chapter . The author had the initial idea, implemented the software, created the poster and presented at the SIGGRAPH Student Research Competition. Dr. Mantiuk contributed to discussion. Dr. Heidrich supervised and contributed ideas. Defocus Techniques for Camera Dynamic Range Expansion M. Trentacoste, C. Lau, M. Rouf, R. Mantiuk, W. Heidrich. The contents of this paper [Trentacoste et al., a] are included Chapter . Dr. Heidrich had the initial concept for the project, aided in the writing of the paper, supervised the project and contributed ideas. Cheryl Lau implemented portions of the software, provided input into the evaluation and during discussions. The author implemented the majority of the software, conducted the evaluation, drafted most the paper, and presented the at Electronic Imaging . Dr. Mantiuk and Mush qur Rouf provided input in discussions.  Portions of this thesis were conducted under Perception of image characteristics and differences, Ethics Certi cate Number H- of the UBC Research Ethics Board.  iv  Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iii  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  v  List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ix  List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  x  Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xiii  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xiv      Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Perceived Appearance . . . . . . . . . . . . . . . . . . . . . . . .    .  Scale-Dependence . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Manipulating Image Appearance . . . . . . . . . . . . . . . . . . .    .  Novel Contributions . . . . . . . . . . . . . . . . . . . . . . . . .    .  Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . .    Visual Perception . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Foundations of Vision . . . . . . . . . . . . . . . . . . . . . . . .    ..  Image appearance . . . . . . . . . . . . . . . . . . . . . .    ..  Spatial Vision . . . . . . . . . . . . . . . . . . . . . . . .    ..  Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . .    v  .  Blur Perception . . . . . . . . . . . . . . . . . . . . . . . . . . .    ..  Blur Sensitivity . . . . . . . . . . . . . . . . . . . . . . .    ..  Blur Appearance . . . . . . . . . . . . . . . . . . . . . . .    Spatial Contrast Perception . . . . . . . . . . . . . . . . . . . . .    ..  Contrast Sensitivity . . . . . . . . . . . . . . . . . . . . .    ..  Contrast Appearance . . . . . . . . . . . . . . . . . . . .    .  Edge Pro le Perception . . . . . . . . . . . . . . . . . . . . . . .    .  Image Appearance Models . . . . . . . . . . . . . . . . . . . . . .    .    Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Computational Photography . . . . . . . . . . . . . . . . . . . . .    ..  Computational Image Capture . . . . . . . . . . . . . . .    ..  Coded Aperture Imaging . . . . . . . . . . . . . . . . . .    ..  Deconvolution . . . . . . . . . . . . . . . . . . . . . . .    Blur Estimation and Manipulation . . . . . . . . . . . . . . . . . .    ..  Blur Estimation . . . . . . . . . . . . . . . . . . . . . . .    ..  Edge Detection . . . . . . . . . . . . . . . . . . . . . . .    ..  Elder and Zucker Minimum Reliable Scale . . . . . . . . .    ..  Samadani et al. Blur Estimation . . . . . . . . . . . . . . .    ..  Blur Magni cation . . . . . . . . . . . . . . . . . . . . .    Countershading Operations . . . . . . . . . . . . . . . . . . . . .    ..  Manipulating Sharpness . . . . . . . . . . . . . . . . . . .    ..  Manipulating Contrast . . . . . . . . . . . . . . . . . . . .    Related Applications . . . . . . . . . . . . . . . . . . . . . . . . .    .  .  .   Synthetic Depth-of-Field for Mobile Devices . . . . . . . . . . . .    .  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Blur Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Noise-Robust Estimation . . . . . . . . . . . . . . . . . . . . . .    .  Efficient Estimation . . . . . . . . . . . . . . . . . . . . . . . . .    .  Blur Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Conclusion    . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vi      Blur-Aware Image Downsizing . . . . . . . . . . . . . . . . . . . .    .  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . .    .  Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . .    .  Model for Matching Blur Appearance . . . . . . . . . . . . . . . .    .  Perceptually Accurate Blur Synthesis . . . . . . . . . . . . . . . .    .  Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .  Conclusion    Scale-Dependent Perception of Countershading . . . . . . . . . . .  .  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . .   .  Experimental Results . . . . . . . . . . . . . . . . . . . . . . . .   .  Discussion and Relation to Other Studies . . . . . . . . . . . . . .   .  Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .     . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ..  Avoiding Haloes When Resizing . . . . . . . . . . . . . .   ..  Local Tonemapping Operators . . . . . . . . . . . . . . .   ..  Artifact-Free Unsharp Masking . . . . . . . . . . . . . . .   ..  Countershading Analysis . . . . . . . . . . . . . . . . . .   ..  Viewer-Adaptive Display . . . . . . . . . . . . . . . . . .   Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Defocus Techniques for Camera Dynamic Range Expansion . . . .  .  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Physical Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Coded Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Deconvolution . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . .  .  Bene ts and Limitations . . . . . . . . . . . . . . . . . . . . . . .   .  Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vii  .  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   viii  List of Tables Table .  Number of operations in blur estimation . . . . . . . . . . . . .    Table .  Speci cations of the Apple iPhone GS camera. . . . . . . . . .    Table .  Reduction in dynamic range as a function of aperture radius . .   ix  List of Figures Figure .  Image analysis/synthesis framework . . . . . . . . . . . . . . .    Figure .  Adelson checkersquare illusion . . . . . . . . . . . . . . . . .    Figure .  Hybrid images of Oliva et al. [] . . . . . . . . . . . . . . .    Figure .  Neurological bases for visual channels . . . . . . . . . . . . . .    Figure .  Example of band-pass lters . . . . . . . . . . . . . . . . . . .    Figure .  Sample receptive elds of simple cells . . . . . . . . . . . . . .    Figure .  De nition of edge contrast . . . . . . . . . . . . . . . . . . .    Figure .  Examples of different blur lter frequency cutoffs. . . . . . . .    Figure .  Blur discrimination threshold from Chen et al. . . . . . . . . .    Figure .  Ciuffreda et al. conceptual model of blur perception . . . . . .    Figure .  Simultaneous blur contrast . . . . . . . . . . . . . . . . . . . .    Figure .  Perceived object scale of Held et al. . . . . . . . . . . . . . . .    Figure .  Contrast sensitivity grating . . . . . . . . . . . . . . . . . . .    Figure .  Plot of contrast constancy . . . . . . . . . . . . . . . . . . . .    Figure .  Example of simultaneous contrast . . . . . . . . . . . . . . . .    Figure .  Reduction in apparent contrast of blurred regions . . . . . . .    Figure .  Countershading process . . . . . . . . . . . . . . . . . . . . .    Figure .  Comparison of perceived effect of countershading . . . . . . .    Figure .  Comparison of step and Cornsweet edges. . . . . . . . . . . .    Figure .  Physical and perceived edge pro les. . . . . . . . . . . . . . .    Figure . Strength of the Cornsweet illusion . . . . . . . . . . . . . . .    Figure . Example Mondrians of Land and McCann. . . . . . . . . . . .    x  Figure .  Plenoptic camera of Ng et al.. . . . . . . . . . . . . . . . . . .  Figure .  Coded aperture pattern for x-ray imaging. Image reproduced    from Gottesman and Fenimore. . . . . . . . . . . . . . . . . .    Figure .  Deconvolution results of Yuan et al.. . . . . . . . . . . . . . .    Figure .  Examples of different kinds of motion blur . . . . . . . . . . .    Figure .  Relationship between motion blur and spatial frequencies . . .    Figure .  Steps of Canny edge detector . . . . . . . . . . . . . . . . . .    Figure .  Conceptualization of a Gaussian scale-space . . . . . . . . . . .    Figure .  Elder and Zucker minimum reliable scale . . . . . . . . . . . .    Figure .  Visualization of estimated blur map . . . . . . . . . . . . . . .    Figure .  Samadani et al. blur magni cation . . . . . . . . . . . . . . .    Figure .  Bae and Durand blur magni cation . . . . . . . . . . . . . . .    Figure .  Ampli cation of details in high-pass image . . . . . . . . . . .    Figure .  Tonemapping operators . . . . . . . . . . . . . . . . . . . . .    Figure .  Restoration of lost contrast by Krawczyk et al. . . . . . . . . .    Figure .  Examples of intelligent upsampling . . . . . . . . . . . . . . .    Figure .  Examples of image retargeting . . . . . . . . . . . . . . . . . .    Figure .  Comparison of original and synthetic depth-of- eld () images   Figure .  Flowchart of Samadani et al. blur estimation . . . . . . . . . .    Figure .  Demonstration of calibrating blur estimation . . . . . . . . . .    Figure .  Comparison of actual and estimated edge blur . . . . . . . . .    Figure .  Flowchart of combined minimum reliable scale and blur estimation   Figure .  Comparison of formulations of rs (x, y) and ms (x, y) . . . . . .  Figure .  Example of contrast inversion resulting from uncorrected blur map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . m′d     Figure .  Comparison of desired blur map md and modi ed version  .    Figure .  Example results for synthetic  . . . . . . . . . . . . . . . .    Figure .  Example of misclassifying texture detail as noise . . . . . . . . .    Figure .  Blur perception experiment stimuli . . . . . . . . . . . . . . .    Figure .  Multi-dimensional scaling of image results and selected images .    Figure .  The results of the blur matching experiments . . . . . . . . . .    xi  Figure .  Average matching blur data . . . . . . . . . . . . . . . . . . .    Figure .  Blur model ς m compared with the experiment results ς m . . . .    Figure .  Comparison of conventional downsample and our method . . .    Figure .  Another comparison of conventional downsample and our method   Figure .  Comparison of appearance of blur at multiple downsamples . .  Figure .  Comparison of naive downsampling, our method and Samadani et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Figure .    Veri cation study containing thumbnails with different amounts of defocus blur . . . . . . . . . . . . . . . . . . . . . . . . . .  Figure .      Comparison of indistinguishable and objectionable contrast countershading . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Figure .  Images used in the countershading perceptual experiment . . .   Figure .  Generation of the countershading enhancement in an image . .   Figure .  Results for individual participants with and without outlier removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Figure .  The results averaged over all participants . . . . . . . . . . . .   Figure .  How resizing changes the appearance of countershading . . . .   Figure .  Comparison between bilateral lter σs speci ed in Durand and Dorsey [] and our choice . . . . . . . . . . . . . . . . . .   Figure .  Comparison of naïve and weighted least-squares ()-based high-pass images . . . . . . . . . . . . . . . . . . . . . . . . .   Figure .  Comparison of Seurat’s original Le Bec du Hoc with our adjusted version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Figure .  Placement of aperture lter in optical setup.  . . . . . . . . . .   Figure .  Aperture lters evaluated. . . . . . . . . . . . . . . . . . . . .   Figure .  Sample images used in evaluation. Images copyright Frederic Drago. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   Figure .  Comparison of deconvolution algorithms without added noise .   Figure .  Comparison of aperture lters without added noise . . . . . . .   Figure .  Comparison of deconvolution algorithms with added noise . . .   Figure .  Comparison of aperture lters with added noise . . . . . . . .   xii  Glossary CSF contrast sensitivity function DOF depth-of- eld HDR high dynamic range HDTV high-de nition television HVS human visual system JND just-noticable-difference LDR low dynamic range LGN lateral geniculate nucleus MRScale minimum reliable scale MURA Modi ed Uniformly Redundant Array PSF pointspread function PSNR peak signal-to-noise ratio SLR single-lens re ex SNR signal-to-noise ratio UMO unsharp masking operator WLS weighted least-squares  xiii  Acknowledgments Completing my degree has been quite a journey. As such, there are quite a few people along the way to which I owe a debt of gratitude. Fives years is a long time. Times change, people move apart. I’ve almost certainly forgotten someone that helped me out at one point or other. This list is by no means complete and I am sure I owe thanks to many more people than I list here. To all the people of the Imager and PSM labs who have given me guidance and camaraderie: Cheryl Lau, Brad Atcheson, Nasa Rouf, Tyson Brochu, Allan Rempel, Anika Mahmud, Lukas Ahrenberg. In particular, Gordon Wetzstein, who beared more than his share of my existential angst and provided much guidance in how to work on the right things. To my friends who have supported me over the years, through making me dinner, buying me a drink, letting me rant or whatever else was necessary at the moment including (in no particular order) Max Ulis, Hannah Holmes, Keenan O’Conner, Jamie Abugov, Drew Smith, Tru Moreland, Justine Townsend, Tom Schulz, Ross Kakushke, Erin Caton, Jody Marshall, Jen Mackie, Annie Chiavaroli, Kat Southam, Nicole Sanches, Jules Emmerson, Matthew Harris, Marin Patenaude, Nathan Fenwick, Hendrik Kueck, Kim Kowaliuk and Brian Leroux. To my housemates, Adam Barlev and Marlo Carpenter, who have been in nitely patient with my late nights, frantic running to and fro and generally leaving a small disaster in my wake. Finally, to the members of my adoptive West Coast family: Josh Williams, Kelsey Nash, Monica Pearson and Shelley Oliver, who I can’t even begin to summarize all the gratitude I owe. To my supervisory committee, Michiel van de Penne and Bob Woodham, for all their suggestions and input and encouragement. To my collaborator, Rafał Mantiuk, xiv  for all the time he spent helping my half-baked ideas come to fruition and all the guidance he gave me on how to be a better researcher. Finally, to my supervisor, Wolfgang Heidrich, for supervising the entirety of my graduate career. For all of his input and insights, patience through my missteps and mess-ups, encouraging me and guiding me and continuing to believe in me even I wasn’t sure if I did myself. Lastly, I would like to thank my family. To my sisters, for the encouragement and belief that I could make it through, for always having the time to listen to whatever I needed to complain about, and taking time to celebrate my minor victories along the way. And last but certainly not least, my parents, Kathy and Michael. For, of all the things they’ve given me, the realization that success doesn’t depend on intelligence or luck, but hard work and sticking with things you believe in no matter how uncomfortable it may feel at the moment.  xv  Chapter   Introduction . Perceived Appearance Consider the scenario of a professional photographer documenting a scene of interest: the subject could be candid portraits in low light, fast-moving sports or a macro shot of an insect. In any case, there is the chance of some undesired blur degrading the image, so the photographer diligently reviews the images on the view nder to ensure the necessary parts are in focus. After inspecting the captured images and nding them to be satisfactory, the photographer wraps up and heads back to their studio. However, when the photographer views the images at full-size on their computer, they appear blurry, despite the smaller versions having appeared sharp. Most digital camera users have encountered this phenomenon on one or more occasions, where the view nder on the camera does not accurately represent the full-size image captured. When shown at a small size, the image often appears sharper than its larger-scale counterpart. The pixel content of the image did not change, but the image somehow looks different. While the image itself has remained unaltered, some change in how the image was viewed caused it to be understood differently. Any discussion about image understanding includes assumptions about the perceptual makeup of the viewer, in this case the visual system of a human observer. One of the most powerful features of the human visual system () is its ability to operate effectively across a vast array of conditions, adapting to provide not only   meaningful but consistent results. This property can easily be seen by our ability to read this page in both direct sunlight and a dimly lit room while maintaining a nearly constant appearance of the page across such a signi cant change. The goal of visual perception is to provide a meaningful representation of the scene being observed. Due to various biological limitations, the neural channels in our visual system are not capable of conveying the vast range spanned by the physical quantities of that scene. To overcome these limitations and provide a representation of the scene that we recognize, the  must continually adapt to changing conditions. This adaptation relies upon representing how the image appears in a manner not based on the underlying physical amount of light received by the eye, but a more general set of perceived quantities. The combination of these perceptual quantities describe the visual appearance of the scene, as understood by the viewer. The adaptation process inherent in perception preserves the visual appearance of the text on this page, allowing it to be equally readable in direct sun or dim lighting. We de ne the visual appearance of a stimulus as: Visual appearance The supra-threshold perception of a stimulus produced by a viewer’s visual system de ned in terms of perceptual units like brightness and colorfulness. Speci cally, we distinguish the visual appearance of a stimulus from the measurements of the underlying physical object being observed. Some aspects of visual appearance, such as perceived contrast, directly relate an underlying property (physical contrast), while other aspects, such as colorfulness, are more abstract. We refer to the collection of physical properties of an object as the physical appearance, to distinguish between the physical stimuli and the perceived phenomenon. The concept of visual appearance, and by extension image appearance, is a complex issue. Terms like color and brightness are simultaneously universal and elusive, intuitively understood by everyone, yet hard to de ne rigorously. The  performs exceeding well at the task of providing a consistent visual appearance of our surroundings and we generally go about our lives with the intuitive notion that objects appear the same regardless of how we happen to be viewing them. However, there are numerous examples in everyday life where this assumption does not hold:   • Car headlights, barely noticeable during the day, are nearly blinding at night. • Scenes appear more colorful and of higher contrast on a sunny day. • Different color matte boards cause artwork to take on different appearances. None of these examples can be explained by physical measurements of the subject matter alone. In all of these cases, we consider the visual appearance of the object of interest to have changed, while the underlying physical representation of that object has generally remained constant. Sometimes that change is consistent with our notions of the world, such as the examples of the car headlights. Sometimes we tend not to even consider the change, such as with scenes appearing more colorful on a sunny day. In both cases, the visual appearance of the scene changed, and we understood the change and accounted for it. In other cases, the change in visual appearance can be unexpected and even confusing, as is the case with the appearance of the artwork and the apparent change in sharpness of the photographs described at the beginning. In all cases, the visual appearance changed, but in these two examples the appearance changed in a way that we did not expect and our understanding of the phenomena changed with it. Even after a lifetime of experience, the mind has not completely learned to account for all the nuances of the perceptual mechanisms it relies on to experience the world. Consider the example of the matted artwork. Part of the reason the change in the appearance is so unexpected is that none of the physical properties of the artwork itself changed. If there is a change in the physical properties of an object, we intuitively understand that we will perceive it differently. In this case, a change in one part of the scene (the matte) caused another part of the scene (the artwork) to change in appearance. This particular scenario brings us to an important realization: The perception of a stimulus does not occur in isolation and the visual appearance of any given subject depends on more than just the inherent properties of the object being observed. The relationship implies that extrinsic factors, beyond the pixel content of the image, affect its perception. The elds of perception and color science have have conducted considerable amounts of research into quantifying and correcting these   changes in appearance. The increases of colorfulness and contrast of a scene on a sunny day are respectively known as the Stevens effect and the Hunt effect, while the artwork changing in appearance as a result of the matte is know as the BartlesonBreneman effect [see Fairchild, ].  . Scale-Dependence In terms of extrinsic factors, the eld of color science is mostly concerned with the perception of global properties, such as image-wide contrast and luminance. More complete models [Fairchild and Johnson, ], that address the complexities in image appearance can account for some additional factors but still fall short of capturing the full complexity of visual perception. None of the current models of image appearance account for what the photographer observed with regards to the sharpness of the images. This thesis investigates how the scale at which an image is viewed can alter the perception of sharpness and contrast features in that image and how understanding image appearance can improve image processing algorithms. Any change to the geometry of a given viewing condition, such as the size of the displayed image or the distance from the viewer, will cause a proportional change in the size of the image projected on the retina of the observer. As a result, the physical relation between the viewer and the image can determine the scale at which image features are perceived. Evidence from both psychophysical studies [Blakemore et al., ] and neurological recordings [Hubel and Wiesel, ] suggests that the visual system contains mechanisms that only respond to a narrow range of spatial frequencies. That the threshold of human contrast sensitivity changes as a function of spatial frequency [Blakemore et al., ] further supports this hypothesis. Changes in the physical relation of image and viewer can change the visual channel to which a given feature is mapped. As a result, differences in how individual channels represent visual information can cause perceived properties like sharpness and contrast to change in appearance when the image projected onto the retina is altered. The size of the image formed on the retina is affected by aspects of the displayviewer relationship that include: the physical size of the display, the distance between the viewer and the display, and the pixel dimensions of the displayed image. Often    changes in image appearance are a result of a change in more than one of these aspects. Walking three times closer to a monitor and switching from a screen in a movie theater to an iPhone result in considerable changes in the size of the image projected on the retina. Knowing how large the display will be and how far away the viewer will be is crucial to conveying the right appearance of that image. The dimensions (in pixels) at which an image is displayed are even more important to consider, since they change even more frequently. Very few cameras remaining on the market, including mobile devices, capture an image at low enough resolution to show on a p HDTV display without resizing. As a result, image downsampling has become synonymous with image display.  . Manipulating Image Appearance The purpose of most images is to effectively convey information to a human observer. However, most computer graphics research focuses on producing the images that are either the most physically correct or mathematically optimal. Physical correctness and mathematical optimality do not necessarily capture the complex relations required for an image to best communicate information. Involving elements of human visual perception in the development of image processing algorithms can ensure that the result can appear the same as the original and be understood correctly. The vast majority of images are produced to be viewed, interpreted and understood by human observers. Given that humans do not directly perceive the underlying physical representation of an image, but some analogous representation of appearance, it follows that most image processing algorithms could produce better results if they focused on producing the image with the desired appearance instead of optimizing another metric. In most cases, the appearance of an image matches the physical representation closely enough that conventional appearance-naïve image processing methods can produce acceptable results. However, a perceptually-focused approach is crucial in the examples above where the perceived appearance of an image deviates signi cantly from its physical appearance. In these cases, appearance-aware algorithms can better account for the complex nonlinearities of visual perception and more accurately produce the desired interpretation.    In our work, we begin by quantifying the change in appearance that a viewer observes when the image is altered, building models of how that image is perceived under different conditions. We then use those models to develop image processing algorithms that incorporate perceptual considerations and produce results that may not be the most optimal with respect to simple mathematical metrics, but instead are the most visually appropriate. The result is a series of methods that rst produce a perceived image according to some desired criteria and then solve for the physical image that best yields that result. This approach parallels how color perception has informed color management practices. The transformations between device color pro les not only account for differences between display response or gamut, but for changes in image perception. Color management algorithms must account for changes in display intensity or viewing environment altering the appearance of the image. Likewise, the methods we present address issues related to image perception across the vast size range of current displays, covering everything from cinema screens to mobile devices.  . Novel Contributions The algorithms all share the conceptual framework of computational image analysis/synthesis. At the core of each approach is a method that analyzes an image and produces an estimate of how some speci c physical attribute changes across the image. The estimate is recast in another (usually perceptually-motivated) representation. We can preserve the perceived appearance when altering an image by solving for the appearance-preserving modi cations to restore the perception of the transformed image. Alternatively we can alter the perceived appearance of an image by transforming the image appearance then solving for the modi cations to the underlying image that will cause the desired change in appearance. Finally, given the original image and these modi cations, we synthesize the nal result. Figure . depicts this framework. The main contribution of these methods are as follows: Spatially-Variant Blur Estimation An efficient technique for robustly estimating the amount of blur present at each pixel in an image.    Image  Estimated quantity  Change image Preserve appearance  Solve for new result  Same image Alter appearance  Solve for new result  Perceived appearance  Figure .: Depiction of image analysis/synthesis framework shared by our contributions. In each case, we analyze some quantity in the image, then synthesize a new result that either preserves the appearance of that quantity or has a novel appearance. Synthetic Depth-of-Field for Mobile Devices An image enhancement technique to overcome the limitations of the optics in mobile devices and create a narrower depth of eld than can be physically captured. Blur-Aware Image Downsampling A perceptually-based approach for preserving the appearance of blur when downsizing images. Scale-Dependent Perception of Countershading A model of visual appearance of countershading in complex images and the conditions under which countershading is considered objectionable. Defocus Dynamic Range Expansion An investigation into increasing the effective dynamic range of image sensors by optically blurring the captured image and restoring it with deconvolution. In the case of the Synthetic Depth-of-Field and Blur-Aware Downsampling algorithms, we analyze the image to produce a spatially-variant estimate of the blur present at each pixel. This estimate is transformed according to a model representing the desired change in image appearance to produce the blur present in the desired image. In the case of Synthetic Depth-of-Field the model represents a change in the aperture of the camera, while in the case of Blur-Aware Downsampling the model represents the change in perceived blur. Finally, the resulting image is synthesized from the original image and the result of the model. With the Scale-Dependent Perception of Countershading paper, we conduct a perceptual experiment to determine what countershading parameter combinations in  troduce halos and are thus considered objectionable. We demonstrate how this perceptual model applies to a number of image processing algorithms that either introduce or modify the appearance of countershading, including unsharp masking, tone mapping and image resizing. Additionally, we present some novel applications of our model including artifact-free countershading and viewer-adaptive displays. In the case of Defocus Dynamic Range Expansion we investigated several techniques for deconvolving an optically blurred image to synthesize a nal sharp image with a dynamic range that exceeded the physical limitations of the imaging sensor. The deconvolution algorithms we employ implicitly generates an estimate of the blur present at each pixel and synthesize the un-blurred result, concurrently adding any expansion in dynamic range. The overall approach is similar to analysis/synthesis but this method does not explicitly formulate the quantity being estimated.  . Thesis Organization This thesis begins with an overview of visual perception, image appearance, blur and contrast perception, and how these topics relate to the eld of computer graphics. Chapter  provides background on the makeup of the human visual system, the mechanisms of spatial vision, blur and contrast perception, image appearance. We begin with an overview of the vision concepts that form the foundation of our work, and build upon that with a discussion of how blur and contrast are represented in spatial vision. We continue with how these concepts related to image appearance, discuss speci c examples such as the Cornsweet illusion, and conclude with a survey of existing attempts to model image appearance. Chapter  covers existing research related to the design of optical systems and image processing algorithms that relate to our work. This survey includes model of blur formation in images, techniques for blur estimation and methods of restoring and altering image appearance. In the following four chapters, the focus shifts to the topics in computer graphics related to our contributions. Chapter  presents the work of Synthetic Depth-of-Field for Mobile Devices, a method of spatially-variant blur estimation that is efficient enough for mobile devices yet robust enough to handle large amounts of noise. We use this blur estimation method to create images with a shallower depth-of- eld than can be    captured by the camera optics. Chapter  discusses how Blur-Aware Image Downsampling uses the same blur estimation method to preserve the appearance of blur in images when downsizing. Replacing the model of lens optics with a perceptual model of blur appearance, this method creates downsampled images that better resemble the appearance of the original. Chapter  explains how Scale-Dependent Perception of Countershading extends this concept of preserving appearance to contrast perception. We present a model of how the width and contrast of countershading pro les affect how they are perceived and use this model to in uence novel image processing methods. Chapter  examines the feasibility of Defocus Dynamic Range Expansion, using modi ed optics and deconvolution to improve the effective dynamic range of image sensors. We take a blurry image captured by a lens with an aperture lter and attempt to recover a sharp image with a larger dynamic range than the original. Finally, Chapter  concludes the thesis with a summary of all the material presented and discusses future directions of investigation.    Chapter   Visual Perception This chapter presents elements of visual perception related to our work and discusses the physiological basis and algorithmic work concerning how images are perceived. These topics motivate our contributions detailed in Chapters  through . To begin, Section . discusses the foundations of visual perception that relate to image appearance. In addition to image appearance, we present details on spatial vision and the band-limited nature of the channels involved in visual perception. Sections . and . discuss the mechanisms in the  which mediate the perception of blur and contrast, respectively. The sections discuss how the underlying visual channels affect our perception of the phenomena, including sensitivity, and discuss the subtle differences between sensitivity to a perceptual quality and the perceived appearance of that quality. Section . examines several higher-order visual phenomena arising from more complex edge pro les. These pro les, along with spatial contrast perception explain the appearance of contrast and blur in complex images. Finally, Section . surveys image appearance models, computational attempts at modeling elements of visual perception. These algorithms incorporate models of how the  processes imagery, and attempt to produce a representation matching our perception of image appearance.    . Foundations of Vision ..  Image appearance  As stated in the Introduction, we differentiate between the visual appearance of an object, as perceived by a human observer, and the underlying physical properties of that object. Visual perception includes complex properties like depth, occlusion and disparity that must be integrated to make sense of the wide array of visual stimuli present in the real world. In this thesis, we are primarily concerned with image appearance, the perception of two dimensional renditions on displays, printed pages, or other physical media. Before attempting to inform the algorithms we construct to consider image appearance, we must rst consider how the portions of an image can be perceived. In this section, we provide several examples of how the perceived appearance can differ from the underlying physical image. These examples provide context to the details of the  we present in the remainder of this chapter. Within a single image, regions with identical physical properties can be interpreted completely differently, depending on the surrounding context. Adeleson’s now-famous checkerboard [Adelson, ], reproduced in Figure ., is one case where similar portions of an image can be perceived in different ways. In this image, the patches labeled A & B are the same intensity and will produce the same luminance on the retina. However, the other squares around A & B de ne separate contexts that alter the understanding, resulting in the luminance shared by A & B is interpreted as two different brightnesses. Adelson [] demonstrates that surrounding regions can have a larger effect on the perception of brightness than the region itself and the conditions under which an image is viewed strongly determine its appearance. However, once the context is changed by adding bars of the same color to connect the two regions, it becomes obvious that they are in fact the same intensity. Likewise, the perception of the image also depends on the portion of the visual system responsible for conveying its appearance. The visual channels along which the  conveys information are not identical and do not reproduce the exact same response. As a result the appearance of an image is in part determined by which    Figure .: Adelson checkersquare illusion. Patches A & B have the same luminance, but appear different brightnesses. Image copyright Edward Adelson. pathways were responsible for transmitting it to the conscious mind. The sensitivity of the  to a quality of the image appearance differs depending on the conditions under which that image is perceived. We de ne sensitivity as the ability to observe changes in a particular visual stimulus, de ned in terms of the threshold that must be exceeded in order for the stimulus to be detectable. The more sensitive the viewer, the lower the threshold and the more they are able to discern changes in an image. The hybrid images by Oliva et al. [], shown in Figure . are one particularly surprising example of how changes in sensitivity can affect image appearance. Both the large left and small right image are the same, however the content of the image has speci cally been designed to cause some details be visible at one scale while other details be visible at another. The change in contrast sensitivity between different visual channels causes the image to appear like Albert Einstein when viewed at a large size but Marilyn Monroe when viewed at a small size or from far away. The sensitivity of the  to an image attribute and the appearance of that attribute are clearly related. However, the  is a complex nonlinear system with various elements of the perceptual process continually adapting and compensating for the performance of other elements. A difference in the sensitivity of the  under two sets of conditions does not necessarily imply that the resulting images will appear different. Likewise, a change in image appearance does not necessarily imply that the sensitivity of the visual system has changed. We explore these relations in   Figure .: Hybrid images of Oliva et al.. The image contains the high frequency content of Albert Einstein and the low frequency content of Marylyn Monroe. One image or the other is more apparent depending on which frequency content is closer to the sensitivity range of the . Image copyright Aude Oliva. depth after a brief overview of the related low-level aspects of visual perception.  ..  Spatial Vision  Hybrid images provide another concrete example of how the difference between visual channels representing an image in the  can cause changes in appearance. This section discusses the structures of the visual system that give rise to these visual channels. Spatial Vision by De Valois and De Valois [] provides detailed coverage of all levels of spatial vision. For a general discussion of the psychophysics and neurology related to visual perception, refer to Vision Science by Palmer [] and Principles of Neural Science by Kandel et al. [].   The act of visual perception begins at the retina, where the photons corresponding to the image are converted into neural impulses by photoreceptive rods and cones. The signals produced by these photoreceptive cells are transmitted to retinal ganglion cells, the next level of visual processing in the retina. The connections between photoreceptors and retinal ganglion cells form a complex network, where each photoreceptor sends, and each ganglion cell receives, information from multiple of their respective counterparts. Each portion of the retina has several functionally distinct subsets of ganglion cells conveying signals from the same photoreceptors in parallel [Kandel et al., ]. Even at this very early stage in the visual pathway, information from regions that extend beyond a single cell is being collected and transmitted in neural impulses from individual units. The ganglion cells combine signals from several photoreceptors in ways that depend on precise spatial (and temporal) patterns [Kandel et al., ]. The signals from the ganglion cells travel along the visual pathway to the lateral geniculate nucleus (), where the neurons connect to one of two kinds of cells, either magnocellular (M-cells) or parvocellular (P-cells) [Palmer, ]. The receptive elds of M-cells correspond to larger regions of the retina and respond stronger to stimuli of lower spatial frequencies. On the other hand, the receptive elds of P-cells correspond to smaller regions of the retina and respond stronger to stimuli of higher spatial frequencies. While the exact spatial sensitivities of both M- and P-cells vary and the two types overlap, it is apparent the  continues to partition ranges of possible spatial frequencies in more discrete parallel pathways. The neural signals produced by the M- and P-cells leave the  and continue on to the visual cortex. Within the visual cortex, these signals are received by neurons known as simple cells. Similar to the pattern of connections linking photoreceptors, retinal ganglion cells and the cells of the , there are multiple sets of connections between the M- and P-cells and the simple cells of the visual cortex operating in parallel. Within the visual cortex, this pattern of interconnectedness continues as larger and larger structures are linked together: Groups of simple cells are connected to complex cells. Groups of interconnected complex cells form columns in the visual cortex. Groups of interconnected columns form hypercolumns. Groups of adjacent hypercolumns are connected to each other. The increasing interconnectedness at each stage of the visual system allows a   Simple Cell Lateral Geniculate Nucleus  Retina Photoreceptor  P-Cells  M-Cells  Retinal Ganglion Cell  Figure .: Neurological bases for visual channels. The connection patterns of photoreceptors and ganglion cells, as well as the organization of the  are examples of how visual channels are formed in the . greater range of possible scales to be represented. A photoreceptor on the retina will only respond to the light from a single point in the scene. On the other hand, a complex simple cell could respond to a range of different region sizes, depending on how large a group of photoreceptors are connected to it. Some cells will collect responses from only a small region of retinal cells while some cells will collect responses from a larger region. As a result, these cells respond to different-sized features [De Valois and De Valois, ]. Features too small for a given region are averaged out as the signals are summed together and features too large for that region cannot be represented represented by the the underlying mechanisms. This con guration of multiple parallel connections gives rise to visual channels present in the . Each channel acts as a band-pass lter, carrying information about a particular spatial frequency band (and orientation) [De Valois and De Valois, ].   This interpretation is support by neurological measurements by Hubel and Wiesel [] of cells in the V area of the visual cortex that selectively respond to different spatial frequencies and orientations, as well as the existence of horizontal connections between hypercolumns in the visual cortex [Hubel and Wiesel, ]. Further evidence is provided by psychophysical studies regarding selective adaptation of subjects to spatial frequency. In the seminal experiments by Blakemore et al. [], subjects shown gratings of a speci c spatial frequency adapted to that frequency and demonstrated a temporary reduction in sensitivity to that spatial frequency. All of this evidence makes a compelling case for the existence of parallel channels of information processing in the . Figure . shows an image decomposed into a Laplacian pyramid, a series of band-pass images. This representation is one possible interpretation of what information is carried in separate visual channels and discussed in Section ..  Figure .: Example of band-pass lters. Each image of Kurt Vonnegut has been processed by a different band-pass lter and contains separate spatial frequency content. These mechanisms have served as inspiration for a variety of image processing techniques. The concept of local or band-limited contrast inherent in the parallel visual channels representation has proven to be a powerful image processing technique. Gabor lters [Feichtinger and Strohmer, ] best model the responses of the simple cells in the visual cortex. While accurate representations of local spatial contrast and orientation sensitivity, Gabor lters are difficult to use computationally and other representations are often employed. There are a number of sub-band transforms that mimic properties of visual channels. The cortex transform by Watson [] provides a simpler, yet reasonably accurate model of the ability of the  to decompose visual stimuli into separate spatial frequencies and orientations. The Gaussian and Laplacian pyramids [Gonzalez and Woods, ] are more common   methods that ignore orientation yet retain segmentation of spatial frequencies.  ..  Edges  Having presented evidence that the  transmits visual information via multiple channels, we now address the exact nature of the information these channels convey. As we have stated, these parallel channels begin with the connections between the photoreceptors and the retinal ganglion cells. Retinal ganglion cells connect to two separate sets of photoreceptors, an inner region surrounded by an outer region. The pattern in which these cells are connected displays lateral inhibition [De Valois and De Valois, ], where light falling on a given photoreceptor causes that cell to excite a ganglion cell and simultaneously inhibit the signals sent to the same ganglion cell by adjacent photoreceptors. The strength of impulses the cell sends increases with the difference between light falling on the inner and outer regions. The ganglion cells do not respond to changes in the absolute intensity of the light, but to the contrast between adjacent regions of the retina. This relationship is maintained at subsequent stages of the visual system. The simple cells present in the visual cortex provide further evidence. While simple cells combine the information received from the retinal ganglion cells in a variety of con gurations, all share the same three properties: they respond to a speci c retinal position, they contain discrete excitatory and inhibitory zones and have a speci c axis of orientation. Figure . contains several examples of the receptive elds associated with simple cells.  Figure .: Sample receptive elds of simple cells. Each contains discrete excitatory and inhibitory zones and have a speci c axis of orientation. In particular, all simple cells are tuned to respond to speci c combinations of   retinal ganglion cells that represent linear features. Since retinal cells respond primarily to contrast, the simple cells respond almost exclusively to the boundaries of an object. In other words, the simple cells in the visual cortex directly represent image edges. The boundaries between image regions are explicitly coded in the visual system and are a fundamental primitive of the . The importance of which was best stated by David Hubel, neural physiologist [Palmer, ]: Many people, including myself, still have trouble accepting the idea that the interior of a form ... does not itself excite cells in our brain ... that our awareness of the interior as black or white ... depends only on cells’ sensitivity to the borders. The intellectual argument is that perception of an evenly lit interior depends on the activation of the cells having elds at the borders and on the absence of activation of cells whose elds are within the borders, since such activation would indicate the interior is not even lit. So our perception of the interior as black, white, gray or green has nothing to do with cells whose elds are in the interior – hard as that may be to swallow ... What happens at the borders is the only information you need to know: the interior is boring. Edges de ne regions of images and are the basis of our understanding of visual stimuli. The full characterization of an edge includes orientation, curvature and chromatic and other attributes. However, we focus on two speci c properties: the blur and the contrast of the edge, depicted in Figure .. We de ne the edge blur as the extent over which a luminance change occurs and the edge contrast as the magnitude of luminance change across the image region subtended by the edge. The combination of magnitude (contrast) and extent (blur) of an edge is the primary factor determining the appearance of that edge. The blur of that feature determines which of the visual channels carry the information of a particular image feature. Likewise, the strength at which any visual channel is activated is representative of the contrast of the associated portion of the image. In the next three sections we will discuss the underlying properties of blur and contrast perception and how they relate to image appearance, rst individually then in combination.    }  }  Contrast  Blur  Figure .: De nition of edge contrast. Contrast is the magnitude of a transition. Blur is the spatial extent over which that transition occurs.  . Blur Perception We examine the response of the visual system to changes in physical stimuli, starting with the perception of blur. There are two aspects of the response of the  to a stimulus: detection and discrimination. Detection (threshold effects) is the change required for a given stimuli to be visible as distinct from the surround while the discrimination (supra-threshold effects) is the difference at which two already-visible stimuli will be considered distinct. Both threshold and supra-threshold responses constitute an inverse relationship, where lower thresholds imply higher sensitivity to visual phenomena. Blur is the spatial extent over which a change in contrast occurs. For a given magnitude change in contrast, the change of an edge with little blur will occur over a small region while the change of an edge with a larger blur will change over a larger region. In both cases, the total change in luminance is equal but the distance over which the change occurs varies. In terms of complex images, blur is the absence or suppression of ne details in the image. Convolution with certain pointspread functions () results in the attenuation of high frequency components of the image and the edges cannot contain abrupt changes in contrast. This absence of high frequency information is apparent    when an image is analyzed in the Fourier domain. Blur can also be interpreted as the spatial frequency above which there is not signi cant information: F (ω ) ≈ 0, for ω > c  (.)  establishing a rough cut-off frequency c for the content of the image. This representation generalizes the effect of convolving an image by one of many individual  kernels, such as a box lter or Gaussian function and Figure . demonstrates the change in cutoff frequency for several Gaussian blurs. The exact appearance of the result can vary subtly, depending on exactly which frequencies are attenuated and by how much, but the end result is the high frequency content of the image will be attenuated.  c  c  c  Figure .: Examples of different blur lter frequency cutoffs. The upper row contains representative images while the lower row contains the respective plots of spatial frequency content. The lower the cutoff c, the less high frequency information present in the image.  ..  Blur Sensitivity  Much of the work quantifying human blur sensitivity has come from the optometry community, related to de ciencies in the optics of the visual system, and myopia (nearsightedness) in particular. Hamerly and Dvorak [], and more recently Wang and Ciuffreda [b], report the  is capable of detecting a difference between a sharp edge and a blurred edge width of  seconds of arc. Interestingly, both report   discrimination between two blurred edges is more accurate than discriminating a blurred edge from a sharp one, and the threshold for blur discrimination is roughly half that of detection. Both Hamerly and Dvorak and Wang and Ciuffreda tested blur discrimination on a relatively narrow ranges of edge blurs, corresponding to the rst few justnoticable-differences () of blur. More recently, Chen et al. [] conducted a similar study covering a signi cantly wider range of blur radii covering Gaussian edges with values of σ ranging from -.°. Chen et al. report a blur detection threshold of σ =.° for a sharp edge, increasing to roughly twice the sensitivity at σ =.°, then sharply decreasing in sensitivity past that. The shape of the blur sensitivity plot is shown in Figure ..  Figure .: Blur discrimination threshold from Chen et al.. The sensitivity increases then decreases sharply past a point. Image reproduced from Chen et al.. The sensitivity to blur depends on a number of different factors. Hamerly and Dvorak found that blur sensitivity increased with contrast and subjects were able to detect smaller changes when the stimulus was higher contrast. Both Wang and Ciuffreda [a] and Wang et al. [a] studied how blur sensitivity changed with regards to position on the retina, and found that the threshold of detection increases   slightly for stimuli further from the fovea, corresponding to the decrease in photoreceptor density. At each retinal position, the blur discrimination thresholds were similar to each other, and they were approximately % of the blur detection threshold magnitude. Wang et al. also found that blur detection sensitivity decreased roughly twice as fast as blur discrimination. Wuerger et al. [] studied the difference in blur sensitivity for chromatic edges, including isoluminant red/green and yellow/blue stimuli in addition to luminance. The thresholds for red/green stimuli are similar to those of luminance stimuli, while thresholds for blue/yellow stimuli were signi cantly higher. Ciuffreda et al. conducted research linking low-level aspects of blur sensitivity to higher-order perceptual constructs. This work includes task-speci c blur sensitivity [Ciuffreda et al., ], derived from experiments where subjects rank the point at which blur was detectable, objectionable and signi cantly degraded performance on visual recognition tasks. The study served as the inspiration for the perceptual experiment we conducted in Chapter . The authors also presented [Ciuffreda et al., ] a conceptual model of blur perception unifying a number of these individual ndings. Their work combines the measurements of detection and discrimination thresholds for different retinal positions and constructs volumes in the visual eld, shown in Figure ., where two stimuli would be within a  of blur from each other. Much of the sensitivity to blur of the  is determined by neurological factors, not the optics of the eye, suggesting that long-term differences in the visual stimuli a subject is exposed to can affect their ability to resolve blur differences. In particular, myopic (near-sighted) viewers and emmetropic viewers (not requiring correction) should display different sensitivity and adaptation to the presence of blur, given that one regularly experiences blurred stimuli while the other does not. Cufflin et al. [] and Wang et al. [b] studied how the blur discrimination thresholds change when subjects have adapted to out-of-focus stimuli. In their experiments, subjects’ vision was blurred by -. diopters, and their visual acuity was tested immediately after the blur was introduced, then after an adaptation period. Initially, both myopes and emmetropes displayed a loss of visual acuity, but recovered some of the acuity after the adaptation period. George and Rosen eld [] noted that myopes displayed a larger increase in sensitivity after adaptation than em  Figure .: Conceptual model of blur perception of Ciuffreda et al. The thick solid line represents the plane focus. The thin solid line represents the distance corresponding to blur detection while the dashed lines correspond to  of blur discrimination. Image reproduced from Ciuffreda et al.. metropes, possibly due to more exposure to blurred stimuli. However, while myopes were more able to adapt to the presence of blur, even with corrected vision they could not match the visual acuity of emmetropes [Rosen eld et al., ]. Rosen eld and Abraham-Cohen [] noted that this improvement in visual acuity came without a change in the optical refraction associated with accommodation of the eye, implying changes in the neurological gains of different visual channels.  ..  Blur Appearance  In addition to altering the sensitivity of viewers, adaptation to differently blurred stimuli can also change the appearance of images. Webster et al., in a and , found that viewers adapted to the amplitude spectra of blurred and sharp images, causing subsequently viewed natural images with a conventional 1/ f 2 frequency distribution to appear altered. Webster et al.  rst showed subjects images that had  either been blurred or sharpened, giving them time to adapt. Normal images shown   afterwards appeared too sharp or too blurry, respectively. Webster et al. also showed that the same phenomenon existed for simultaneous contrast [see Fairchild, , chap. ], where a normal image surrounded by blurred images would seem too sharp and vice versa, shown in Figure ..  Figure .: Simultaneous blur contrast. The center face is the same in both images, but the surrounding faces affect its appearance. Images copyright Michael Webster. In natural settings, defocus blur most often results from the current focal distance of the observer, and there is strong relation between the amount of blur perceived and the distance between the viewer and the object. Mather and Smith [] investigated how blur discrimination affected the ability of the  to determine depth relationships in a scene and demonstrated blur variation alone is sufficient to determine the apparent depth ordering. More recent work by Vishwanath and Blaser [] agrees with Mather and Smith, additionally suggesting the blur is likely used as a quantitative cue to distance consistent with the observer relationship between retinal blur and distance, and not the result of a precise computation. Both authors found that the ability of observers to discriminate different levels of blur was rather poor, consistent with our own results. This lack of sensitivity limits the use of retinal blur to a relatively coarse, qualitative depth cue. Held et al. [] presented a probabilistic model of how viewers may use defocus blur in conjunction with other pictorial cues to estimate the absolute distances to objects in a scene. Their model suggests how the visual system might use a com  bination of blur pattern and relative depth cues to determine the apparent scale of the content of images. Held et al. validated their model with a series of perceptual experiments and found viewer’s estimates of model distance consistent with their predictions. The authors also proposed an algorithm for introducing blur into an image to achieve a novel perceived object scale, as seen in Figure ..  Figure .: Perceived object scale of Held et al.. Simulating different defocus patterns can alter the perception of scale and distance of objects. Image copyright Robert Held. The degree to which a given stimulus is blurred determines the visual channel responsible for processing that stimulus. Differences between the responses of individual visual channels affect the perception of that blur, resulting in changes in blur sensitivity and appearance across the range of possible scales. Many other aspects of image perception are tied to the underlying visual channel, including contrast. The next section details how the contrast sensitivity and the appearance of contrast in image features is determined by spatial scale.  . Spatial Contrast Perception Contrast, in the most general sense, is the spatial variance in visual properties that causes one element of an image to be distinguishable from the surrounding elements. This thesis is primarily concerned with luminance contrast, how the intensity of light changes between one region and another. Contrast is speci ed in terms of ratios   between light and dark sides of an edge, removing any dependence on absolute magnitude of the stimuli. Features with vastly different magnitudes can have the same contrast as long as the relative difference remains the same: CRatio =  Ymax . Ymin  (.)  Other metrics of contrast have arisen from speci c stimuli that have appeared repeatedly in literature. For simple sinusoid gratings with maximum and minimum luminances, Ymax and Ymin , contrast is measured by Michelson contrast: CMichelson =  Ymax −Ymin . Ymax +Ymin  (.)  For tests concerning the contrast between a simple target and surrounding area contrast is measured by Weber contrast, and describes the increment or decrement ∆Y from some base level of illumination Y . Weber contrast is de ned as the ratio of change in luminance to base luminance: CWeber =  ∆Y . Y  (.)  Both of these formulations are closely related and each can easily be expressed with the terms of the other: ∆Y 2Y + ∆Y Ymax −Ymin = . Ymin  ′ CMichelson = ′ CWeber  (.) (.)  However, these contrast measures are lacking. Neither measure can address the fact that natural scenes contain a diverse range of spatial scales that vary across the image. Contrast is a local phenomenon that changes across the image, where smallscale changes in local contrast correspond to changes in surface texture and largerscale changes in local contrast correspond to geometry or illumination. A better de nition of contrast would adapt to the changes in intensity present at different scales in different regions of an image, similar to the visual channels present in the .   Pyramid approaches [Adelson et al., ] are one of the most common means of processing images that accurately capture differences in scale. The simplest measure of spatial contrast is the Laplacian pyramid [Burt and Adelson, ]. This representation decomposes an image into discrete frequency bands by computing differences between Gaussian- ltered images. Each level of a Laplacian pyramid is de ned as the difference between subsequent octaves of a Gaussian pyramid, with the rst Laplacian level comprised of the difference between the original image and the topmost Gaussian level. There is an additional base level containing the remaining frequency information. The Laplacian pyramid decomposes an image into n band-pass images bi and a single low-pass image l, resembling the local spatial processing and band-limited nature of human vision: n−1  I=  ∑ bi + l.  (.)  i=0  One shortcoming of the Laplacian pyramid is that it is based on differences in luminance, as opposed to ratios of luminance. Peli [] proposed a similar approach, basing the measure of contrast on ratios of luminance. Peli constructs a series of band-pass images of the ratio of luminance to a local average of the following structure  i−1  n−1  li = l0 + ∑ a j ,  I = l0 + ∑ ai  (.)  j=i  i=1  where ai is the band-pass ltered image, li is the local mean consisting of all frequencies below that band, with l0 being the lowest band. The de nition of contrast is then the ratio of the band-pass image to the local average: CPeli =  ai . li  (.)  Pyramid representations provide efficient approximations of the local band-limited contrast processing of the . If these metrics are insufficient, more accurate measures of local contrast based on models of human contrast sensitivity should be employed.   ..  Contrast Sensitivity  One of the most fundamental issues related to contrast perception concerns contrast sensitivity, or how large a luminance difference must be in order to be visible. Perceptual psychologists have dedicated considerable effort to measuring the dependence of contrast threshold on spatial frequency and luminance. The contrast sensitivity function (), shown in Figure ., quanti es the ability of the HVS to detect changes in relative intensity under a speci c set of conditions. The full model of contrast sensitivity depends on many factors, but the spatial frequency of a stimulus plays one of the most signi cant roles in determining the sensitivity of the  to the details present. The  displays a peak sensitivity between - c/deg with a sharp drop in sensitivity for higher frequencies and gentler but still signi cant decrease for lower frequencies [De Valois and De Valois, ]. Reasons for reduced contrast of high spatial frequencies due to the optical quality of the lens in the eye and the limited density of photoreceptors in the retina, as well as reduced contrast for low spatial frequencies due to the limited spatial extent over which low-level cortex cells can merge visual input. The shape of the  is directly visible from looking at frequency gratings like Figure .. Combinations of spatial frequency and contrast that appear gray are below the visibility threshold of the  while combinations that show the grating are in the range of sensitivity. The similarity with blur perception is apparent, and different visual channels do not exhibit exactly the same responses to stimuli of equal contrast. The overall characteristic of the  is a composite of sensitivities of the underlying visual channels and its shape describes the relative performance of those channels. Individual channels are still responsible for the processing of visual stimuli, but the  embodies summation of channel contrast sensitivity.  ..  Contrast Appearance  The , however, only describes the behavior of the visual system to threshold contrasts. Most regularly encountered visual stimuli are supra-threshold and easily detectable. There is no guarantee that the sensitivity of the  to contrast of a certain spatial frequency is representative of how that contrast actually appears.    Figure .: Grating demonstrating the contrast sensitivity of the human visual system. The apparent reduction of contrast at high and low frequencies is a result of our visual system, not the image. Blakemore et al. [] studied the appearance of supra-threshold contrast and reported the relative apparent contrast of sine waves at different contrasts and spatial frequencies. Subjects adjusted the contrast of sinusoidal gratings of various spatial frequencies to match the contrast of a  cpd grating. At threshold contrast, the curves match the . As contrast increases, low frequencies appear less attenuated up to a Michelson contrast of ., at which point they appear equal contrast to the reference. Higher frequencies remain attenuated similar to the . Georgeson and Sullivan [] were the rst to note that high (and to a lesser extent low) spatial frequencies never appear as faint or as fuzzy as mid-spatial frequency gratings. The phenomenon, known as contrast constancy, refers to the ability of the visual system to account for differences in contrast sensitivity between different spatial frequencies. Instead of a stimulus at threshold contast appearing barely detectable, it appears near full contrast. Cannon [] corroborated that two equal physical contrast stimuli have an equal perceived contrast, even if the respective thresholds are quite different, implying the mechanisms mediating threshold detection and supra-    threshold perception may be different. It is as if the visual system has an internal model of the  it uses to partially invert the effects of contrast sensitivity on the perceived result, as seen in Figure .. 0.001 0.01  Contrast  0.03 0.1 0.3 1.0  0.25  0.5  1  2  5  5  10  15  20  25  Spatial frequency (c/deg)  Figure .: Plot of contrast constancy. Even though the threshold (topmost line) increases for high and low frequency, the appearance of contrast appears the same as that of middle frequencies with lower threshold. Image reproduced from Georgeson and Sullivan. Psychophysical studies have shown contrast constancy to hold under a wide range of conditions. Subsequent studies by Peli et al. [; ] have shown that the  is capable of preserving the appearance of contrast across a vast range of visual conditions, reducing in effect only for dim or very high frequency targets. Tiippana and Näsänen [], likewise, found that for high contrasts, matches were nearly independent of stimulus spatial-frequency bandwidth up to about  octaves, even though detection thresholds increased with bandwidth. Brady and Field [] derived a model of contrast constancy that, in addition to explaining experimental results, also predicts perceived contrast will be approximately constant across scale for scenes whose spectra fall as /f, as is typical of natural scenes. In some cases, the compensatory mechanisms can be overdriven and frequencies with a lower sensitivity can appear to have more contrast than higher-sensitivity stimuli, referred to as contrast overconstancy by Georgeson []. The perception of contrast in complex images depends on factors beyond the amplitude of the spatial frequency components. While it is convenient to treat the   visual system as a set of quasi-linear operations, numerous exceptions exist. Examples like the contrast constancy phenomenon have demonstrated that the appearance of stimuli is effectively decoupled from the sensitivity of the underlying visual channels responding to a stimulus. The contrast between one region of an image and adjacent regions can also affect the apparent contrast. Apparent contrast is the relative brightness of visual stimuli, where brightness is formally de ned as “the visual sensation according to which an area appears to emit more or less light” [Poynton, ]. Brightness could be thought of as apparent luminance, and much like the ratio between luminances determines physical contrasts, the ratio between brightnesses determines apparent contrast. The most commonly known example of these higher-level phenomena is the shift in color appearance of a stimulus when the color of its background is changed, known as simultaneous contrast. In the case of Figure ., identical gray patches presented on different backgrounds appear distinct. Darker backgrounds cause the grey patch to appear brighter, while lighter backgrounds cause the grey patch to appear dimmer.  Figure .: Example of simultaneous contrast. The internal gray squares appear of different brightness depending on the surrounding area.  . Edge Pro le Perception The amplitudes of spatial frequencies that comprise a stimulus have an obvious effect on the perceived blur and contrast. This fact is especially relevant in the context of edges, which consist of all spatial frequencies. In particular, the edge forming the boundary between regions can have signi cant effects on the perception of those regions. The change in intensity across that boundary, known as the luminance pro le,   can substantially alter the appearance of both the edge and adjacent regions.  Figure .: Reduction in contrast of regions bounded by blurred edges demonstrated, by Kim et al. []. The inside of (a) and (b) are the same color, but the blurred edges in (b) reduce the perceived contrast. Kim et al. also present a method to correct for the loss of contrast, demonstrated in (c). Images copyright Min Kim. The previous section addressed how individual spatial frequencies are perceived, and we now address how groups of spatial frequencies are perceived in the context of edges. By de nition, blurred edges, with attenuated high spatial frequencies, appear less well de ned than sharp edges. Likewise, contrast thresholds increase as the width of a blurred edge increases [Shapley and Tolhurst, ], in accordance with the . What is surprising, however, is blurred edges also appear to have less contrast than step edges of equal Michelson contrast [O’Brien, ], the effects of which were recently studied for complex images by Kim et al. [] and depicted in Figure .. Edges pro les with ampli ed high frequencies, known as countershaded edges, are more interesting still. The relative intensity of ampli ed spatial frequencies can alter the perception of both sharpness and contrast of the edge. Countershading increases local contrast by adding a high-pass image Hσ (Y ) to the original image Y = Y + λ Hσ (Y ) = (1 + λ )Y − λ gσ ∗Y,  (.)  where λ is the contrast of the countershading. The process is illustrated in Figure .. In some cases, the underlying edge may be attenuated, or omitted entirely, and only the high-pass image used. Depending on the parameters used, the result of countershading causes different changes in the appearance of the image. The choice of a small σ increases   = Enhanced signal  + Original signal  Gain  Boosted contrast  Figure .: Demonstration of countershading process. A high-pass image modulated by λ is added the underlying image. Image reproduced from Smith. the acutance [Neycenssac, ], or apparent sharpness, of the image (Figure .), which is the effect usually referred to as “unsharp masking” [Badamchizadeh and Aghagolzadeh, ]. However, narrow pro les do not signi cantly alter the perception of contrast [Kingdom and Moulden, ]. The choice of a large σ increases the contrast of the image (Figure .), which is usually referred to as countershading.  Increased Sharpness  Normal  Increased Contrast  Figure .: Comparison of perceived effect of countershading. Narrow pro les can increase perceived sharpness while wide pro les can increase perceived contrast. Original image copyright Greg Ward. For both changes in sharpness and contrast appearance, the perception of countershaded edges again falls into two categories: threshold sensitivity and supra-threshold appearance. The threshold, when equal-contrast countershaded edges and step edges can be differentiated, and the supra-threshold appearance, the amount of contrast or sharpness increase that a countershaded edge appears to possess, are determined by countershading parameters σ and λ . The studies on the Cornsweet illusion by Sullivan and Georgeson [], Campbell et al. [; ] and Burr [] determined the contrast at which a coun  tershaded edge becomes distinguishable from a step edge of equivalent contrast, the so called scalloping threshold. The thresholds apply only to larger σ -values because the countershading is always distinguishable from a square wave at spatial frequencies above  c/deg [Sullivan and Georgeson, ]. Thus narrow countershading that increases apparent sharpness will always appear distinct from their step edge counterparts. Interestingly, research has almost exclusively focused on the perception of wider, contrast-enhancing countershading pro les and Lin et al. [] conducted one of the only studies regarding the perceived change in apparent sharpness of countershaded edges. Subjects ranked the perceived quality of images for contrasts of different magnitudes and Lin et al. computed the most desirable and highest tolerated contrast. Findings with Laplacian edge-enhancement lters show that the baseline preferred sharpness is about .× that of the local just-noticeable difference, and the actual preferred sharpness is also dependent on the contrast increase in the surrounding areas. On the other hand, a substantial body of perceptual literature exists on the effect of countershading on perceived contrast. The best known and most dramatic example is the Craik-O’Brien-Cornsweet illusion [Cornsweet, ; Craik, ; O’Brien, ], the name commonly given to a family of related illusions that induce an increase in perceived brightness by introducing countershading. The family, often referred to collectively as the Cornsweet illusion, results from a sufficiently-wide countershading pro le. The illusion, shown in Figure ., causes the region on the bright side of a countershaded edge to appears brighter and the region on the dark side of the the edge to appears darker. Tolhurst [] and Shapley and Tolhurst [] found that the threshold contrasts of countershaded edges were the same as that of step edges, provided the width of the high-pass image was above a critical value of .°. Below the critical width, sensitivity to Cornsweet edges became progressively lower than that of step edges. Campbell et al. ;  and Sullivan and Georgeson [] found that the contrast thresholds for the missing fundamental illusion, shown in Figure ., to be the same as those for square waves provided the slope width was less than roughly . c/deg. Differences between the ndings can be attributed to differing luminance pro les used in their respective experiments.   (a) Cornsweet Illusion  (b) Similar contrast step edge  Figure .: Comparison of step and Cornsweet edges. Left: a Cornsweet edge. Right: a standard edge. The regions on either side of the edges appear to have roughly the same brightness, even though they are of the same brightness for the Cornsweet edge. Image reproduced from Smith. Seminal work by Dooley and Green eld [] resulted in the graph in Figure . that relates the apparent contrast of a Cornsweet pro le to the physical contrast of its luminance pro le. The apparent contrast of the edge is shown to be a function of the width of the edge pro le measured in angular degrees, and amplitude measured in Michelson contrast. The strength of the illusion diminishes as the width of the Cornsweet pro le decreases, and as amplitude increases, the apparent contrast of the edge becomes nonlinear and eventually begins to decrease. The choice of a larger σ or a smaller λ results in a larger relative increase in contrast. The Cornsweet illusion is additive, as noted by Burr [], and a countershading pro le can be added to an existing step edge to further increase the apparent contrast of the edge. Dooley and Green eld [] noted that the strength of the Cornsweet illusion decreases as the contrast of the existing step edge increased. Additionally, the Cornsweet illusion is cumulative, and multiple Cornsweet pro les will result in brightness that approximates a set of step edges, where each patch will appears brighter than the previous. Countershading is not limited to changes in perceived brightness; it has been observed to be a property of several aspects of visual perception. Of particular interest to image processing algorithms is the fact that the illusion is present in similar edge pro les of color change [van den Brink and Kleemink, ; Ware and Cowan, ],   Craik-O’Brien  Cornsweet  Repeated Cornsweet  Missing Fundamental  Blurred Edge  Figure .: Examples of different physical edge pro les (left) and their perceived result (right). Image reproduced from Kingdom and Moulden. and can be created from changes in hue, saturation and chromaticity [Wachtler and Wehrhahn, ]. Additionally, Anstis and Howard [] presented the existence of the illusion with regards to stereoscopic depth, Loomis and Nakayama [] and Walker and Powell [] described the relation of the effect to motion perception, while Purves et al. [; ] provided some compelling examples of how the effect relates to our perception of D scenes. Ihrke et al. [] performed a perceptual evaluation of the work on D unsharp masking by Ritschel et al. []. The authors tested the preferred value of their magnitude parameter λ for pro les of several different widths, measured on object surfaces, opposed to image-space.  . Image Appearance Models The perceptual basis of the Cornsweet illusion is the subject of ongoing debate and there are competing theories on what aspect of visual processing mediates the illusion. Dooley and Green eld [] suggest that the effect is due to the shape of the  and based the fact that given the lack of sensitivity of the  to low frequencies, a Cornsweet edge and a step edge have the same response. Another class of theories   3 2  Perceived Contrast  1  .5  .2 .1  Contrast  Figure .: Strength of the Cornsweet illusion as a function of width and contrast. As width increases, so does the strength of the illusion. Increasing the contrast increases the strength of the effect up to a point, then the illusion fails. Image reproduced from Georgeson and Sullivan. attempts to explain the effect in terms of lightness constancy like Retinex [Land and McCann, ], described below. A third interpretation suggests that the effect is the result of local analysis of edge features and uses sets of lters to detect and extract information about edges and uses that information to extrapolate global effects. Davey et al. [] has reported that the illusion is consistent with the effects observed from the “ lling in” mechanism of our vision perception, but that has been disputed by Devinck [Devinck et al., ]. Kingdom and Moulden [] provide a comprehensive comparison of all the competing theories. Attempts to model perceptual processes extend beyond countershading and numerous approaches have been proposed to model how the human visual system understands blur and contrast. These methods share the goal of modifying an image such that it approximates the  perception of brightness. One signi cant class of methods is known as lightness algorithms. The most commonly known version is Retinex [Land and McCann, ], but a number of similar approaches exist [Blake, ; Horn, ]. Hurlbert [] provides a summary of methods and presents several formal connections between their formulations.   The key observation these methods make is that human perception of brightness roughly corresponds to the re ectances of surfaces in the scene. However, the observed image is the product of the re ectance and the illumination. Given this observation, if one separates the contributions of re ectance and illumination from the observed image and only considers the re ectance, they could have an estimate of how the HVS perceives the brightness. The original Retinex paper presented by Land and McCann was based on the observation that changes in illumination correspond to smooth gradients that vary slowly while changes in re ectance correspond to edges and are abrupt. This assumption leads to the approach shared across many of the lightness algorithms: rst differentiate the image to recover change in luminance, then employ a threshold operation to remove small changes that correspond to smooth gradients, and nally integrate to recover a modi ed image that approximates the surface re ectance. Land’s original paper focused on D patches of uniform re ectance called Mondrians, shown in Figure .. The approach performs random walks between different patches of the D Mondrian, summing the thresholded differences along the path to approximate the log of the ratio of surface re ectances at the beginning and end of the path. This operation is repeated along numerous random paths to try and nd region of highest re ectance. The approach was effective at predicting the appearance of the Mondrian patches it was designed for, but did not accurately predict more complex scenes. To handle more complex inputs, a second version [Jobson et al., ; Land, ] of Retinex was proposed based on a center-surround formulation. This version keeps the assumption that the illumination is smoothly varying, and concludes that some spatially weighted average of the observed scene will approximate that illumination. If the observed scene S is the product of the illumination I and re ection r, the Retinex output is de ned as: (  Ir R = log Ir (r) R = log r    ) (.) (.)  Figure .: Example Mondrians of Land and McCann []. as long as I ≈ I. Instead of randomly exploring patches of a Mondrian to try and determine region of maximum re ectance, this version computes the normalization constant from a Gaussian weighted average of a large surrounding neighborhood. The scale of the Gaussian allows a tradeoff between amount of reduction of contrast and global correspondence of lightness. Despite additions by Jobson and Woodell []; Rahman [], no single Retinex scale works for all images, or even different portions of a single image. Rahman et al. [a] proposed a multiple-scale version that alleviates some of the issues of the center-surround formulation. Multi-scale Retinex is equivalent to the centersurround form, but the output is a combination of a weighted average of several different scales of the single-scale version of Retinex. Numerous researchers [Barnard and Funt, ; Funt et al., ; Rahman et al., b] have studied and improved on the multi-scale version, but there are still issues. Many variants of the algorithm attempt to solve for changes in luminance and color simultaneously and have trouble separating side effects between the two. The photographic tonemapping operator by Reinhard et al. [] is very similar to luminance multi-scale Retinex, but removes several unnecessary steps and includes automatic selection of the proper scale   for each pixel. The eld of color science has made signi cant progress on mimicking how the HVS converts physical values in perceptually-salient units. Perceptually uniform color spaces like L∗ ab space and luminance quantizations [Mantiuk et al., ] make it possible to compare perceived differences in physical quantities of light. Color appearance models like the CIECAM [CIE, ] are capable of exhibiting the complex behavior of our visual system like simultaneous contrast. Image appearance models extend this further to model the spatial and multi-band components of image perception. The s-CIELAB [Zhang and Wandell, ], multi-scale adaptation [Pattanaik et al., ] and iCAM [Fairchild and Johnson, ] models extend color appearance models to incorporate some aspects of the HVS when viewing complex images. Reinhard et al. [] provides a comprehensive overview of these models. While covering a multitude of dimensions of visual perception, these frameworks are still incomplete. None of these models accurately capture the complex scaledependent relationships of blur and contrast perception we have observed. It is this de ciency in the ability of existing models to represent the nuances of the  that has motivated our contributions in Chapters  and .    Chapter   Related Work The previous chapter detailed the signi cance blur/contrast perception plays in image understanding. Unsurprisingly, given this importance, the elds of image processing and computer graphics have investigated the manipulation of blur and contrast in images. This chapter surveys methods and computational approaches to understanding and altering blur and contrast in complex images. Section . covers computational photography, methods for extending camera capabilities of image capture. In particular, we examine the ability of aperture ltering and deconvolution techniques for manipulating blur and contrast when capturing images. Then Section . provides a summary of techniques for estimating the blur and contrast present in images, with an emphasis on local multi-scale operators. This coverage includes the estimation of blur and contrast, as well as how those operators relate to edges and edge detection. Next, Section . describes methods for manipulating blur and contrast in images in spatially-variant manners, including blur magni cation and applications of countershading. Finally, Section . details other applications that relate to the spatial perception of blur and contrast in natural images, including resizing operators and the restoration of detail.    . Computational Photography ..  Computational Image Capture  As digital sensors have become the de facto means of image capture, methods of acquisition have evolved to leverage the processing capabilities inherent in this augmented imaging pipeline. This new envisioning of photography extends beyond the conventional constructs of a single lens focusing the scene onto a static sensor plane. The paradigm of computational photography circumvents the limitations inherent to normal methods of image acquisition by including additional encoding and decoding steps into the conventional imaging process. In this framework, the acquisition setup, including optics, lighting and exposure, is altered to encode additional information in the captured image. The captured image is then decoded by software to recover the conventional pixel values and use the encoded information to synthesize a novel image. Many of these computational techniques have addressed aspects of how the formation of contrast and blur is dictated by the physics inherent to lens optics. To overcome the limitations of focus in a conventional lens system Ng introduced the plenoptic camera [Ng et al., ], based on the principles of Fourier-slice photography [Ng, ]. This new camera design involved an array of microlenses over the image sensor that encoded a series of images that could later be used to refocus the image or change the aperture or viewpoint. Subsequent designs by Veeraraghavan et al. [] and Lumsdaine and Georgiev [] have recast the problem in the frequency domain to improve the utility of the original construction. All of these techniques address the removal of defocus blur, and Levin et al. [] analyzed the use of computational cameras for depth-of- eld () extension. The same principles apply equally to the temporal domain, and information can be encoded in the shutter sequence to address motion blur. Raskar et al. [] used a coded exposure pattern, opening and closing the shutter across a single exposure, to better decode motion blur. Similarly, Levin et al. [] devised a parabolic camera motion that represented motion blur in a velocity-independent manner, and Agrawal et al. [] proposed a similar coded approach for video sequences.    Figure .: Plenoptic camera of Ng et al. [] (left). The combination of microlenses and processing allow the image to be refocused after capture Images copyright Ren Ng.  ..  Coded Aperture Imaging  All of these techniques allow the photographer to exert considerable control over the blur present in the image after it has been captured. Similar to how coded exposure patterns address motion blur, coded aperture imaging allows for better removal of defocus blur. Coded aperture imaging rst appeared in the context of astronomical x-ray and gamma ray imaging as a solution to constraints in feasible optical systems for those telescopes. Compared to visible light, the high energy photons simply pass through media without refracting, rendering conventional lenses useless. At rst, the standard practice was to use a pinhole aperture to produce a sharp resulting image, but this approach blocks the majority of the energy from the source and has a very poor signal-to-noise ratio (). The rst improvement over the pinhole aperture was the random aperture arrays proposed by Dicke []. Instead of a single pinhole, they used a two-dimensional array of randomly positioned pinholes, resulting in numerous shifted copies of the image overlaid on top of each other, which they attempted to correct with a decoding step. This process signi cantly improved the  of the imaging system, increasing the amount of light captured by the number of pinholes in the array, but did so at the cost of the resolving power. It is impossible to completely undo the cumulative effects of the random array, and there was always some residual blur. This method was improved upon by structured patterns such as URA [Fenimore and Cannon, ] and MURA [Gottesman and Fenimore, ]. These patterns retain the multiple aperture holes of the random array but are constructed in such a way that the placement of the holes has a unique and complete means of decoding the   signal. This design permits the complete restoration of the original image, and the patterned array effectively acts as a single pinhole with the  of multiple apertures. Aperture Source  Detector  Figure .: Coded aperture pattern for x-ray imaging. Image reproduced from Gottesman and Fenimore. More recently, the principles of coded aperture imaging have been employed in visible light photography. Levin et al. [] made use of aperture lters to accurately determine the amount of blur in order to refocus an image. Zhou and Nayar [] analyzed which coded aperture patterns were best suited for deconvolution of defocus blur, providing a comparison of the suitability of each at recovering the original image. In related work, Cossairt et al. [] analyzed the depth invariance of different coded aperture designs and proposed an improved diffuser-based construction that still preserved image spatial frequencies.  ..  Deconvolution  The coded aperture design in visible light optical systems differs from that of systems for x-ray astronomy. Several of the underlying assumptions regarding the optical setup no longer hold, and simple methods of restoring the original image cannot be used. Instead, deconvolution is used to reverse the effects of the blur introduced by the camera optics. In deconvolution, it is assumed that some desired image has been distorted (blurred) by some known function, and the goal is to recover that original image. Mathematically this relation is represented as:   f = f0 ⊗ k + η ,  (.)  where the recorded image f is the result of convolving the original image f0 by some pointspread function () k with additive noise η . Numerous solutions have been proposed over the years, from frequency-space methods such as Wiener ltering [Wiener, ] to iterative methods such as Richardson-Lucy deconvolution [Lucy, ; Richardson, ] and expectation-maximization. It is exceedingly difficult to recover the original image at good quality. The system of equations does not have an exact solution due to the presence of noise and image information is suppressed by the frequency response of the lter. Additionally, the system is ill-posed, resulting in an in nite number of possible solutions. Results often include ringing around edges, ampli ed noise and other artifacts. In the case of coded aperture imaging, the  k is known and speci cally chosen to aid in conditioning the deconvolution. While this design reduces the size of the solution space, it is still no easy task, and additional measures are required to guide the process towards the correct solution. New methods have incorporated natural image statistics into deconvolution algorithms to better recover the original image. The power spectra of images of real world scenes all roughly follow the same distribution. Individual images exhibit variation, but the overall trend carries strongly across all natural images. A number of recent papers have utilized deconvolution algorithms incorporating these assumptions to better recover the original image. Levin et al. [] used a combination of coded aperture and enhanced deconvolution to recover depth and refocus images. More recently, Bando and Nishita [] presented a means of recovering the original image from defocus blur without a coded aperture, while Yuan et al. [] proposed more advanced techniques for removing artifacts. Most deconvolution methods assume the  that degraded the image is already known and use it to invert the blur. Blind deconvolution methods, on the other hand, such as those by Fergus et al. []; Lam and Goodman []; Shan et al. [a] iteratively estimate the shape of the blur while restoring lost image details. To do so, blind methods require additional information about the blur, and most assume that the blur kernel is invariant across the entire image, as is the case with    Figure .: Left: image degraded by motion blur ( shown in insert.) Right: deconvolution results of Yuan et al.. Images copyright Lu Yuan. most camera shake. Any blurring operation spreads the luminance incident on a single pixel over a larger area, averaging the values of neighborhoods of pixels together and suppressing high frequency variations. As adjacent pixels share more of the same light falling upon then, the contrast between them is also reduced. Deconvolution is most commonly employed to restore image sharpness lost when degraded by blurring. Our inquiry in Chapter  investigates the feasibility of using deconvolution methods to restore the contrast lost by blurring.  . Blur Estimation and Manipulation All deconvolution methods restore image detail lost to blurring and thus require an accurate model of the  that degraded the image. Blind deconvolution methods perform the additional step of estimating the shape of that blur kernel. Blur estimation has proven to be of great utility for a number of applications besides the restoration of lost image detail, including computer vision and image quality assessment. This section surveys different techniques of estimating blur in images as well as other applications made possible by blur estimates. Blur is introduced into images by a number of different processes. Object shape, illumination and capture method can all cause different kinds of blur to an image. We focus our discussion on three kinds blurs resulting from the image capture process, each with their own properties: motion blur, camera shake and defocus blur. Motion blur is a linear blur caused by the subject moving in the frame and while it only    applies to that object, tends not to vary spatially. Camera shake is caused by rotation of the camera during the exposure and, depending on the particular rotation, may closely resemble a motion blur or may be a much more complex shape. Camera shake is often uniform, but may vary spatially depending on which axis the rotation is around. Defocus blur is circular blur that depends on the distance of each point in the scene from the focal plane, and each portion of the image is blurred by the scaled shape of the aperture.  Motion blur  Camera shake  Defocus blur  Figure .: Examples of different kinds of motion blur. These three types of blur motivate several approaches to blur estimation. In some cases, like camera shake, the entire image shares the same blur . In cases of motion blur, the blur shape varies across the image, but can be divided into large regions sharing the same . When addressing defocus blur, estimation must consider each pixel individually. Individual algorithms may address one or more of these approaches, depending on the circumstances responsible for introducing the blur into the image. However, estimation techniques can be roughly divided into blur estimation, techniques that estimate the uniform blur of a given region, and edge detection, techniques that estimate the location (and blur) of edges in the image.  ..  Blur Estimation  Blur estimation methods vary in terms of the amount of information they ascertain about the blur. Some routines merely estimate the ne detail absent from the image. Other routines attempt to characterize the parameters of blur, such as direction and   magnitude of motion blur. The simplest algorithms perform a global analysis of the blur characteristics, while more advanced methods analyze regions and attempt to segment the image based on the kind of blur present, such as identifying a moving car in front of a stationary background. Any blur estimation procedure must contend with noise present in the image. Both noise and image detail consist of high-frequencies, but the presence of highfrequency information does not imply image detail and blur estimation routines must ensure that high frequency information is not due to noise. Methods such as those by Rooms et al. [] and Tong et al. [] decompose the image using a wavelet transform to separate noise from the rest of the image content and only consider high-frequency information that is consistent with lower-frequency information, as is the case in sharp edges. More recently, Ilic et al. [] compute statistics based on the average cone ratio of wavelet coefficients to improve noise performance further. While resilient to noise, none of these methods estimate the shape of the blur present, only the degree to which it has suppressed image detail. If the shape of the blur is simple, in the case of motion blur, Fourier analysis methods can be used to ascertain blur parameters from how the spatial frequencies of the image deviate from that of natural images. Moghaddam and Jamzad [] observe that a linear motion blur is equivalent to a temporal box- lter, resulting in the spatial frequencies of the image being multiplied by a sinc lter orthogonal to the direction of the blur. Moghaddam and Jamzad use the Radon transform [Deans, ] to determine the orientation of the blur, then compute the distance between the zeros of the sinc function to determine the length. However, this approach is only valid for linear motion blurs with uniform velocity. Ji and Liu [] extended the approach to robustly identify the blur kernel for a number of motion types and accurately determine whether or not the motion is of uniform velocity. Spatially-variant blur estimation schemes operate similar to conventional blur estimation, but have the additional challenge of trying to identify one or more regions of the image degraded by the same blur. Liu et al. [] propose a blur classi cation framework for automatically detecting blurred regions and recognizing the blur types for those regions without performing blur kernel estimation or image deblurring. Chakrabarti et al. [] derive a local measure of the likelihood of a small neighborhood having been blurred by a candidate blur kernel. The authors employ   Small blur  Small blur  Large blur  Large blur  Figure .: Relationship between motion blur and spatial frequencies demonstrated by Moghaddam and Jamzad []. The angle of the zeros in the Fourier transform corresponds to the direction of blur and the spacing of zeros is inversely proportional to the length of the blur. this measure to simultaneously select a motion blur kernel and segment the region that kernel affects for a given image. Dai and Wu [] devise a matte-based blur estimation approach where the object and background are highlighted and the algorithm derives the blur parameters based on the assumption that the blurred outline of the object is a blending of it and the background. Finally, researchers have estimated the motion blur between the sequential frames of a video. In this case, the motion of the blur and the motion of the object between frames will be coherent, and the additional frames provide more information for robustly computing the parameters of the blur. Trussell and Fogel [] compare the motion between portions of sequential images to segment the images into regions of consistent blur and estimate the parameters of that blur, subsequently attempting to restore the images of the video sequence. Ben-Ezra and Nayar [] combine a high-speed but low-resolution video camera with a conventional video camera to capture the  of the motion blur in higher precision. Like Trussell and Fogel, Ben-Ezra and Nayar segment the image into different blur regions and attempt to restore the original image.  ..  Edge Detection  Blur estimation routines consider all the information present in a given region to characterize an estimate of the blur for that region. Edge detection methods, on the other hand, focus on using purely local information to determine the sparse set positions representing edges present in the image.   The most straightforward de nition of an edge is any location with a rapid intensity variation. Mathematically, this construction lends to de ning edges in terms of the gradient of pixel intensities: ( J(x) = ∇I(x) =  ) ∂I ∂I , (x), ∂x ∂y  (.)  where J indicates the direction of steepest ascent, perpendicular to the edge direction, and the magnitude of J denotes the contrast of the edge. While it is convenient to de ne edges as locations of perfectly sharp intensity change, a cursory inspection of natural images reveals that the majority of edges present are not perfectly sharp transitions in luminance. Various phenomena, including focal blur inherent in camera optics, penumbral blur of shadows, or shading variations of smooth objects, all lead to the luminance variation of an edge extending beyond a single pixel. As a result, all edge detectors must consider a range of potential widths of edge transitions. Additionally, operators must isolate the edge at a given location within the transition that corresponds to the maxima of edge strength (gradient magnitude). The Canny edge detector [Canny, ] is one of the earliest and still bestperforming edge detection algorithms. At its heart the Canny edge detector consists of the following stages, also depicted in Figure .: . A smoothing lter to reduce the degree to which noise and high frequencies affect the subsequent estimation of image gradients. . A gradient magnitude operator to estimate the strength of edge contrast at each pixel in the image. . A suppression of non-maximum gradient values to determine the precise location of edges. . A linking operation to chain individual pixels into edges. This high-level structure of the algorithm has served as a template for a host of later approaches. Subsequent algorithms have re ned the method, including differential geometry and variational geometry formulations [Lindeberg, ], but the   (a) Original  (b) Smoothed  (c) Gradient magnitude  (d) Non-maxima suppression  Figure .: Steps of Canny edge detector. The original image (a) is rst low-pass ltered to remove noise (b). The gradient magnitude (c) is then computed, followed by the suppression of non-maxima values (d). Images copyright Wikipedia user Simpsons contributor. original Canny algorithm remains one of the most-utilized methods of edge detection. Any choice of smoothing operator used in edge detection requires the selection of a spatial scale σ . If only the detection of sharp edges is desired, the choice of σ can be determined from the noise characteristics of the image [Canny, ; Elder and Zucker, ]. Alternatively, a more principled approach employs a scalespace [Koenderink, ; Witkin, ] to reliably detect edges at a variety of scales. Objects in real-world scenes can appear in images at a large range of scales, owing both to the multitude of object sizes and the array of viewing conditions under which the object is observed. The  has developed much of the spatial vision adaptations detailed in Chapter  to account for this change in object size and scale-space    frameworks provide an analogous ability to computer vision algorithms. Scale-space methods represent images as a family of smoothed images depending on some spatial parameter, and have the advantage that for a chosen spatial scale σ , √ features smaller than σ will be suppressed. The scale-space function is de ned as: Nn (x; σ ) = σ α I(x) ⊗  ∂ n G(x; σ ) , ∂ nx  (.)  where the image I is convolved by a nth-derivative Gaussian of width σ , and normalization term α such that the function peaks at scale of the desired feature [Lindeberg, ]. Furthermore, in comparison to partial representations like Gaussian or Laplacian pyramids, complex image features can be recognized by either local maxima, minima, or zero crossings at respective locations across scales. Scale-space methods also have the additional bene t of recovering a spatially-variant estimate of the blur  Scale (t)  present in an image in the process of adapting the scale according to image content.  Position (x)  Figure .: Conceptualization of a Gaussian scale-space. In this gure, a D signal is depicted at different levels of the scale-space, t. As the scale increases, more of the high-frequency content is ltered out. Image reproduced from Perona and Malik. The scale-space framework has proven invaluable for adapting detection and recognition tasks to the different scales at which objects appear in natural images.   Detection algorithms for edges, ridges, corners and blobs have been extended by use of scale-spaces to automatically adapt to changing sizes of objects in images. Additionally, this conceptualization provides a hierarchical organization of image features, attaching smaller-scale features to the larger-scale features in which they reside. We have found two particular methods to be of utility in our work: the minimum reliable scale computation of Elder and Zucker [] and the blur estimation method of Samadani et al. [], which we describe below in more detail. This survey of edge detection methods is far from complete and we direct the reader to comprehensive overview of Basu [].  ..  Elder and Zucker Minimum Reliable Scale  The minimum reliable scale () of Elder and Zucker [] acts as a precursor to their scale-space edge detection operator.  compares the gradient magnitude to the noise level and determines the scale at which edge detection should occur to ensure accuracy. The operator takes advantage of the fact that for any given pixel the portion of the gradient due to noise is independent of adjacent pixels, while the portion of the gradient due to a blurred edge is related to surrounding pixels. This difference in structure implies the portion of the gradient due to noise will decrease faster than the portion due to the edge when the image is blurred, and the signal detection properties will improve when blurring more. Elder and Zucker use a partial scale-space representation, operating at scales corresponding to powers of two, and estimate the minimum scale they can accurately recover the gradient. Their method proceeds as follows: rst use local scale control to estimate the gradient and its orientation at each pixel, then use local scale control to estimate the second derivative along the gradient direction at each pixel, and localize edges using the second derivative. This information allows them to reliably estimate the location, intensity, and extent of edges in real images with noise and other artifacts. Elder and Zucker demonstrate that for a Gaussian edge G of amplitude A, offset B and width σb captured by a sensor with white noise n(x, y) of variance v2n : ( )( ( ) ) A x G(σ , x) = erf √ + 1 + B + n(x, y) 2 2σb   (.)  they can determine a value for a given blur scale s above which they can be con dent of the estimated gradients, and de ne that critical value as: c(s) =  1.1vn , s2  (.)  where the factor 1.1 is determined by their assumptions of the estimation reliability; please see the original paper for more details. Combining this formulation with their edge de nition, one can solve for the blur scale sˆ at which the gradient can be reliably detected: vn sˆ = A 2  ( ) √ 2 5.4 + 28.9 + 15.2 (Aσb /vn ) pixels2 .  (.)  In practice, when estimating the blur of an edge, the amplitude A is also unknown. Elder and Zucker approximate the  by computing gradient magnitude estimates at a series of scales and selecting the smallest scale at which the gradient magnitude exceeds the magnitude of the noise present: s(x, ˆ y) = inf {s : rs (x, y) > c (s)} .  (.)  The result is a map of how much each region of a noisy image must be blurred by to allow for accurate estimation of the blur in that region.  Figure .: Example of Elder and Zucker minimum reliable scale. Left: original image. Right: map of minimum reliable scale required to accurately estimate gradient. Dark regions correspond to smaller scales while light regions correspond to larger scales. Images copyright James Elder.    ..  Samadani et al. Blur Estimation  The blur estimation method of Samadani et al. [] produces a spatially-varying map of the blur present at each pixel in the image at the resolution of the thumbnail. The algorithm rst generates a standard thumbnail, ts , and produces a scale-space of thumbnails blurred by different amounts. Image features are computed for the high resolution original as well as for each of the thumbnail images in the scale-space. The amount of blur is determined by the level of the scale-space with feature values most similar to those of the original image features. These features are computed as the maximum absolute difference between a pixel and its eight neighbors. In the case of the original image, the feature values are downsampled using a maximum lter to produce a thumbnail resolution range map, ro . The levels of the thumbnail scale-space lσ j are created by convolving the standard thumbnail with a set of Gaussian kernels of standard deviations σ j , lσ j = g(σ j ) ⊗ ts  (.)  where lσ0 is the unblurred, original thumbnail. For each of these images lσ j , a low resolution range map rσ j is generated. The estimate of the blur present in an image is represented by the blur map m. Each pixel i of the blur map is determined as { } m(i) = min j | rσ j (i) ≤ γ ro (i) , j  (.)  where γ is a user-speci ed parameter that controls which rσ j most closely matches ro and in turn adjusts the amount of blur added. An example of the blur map is shown in Figure .. The nal image is synthesized by selecting the pixel from lσ j that corresponds to m(i). However, the result does not correspond to the σ of the blur in pixels. Our method in Section . calibrates the method to provide an absolute measure of the estimated blur.  ..  Blur Magni cation  In the context of image processing, blur is mostly considered undesirable and most uses of estimated blur focus on the deconvolution methods described previously to   Blur estimation 0px blur  15px blur  Figure .: Input image (left) and the associated blur map (right). Note that while the curb is the same distance as the wooden boards, it is estimated as blurrier due to the lack of detail in that area. The blur map is visible only in color. restore lost image details. However, appropriately-applied blur can be used to great artistic effect, including separating foreground from background and conveying sense of motion. In these circumstances it is advantageous to magnify the blur present in an image. Examples include preserving the appearance of blur that would otherwise be lost to some other image processing operation and introducing blur that could not be captured, in the case of small apertures. We detail two uses related to our work: the preservation of the appearance of blur to generate representative thumbnails and magni cation of blur to simulate a wider aperture than used to capture the image. Due to the limited display resolution and computational expense of displaying many images, lower-resolution thumbnails are regularly viewed in place of full-sized originals on the view nder of a digital cameras or in photo management software like iPhoto. However, thumbnail images are not always representative of their high-    resolution counterparts. In particular, the sparser pixel sampling of thumbnails makes it is impossible to differentiate between sharp and slightly blurred edges. Downsampled images often appear sharper than their respective originals and it is common to encounter sharp and blurred versions of an image that result in identical thumbnails. Samadani et al. [] utilize their spatially-variant estimate of image blur to reintroduce blur into thumbnails lost during the downsampling process. Their approach assumes the thumbnail will be sufficiently small that all blur present in the original image will be removed. Samadani et al. then estimate the blur present in the original image, and reintroduce that blur into the thumbnail. The downsampled image is used to generate a scale-space of increasing amounts of blur. The blur estimated from the full-size image is then used to select the level of the scale-space that best represents the original image blur and produce a thumbnail better representing the blur present in the original. Depending on the desired application, the defocus blur resulting from the choice of aperture in the lens either enhances or detracts from the image. A wide aperture with large defocus blur is considered desirable for portraits, where it can separate the subject from the background. Conversely, the same would be considered undesirable for photographing a landscape and a narrow aperture with minimal blur would be more appropriate. In both cases, the photographer wants as much control of the  as possible. However, a large aperture is not feasible in all cases, such as having to choose a small aperture for excessively bright conditions or using a device with a small sensor like a mobile device. In both examples, the limited aperture size prevents the capture of images with narrow . While Samadani et al. [] focus on magnifying blur for low-resolution thumbnails, Bae and Durand [] address the high-quality magni cation of defocus blur in full-resolution images. Building upon the edge detection of Elder and Zucker [], Bae and Durand obtain an accurate estimate of the scale of blur present at the location of each edge in the image. This estimate is propagated to non-edge pixels using an image colorization scheme [Levin et al., ]. Once they have obtained a full-image map of the amount of blur present, Bae and Durand then use a model of defocus blur resulting from camera optics to solve for the amount of blur to introduce to the image associated with the desired wider aperture. Finally, they synthesize the resulting image with the narrower .   Figure .: Example of Samadani et al. [] blur magni cation. The blur in the input image (top) is not visible in the conventional thumbnail (bottom left). The thumbnail created by Samadani et al. (bottom right) retains the appearance of blur. Images copyright Ramin Samadani. The methods of Samadani et al. [] and Bae and Durand [] are not without limitations. The method of Bae and Durand [] is computationally very expensive and thus not suitable for many applications, such as a digital view nders. In both cases, the amount of blur is increased by a single scale factor, speci ed by the user. As our experiment in Section . shows, the perception of blur is more complex than this relationship. Neither method can ensure that the appearance of blur will remain constant if the image is resized, motivating our approach.  . Countershading Operations In terms of both the estimation and manipulation of image blur, the most versatile methods are those that are able to operate on individual pixels using purely local    Original  Estimated blur  Narrowed   Figure .: Example of input image (left), estimated blur map (center) and Bae and Durand [] blur magni cation (right). Images copyright Soomin Bae. neighborhood information. This feature is a necessity for complex spatially-variant operations like  magni cation. The most powerful contrast manipulation techniques operate in a similar fashion, modifying each pixel based on local information of its respective neighborhood. In terms of the estimation of contrast, two of the methods already presented are capable of determining spatially-variant estimates. Local, band-limited contrast measures (Section .) inherently represent image contrasts in a local sense. Likewise, edge detection methods that rely on scale-spaces (Section .) recover both the width and contrast at pixels corresponding to edge locations. The latter is a sparse set of pixels but since edges, by de nition, are changes in contrast and the representation provides complete coverage of local changes in image contrast. Local contrast enhancement is a powerful image processing technique, fundamental to many aspects of computer graphics such as image editing and tonemapping of high dynamic range () images. The same basic techniques can be applied to improve the recognition of objects in a scene, to aid in identifying brightness of regions or to accentuate speci c image details. In many cases, images with enhanced contrast appear more aesthetically pleasing [Calabria and Fairchild, a,b]. As presented in Section ., one of the most common approaches to enhancing local contrast in images is countershading, where the local edge contrast is increased by adding gradients to either side of the edges. This approach is common across numerous classes of algorithms. Many of these algorithms, whether explicitly or implicitly, resemble the effect of the unsharp masking operator (). Unsharp masking increases local contrast by adding a high-pass image Hσ (Y ) to the original    image Y = Y + λ Hσ (Y )  (.)  where λ is the contrast of the countershading and the high-pass image Hσ (Y ) is produced by subtracting a Gaussian blurred image gσ ∗ Y from the original image Y . Most commonly, σ and λ are determined by the user and do not vary across the image. This simple technique has proven to be incredibly versatile and, depending on the choice of blur employed in the high-pass lter, can produce a variety of effects. The choice of a small σ increases the accutance [Neycenssac, ], or apparent sharpness, of the image, which is the effect usually referred to as “unsharp masking” [Badamchizadeh and Aghagolzadeh, ]. The choice of a large σ increases the contrast of the image, which is usually referred to as countershading. In this text, we refer to the technique as unsharp masking and the effect of local contrast enhancement it produces as countershading, regardless of the choice of σ . Figure . presents a comparison of different countershading pro les.  ..  Manipulating Sharpness  The addition of narrow countershading pro les resulting from a small σ will increase the apparent sharpness of the edge. Increasing the apparent sharpness, known as accutance, by the introduction of countershading differs from the deconvolution techniques of Section ... Deconvolution techniques, in particular those that employ natural image statistics, restore frequency content absent from the image. Narrow countershading pro les, on the other hand, only increase the contrast of the edge, but do not restore lost image details. However, due to the relation between blur and sharpness inherent in the , higher contrast edges appear sharper than low contrast edges, and the observer’s impression is that the detail of the image has increased. The strength of the effect on perceived sharpness can appear to hold even more than the increase in perceived contrast. However, this increase in sharpness can also introduce artifacts and unsharp masking is known to excessively amplify the contrast of small features, especially noise. Neycenssac [] noted that unsharp masking introduces contrast through two sep  arate mechanisms: the addition of countershading pro les around edges and the ampli cation of existing image features. Lindeberg [a] notes that convolution with √ a Gaussian of width σ will remove all the features smaller than σ . Thus any feature √ smaller than σ , including noise, will be present in the high-pass image Hσ (Y ) at their original contrast. When the high-pass image is added to the original image, √ only features larger than σ receive countershading pro les, while features smaller √ than σ are ampli ed. In the process of increasing edge contrasts the  preferentially ampli es high frequency details in the image. The noise predominantly √ consists of frequencies smaller than σ , for even very small σ , and as such will be magni ed by unsharp masking more than other features.  Original image  Blurred  Resulting high-pass  Figure .: Ampli cation of details in high-pass image. Noise present in the original image is removed when the image is blurred in the process of creating the high-pass image. The absence of high-frequency detail in the blurred image causes the noise to be present in the high-pass image and reintroduced during countershading. Given the effectiveness of unsharp masking for increasing the apparent detail in images, image processing researchers have dedicated substantial effort to determining optimal parameters to enhance contrast while avoiding unwanted ampli cation of noise. The most successful approaches have adapted the parameters to local image content. The visual channels in the  are not completely independent of each other and frequency content at one scale can affect the perception of details at another scale. As a result of this attribute, known as contrast masking, the visual system is less sensitive to noise in regions around edges [De Valois and De Valois, ]. Adaptive unsharp masking operators adjust their parameters to local gradient magnitude, increasing the strength of countershading near edges where it will increase accutance, and decreasing it further away where it will increase the appearance of noise. Polesel et al. [], Ramponi et al. [] and Kim and Allebach []   have all proposed methods of adaptively determining the optimal countershading contrast, λ . Likewise Wang et al. [] and Nowak and Baraniuk [] developed techniques adaptively determining the scale σ . Badamchizadeh and Aghagolzadeh [] evaluated a selection of unsharp masking techniques comparing their performance at contrast improvement and noise ampli cation. However, most of this work attempts to address the issue of artifacts resulting from noise ampli cation, and not the artifacts caused by objectionable magnitude of countershading (haloes), which is the focus of Chapter .  ..  Manipulating Contrast  While the choice of a small unsharp masking σ parameter will increase the perceived sharpness of an image, the choice of a large σ parameter will increase the local contrast of the image. For a given choice of λ , both parameter choices result in an equal increase in contrast at the location of the edge, but the narrower pro le is perceived as a change in detail and the wider pro le is perceived as a change in contrast. Sufficiently wide unsharp masking pro les can even induce the Cornsweet illusion (Section .), where the entirety of adjacent regions change in appearance. In many cases, increasing the perceived contrast of an image increases viewers’ preference [Calabria and Fairchild, a,b] and image processing algorithms have used countershading to increase the aesthetic quality of the image. Additionally, local contrast enhancement can be applied to improve the recognition of objects in a scene, increasing the contrast between them and the surrounding areas [Luft et al., ]. Countershading is frequently employed in local tonemapping operators, although not always intentionally. The operator by Chiu et al. [] explicitly computed a high-pass image with a large σ to reduce global contrast while retaining edge contrast. However, while reducing the dynamic range this naïve approach introduced signi cant haloes. More recent operators like Durand and Dorsey [] and Fattal et al. [] can also introduce countershading pro les, though they do not view them as a bene t. In general, countershading in local tonemapping is considered synonymous with halo artifacts or “gradient reversals”. On the other hand, we view the particular combination of contrast and scale of the countershading, rather than the presence of any pro les, to be responsible for the loss of image quality.   Figure .: Many tonemapping operators employ countershading to reduce the contrast of a  image (left) to make details more visible (right). Images copyright Kevin McCoy Wikipedia contributor Darxus. Most related to our work is that of Krawczyk et al. [] on restoring luminance contrast lost during tonemapping. They propose an automated method that introduces countershading to a low dynamic range () tonemapped image to match the contrast of a reference  image. In an attempt to avoid introducing objectionable artifacts, they propose a perceptual model of just-detectable countershading based on the model of Dooley and Green eld []. However, such a model relates to the detection of countershading pro les, not whether the countershading is considered objectionable, and is overly conservative. Additionally, the method of Krawczyk et al. is meant to reproduce contrast of an  image rather than enhance contrast without an  reference. We discuss their model in more detail in Section .. Countershading has been used in several other capacities to restore lost contrast and enhance scene understanding. Smith et al. have used countershading to restore color saturation lost during tonemapping [] and while converting from color to greyscale []. Similarly, Luft et al. [] and Ritschel et al. [] added countershading based on depth values to improve the recognition of objects in scenes. Finally, work by Akyüz and Reinhard [] used the Cornsweet illusion as a means of evaluating the perception of contrast alterations introduced by tone-mapping operators.    (a) Original  (b) Countershading pro les  (c) Restored contrast  (d) Countershading detection  Figure .: Restoration of lost contrast by Krawczyk et al. []. A tone mapped image (a) doesn’t always reproduce the appearance of contrast of its  counterpart. Krawczyk et al. introduce countershading pro les (b) to produce a higher contrast result (c) that more closely matches the original, using a model of human perception (d) to avoid the countershading being deemed objectionable. Images copyright Gregorz Krawczyk.  . Related Applications Finally, our investigations on image resizing and the creation of representative images relate to a selection of other topics in image processing and computer graphics. These subjects include image upsampling, seam carving, visualization of large images and noise estimation. In the context of image resizing, Fattal [], Kopf et al. [] and Shan et al. [b] developed techniques for intelligently upsampling images. These methods use assumptions about the image statistics to invent additional information based on existing details to provide a more natural rendition than resampling with a reconstructionlter. Fattal [] de ne a edge-frame continuity modulus to impose a certain set of statistics on the edges in the upsampled image, retaining sharpness where appropriate. Kopf et al. [] use existing high-resolution images to aid in the upsampling   of the result of a computation performed at lower resolution. Lastly, Shan et al. [b] extended the concept of upsampling to extend across multiple images and handle video sequences.  Upsampling input  Upsampling result  Figure .: Intelligent upsampling tries to hallucinate more realistic image details than a reconstruction lter. Images copyright Raanan Fattal. Seam carving methods, such as the one by Avidan and Shamir [], resize images by inserting or removing pixels in the least important regions of the image, preserving the overall structure. A number of additional methods have been presented, including extensions to video [Krähenbühl et al., ; Pritch et al., ]. Seam-carving mostly focuses on adjusting aspect ratios and is combined [Rubinstein et al., ] with regular downsampling operations for extreme changes in resolution. Rubinstein et al. [] provides a comprehensive overview of existing methods, and conducts both perceptual and objective analyses of each, comparing their performance. Image resampling is not the only means of generating smaller versions of the image and depending on the intended task, a different representation of the content present in the image may be preferable. These algorithms produce smaller versions of an image that are not derived from a straight resizing of the original. Suh et al. [] and Santella et al. [] have proposed automatic cropping methods for selecting the most salient region of image to highlight in a thumbnail. Berkner and Erol [] present a method of creating thumbnails of pages of documents that are more recognizable than simple downsampled versions of the page. They focus on    Seam carving input  Scaled image  Retargeted result  Figure .: Image retargeting tries to change the aspect ratio of the image without scaling by removing the least salient portions of the image. Images copyright Wikipedia contributor Newton. preserving the shape and position of blocks of text, as well as giving added emphasis to the position and content of images. Lastly, resizing operations can remove additional features besides the blur present in the full-resolution original. When resizing, the full-size image is low-pass ltered to avoid introducing aliasing into the thumbnail. However, this operation will remove any noise present in the large image and the thumbnails for a pair of low-noise and high-noise images of the same scene will be identical. If the resizing operation wants to preserve the appearance of noise in the original, it will need to accurately estimate the quantity present. Any adaptive denoising technique includes some means of estimating the noise level of an image region, and wavelet denoising techniques by Donoho et al. [] and Simoncelli and Adelson [] are a common example. Other approaches attempt to characterize the noise as a function of the camera’s response, such as methods by Liu et al. [] and Shin et al. []. These techniques have applications beyond denoising and resizing. Adjusting the smoothing step of edge detection to the level of noise present is one such example.    Chapter   Synthetic Depth-of-Field for Mobile Devices The depth-of- eld () present in a photograph can signi cantly alter how that image is perceived. Skillful photographers can use a shallow depth-of- eld to great effect and dramatically emphasize the subject matter of the photograph. Cell phones and similar mobile devices are rapidly becoming the primary camera for many users, and the optical packages of these devices cannot produce a narrow depth-of- eld. The size of the optical package that can be constructed in such a con ned area prevents the lens from producing a large defocus.  . Introduction The limited space available for cameras in mobile devices places constraints on the design of the lens and the optics of which it is comprised. One effect of this con ned space is the the lens is unable to achieve a signi cant defocus blur, preventing the camera from capturing images with a narrow . It is very difficult to overcome this constraint with the physical construction of the lens, and instead we propose a method of producing a narrower  synthetically. Our method builds on existing approaches to robustly determine a spatially-variant estimate of the defocus blur present in a captured image when signi cant noise is present, shown in Figure .. We then modify that estimate to produce the blur associated with an image of a    Figure .: A comparison of the original image (left), estimated blur map (center), and our synthetic depth-of- eld algorithm (right). The depth-of- eld is sufficiently wide in the original image to make determining the card most in focus difficult, while the depth-of- eld in our result is narrow enough to make it obvious that the third photograph is in focus. narrower  and use that new blur to synthesize an image with the desired amount of defocus blur. The result, also shown in Figure ., is a means of producing a narrower  than could physically be captured suitable for mobile devices, in terms of both computational efficiency and robustness to the noise present in captured images. Since the constraints on the  cannot be easily solved with a replacement optical design, we propose to narrow the depth-of- eld by synthetically magnifying the defocus blur. While methods for magnifying the blur present in an image already exist, they are either too computationally expensive to be realized in a mobile device [Bae and Durand, ], or are sensitive to the noise levels common in cellphone cameras and produce low-quality results [Samadani et al., ]. Drawing on existing methods, we propose an efficient and robust algorithm for estimating the blur present in an image. Given a noisy image, we rst compute how much we must reduce the variance of the noise at each pixel location to accurately estimate the blur present. We estimate the blur at a number of different scales and choose the most reliable estimate given the noise present. We then use that blur estimate to synthesize a new image with magni ed defocus blur, simulating a reduced  for devices with aperture-limited optical packages.  . Blur Estimation To create an image with the desired , our algorithm rst determines a spatiallyvariant estimate of the blur present in the original image, then modi es that blur   estimate to correspond to a narrower depth-of- eld, and nally synthesizes the resulting image. As a starting point for our estimation method we use the algorithm by Samadani et al. [], detailed in Section .., because of its simplicity and computational efficiency. Figure . provides a owchart of the operation of the algorithm. Range map  Image features  ro  Image  Blur map Thumbnail  Thumbnail Thumbnail Thumbnail scale-space scale-space scale-space  lσ  ts  Range Thumbnail Thumbnail scale-space scale-space scale-space  rσ  Figure .: Flowchart of Samadani et al. blur estimation. Features from the full-size image are compared to corresponding features from the thumbnail scale-space to determine the appropriate blur for each pixel. While Samadani et al.’s blur estimation provides a means of controlling the relative increase in the amount of blur in the resulting thumbnail, it does not provide an absolute measure of the blur present in the large image. In order to amplify the defocus blur, we need to know the scale of the blur present in the original image. To do so, we extend their local image features to a general relationship between the width of a blurred edge and the corresponding derivatives at different resolutions, which we can use to recover the scale of the original image blur. In the case of a D Gaussian blurred edge of normalized contrast, the edge pro le is the integral of the Gaussian function. The derivatives of this pro le follow the Gaussian function, with the peak lying at the center point of the edge. For a Gaussian pro le with standard deviation σ to have a contrast of 1, the derivative of the edge cross-section will be: 2 1 − x g(σ , x) = √ e 2σ 2 . 2πσ 2  (.)  This scaling factor establishes a relationship between the width of the Gaussian pro le and the scale of its derivatives. If the width of the edge pro le changes by a factor   k, the derivatives must change by a factor of 1/k to retain the same contrast. The range map operator in Samadani et al. approximates the gradient magnitude. For an edge with blur σo , the corresponding range map will equal a Gaussian distri√ bution at the edge location with an amplitude of 1/ 2πσo2 . After downsampling √ that image to obtain ro , the value at the edge location is still equal to 1/ 2πσo2 . Due to downsampling, the effective width of that edge in the thumbnail lσ0 will differ by the downsample factor d. The pixel corresponding to the edge location in the thumbnail range map rσ0 will be √  1 ( )2 2π σdo  (.)  due to the relationship between the width and scale of a Gaussian mentioned above. Additionally, that σo /d will be further altered by the Gaussian ltering that generates the scale-space images lσ j . Using the convolution formula for Gaussian functions: ( ) √ 2 2 g (n1 , σ1 ) ⊗ g (n2 , σ2 ) = g n1 + n2 , σ1 + σ2 ,  (.)  the width of the edge in thumbnail scale-space image lσ j will thus be √(  σo )2 + σ 2j . d  (.)  We construct the scale-space from a series of blurs with a uniform spacing of  β , which implies σ j = β j. This way, the choice of β along with the maximum j determines the range and quantization of the scale-space. The end result is two different values for corresponding pixels of the two range maps: 1 ro = √ vs. rσ j = √ 2πσo2  2π  1 [( ) σo 2 d  2  + (β j)  ].  (.)  Figure . depicts the relationship between ro and the scale-space rσ j for an edge of increasing blur. For the algorithm to select the correct value for the blur map m(i), the two values of Equation . must be equal. In complex images, adjacent image features    Figure .: Demonstration of calibrating blur estimation. Top: image of a step edge with blur increasing from 0 . . . 10 along the x-axis. Bottom: range map values along dotted line for the original image ro (red), and the downsampled scale-space rσ j (blue). Note that the intersection between ro and rσ j (black dots) happens at x = j. alter the derivative values at this edge location and a direct solution would misestimate σo . Our approach determines σo based on the correspondence between the range map and the levels of the scale-space. Adjacent features alter the gradient magnitude in both range maps in the same fashion, and the correspondence between √ them is preserved. Additionally, image features smaller than σ are suppressed on the scale-space level with a Gaussian blur of σ , eliminating some of overlapping features [Lindeberg, b]. To determine the value of the blur map m(i), Samadani et al. employ the user-    speci ed parameter γ to bias the selection of values for m(i) towards more or less blurred levels of rσ j . Noting that the relation between the two range maps depends on the downsample factor, we instead determine the value of γ that correctly scales the range map of the original image and the range maps of the downsampled scalespace. We solve the following equation between ro and rσ j  γ√  1 2πσo2  =√ 2π  1 [( ) σo 2 d  2  + (β j)  ]  (.)  for the value of γ that ensures the index of the blur map selected with Equation . is equal to the width of the blur in the original large image, m(i) = j if σo = j. Canceling terms and solving for γ yields: 1 γ = √( ) 1 2 d  . +β2  (.)  The result is a value of γ automatically chosen for a given downsample factor and scale-space resolution. In our method, we use  levels of blur ( j = 1, . . . , 25) and a value of β = 0.4. The resulting values for downsamples of , , and  are  γd=2 = 1.55, γd=4 = 2.11 and γd=10 = 2.48.  . Noise-Robust Estimation However, the resulting estimate is highly susceptible to any noise present in the original image. Consider an image that contains a blurred edge and some amount of noise. The larger the blur of the edge, the smaller the corresponding gradient magnitudes will be, until the gradient magnitude, and resulting blur estimate, is determined more by noise present than the edge. This situation can occur for even small amounts of noise, and when it does the estimated blur is biased towards low blur scales or is mis-estimated as perfectly sharp, as seen in Figure .. As a result, Samadani et al. alone cannot accurately estimate the blur in images captured with mobile devices due to the noise levels present. To address this shortcoming, we can determine the  of Elder and Zucker [], described in Section .., at which we can accurately estimate the blur. The  of an image region corresponds to the largest amount of blur that can be reliably detected   25  estimated blur (pixels)  20  σ =0.0 σ =0.5 σ =1.0 σ =2.0  15  10  5 5  15 10 actual blur (pixels)  20  25  Figure .: Comparison between actual edge blur (σ = 0) and the blur estimated by Samadani et al. for various noise levels σ . As the noise level increases, the estimated blur is biased towards smaller scales, eventually estimating the entire image is sharp. in that image region. This scale is dependent on the , and hence the image intensities in that region. Elder and Zucker prove that for a given edge pro le and noise intensity, the image can be blurred by some amount to sufficiently reduce the noise to allow accurate estimation of the blur. Combining the methods by Samadani et al. and Elder and Zucker, we can produce a robust estimate of the blur in noisy images as follows, and shown in Figure .: . For each scale s of the set of minimum reliable scales, convolve the original image by a Gaussian of scale s to produce a blurred copy Is (x, y). . For each Is (x, y), obtain the estimated blur map ms (x, y) using Samadani et al., as described in Section .. . For each scale s, compute the gradient magnitude rs (x, y) of the corresponding blurred image Is (x, y). . Compare all the gradient magnitudes rs (x, y) to determine the appropriate scale at which to estimate the blur s(x, ˆ y) for each pixel.   . Select the appropriate ms (x, y) for each pixel according to the scale map s(x, ˆ y) to produce the nal blur estimate m(x, y).  σ ˆ  MRScale I  m1 (i)  blur g(1)  blur est.  blur g(2)  .. .  blur est.  m2 (i)  blur g(n)  blur est.  mn (i)  .. .  combine  m(i)  Figure .: Flowchart overview of our combined minimum reliable scale and blur estimation. The gradient magnitude rs (x, y) and blur estimate ms (x, y) are computed at scale s. Then the gradient magnitudes rs (x, y) are compared to produce the minimum reliable scale map s(x, ˆ y), which is used to select the appropriate blur estimate ms (x, y) for each pixel to produce the nal blur estimate m(x, y). This process is repeated for each minimum reliable scale level being considered. The blur estimate corresponding to each blurred copy Is (x, y) includes the amount of blur that was added to the image to ensure a reliable estimation. We use the convolution formula of Gaussians from Equation . to determine the blur present in the original image before creating the nal blur estimate m(x, y). Once the reliable blur estimate has been obtained, we modify the nal blur estimate to represent the narrower . Finally, we use the modi ed blur map to synthesize the desired image.  . Efficient Estimation While conceptually simple, the approach outlined in the previous section is inefcient. Compared to the method of the previous section, our optimized method exploits redundancies in the intermediate results of computing each stage of the method. Both the  computation and the blur estimation corresponding to each scale constructs a scale-space [Lindeberg, c] of Gaussian-blurred images used in the computation of their respective nal results. We optimize two portions   of the method: results shared between a given  level and the blur estimation associated with that level — rs (x, y) vs. ms (x, y) — and results shared between the blur estimations associated with different scales — ms1 (x, y) vs. ms2 (x, y). First, we consider the results shared between the computation of a given minimum reliable scale level and the blur estimation associated with that level. To determine the nal map of the minimum reliable scale s(x, ˆ y) and blur estimate m(x, y) for a given image I, two outputs from each scale s of the  computation are required: the gradient magnitude map rs (x, y) and estimated blur map ms (x, y) corresponding to that scale. Both the  computation and the blur estimation require the gradient magnitude rs (x, y) corresponding to that scale. Additionally, the blur estimation requires the original image I convolved by a Gaussian of radius s, g(s). We rst blur I by g(s) to obtain the blurred image Is (x, y). We then compute rs (x, y) from the partial differences of Is (x, y). Once the input to the blur estimation Is (x, y) and gradient magnitude rs (x, y) have been computed, the blur estimation proceeds as normal. Our method is shown in the top half of Figure .. On the other hand, the naïve formulation computes each step in different ways, as shown in the bottom half of Figure .. The  computes the gradient magnitude from the result of convolving the image with steerable Gaussian basis lters gx (s), gy (s). To compute the same quantity for the blur estimation, the image has to rst be blurred by a Gaussian g(s), then partial differences are computed to determine the gradient magnitude. Our method computes  less convolutions of the full-size I per scale s. Second, we consider the results shared between the blur estimations associated with different scales. For each scale s, the image blurred by that scale Is (x, y) is passed to the corresponding instance of blur estimation. In the construction of each of the blur estimation scale-spaces, that input Is (x, y) is further blurred by range of values  σ j = 0 . . . j, resulting in a nal blur of Is,σ j = I ⊗ g  (√  √  ) s2 + σ 2j ,  (.)  s2 + σ 2j will occur in the blur estimations √ of multiple scales. We compute all the possible combinations of s2 + σ 2j , round by Equation .. A resulting blur of    I  ⊗g(s)  New Formulation  ∂x  rs  blur est. scale space  ⊗gx (s)  I  rs  ∂y  ⊗gy (s)  ⊗g(s)  blur est.  ms  rs  rs  rs,o  Is  Naive Formulation  blur est. scale space  blur est.  ms  Figure .: Comparison of our formulation and the original formulation for computing rs (x, y) and ms (x, y). Both  and the blur estimation share rs (x, y), but compute it in different ways. We rearrange the terms to share the computation between both portions with minimal extra computation. each to the nearest integer, and only compute the nal blur √ once and use it in all blur estimations that require that value. At worst, this is O( s2 + σ 2j ) convolutions, opposed to (s × j) convolutions. Finally, there are several minor additional optimizations. All of the Gaussian blurs are computed using the recursive implementation by Young and van Vliet []. We compute all the gradient magnitude maps rs (x, y) using the partial differences, compared with the maximum difference between a pixel and the  neighbors of Samadani et al., reducing the number of memory reads by a factor of 4×. While neither of these optimizations seems like much on its own, they add up to a notable difference in required computation. Consider the speci cs of an Apple iPhone , with a resolution of  megapixels. We use  scales in the minimum reliable scale computation: s ∈ .5, 1, 2, 4, 8, 16, 32, while we use a blur estimation scale-space with a maximum of : σ j = 0 . . . 30. As with the naïve implementation, all of the rs (x, y) are combined to form the minimum reliable scale map s(x, ˆ y). In turn, that map is used to select the appropriate   operation full-size image blurs blur-estimation scale-space blurs full-size gradient magnitude diffs downsampled gradient magnitude diffs  our method      naïve      Table .: Comparison of the number of operations for different steps in the full blur estimation operation, speci cally the number of Gaussian blurs and pixel differences that happen at the original resolution and the downsampled resolution used in Samadani et al. blur estimate ms (x, y) for each pixel of the nal blur estimate m(x, y). Lastly, as noted in Section ., the estimated blur is computed on a downsampled image, and we resize the nal blur estimate m(x, y) to the size of the original image.  . Blur Synthesis Given an accurate estimate of the blur present in an image, we need to synthesize the nal image with the desired narrower . To do so, we scale the estimated blur map to represent the target  md (x, y), and compute the amount of additional blur ma (x, y) we must introduce at each pixel to achieve that amount of blur. Finally, we produce the image with that additional blur to obtain the desired . In optical terms, the estimated blur map m(x, y) approximates the radius of the circle of confusion due to defocus blur at each point. Decreasing the  of the image corresponds to linearly scaling all values of the blur map by a value greater than 1 to increase the circle of confusion. To adjust the  to be equal to an fnumber different from that of the physical aperture, we use the relationship between the diameter of the circle of confusion c and f-number N: c=  |S − D| f2 · S N(D − f )  (.)  where D is the focus distance, S is the distance of a point in the scene, and f is the focal length of the lens. We are interested in the proportional increase between the circles of confusion of the desired f-number Nd and original f-number No . The increase b applied is equal to the ratio of cd and co , canceling terms, we can discard all distances, and this quantity is equal to the ratio of the f-numbers:   b=  cd No = . co Nd  (.)  So the desired blur is given as md (x, y) = b m(x, y). Even though the m(x, y) approximates the blur at each pixel in the image, that blur does not necessarily represent the defocus or depth. If an image region is a at color, we cannot determine whether that is a detailed region that is out of focus or it is in focus but lacks any detail. If a point in the scene is out-of-focus we know that the corresponding pixel will be estimated as blurred, but if a pixel is estimated as blurred we cannot assume the corresponding point in the scene is out of focus. One implication of this ambiguity is that the estimated blur of adjacent pixels can differ signi cantly. While synthesizing the nal image, the difference in amount of additional blur those pixels receive will be even larger, due to the scaling b applied to the blur map. This inconsistency between adjacent blurs implies the differences the respective neighborhoods can introduce contrast inversions in the nal image, as seen in Figure ..  Uncorrected blur map  Original image  Corrected blur map  Figure .: Example of the contrast inversion resulting from synthesizing the nal image with an uncorrected blur map. The uncorrected blur map can have vastly different blurs adjacent to each other, causing contrast inversions where the blur amount changes, like the eyes, while the corrected blur map does not. Original image copyright Flickr user hpj. One means of addressing this issue is to solve a sparse linear system [Levin et al., ] like Bae and Durand [] to propagate the estimated blurs until they are consistent. While this approach successfully avoids any artifacts for a large range of desired , it is too computationally expensive for our targeted application. In  stead, we enforce the simple condition that no pixel in the desired blur md (x, y) may differ from its neighbors by more than some quantity. For each σ = 0 . . . max(md ) in the range of desired blurs, we compute the distance transform for the set of pixels with a desired blur of σ . Each pixel in the modi ed map m′d is the minimum of all the distance values the corresponding to that pixel. The result of this operation is shown in Figure ..  Figure .: False-color image comparing the desired blur map md (left) and the modi ed version m′d (right). The large estimated blurs on the character’s face and chest are reduced to be consistent with nearby smaller values. Original image copyright Flickr user hpj. We use that new blur m′d (x, y) and the blur present m(x, y) to compute the additional blur required ma (x, y) at each pixel with Equation .. To produce the nal image, we construct a Gaussian scale-space corresponding to the range of additional blurs σa = 0 . . . max(ma ) For each level, we blur the original image I by the corresponding amount of additional σa , then linearly blend those blurred images together to approximate non-integer values of ma . The Gaussian model differs from the geometric model for defocus blur. However, it has been argued that Gaussian blur better accounts for artifacts in actual cameras, and it has been used widely in computer vision [Pentland, ; Subbarao, ]. While more accurate spatially-variant blur synthesis is possible, such as Popkin et al. [], we haven’t noticed any artifacts requiring such methods.    . Evaluation Most of the parameters of our algorithm can be determined from the properties of the imaging system of the mobile device. We use the Apple iPhone GS as the example for our evaluation, the speci cations of which are in Table .. Resolution Aperture Focal length Sensor size   x  f /2.8 . mm . mm (diag)  Table .: Speci cations of the Apple iPhone GS camera. Even though the iPhone GS lens has an aperture of f /2.8, the defocus blur produced by that aperture is the same as an f /28 aperture for the equivalent focal length on a mm camera. Given that circle of confusion, the maximum defocus blur that can be achieved is a disc lter with radius of  px, by Equation .. For the blur estimation, we use a downsample factor of 4× and a scale-space with levels corresponding to Gaussian scales ( j = 0 . . . 15) and a value of β = 0.4. For the minimum reliable scale computation, we estimate the noise present in the image using the method by Ibenthal []. We synthesize the nal image with 2× the circle of confusion found in the original image. This increase in circle of confusion equates to an aperture of f /1.4 ( f /14 in  mm terms), and results in a maximum of additional blur of σ = 25. Artifacts begin to become apparent beyond this scale, and narrower  would require better blur synthesis methods such as those used by Bae and Durand []. Figure . contains results of our method. While our method can produce reasonable results in many cases, it has several limitations. In order to synthesize a narrower depth-of- eld, there has to be a range of defocus blurs detectable in the image. Unless the foreground is very close to the camera, the amount of defocus blur will be not be sufficiently large for our algorithm to distinguish between blurred and sharp edges. In this case, magnifying the estimated blur will have little effect on the . Additionally, our method can accurately estimate the blur of edges when noise is present, but it has difficulty with textured regions. Depending on the estimated noise   Original image  Estimated blur map  Narrower- result  Figure .: A selection of images produced with our method, comparing the conventional thumbnail (left), estimated blur map (center), and our synthetic depth-of- eld algorithm (right). In all images, the desired blur md (x, y) was determined by doubling the estimated blur. level, our method cannot easily distinguish between ne texture detail and noise, as seen in Figure .. Signi cant noise present in the image causes many textured regions to be estimated as blurry and all detail is blurred out when synthesizing the nal image. This confusion between noise and texture would be avoided in the future by using an analytic noise model for the given camera at the speci c exposure and gain levels used to capture the image.   Figure .: Our blur estimation method cannot distinguish between noise and texture detail, and regions with ne texture detail will result in a blurry estimate. The texture in the foam (left) is indistinguishable from the noise in the beer, and both are blurred in the synthesized image (right). Neither our method nor Samadani et al. can accurately detect motion blur or camera shake. Our method estimates the blur in proportion to the maximum gradient magnitude. Large gradient magnitudes correspond to small blurs, while small gradient magnitudes correspond to large blurs. Both motion blur and camera shake result in a linear blur, which still has large gradient magnitudes orthogonal to the blur direction, and such image regions are estimated as sharp.  . Conclusion In this chapter we have combined the blur estimation of Samadani et al. with the robustness of Elder and Zucker’s minimum reliable scale to produce an efficient means of estimating spatially-variant blur in noisy images. We have shown that refactoring the implementation of the method can reduce the amount of computations necessary, especially in terms of the number of Gaussian blurs required. The optical design of cameras included in mobile devices makes it impossible to physically capture a narrow . Using our estimate of the blur present in the   image, we synthesize a new image with narrower  than the camera could actually capture. While our method has limitations in terms of how accurately it can estimate blur, and how much it can enhance the blur it does nd, it is capable of producing compelling results.    Chapter   Blur-Aware Image Downsizing Synthesizing a narrower  is not the only application of blur magni cation. The spatially-variant blur estimation of the previous chapter can be used with other models than camera optics. Resizing to a lower resolution can alter the appearance of an image. In particular, downsampling an image causes blurred regions to appear sharper. It is useful at times to create a downsampled version of the image that gives the same impression as the original, such as for digital camera view nders. To understand the effect of blur on image appearance at different image sizes, we conduct a perceptual study examining how much blur must be present in a downsampled image to be perceived the same as the original. We nd a complex, but mostly image-independent relationship between matching blur levels in images at different resolutions. We incorporate this model in a new appearance-preserving downsampling algorithm, which alters blur magnitude locally to create a smaller image that gives the best reproduction of the original image appearance.  . Introduction One pervasive trend in imaging hardware is the ever increasing pixel count of image sensors. Today even inexpensive cameras far outperform common display technologies in terms of image resolution. For example, very few cameras remaining on the market, including mobile devices, capture an image at low enough resolution to show on a p high-de nition television () display without resizing. In an extreme case, the preview screen on one Nikon professional digital single-lens   re ex () can only display 1.5% the pixels captured by the sensor. The cellphone camera owner and the K cinematographer face the same problem of getting an accurate depiction of the image when they can’t see all the pixels. While high resolution images are needed for a number of applications such as on-camera previewing, print output or cropping, the image is often previewed on a display of lower resolution. As a result, image downsampling has become a regular operation when viewing images. Conventional image downsampling methods do not accurately represent the appearance of the original image, and lowering the resolution of an image alters the perceived appearance. In particular, downsampling can cause blurred regions to look sharp and the resulting image often appears higher quality than its full-size counterpart. While the higher quality images can be desirable for purposes such as web publishing, the change is problematic in cases where the downsampled version is to be used to make decisions about the quality of the fullscale image, for example in digital view nders. In this chapter, we aim to develop an image downsampling operator that preserves the appearance of blurriness in the lower resolution image. This is a potentially complex task — the human visual system’s ability to differentiate blurs is dependent on spatial frequency, and edges blurred by different amounts may be perceived as different at one scale but equal at another. Additionally, there is potential for contentdependent blur perception where the same amount of blur is perceived differently, depending on the type(s) of object(s) shown. We approach this problem by conducting a perceptual study to understand the relationship between the amount of blur present in an image, and the perception of blur at different image sizes. Our study determines how much blur must be present in a downsampled image to have the same appearance as the original. We nd a complex and mostly image-independent relationship between matching blur levels in images at different resolutions. The relationship can be explained by a linear model when the blur magnitude is analyzed in terms of spatial frequency. Using the results of this study, we develop a new image resizing operator that ampli es the blur present in the image while downsampling to ensure it is perceived the same as the original. While our algorithm is compatible with any combination of methods for producing a spatially-variant estimate of image blur and spatially-variant image ltering, we employ the blur estimation scheme presented in Chapter . The   result is a fully-automatic method for downsampling images while preserving the appearance of blur, the performance of which we verify with another user study.  . Experiment Design The basic premise of our work is that the blur in an image is perceived differently when that image is downsampled. In order to create a downsampled image that preserves the appearance of the original image, we must quantify that change in perception. The experiment was intended to measure the amount of blur that needs to be present in a thumbnail image in order to match the appearance of blur in a fullsize version of the same image. This relationship was measured in a blur-matching experiment. Observers were presented a full-size image, as well as two thumbnail images. They were asked to adjust the amount of blur in both thumbnail images such that the rst thumbnail was just noticeably blurrier than the full-size image, and the second thumbnail was just noticeably sharper. An example stimulus is shown in Figure .. We found that this ‘bracketing’ procedure resulted in more accurate measurements than direct matching and was necessary due to the relatively wide range of blur parameters that result in approximately equal appearance. Such variation of the method of adjustment was used before to measure a just noticeable blur in the context of the depth of focus of the eye [Yi et al., ] and the brightness of the glare illusion [Yoshida et al., ]. The matching blur amount was computed as the mean of the ‘less-’ and ‘more-blurry’ measurements. An alternative experiment design that would produce more accurate results, is the -alternative-forced-choice procedure, in which the observers are asked to select a blurrier/sharper image when presented the original and downsampled version and the amount of blur is randomly added or removed from the smaller image. Such a procedure, although more accurate, consumes much more time (is on average – longer) and thus is not effective with a larger group of observers. The objective of our study was to gather data for an ‘average’ observer, thus it was more important to collect a larger number of measurements for a larger population, rather than fewer but more accurate measurements.    Figure .: Screen capture of the stimuli used in the experiment. Subjects adjust the blur in the small images on the right to match the blur in the large image on the left. Viewing conditions. The images were presented on a ” Dell WFPc display with × resolution. The experiment was run in a dimmed room with no visible display glare. The viewing distance was  m, resulting in a pixel Nyquist frequency of  cycles per visual degree. Image selection. A pilot study was run to observe how the blur estimates differ between images, and in order to identify a possibly small set of images that would still re ect image-dependent effects. For the pilot experiment we selected  images containing people, faces, animals, man-made objects, indoor and outdoor scenes. The pilot experiment was run with  blur-levels and only  observers. The results were averaged for each test condition (blur level × downsampling level) to form a vector value. Then, the Euclidean distance was computed between vectors for each pair of images, to build a difference matrix. The difference data was then projected onto a D space using multi-dimensional scaling [Kruskal and Wish, ] in order to produce the plot in Figure .. The plot re ects image-dependent differences in blur perception for the same blur parameters. To maximize diversity in image content, we selected ve images which were located far apart on the plot and thus were likely to be the most different in terms of produced results.    Desk  Scaling dimension 2  Lab Factory  Cafe Teapots Dolores Park Coast Alley Starfruit  Cheering crowd Market Sicily Rockband Fireworks Girls  Break  Subway station  Bike polo  Campus  Cyclists Scaling dimension 1  Figure .: The result of multi-dimensional scaling on the differences between per-image results collected in a pilot study. The pilot study was intended to identify the representative images that had the highest potential to reveal any image-dependencies in the study. The images selected for use in the main study are shown at the bottom. Stimuli. For both the pilot study and the full experiment, differently blurred versions of images were generated by introducing synthetic blur to full-size images with no noticeable blur in them. Since we could not control where in the image users were looking to make their judgements, we introduced uniform blur to completely in-focus images to avoid any ambiguities in response. For this purpose we used a Gaussian kernel of a speci ed standard deviation ς (in this chapter, we use ς to denote standard deviations of blur kernels expressed in visual degrees, and σ for blur kernels expressed in pixels). Thumbnail images were produced by the same process, except that the convolution of a full-size image was followed by nearest-neighbor    resampling. We chose a nearest neighbor lter for this step in order to not distort the experiments by introducing additional low-pass ltering. However, as a result, some amount of aliasing was present for small blur kernels under large downsampling factors (also see Section .). The reported ς -values are given in visual degrees to make them display-independent. The ve selected images were shown at  blur levels, ranging from  do . visual degrees, and at three downsampling factors: , , and . Observers.  observers ( male and  female) participated in the study. They were paid and unaware of the purpose of the experiment. The observer age varied from  to  with the average . All observers had normal or corrected-to-normal vision. Experimental procedure. Given a reference image with blur ς r , the observers were asked to adjust the matching blur to be just noticeably stronger in one and just noticeably weaker in the other thumbnail image. Each observer repeated the measurement for each condition three times, but each observer was assigned a random subset of  out of  conditions to reduce workload ( =  downsampling factors ×  blur levels ×  images). In total over , measurements were collected. The experiment was preceded with a training session during which no data were recorded, followed by three main sessions with voluntary breaks between them. The breaks were scheduled so that each session lasted less than  minutes.  . Experiment Results The results of the experiment, averaged over the ve selected images and for each image separately, are shown in Figure .. The results are very consistent regardless of the image content, but averaging across all images is necessary to reduce variability in the data. Because both ς values are reported with respect to the blur in the fullsize image, the y = x line (dashed black line in the plot) is equivalent to retaining the blur of the original image. For all data points the matching blur is larger than the blur in the original images (points above the dashed black line). This is because images look sharper after downsampling and they need to be additionally blurred to   match the appearance of full-size images.  Figure .: The results of the blur matching experiments, plotted separately for averaged data ς m (top-left) and for each individual image. The continuous lines are the expected magnitudes of matching blurs found by computing the average between two measurements for ‘more blurry’ and ‘less blurry’. The error bars represent a % con dence interval. The edges of the shaded region correspond to the mean measurement for ‘more blurry’ and ‘less blurry’. The experimental curves also level off at higher downsampling levels and for larger blur amounts. This effect is easy to explain after inspecting actual images, in which the amount of blur is so large that it sufficiently conveys the appearance of the full-size image and no additional blurring is needed. It is important to note that the reported values also include the blurring necessary to remove aliasing artifacts. As mentioned in the previous section, we used a simple nearest-neighbor lter to resample the blurred high-resolution images so that the results are not confounded with an anti-aliasing lter. If the blur was not sufficient to prevent aliasing in the downsampled image, the result appeared sharper than the original. We observed that when no blur was present in the large image, subjects adjusted the amount of blur in the thumbnail to a value close to the optimal low pass lter for the given downsampling factor.   . Model for Matching Blur Appearance In this section we introduce a model that can predict our experimental results. The plot curves in Figure . suggest a non-linear relation for matching blur in original and downsampled images. However, we show that the averaged measured data ς m is well explained by the combination of an anti-aliasing lter ς d and a model S , which is linear in spatial frequencies (measured in cycles per visual degree):  ςm =  √ ς 2d + S 2 .  (.)  The ς m is the model prediction of the experimental blur-matching data for an average observer. The term ς d approximates the effect of an ideal anti-aliasing lter. The standard deviation ς d of the Gaussian lter that provides a least-squares t of the sinc lter is √ 3 log 2 ςd = d , π ·p  (.)  where d is the downsampling factor, while p is a conversion factor that maps from image units (pixels) to visual degrees, which is equal to the number of pixels per visual degree. In our experiments, we had subjects sit further away from the screen than usual, to prevent limitations in screen resolution from affecting the results. As a result, p = 60 for our experimental data. To motivate our choice of model, we remove the anti-aliasing component ς d from the experimental data and plot it in terms of spatial frequency 1/ς in Figure .. The plot shows the experimental data expressed as the S component of Equation .. All data points are now well aligned and mostly in linear relation, except several measurements at high frequencies and for the 2× downsampling factor. We attribute these inaccuracies to the measurement error, which is magni ed in this plot because of the f = 1/ς transform. The plot demonstrates that the remaining term S can be modeled as a set of straight lines when expressed in terms of spatial frequencies. Moreover, the lines cross at approximately the same point. The model that provides the best least-squares t of the experimental data in terms of ς -values is    S (ς r , d) =  1 , 2−0.893 log2 (d)+0.197 ( ς1 − 1.64) + 1.89  (.)  r  where d is the downsample amount and ς r is the amount of the reference blur in  Downsampled image blur cut−off frequency (1/S) [cycles per degree]  the original image. 30 25  x2 x4 x8  20 15 10 5 0  0 5 10 15 20 25 30 35 40 Full size image blur, cut−off frequency (1/ςr) [cycles per degree]  Figure .: Average matching blur data (Figure . rst panel), with the anti-aliasing component ς d removed, replotted as the roll-off frequency. The matching blur follows straight lines, except for the small blur amounts (high frequency roll-off), where aliasing dominates. The two lowest value ς r points were omitted from the plot as the values were excessively large due to the 1/ς transform. Figure . plots the combined blur model ς m as compared to the results from our experiments. The gure shows that the tting error is quite acceptable, even for low-ς (high-frequency) points, which did not follow the linear relation in Figure .. When comparing plots, note that the large frequency values correspond to small ς values. While a higher-order function could provide a better t, our experimental data do not provide enough evidence to justify such a step. Moreover, we believe that a linear model in terms of spatial frequency is more plausible than a higher-order function. Note that the combined model of matching blur ς m is the absolute amount of blur that needs to be present in the full-size image before downsampling and is   Downsampled image blur radius ς [vis deg] m  0.5  0.4  0.3 x2 data  0.2  x2 model x4 data x4 model  0.1  0  x8 data x8 model 0  0.05  0.1 0.15 0.2 0.25 Full size image blur radius ςr [vis deg]  Figure .: Blur model ς m (dashed lines) compared with the experiment results ς m (continuous lines and error bars). expressed in units of visual degrees. In Section . we explain how to compute the amount of blur that needs to be added to a downsampled image.  . Perceptually Accurate Blur Synthesis The goal of our algorithm is to use the results of our experiment to automatically produce a downsampled image that preserves the appearance of the original blur. We rst compute a spatially-varying estimate of the amount of blur present in the full-size image. Given that estimate, we use the results of our study to determine how much additional blur is needed for the speci ed downsample. Finally, we synthesize a new downsampled image with the amount of blur required to preserve the appearance of the image. Our overall approach can work with any method that provides a spatially variant estimate of image blur. We considered the method by Elder and Zucker [], but it only produces estimates at edge locations and requires the work of Bae and Durand [] to provide a robust estimate of the blur at all pixels. While the approach produces high-quality results, it operates at the resolution of the original image and is computationally intensive. Instead, we rely on the algorithm presented   in Section . because of its simplicity and computational efficiency, which is a result of performing most work at the resolution of the downsampled image. The method provides blur estimates at each pixel location in terms of the Gaussian kernel σ before synthesizing the nal image using our model of perceived blur. To synthesize the nal image, we employ the same process as Section ., replacing the model of camera optics with the perceptual model derived in Section .. While our method recovers a blur map for the image, it is worth noting that this blur map does not necessarily represent the defocus or depth of pixels. It can better be understood as a map of relative gradient magnitude per image region. While rapidly changing derivatives correspond to sharp regions, there is an ambiguity with slowly changing derivatives. If an image region is a at color, we cannot determine whether that is a detailed region that is out of focus or it is in focus but lacks any detail. Both situations are equivalent from the viewpoint of this algorithm. With an accurate estimate of the blur present at each pixel of the large image, we use our model from Section . to compute the amount of blur desired in the downsampled image. To produce the appearance-matching image, we reduce its resolution by downsampling it by a factor of d using the standard technique with an antialiasing lter. Because the anti-aliasing is now accounted for, we use the aliasingfree component of the model S (ς r , d) from Equation ., rather than the complete model ς m . Given the existing blur in the full-size image σo , the amount of blur that needs to be added to a downsampled image is expressed as √(  σa =  S (σo ·p−1 , d)·p d  )2 − σo2 .  (.)  The downsampling factor d reduces the blur amount as we work on a lowerresolution downsampled image. The conversion factor p, which is equal to the number of pixels per one visual degree, converts visual degrees used in the model to pixels used in the blur estimation. For a computer monitor seen from a typical distance, p is approximately 30 pix/deg. To produce the nal image, for each level of the scale-space σ j we blur the downsampled image by the corresponding amount of additional σa , then linearly blend sequential pairs of those blurred images together to approximate non-integer values of σa . While more accurate spatially-variant blur synthesis is possible, such as   Conventional downsample  Our method  Cropped region of original image  Figure .: Comparison of conventional downsample and our method. The bottom row contains cropped portions of the image at the original resolution (see pink boxes in conventional thumbnail). Note the blur present in the eye of the robot sculpture and cardboard box is visible in our result, but appears sharp in the conventional thumbnail. Popkin et al. [], we haven’t noticed any artifacts requiring such methods.  . Evaluation In this section, we provide results of our method and compare our approach to that of Samadani et al. []. We encourage the reader to look at the electronic versions of the images, which represent the ne details better than prints. Figures . and . compares the results of our algorithm to those of a conventional downsampling method of low-pass ltering the image followed by nearestneighbor sampling. In both the example of the robot sculpture and the art supplies, objects that appear in focus (such as the head of the robot or cardboard box) in the conventionally-downsampled image are in fact blurry, as can be seen in the zoomed portions. Our method accurately detects this blur and preserves the appearance in the downsampled image.   Conventional downsample  Our method  Cropped region of original image  Figure .: Comparison of conventional downsample and our method. The bottom row contains cropped portions of the image at the original resolution (see pink boxes in conventional thumbnail). Figure . demonstrates the effectiveness of our algorithm at preserving the appearance of blur in images across multiple downsample factors. In this image, the original images are downsampled by a factor of  and , and the smaller versions retain the same impression of the depth of eld. Figure . compares the results of our method to those of the original method of Samadani et al. []. If the value of γ is manually chosen for the image, their method can approximate our own. However, if the value of γ is incorrectly chosen, their method will either introduce too much blur and remove detail from the branches in the upper left or not introduce enough blur and retain all the details in the owers. Even with a correctly chosen value of γ their method can only linearly scale the amount of blur, and cannot model the more complex relationship between existing blur and desired blur observed in the user study. Additionally, we conducted a second user study to verify the effectiveness of our method. Previously, Samadani et al. [] performed a preference study to    original  2x normal  4x normal 4x blur-aware  2x blur-aware  2x normal  4x normal 4x blur-aware  2x blur-aware  original  Figure .: Comparison of appearance of blur at multiple downsample levels. All of our results retain roughly the same amount of blur as the original while the conventionally downsampled appear to get progressively sharper. determine whether subjects felt their method was more representative of the original image than a conventional thumbnail. This study showed that users did prefer the method of Samadani et al. over standard thumbnails. We instead chose to conduct a task-based survey to determine the extent to which our method improves users’ ability to make accurate comparisons of how objects in a scene are blurred. In this -alternative-forced-choice study, we photographed a series of objects with increasing amounts of defocus blur. Thumbnail versions of these images were created using both our algorithm and the conventional downsample process to downsample by a factor of . Subjects were shown pairs of images with different amounts of defocus blur and asked to specify in which of the thumbnails the object appeared sharper. Figure . contains an example stimuli. A total of  observers participated in the study, performing a total of  trials for each of the downsampling algorithms. Overall, subjects correctly identi ed the sharper object % of the time when viewing conventional thumbnails, while they correctly identi ed the sharper object % of the time when viewing the results of our method. Our method outperforms conventional downsampling when the blur is small enough that the object will appear sharp in a standard thumbnail and blurred in our result. However, both methods exhibit similar performance if the blur is small enough for the object to appear sharp at the original resolution, and thus in both thumbnail versions. Likewise, the performance of the two methods will be the same if the blur is large enough that the object will appear blurred in both thumbnails.   Conventional downsample  Our result  Samadani, γ = .5  Samadani, γ = 4  Cropped region of original  Figure .: Comparison of naive downsampling and our method (top row) to Samadani et al. with too little blur (γ = .5) and too much blur (γ = 4) from incorrect choices of γ . The bottom row contains a cropped region of the original image (see pink boxes) for comparison. Original image copyright Ramin Samadani.    Figure .: A screenshot of the veri cation study containing a pair of thumbnails with different amounts of defocus blur. Subjects had to choose which image contained the object more in focus. We used a uniform distribution of blur amounts, so our experiment covers all three of these cases.  . Conclusion In this chapter, we have presented a perceptually-based model of how the perception of blur in an image changes as the size of that image is reduced. This model is based on a linear relationship between the perceived blur magnitude and the blur present in the image, when analyzed in terms of spatial frequency. We have used that model to create a new image-resizing operator that preserves the perception of blur in images as they are downsampled, ensuring that the new image appears the same as the original. To do so, we modi ed an existing blur estimation algorithm by Samadani et al. [] to provide estimates of the original image in absolute units. More generally, any form of downsampling involves discarding information present in the image. The convention in graphics and image processing is to attempt to produce the highest quality result, which usually involves throwing away higher frequency detail to avoid any aliasing artifacts. Due to the disparity between sensor resolution and display resolution, users often view images and make image assessments based on lower-resolution versions that might not represent their full-size counterparts. We have proposed an approach that considers how the image is perceived, and preserves that appearance rather than producing a higher-quality but less   representative result.    Chapter   Scale-Dependent Perception of Countershading The previous chapter discussed how the perception of image blur changes with the size at which the image is presented. This chapter investigates similar changes in the appearance of edge pro les of different scales. Countershading is a double-edged sword: while correctly chosen parameters for a given viewing condition can signi cantly improve the image sharpness or trick the human visual system into perceiving a higher contrast than physically present in an image, wrong parameters, or different viewing conditions can result in objectionable halo artifacts. In this chapter we analyze the circumstances under which countershading turns from image enhancement to artifact for a range of parameter settings and viewing conditions. Our experimental results can be modeled as a function of the width of the countershading pro le. We employ this empirical function in a range of applications such as image resizing, view dependent tone mapping, and countershading analysis in photographs and works of ne art.  . Introduction Local contrast enhancement is a powerful image processing technique, fundamental to many aspects of computer graphics such as image editing and tonemapping of HDR images. The same basic techniques can be applied to improve the recognition    of objects in a scene, to aid in identifying brightness of regions or to accentuate speci c image details. In many cases, images with enhanced contrast appear more aesthetically pleasing. One of the most common approaches to enhancing local contrast in images is countershading, where the local edge contrast is increased by adding gradients to either side of the edges. This approach is common across numerous classes of algorithms. Many of these algorithms, either explicitly or implicitly, resemble the effect of the unsharp masking (UM) operator. This simple technique has proven to be incredibly versatile and, depending on the choice of blur employed in the high-pass lter, can produce a variety of effects. Unsharp masking with a narrow high-pass lter can increase the acutance, also known as apparent sharpness, of the image, making ne details easier to identify. On the other hand, unsharp masking with a wide high-pass lter can increase the contrast of the regions adjacent to the edge, altering the overall impression of contrast in the image. Sufficiently wide unsharp masking pro les can even induce the Cornsweet illusion, where the entirety of adjacent regions change in appearance. However, unsharp masking can also introduce objectionable countershading around an edge, frequently referred to as haloes. In these cases, the contrast enhancement detracts from the image, providing neither improved understanding nor aesthetic quality. We desire a better understanding of what causes local contrast enhancement to be considered objectionable. Our inquiry begins from the simple observation that the same basic operation of unsharp masking, and countershading in general, can lead to multiple, disagreeing interpretations of its effect on the image. In some cases countershading is interpreted as an enhancement, while in others it is interpreted as an artifact. We investigate the acceptable contrast for different width countershading pro les and conduct a perceptual study to determine the amount of countershading that can be introduced at different scales without it becoming objectionable. We discover an “uncanny valley” of countershading pro les, where certain width pro les are considered unacceptable even if only slightly visible, separating adjacent regions of both wider and narrower countershading pro les with considerably higher levels of tolerated contrast. Figure . illustrates how the regions of indistinguishable and objectionable countershading vary with the width of the countershading pro le and its amplitude. The important observation is that countershading that is indistin  Indistinguishable countershading  Countershading magnitude  Objectionable countershading (halos)  Spatial frequency  Figure .: The square-wave enhanced by a countershading pro les. The regions of indistinguishable (from a step edge) and objectionable countershading are marked with dotted and dashed lines of different color. The higher magnitude of countershading produces higher contrast edges. But if it is too high, the result appears objectionable. The marked regions are approximate and for illustration and actual regions will depend on the angular resolution of the gure. guishable from a plain edge provides only very limited contrast enhancement; thus the region below objectionable countershading needs to be used to achieve good quality results. We nd that these regions correspond to the various semantic descriptions of the images resulting from different unsharp masking parameters. The “valley” consists of pro les considered haloes, while narrower pro les were perceived to sharpen image features and wider pro les were perceived to enhance contrast. The remainder of the chapter is organized as follows. In the next section, we discuss unsharp masking and various applications of countershading in graphics and image processing. Section . describes the design of the perceptual experiment    we conducted, while Section . discusses its results. In Section ., we relate our measurements to other studies on the perception of countershading pro les. Finally, we demonstrate the applicability of our results and the understanding they afford with a number of applications, including size-aware image resizing, control of tonemapping parameters, artifact-free unsharp masking, detection of countershading in images, and a viewer-adaptive display.  . Experiment Design In our perceptual experiment, we target the most general case of countershading operations: determining the magnitude at which the countershading pro le of a given width becomes objectionable. In designing our study, we take inspiration from work by Ciuffreda et al. [] on determining the level of “bothersome blur.” While parameters of σ and λ responsible for the shape of the countershading pro le smoothly vary over the space of possible values, a semantic shift from “enhancement” to “artifact” occurs along a boundary within that space. Our goal is to determine the boundary between the region of enhancements and the region of artifacts within the parameter space of the unsharp masking operator. Our study does not attempt to determine any of the other aspects of local contrast perception such as local contrast appearance, detection thresholds of countershading, or user preference. View setup. The images were presented on a ” NEC LCDWUXi display with × resolution with a black level of .45 cd/m2 and a peak intensity of 213 cd/m2 . The experiment was run in a darkened room with no visible display glare. The viewing distance was 1 m, resulting in a pixel Nyquist frequency of  cycles per visual degree. Stimuli. The study consisted of six images, three test patterns of a single step edge of various contrasts, and three complex scenes, as shown in Figure .. The contrast of images of complex scenes was linearly scaled down to retain headroom for the countershading without saturating pixels. The process of adding countershading enhancement to an image is illustrated in Figure .. Given a linear luminance image, the countershading is applied in the logarithmic domain to produce the pro les with   Edge - high (0.59)  Edge - med (0.32)  Edge - low (0.041)  Palm beach (0.68)  Coast (0.61)  Building (0.39)  Figure .: Images used in the perceptual experiment:  edges of different contrast and  images of complex scenes. The semi-transparent red-green color mask is the edge template. The numbers in parenthesis denote the Michelson contrast of the edge. the most symmetric appearance of lightness. To produce a high pass image for the enhancement, a Gaussian- ltered image is subtracted from an original. However, the enhancement is computed based on an edge-template, as opposed to using the original image. The template image contains the edges to be enhanced with smooth regions in between, and can be produced by an edge-preserving lter such as that of Farbman et al. []. Using an edge-template ensures a constant increase in contrast along the edge and, more importantly, avoids the ampli cation of high frequency detail noted by Neycenssac [] which could distract subjects from evaluating the appearance of the countershading pro les. Ten countershading pro le widths were used in the trials, ranging from . to . visual degrees, equivalent to .5 − 256 px at 1 m viewing distance, increasing by factors of two. Observers.  observers ( male and  female) participated in the study. They were paid and unaware of the purpose of the experiment. The observer age varied from  to  with the average . All observers had normal or corrected-to-normal vision.   Width of countershading Magnitude of countershading b  -  c  +  Create edge template  Input luminance image a  +  a  b  c  d  exp  d  Figure .: Generation of the countershading enhancement in an image. An edge template is used to precisely mark selected edges in the experiment, but is normally generated with an edge-preserving lter. Procedure. After being presented an image with a countershading pro le of width  σ , the observers were asked to adjust the magnitude λ of the countershading to the maximum level not considered an artifact. Both “artifact” and “objectionable” are subjective terms, and we relied on a no-reference measure of the artifacts, where if the subject saw the image without knowing the original, they would say that it contained undesirable countershading. Each observer repeated the measurement for each of the  conditions ( images ×  pro le widths) twice, for a total of  trials each. In total over  measurements were collected. The experiment was preceded with a training session to familiarize participants with the task. No data were recorded during the training phase, which was followed by three main sessions with voluntary breaks between them. The breaks were scheduled so that each session lasted less than  minutes. Screening and outlier removal. Because of the subjective nature of the experiment, erroneous and inconsistent measurements are likely to be found in the data. To remove   erroneous measurements, we rstly screened the participants using the procedure recommended for magnitude estimation experiments ITU-R-BT.- [, Sec. ..], then we eliminated those measurements for which the intra-observer standard deviation exceeded two times the mean standard deviation of all the measurements. The screening eliminated the data of two participants and the outlier removal removed % of the measurements.  . Experimental Results To illustrate individual variations, the data for individual observers is shown in Figure .. The main inter-observer variation is visible as the vertical shift of curves on the plot, suggesting that different individuals have different notions of ‘justobjectionable’ countershading. The data becomes more consistent once the differences in the mean values are compensated, as shown on the right of Figure .. For the remaining considerations we use the data averaged over all observers. But the strong inter-observer variation indicates that the ‘objectionable’ level is subjective and the algorithms may choose to include a user-de ned scaling parameter for our mean-observer data. Figure . shows the results averaged over all observers.The most salient feature of all plots is the U-shaped characteristic, indicating a reduced tolerance to the halo effect for the medium pro le widths, with the trough around . visual degrees. One of the most interesting observations is the difference in the characteristic between an isolated-edge and complex images, which is shown as the difference between the plots in Figures .a and .b. At large countershading widths a larger magnitude of enhancement was selected for a complex image than for an isolated edge. This could be due to visual masking, which was present in complex images, but not in the case of isolated edges. The difference between the two plots suggests that the measurements for simpli ed stimuli do not generalize to complex images. The results for individual images are better aligned if the countershading pro le is generated irrespectively of the contrast of the underlying edge. The λ -values on the plots represent the magnitude of the countershading generated from an edge template (refer to Figure .) with a xed log10 -contrast of , rather than the contrast of an edge. This observation suggests that pro les of the same magnitude should be    used regardless of the contrast of the underlying edge. The remaining variation between images can be explained by the small effect of the underlying edge contrast. The effect is mostly visible for complex images and narrow countershading pro les. The data, however, is not sufficiently accurate to model this effect. Moreover, the effect disappears for large σ -values, which are the most relevant for an effective contrast enhancement. For convenience of use, we t a polynomial function with a linear segment to the values averaged across all complex images. The λ -values for the just-objectionable countershading can be found from: { −.249ς 3 − .233ς 2 + .377ς + .674 if ς ≤ .418 λ= .048ς + .752 if ς > .418  (.)  where ς = log10 (σ ). The model- t is shown as a black continuous line in Figure .b. Collected data  1.8  Outliers removed, mean-λ shift  1.6  Profile magnitude λ  1.4 1.2 1 0.8 0.6 0.4 0.2 0  −2  0 Profile width σ [log10 deg]  −2  0 Profile width σ [log10 deg]  Figure .: Left: Results for individual participants for the Palm beach image. Right: The same results but after outlier removal and compensating for the mean λ -value between the participants. Standard error of the mean is plotted for a single participant for better clarity (in orange).    Edge ï high Edge ï med Edge ï low Scallop threshold  1.2  1 Profile magnitude h  1 Profile magnitude h  Coast Palm beach Building Model fit Scallop threshold  1.2  0.8  0.6  0.4  0.8  0.6  0.4  0.2  0.2 ï2  ï1 0 Profile width m [log10 deg]  1  ï2  (a) Edges  ï1 0 Profile width m [log10 deg]  1  (b) Complex images  Edge ï high ï just objectionable Edge ï med ï just objectionable 0  Edge ï low ï just objectionable  Profile magnitude log10 h  Scallop threshold  ï1  ï2  ï1 0 Profile width m [log10 deg]  1  (c) Comparison with Krawczyk et al. []  Figure .: The results averaged over all participants, (a) for isolated edges, and (b) for for complex images. The black line in (b) represents our model t. (c) Our just-objectionable measurements (edges) compared to the just-distinguishable thresholds from Krawczyk et al. [] (lines with no markers). log10 scale was used for λ in this plot for better visualization.  . Discussion and Relation to Other Studies The perceived effects of countershading and local contrast enhancement are wellstudied. Work includes studies related to the sensitivity of the human visual system to local contrast enhancement, as well as the amount preferred by subjects under various conditions. We review a selection of key work and discuss its relation to our own ndings.   The studies on the Cornsweet illusion by Sullivan and Georgeson [], Campbell et al. [, ] and Burr [] determined the contrast at which a countershaded edge becomes distinguishable from a step edge of equivalent contrast, the so called scalloping threshold. The horizontal magenta lines in Figures .a-.c show the approximated values of the scalloping threshold, after Kingdom and Moulden []. The thresholds apply only to larger σ -values because the countershading is always distinguishable from a square wave at spatial frequencies above  c/deg [Sullivan and Georgeson, ], which approximately corresponds to σ < 1 for our Gaussianbased pro les. Very low scalloping thresholds clearly show that the countershading is likely to be noticeable for most practical cases; and thus the just-objectionable threshold is more relevant for contrast enhancement applications. For a more complete discussion of scalloping, see Kingdom and Moulden [] for a comprehensive review. Lin et al. [] studied the perceptual impact of edge sharpness. In their study, they processed images using an unsharp masking lter with a xed σ = 1.4 px at edge locations only to avoid noise ampli cation. This width equates to .04 spatial degrees in their viewing setup, roughly corresponding to the third of our tested widths. The lter was applied to all edges in the image, and the amount of countershading introduced was proportional to the underlying edge contrast. Subjects ranked the perceived quality of images for contrasts of different magnitudes and Lin et al. computed the most desirable and highest tolerated contrast. Without knowing the contrast of the original image edges, we cannot compute the equivalent λ to plot a direct comparison. Based on the theoretical model of countershading perception by Dooley and Green eld [], Krawczyk et al. [] proposed a visual model of just-detectable countershading, which was used to adaptively introduce countershading in images. The main assumption behind their algorithm is that the countershading becomes objectionable as soon as the pro les become visible. The comparison of our data with the scalloping thresholds in Figures .a and .b demonstrates that it is not the case. We reproduced their model and computed its predictions for the isolated edges from our experiment, assuming the t.v.i. value equal to % and no masking. The model predictions, plotted in Figure .c, show little correlation with our experiment results and seem to be too conservative even for the scalloping threshold   reported in the literature. Ihrke et al. [] performed a perceptual evaluation of the work on D unsharp masking by Ritschel et al. []. The authors tested the preferred value of their magnitude parameter λ for pro les of several different widths, measured on object surfaces, opposed to image-space. Interestingly, they did not nd their results varied with the width of the countershading pro les. However, they only tested images of complex scenes with a narrow set of pro les (corresponding to the trough of the valley we nd). Additionally, the depth-dependency of their algorithm means that pro les of varying widths were added to their scenes, making it hard to ascertain the magnitude level associated with a single countershading width. Their largest σ was the only width large enough to cause the Cornsweet illusion. Similar to Lin et al., it is difficult for us to plot a direct comparison to our ndings because λ and σ vary across the image.  . Applications We have implemented a number of simple applications of our model of objectionable countershading for use in adjusting image scale and contrast, HDR tone mapping, artifact-free unsharp masking, and countershading pro le analysis. In the rest of this section, we brie y describe these methods and their results. The purpose of these examples is to demonstrate the breadth of topics for which our measurements are relevant. Any of these tools could be made more sophisticated, but are sufficient to demonstrate the associated approach. As noted in Section ., the perception of countershading pro les strongly depends on the spatial frequency at which they appear. All the images assume viewing of this thesis as printed on letter-size paper ( .”×” / .×. cm) page viewed from a distance of ” (. cm). When evaluating the images, please ensure you are viewing under similar conditions. Additionally, some of the effects are subtle and we suggest looking at the electronic copy of the document.  ..  Avoiding Haloes When Resizing  As shown in the Section ., increasing the magnitude of a countershading pro le can not only boost the perceived edge contrast, but also move the appearance into   the objectionable region. Similarly, operations that change the width of the pro le can cause the same issue.  b  a c  d  Figure .: Downsizing an image (a) transforms a .° countershading pro le to a smaller one, where the threshold of objectionable magnitude is signi cantly reduced, causing objectionable haloes. These artifacts can be remedied by adjusting the magnitude of the countershading (b) to correspond to the new angular size of the pro les or adjusting the width of the countershading (c) to compensate for the downsample factor, or some combination thereof. Neither image in this simple case appears exactly the same as the full-size, but they do not include objectionable artifacts. The same issue occurs in the opposite case, when a sharpened image is enlarged (d). Resizing an image with acceptable countershaded edges can cause the pro les around those edges to move from the acceptable to objectionable region, as shown in Figure .. Figure .a shows an edge with .° countershading pro les near the just-acceptable contrast. Shrinking that image by a factor of × causes those pro les to move into the objectionable region and appear as haloes. In order to regain acceptable pro les, the current combination of pro le contrast and width must be projected back outside the objectionable region. Fig .b and .c show the result   of projecting along only the contrast axis or only the width axis, respectively. The resulting contrast appearance depends on the projection, where primarily adjusting contrast reduces the overall image contrast enhancement, while primarily adjusting width approximates a global contrast adjustment. Algorithms can choose how to adjust those parameters based on desired image appearance. The same issue also must be considered when enlarging images. Contrasts acceptable for very narrow pro les will become objectionable as they are widened, as shown in Figure .d. While less frequent, this scenario can occur if a user is adjusting the contrast of a sharpening lter in a preview of less than 100% of the full image size.  ..  Local Tonemapping Operators  The same consideration to the scale of countershading from the observer’s perspective can be applied to other operations. Local tonemapping is frequently associated with introducing objectionable haloes. In some cases, poor algorithm performance, regardless of parameters, is responsible for the artifacts. In others, it may be a mismatch between the scale for which the image was produced and the scale at which it is being displayed. We use the Office image from Durand and Dorsey [], which is × px in size. Their algorithm calls for the σs of the bilateral lter to be 2% of the image size, in this case 26 px. Viewing the image -to- pixels on a ” × display from a distance of ”, σs is equivalent to .°. We observe the artifacts to be acceptably low under these conditions. However, resizing that image to t within the margins of this document at our ” viewing distance, the same σs is equivalent to .°, a spatial frequency at which we are considerably more sensitive to haloes. The contrast of the countershading pro les depends on the interaction of several algorithm parameters and is difficult to specify. On the other hand, the width of the pro les only depends on the choice σs in the bilateral lter, so we project back out of the objectionable region along the pro le width axis. After xing the σs to be equivalent to .° ( px in this case) the haloes are below the objectionable threshold for the image in this thesis, as can be seen in Figure .. We have not entirely removed the haloes from the image, as can be seen when viewing the image from a large distance. We have simply ensured the pro le   (a)  (b) Figure .: A comparison between (a) the σs speci ed in Durand and Dorsey [] and (b) our σs chosen for the size of the image in this document. The contrast of the countershading has not changed, the pro les have been sufficiently widened so the contrast is no longer objectionable. Office image copyright Fredo Durand. contrasts are below the objectionable threshold for the given viewing conditions. This approach provides an alternative to that of Farbman et al. [] when considering situations involving small displays or requiring computational efficiency, such as mobile devices.    ..  Artifact-Free Unsharp Masking  We can also use the insight from our experimental procedure to develop improved unsharp masking that does not introduce artifacts. Neycenssac [] noted that unsharp masking introduces contrast through  separate mechanisms: the addition of countershading pro les around edges and the ampli cation of existing image features. An ideal countershading operator should introduce acceptable magnitudes of countershading at edges without introducing other artifacts. While the results derived from our experiment provide a model of acceptable countershading magnitudes, we still must address the ampli cation of image features. A naïve unsharp masking operation ampli es small features, especially noise, as visible in Figure .c. Lindeberg [a] notes that convolution with a Gaussian of √ width σ will remove all the features smaller than σ , so those features, including noise, will be present in the high-pass image Hσ (Y ) at their original contrast. When √ the high-pass image is added to the original image, only features larger than σ √ receive countershading pro les, while features smaller than σ are ampli ed. To achieve effective countershading, rather than detail ampli cation, it is necessary to suppress unwanted details in the high-pass image. Many attempts have been made to adaptively scale the unsharp masking parameters λ and σ based on local image content to avoid the unwanted features present in Hσ (Y ) from being introduced into the nal image. Taking inspiration from the edge templates employed our experiment, we consider a different approach: if small features are ampli ed rather than countershaded, they should not be present in the high-pass image. That way the unsharp masking operator does not need to remove them after the fact. We replace the conventional high-pass image Hσ (Y ), based on the difference between the image and a Gaussian-blurred copy gσ Hσ (Y ) = Y − gσ ∗Y  (.)  with a modi ed version Hσ′ (Y ) based on a template function Tσ that removes fea√ tures smaller than the σ from the image while retaining high-frequency edges: Hσ′ (Y ) = Tσ (Y ) − gσ ∗ Tσ (Y ).    (.)  a: original Y  b: template Tσ (Y )  c: naïve high-pass Hσ (Y )  d: template high-pass Hσ′ (Y )  e: naïve countershade  f: template countershade  Figure .: Comparison of the original image Y with the result of the  lter Tσ (Y ) for the generated high-pass image and countershaded result. The naïve high-pass image retains the noise of the original and ampli es it in the result, while the template version does not. In this case the countershading magnitude has been chosen to make the difference between operators easily visible. Edge-preserving smoothing lters, especially the  framework of Farbman et al. [], provide a very good approximation of Tσ . In the case of Farbman et al., the parameters must be calibrated such that the frequency response of the lter corresponds to that of gσ used by H . The frequency response of a Gaussian lter of width σ is another Gaussian of width 1/σ : −  Gσ (ω ) = e   ω2 2/σ 2  (.)  From Farbman et al., the frequency response of  for a region without significant edges is Fγ ,α (ω ) =  1 ( )−1 , 1 + γω 2 |∂ ℓ|α + ε  (.)  where γ controls the spatial extent of the function, α is the edge-stopping parameter,  ∂ ℓ is the average partial difference of the log-luminance of the input image and ε is a small term to avoid division by zero. We choose Farbman et al.’s value of α = 1.2 and an average pixel difference of .04 to approximate the gradient magnitude of at regions of images in the range [0, 1] and solve the γ that minimizes the least-squares difference between Gσ and Fγ . For σ = 1, the equivalent value of γ1 = 0.027, given our choice of parameters. From Farbman et al., values equivalent to other σ are determined by γσ = σ 2 γ1 . √ The corresponding  lter Fλ will remove features smaller than σ while preserving edges, the desired behavior of our template function Tσ . Figure . compares conventional unsharp masking to the template image approach, which successfully removes small details from the high-pass image. Finally, we use our model from Section . to ensure the added countershading is acceptable. This operator can be viewed as the complement to using the  lter for multiscale tone manipulation. Because  does not smooth across edges, the differences in contrast between scales only occur in smoothly-changing regions and  tone manipulation predominantly adjusts the contrast of regions without sharp edges. Conversely, our operator predominantly adjusts the contrast of regions around sharp edges.  ..  Countershading Analysis  Many images already contain countershading, including some works of ne art such as Seurat’s Le Bec du Hoc, shown in Figure .. In fact, countershading originated in the ne arts, where artists overcame the limitations of the medium and gave images the appearance of higher contrast than they could otherwise convey. When this painting is seen at a different visual resolution, for example when reproduced on a web-page, the appearance of countershading may be very different from the    Figure .: Comparison of Seurat’s original Le Bec du Hoc (left) with our adjusted version (right). We performed a multi-scale decomposition of the painting (left) and estimated the countershading pro les on each side of the edges separated the land, sea, and sky portions (bottom), then demonstrate the difference by synthesizing the new image (right) with those pro les removed. artist’s intent. However, it is possible to compensate for the difference in viewing conditions by analyzing the countershading present and then reintroducing adjusted countershading pro les. To accomplish this task we use an extreme case of image abstraction followed by an analysis-by-synthesis estimation of the countershading pro les. We observe that if a single iteration of an edge-preserving lter removes texture details, multiple iterations remove any low-amplitude intensity changes, including the countershading pro les. We employ the  framework of Farbman et al. [], using the iterative version of their algorithm to obtain a texture-free layer Ds , and then repeat several more iterations to obtain a countershading-free layer Dt . The layer Dt consists of nearly uniform regions of color separated by sharp edges like the template   images described earlier. We then segment the template image and solve for the Gaussian centered at each edge position that best approximates the countershading present: Dt − Ds . Estimation is performed independently for each side of the edge to account for asymmetries in the pro les. The result is an approximation of pro le width and magnitude at each edge location, shown in Figure . (right). Given this representation, we are able to remove existing countershading and synthesize a new set of pro les for the desired conditions. This approach not only provides a better means to reproduce artwork containing countershading, but also may afford a deeper insight into how it is used artistically.  ..  Viewer-Adaptive Display  The underlying theme of these applications is that the perception of countershading strongly depends on the width of the pro le from the point of the viewer. Algorithms can easily account for the dimensions and resolution of the display, but fail to account for the fact that the perceived pro le width also depends on the distance between observer and display. We created a setup with a viewer-adaptive display, which determines the distance of a viewer from the screen using headtracking, and then adjusts countershading pro les accordingly (details in video at http://matttrent.com/research/thesis).  The goal is to maximize contrast enhancement  without causing haloes to appear, especially at larger viewing distances (refer to Figure .). We can achieve the adjustment in two different ways: either by keeping the width of the pro les constant on the screen and adjusting the magnitude of the distortion, or by changing the width of the pro les on the screen so that their angular size from the viewer point stays the same. We found the rst approach to be less disruptive to image content while the second to provide stronger contrast enhancement.  . Conclusion In this study we measured conditions under which countershading pro les are perceived as objectionable. We found a strong effect of the width and magnitude of the pro le and a much weaker effect of the underlying image content. Unlike previous studies on the detection of countershading pro les (Sullivan and Georgeson [Sul  livan and Georgeson, ], and Krawczyk et al. [Krawczyk et al., ]) or their matching contrast (Dooley and Green eld [Dooley and Green eld, ]), the focus of our work is the aesthetics of countershading. In particular, the perceived quality is strongly affected by changes in pro le width, including those resulting from changes in viewing conditions such as distance. Image downsampling also affects pro le width and may easily convert acceptable pro les into objectionable ones, implying this observation is very relevant to a large group of image enhancement algorithms. It is our suspicion that a number of algorithms said to introduce haloes may in fact produce acceptable results, but suffer from the disparity in size between authors’ monitors and their printed reproductions. We have shown several applications where our model, combined with edgepreserving smoothing, can be used to improve upon existing countershading approaches, as well as enable some new possibilities. Improvements to existing methods include more accurate resizing of countershading pro les, halo-free local tonemapping, and a new unsharp masking operator that avoids issues of noise ampli cation. We also present a means of estimating and modifying countershading in existing images, including ne art.    Chapter   Defocus Techniques for Camera Dynamic Range Expansion Chapters  and  presented perceptual models relating the appearance of blur and contrast. In this chapter, we investigate the use of optical blur to capture a greater contrast range with conventional sensors. Defocus imaging techniques, involving the capture and reconstruction of purposely out-of-focus images, have recently become feasible due to advances in deconvolution methods. This chapter evaluates the feasibility of defocus imaging as a means of increasing the effective dynamic range of conventional image sensors. Blurring operations spread the energy of each pixel over the surrounding neighborhood; bright regions transfer energy to nearby dark regions, reducing dynamic range. However, there is a trade-off between image quality and dynamic range inherent in all conventional sensors. The approach involves optically blurring the captured image by turning the lens out of focus, modifying that blurred image with a lter inserted into the optical path, then recovering the desired image by deconvolution. We analyze the properties of the setup to determine whether any combination can produce a dynamic range reduction with acceptable image quality. Our analysis considers both properties of the lter to measure local contrast reduction, as well as the distribution of image intensity at different scales as a measure of global contrast reduction. Our results show that while combining state-of-the-art aperture lters and deconvolution   methods can reduce the dynamic range of the defocused image, providing higher image quality than previous methods, rarely does the loss in image delity justify the improvements in dynamic range.  . Introduction The fact that the range of luminances found in the real world greatly exceeds the capabilities of imaging sensors is a fundamental problem encountered during the acquisition of digital images. Real world scenes contain values brighter and darker than the range that can be captured at any one time by conventional image sensors, and as a result, over-exposed and under-exposed pixels commonly occur in photographs. Conventional image sensors cannot match the dynamic range of the scene, and can only capture a subset of the luminances present. Although specialized high dynamic range image sensors can capture the range of luminances found in most real world scenes, they suffer from lower signal-to-noise ratios (SNR) or slower read-out speeds [Yang et al., ]. Photographers have contended with this problem since the advent of photography, and the most common solution is the concept of exposure to control the amount of light that falls on the sensor. While controlling the amount of light reaching the sensor by adjusting the aperture and the exposure time, photographers are able to select which subset of the scene luminances they wish to capture without undesirably over- or under-exposing the image. However, adjusting the exposure does not improve the limited dynamic range that can be acquired. The subset of luminances that can be accurately captured can be thought of as a slice through the entire range of luminances found in the scene. Adjusting the exposure can move the slice up and down the range of scene luminances, and pixels with luminances above the top of the slice are recorded as white, while pixels with luminances below the bottom of the slice are recorded as black. A correctly chosen exposure can minimize the number of over- and under-exposed pixels, resulting in a properly exposed image, but it is the dynamic range of the sensor that controls the width of the slice. The fact remains that if the dynamic range of the scene exceeds that of the sensor, some pixels will not be recorded accurately. Most existing techniques capture multiple slices of the luminance range and com-    bine them into a single image representing a wider slice, and all of these methods require some tradeoff to extend the width of the slice. Multi-exposure high dynamic range reconstruction [Debevec and Malik, ; Robertson et al., ] takes a sequence of slices distributed in time, trading off temporal resolution for a larger dynamic range. Similarly, placing an array of different neutral density lters onto the sensor [Nayar and Mitsunaga, ] can trade spatial resolution for a wider slice of the dynamic range. The best option is to develop new sensor technology [AcostaSera ni et al., ; IMS Chips, ; Yang et al., ] that is directly capable of capturing a wider slice of the dynamic range, but these sensors are still some way off from commercial availability and currently suffer from resolution and quality issues.  . Overview The majority of the existing methods attempt to expand the dynamic range of the sensor to match a xed range of real scene luminances incident upon the sensor. We investigate the opposite, reducing the dynamic range of the scene to better t in the limited range of the sensor. The method we investigate attempts this reduction in a two-part, combined optical and software approach. First, we optically blur the image to reduce the dynamic range of the scene incident on sensor. Then we restore the original image detail and dynamic range in software. Blurring is a convolution operation, where the energy that would fall on a single photosite of the sensor is spread over a local neighborhood, and reciprocally, that photosite receives some amount of energy from its neighborhood. This exchange reduces the difference between the pixel and its neighbors, thus producing an image with less local contrast. The underlying assumption of this approach is that the most extreme luminance values are sufficiently spatially distributed such that the local contrast reduction from convolution will reduce the number of pixels outside the sensor’s dynamic range. A good candidate for this method would be small point sources, such as streetlights at night, while a bad candidate would be large bright areas, such as daytime sky viewed through a window. We analyze, both in terms of possible dynamic range reduction and resulting image quality, the properties of the optical-digital system composed of a blurred im-    age obtained by an aperture lter inserted into the optical path then restored by a deconvolution method. The results are quanti ed to determine whether any combination can produce a meaningful reduction of dynamic range required to capture by a sensor while maintaining acceptable image quality.  . Physical Setup While the optical system in a camera lens is complex, the entire collection of lenses can be approximated as a pair of thick lenses with the aperture in between them. This pair of lenses focuses a bundle of rays coming from points in the scene to points on the sensor, while the size and shape of the aperture controls which rays in the bundle reach the sensor. The optical system focuses the image by directing all the rays in a bundle that originate from a point on the focal plane in the scene to converge to the same point on the imaging sensor. If a point lies in front of or behind the focal plane, the bundle of rays do not converge to a point on the sensor. Instead, the sensor plane will intersect the cone of light exiting the rear lens element, resulting in a circular pattern projected on the sensor. The amount of defocus determines the radius of the blur circle, and points further from the focal plane are proportionally further from focusing on the sensor and are more blurred since they intersect a bigger slice of the cone. As shown in Figure ., the aperture sits in the middle of the imaginary pair of optical elements we treat as the lens. Light rays pass directly through it. The circular blur pattern normally observed in out-of-focus images results from the circle shape of the aperture in a normal lens. With a different aperture shape, if a point is out of focus, the sensor still intersects with the cone of out of focus rays, but the aperture shape has blocked some rays traveling through it and the shape of the blur matches that pattern of the aperture.  . Coded Aperture As mentioned in the previous section, defocus blur convolves the image of the scene by aperture shape. Thus, the Fourier transform of the captured image has the frequency characteristics multiplied with the Fourier transform of the aperture shape.   Figure .: Placement of aperture lter in optical setup. The goal of inserting an aperture lter into the optical system is to produce a blur pattern that has good frequency preservation properties. Convolution is equivalent to multiplying the Fourier transform of the scene image by the aperture optical transfer function (OTF), the frequency space representation of the aperture shape. Therefore, in a noise-free case, deconvolution can be viewed as dividing the Fourier transform of the scene image by the aperture OTF. Zeros and very small values in Fourier transform of the lter result in division by zero and excessive error in the deconvolved image; these values are responsible for many of the artifacts observed. The question of “what constitutes a good coded aperture pattern?” involves a number of criteria. We take this inquiry to speci cally consider the aperture pattern that yields the highest quality reconstruction of the original image with the least number of artifacts. As mentioned above, the zeros in the frequency response of the lter prevent information from the scene being captured by the camera. The ideal coded aperture pattern has a frequency spectrum with as few nearzero values as possible, with the locations of the existing near-zero values located as far away from the low frequency bins and each other as possible. Additionally, that property must hold for a number of different scales of the lter, as points at different depths in the scene will be blurred by different amounts. There are several practical issues beyond these theoretical considerations. Suffi  ciently small holes in the aperture pattern will cause the light passing through them to diffract, causing ringing artifacts in the captured image. Another desired property is that the aperture pattern transmits as much light as possible. Given these considerations, we will now analyze several coded aperture patterns (shown in Figure .) and discuss their relative merits:  Standard  Gaussian  Veeraraghavan  Zhou  Levin  Figure .: Aperture lters evaluated.  Standard (circular) aperture: The circular aperture found in conventional lenses corresponds to a rst-order Bessel function in frequency space. The Bessel function resembles a dampened cos function, with near-zero values around the numerous zero-crossings of the function. This function causes very poor results with conventional deconvolution algorithms but can be made to yield acceptable results with a better deconvolution algorithm utilizing natural image statistics. However, as far as coded aperture patterns go, the standard aperture is very poor, and the only advantage is that it has the highest light transmission of any pattern. Circular Gaussian: The Gaussian function has two compelling reasons for its use: the value of the function never reaches zero and the Fourier transform of a Gaussian function is another Gaussian. However, there are two equally signi cant caveats. First, while the Gaussian doesn’t ever reach zero, it quickly reaches the noise oor of the camera. For the Fourier transform of a Gaussian function to be sufficiently broadband, the aperture pattern is nearly a pinhole. Second, while the Gaussian function has in nite extent, the actual lter shape would be a truncated Gaussian, the Gaussian multiplied by a box lter. The result would have the same zero-crossing issues as the standard aperture.    Veeraraghavan et al. [] started with a Modi ed Uniformly Redundant Array () pattern, then improved it with an optimization function. They employed gradient decent to select the binary pattern that best ts the difference in con guration between conventional cameras and x-ray telescopes. Modeling such differences as the linear convolution that occurs in the optical systems of conventional cameras, they iteratively searched for the binary pattern with the greatest minimum value of the lter frequency response. Zhou and Nayar [] improve upon the work of Veeraraghavan et al. and formulate a de nition of the quality of the aperture lter based directly on the quality of the deconvolved image as opposed to the properties of the lter power spectrum alone. They assume that the camera will be used to capture natural images, and they make use of the 1/ f relation of frequencies in natural images to further enhance their lter choice. Additionally, they note that the amount of noise present in the captured image signi cantly affects the quality of the deconvolution, and they use a genetic algorithm to search for the optimal coded aperture for a series of different noise levels. Both this work and Veeraraghavan et al. assume that the radius of the blur is determined by another method. Levin et al. [] conducted similar work but optimized a lter for a different set of criteria. In their work, they wanted to accurately determine the amount of defocus present at every pixel and use that to recover the depth. Opposed to the work of Veeraraghavan et al. and Zhou and Nayar, who either directly or indirectly optimized lter patterns with a minimal number of near-zero values, Levin et al. constructed a lter such that there was the maximum possible difference in the position of the zero-crossing in the frequency spectrum for different lter radii. This allowed them accurately recover the amount of blur at every pixel by looking at the missing information, which they were able to ll in with their deconvolution algorithm.    . Deconvolution Deconvolution is used to restore the blurred image recorded by the sensor to its original sharpness and dynamic range. As discussed in Section ., more recent deconvolution routines based on natural image statistics have made signi cant progress towards restoring the sharpness of blurred images. While the use of a coded aperture lter improves the spatial frequency properties of the captured image, the fact remains that the deconvolution problem is ill-posed. The blurred image could be the result of any one of an in nite number of images that produce the same result when convolved by the chosen lter. However, none of those images will be exactly the same as the perfect (not blurred) photograph of a real world scene. Real world scenes all share some very speci c properties that can be used to guide the result of the deconvolution towards a physically-plausible result. Speci cally, natural images tend to consist of large areas of nearly-constant values with sharp divisions between them. Described in a more formal way, the derivatives of natural images have a heavy-tailed distribution with a narrower peak and longer tail than a Gaussian function. This distribution implies that most pixels have values close to zero, but some small number of pixels, especially those lying next to edges, have signi cantly larger values. Both Bando and Nishita [] and Levin et al. [] have presented deconvolution algorithms that make use of these natural image statistics. They solve a system of equations that includes a weighting term that corresponds to the prior assumption of a natural-image distribution. Bando and Nishita modify the WaveGSM [BioucasDias, ] algorithm to operate in gradient space and perform expectation maximization on the resulting non-linear system of equations using the second-order stationary iterative method. Levin et al. [] approach the problem as nding the maximum likelihood explanation, employing iteratively reweighted least-squares to solve for the non-linear sparse prior term.  . Evaluation Our goal is to determine whether any combination of aperture lter and deconvolution algorithm can produce a meaningful increase in effective dynamic range   while maintaining acceptable image quality. Speci cally, we compare lters by Veeraraghavan et al. [], Zhou and Nayar [], and Levin et al. [] in addition to the Gaussian and standard aperture. We evaluate deconvolution methods by Bando and Nishita [] and Levin et al. [] in addition to classical Wiener ltering [Wiener, ] and Richardson-Lucy [Lucy, ; Richardson, ] methods. In our evaluation, we measure the reduction in contrast between the blurred image and the original image as opposed to the expansion in contrast between the result of the deconvolution and the blurred image. Errors in images produced by deconvolution often appear as ringing artifacts around contrast edges, arti cially increasing the contrast in those regions. This approach avoids mistaking those artifacts with meaningful increases in dynamic range. The quality of the deconvolution algorithm is still determined as the difference between the original image and the reconstructed image. The structure of the image has an impact on the effectiveness of the algorithm. If the size of a bright or dark image feature exceeds the given lter diameter, there will be no reduction in dynamic range since energy is only exchanged between pixels within the local neighborhood. Concerning dynamic range reduction, our analysis considers both properties of the lter used as a measure of local contrast reduction as well as the distribution of intensity values in the image at different scales as a measure of global contrast reduction. Our primary interest was validating the proof of concept, and we conduct our evaluation using synthetic results instead of real optical systems. This approach introduces less complexity in quantifying the performance of the method while still determining if the upper bound of the performance is of sufficient quality. We present two of the images we used to evaluate the performance in this proposal, Atrium Morning and Atrium Night, shown in Figure .. This pair was chosen to demonstrate how the performance of the system depends on the spatial distribution of luminance values in the image. While Atrium Night has a larger dynamic range than that of Atrium Morning, its extreme values are located in small bright light sources in contrast to the broad skylights in Atrium Morning. Table . shows the radii of lters used in the evaluation and the change in dynamic range of the image as a result of being blurred by a standard aperture disc lter of different radii. The night image results in a larger reduction in dynamic range   Atrium Morning  Atrium Night  Figure .: Sample images used in evaluation. Images copyright Frederic Drago. since the point light sources distribute energy over neighborhoods of signi cantly dimmer values. The tables and plots included in this section all reference the differences in dynamic range in terms of the photographic concept of exposure value (EV) stops. In photography, a change of 1 stop or 1 EV represents a unit change on the log2 scale, where 1 EV = log2 (L1 ) − log2 (L2 ). We evaluated the quality of the images reconstructed by the deconvolution algorithms in terms of peak signal-to-noise ratio () [Thomos et al., ]. Typical values for the PSNR in lossy image and video compression are between 30 and 50 dB, where higher is better [Thomos et al., ; Xiangjun and Jianfei, ]. However, the images used in this evaluation have a larger dynamic range than that of conventional 8-bit images. From evaluating the quality of the results we have chosen 35 dB to be the lower bound on acceptable image quality for deconvolved images. These images have visible artifacts, but all of the features are still clearly visible. Similarly, we have chosen a dynamic range reduction of 2 stops to be the minimum acceptable reduction in dynamic range. Our evaluation proceeded as follows. First, each image was convolved by each of the aperture lters at a number of different radii. The minimum and maximum    Radius  Original             Atrium Morning min . . . . . . . . . . . .  max . . . . . . . . . . . .  reduction . . . . . . . . . . .  Atrium Night min . . . . . . . . . . . .  max . . . . . . . . . . . .  reduction . . . . . . . . . . .  Table .: Amount of reduction in dynamic range as a function of radius of a standard aperture (disc) lter in pixels. The minimum and maximum values are the result of convolving the original image with a disc lter of the speci ed size, showing how much of the dynamic range reduction was from reducing the intensity of highlight regions versus increasing the intensity of shadow regions. All units are in terms of powers of two, referred to as exposure value (EV) stops. values of the original image and the blurred image were compared to compute the amount of dynamic range reduction for that size lter. Next, different amounts of Gaussian noise were added to simulate the random nature of the image acquisition process for different sensor sizes. Then, all of the combinations of lter, radius and noise were deconvolved by each of the deconvolution algorithms. Finally, all results were compared to the original image to compute the PSNR. Figures . and . summarize our results across the combinations of aperture lter and deconvolution method for images without any noise added, while Figures . and . summarize our results for images with Gaussian noise σ = 1 added. The deconvolution algorithm by Levin et al. [] was able to produce acceptable results for images with small bright areas, such as Atrium Night, when paired with one of the lters by Levin et al. [], Veeraraghavan et al. [], or Zhou and Nayar []. However, it was only able to do so at noise levels below those    Atrium Morning deconvolution  60  50  50  45  45  40  40  35  35  30  30  25  25  20  20  15 10  Weiner Richardson−Lucy Bando Levin  55  PSNR (dB)  PSNR (dB)  55  Atrium Night deconvolution  60  Weiner Richardson−Lucy Bando Levin  15  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  10  5  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  5  Figure .: Comparison of all four deconvolution algorithms on the Atrium test scenes without any noise added. All results were computed using the aperture lter proposed by Zhou and Nayar []. Values above and to the right of the green bars pass our acceptance criteria. Atrium Morning aperture filter  60 55  50  45  45  40  40  35  35  30  30  25  25  20  20  15 10  Standard Aperture Gaussian Veeraraghavan Zhou Levin  55  PSNR (dB)  PSNR (dB)  50  Atrium Night aperture filter  60  Standard Aperture Gaussian Veeraraghavan Zhou Levin  15  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  10  5  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  5  Figure .: Comparison of all aperture lters on the Atrium test scenes without any noise added. All results were computed using the deconvolution algorithm proposed by Levin et al. [] Values above and to the right of the green bars pass our acceptance criteria. found in existing cameras. Additionally, the Levin et al. [] method performed worse than RichardsonLucy for images with a few bright points like Atrium Night at realistic noise levels. While it is able to reconstruct very ne details, the method of Levin et al. [] tends to introduce high frequency ringing. The amount of ringing associated with a given feature is insigni cant relative to its overall magnitude, but if that feature is bright enough, the ringing will destroy detail in the dark regions of the image.    Atrium Morning deconvolution  60  50  50  45  45  40  40  35  35  30  30  25  25  20  20  15  15  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  Weiner Richardson−Lucy Bando Levin  55  PSNR (dB)  PSNR (dB)  55  10  Atrium Night deconvolution  60 Weiner Richardson−Lucy Bando Levin  10  5  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  5  Figure .: Comparison of all four deconvolution algorithms on the Atrium test scenes with additive Gaussian noise of σ = 1. All results were computed using the aperture lter proposed by Zhou and Nayar []. Values above and to the right of the green bars pass our acceptance criteria. Atrium Morning aperture filter  60 55  50  45  45  40  40  35  35  30  30  25  25  20  20  15 10  Standard Aperture Gaussian Veeraraghavan Zhou Levin  55  PSNR (dB)  PSNR (dB)  50  Atrium Night aperture filter  60  Standard Aperture Gaussian Veeraraghavan Zhou Levin  15  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  10  5  0  0.5  1  1.5  2 2.5 3 3.5 Dynamic range reduction (EV stops)  4  4.5  5  Figure .: Comparison of all aperture lters on the Atrium test scenes with additive Gaussian noise of σ = 1. All results were computed using the deconvolution algorithm proposed by Levin et al. [] Values above and to the right of the green bars pass our acceptance criteria. Increasing the smoothing parameter of the algorithm might have produced less noise in the dark regions but would not have recovered any additional detail. The deconvolution algorithm by Bando and Nishita [] performed worse than expected, given its success on conventional 8-bit images. This could be caused by the selection of its normalization parameters, which we have not been able to optimize for the given images. However, in all tests on conventional (-bit) images,    it performed at best equal with that of Levin et al. [], and we are con dent that it could not do signi cantly better here. The lters by Levin et al. [], Veeraraghavan et al. [] and Zhou and Nayar [] outperformed the other lters for images with low noise levels. As the amount of noise increased, the difference between the capability of all the different lters decreased, and eventually all the lters had the same performance as the conventional aperture. This con rms the observation of Zhou and Nayar [] that the optimized aperture lters they designed became simpler and more like the conventional aperture as the noise level increased. Overall, the results show that no current combination of aperture lter and deconvolution algorithm can deliver an acceptable combination of dynamic range reduction and image quality for images with large bright areas like Atrium Morning at any noise level, while the desired performance was only possible for images with small bright areas with unrealistically low amounts of noise. Additionally, the quality of the nal result depended more on the deconvolution method than on the choice of aperture lter for realistic noise levels.  . Conclusions None of the possible combinations of aperture lter and deconvolution algorithm were able to consistently reduce the dynamic range of the captured image without excessively degrading image quality. The combination of algorithm and lter that did work did so under very controlled conditions. Without advances to either aperture ltering or image reconstruction, the approach is not applicable to general circumstances. The efficiency of this defocus imaging approach is scene-dependent. The method is good for small over-exposed regions that are just above the maximum photosite capacity of a sensor but performs worse on large overexposed areas or in recovering exceedingly bright regions. The more complex deconvolution algorithms performed better than traditional methods but at signi cant computational cost. These algorithms took on the order of several minutes to produce results for megapixel images. The marginal improvement in dynamic range at acceptable image quality does not justify the amount of computation required by the method.    We could also consider different means of evaluating image quality. High PSNR is rarely the goal of deconvolution. Images may look acceptable, or even better, if they are sharper despite the fact that the PSNR score would be lower than a different version. We could obtain a more accurate estimate of the image quality using different metrics or subjective user preference means of evaluation. Subsequent collaborative work [Rouf et al., ] took a novel view of  capture inspired by this combination blurring-deconvolution approach. In this work, we rst optically encode both the  re ectance portion of the scene and highlight information into the image captured with a conventional image sensor. This step is achieved using a cross-screen or “star” lter. Second, we decode, in software, both the low dynamic range image and the highlight information. Lastly, these two portions can be combined to form an image of a higher dynamic range than the regular sensor dynamic range.    Chapter   Discussion and Conclusions This dissertation has presented several methods for manipulating the perception of blur and contrast in images. These methods take inspiration from the fundamental organization of spatial image perception into multiple parallel channels for processing visual information and conveying image appearance. In addition to taking inspiration from, some of the methods employ models of human spatial vision to more accurately control the appearance of images under changing viewing conditions.  . Bene ts and Limitations Beyond the individual contributions of each chapter, there are several other bene ts. The rst bene t of the work presented in this dissertation is the conceptual framework of understanding scale-dependent effects of image perception. We have demonstrated that the parallel spatial frequency channels present in every viewer’s visual system affect the perceived appearance of displayed images. The resolution of the image on the retina, not in terms of pixels but of angular resolution, determines the mapping of spatial frequency information in the image to speci c visual channels. With this understanding, we are able to identify which elements of image display affect the mapping from spatial frequencies to visual channels, the physical arrangement of viewer and display in particular. It is this relationship, along with models of image appearance based on those spatial frequency channels that allow us to predict how an image is perceived and how that perception will change depending on changes in viewing conditions.   We have used this model to create new methods of manipulating blur and contrast in images. Accurate estimates of spatially-variant image blur allow us to synthesize images with defocus patterns prohibited by the constraints of lens optics design. When coupled with a model of perceived image blur, the same estimation routine allows us to preserve the appearance of images when downsampling. A similar perceptual model of whether countershading is perceived as acceptable or objectionable has provided us new insights into the use of unsharp masking and similar image processing operations. Our work has also suggested novel ways of introducing countershading that avoid the ampli cation of high frequency details that plagued previous approaches. Finally, we have provided some initial attempts at the scale-aware display of images with monitors that present images not only calibrated to the size and resolution of the display, but to changes in the physical distance to the viewer. The contributions presented in this work are not without limitation. Several of the methods presented rely on accurate estimation of image features when altering image appearance. Regardless of the quality of the image appearance models, if the estimation falters, the result will contain artifacts. The method of blur estimation presented is still not entirely robust to noise. It can confuse noise with ne texture detail. Our means of estimating the noise level in the image could be improved and more complex approaches would distinguish between noise and texture detail based on statistics of image regions. Currently, we just compare gradient magnitude of regions. Patched-based methods that estimate local histograms could better distinguish between random noise and ne image structures. Additionally, our method can only estimate the amount of defocus blur. Motion blur and camera shake do not alter the gradient magnitude in an analogous way and still appear sharp to the estimation algorithm. Extending our model of blur would allow us to enhance the appearance of more than defocus blur effects. There is even more work regarding the estimation of countershading pro les. Our example of Seurat’s Le Bec du Hoc was only a proof of concept and can not easily be extended to automatically handle arbitrary images. While the sharp transition of a countershading pro le is easy to measure in images, the slow-changing gradients are much harder to accurately estimate. The close proximity of step edge and lowmagnitude gradients defy existing multi-scale estimation techniques such as Elder and   Zucker [] and more robust means of estimating very small gradients in images need to be developed. Providing correct-appearing images to the viewer is another challenge. The scale at which the image appears on the viewer’s retina is in part determined by the physical relationship between the viewer and the display. Display-speci c attributes such as dimension and resolution can be known in advance and precomputed. The distance of the viewer continually changes and the display must adapt to the change in distance. Our scale-aware display provides a proof of concept capable of adapting the image content to the viewer’s distance, but only for a single viewer. However, basing the approach on head-tracking technology is not suitable for everyday use. Better methods of tracking multiple viewers and providing them with speci cally-tailored images are necessary for truly scale-aware image display. Some new technology, whether light eld display or something else is necessary to enable this possibility.  . Future Work There are a number of potential avenues of further investigation. Each individual contribution suggests next possible steps, as well as some other directions that draw on lessons from the body of work as a whole. On the topic of blur estimation, future work includes improving the algorithm’s ability to differentiate between textured and noisy regions and preserve ne details. In the context of mobile devices, information is available from additional types of sensors and it may be possible to use the accelerometer found in many devices to more accurately estimate blur resulting from camera shake. Future work in image resizing includes extending the concept of preserving the perceived appearance to other image attributes. From an image-quality perspective, accurately preserving the appearance of noise when downsampling can be as important as blur. We would like to conduct a more comprehensive study of the perception of countershading, including more images and larger variation of the underlying edge contrast. The existing models of perceived contrast resulting from the Cornsweet illusion would enable us to accurately determine the new combinations of λ and σ to preserve the perceived edge contrast while still avoiding haloes.    The investigation of Camera Dynamic Range Expansion did not yield positive results, in part due to the quality of deconvolution algorithms used, which are not well-tuned for high dynamic range images. The human visual system is sensitive to relative change and can detect small changes in dark regions. However, deconvolution algorithms are linear and minimize absolute error without accounting for how salient that error is to the nal observer. An interesting line of research would be attempting to devise a deconvolution algorithm that gives preference to relative error and correctly weights the importance of dark regions. Additionally, while we treat the visual channels in the brain as independent, we know this in not the case. Lateral inhibition, found between adjacent photoreceptors in the retina, applies equally to adjacent spatial frequency channels in the  at large. The presence of strong contrast at one frequency inhibits the response of nearby visual channels and can even amplify the response of more distant ones. We want to extend our models of the perception of blur, contrast and other image attributes to incorporate models of spatial perception accurate enough to capture these subtle effects. Finally, the main theme of this dissertation is centered on the observation that the same image can appear different depending on the scale at which it is viewed. The eld of color science has developed models capable of accounting for changes in color perception under different conditions. Similarly, we envision the development of an overarching pipeline that employs accurate models of human spatial vision to account for all changes in image appearance resulting from the diverse scales at which that image can be displayed.  . Conclusions Over the last decade, the paradigm of computational photography has made great strides increasing the capabilities of image capture. All of these approaches have involved some change in the physical setup to visually encode more information with a change in the processing of that image to recover that information and produce the desired result. The next decade will see a similar increase in the computational abilities available for image display and it is worth considering what principles would best guide the development of future display algorithms.    At the heart of the issue, we are fundamentally asking ourselves “What are the reasons to display, or even produce images?” Imaging is synonymous with communication. In the end, images are produced to communicate some notion or concept, no matter how abstract, with the viewer. The implicit assumption is we are communicating with a human observer possessing a human visual system that will process visual stimuli and the resulting perceived qualities will form their interpretation. Computational photography constructs novel physical setups to encode information and designs matching algorithms to decode that information. Likewise, the emerging eld of computational display must recognize since the  is responsible for decoding all information received by the viewer, the means of image display capable of conveying the most information will be matched to its respective decoder, the viewer’s visual system. Accurate and communicative display of images must rely on knowledge of human perception, for the act of perception is the only means of receiving information available to viewers. The conventional imaging pipeline includes the capture, manipulation and display of image content. In the future, we foresee the development of a new imaging pipeline that extends beyond the production of photons by the monitor. This new pipeline will recognize that any displayed image is viewed under some speci c conditions by the visual system of an observer which responds to both the image and the conditions. Any communicated information will result from that speci c response of the visual system to those stimuli. By considering the perceived appearance of images we can develop an image understanding pipeline as the basis of more effective visual communication.    Bibliography Acosta-Sera ni, P., Masaki, I., and Sodini, C. (). Single-chip imager system with programmable dynamic range. U.S. Patent ,,. → pages  Adelson, E. H. (). Checkershadow illusion. → pages  Adelson, E. H. (). Lightness perception and lightness illusions. In Gazzaniga, M., editor, The New Cognitive Neurosciences, chapter , pages –. MIT Press, Cambridge, MA, nd edition. → pages  Adelson, E. H., Anderson, C. H., Bergen, J. R., Burt, P. J., and Ogden, J. M. (). Pyramid methods in image processing. Engineer, ():–. → pages  Agrawal, A., Xu, Y., and Raskar, R. (). Invertible motion blur in video. ACM Trans. Graph., ::–:. → pages  Akyüz, A. and Reinhard, E. (). Perceptual evaluation of tone-reproduction operators using the Cornsweet-Craik-O’Brien illusion. ACM Transactions on Applied Perception (TAP), (). → pages  Anstis, S. M. and Howard, I. P. (). A Craik-O’Brien-Comsweet illusion for visual depth. Vision Research, :–. → pages  Avidan, S. and Shamir, A. (). Seam carving for content-aware image resizing. ACM Trans. Graph., ():. → pages  Badamchizadeh, M. A. and Aghagolzadeh, A. (). Comparative study of unsharp masking methods for image enhancement. In Proceedings of International Conference on Image and Graphics, pages –, Washington, DC, USA. IEEE Computer Society. → pages , ,  Bae, S. and Durand, F. (). Defocus magni cation. Computer Graphics Forum, ():–. → pages xi, , , , , , ,     Bando, Y. and Nishita, T. (). Towards digital refocusing from a single photograph. In Proceedings of Paci c Conference on Computer Graphics and Applications, pages –. → pages , , ,  Barnard, K. and Funt, B. (). Analysis and improvement of multi-scale retinex. Proceedings of the IS&T/SID Fifth Color Imaging Conference: Color Science, Systems and Applications, pages –. → pages  Basu, M. (). Gaussian-based edge-detection methods - a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C, ():–. → pages  Ben-Ezra, M. and Nayar, S. K. (). Motion-based motion deblurring. IEEE Transactions on Pattern Analysis and Machine Intelligence, :–. → pages  Berkner, K. and Erol, B. (). Adaptation of document images to display constraints. In Human Vision and Electronic Imaging XIII, volume , page C. SPIE. → pages  Bioucas-Dias, J. (). Bayesian wavelet-based image deconvolution: a GEM algorithm exploiting a class of heavy-tailed priors. IEEE Transactions on Image Processing, ():–. → pages  Blake, A. (). On lightness computation in Mondrian world. In Ottoson, T. and Zeki, S., editors, Central and Peripheral Mechanisms of Color Vision, pages –, New York. Macmillan. → pages  Blakemore, C., Carpenter, R. H. S., and Georgeson, M. A. (). Lateral inhibition between orientation detectors in the human visual system. Nature, :–. → pages ,  Blakemore, C., Muncey, J. P., and Ridley, R. M. (). Stimulus speci city in the human visual system. Vision Res., :–. → pages  Brady, N. and Field, D. J. (). What’s constant in contrast constancy? The effects of scaling on the perceived contrast of bandpass patterns. Vision Res., :–. → pages  Burr, D. (). Implications of the Craik-O’Brien illusion for brightness perception. Vision Research, ():–. → pages , ,  Burt, P. J. and Adelson, E. H. (). The laplacian pyramid as a compact image code. IEEE Transactions on Communications, :–. → pages     Calabria, A. J. and Fairchild, M. D. (a). Perceived image contrast and observer preference i: The effects of lightness, chroma, and sharpness manipulations on contrast perception. Journal of Imaging Science & Technology, :–. → pages ,  Calabria, A. J. and Fairchild, M. D. (b). Perceived image contrast and observer preference ii: Empirical modeling of perceived image contrast and observer preference data. Journal of Imaging Science & Technology, :–. → pages ,  Campbell, F. W., Howell, E. R., and Johnstone, J. R. (). A comparison of threshold and suprathreshold appearance of gratings with components in the low and high spatial frequency range. J. Physiol. (Lond.), :–. → pages , ,  Campbell, F. W., Howell, E. R., and Robson, J. G. (). The appearance of gratings with and without the fundamental Fourier component. J. Physiol. (Lond.), :P–P. → pages , ,  Cannon, M. W. (). Perceived contrast in the fovea and periphery. J Opt Soc Am A, :–. → pages  Canny, J. (). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages –. → pages ,  Chakrabarti, A., Zickler, T., and Freeman, W. T. (). Analyzing spatially-varying blur. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages –. → pages  Chen, C.-C., Chen, K.-P., Tseng, C.-H., Kuo, S.-T., and Wu, K.-N. (). Constructing a metrics for blur perception with blur discrimination experiments. In Proc. SPIE, Image Quality and System Performance VI, number . → pages x,  Chiu, K., Herf, M., Shirley, P., Swamy, S., Wang, C., and Zimmerman, K. (). Spatially nonuniform scaling functions for high contrast images. In Proc. Graphics Interface, pages –. → pages  CIE (). A colour appearance model for colour management systems: CIECAM. Technical Report Technical Report CIE :, Commision Internationale De L’Eclairage, Vienna. → pages     Ciuffreda, K. J., Selenow, A., Wang, B., Vasudevan, B., Zikos, G., and Ali, S. R. (). ”Bothersome blur”: a functional unit of blur perception. Vision Res., :–. → pages ,  Ciuffreda, K. J., Wang, B., and Vasudevan, B. (). Conceptual model of human blur perception. Vision Research, (): – . → pages x, ,  Cornsweet, T. (). Visual Perception. Academic Press, New York. → pages  Cossairt, O., Zhou, C., and Nayar, S. (). Diffusion coded photography for extended depth of eld. ACM Trans. Graph., ::–:. → pages  Craik, K. J. W. (). The nature of psychology; a selection of papers, essays, and other writings. Cambridge University Press, Cambridge. → pages  Cufflin, M. P., Mankowska, A., and Mallen, E. A. H. (). Effect of blur adaptation on blur sensitivity and discriminations in emmetropes and myopes. Investigative Ophthalmology & Visual Science, ():–. → pages  Dai, S. and Wu, Y. (). Estimating space-variant motion blur without deblurring. In Proceedings of the International Conference on Image Processing, pages –. → pages  Davey, M., Maddess, T., and Srinivasan, M. (). The spatiotemporal properties of the Craik-O’Brien-Cornsweet effect are consistent with ‘ lling-in’. Vision Res, :–. → pages  De Valois, R. and De Valois, K. (). Spatial Vision. Oxford University Press. → pages , , , ,  Deans, S. R. (). The radon transform and some of its applications. → pages  Debevec, P. and Malik, J. (). Recovering high dynamic range radiance maps from photographs. Proceedings of ACM SIGGRAPH. → pages  Devinck, F., Hansen, T., and Gegenfurtner, K. (). Temporal properties of the chromatic and achromatic Craik-O’Brien-Cornsweet effect. Vision Research, ():–. → pages  Dicke, R. (). Scatter-hole cameras for x-rays and gamma rays. Astrophysical Journal, . → pages  Donoho, D., Johnstone, I., and Johnstone, I. M. (). Ideal spatial adaptation by wavelet shrinkage. Biometrika, :–. → pages    Dooley, R. and Green eld, M. (). Measurements of edge-induced visual contrast and a spatial-frequency interaction of the Cornsweet illusion. Journal of the Optical Society of America, ():–. → pages , , , ,  Durand, F. and Dorsey, J. (). Fast bilateral ltering for the display of high-dynamic-range images. ACM Trans. Graph., :–. → pages xii, , ,  Elder, J. and Zucker, S. (). Local scale control for edge detection and blur estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, ():–. → pages xi, , , , , , , ,  Fairchild, M. and Johnson, G. (). iCAM framework for image appearance, differences, and quality. Electronic Imaging. → pages ,  Fairchild, M. D. (). Color Appearance Models. Wiley-IS&T, Chichester, UK, nd edition. → pages ,  Farbman, Z., Fattal, R., Lischinski, D., and Szeliski, R. (). Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Trans. Graph., ::–:. → pages , , , ,  Fattal, R. (). Image upsampling via imposed edge statistics. ACM Trans. Graph., ():. → pages  Fattal, R., Lischinski, D., and Werman, M. (). Gradient domain high dynamic range compression. ACM Transactions on Graphics, ():–. → pages  Feichtinger, H. G. and Strohmer, T. (). Gabor Analysis and Algorithms. Birkhäuser. → pages  Fenimore, E. E. and Cannon, T. M. (). Coded aperture imaging with uniformly redundant arrays. Applied Optics, ():–. → pages  Fergus, R., Singh, B., Hertzmann, A., Roweis, S., and Freeman, W. (). Removing camera shake from a single photograph. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), ():–. → pages  Funt, B., Barnard, K., Brockington, M., and Cardei, V. (). Luminance-based multi-scale retinex. AIC International Color Association. → pages  George, S. and Rosen eld, M. (). Blur adaptation and myopia. Optom Vis Sci, :–. → pages     Georgeson, M. and Sullivan, G. (). Contrast constancy: deblurring in human vision by spatial frequency channels. Journal of Physiology, ():–. → pages , ,  Georgeson, M. A. (). Contrast overconstancy. J Opt Soc Am A, :–. → pages  Gonzalez, R. C. and Woods, R. E. (). Digital Image Processing. Addison-Wesley, nd edition. → pages  Gottesman, S. and Fenimore, E. (). New family of binary arrays for coded aperture imaging. Applied Optics, ():–. → pages xi, ,  Hamerly, J. R. and Dvorak, C. A. (). Detection and discrimination of blur in edges and lines. J. Opt. Soc. Am., ():–. → pages ,  Held, R. T., Cooper, E. A., O’Brien, J. F., and Banks, M. S. (). Using blur to affect perceived distance and size. ACM Trans. on Graph., ()::–. → pages x, ,  Horn, B. (). On lightness. Technical report, MIT Arti cial Intelligence Lab. Membo . Massachusetts Institute of Technology. → pages  Hubel, D. H. and Wiesel, T. N. (). Receptive elds of single neurones in the cat’s striate cortex. Journal of Physiology, :–. → pages ,  Hurlbert, A. (). Formal connections between lightness algorithms. Journal of the Optical Society of America, ():–. → pages  Ibenthal, A. (). Image sensor noise estimation and reduction. Technical Report ITG ., University of Applied Sciences and Arts Hildesheim/Holzminden/Göttingen. → pages  Ihrke, M., Ritschel, T., Smith, K., Grosch, T., Myszkowski, K., and Seidel, H.-P. (). A perceptual evaluation of D unsharp masking. Human Vision and Electronic Imaging, page . → pages ,  Ilic, L., Pizurica, A., Vansteenkiste, E., and Philips, W. (). Image blur estimation based on the average cone of ratio in the wavelet domain. volume , page F. SPIE. → pages  IMS Chips (). HDRC VGAx. http://www.hdrc.com. → pages  ITU-R-BT.- (). Methodology for the subjective assessment of the quality of television pictures. → pages    Ji, H. and Liu, C. (). Motion blur identi cation from image gradients. Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, :–. → pages  Jobson, D., Rahman, Z., Woodell, G., Center, N., and Hampton, V. (). Properties and performance of a center/surround retinex. IEEE Transactions on image processing, ():–. → pages  Jobson, D. and Woodell, G. (). Properties of a center/surround retinex part two: Surround design. Citeseer. → pages  Kandel, E. R., Schwartz, J. H., and Jessell, T. M., editors (). Principles of Neural Science. McGraw-Hill, New York, th edition. → pages ,  Kim, M. H., Ritschel, T., and Kautz, J. (). Edge-aware color appearance. ACM Transactions on Graphics (presented at SIGGRAPH), ()::–. → pages  Kim, S. H. and Allebach, J. P. (). Optimal unsharp mask for image sharpening and noise removal. J. of Electronic Imaging, ():. → pages  Kingdom, F. and Moulden, B. (). Border effects on brightness: A review of ndings, models, issues. Spatial Vision, ():–. → pages , , ,  Koenderink, J. (). The structure of images. Biological Cybernetics, ():–. → pages  Kopf, J., Cohen, M. F., Lischinski, D., and Uyttendaele, M. (). Joint bilateral upsampling. ACM Trans. Graph., . → pages  Krähenbühl, P., Lang, M., Hornung, A., and Gross, M. (). A system for retargeting of streaming video. In ACM SIGGRAPH Asia  papers, SIGGRAPH Asia ’, pages :–:, New York, NY, USA. ACM. → pages  Krawczyk, G., Myszkowski, K., and Seidel, H. P. (). Contrast restoration by adaptive countershading. In The European Association for Computer Graphics Annual Conference (EUROGRAPHICS). → pages xi, , , , ,  Kruskal, J. B. and Wish, M. (). Multidimensional Scaling. Sage Publications. → pages  Lam, E. Y. and Goodman, J. W. (). Iterative statistical approach to blind image deconvolution. J. Opt. Soc. Am. A, ():–. → pages     Land, E. (). An alternative technique for the computation of the designator in the retinex theory of color vision. Proceedings of the National Academy of Sciences, ():–. → pages  Land, E. and McCann, J. (). Lightness and retinex theory. Journal of the Optical Society of America. → pages x, , ,  Levin, A., Fergus, R., Durand, F., and Freeman, W. (). Image and depth from a conventional camera with a coded aperture. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), (). → pages , , , , , , , ,  Levin, A., Hasinoff, S. W., Green, P., Durand, F., and Freeman, W. T. (). d frequency analysis of computational cameras for depth of eld extension. ACM Trans. Graph., ::–:. → pages  Levin, A., Lischinski, D., and Weiss, Y. (). Colorization using optimization. ACM Trans. Graph., :–. → pages ,  Levin, A., Sand, P., Cho, T. S., Durand, F., and Freeman, W. T. (). Motion-invariant photography. ACM Trans. Graph., ::–:. → pages  Lin, W., Gai, Y., and Kassim, A. (). Perceptual impact of edge sharpness in images. Vision, Image and Signal Processing, IEE Proceedings -, (): – . → pages , ,  Lindeberg, T. (a). Scale-space theory: A basic tool for analyzing structures at different scales. Journal of applied statistics, ():–. → pages ,  Lindeberg, T. (b). Scale-space theory in computer vision. → pages  Lindeberg, T. (c). Scale-Space Theory in Computer Vision. Kluwer Academic Publishers. → pages  Lindeberg, T. (). Edge detection and ridge detection with automatic scale selection. IEEE Conference on Computer Vision and Pattern Recognition, pages –. → pages  Lindeberg, T. (). Feature detection with automatic scale selection. International Journal of Computer Vision, ():–. → pages  Liu, C., Freeman, W., Szeliski, R., and Kang, S. B. (). Noise estimation from a single image. In Proc. CVPR, volume , pages –. → pages     Liu, R., Li, Z., and Jia, J. (). Image partial blur detection and classi cation. In Proc. CVPR, pages –. → pages  Loomis, J. M. and Nakayama, K. (). A velocity analogue of brightness contrast. Perception, :–. → pages  Lucy, L. (). An iterative technique for the recti cation of observed distributions. The Astronomical Journal, ():–. → pages ,  Luft, T., Colditz, C., and Deussen, O. (). Image enhancement by unsharp masking the depth buffer. ACM Trans. Graph., ():–. → pages ,  Lumsdaine, A. and Georgiev, T. (). The focused plenoptic camera. In In Proc. IEEE ICCP, pages –. → pages  Mantiuk, R., Krawczyk, G., Myszkowski, K., and Seidel, H. (). Perception-motivated high dynamic range video encoding. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), ():–. → pages  Mather, G. and Smith, D. (). Blur discrimination and its relation to blur-mediated depth perception. Perception, ():–. → pages  Moghaddam, M. E. and Jamzad, M. (). Linear Motion Blur Parameter Estimation in Noisy Images Using Fuzzy Sets and Power Spectrum. EURASIP Journal on Advances in Signal Processing, :–. → pages ,  Nayar, S. and Mitsunaga, T. (). High dynamic range imaging: Spatially varying pixel exposures. International Conference on Computer Vision, . → pages  Neycenssac, F. (). Contrast enhancement using the laplacian-of-a-gaussian lter. CVGIP: Graphical Models and Image Processing, (): – . → pages , , ,  Ng, R. (). Fourier slice photography. ACM Trans. Graph., :–. → pages  Ng, R., Levoy, M., Bredif, M., Duval, G., Horowitz, M., and Hanrahan, P. (). Light eld photography with a hand-held plenoptic camera. Technical Report CSTR -, Stanford University. → pages xi, ,  Nowak, R. D. and Baraniuk, R. G. (). Adaptive weighted highpass lters using multiscale analysis. IEEE Trans Image Process, :–. → pages  O’Brien, V. (). Contour perception, illusion and reality. Journal of the Optical Society of America, :–. → pages ,    Oliva, A., Torralba, A., and Schyns, P. G. (). Hybrid images. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), ():–. → pages x, ,  Palmer, S. E. (). Vision Science: Photons to Phenomenology. MIT Press, Cambridge, Mass. → pages , ,  Pattanaik, S., Ferwerda, J., Fairchild, M., and Greenberg, D. (). A multiscale model of adaptation and spatial vision for realistic image display. In Proceedings of ACM SIGGRAPH, pages –. → pages  Peli, E. (). Contrast in complex images. Journal of the Optical Society of America, ():–. → pages  Peli, E., Arend, L., and Labianca, A. T. (). Contrast perception across changes in luminance and spatial frequency. J Opt Soc Am A Opt Image Sci Vis, :–. → pages  Peli, E., Yang, J. A., Goldstein, R., and Reeves, A. (). Effect of luminance on suprathreshold contrast perception. J Opt Soc Am A, :–. → pages  Pentland, A. P. (). A new sense for depth of eld. IEEE Trans. Pattern Anal. Mach. Intell., :–. → pages  Perona, P. and Malik, J. (). Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, ():–. → pages  Polesel, A., Ramponi, G., and Mathews, V. (). Image enhancement via adaptive unsharp masking. IEEE Trans. Image Proc., (): –. → pages  Popkin, T., Cavallaro, A., and Hands, D. (). Accurate and efficient method for smoothly space-variant Gaussian blurring. IEEE Trans. Img. Proc., ():–. → pages ,  Poynton, C. (). Digital Video and HDTV: Algorithms and Interfaces. Morgan Kaufmann. → pages  Pritch, Y., Kav-Venaki, E., and Peleg, S. (). Shift-map image editing. In ICCV’, pages –, Kyoto. → pages  Purves, D., Shimpi, A., and Lotto, R. (). An empirical explanation of the Cornsweet effect. Journal of Neuroscience. → pages  Purves, D., Williams, S. M., Nundy, S., and Lotto, R. B. (). Perceiving the intensity of light. Psychol Rev, :–. → pages    Rahman, Z. (). Properties of a center/surround retinex part one: Signal processing design. NASA Contractor Report, . → pages  Rahman, Z., Jobson, D., and Woodell, G. (a). Multi-scale retinex for color image enhancement. Image Processing, . Proceedings., International Conference on, . → pages  Rahman, Z., Jobson, D., and Woodell, G. (b). A multiscale retinex for color rendition and dynamic range compression. SPIE Proceedings: Applications of Digital Image Processing XIX, . → pages  Ramponi, G., Strobel, N. K., Mitra, S. K., and Yu, T.-H. (). Nonlinear unsharp masking methods for image contrast enhancement. Journal of Electronic Imaging, ():–. → pages  Raskar, R., Agrawal, A., and Tumblin, J. (). Coded exposure photography: motion deblurring using uttered shutter. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), ():–. → pages  Reinhard, E., Khan, E. A., Akyüz, A. O., and Johnson, G. M. (). Color Imaging: Fundamentals and Applications. A K Peters, Ltd. → pages  Reinhard, E., Stark, M., Shirley, P., and Ferwerda, J. (). Photographic tone reproduction for digital images. ACM Transactions on Graphics (special issue SIGGRAPH ), ():–. → pages  Richardson, W. (). Bayesian-based iterative method of image restoration. Journal of the Optical Society of America, (). → pages ,  Ritschel, T., Smith, K., Ihrke, M., Grosch, T., Myszkowski, K., and Seidel, H.-P. (). D unsharp masking for scene coherent enhancement. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), (). → pages , ,  Robertson, M., Borman, S., and Stevenson, R. (). Dynamic range improvements through multiple exposures. In Proceedings of International Conference on Image Processing (ICIP) , pages –. → pages  Rooms, F., Pizurica, A., and Philips, W. (). Estimating image blur in the wavelet domain. In International Conference on Acoustics, Speech, and Signal Processing. → pages  Rosen eld, M. and Abraham-Cohen, J. A. (). Blur sensitivity in myopes. Optom Vis Sci, :–. → pages    Rosen eld, M., Hong, S. E., and George, S. (). Blur adaptation in myopes. Optom Vis Sci, :–. → pages  Rouf, M., Mantiuk, R., Heidrich, W., Trentacoste, M., and Lau, C. (). Glare encoding of high dynamic range images. In Computer Vision and Pattern Recognition (CVPR). → pages  Rubinstein, M., Gutierrez, D., Sorkine, O., and Shamir, A. (). A comparative study of image retargeting. ACM Trans. Graph., ::–:. → pages  Rubinstein, M., Shamir, A., and Avidan, S. (). Multi-operator media retargeting. ACM Trans. Graph., ::–:. → pages  Samadani, R., Mauer, T. A., Berfanger, D. M., and Clark, J. H. (). Image thumbnails that represent blur and noise. IEEE Trans. Img. Proc., ():–. → pages xi, xii, , , , , , , , , , , , , , ,  Santella, A., Agrawala, M., DeCarlo, D., Salesin, D., and Cohen, M. (). Gaze-based interaction for semi-automatic photo cropping. In Proc. CHI, pages –. → pages  Shan, Q., Jia, J., and Agarwala, A. (a). High-quality motion deblurring from a single image. ACM Trans. Graph., ::–:. → pages  Shan, Q., Li, Z., Jia, J., and Tang, C.-K. (b). Fast image/video upsampling. volume , pages :–:, New York, NY, USA. ACM. → pages ,  Shapley, R. M. and Tolhurst, D. J. (). Edge detectors in human vision. J. Physiol. (Lond.), :–. → pages ,  Shin, D.-H., Park, R.-H., Yang, S., and Jung, J.-H. (). Block-based noise estimation using adaptive gaussian ltering. IEEE Trans. Consumer Electronics, ():–. → pages  Simoncelli, E. P. and Adelson, E. H. (). Noise removal via bayesian wavelet coring. In Proc. ICIP, volume , pages –. → pages  Smith, K. (). Contours and Contrast. PhD thesis, Max-Planck-Institut fur Informatik. → pages ,  Smith, K., Krawczyk, G., Myszkowski, K., and Seidel, H. (). Beyond tone mapping: Enhanced depiction of tone mapped HDR images. Proceedings of Eurographics, ():–. → pages     Smith, K., Landes, P., Thollot, J., and Myszkowski, K. (). Apparent greyscale: A simple and fast conversion to perceptually accurate images and video. Proceedings of Eurographics. → pages  Subbarao, M. (). Radiometry. chapter Parallel depth recovery by changing camera parameters, pages –. Jones and Bartlett Publishers, Inc., , USA. → pages  Suh, B., Ling, H., Bederson, B. B., and Jacobs, D. W. (). Automatic thumbnail cropping and its effectiveness. In Proc. UIST, pages –. → pages  Sullivan, G. and Georgeson, M. (). The missing fundamental illusion: Variation of spatio-temporal characteristics with dark adaptation. Vision Research, ():–. → pages , , ,  Thomos, N., Boulgouris, N. V., and Strintzis, M. G. (). Optimized transmission of JPEG streams over wireless channels. IEEE Transactions on image processing, (). → pages  Tiippana, K. and Näsänen, R. (). Spatial-frequency bandwidth of perceived contrast. Vision Resarch, ():–. → pages  Tolhurst, D. J. (). On the possible existance of edge detector neurones in the human visual system. Vision Res., :–. → pages  Tong, H., Li, M., Zhang, H., and Zhang, C. (). Blur detection for digital images using wavelet transform. In Proceedings of the IEEE International Conference on Multimedia and Expo, pages –. → pages  Trentacoste, M., Lau, C., Rouf, M., Mantiuk, R., and Heidrich, W. (a). Defocus techniques for camera dynamic range expansion. In Proceedings of Human Vision and Electronic Imaging XXI, page . → pages iv Trentacoste, M., Mantiuk, R., and Heidrich, W. (b). Quality-preserving image downsizing. SIGGRAPH Student Research Competition Poster. → pages iv Trentacoste, M., Mantiuk, R., and Heidrich, W. (a). Blur-aware image downsampling. Computer Graphics Forum (Eurographics). → pages iii, iv Trentacoste, M., Mantiuk, R., and Heidrich, W. (b). Scale-dependent perception of countershading. In To Appear. → pages iii Trentacoste, M., Mantiuk, R., and Heidrich, W. (c). Synthetic depth-of- eld for mobile devices. In To Appear. → pages iii, iv   Trussell, H. J. and Fogel, S. (). Identi cation and restoration of spatially variant motion blurs in sequential images. IEEE Transactions on Image Processing, :–. → pages  van den Brink, G. and Kleemink, C. J. (). Luminance gradients and edge effects. Vision Research, :–. → pages  Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., and Tumblin, J. (). Dappled photography: mask enhanced cameras for heterodyned light elds and coded aperture refocusing. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH). → pages , , , ,  Vishwanath, D. and Blaser, E. (). Retinal blur and the perception of egocentric distance. J Vis, :. → pages  Wachtler, T. and Wehrhahn, C. (). The Craik-O’Brien-Cornsweet illusion in colour: Quantitative characterisation and comparison with luminance. Perception, :–. → pages  Walker, P. and Powell, D. J. (). Lateral interaction between neural channels sensitive to velocity in the human visual system. Nature, :–. → pages  Wang, B. and Ciuffreda, K. J. (a). Blur discrimination of the human eye in the near retinal periphery. Optom Vis Sci, :–. → pages  Wang, B. and Ciuffreda, K. J. (b). Foveal blur discrimination of the human eye. Ophthalmic Physiol Opt, :–. → pages ,  Wang, B., Ciuffreda, K. J., and Irish, T. (a). Equiblur zones at the fovea and near retinal periphery. Vision Res., :–. → pages ,  Wang, B., Ciuffreda, K. J., and Vasudevan, B. (b). Effect of blur adaptation on blur sensitivity in myopes. Vision Research, (): – . → pages  Wang, Y.-P., Wu, Q., Castleman, K. R., and Xiong, Z. (). Image enhancement using multiscale differential operators. In IEEE Proc. of the Acoustics, Speech, and Signal Processing, pages –. → pages  Ware, C. and Cowan, W. (). The chromatic Cornsweet effect. Vision Res, ():. → pages  Watson, A. (). The cortex transform: Rapid computation of simulated neural images. Computer Vision, Graphics, and Image Processing, :–. → pages     Webster, M. A., Georgeson, M. A., and Webster, S. M. (). Neural adjustments to image blur. Nat. Neurosci., :–. → pages  Webster, M. A., Webster, S. M., MacDonald, J., and Bahradwadj, S. R. (a). Adaptation to blur. In B. E. Rogowitz and T. N. Pappas, editor, Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, volume  of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, pages –. → pages  Webster, S. M., Webster, M. A., Taylor, J., Jaikumar, J., and Verma, R. (b). Simultaneous blur contrast. In B. E. Rogowitz and T. N. Pappas, editor, Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, volume  of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, pages –. → pages  Wiener, N. (). Extrapolation, Interpolation, and Smoothing of Stationary Time Series. MIT Press, Cambridge, Mass. → pages ,  Witkin, A. (). Scale-space ltering. In Proceedings of the th International Joint Conference on Arti cial Intelligence. → pages  Wuerger, S. M., Owens, H., and Westland, S. (). Blur tolerance for luminance and chromatic stimuli. J Opt Soc Am A Opt Image Sci Vis, :–. → pages  Xiangjun, L. and Jianfei, C. (). Robust transmission of JPEG encoded images over packet loss channels. In IEEE International Conference on Multimedia, pages –. School of Computer Engineering, Nanyang Technological University. → pages  Yang, D., Gamal, A., Fowler, B., and Tian, H. (). IEEE Journal of Solid-State Circuits, :–. → pages ,  Yi, F., Iskander, D. R., and Collins, M. J. (). Estimation of the depth of focus from wavefront measurements. Journal of Vision, (). → pages  Yoshida, A., Ihrke, M., Mantiuk, R., and Seidel, H. (). Brightness of the glare illusion. In Proc. of Aymposium on Applied Perception in Graphics and Visualization, pages –. ACM. → pages  Young, I. T. and van Vliet, L. J. (). Recursive implementation of the gaussian lter. Signal Process., :–. → pages     Yuan, L., Sun, J., Quan, L., and Shum, H.-Y. (). Progressive inter-scale and intra-scale non-blind image deconvolution. ACM Trans. Graph., ::–:. → pages xi, ,  Zhang, X. M. and Wandell, B. (). A spatial extension to CIELAB for digital color image reproductions. In Proceedings of the SID Symposium Digest, volume , pages –. → pages  Zhou, C. and Nayar, S. (). What are good apertures for defocus deblurring? IEEE Computational Photography. → pages , , , , , ,     

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0052094/manifest

Comment

Related Items