UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Improving perceptual quality of high dynamic range video Azimi Hashemi, Maryam 2019

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2019_november_azimihashemi_maryam.pdf [ 7.79MB ]
Metadata
JSON: 24-1.0384606.json
JSON-LD: 24-1.0384606-ld.json
RDF/XML (Pretty): 24-1.0384606-rdf.xml
RDF/JSON: 24-1.0384606-rdf.json
Turtle: 24-1.0384606-turtle.txt
N-Triples: 24-1.0384606-rdf-ntriples.txt
Original Record: 24-1.0384606-source.json
Full Text
24-1.0384606-fulltext.txt
Citation
24-1.0384606.ris

Full Text

Improving Perceptual Quality of High Dynamic Range Video by  Maryam Azimi Hashemi  B.A.Sc., Ferdowsi University, 2009 M.A.Sc, University of British Columbia, 2014  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  Doctor of Philosophy in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Electrical and Computer Engineering)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)   October 2019  © Maryam Azimi Hashemi, 2019  ii  The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled: Improving Perceptual Quality of High Dynamic Range Video  submitted by Maryam Azimi Hashemi in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering  Examining Committee: Panos Nasiopoulos, Electrical and Computer Engineering Supervisor  Victor Leung, Electrical and Computer Engineering Supervisory Committee Member  Shahriar Mirabbasi, Electrical and Computer Engineering Supervisory Committee Member Ronald Rensink, Computer Science University Examiner Alan Wagner, Computer Science University Examiner  Additional Supervisory Committee Members:  Supervisory Committee Member  Supervisory Committee Member  iii  Abstract With the real-life viewing experience of High Dynamic Range (HDR) videos and the growing availability of HDR displays and video content, an efficient HDR video delivery pipeline is required for applications such as broadcasting. The existing pipeline has been designed for Standard Dynamic Range (SDR) signals and displays. Using this pipeline for HDR content will result in visible quality degradation as HDR bears fundamental differences with SDR technology such as higher brightness levels and a wider color gamut (WCG). As a result, various HDR delivery pipelines are under development, supporting varying bitrates and visual quality. In this thesis, we improve the visual quality and hence quality of experience (QoE) of delivered HDR videos without increasing the bitrate. First, we investigate the existing transmission pipelines’ efficiency in delivering HDR through an extensive set of subjective experiments. The unprecedented analysis of each pipeline presented in this work, while considering their backward compatibility with SDR displays, provides valuable information for broadcasters to identify the most efficient pipeline in terms of required bitrate and visual quality for viewers. Next, we evaluate the effect that the identified HDR delivery pipeline has color accuracy. These evaluations are helpful to determine the colors that need improvement. By considering certain characteristics of the human visual system (HVS), we propose two processing techniques that improve the perceptual fidelity of these colors. The proposed techniques are shown to outperform the existing methods in terms of maintaining the color information of HDR signals first subjectively through a set of visual evaluations and second objectively by using color difference evaluation metrics. Additionally, for cases where delivered HDR signals are received by an SDR display, we propose iv  two novel color mapping methods that result in the least perceptual color differences compared to the original HDR signal. The proposed color mapping techniques are compatible with the current pipeline infrastructure with minimal implementation cost.  The presented work in this thesis improves the visual quality of transmitted HDR videos, either viewed directly on HDR displays or through a mapping process on SDR displays, while the transmission bitrate is not affected.   v  Lay Summary With the recent introduction of High Dynamic Range (HDR) technology, viewers’ quality of experience is highly enriched as HDR matches the brightness and colors seen by the human eye.  For the viewer to benefit from HDR’s enhanced quality, such information needs to be maintained throughout transmission processes such as digitization, compression and display adaptation.  This thesis focuses on improving the color quality of the transmitted HDR videos, without increasing the required bandwidth. More specifically, by considering the human visual system characteristics and limitations, the HDR color pixels are processed such that the color information invisible to human eyes is discarded so that visible information is treated with more care. Such perceptual representation of HDR color pixels results in more efficient transmission and higher visual quality. vi  Preface  This thesis presents research conducted by Maryam Azimi under the guidance of Dr. Panos Nasiopoulos. A list of publications resulting from the work presented in this thesis is provided on the following page. A version of Chapter 2 is published in [P1] and part of the results of Chapter 2 are taken from the work we published in [P2]. Three contributions were made to MPEG video compression standardization activities [P3] – [P5] based on this chapter. The work presented in Chapter 2 of this thesis was performed by Maryam Azimi and Dr. Ronan Boitard, including data acquisition, algorithm designing, and manuscript writing. Maryam Azimi was the main contributor for implementing the proposed algorithms, conducting the experiments, and analyzing the results. Stelios Ploumis, Basak Oztas and Hamidreza Tohidypour helped with running the experiments in [P2]. Dr. Mahsa T. Pourazad and Dr. Panos Nasiopoulos provided technical guidance and editorial input into the manuscript writing. A version of Chapter 3 appears in one conference [P6] and the work presented in it was performed by Maryam Azimi including data acquisition, algorithm designing, and manuscript writing. Dr. Ronan Boitard helped in research concept formation. Dr. Ronan Boitard, Dr. Mahsa T. Pourazad and Dr. Panos Nasiopoulos all provided technical guidance, as well as editorial input into the manuscript writing throughout the project.  vii  The work presented in Section 4.1 has been published in [P7] while versions of Section 4.2 were submitted to [P8] – [P9]. An MPEG contribution is also submitted based on this section [P10]. The work in Chapter 4 was primarily performed by Maryam Azimi, including designing and implementing the proposed algorithms, performing all experiments, analyzing the results, and writing the manuscripts. The work was conducted with the guidance and editorial input of Dr. Panos Nasiopoulos and Dr. Mahsa T. Pourazad. Dr. Ronan Boitard helped with the early stages of forming the idea in Chapter 4.  A version of Sections 5.1 and of 5.2 appear in [P11] and [P12], respectively. The work presented in Chapters 5 is primarily performed by Maryam Azimi. Timothee Bronner provided the dataset and consultation. Maryam Azimi was the main contributor for manuscript writing, analyzing the results and performing experiments. Dr. Ronan Boitard, Dr. Mahsa T. Pourazad and Dr. Panos Nasiopoulos provided guidance and editorial input into the manuscript writing. The subjective studies in this work were covered under the UBC Ethics Board (H12-00308). List of Publications Based on Work Presented in this Thesis [P1] M. Azimi, R. Boitard, M. T. Pourazad, and P. Nasiopoulos “Performance evaluation of single layer HDR video transmission pipelines”, IEEE Transactions on Consumer Electronics, vol. 63, no. 3, pp. 267-276, August 2017.  [P2] M. Azimi, R. Boitard, B. Oztas, S. Ploumis, H. S. Tohidypour, M.T. Pourazad, P. Nasiopoulos, “Compression Efficiency of HDR/LDR Content,” Seventh International viii  Workshop on Quality of Multimedia Experience (QoMEX), Costa Navarino, Messinia, Greece, May 2015. [P3] M. Azimi, R. Boitard, M.T. Pourazad, P. Nasiopoulos, “Evaluation of Backward-compatible HDR Transmission Pipelines,”, ISO/IEC JTC1/SC29/WG11, MPEG# W0106, San Diego, USA, February 2016. [P4] M. Azimi, R. Boitard, M.T. Pourazad, P. Nasiopoulos, “Evaluation of PQ versus tone-mapping for single layer HDR video compression,”, ISO/IEC JTC1/SC29/WG11, MPEG# m37318, Geneva, Switzerland, October 2015. [P5] R. Boitard, M. Azimi, M.T. Pourazad, P. Nasiopoulos, “Evaluation of Scalable versus Single Layer Compression on Consumer HDR Displays,”, ISO/IEC JTC1/SC29/WG11, MPEG# m37319, Geneva, Switzerland, October 2015. [P6] M. Azimi, R. Boitard, M.T. Pourazad, P. Nasiopoulos, “Visual color difference evaluation of standard color pixel representations for high dynamic range video compression,” 2017 25th European Signal Processing Conference (EUSIPCO), Kos Island, Greece, August 2017. [P7] M. Azimi, M. T. Pourazad, and P. Nasiopoulos, “An Efficient HDR Video Compression Scheme based on a Modified Lab Color Space”, The Tenth International Conference on Creative Content, Barcelona, Spain, February 2018. [P8] M. Azimi, Panos Nasiopoulos, and M. T. Pourazad, “Improving Color Accuracy of HDR Video Content”, accepted to 26th IEEE International Conference on Electronics Circuits and Systems Genova, Italy.  [P9] M. Azimi and M. T. Pourazad, “A Novel Chroma Processing Scheme for Improved Color Accuracy of HDR Video Content”, accepted to IEEE Transaction on Broadcasting. ix  [P10] M. Azimi, M.T. Pourazad, P. Nasiopoulos, “A Novel Chroma Processing Scheme for Improved Color Accuracy of HDR Video Content”, ISO/IEC JTC1/SC29/WG11, MPEG# m50611, Geneva, Switzerland, October 2019. [P11] M. Azimi, T. Bronner, M. T. Pouzarad, and P. Nasiopoulos, “A Color Gamut Mapping Scheme for Backward Compatible UHD Video Distribution”, IEEE International Conference on Communications (ICC) 2017, Paris, France, May 2017. [P12] M. Azimi, T. Bronner, R. Boitard, M. T. Pourazad, and P. Nasiopoulos, “A Hybrid Approach for Efficient Color Gamut Mapping,” IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, January 2017.  x  Table of Contents  Abstract ......................................................................................................................................... iii Lay Summary .................................................................................................................................v Preface ........................................................................................................................................... vi Table of Contents ...........................................................................................................................x List of Tables ................................................................................................................................xv List of Figures ............................................................................................................................. xvi List of Abbreviations ................................................................................................................. xxi Acknowledgements .................................................................................................................. xxiv Dedication ................................................................................................................................. xxvi Chapter 1: Introduction and Overview .......................................................................................1 1.1 High Dynamic Range Technology Overview ................................................................. 3 1.1.1 HDR Perceptual Transfer Functions ........................................................................... 5 1.1.2 Bit-Depth Requirements for HDR ............................................................................ 11 1.1.3 Color Gamut of HDR ................................................................................................ 15 1.1.4 Quality Assessment of HDR videos ......................................................................... 16 1.2 HDR Video Transmission Pipelines ............................................................................. 19 1.3 Color Representations for HDR Video Content ........................................................... 24 1.3.1 Non-Constant Luminance (NCL) YCbCr .................................................................. 27 1.3.2 Constant Luminance (CL) YCbCr ............................................................................. 30 1.3.3 Yu’v’ ......................................................................................................................... 32 1.3.4 Yu’’v’’ ...................................................................................................................... 33 xi  1.3.5 YDzDx ....................................................................................................................... 34 1.3.6 ICtCp .......................................................................................................................... 35 1.4 HDR Video Chroma Processing ................................................................................... 37 1.4.1 Chroma adjustment ................................................................................................... 38 1.4.2 Chroma scaling ......................................................................................................... 39 1.4.3 Chroma transform ..................................................................................................... 39 1.4.4 Chroma Reshaping .................................................................................................... 40 1.5 Color Gamut Mapping for HDR Video Content........................................................... 40 1.6 Thesis Outline ............................................................................................................... 44 Chapter 2: Performance Evaluation of HDR Video Transmission Pipelines ........................47 2.1 Single Layer Transmission Pipelines Evaluations ........................................................ 47 2.1.1 Experiments Setup .................................................................................................... 50 2.1.2 Performance of the HDR10 and SDR10 Pipelines in terms of HDR Output Quality ..   ................................................................................................................................... 53 2.1.2.1 Display .............................................................................................................. 53 2.1.2.2 Display Adaptation ........................................................................................... 53 2.1.2.3 Subjective Tests ................................................................................................ 54 2.1.2.4 Viewers ............................................................................................................. 55 2.1.2.5 Results and Discussions .................................................................................... 55 2.1.3 Performance of the HDR10 and SDR10 Pipelines in terms of SDR Output Quality 59 2.1.3.1 Display .............................................................................................................. 59 2.1.3.2 Subjective Tests ................................................................................................ 60 2.1.3.3 Viewers ............................................................................................................. 62 xii  2.1.3.4 Results and Discussions .................................................................................... 62 2.2 Scalable vs. Single Layer Transmission Pipelines Evaluations .................................... 66 2.2.1 Experiments Setup .................................................................................................... 66 2.2.2 Display ...................................................................................................................... 68 2.2.3 Display Adaptation ................................................................................................... 68 2.2.4 Subjective tests.......................................................................................................... 69 2.2.5 Viewers ..................................................................................................................... 69 2.2.6 Results and Discussions ............................................................................................ 69 2.3 Conclusions ................................................................................................................... 72 Chapter 3: Evaluation of Standard Color Pixel Representations for High Dynamic Range Video Transmission .....................................................................................................................75 3.1 Color Difference Evaluation Experiments .................................................................... 75 3.2 Results and Discussions ................................................................................................ 77 3.3 Conclusions ................................................................................................................... 84 Chapter 4: Chroma Processing Schemes for Improved Color Accuracy of Transmitted HDR Video Content .....................................................................................................................86 4.1 Chroma Scaling of CIE LAB Color Space for Efficient HDR Video Content Transmission ............................................................................................................................. 86 4.1.1 Proposed modifications ............................................................................................. 86 4.1.2 Experiments Setup .................................................................................................... 88 4.1.3 Results and Discussions ............................................................................................ 89 4.2 A Novel Chroma Processing Scheme for Improved Color Accuracy of Transmitted HDR Video Content .................................................................................................................. 94 xiii  4.2.1 Proposed Chroma Processing Scheme ...................................................................... 94 4.2.2 Color Perception Evaluation of the Proposed Method ........................................... 101 4.2.2.1 Test Setup........................................................................................................ 101 4.2.2.2 Results ............................................................................................................. 104 4.2.3 Compression Performance of the Proposed Chroma Processing – Objective Evaluation ........................................................................................................................... 105 4.2.3.1 Pre-Processing................................................................................................. 106 4.2.3.2 Video Coding .................................................................................................. 107 4.2.3.3 Post-Processing ............................................................................................... 107 4.2.3.4 Results and Discussions .................................................................................. 108 4.2.4 Subjective evaluation of the Proposed Method ...................................................... 111 4.2.4.1 Test Methodology and Procedures .................................................................. 112 4.2.4.2 Displays and Viewers ..................................................................................... 113 4.2.4.3 Results and Discussions .................................................................................. 113 4.3 Conclusions ................................................................................................................. 116 Chapter 5: Gamut Mapping for Backward Compatible HDR Videos Transmission .........118 5.1 A Hybrid Approach for Efficient Color Gamut Mapping .......................................... 118 5.1.1 Proposed Hybrid Gamut Mapping Method............................................................. 119 5.1.2 Results and Discussions .......................................................................................... 121 5.2 A Color Gamut Mapping Scheme for Backward Compatible HDR Video Transmission   ..................................................................................................................................... 122 5.2.1 Proposed Method .................................................................................................... 125 5.2.1.1 Gamut Compression........................................................................................ 127 xiv  5.2.1.2 Gamut Expansion ............................................................................................ 128 5.2.1.3 Bit-depth Considerations ................................................................................ 129 5.2.1.4 Results and Discussions .................................................................................. 129 5.3 Conclusions ................................................................................................................. 132 Chapter 6: Conclusions and Future Work ..............................................................................133 6.1 Summary of the Contributions .................................................................................... 133 6.2 Significance and Potential Applications of the Research ........................................... 134 6.3 Future Work ................................................................................................................ 136 Bibliography ...............................................................................................................................138  xv  List of Tables Table 2-1 HDR Video Dataset ...................................................................................................... 51 Table 2-2 QPs Used for Compression of Each Test Video .......................................................... 59 Table 2-3 Cropped Area Horizontal Coordinates for each Test Video ........................................ 59 Table 4-1 Bit-rate Savings of the Proposed CIELAB Compared to NCL YCbCr ......................... 59 Table 4-2 Average DE2000 and Percentage of Pixels with DE2000 Value Greater Than One for the Test Images when Represented with YCbCr10, I10CtCp9 and the Proposed I10Ct*Cp*8 with 4:4:4 and 4:2:0 Chroma ................................................................................................................ 59 Table 4-3 Average Objective results of I10CtCp9 and I10CtCp*8 Compared to the Anchor of 10-bit YCbCr ....................................................................................................................................... 59 Table 4-4 Average DE2000 and Percentage of Pixels with DE2000 Value Greater than One for the First Frame of the Test Videos when Represented with YCbCr10, I10CtCp9 and the Proposed I10Ct*Cp*8 with 4:4:4 and 4:2:0 Chroma Compared to the Originals ......................................... 59 Table 5-1 Results of the Hybrid Gamut Mapping Approach versus TWP-CIELAB ................... 59 Table 5-2 Results of Gamut Mapping from Bt.2020 to Bt.709 in terms of the DE2000 Metric .. 59 Table 5-3 Results of Inverse Gamut Mapping from the Resulted Bt.709 to Bt.2020 in terms of the D2000 Metric .......................................................................................................................... 59       xvi  List of Figures  Figure 1-1 The overall dynamic range of brightness seen by the human eye, the range seen at an instance (adaptation), and the range supported by SDR cameras and displays, and HDR cameras and displays ..................................................................................................................................... 4 Figure 1-2 Process of capturing and displaying physical light through EOTF and OETF ............. 6 Figure 1-3 Barten's model of the CSF based on spatial frequencies .............................................. 9 Figure 1-4 HDR transfer functions performance in mapping luminance to encoded luma .......... 10 Figure 1-5 Bit-depth quantization for SDR luminance range at 8 bits with no visible quantization error, and with 7 at 6 bits with visible shifts ................................................................................. 11 Figure 1-6 Performance of PQ, HLG and Gamma encoding with different bit-depths compared to Barten thresholds. Curves above Barten threshold result in visible quantization errors while the ones below it result in no visible errors ........................................................................................ 12 Figure 1-7 Code-word representation of PQ and Gamma encoding using 10 bits ....................... 14 Figure 1-8 Representation of BT.709, BT.2020 and P3 colors in the Yxy color space ............... 15 Figure 1-9 SDR video transmission pipeline ................................................................................ 19 Figure 1-10 HDR10 video transmission pipeline ......................................................................... 20 Figure 1-11 Simulcast transmission pipeline ................................................................................ 21 Figure 1-12 Scalable transmission pipeline .................................................................................. 22 Figure 1-13 SDR single layer transmission pipeline (SDR10) ..................................................... 23 Figure 1-14 HDR single layer transmission pipeline .................................................................... 24 Figure 1-15 Snapshot of the image used for reporting correlations (tone mapped) ..................... 25 Figure 1-16 Correlation of R' with G' and B' using 10 bits .......................................................... 26 xvii  Figure 1-17 Correlation of NCL Y' with Cb and Cr using 10 bits ................................................ 28 Figure 1-18 Chroma Subsampling in YCbCr 4:2:0 ...................................................................... 29 Figure 1-19 Correlation of Y' with u' and v' using 10 bits ............................................................ 33 Figure 1-20 Correlation of Y' with Dz and Dx using 10 bits ........................................................ 35 Figure 1-21 Correlation of I with Ct and Cp using 10 bits ............................................................ 36 Figure 1-22 Gamut mapping of colors A to D to one point on the border of BT.709, resulting in color through clipping ................................................................................................................... 42 Figure 1-23 Gamut mapping through compression method ......................................................... 43 Figure 1-24 Selecting mapped color based on Closest and Towards White point approaches .... 44 Figure 2-1 Coding and decoding chain of HDR10 using HEVC, and its display adaptation processes. ...................................................................................................................................... 50 Figure 2-2 Coding and decoding chain of SDR10 using HEVC, and its display adaptation processes. ...................................................................................................................................... 52 Figure 2-3 HDR MOS-rate comparison of HDR10 pipeline using the SMPTE ST 2084, with SDR10 pipeline using Cam, HE and PTR TMOs for the tested sequences .................................. 57 Figure 2-4  Mapping of luminance to luma for tested sequences using the SMPTE ST 2084 for HDR10, and TMOs Cam, HE, and PTR for SDR10 along with the normalized histogram of each content ........................................................................................................................................... 58 Figure 2-5 SDR MOS-rate comparison of HDR10 pipeline using the SMPTE ST 2084 followed by HE and PTR TMO for display, with SDR10 pipeline using HE, and PTR TMOs for tested sequences. ..................................................................................................................................... 59 Figure 2-6 Tone mapped version of the first frame of Market3 encoded using (a) HDR10 pipeline in YCbCr 10-bit 4:4 ......................................................................................................... 59 xviii  Figure 2-7  SDR MOS-rate comparison of HDR10 using the SMPTE ST 2084 followed by HE, PTR, and WLS TMO for display, with SDR10 pipeline using HE, and PTR TMOs, for tested sequences without reference to original video quality. ................................................................. 59 Figure 2-8 HDR MOS-rate comparison of HDR10 using the SMPTE ST 2084 with Scalable pipeline for FireEater2, Tibul2, and BalloonFestival ................................................................... 59 Figure 2-9 HDR MOS-rate comparison of HDR10 using the SMPTE ST 2084 with Scalable pipeline for BikeSaprkes and AutoWelding ................................................................................. 59 Figure 3-1 Color difference evaluation experiment workflow ..................................................... 59 Figure 3-2 Colors represented by BT.2020 ................................................................................... 59 Figure 3-3 Color errors of 10-bit NCL YCbCr with PQ in terms of DE2000 ............................... 59 Figure 3-4 Color errors of 10-bit CL YCbCr with PQ in terms of DE2000 .................................. 59 Figure 3-5 Color errors of 10-bit NCL YCbCr with HLG (reference display peak luminance of 4000 cd/m2) in terms of DE2000 .................................................................................................. 59 Figure 3-6 Color errors of 10-bit CL YCbCr with HLG (reference display peak luminance of 4000 cd/m2) in terms of DE2000 .................................................................................................. 59 Figure 3-7 Color errors of 10-bit NCL YCbCr with HLG (reference display peak luminance of 1000 cd/m2) in terms of DE2000 .................................................................................................. 59 Figure 3-8 Color errors of 10-bit CL YCbCr with HLG (reference display peak luminance of 1000 cd/m2) in terms of DE2000 .................................................................................................. 59 Figure 3-9 Color errors of 10-bit ICTCP with PQ in terms of DE2000 ......................................... 59 Figure 3-10 Color errors of 10-bit ICTCP with HLG (reference display peak luminance of 4000 cd/m2) in terms of DE2000 ........................................................................................................... 59 xix  Figure 3-11 Color errors of 10-bit ICTCP HLG (reference display peak luminance of 1000 cd/m2) in terms of DE2000 ....................................................................................................................... 59 Figure 4-1 Snapshots of the first frames of HDR test video sequences (tone-mapped version): (a) FireEater2, (b) Market3, (c) BalloonFestival, and (d) SunRise .................................................... 59 Figure 4-2 Pre/post processing steps of the proposed modified CIELAB for HDR video compression .................................................................................................................................. 59 Figure 4-3 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjusted YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for FireEater2....................................................................................................................................................... 59 Figure 4-4 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjusted YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for Market3 . 59 Figure 4-5 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjusted YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for SunRise . 59 Figure 4-6 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjuste YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for BalloonFestival ............................................................................................................................. 59 Figure 4-7 Workflow for calculating color errors in terms of DE2000 due to ICtCp quantization on BT.2020 sampled colors. ......................................................................................................... 59 Figure 4-8 Color errors generated due to bit-depth quantization for (a) 10-bit YCbCr, (b) 10-bit ICtCp, (c) I10CtCp9, (d) I10CtCp8, (e) the proposed I10Ct*Cp*9, (f) the proposed I10Ct*Cp*8, and (g) the proposed I10Ct*Cp*7 over BT.2020 gamut using the error bars on the right. Numbers indicate the used bit-depth. ........................................................................................................... 59 Figure 4-9 Original and transferred chroma code-words with 10 bits .......................................... 59 xx  Figure 4-10 Comparison of the chroma channels’ code-word distribution with 10 bits using the original and the ............................................................................................................................. 59 Figure 4-11 (a) original test images cut-outs (tone-mapped), (b), color errors generated by I10CtCp9 and, (c) color errors generated by I10Ct*Cp*8 using error bar in ................................. 59 Figure 4-12 End-to-end workflow with the proposed chroma processing scheme shown in purple dotted box...................................................................................................................................... 59 Figure 4-13 Subjective test results of the color difference perception between the images generated by the proposed method and the original images. ........................................................ 59 Figure 4-14  (a) the first frames of Market3 (tone mapped) and DE2000 values distribution of the same frame represented with (b) 10-bit YCbCr, (c) I10CtCp9 and (d) the proposed I10Ct*Cp*8 shown using the error bar similar to Figure 4-8 ............................................................................ 59 Figure 4-15 MOS-bitrate comparison of I10CtCp9 and ICt*Cp*8 using (a) 4:4:4 and (b) 4:2:0 chroma formats for tested sequences ............................................................................................ 59 Figure 5-1 Visual Comparison of Closest-CIELAB and TWP-CIELAB approaches for gamut mapping from a larger gamut to a smaller one ............................................................................. 59 Figure 5-2 Current HD/SD distribution pipeline with 8-bit BT. 709 support .............................. 59 Figure 5-3 future distribution pipeline with 10-bit BT. 2020 support .......................................... 59 Figure 5-4 invertible gamut mapping compatible with the current distribution pipeline ............. 59 Figure 5-5 Effect of scaling factor, α, on the size of the inner gamut inside BT.709 gamut ........ 59 Figure 5-6 Relationship between larger (BT. 2020), smaller (BT. 709) and inner gamut (scaled by α) distances and the original and mapped color ....................................................................... 59  xxi  List of Abbreviations  ARIB      Association of Radio Industries and Businesses AVC     Advanced Video Coding BD     Bjøntegaard Delta BPP     Bits per Pixel CfE     Call for Evidence CIE     Commission on Illumination CL     Constant Luminance CMOS     Complementary metal–oxide–semiconductor CRT     Cathode-ray tube CSF     Contrast Sensitivity Function DB     Decibel DCI     Digital Cinema Initiatives DE     Association of Radio Industries and Businesses DSIS     Double-Stimulus Impairment Scale DSQS     Double-Stimulus Quality Scale EOTF     Electro-Optical Transfer Function FPS      Frames per Second GB     Giga Byte HD     High Definition HDMI     High-Definition Multimedia Interface HDR     High Dynamic Range xxii  HE     Histogram Equalization HEVC     High Efficiency Video Coding HLG     Hybrid Log-Gamma HVS     Human Visual System IEEE     Institute of Electrical and Electronics Engineers ITMO     Inverse Tone Mapping Operator ITU-T International Telecommunication Union -Telecommunication JND     Just Noticeable Difference LCD     Liquid Crystal Display LED     Light Emitting Diode LOG     Logarithm LUT     Look-up Table MBPS     Mega Bytes per Second MOS     Mean Opinion Score MPEG     Moving Picture Experts Group NCL     Non-constant Luminance OETF     Opto-Electronic Transfer Function PQ     Perceptual Quantizer PSNR     Peak Signal to Noise Ratio PTF     Perceptual Transfer Function PTR     Photographic Tone Reproduction QoE     Quality of Experience xxiii  QoS     Quality of Service QP     Quantization Parameter SDR     Standard Dynamic Range SMPTE    Society of Motion Picture and Television Engineers SS     Single-Stimulus STB     Set-top Box TF     Transfer Function TMO     Tone Mapping Operator TV     Television TWP     Towards White Point UHD     Ultra High Definition VDP     Visual Difference Predictor WCG     Wide Color Gamut WLS     Weighted-Least Square   xxiv  Acknowledgements First and foremost, I would like to express my most sincere gratitude to my supervisor and mentor, Dr. Panos Nasiopoulos for his continuous support through my Ph.D. studies. I thank Panos for his encouragement, patience, enthusiasm, immense knowledge in multimedia, and specially for teaching me how to think. He has always been a great mentor, a role model, and a friend.  I would also like to extend my gratitude to Dr. Mahsa Pourazad for her constant guidance, help, feedback, support and above all her friendship during my studies. She constantly inspired me to pursue my goals with hard work and dedication.  This thesis would not have been possible without Dr. Ronan Boitard’s technical guidance, feedback and support during its different stages. I thank Ronan for his brilliant ideas and his encouragement.  I would also like to thank my Ph.D. committee members, Dr. Victor Leung, Dr. Shahriar Mirabbasi, and Dr. John Madden for investing their time and energy to provide guidance and help throughout my graduate studies and on my thesis. Their encouragement and insightful comments have been extremely valuable to me. I also thank my other colleagues at the UBC Digital Multimedia Lab: Pedram Mohammadi, Anahita Shojai, Joseph Khoury, Stelios Ploumis, Fujun Xie, Ilya Ganelin, Hamid Reza Tohidypour, Ahmad Khaldieh, Abrar Wafa, Nusrat Mehajabin, Timothee Bronner and Doris xxv  Xiang.  I thank you all for creating a vibrant and supportive research environment, for the stimulating discussions, for your friendship, and for all the fun we have had together. I would also like to thank the Natural Sciences and Engineering Research Council of Canada (NSERC), the Electrical and Computer Engineering department and the University of British Columbia for providing financial support in the form of research grants and tuition awards. Finally, I thank my mother, my husband Nima, and my siblings Mojgan, Mehran and Mehrnoush for their constant love, kindness, support, and encouragement.  xxvi  Dedication  To my father. “A part of good science is to see what everyone else can see but think what no one else has ever said.” - Amos Tversky 1  Chapter 1: Introduction and Overview High Dynamic Range (HDR) video technology is capable of capturing, processing and displaying a higher dynamic range compared to the conventional Standard Dynamic Range (SDR) technology. Dynamic range refers to the ratio between the maximum and minimum values of measured light. SDR technology’s dynamic range only reaches two to three orders of magnitude while HDR covers four to five orders of magnitude [1], matching the capabilities of the human eye. Since HDR attains what the Human Visual System (HVS) is capable of perceiving, an HDR video viewed on an HDR display appears as real as the original scene to a human observer. To benefit from HDR’s real-life quality, the capturing, transmission and display technologies all need to support HDR. The focus of this thesis is to minimize the effects of quantization, chroma subsampling and compression on the overall visual quality of transmitted HDR content.  A transmission pipeline, in general, consists of three parts: pre-processing, compression and post processing. Pre-processing involves digitizing the captured physical light values and representing the digital color values using a proper color space [2]. Compression involves taking advantage of the spatial and temporal similarities of the frames to reduce the signal’s size and required bitrate for transmission with minimal or no effect on visual quality [3]. The post-processing step prepares the decompressed signal for display by conforming it to the display’s expectations [4]. It is worth noting that conventional video transmission pipelines are designed for SDR technology and have been the norm for more than twenty years. With the introduction of the HDR technology, the traditional transmission pipelines need to be re-evaluated and perhaps improved in terms of their efficiency in delivering HDR content. Moreover, 2  as is the case for any other emerging technology, history has shown us that the shift to the newer technology does not happen overnight but it rather takes a long time for the end users to adopt it. As a result, HDR and SDR technologies are expected to co-exist for the foreseeable future, forcing the new transmission pipeline to support backward compatibility with SDR video content and displays. Additionally, the backward compatible HDR pipeline has be at least as efficient as the current SDR pipeline for delivering SDR videos while their quality is not be compromised.  In this thesis, we propose novel pre-processing and post-processing techniques that improve the perceptual quality of the transmitted HDR signal without affecting the required bitrate while they are also addressing the backward compatibility requirement. In Chapter 2, we identify the most efficient backward compatible transmission pipeline for HDR/SDR delivery and provide guidelines on how one can achieve the best possible HDR perpetual quality. In Chapter 3 we identify the most efficient color representation for HDR content delivery that maintains color accuracy and is suitable for compression. Chapter 4 presents two color processing methods that improve the color accuracy of the transmitted HDR content without increasing the required bitrate. In Chapter 5, we present two post-processing display adaptation techniques that address backward compatibility of HDR content with SDR displays while accurately preserving the original HDR colors. The following sections of this chapter offer an in-depth background information on HDR technology as well as literature reviews of the topics covered in each of the following chapters. Section 1.1 introduces different aspects of HDR technology, including HDR transfer functions, bit-depth considerations, wider color gamut, and HDR quality metrics. Section 1.2 discusses existing HDR transmission pipelines. A literature review on HDR color representation and chroma 3  processing is provided in Section 1.3 and Section 1.4, respectively. Section 1.5 presents existing color mapping methods designed to address backward compatibility of HDR content with SDR displays and Section 1.6 gives a summary of the scientific contributions of this thesis. 1.1 High Dynamic Range Technology Overview Our eyes are the sensor with which the HVS probes the environmental light.  One metric for measuring visible light to human eyes is ‘luminance’. Luminance is the luminous energy per unit of time (power) per unit of solid angle per unit solid angle radiating through a unit solid angle per unit area [5]. Luminance is measured in ‘nits’ or candela per square meter (cd/m2). Candela is the unit for measuring luminous intensity which is luminous power per unit of solid angle. In this thesis we refer to brightness and luminance interchangeably.  The HVS can perceive 14 orders of magnitude of brightness if it has the time to adapt. Without the adaptation time, it can perceive up to five orders of magnitude (See Figure 1-1)[1]. With the introduction of HDR displays [6] that support 4 to 5 order of magnitude of brightness, 1000 times better than their SDR counterparts, the quality of experience (QoE) of digital media drastically improves. Current HDR displays can cover brightness range of 0.01 to 1000 nits while the range usually is 0.1 to 100 nits for SDR displays. Therefore, darker black and brighter white shades, as well as richer colors can be displayed using HDR systems. Other than the display technology, HDR life-like viewing quality is also attributable to its capturing process. HDR images/videos are either obtained by captured several exposures of a 4  scene that are fused together to create an HDR version of the scene [7] or by using a modern CMOS image sensor that can capture a higher dynamic range image/video using a single exposure [8][9]. To store the captured HDR physical absolute luminance values (in cd/m2 or nits), floating-point representation is used requiring 32 bits per color channel if Red (R), Green (G), and Blue (B) representation is used [10]. That equals 96 bits per pixel (bpp) as three channels are involved [11]. This is four times the amount needed to store an SDR pixel which contains the relative luminance values, and therefore is not an efficient representation for transmission. By taking advantage of the correlation between the stored color channels (R, G and B), and certain limitations of the HVS, a file format known as OpenEXR is provided by the IEEE 754 16-b float standard [12] (half-float), covering around 10 orders of magnitude while using 48 bpp [13]. The values stored in an  Figure 1-1 The overall dynamic range of brightness seen by the human eye, the range seen at an instance (adaptation), and the range supported by SDR cameras and displays, and HDR cameras and displays  5  OpenEXR file format are still absolute light values. For transmission of HDR signals, these values are transferred to relative light values so that they can efficiently be digitized to integer values for further compression. This is process is carried out through the ‘perceptual transfer functions’ as explained below. 1.1.1 HDR Perceptual Transfer Functions HDR floating-point pixel values correspond to absolute light intensities (or luminance in terms of cd/m2 or nits) and cover the entire spectrum of light. To transfer these absolute values into relative light values, certain characteristics of the HVS are considered.  The HVS does not perceive light in a linear fashion. Therefore, a non-linear function known as ‘perceptual encoding’, also referred to as ‘perceptual transfer function’, or more simply sometimes only referred to as ‘transfer function’ is used [14]. The perceptual transfer function is applied on the physical (linear) light values so that information invisible to human eyes are discarded without affecting the perceptual quality. The transferred non-linear pixel values of each channel are then scaled to the maximum number of code-words represented by the available number of bits, e.g. 256 values if using 8 bits. At the display side, the transfer function is inverted and applied on the relative light values to obtain absolute light values again so that they can be emitted out of the display device. In the literature or standard documents, these transfer functions are also referred to as Opto-Electronic Transfer Function (OETF) as they transfer the optical physical light to digital values. The inverse of these functions which will be used inside the TV sets to transfer the electric signals into optical ones again is referred to as Electro-Optical Transfer Function (EOTF). The combination of applying OETF and then its inverse (EOTF) is referred to as Opto-Optical Transfer Function (OOTF). In this thesis, we suffice to referring to OETF functions as transfer functions or perceptual 6  encoding functions, and to EOTFs as inverse transfer functions or inverse perceptual encoding functions. Figure 1-2 summarized the process of transferring light from a scene in cameras and inverse transferring it in TVs.   In SDR technology, the non-linear transfer function was essentially the inverse of the gamma correction applied on the signal on Cathode Ray Tubes (CRT) monitors [15] [16]. The purpose of gamma correction on CRT monitors was to reduce camera noise and control the current of the electron beam [17]. Gamma correction or gamma function is essentially a form of the power function: 𝒀 = 𝒀′𝜸         (1) The values of γ is between 1.8 and 2.2.  The gamma function was standardized in 2012 as BT. 1886 [18]. Equation (1) is the EOTF and its inverse (power of 1/ γ) is essentially the perceptual encoding function. Note that Y refers to linear luminance and Y’ is the encoded non-linear luminance which is referred to as ‘luma’. Throughout this thesis, whenever a prime is used on a channel, it means that the channel information is perceptually encoded and hence non-linear.   Figure 1-2 Process of capturing and displaying physical light through EOTF and OETF 7  CRT displays are capable of handling luminance values only up to 100 cd/m2. For luminance values up to 100 cd/m2, the inverse of the gamma function is well-correlated with the HVS response. For the extended brightness range of HDR technology, however, the conventional gamma function does not correlate well with the HVS response. To that end, perceptual encoding functions which consider the HVS response at higher luminance ranges need to be designed.  The first of these functions was proposed in [19] for luminance values up to 1.84 × 10 19 nits and down to 5.44 × 10 -20.  This function is a logarithmic function in base 2 and is referred to as log-luminance encoding. A log function correlates well with HVS response to light, especially in brighter luminance values. This transfer function is used in 32-bit LogLuv pixel encoding which is used in TIFF format [20]. However, a logarithmic function assigns more than enough code-words in bright areas, while it does not assign enough code-words in the dark areas.  Another HDR transfer function is a combination of the gamma correction and an S-log correction for dark and bright luminance values, respectively and was proposed in [21]. Using this function, dark values are given more code-words through the gamma function, while the S-Log avoids highlight clippings. This function has a general form of: 𝑌′ =  {𝑎 ln(𝑌 + 𝑏) + 𝑐,               𝑖𝑓 𝑌 ≥ 1𝑌1𝛾⁄ ,                                  𝑖𝑓 𝑌 < 1          (2) The values of a, b, and c are dependent on the γ value used and can be found in [21]. A similar Hybrid-Log-Gamma (HLG) function was also proposed in [22] and was later standardized by the Association of Radio Industries and Businesses (ARIB) as ARIB STD-B67 [23].  HLG is designed 8  for the range of luminance values supported by current reference grading displays, mainly from 0.01 to 1000 cd/m2 (or 4000 cd/m2). Therefore, its function varies depending on the maximum peak luminance of the graded content. Since the HLG function uses a gamma function for darker values, it is compatible with legacy devices that can only support gamma function as the EOTF, such as the conventional CRT displays. Another way to derive an optimized transfer function for HDR color pixels is through determining how many “just noticeable difference” (JND) steps are perceived by human vision and to fit a function to those JNDs accordingly. A property of the HVS is that the same contrast is not perceived similarly by our eyes at different spatial frequencies and/or background luminance [24]. Several experiments were conducted to measure the contrast detection threshold for different spatial frequencies and background luminance values [25] [26]. These thresholds are known as contrast sensitivity functions (CSF) [27]. Figure 1-3 illustrates one of the CSF models known as Barten’s model of CSF for several background luminance values (La) [25].  In [28], by tracking the peaks of contrast sensitivity at different luminance levels, the minimum detectable contrast in other words the just noticeable difference (JND) is derived. A function is fitted to these JNDs serving as the transfer function in the form of: Y′ = (max[(𝑌1𝑚2⁄ −𝑐1),0]𝑐2−𝑐3𝑌1𝑚2⁄)1𝑚1⁄              (3) 𝑚1 =26104096×14= 0.1593017578125 𝑚2 =25234096× 128 = 78.84375 9  𝑐1 = 𝑐3 − 𝑐2 + 1 =34244096= 0.8359375 𝑐2 =24134096× 32 = 18.8515625 𝑐3 =23924096× 32 = 18.6875 Note that Y in the above formula is normalized to 1 by dividing it by 10000 (maximum luminance in terms of nits, supported by current HDR systems). The above transfer function is commonly referred to as Perceptual Quantizer (PQ) as it considers the HVS contrast sensitivity characteristic. PQ has been standardized by the Society of Motion Pictures and Television Engineers (SMPTE) as SMPTE ST 2084 [29]. PQ is designed for luminance values range from 0.005 to 10,000 cd/m2 and unlike HLG, its code-words allocation does not change according to the peak luminance of the content. One way to transfer linear physical light is to use a Look-up table (LUT) which is based on a CSF such as the ones in [29] [30].  Figure 1-3 Barten's model of the CSF based on spatial frequencies 10  Another transfer function for HDR luminance encoding based on Barten’s CSF was proposed in [31] and has the form of:  𝑌′ = 2305.9𝑙𝑛[(𝑌)1𝛾⁄  × (𝑒𝑚 −1) +1]𝑚          (4) In (4), m = 4.3365, and 𝛾 = 2.0676. Note that Y in the above formula is normalized to 1 by dividing it by 10000 (maximum luminance in terms of nits, supported by current HDR systems). Figure 1-4 compares how each of these transfer functions (TFs) compare to each other in encoding luminance values up to 10000 cd/m2.  Figure 1-4 HDR transfer functions performance in mapping luminance to encoded luma  11  1.1.2 Bit-Depth Requirements for HDR The transfer functions described in Section 1.1.1 are all associated with a minimum bit-depth meaning the transfer function will not result in visible quantization error if that bit-depth is used to transfer floating point values to integers. Using a bit-depth lower than this threshold will result in visible quantization artifacts. Using a bit-depth higher than the threshold will result in invisible transition between the code-words but will also waste those code-words as no visible information is represented by the additional codes. As an example, gamma encoding needs at least 8 bits to represent the SDR luminance values (0 – 100 cd/m2) without visible errors. Using a bit-depth lower than 8 however, will result in visible quantization errors as depicted in Figure 1-5. Using more than 8 bits to represent the SDR values will be visually identical to using 8 bits.   Figure 1-5 Bit-depth quantization for SDR luminance range at 8 bits with no visible quantization error, and with 7 at 6 bits with visible shifts 12  Figure 1-6 shows the minimum contrast steps at different luminance levels using Barten contrast sensitivity function to identify the JNDs with a black curve. If a transfer function shows a similar behavior as the Barten threshold curve shown in Figure 1-6, then it is the most perceptually efficient one. Curves below Barten threshold will be perceptually errorless but will also waste the code-words, while curves above Barten threshold will result in perceptual errors due to quantization.  Figure 1-6 also depicts the performance of gamma encoding, PQ and HLG using different bit- depth values compared to Barten threshold. As can be observed, gamma function performance with 12 bits is similar to PQ with 10 bits up to luminance value of 0.1 cd/m2.  For luminance values  Figure 1-6 Performance of PQ, HLG and Gamma encoding with different bit-depths compared to Barten thresholds. Curves above Barten threshold result in visible quantization errors while the ones below it result in no visible errors      13  above 10 cd/m2, gamma function with 12 bits curve lies far lower than the Barten threshold resulting in wasting the code-words. Note that the gamma and HLG curves are shown for maximum luminance of 1000 and 10000 cd/m2 in Figure 1-6. As observed, 11-bit PQ performs close to the Barten threshold curve. It can be concluded that PQ with 11 bits (with maximum luminance of 10000 cd/m2) is perceptually very similar to gamma with 12 bits (with maximum luminance of 1000 cd/m2) in representing HDR luminance values while one whole bit is saved in case of PQ. When the maximum luminance is 10000 cd/m2, gamma encoding is far from the Barten threshold and hence inefficient even when 12 bits are used (See yellow curve). Figure 1-7 shows how 10-bit code-words are distributed through the HDR luminance range (up to 10000 cd/m2) using gamma function and PQ. Recall from Figure 1-3 that the HVS is more sensitive to changes in dark areas as it is to those in bright areas. This characteristic is reflected in PQ behavior in assigning code-words: as much as 15% of the code-words are assigned to luminance levels lower than 1 cd/m2. In contrast, gamma encoding only assigns 1% of the code-words to this range and hence visible quantization artifacts will appear.  For transmission purposes, it is common practice that instead of scaling the perceptually encoded signal to 2bit-depth, translating to 0 to 2bit-depth range of quantized values, an offset is added to the beginning and subtracted from the end of the range as below:  𝑌′𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑒𝑑 = ((𝑌′ × 219) + 16) × 2(𝐵𝑖𝑡𝐷𝑒𝑝𝑡ℎ−8)          (5) 14  Any value lower than the minimum and higher than the maximum code-word, for instance lower than code-word 64 and higher than code-word 960 in case of 10 bits, will be clipped to 64 and 960, respectively by the consumer displays. This is part of the recommendation ITU-R BT.1361: Worldwide unified colorimetry and related characteristics of future television and imaging systems [31].  The signals quantized using the BT.1361 standard are referred to as ‘Limited Range’ or ‘Video Range’, while the ones quantized only b scaling (from 0 to 2bit-depth) are referred to as ‘Full Range’ and are mainly used in reference monitors. Currently, the current infrastructures of video transmission, including the consumer displays and the compression standards, support 10 bits [33], while 12 bits is supported only by some professional grading monitors.  Figure 1-7 Code-word representation of PQ and Gamma encoding using 10 bits    15  1.1.3 Color Gamut of HDR Another advantage of HDR technology is its wider color gamut. A gamut is a subset of the visible colors a display can show, or a camera can record. In SDR, BT. 709 [34] is the color gamut that consumer displays and cameras can represent and capture as shown in Figure 1-8. The BT.709 corresponds to 35.9% of the full visible gamut that human eyes can perceive (the horse-shoe shape in Figure 1-8). Comparatively, the color gamut used by cinema theater projectors, described by the digital cinema initiative (DCI) P3 standard [35], covers 53.6% of the full visible color gamut. A wider gamut of BT.2020 [36] is the new standard whose colors the HDR technology can support,  corresponding to 75.8% of the full visible gamut (See Figure 1-8). It is worth mentioning that the BT.2020 gamut is also supported by the Ultra High Definition (UHD) displays. Since BT.2020  Figure 1-8 Representation of BT.709, BT.2020 and P3 colors in the Yxy color space 16  encompasses almost twice number of colors compared to the BT. 709, a larger number of bits is required to represent its colors. That is why 10 and 12 bits are the standard bit-depth values for representing BT.2020 color gamut, while for BT.709, 8 bits deemed enough.  1.1.4 Quality Assessment of HDR videos The capturing, processing, compression, transmission, and display processes of the digital media in general, each may introduce distortion or artifacts to the signal affecting the final visual quality of the content. That is why quality assessment is essential for the development and evaluation of the digital media. Quality assessment methods are generally categorized into objective metrics and subjective evaluations. The latter which refers to viewers’ perceptual experience with the digital media is the ideal assessment measure and is also referred to as Quality of Experience (QoE) of the digital media. Although subjective measures are very effective, their collection is time consuming. To overcome this issue with subjective evaluations, objective metrics are designed. They are essentially mathematical functions that model the observers’ visual preference. For videos, objective metrics are usually calculated per frame and their average is reported as a single score or a signal to noise ratio. Objective metrics have the advantage of being fast and automated but do not necessarily reflect subjective test results [37]. In this thesis, we rely on both subjective and objective evaluations results to benefit from both methods’ advantages. The most traditional and widely used SDR objective metric due to its simplicity is the peak signal-to-noise ratio (PSNR). In HDR, the traditional PSNR formula is applied on perceptually encoded signals using a transfer function and hence is referred to as tPSNR [38]. Usually the tPSNR is applied on XYZ channels (instead of the conventional YCbCr) [39]. Another way to apply the 17  conventional PSNR to HDR color pixels is to calculate PSNR on several SDR exposures and report their average as the HDR PSNR value [39]. This method is referred to as multi-exposure and in short referred to as mPSNR. The mPSNR measure works on linear RGB pixel values.  Since color preservation and color perception is also of great importance in HDR’s QoE, a color difference predictor metric known as DE2000 is also used to evaluate the color accuracy of the transmitted HDR signal [41]. DE2000 considers how the HVS perceives color and provides a numerical value representing the difference between two colors. The Just Noticeable Difference (JND) threshold in terms of DE2000 is one [42]. In other words, any color difference with DE2000 values less than one is not perceptible by human eyes. Moreover, the larger the value of the DE2000 metric, the more visually different the two colors are. DE2000 metric requires a reference whit point and is calculated on CIE LAB color values. DE2000 with reference white point of 100 cd/m2 and 1000 cd/m2 is usually referred to as DE00 (or DE100) and DE1000, respectively.  For transmission application, the DE2000 metric is usually reported in decibels (db) using the PSNR formula as below:      𝑃𝑆𝑁𝑅𝐷𝐸(𝑑𝑏) = 10 × log1010000𝐷𝐸2000         (6) HDR-VDP-2 is another more complex objective metric designed based on Daly’s visual difference predictor (VDP) [43]. HDR-VDP-2 mimics the human visual system and is designed to predict the visibility of changes caused by artifacts on the test image [44]. The input of the metric includes the reference image, the test image, and some other parameters such as maximum physical luminance of the display, angular resolution of the image, and other attributes of the viewing 18  environment. The output of the metric is a probability map that determines the probability of detecting dissimilarities between the reference and test image by a human observer in each image region. Then by using a pooling strategy the probability map is converted into a value of quality score between 0-100, where 0 represents the lowest quality and 100 stands for the highest quality meaning the reference and test images are identical in terms of the quality.  Despite the objective quality metrics, subjective evaluations are still the most reliable measure for HDR QoE. Typically, in a subjective test, video sequences are shown to a group of viewers and the viewers’ ratings on the quality of each video sequence are analyzed. Many subjective video quality methodologies as well as general viewing conditions are described in the ITU-T recommendation BT.500 [45].  Depending on the objective of the test and the type of the visual content, details of the subjective test procedures may vary greatly. If a reference video is available, then the double stimulus methods are used in which either the impairments of the test video is evaluated based on the reference video, referred to as the Double-Stimulus Impairment Scale (DSIS) method or the quality of the test and/or reference video is evaluated with reference to the reference video. The latter is referred to as the Double-Stimulus Quality-Scale (DSQS) for which a continuous or a discrete grading scale can be used. Whenever a reference is not available, then Single-Stimulus (SS) methods are used. More details on these tests can be found in [45]. 19  1.2 HDR Video Transmission Pipelines The existing video transmission pipeline is intended and optimized for SDR videos. Gamma encoding, BT.709 color gamut and limited dynamic range are some characteristics of conventional SDR videos that the pipeline is designed around. This pipeline is shown in Figure 1-9.  With the introduction of HDR, its unique characteristics such as its new perceptual transfer function, higher bit-depth and wider color gamut of BT.2020 should also be supported by the transmission pipelines.  Moving Pictures Experts Group (MPEG) is the working group in charge for creating standards for video and audio compression and transmission [46]. With the growing interest for distributing HDR content, MPEG investigated whether any changes to the state of the art High Efficiency Video Coding (HEVC) standard [47] [48] are required for HDR video compression. It was shown that by changing the perceptual encoding in the transmission pipeline to an HDR function and the colour primaries to BT.2020, the current pipeline can deliver HDR videos with acceptable bitrate and quality [49]. The transfer function and the color gamut of the delivered signal are included in   Figure 1-9 SDR video transmission pipeline   20  a metadata referred to as Supplementary Enhancement Information (SEI) [50] and Video Usability Information (VUI) [49] messages so that the HDR display can address the signal in the appropriate way (See Figure 1-10). This pipeline is referred to as HDR10 as it uses the HEVC Main10 profile [52]. For applications such as Blu-ray [53], HDR10 is an acceptable solution. However, for broadcasting applications that the display at the viewer side could be SDR or HDR, a practical pipeline should support both HDR and SDR as these two technologies will co-exist and hence backward compatibility of the pipeline is critical for its industry diligence. Backward compatibility can refer to any part of the transmission pipeline: acquisition, distribution channel, compression, or display. In this thesis when we discuss backward compatibility, we imply it in terms of the pipeline addressing SDR displays. That is why in February 2015, MPEG issued a Call for Evidence (CfE) to look for solutions that improve the HDR/WCG video coding performance over HEVC Main 10 [37]. Backward compatible HDR transmission pipelines can be classified in three categories: simulcast, scalable, and single layer. Simulcast pipeline consists of sending two streams of HDR and SDR   Figure 1-10 HDR10 video transmission pipeline  21  using two different transmission channels. This pipeline is shown in Figure 1-11. At the viewer side, the signal that is compatible with the available display is selected and the other one is discarded. This is usually preferred when there is no limitation on the bandwidth e.g. in Blue-ray applications. For streaming and broadcast application this approach is not desirable as almost twice the amount of bandwidth needed for each of these streams is needed. In scalable pipeline a base layer represents the signal with the smallest color gamut and dynamic range (SDR), while several enhancement layers represent additional information corresponding to displays with increased color gamut and dynamic range. The base and enhancement layers are transmitted in a single stream. Scalable pipeline is illustrated in Figure 1-12. This pipeline is expected to be more efficient than the simulcast one as instead of sending two different streams separately, some common information between the SDR and HDR information is transmitted once and the extra information is added through the enhancement layers.  As can be seen from Figure 1-11 and Figure 1-12, both simulcast and scalable pipelines require two versions of the input: SDR and HDR versions. The two versions can either acquired during  Figure 1-11 Simulcast transmission pipeline  22  the capturing by recording both SDR and HDR version of the scene or through a post-processing method referred to as tone mapping [54]. When an HDR signal is tone mapped to SDR, its dynamic range is reduced to adapt to SDR signal characteristics. This process has been studied extensively in literature and many tone mapping operators (TMOs) have been developed with different applications and characteristics [55][56][57]. Similarly, when an SDR signal is tone mapped to HDR, its dynamic range is expanded to adapt to HDR capabilities using Inverse Tone Mapping Operators (iTMOs) [58]. ITMOs are generally more challenging processes than TMOs as they estimate the information that does not exist in SDR signals. iTMOs have also been studied in depth and many iTMOs have been developed with different objectives and characteristics [59][60][61]. Other than luminance adaptation required for HDR to SDR or SDR to HDR mapping, the color gamut of the original signals also needs to be adapted to that of the destination signal. This process is known as gamut mapping a.k.a. gamut reduction referring to converting a wider color gamut to a smaller one [62], while the process of converting a smaller gamut such as BT.709 to a wider one is usually being referred to as inverse gamut mapping or sometimes gamut expansion [63].We will discuss these processes in more details in section 1.5.   Figure 1-12 Scalable transmission pipeline  23  Another transmission pipeline that can support both HDR and SDR is single layer. In this case, only one of the HDR or SDR streams will be transmitted. With the help of tone/color mapping, based on the characteristics of the display at the viewers’ side, the expected signal can be reconstructed. For example, if the pipeline is SDR single layer and the video to be transmitted is HDR, it can be tone mapped to SDR before transmission. For a viewer with HDR display, the decoded signal can then be inverse tone mapped at the viewer’s side to reconstruct the HDR signal. For a tone mapping to be invertible its parameters need to be transmitted. This information is contained in the metadata sent along the coded stream. Both tone/gamut mapping and inverse tone/gamut mapping can greatly benefit from metadata to increase the quality of the reconstructed signal. SDR and HDR single layer pipelines are shown in Figure 1-13 and Figure 1-14, respectively while addressing both HDR and SDR displays. Given the multiple ways HDR can be transmitted to viewers while also addressing the backward compatibility with SDR displays, content distributors are faced with a challenge of identifying the most efficient pipeline in terms of required bitrate and subjective quality in terms of both HDR   Figure 1-13 SDR single layer transmission pipeline (SDR10)  24  and SDR. In the case of single layer pipelines, the tone mapping and inverse tone mapping operators also affect the required bitrate and subjective quality. That is why an extensive evaluation needs to be performed to identify the pipeline that can transmit video signals such that viewers with HDR display can benefit from the potentials of their display and the ones with SDR displays will continue to experience the same viewing quality. All the while, the transmission bitrate should not surpass the limit imposed by the available bandwidth. Chapter 2 of this thesis provides such evaluation through extensive sets of SDR and HDR subjective experiments. 1.3 Color Representations for HDR Video Content Humans perceive the visual information of their surroundings using two main receptors in their eyes: rods and cones. While rods are responsible for vision at low light without any color information, cones are responsible for color vision and are active at higher light levels [64]. We have three types of cones in our eyes receptive to Long, Medium and Short wavelengths, often referred to as L-cones, M-cone and S-cones, representing green, red and blue colors, respectively [65].   Figure 1-14 HDR single layer transmission pipeline 25  Due to having three photoreceptors, color pixels are also represented using three channels [66]. One of main three-channel representations is the RGB with three channels representing Red, Green and Blue. RGB is a device-dependent representation meaning only colors that are reproducible with a device (display mainly) can be represented with them. The most prevalent RGB is the RGB BT.709 [34] and RGB BT.2020 [36]. In contrary, a device independent color representation can represent every visible color to human eyes (sometimes even the ones beyond what humans can see). An example of a device-independent color representation is the CIE XYZ [39]. Regardless of the representation used to describe the color, one can change it to any other representation using a 3x3 matrix in a lossless way.  Although the RGB representation can accurately describe all colors representable by current displays, there exists redundant information in between its channels. Consider the image in Figure 1-15 an HDR image tone mapped so that it can be displayed here. The correlations between R and  Figure 1-15 Snapshot of the image used for reporting correlations (tone mapped)  26  G, and R and B channels’ information the image shown in Figure 1-15 are illustrated in Figure 1-16.  The Pearson Correlation Coefficients (PCCs) [65] are also reported in Figure 1-16.  To transmit and compress a color pixel, an efficient color representation is needed such that its channels do not contain correlated information. To that end, it is common practice to use color difference encodings for compression. Color difference encodings consist one brightness channel extracted from the RGB channels and two chroma channels containing the difference of the color channels from the brightness channel. Since this thesis focuses on delivery of the HDR signals, in what follows we review some of the common color difference encodings that can be used for HDR  Figure 1-16 Correlation of R' with G' and B' using 10 bits  27  signal transmission. In Chapter 3 we evaluate these color encodings in terms of their accuracy in representing HDR color pixels.  1.3.1 Non-Constant Luminance (NCL) YCbCr The most prevalent color representation employed in video compression is YCbCr, where Y is the luma component (transferred luminance) and Cb and Cr are the chroma components representing Blue and Red color difference from Y, respectively. These channels are derived from RGB as shown below:     Y’ =  0.2627R’ +  0.6780G’ +  0.0593 B’           (7) 𝐶𝐵′ =B′−𝑌′1.8814                             (8) 𝐶𝑅′ =R′−𝑌′1.4746                            (9) The primed R, G and B (R’, G’, B’) are the perceptually encoded values and are not linear. Therefore, the Y’ channel is also the perceptually encoded luminance, commonly referred to as luma. Note that the equations (7), (8) and (9) are from RGB BT.2020 and the corresponding BT.709 conversion can be found in [34]. The correlation of Y’ with C’b and C’r channels are depicted in Figure 1-17 for the same image in Figure 1-15. As can be observed from Figure 1-17 the channels information correlation is drastically reduced compared to R’G’B’ channels information correlation shown in Figure 1-16.   28  It is common practice to subsample Cb and Cr chroma components to ¼ of their original resolution as shown in Figure 1-18 to further compress the signal. The subsampled YCbCr referred to as YCbCr 4:2:0, compared to the original YCbCr 4:4:4, improves the overall compression efficiency without affecting the visual quality as our eyes are less sensitive to changes in color compared to changes in brightness. However, YCbCr 4:2:0 has some shortcomings in representing HDR colors [68]. Even before compression, luma of the converted YCbCr 4:2:0 signals are distorted sufficiently to result in visible color shifts. One reason behind this is the still-existing correlation between Y’ and Cb, Cr channels information as shown in Figure 1-16.  When Cb and Cr are subsampled to 4:2:0, some  Figure 1-17 Correlation of NCL Y' with Cb and Cr using 10 bits 29  correlated Y information is also discarded through the sampling process. Although this discarded Y information did not affect the SDR visual quality, it yields visible color changes in the extended range of color and luminance of HDR.  As an example, let us consider two quite similar colors of RGB = (4000, 0, 100) and  RGB2 = (4000, 4, 100), both with linear light values. Using equation (x) he luminance values of these triplets are 1059 and 1056 nits, for RGB1 an RGB2, respectively. These luminance values are quite close to each other. After applying SMPTE ST 2084 as the transfer function they will transfer to R’G’B1 = (0.9026, 0, 0.5081), R’G’B2 = (0.9026, 0.2324, 0.5081). Following the conversions in (7), (8) and (9) we will have YCbCr1 = (0.2672, 0.1280, 0.4309) and YCbCr2 = (0.4248, 0.0443, 0.3240). After applying the chroma down-sampling, for simplicity through averaging the Cb and Cr samples, we will have: YCbCr1 4:2:0 = (0.2672, 0.0861, 0.3774) and YCbCr2 4:2:0 = (0.4248, 0.0652, 0.3507).  To revert these to linear RGB, after upsampling the chromas, we will apply the inverse of (7), (8) and (9) resulting in: R’G’B1 = (0.8238, 0.0373,   Figure 1-18 Chroma Subsampling in YCbCr 4:2:0 30  0.4293), and R’G’B2 = (0.9420, 0.2137, 0.5475). After conversion back to linear RGB by applying the inverse of SMPTE ST 2084, we will have: RGB1 = (1934, 0.031, 44.51) with luminance of ~510 nits which is far too dark, and RGB2 = (5775, 3.016, 147.34) with luminance of ~1527 nits which is far too bright compared to original RGB1 and RGB2, respectively. Note that in these conversions no bit-depth quantization was applied, and the values stayed in floating-point. Hence, the difference is only due to the chroma subsampling process. As observed in the example above, the luminance of the final RGB triplet above has been changed compared to the original luminance due to the conversions involved.  Such YCbCr 4:2:0 conversion, is referred to as Non-constant Luminance (NCL) because of the changes it introduces to the luminance channel. 1.3.2 Constant Luminance (CL) YCbCr To address the limitation of NCL YCbCr 4:2:0, a Constant-Luminance (CL) YCbCr derivation was added to BT. 2020 [36]. In CL YCbCr, the linear RGB values are used to derive the luminance channel, instead of the perceptually encoded values. Perceptual encoding is then applied on the luminance channel (Y), to construct luma (Y’). Following is how CL YCbCr is derived from BT.2020 RGB:     Y’ = ( 0.2627R +  0.6780G +  0.0593 B )′      (10) 𝐶𝑏′ = {B′−𝑌′−2N𝐵, N𝐵 ≤ B′ − 𝑌′ ≤ 0 B′−𝑌′2P𝐵, 0 ≤ B′ − 𝑌′ ≤ P𝐵          (11) 31  𝐶𝑟′ = {R′−𝑌′−2N𝑅, N𝑅 ≤ R′ − 𝑌′ ≤ 0 B′−𝑌′2P𝑅, 0 ≤ R′ − 𝑌′ ≤ P𝑅          (12) with P𝐵 = 0.7919, N𝐵 = −0.9702, P𝑅 = 0.4969, N𝑅 = 0.8591  Note that BT.709 does not have a standardized CL derivation as BT.709 mainly represents SDR signals and NCL derivation works well for SDR.  An in-depth analysis on NCL and CL methods presented in [69] reports less loss of original luminance information after chroma sub-sampling in the CL scheme. It was further proven that the CL approach has the benefit of higher compression efficiency over the traditional NCL method [69]. However, CL YCbCr is usually avoided in practice due to the complex conversions it involves mainly due to the comparisons that need to be performed for each pixel to check the conditions in (11) and (12). Additionally, as the current infrastructure is designed for NCL, updating it to support CL will impose a huge cost and is avoided.    In [70], new equations are derived that take advantage of the CL method’s efficiency without increasing hardware complexity for HDR video distribution. The proposed method is designed to be content adaptive. Results indicate that improved color quality can be achieved in terms of DE2000 metric for sequences with one prominent primary (Red, Green, or Blue), when compared with the NCL approach. 32  1.3.3 Yu’v’ This color encoding is based on CIE Lu’v’ color space [19] where the L’ is the perceptually encoded luminance channel and u’ ad v’ are the chromaticity channels. In [19], the transfer function used for L’ is a logarithmic function. That is why this color space is also referred to as the LogLuv pixel encoding. For the encoding of the luminance information 16 bits are used, while for the u’ and v’ channels, only 8 bits are allocated.  In theory, other than the logarithmic function, any other transfer function can be applied to the luminance channel (Y channel) of the Lu’v’. To differentiate from logarithmic transfer function, we use Yu’v’ with Y referring to perceptually encoded L of Lu’v’ using SMPTE ST 2084 (PQ). Conversions to and from 1931 CIE (x, y) chromaticities to obtain u’ and v’ are given in Equations 13 and 14 below. 𝑢′ =  4𝑥−2𝑥+12𝑦+3             (13) 𝑣′ =  9𝑦−2𝑥+12𝑦+3             (14) x = X /(X + Y + Z)         (15) y = Y/(X + Y + Z)           (16) Y or L is essentially the Y channel from CIE XYZ. For visible colors, u’ and v’ range from 0 to 0.62. It is shown in [71] that Yu’v’ is a perceptually uniform color space meaning any change in the color pixel values will result in the same perceptual change. It is also shown in [71] that Y and 33  chromaticity channels of u’ and v’ are well de-correlated.  We also show the decorrelation of Y from u’ and v’ channels in Figure 1-19 for the same image in Figure 1-15. 1.3.4 Yu’’v’’  Yu”v” is another color space based on Yu’v’ as explained in 1.3.3 with the exception that the transfer function used for the luminance channel is the one explained in 1.1.1 as Philips TF. This color encoding was first proposed in [71] . The u” and v” channels are u’ and v’ channels but are scaled down for dark luminance values (lower than 5 nits). This scaling results in less noisy dark values while it does not affect the color perception at those dark luminance values. The smaller  Figure 1-19 Correlation of Y' with u' and v' using 10 bits  34  chroma values due to the scaling improve their coding efficiency compared to Yu’v’. However, the u” and v” channels are dependent on the luminance channel and any distortion in the Y values, such as compression, will affect the color components which may result in perceptual changes. 1.3.5 YDzDx The YDzDx color encoding is based on CIE 1931 XYZ with a common Y channel while Dz is the Z difference and Dx is the X difference from Y [73]. This color space is designed to reduce the redundancies of information between X, Y and Z channels. Dz and Dx differences are computed as below:  𝐷𝑧′ =  𝑐1𝑍′−𝑌′2             (17) 𝐷𝑥′ =  𝑋′− 𝑐2𝑌′2            (18) Where c1=0.986566 and c2=0.991902 and X’, Y’ and Z’ channels are the perceptually encoded X, Y and Z channels, respectively. The correlation between Dz and Y channel is depicted in Figure 1-20 for the same image in Figure 1-15.  35  1.3.6 ICtCp ICtCp is a color space that was designed to further de-correlate the brightness and chroma information by matching the characteristics of HVS [74].  ICtCp is based on LMS color space representing cones (photoreceptors of the HVS) response to Long, Medium and Short wavelengths. To derive ICtCp from XYZ, they are first converted LMS as below: [𝐿𝑀𝑆] =  [0.3592 0.6976 −0.0358−0.1922 1.1004 0.07550.0070 0.0749 0.8434]  ×  [𝑋𝑌𝑍]      (19)  Figure 1-20 Correlation of Y' with Dz and Dx using 10 bits  36  Then LMS channels are perceptually encoded using SMPTE ST 2084, resulting in L’M’S’. ICtCp is then calculated as below: [𝐼𝐶𝑇𝐶𝑃] =  [0.5 0.5 01.6137 −3.3234 1.70974.3781 −4.2455 −0.1325]  × [𝐿′𝑀′𝑆′]      (20) Ct corresponds to blue⁃yellow color perception and Cp corresponds to red⁃green color perception. Figure 1-21 shows the correlation between Ct and I channels’ information for the same image in Figure 1-15. As can be observed, the correlations between the chroma channels with bright channel are reduced compared to those of the YCbCr.    Figure 1-21 Correlation of I with Ct and Cp using 10 bits  37  1.4 HDR Video Chroma Processing Since the HVS is more sensitive to changes in luma compared to those in chroma, it is common practice that the chroma channels are subsampled at ¼ of their resolution, before compression. Figure 1-18 shows a subsampled 4:2:0 color encoding. It is recommended in [36] to use two 3-tap filters to perform the subsampling once horizontally and once vertically. The details of the recommended vertical and horizontal filters can be found in [36]. At the display side, the discarded chroma information is interpolated through the up-sampling process. The details on the up-sampling filter can also be found in [36] . As stated in Section 1.3.1, chroma sub-sampling introduces visible color shifts to the colors of HDR in case of YCbCr 4:2:0 which did not exist in the case of SDR. One solution to address the lower quality of the chroma samples is to assign lower quantization parameters (QP) to chroma channels than the luma [75]. The goal is to maintain color accuracy for YCbCr 4:2:0 content. However, although this method addresses the color accuracy due to compression, the main problem of close correlation between luma and chroma channels that results in visible luminance errors, still exists One solution is to use a better de-correlated color encoding such as Yu’v’ or ICtCp However, since the compression standards are optimized for YCbCr, usually the bit-rate is increased if any other color encoding is used. To that end, different chroma processing methods have been proposed to improve the compression efficiency of the color encodings other than YCbCr. Other chroma processing methods also have been proposed to improve the color accuracy of the YCbCr pixels.  The existing chroma processing methods can be classified into four main categories: chroma 38  adjustment as in [76], chroma scaling as in [77] and [78], chroma transform as in [79], and chroma reshaping as described in [80].  In chapter 4 we also propose two novel chroma processing methods which result in significant compression efficiency improvement of HDR coding while the color perception of the HDR content is not affected.  We discuss the existing chroma processing methods in detail in this section. 1.4.1 Chroma adjustment Chroma adjustment is proposed in [76] to further improve the color quality of the luma adjusted NCL YCbCr. Luma adjustment is a pre-processing technique that finds a luma code value through a recursive correction step so that the luminance error caused by NCL YCbCr 4:2:0 is minimized [81] [82]. Chroma adjustment is an addition to luma adjustment which adjusts the chroma channels of the luma adjusted NCL YCbCr through a recursive pre-processing step. This step makes Cb and Cr channels smoother and hence more suitable for compression. It is ensured that the visual perception of processed pixels with new Cb and Cr values is the same as the original pixels. This chroma adjustment technique resulted in improving the compression efficiency by 2.4% and 0.3% in terms of tPSNR Y and DE100, respectively as reported in [76].  Since using this method the visible color artifacts are removed, the overall subjective quality of the compressed signal also improved. The drawback of this method is the several recursions the resulted complexity before compression.  39  1.4.2 Chroma scaling In [76], it is proposed to use the Yu’v’ format for HDR compression. Yu’v’ is a highly de-correlated and more perceptually uniform color space compared of YCbCr. However, u’ and v’ channels are oversampled, i.e., the code-words for chromaticity channels are not used efficiently. It is proposed in [76] to scale down the chroma channels of u’ and v’ at darker luminance levels based on findings in [82], such that they are sampled with just enough number of bits without affecting the visual quality. Results show that the proposed scaling reduces Yu’v’ bit-rate requirements for dark content while preserving its high color accuracy. The drawback of this method is that the conversion function for each pixel is luminance dependent. Based on the decoded luma value, which is different from the original one due to compression, a different inverse function is applied to the chroma pixel values that may introduce visible color errors in the RGB representation.  1.4.3 Chroma transform Mahmalat et al. also proposed to use Yu’v’ to address the color errors associated with NCL YCbCr [78] in representing HDR color pixels. The oversampling of the u’ and v’ channels are handled by a content dependent scaling factor to more efficiently quantize the chrominance channels. A chromaticity-only transform is also proposed to convert the visible gamut into a unit square of colors to optimize the representation of the u’ and v’ channels for compression. The proposed solutions improve the compression efficiency of Yu’v’ color encoding compared to that of the YCbCr while improvements in perceived color quality is also achieved. Both chroma scaling and chroma transform proposed in [78] have the advantage of being luminance independent, while the chroma scaling method is content dependent.  40  1.4.4 Chroma Reshaping  As explained in Section 1.3.6, ICtCp was proposed to solve the color issues of the NCL YCbCr. However, since the video compression standards such as HEVC [47] are designed for YCbCr, using ICtCp will result in higher bitrates [74]. In [80], an adaptive reshaping for chroma values of ICtCp is presented. The reshaping functions is essentially a linear function which re-distributes the code-words among chroma channels depending on the characteristics of the content.  More specifically, the chroma channels are scaled to the maximum and minimum values of Ct and Cp allowed which is 0.5 and -0.5, respectively. Based on this method, some subjective quality improvement is reported compared to original ICtCp while the bitrate is lowered [80]. The main drawback of this method is its content dependency and the extra metadata that needs to be transmitted to signal the scaling value corresponding to the original minimum and maximum values of Ct and Cp. 1.5 Color Gamut Mapping for HDR Video Content As briefly explained in Section 1.1.3, legacy HD SDR television systems only support a small proportion of the colors that a human observer can perceive. This portion is described in the ITU-R Recommendation BT.709 [34], more commonly referred to as BT.709 or Rec.709. The ITU-R Recommendation BT. 2020 [36] covers a larger color gamut and is the standard recommendation for Ultra High Definition (UHD) and HDR television systems. Since in this thesis, our focus is not theatrical applications, we do not consider P3 color gamut [35]. Figure 1-8 showed the range of colors covered by the BT.709 and BT.2020, in the CIE 1931 chromaticity plane in comparison with all the visible colors (depicted by the horse-shoe shape). In 41  this chromaticity plane, the BT.709 covers approximately 35.9% of the full visible gamut using 8 bits while the BT.2020 covers 75.8% of that range using 10 bits [84].  To address backward compatibility between UHD/HDR and HD/SDR displays with UHD/HDR and HD/SDR content, an adaptation process is used. The process of adapting a wider color gamut such as BT.2020 to a smaller color gamut such as BT.709 is referred to as color gamut mapping or color gamut reduction while the process of converting a smaller gamut to a wider one is referred to as inverse gamut mapping or gamut expansion [85]. The process of gamut mapping inevitably leads to a loss in the mapped video’s color information. The process of inverse gamut mapping extrapolates the missing color information. The goal of gamut mapping is to keep the contrast and overall perception of the original and mapped videos the same. Since the saturation of the mapped colors will be changed inevitably, ideally its hues should be kept similar [86]. The BT.2020 colors are mapped to BT.709 through two general approaches: Clipping or compression [86]. Clipping refers to the process of clipping all the colors that are outside the BT.709 triangle (see Figure 1-22) to the boundary of the destination gamut of BT.709. The colors inside the BT.709 gamut are retained without change. Since several out-of-gamut colors are mapped to a single color (on the gamut border), spatial details can be discarded through the mapping process resulting in degradation of overall quality.  Figure 1-22 shows the process of gamut mapping through clipping. As can be seen the BT.2020 colors of A, B, C and D are all mapped to color A’ on the border of BT.709 gamut. 42  The compression gamut mapping technique on the other hand not only affects the out-of-gamut colors but also the ones that are inside the destination gamut. Unlike clipping, compression method preserves the relative distance between colors of the original content. The result is a better matching contrast of the mapped content with the original content. In compression gamut mapping, usually a limited area of the destination gamut is unchanged, and only colors outside that area are mapped using some heuristics, or dividing the source gamut into several regions, each one using a different mapping technique. Figure 1-23 shows the former approach. Regardless of the projection method used, the destination color needs to be specified such that it is as close to the original color. Two general approaches exist to select the destination color: Closets color or Towards-White-Point color. The former reduces the Euclidian distance between the original and the border of the destination gamut while the latter one selects the color on the  Figure 1-22 Gamut mapping of colors A to D to one point on the border of BT.709, resulting in color through clipping 43  line connecting the original color and the destination gamut reference white point. Figure 1-24 shows these two approaches.  Another aspect that needs attention in gamut mapping is the color space in which the mapping process is performed. A perceptual uniform color space is ideal for this process as changing the color gamut inside such a space will not result in substantial perceptual hue and saturation changes. In [86], an extensive review on several color space and different projection methods showed that mapping the chromaticities toward the reference white point in CIE LAB color spaces yielded the smallest color differences between original and mapped colors in terms of DE2000. As explained earlier in section, DE2000 is a metric used for quantifies the perceptual color differences between two colors.  Figure 1-23 Gamut mapping through compression method 44  In this thesis, in Chapter 5, we propose a gamut mapping method that outperforms the most accurate gamut mapping identified in [86]. We also propose a backward compatible gamut reduction which can be inverted with the least amount of color differences between original and mapped colors.  1.6 Thesis Outline In this thesis, we present novel methods for improving the quality of viewing experience for delivered HDR content while also considering backward compatibility with SDR. This includes determining which transmission pipeline can efficiently deliver both HDR and SDR with the highest subjective quality, identifying the most efficient HDR color encoding that achieves the best perceptual color accuracy, proposing a novel chroma processing to better represent HDR  Figure 1-24 Selecting mapped color based on Closest and Towards White point approaches 45  colors, and finally proposing novel techniques for HDR gamut mapping and backward compatible gamut mapping while reducing the color errors caused by this process. In Chapter 2, we conduct a comprehensive set of subjective evaluations on the existing transmission pipelines that can address both HDR and SDR. We identify the most efficient pipeline in terms of required bitrate and subjective quality in terms of HDR and SDR quality. We also provide guidelines on how HDR chroma information should be processed given the current infrastructure such that the final quality of the viewed video content is not compensated. In Chapter 3, we evaluate the existing color encoding and perceptual transfer functions in terms of how accurately they can represent HDR colors in terms of introducing visual color differences with original colors. Our study encompasses all visible colors representable by BT.2020 gamut sampled at different luminance levels. The visibility of error and the amount of color differences introduced by each color encoding due to quantization is identified. Additionally, the amount of color quality degradation for each color due to the choice of PTF and color pixel representation, without considering the compression, are identified. In Chapter 4, two chroma processing techniques are presented. In Chapter 4.1, we propose another chroma scaling that makes better use of the available code-words when a perceptually uniform color space is employed for HDR video compression. The results show that the compression efficiency of the proposed method is significantly improved while the color accuracy is maintained.  In Section 4.1, by using the observations in Chapter 3, we propose a novel chroma processing method that reduces the perceptual color differences due to bit-depth quantization. The method is applied on the most efficient pair of PTF and color pixel representation that was 46  identified in Chapter 3. The proposed method improves the compression efficiency of HDR video compression both in terms of subjective and objective evaluations. In-depth analysis is also provided on how the proposed method outperforms the existing methods even when chroma subsampling is applied. In Chapter 5, two gamut mapping approaches are proposed maintaining the contrast and color relationship of the original and the mapped signal. In Section 5.1 we present a hybrid gamut mapping approach which selects one combination of color space and projection method for each BT.2020 color pixel. The selection is based on minimizing the CIE DE2000 metric. Results show improvements in terms of CIE DE2000 when comparing original and projected colors over existing methods. Our method is practical and easy to be used in set up boxes since it can be implemented as a subsampled 3D-LUT. If a new color space, projection technique or color metric is designed, the created LUT would need to be adjusted to improve its performance. In Section 5.2, we propose an invertible gamut mapping from HDR colors to SDR colors so that HDR displays can reconstruct the colors, while SDR displays are addressed directly using legacy video delivery pipeline. The proposed color mapping scheme allows the mapped signal to be converted back to the original signal with minimal perceptual error so that the viewers’ quality of experience (QoE) is preserved. Our method includes a parameter that adjusts the trade-off between the quality of the SDR content and that of the HDR content. Our experiment results provide a guideline on how to strike a balance between color errors in the mapped signal and the retrieved one.  47  Chapter 2: Performance Evaluation of HDR Video Transmission Pipelines HDR video content delivery is possible with three main transmission pipelines: simulcast, scalable, and single layer [84]. All these three pipelines can address both SDR and HDR displays, a backward compatibility that is necessary at the introduction phase of HDR technology to the consumer market (with majority of clients with SDR TVs). Amongst these three, simulcast solution requires the highest bandwidth since it sends two separate streams (the bitrate will become almost double of that of a single layer approach). To identify the most efficient pipeline for HDR/SDR transmission, single layer and scalable pipelines need to be evaluated in terms of their compression efficiency and the subjective quality of the delivered signal t. In what follows, single layer pipelines are evaluated in Section 2.1. The most efficient single layer pipeline is then compared to the scalable pipeline in section 2.2. Conclusions are drawn in Section 2.3.  2.1 Single Layer Transmission Pipelines Evaluations As explained in section 1.2, one single layer approach for HDR content delivery is HDR10[87]. HDR10 employs perceptual quantizer (PQ) as the perceptual encoding function as the pipeline expects relative light information.  HDR10 is the first HDR content delivery pipeline adopted by all the consumer HDR TV manufacturers as it uses the same pipeline as the one for Ultra High Definition (UHD) content delivery, plus some metadata for HDR signaling. Many service companies started employing HDR10 since it allowed them to be amongst the first in the market to support HDR. In [88], we evaluated the HDR subjective quality of HDR10 compared to that of the SDR10 approach. Subjective test evaluations showed that, for the same bitrate, HDR10 videos were rated higher than those reconstructed from the SDR10 approach. The evaluations in [89] also 48  showed that HDR10 is suitable for HDR viewing purposes while it is also efficient. However, how HDR10 performs in addressing an SDR display using a tone mapping operator (TMO) [90] has never been tested before.  Considering the issue of backward compatibility with legacy SDR displays at the introduction of HDR technology to the market, SDR10 has also attracted the attention of some of service companies. SDR10 is another single layer approach for HDR delivery, in which first the HDR content is converted to SDR using a TMO, and the resulting SDR signal is transmitted. At the receiver side this signal is either directly used to support SDR displays, or an inverse tone mapping operator is deployed to convert it to HDR to address HDR displays. SDR10 has mainly been suggested due its backward compatibility with SDR displays, as it transmits SDR signals. However, SDR10 has not been tested in terms of its HDR viewing quality. Additionally, SDR10 viewing quality on both HDR and SDR displays greatly depend on the choice of TMO while its compression performance can also vary greatly based on the chosen TMO. Given that broadcasters have the choice of HDR10 and SDR10 for HDR/SDR delivery, one of the challenging questions they are facing is that which of these single layer approaches results in both high quality SDR and HDR outputs. Existing studies has compared the subjective quality of HDR output of the two single layer pipelines at different bitrates [88][89] . However, the subjective quality of the SDR outputs is overlooked in this study.  It is worth mentioning here that there are studies that have solely evaluated the objective quality of the SDR output of these pipelines [91][92].  To the best of our knowledge however, there is no comprehensive study that compares the subjective HDR and SDR quality of the HDR10 and SDR10 pipelines. Another question that needs to be addressed for broadcasters is how different video pre-processing and post-processing 49  alternatives (such as TMOs and display adaptations) used in the transmission pipelines affect the quality of the HDR and SDR outputs.  To address the above-mentioned issues, in this section three sets of subjective tests are designed and performed to evaluate how each processing step in the HDR10 and SDR10 pipelines affects the subjective quality of the generated HDR and SDR outputs. More specifically the contributions of this section are as follows: • Investigating the HDR10 and SDR10 compression performance in terms of their visual quality on both HDR and SDR displays • Investigating the effect that different TMOs used in the HDR10 pipeline have on the quality of the SDR output • Investigating the effect that the TMO used in the SDR10 pipeline has on the quality of the HDR output and the compression efficiency of the SDR10 pipeline • Investigating the effect that the chroma subsampling process at the display stage has on the visual quality of the SDR output both for the HDR10 and the SDR10 pipelines The findings of this chapter can be used as a guideline by broadcasters that are planning to deploy a single layer pipeline for HDR content delivery to consumers with HDR and/or SDR TVs. More specifically, the following questions are addressed in this section: (i) whether HDR10 achieves higher coding efficiency compared to SDR10 when displaying HDR content on an HDR or a SDR display, (ii) whether there is a single pipeline that can provide suitable subjective quality on both HDR and SDR displays, and (iii) how different TMOs impact the performance and visual quality of the HDR10 and SDR10 pipelines.  50  2.1.1 Experiments Setup Performance comparison of the HDR10 and SDR10 pipelines is carried out through a set of subjective experiments. For our experiments, four sequences from the HDR video dataset provided in the Moving Picture Expert Group (MPEG) call for evidence (CfE) are used. All of these HDR videos are represented in BT.2020 container, i.e., using BT.2020 color primaries. Table 2-1  summarizes the characteristics of each sequence. The column “Cropped Area” is explained in Section 2.1.2.3.  For the HDR10 pipeline, PQ (SMPTE ST-2084) is utilized as the perceptual encoding function (See first box in Figure 2-1).  As for the SDR10 pipeline, a TMO needs to be selected for this step. To select an appropriate TMO for the SDR10 pipeline, we considered two main attributes: temporal coherency and invertibility. High temporal coherency is needed since it affects the  efficiency of inter-prediction in the video coding stage [93]. Invertibility is also required as the tone mapped videos need to be inverse tone mapped at the decoding stage so that they can be      Figure 2-1 Coding and decoding chain of HDR10 using HEVC, and its display adaptation processes.   51  viewed on an HDR display. Bearing these characteristics in mind, we select the following three TMOs for our study: 1) Camera TMO (Cam) [56]: perfect temporal coherency (by disabling temporal adaptation), and dynamic range capabilities similar to SDR cameras.  2) Photographic tone reproduction (PTR) [57] in its global version: good preservation of details and coherent with the human visual system. The mapping curve was fixed for each sequence to prevent temporal incoherencies.  3) Histogram equalization (HE) [55]: redistribution of each frame’s tonal levels such that histogram bins with higher pixel densities are given more importance (better preservation of information). Therefore, three versions of the SDR10 pipeline resulting from these TMOs are included in our tests.   Table 2-1 HDR Video Dataset Sequence Frame Rate (frames per second) Number of Frames Scene Type Cropped Area FireEater2 24 240 Outdoor/night light 550 - 1497 BalloonFestival 25 200 Outdoor/day light 1-948 Market3 50 400 Outdoor/day light 100 – 1047 Tibul2 30 240 Computer-generated 800 - 1747   52  For compression of the video content, the High Efficiency Video Coding (HEVC) Main 10 profile (HEVC test model (HM) version 16.2) is used in our experiments [94] . In the HDR10 pipeline case, we encoded the 10-bit PQ HDR videos with four different quantization parameter (QP) levels. Similar QPs were used compress the tone mapped SDR videos resulting from the three TMOs mentioned above in the SDR10 pipeline case. The QPs used for each of the videos are the ones recommended by MPEG, except for the BalloonFestival and Market3 videos for which the used QP values are [18, 26, 34, 38], and [29, 33, 37, 41], respectively. The reason for introducing new QPs is that the lowest QP recommended by MPEG for these videos did not result in noticeably different visual quality levels when viewed on our HDR display (please refer to Section 2.1.2.1 for more details on the HDR display).      Figure 2-2 Coding and decoding chain of SDR10 using HEVC, and its display adaptation processes.   53  2.1.2 Performance of the HDR10 and SDR10 Pipelines in terms of HDR Output Quality This section focuses on comparing the subjective quality of the decoded HDR videos transmitted using the HDR10 and SDR10 pipelines when viewed on an HDR display. The details specific to our experiment and the corresponding results are provided in following subsections. 2.1.2.1 Display A full HD 47” prototype HDR LCD display with individually controlled LED backlight modulation was used for HDR viewing. This HDR display is capable of emitting light at a maximum luminance level of 6000 cd/m2 for a color gamut corresponding to the BT.709. In the experiment only luminance levels up to 4000 cd/m2 were used since this was the maximum level used by the content. 2.1.2.2 Display Adaptation For HDR10 pipeline, since decoded content at the receiver end is HDR, it can be displayed on HDR displays with BT.2020 support. However, since we are using a prototype HDR monitor that only supports BT.709, a gamut conversion from BT.2020 to BT.709 is performed to address the display properly (see Figure 2-1). In the case of SDR10 pipeline since the decoded content is SDR, the inverse of the tone mapping operator that has been used before encoding is applied on the decoded signal to retrieve the HDR content (see Figure 2-2) followed by a gamut conversion to BT.709. 54  2.1.2.3 Subjective Tests  Subjective tests were performed to evaluate the quality of the decoded HDR videos using the HDR10 and SDR10 pipelines. This test was composed of four HDR videos coded with SDR10 pipeline using three different types of TMOs (see Section 2.1.1 for the list of TMOs), and with HDR10. Four QP levels were used for each combination resulting in 64 tests videos. For subject validation purposes, we also added the original videos to be compared with themselves. In summary, 68 videos were shown to the participants in each test session. The order of videos was randomized in each session and we ensured that the same sequence is not shown consecutively. We presented both the original and decoded videos at the same time in a side-by-side manner to the viewers based on the double stimulus impairment scale (DSIS) method from the ITU-R Recommendation BT.500.  To show the videos side-by-side without lowering their resolution, we cropped them along their width while keeping their original height (1080), as the TV resolution was 1920x1080 HD.  We selected a cropped area that best represents the motion in the scene. The horizontal coordinates of the cropped area for each video sequence can be found in Table 2-1. The right and left videos were separated with a 25-pixel wide black strip. During the test, the subjects were aware of the original video’s position (left or right). We did not change the position of the original and test videos throughout the test.  Participants were asked to rate the impairments of the test video with reference to the original (uncompressed) video on a discrete impairment rating scale ranging from 1, being very annoying to 10, being imperceptible i.e. identical quality to that of the original video. Subjects were not instructed where to look in each of the videos and they were all considered naïve to the content. 55  The room in which the tests were run was compliant with the BT.500 recommendations for subjective evaluation of visual data.   2.1.2.4 Viewers Eighteen adult subjects including 13 males and 5 females participated in this test. The participants’ age ranged from 23 to 32 years old with an average of 25.7. Prior to the tests, all subjects were screened for color blindness and visual acuity by the Ishihara chart and the Snellen charts, respectively. Subjects that failed the pre-screening did not participate in the test. None of the participants were aware of the test objectives. Oral and written instructions of the test were presented to the subjects prior to the test. In order to familiarize the participants with the test procedure, a training test was conducted. This training consisted of two video sequences, different from the ones in the actual test dataset, but prepared in the same manner. All tests were conducted with three subjects per session. Each HDR subjective test took approximately 15 minutes (excluding training, oral and written instructions). 2.1.2.5 Results and Discussions After collecting the subjective tests results, the outliers were detected according to the BT.500 recommendations[45]. Two outliers, out of 18 subjects, were detected and their scores were discarded from the results. The Mean Opinion Score (MOS) for each test video was calculated by averaging the scores over all subjects with a 95% confidence interval. Figure 2-3 plots the subjective MOS values at different bitrates for both the HDR10 and SDR10 pipelines. SDR10 pipeline is plotted for the three tested TMOs: Cam, HE and PTR. 56  It can be observed from Figure 2-3 that the MOS values of the HDR10 pipeline are significantly higher compared to those of the SDR10 pipeline for the FireEater2, Market3 and Tibul2 sequences. However, for the BalloonFestival sequence, both pipelines performed almost similarly. In order to understand the reason behind this, we studied the distribution of luminance values of each sequence and how it affects tone mapping. Figure 2-4 plots for each sequence, the mapping curves from luminance to luma using the SMPTE ST 2084 (PQ), Cam, HE and PTR. The normalized luminance histogram is also plotted on top of the curve. Note that the curve computed for HE varied for each frame, so we only plotted the curve corresponding to the histogram. It can be observed from Figure 2-3 and Figure 2-4 that for the sequences with similar mapping curves, similar MOS values were also achieved. That is because similar mapping curves will result in similar visual SDR signals. Specifically, for FireEater2 and Tibul2, the SMPTE ST 2084 performs quite distinctly compared to the tested TMOs (See Figure 2-4). Recall that the highest differences in MOS values were reported for these two sequences. Similarly, since the SMPTE ST 2084 mapping curve for Market3 is close to that of the HE TMO, their MOS values are also close (See Figure 2-3 and Figure 2-4). The Cam TMO curve for Market3 is distinctly different from the SMPTE ST 2084 curve which results in much lower MOS values compared to SMPTE ST 2084 as shown in Figure 2-3. As for BalloonFestival, all MOS values are similar and no statistical difference can be noticed between SDR10 and HDR10 MOS results in Figure 2-3. The reason for this is that based on Figure 2-4, its luminance is mainly distributed between 50 and 1000 cd/m2. Since this range does not correspond to a very high dynamic range, the amount of information removed from the original HDR signal by tone mapping is insignificant.   57   Figure 2-3 HDR MOS-rate comparison of HDR10 pipeline using the SMPTE ST 2084, with SDR10 pipeline using Cam, HE and PTR TMOs for the tested sequences    58   Figure 2-4  Mapping of luminance to luma for tested sequences using the SMPTE ST 2084 for HDR10, and TMOs Cam, HE, and PTR for SDR10 along with the normalized histogram of each content59  From these test results, we can conclude that, in general, the HDR10 pipeline outperforms the SDR10 in terms of compression efficiency, if the transmitted signal is viewed on an HDR display. We also observed that when the TMOs curves resemble the SMPTE ST 2084 curve they are visually viewed similar to each other. The more distant these curves are, the greater the differences in their MOS values are. These differences always favor the HDR10 approach. 2.1.3 Performance of the HDR10 and SDR10 Pipelines in terms of SDR Output Quality In this second test, we compare the subjective quality of the decoded videos transmitted using the HDR10 and SDR10 pipelines when viewed on an SDR display. This section provides the details of our experiment and their corresponding results. 2.1.3.1 Display An ultra HD 65” 10-bit LCD display was used for showing the SDR content in the subjective tests. The maximum peak luminance of this display is 1000 cd/m2 in HDR mode and 150 cd/m2 in SDR mode. The latter mode was used in these tests. This display is addressed using a 10-bit YCbCr 4:2:0 signal.  For the HDR10 pipeline, since the decoded content at the receiver end is HDR, it needs to be tone mapped according to the dynamic range of the SDR display. Figure 2-1 shows the steps needed after decoder to obtain a 10-bit YCbCr 4:2:0 signal to address the SDR TV (chroma up-sampling to 4:4:4 and inverse quantization processes are performed inside the SDR TV). Since the display’s gamut is BT.709, a gamut conversion from BT.2020 (content gamut) to BT.709 was performed. 60  On the other hand, since the decoded videos from the SDR10 pipeline are already SDR, only the gamut and the to 10-bit YCbCr 4:2:0 conversions are required to address SDR displays. The decoding chain and the display adaptation steps of the SDR10 pipeline are shown in Figure 2-2. 2.1.3.2 Subjective Tests  The objective of this experiment is to compare the SDR quality of the HDR10 and SDR10 pipelines. We decided to use only the HE and PTR TMOs, which produced the best HDR quality results. The Cam TMO was discarded as it resulted in poor HDR quality, as reported in Section 2.1.2.5. Thus, this test consists of 4 HDR videos encoded with HDR10 (using PQ) and SDR10 (using 2 TMOs: HE and PTR). Note that the same TMOs (HE and PTR) were applied on the decoded HDR signal from the HDR10 pipeline, before feeding them to the SDR display. Four QP levels were used for each combination, resulting in 64 tests (4 sequences × (1 (PQ + HE TMO) + 1 (PQ + PTR TMO) + 1 HE TMO + 1 PTR TMO) × 4 QPs). The Double Stimulus Impairment Scale (DSIS) approach is used for these subjective tests. The selected cropped area and viewing sessions are the same as the ones mentioned for HDR tests. The entire test lasted approximately 15 minutes, not considering the training and instructions time. For these experiments, the reference videos were the tone mapped (before compression) using the same TMO as the one applied to the test videos. 61   Figure 2-5 SDR MOS-rate comparison of HDR10 pipeline using the SMPTE ST 2084 followed by HE and PTR TMO for display, with SDR10 pipeline using HE, and PTR TMOs for tested sequences. 62  2.1.3.3 Viewers Seventeen adult subjects, including 14 males and 3 females participated in this test. The participants’ age ranged from 20 to 31 years old with an average of 24.3. The same pre-test procedure as the one explained in Section 2.1.2.4 was performed for this test. 2.1.3.4 Results and Discussions The same outlier detection procedure as in Section 2.1.2.5 was conducted on the SDR test results. No outlier was detected amongst the subjects.  Figure 2-5 shows the results of the SDR subjective tests in terms of MOS values for different bitrates using the HDR10 and SDR10 pipelines. We observe from  that SDR10 yields better subjective quality compared to that of HDR10 at the same bitrate. In other words, with our current test settings, applying the TMO before the compression stage results in higher subjective quality. An underlying reason for HDR10 underperforming SDR10 in terms of SDR quality may be due to the chroma subsampling processes. Both HDR (resulted from the HDR10 pipeline) and SDR (resulted from the SDR10 pipeline) videos undergo chroma down-sampling process twice (see Figures 2-1 and 2-2). For the HDR10 pipeline, these two chroma subsampling processes are performed in two different domains (once in PQ domain and once in the tone mapped domain). However, for the SDR10 pipeline both chroma subsampling processes are performed in the SDR, thus resulting in less noticeable overall quantization artifacts. 63  In order to validate of the above observation, we encoded the videos using the 4:4:4 format at bit-rates similar to those of the HDR10 4:2:0.  (b)  (c) Figure 2-6 (a) shows the first frame of the sequence Market3 encoded using HDR10 in 4:4:4 and  (b) 64   (c) Figure 2-6 (b) shows the same frame encoded in 4:2:0 mode.   (b)  (c) 65  Figure 2-6 (c) shows the same frame encoded using SDR10 in 4:2:0 format. As can be observed, the quality of the encoded frame using SDR10 pipeline in 4:2:0 mode ( (b)  (c) Figure 2-6 (c)) is visually very close to that of the frame encoded using HDR10 pipeline in4:4:4 mode (  (b) 66   (c) Figure 2-6 (a)). From this observation we can conclude that applying chroma subsampling in two different domains causes additional SDR visual degradation in the HDR10 pipeline. This is an expected outcome since HDR represents color information at higher brightness levels, where the human visual system is more sensitive to color distortions due to subsampling. This test also points out that, compression of HDR content in 4:4:4 (without subsampling) greatly enhances the quality of the reproduced SDR videos at same bitrates in HDR10 pipeline. Similar results have been reported regarding HDR quality of HDR10 pipeline using 4:4:4 [94] [96].  All these results recommend avoiding chroma subsampling in HDR10 pipeline. However existing practices and infrastructure only support 10-bit 4:2:0 encoding, which inevitably requires one chroma subsampling at the encoding stage. Furthermore, most of legacy SDR displays and set-top boxes (STBs) can only be addressed using 4:2:0. Thus having subsampling steps twice in the content delivery pipeline will most likely remain an issue in achieving high quality reproduction of SDR content.  We also investigated the effect that the chosen TMO has on the quality of the SDR output in the HDR10 pipeline. As discussed in Section 02.1.1, the criteria for choosing TMOs in the SDR10 was their invertibility. However, in the HDR10 pipeline any TMO with high performance can be 67  deployed. For our experiment, we kept the two invertible TMOs of HE and PTR and added a non-invertible TMO known as weighted-least square (WLS) which is reported to produce high quality SDR [97]. We did not include this TMO as part of the test in the SDR10 pipeline as it is not invertible and cannot address an HDR display.  (a)  (b)  (c) Figure 2-6 Tone mapped version of the first frame of Market3 encoded using (a) HDR10 pipeline in YCbCr 10-bit 4:4:4, (b) HDR10 pipeline in Y’CbCr 10-bit 4:2:0, and (c) SDR10 pipeline in Y’CbCr 10-bit 4:2:0 format.   68  Similar to the former SDR subjective test, this test consisted of decoded videos using the HDR10 and SDR10 pipelines, with a total of 4 SDR videos × (1 PTR TMO + 1 HE TMO + 1 (PQ + PTR TMO) + 1 (PQ + HE TMO) + 1 (PQ + WLS)) × 4 QPs = 80 SDR videos. The test was approximately 20 minutes. The subjects were asked to rate the overall quality of the test videos without any reference to the original video (single stimulus method) since a reference could not be generated in the case of non-invertible TMO of WLS. A discrete rating scale from 1 (worst quality) to 10 (best quality) was provided to the subjects. The videos were presented at full resolution (without cropping) as only one video was shown at a time.  Figure 2-7 shows the results of this experiment. As it can be observed, using the non-invertible WLS TMO in the HDR10 pipeline can boost the SDR subjective quality significantly. In fact, HDR10 pipeline provides TV and STB manufacturers with the freedom to implement a desired TMO on their devices without being concerned about its invertibility. In general, invertibility limits the performance of a TMO and affects its visual quality. Based on the results of our experiments in this study, it is concluded that transmitting HDR signal using the HDR10 pipeline has the advantage of requiring lower bitrate compared to its SDR counterpart transmitted by the SDR10 pipeline. Also, by choosing the right TMO in the HDR10 pipeline, meaning a TMO that provides an acceptable SDR quality, broadcasters can guarantee the lowest bitrate without compensating the SDR quality. In summary by using HDR10, a single transmitted signal can address both HDR and SDR displays with best achievable quality. Better SDR quality is achieved if the display supports 4:4:4 format.69   Figure 2-7  SDR MOS-rate comparison of HDR10 using the SMPTE ST 2084 followed by HE, PTR, and WLS TMO for display, with SDR10 pipeline using HE, and PTR TMOs, for tested sequences without reference to original video quality.  70  2.1.4 Scalable vs. Single Layer Transmission Pipelines Evaluations As stated in Section 1.2, another backward compatible pipeline to deliver both HDR and SDR to viewers is the scalable pipeline which requires both HDR and SDR versions of the content. Between the single layer pipelines, we showed in Section 2.1 that the HDR10 pipeline is more efficient and more suitable for a backward compatible HDR/SDR delivery while it provides the displays with the liberty to choose a TMO that better suits their needs.  In this section we evaluate the scalable pipeline compared to HDR10 in terms of their HDR subjective quality. 2.1.5 Experiments Setup To perform our tests, we used 5 sequences considered in the Call for Evidence (CfE): FireEater2, Tibul2, AutoWelding, Bikesparkler, and BalloonFestival. These videos are available in HDR in linear RGB domain. To evaluate these videos with HDR10, the same process as the one shown in Figure 2-1 is performed. Note that the HEVC codec used is HM 16.6. For ease of reference, we will refer to the HDR videos processed using the chain in Figure 2-1 also as HDR10. To evaluate the test videos with the scalable pipeline, their SDR versions are also required.  Two different SDR versions of these sequences are provided by MPEG namely SDR_A10 and SDR_C10, mastered differently. However, both are quantized to 10 bits and are of BT.709 color gamut. The HEVC scalable encoder software SHM 0.8 was used to code base (SDR) and enhancement (HDR) layers. Given that we have two SDR versions, three different configurations were used to obtain the compressed content: 71  1. SM10: HDR10 sources compressed using HEVC (HM 16.6) 2. SCC10_L1: HDR10 and SDR_C10 sources using HEVC (SHM 0.8) 3. SCA10_L1: HDR10 and SDR_A10 sources using HEVC (SHM 0.8) These videos were encoded using four different QPs as reported in  Table 2-2. In our experiments, we asked the participants to rate the quality of the decoded videos compared to that of the original HDR video (uncompressed).  2.1.6 Display The experiment was conducted on a Samsung SUHDTV UN65JS9500 series 9 of resolution 3840x2160 which is a 65” 10-bits commercial TV with a peak luminance of 1000 nits and a P3 color gamut. The expected input signal is non-linear RGB using the SMPTE ST 2084.  Table 2-2 QPs Used for Compression of Each Test Video Configurations Sequence name SM10 - QPs SCC10_L1 - QPs SCA10_L1  QP values FireEater2 20,23,26,29 20,23,26,31 20,23,26,29 Tibul2 19,24,29,34 19,24,29,34 19,24,29,34 AutoWelding 21,25,29,33 21,25,29,33 N/A BikeSparklers 23,25,29,33 23,25,31,35 N/A BalloonFestival 18,22,26,30 18,22,26,30 18,22,26,30  72  2.1.7  Display Adaptation We showed videos at 10 bits per channel HD (1920x1080) signal at 30 frames per seconds (graphic cards limitations). The Scratch player [98] was used to handle 10 bits. Since the resolution cannot encompass two HD videos, we cropped each sequence along the horizontal axis with the horizontal coordinates indicated in Table 2-3, so that the original and test videos can be viewed side-by-side. Note that the peak luminance of content is higher than that of the TV (4000 nits vs. 1000 nits), thus we scaled down (dividing by 4) each color channel in the linear domain to avoid clipping the highlights. Instead of scaling, a TMO could also be used to lower the dynamic range of the input. However, such an approach might also affect compression as we observed in Section 2.1. Therefore, in our experiments we sufficed with a linear scaling to address the display.   2.1.8 Subjective tests The participants were asked to evaluate the quality of the decoded videos compared to that of the original ones. A scale from 1 to 10 was used to assess the quality. A training session was organized before the actual experiment to familiarize the subjects with the compression artifacts. The Table 2-3 Cropped Area Horizontal Coordinates for each Test Video Sequence name Cropped area FireEater2 550 - 1497 Tibul2 800 - 1747 AutoWelding 375 - 1322 BikeSparklers 550 - 1497 BalloonFestival 1 - 948  73  evaluation was composed of 52 tests (5 sequences × 4 QPs × 2 pipelines of SM10 and SCC10_L1+ 3 sequences × 4 QPs × 1 pipeline of SCA10_L1). Test videos were randomly ordered so that two same sequences were not shown consecutively. Subjects were given three seconds between two stimuli to vote. 2.1.9 Viewers Twenty subjects took part in our experiments and all were screened for color blindness and visual acuity. Five of the subjects were detected as outliers and their results were discarded. Five out of the 15 subjects were considered experts in the field and familiar with the content. 2.1.10 Results and Discussions The MOS-bitrate results are provided in Figure 2-8 and Figure 2-9. We observe that for the case of Tibul2, BalloonFestival, and AutoWelding, the single layer approach tends to outperform the scalable one. For FireEater2 and BikeSparklers, the difference in quality between the lower and higher bitrates is not significant enough to reach any conclusion. This is because the QPs used were not high enough to introduce any visible distortions for these two streams.74    Figure 2-8 HDR MOS-rate comparison of HDR10 using the SMPTE ST 2084 with Scalable pipeline for FireEater2, Tibul2, and BalloonFestival  75     Figure 2-9 HDR MOS-rate comparison of HDR10 using the SMPTE ST 2084 with Scalable pipeline for BikeSaprkes and AutoWelding  76  It can be concluded that since HDR10 has the advantage of only requiring the HDR version of the content compared to the scalable approach that requires both HDR and SDR version of the content, HDR10 is the favorable pipeline for transmitting HDR with backward compatibility with SDR displays as shown in Section 2.1.3. 2.2 Conclusions With the introduction of HDR technology to the consumer market, one of the challenging questions that broadcasters are facing is which transmission pipeline can deliver high quality SDR and HDR content at acceptable bitrates. Another question that broadcasters encounter is how different video pre-processing and post-processing alternatives (such as TMOs and display adaptation) used in the transmission pipeline affect the quality of the HDR and SDR outputs.  In Section 2.1, three sets of subjective tests were designed and performed to evaluate how each processing step in the HDR10 and the SDR10 pipelines affect the subjective quality of the generated HDR and SDR outputs. Our experiment results show that at the same bitrate, videos transmitted using the HDR10 pipeline yield higher HDR visual quality compared to that of the ones transmitted with SDR10. In other words, HDR10 clearly outperforms SDR10 in HDR viewing. We also showed that whenever the tone mapping curves were similar to PQ curve, the subjective quality of the SDR10 pipeline tended to be similar to that of HDR10. Comparison of the SDR quality of the delivered signal by the HDR10 and SDR10 pipelines showed that if the two pipelines use the same invertible TMOs, the SDR10 pipeline outperforms the HDR10. We showed that the reason why HDR10 underperforms SDR10 is that two chroma 77  subsampling steps are applied in different domains for the HDR10 pipeline, which seems to greatly affect the quality of the decoded videos. Compressing HDR 4:4:4 (without subsampling) in HDR10 seems to greatly enhance the quality of the reproduced SDR videos for similar bitrates. This is an expected outcome since HDR represents color information at higher brightness levels, where the human visual system is more sensitive to color distortions due to subsampling. As existing infrastructure relies on chroma subsampling, it is imperative to design better chroma subsampling filters that address the needs of HDR for the foreseeable future.  We also investigated the case where a non-invertible TMO is used in HDR10 pipeline to address SDR displays. We compared the quality of the generated SDR content with that of SDR10 pipeline. In this case, the results indicate that HDR10 clearly outperforms the tested SDR10 pipeline in addressing SDR displays for three out of the four sequences tested, while it performs similar to SDR10 for the other one.  In summary, while it was an expected outcome that HDR10 would address HDR displays more efficiently than SDR10 pipeline, it was shown in this study that it can be also used as an efficient backward compatible pipeline to address SDR displays, without compensating the visual quality. It was also shown for the first time in this study that the visual quality of the SDR output of the HDR signal transmitted with the HDR10 pipeline, can be further enhanced if the display is addressed in 4:4:4 mode. In Section 2.1.4, another set of subjective tests was performed to evaluate how the scalable and HDR10 pipelines compare in delivering HDR videos. Our experiment results showed that the visual quality of the HDR signals processed with scalable and HDR10 pipeline were not statically 78  different although the HDR10 pipeline was on average rated higher in terms of the subjective quality. Additionally, the single layer approach is more straightforward and is not complex like scalable which sends two layers of information and at the decoder side reconstructs the needed signal depending on the available display. Furthermore, in contrast to simulcast and scalable approaches that require both HDR and SDR versions of the content, the HDR10 single layer approach only requires an HDR version of the content. Availability of both HDR and SDR versions of the content is quite rare and a very restrictive requirement for broadcasting applications. Thus, we conclude that the HDR10 single layer approach is the most favorable and efficient choice for backward compatible HDR transmission.  79  Chapter 3: Evaluation of Standard Color Pixel Representations for High Dynamic Range Video Transmission As described in Section 1.1.1 and 1.3, different transfer functions and different color encodings can be employed for HDR color pixel representation for transmission purposes.  When quantization is applied, these transfer functions and color encodings affect the HDR colors differently. To understand the effect of these processes on HDR colors, in this section we evaluate the perceptual color differences caused by employing transfer functions and color encodings that are part of ITU-R BT.2100 recommendation, followed by 10-bit quantization. To evaluate the color errors, we rely on CIE DE2000 metric, a perceptual objective color difference metric based on the HVS characteristics. Figure 3-1 shows the general workflow of the evaluation process, while this figure is discussed in more detail in Section 3.1. 3.1 Color Difference Evaluation Experiments In this work, we investigate how the perceptual transfer functions (PTFs) and color pixel representations recommended in BT. 2100 followed by quantization alter each HDR color perceptually. The PTFs used are PQ and HLG as described in section 1.1.1 while the color pixel  Figure 3-1 Color difference evaluation experiment workflow 80  representations are NCL Y’CbCr, CL Y’CbCr, and ICtCp as described in sections 1.3.1, 1.3.2, and 1.3.6, respectively. Since neither compression nor chroma sub-sampling is applied on the signals, the generated errors are due to quantization only (see Figure 3-1).  Please note that we only consider signal transmission application therefore, 10-bit BT.2020 colors. The 10-bit quantization performed throughout this test follows the restricted range quantization as described in BT.2100. Our test encompasses all visible colors representable with BT. 2020 and for luminance levels ranging from 0.01 cd/m2 to 1000 cd/m2, and 4000 cd/m2. To construct these colors, we start with CIE 1976 Lu’v’ color space due to its perceptual uniformity. For each luminance level, while L is constant, the u’ and v’ values are increased from 0 to 0.62 with step size of 0.001. According to [19], chromaticity changes lower than 0.45/410 ~= 0.001 are imperceptible to the human eye. The tested PTFs and color pixel representations are applied on the constructed colors, followed by 10-bit quantization. Please see Figure 3-1 for the complete workflow. The reason for choosing two maximum luminance values of 1000 and 4000 cd/m2 is that these values correspond to the peak luminance of currently available reference displays.  To evaluate the color deviations from the original signal (blue boxes in Figure 3-1) and the tested signal (green boxes in Figure 3-1), we employ the perceptual objective metric of CIE DE2000. This metric is designed to work on CIE 1976 L*a*b* color space (CIELAB) values. For this reason, the original and the encoded signals are transformed to this color space for comparison (see Figure 3-1). The Just Noticeable Difference (JND) threshold in terms of CIE DE2000 is one. In other words, any color difference less than one is not perceptible by human eyes [42]. Moreover, 81  the larger the value of the CIE DE2000 metric is, the more different the tested colors are perceptually. 3.2 Results and Discussions Figure 3-3 and Figure 3-4 show errors generated due to 10-bit NCL YCbCr and 10-bit CL YCbCr color encoding, respectively, at luminance levels of 0.01, 0.1, 1, 10, 100, 500 and 1000 cd/m2, with PQ as the PTF. We depict the DE2000 error values using a color error bar system where dark blue corresponds to values less than JND (below 1). Therefore, as soon as a light blue is shown it represents a visible color distortion. Please note the loss of colors at luminance level of 0.01 cd/m2, and 1000 cd/m2 are due to the clipping enforced by luminance level, which is 10000 in case of PQ (refer to Y derivation formula in BT. 2020 for more details). As it can be observed the color errors are mainly around the white point. It is well known that the HVS is more sensitive to changes in brightness. As the colors around the white point are brighter, any change due to quantization is more visible (and hence larger DE2000 value). This observation is consistent throughout our experiment when color error is measured in the YCbCr color space. 82     Figure 3-2 Colors represented by BT.2020  Figure 3-3 Color errors of 10-bit NCL YCbCr with PQ in terms of DE2000  Figure 3-4 Color errors of 10-bit CL YCbCr with PQ in terms of DE2000  83     Figure 3-5 Color errors of 10-bit NCL YCbCr with HLG (reference display peak luminance of 4000 cd/m2) in terms of DE2000  Figure 3-6 Color errors of 10-bit CL YCbCr with HLG (reference display peak luminance of 4000 cd/m2) in terms of DE2000  Figure 3-7 Color errors of 10-bit NCL YCbCr with HLG (reference display peak luminance of 1000 cd/m2) in terms of DE2000  84     Figure 3-8 Color errors of 10-bit CL YCbCr with HLG (reference display peak luminance of 1000 cd/m2) in terms of DE2000  Figure 3-9 Color errors of 10-bit ICTCP with PQ in terms of DE2000  Figure 3-10 Color errors of 10-bit ICTCP with HLG (reference display peak luminance of 4000 cd/m2) in terms of DE2000  85     Figure 3-11 Color errors of 10-bit ICTCP HLG (reference display peak luminance of 1000 cd/m2) in terms of DE2000  86  Comparing the results in Figure 3-3 and Figure 3-4, we observe that by simply changing from NCL to CL YCbCr, color errors are reduced and are less noticeable. This reduction in color errors is more evident with red and blue combinations. That is because the CL Y’ is more de-correlated from the Cb and Cr (red and blue difference from Y’) compared to NCL Y’. As a result, changing NCL Y’ to CL Y’ makes the reconstruction of blue and red channels more error resilient.   Figure 3-5 and Figure 3-6 are also showing 10-bit NCL YCbCr and 10-bit CL YCbCr color pixel representation, respectively, with HLG as the PTF where the reference display peak luminance is assumed to be 4000 cd/m2. Figure 3-7 and Figure 3-8 are similar to Figure 3-5 and Figure 3-6 with the exception that the reference display peak luminance is assumed to be of 1000 cd/m2 for the former pair. The errors that are generated with HLG at high luminance levels (L = 500 and 1000 cd/m2 for Figure 3-5 and Figure 3-6 and L = 100 and, 500 and 1000 cd/m2 for Figure 3-7 and Figure 3-8) are due to the clipping enforced by reference display luminance level. Note that the same errors will also happen with PQ if it is assumed that the content was mastered on a grading display before encoding. Comparing the results of CL and NCL (compare Figure 3-5  with Figure 3-6, and Figure 3-7 with Figure 3-8), we found that the color errors are reduced and are less noticeable in the case of CL. This observation is consistent with the one derived when PQ PTF is used (comparing Figure 3-3 with Figure 3-4). The rest of the errors present in YCbCr encoding, even when using CL method at the different luminance levels (see Figure 3-6 and Figure 3-8), are due to quantization and the correlation of Y’ with Cb and Cr. Quantization errors are the result of limited number of code words assigned to each luminance levels. By comparing HLG and PQ at each luminance level (compare 87  Figure 3-3 with Figure 3-5 and Figure 3-7, and Figure 3-4 with Figure 3-6 and Figure 3-8), it can be observed that PQ outperforms HLG at dark luminance levels (up to 100 cd/m2) in the Y’CbCr color space.  This behavior can be explained by the fact that HLG consists of a gamma function for dark areas and a logarithmic one for bright areas. This results in fewer code-words for the dark areas compared to the bright areas. This also explains why HLG is producing fewer errors at high luminance levels compared to PQ (compare luminance levels of 100 cd/m2 in Figure 3-3 and Figure 3-4  with Figure 3-5, Figure 3-6, Figure 3-7, and Figure 3-8).  Another note-worthy observation is how HLG preforms based on the peak luminance of the display by comparing Figure 3-6 with Figure 3-8 in CL case (or Figure 3-5 with Figure 3-7 in the NCL case). With HLG at reference display peak luminance of 1000 cd/m2, more code words are allocated to dark areas, as the content range is normalized to a smaller value compared to the case of reference display peak luminance of 4000 cd/m2. This behavior-change of HLG at different peak luminance levels does not happen with PQ, as the latter always assumes a peak luminance of 10000 cd/m2. Please note that in BT.2100, it is suggested to apply clipping on HLG signals that are out of [0, 1] range, at the display side. However, since addressing the display is out of the scope of this paper, we did not clip the encoded signal to [0, 1] range.  Finally, Figure 3-9, Figure 3-10, and Figure 3-11 are showing color errors generated by the ICtCp color encoding paired with PQ, HLG with peak luminance 4000 cd/m2 and HLG with peak luminance of 1000 cd/m2, respectively. As it can be observed, ICtCp with PQ can represent most of the colors without any visible error at the majority of the luminance levels. As Figure 3-9 shows, 88  since ICtCp de-correlates the chroma channels from luminance channels quite well (see [74]), when using PQ the errors are mainly due to the quantization and are centered at the white point. When HLG is used with ICtCp, it is shown that colors at darker luminance levels are represented with more errors compared to the color at higher luminance levels. The loss of colors due to the clipping enforced by luminance levels (10000 cd/m2 for Figure 3-9, 4000 cd/m2 for Figure 3-10 and 1000 cd/m2 for Figure 3-11) is also visible in Figure 3-9, Figure 3-10, and Figure 3-11. Please note how color errors with ICtCp are not only towards red and blue channels as compared to YCbCr. This can be explained by the de-correlation of the intensity (I) channel from Ct and Cp.  We conclude that based on the presented results, ICtCp with PQ yields better performance in terms of preserving HDR colors over the tested luminance levels when only quantization errors are taken into account. These results can be explained by the fact that ICtCp was designed to better de-correlate intensity from chroma channels. HLG can be beneficial due to its backward-compatibility characteristics, since it also represents HDR colors in bright areas with minimal errors. 3.3 Conclusions In this chapter, we evaluated the visual color differences caused by different PTFs and color representations followed by 10-bit quantization. It is shown that even before compression, choice of PTF and color pixel representation will affect the visual color perception. Particularly, it was shown in the case of YCbCr that PQ performs better than HLG in dark luminance levels while HLG performs as well as PQ at bright luminance levels. The performance of HLG according to its reference display peak luminance also showed that the lower this value is, the better HLG performs at both dark and bright luminance levels. It is also shown that 10-bit ICtCp outperforms 10-bit 89  YCbCr both with CL and NCL derivation in representing color due to its better de-correlation of luminance and chrominance. Although ICtCp with PQ represents colors throughout most of the tested luminance levels with minimal errors, there are still large errors in bright areas around the white point due to 10-bit quantization. We will address these errors in Chapter 4.2.  90  Chapter 4: Chroma Processing Schemes for Improved Color Accuracy of Transmitted HDR Video Content In this chapter, we propose two techniques that result in more accurate representation of HDR color pixels for transmission purposes. In Chapter 4.1, a scaling function is proposed for the perceptually uniform color encoding of CIELAB such that chroma code-words are utilized more efficiently for this representation without increasing the bitrate. In Chapter 4.2, we propose a technique that re-distributes the chroma code-words of ICtCp, which was shown in Chapter 3 to outperform other color encodings in representing HDR colors, such that its color accuracy is improved without affecting the compression bitrate in a negative way. 4.1 Chroma Scaling of CIE LAB Color Space for Efficient HDR Video Content Transmission In this section, the performance of a perceptually uniform color encoding of CIELAB for HDR video compression is investigated. As CIELAB is designed for SDR brightness values up to of 100 cd/m2, some adjustments are made to its original form to accommodate HDR signals. More specifically, a scaling function for chroma channels of a* and b* is proposed to better make use of the available code-words, that is presented in what follows. 4.1.1 Proposed modifications CIELAB or L*a*b* consists of one brightness channel (L*) which goes up to 100 cd/m2 and two color channels of a* and b* which cover colors from green to red, and from blue to yellow, respectively.  Each of these channels is constructed as follows:  91  𝐿∗ = 116𝑓 (𝑌𝑌𝑛) − 16                   (21) 𝑎∗ = 500 [𝑓 (𝑋𝑋𝑛) − 𝑓 (𝑌𝑌𝑛)]               (22) 𝑏∗ = 200 [𝑓 (𝑌𝑌𝑛) − 𝑓 (𝑍𝑍𝑛)]               (23)  𝑓(𝑤) =  {𝑤1/3, 𝑤 > 0.0088567.787 𝑤 + 16/116, 𝑤 ≤ 0.008856            (24) where Xn, Yn, and Zn are the XYZ components of the white point. Since HDR luminance values can go up to 10000 cd/m2, the current CIELAB cannot efficiently handle an HDR signal. To address this issue, an hdr-CIELAB is proposed in [99]. The only change in hdr-CIELAB is the transfer function to have better performance for shadows and highlights compared to conventional CIELAB; otherwise all the other derivations are the same as in CIELAB. Yet, the encoded L* in hdr-CIELAB goes only up to 245 cd/m2. In this work, we propose to use SMPTE ST 2084 as the transfer function for HDR luminance values in CIELAB. Therefore, the proposed L*, a* and b* channels will be calculated as follows:   𝐿∗ = 𝑌′              (25) 𝑎∗ = {𝑓(𝑋𝑋𝑛)−𝑓(𝑌𝑌𝑛)0.1441 ×2, −0.1441 ≤ 𝑥 ≤ 0𝑓(𝑋𝑋𝑛)−𝑓(𝑌𝑌𝑛)0.1083 ×2, 0 < 𝑥 ≤ 0.1083       (26)  𝑏∗ = {𝑓(𝑌𝑌𝑛)−𝑓(𝑍𝑍𝑛)0.2338 ×2, −0.2338 ≤ 𝑥 ≤ 0𝑓(𝑌𝑌𝑛)−𝑓(𝑍𝑍𝑛)0.6208 ×2, 0 < 𝑥 ≤ 0.6208      (27) 92  where X’, Y’, and Z’ are the perceptually quantized X, Y and Z signals using SMPTE ST 2084. In (6) and (7), a* and b* channels are scaled to fall within [-0.5 0.5] so that BT.1361 quantization [21] can be applied to them.  The proposed CIELAB for HDR signals is somewhat similar to YDzDx. However, in the proposed modified CIELAB, color difference channels are scaled differently for positive and negative differences so that code-words are utilized more efficiently. 4.1.2 Experiments Setup To evaluate the proposed modified CIELAB color encoding for HDR video compression, we use four HDR video sequences from MPEG HDR video dataset: FireEater2, Market3, BalloonFestival, and SunRise. All of these videos are 1920x1080p and are in the BT.2020 container although their actual colors fall inside the BT.709 gamut. Figure 4-1 shows tone mapped snapshots of the first frame of each video sequence. Figure 3 shows how the original linear light HDR content is encoded to the modified CIELAB, followed by quantization and chroma down-sampling. It is worth noting that our modified CIELAB-based method uses the original sampling filters designed specifically for YCbCr and as  Figure 4-1 Snapshots of the first frames of HDR test video sequences (tone-mapped version): (a) FireEater2, (b) Market3, (c) BalloonFestival, and (d) SunRise  93  such they are not optimized for our proposed scheme. For compression, we used the HEVC encoder reference software HM 16.15, Main10 profile. We coded the tested videos at four bit-rate levels using four QPs, as suggested in MPEG CfE. To compare the color encoded and compressed signals with the original ones in terms of quality, they are de-compressed and converted back to the linear light domain as shown in Figure 4-2. 4.1.3 Results and Discussions Figure 4-3, Figure 4-4, Figure 4-5, and Figure 4-6 show the bit-rate versus DE1000, Overall Signal to Noise Ratio (OSNR) and perceptually Transformed Peak Signal to Noise Ratio (tPSNR) in terms of (db) for the proposed modified CIELAB, the NCL YCbCr, luma-adjusted NCL YCbCr, ICtCp and YDzDx for FireEater2, Market3, SunRise and BalloonFestival, respectively. tPSNR is the average of the PSNR X’, Y’ and Z’. OSNR is the overall SNR of X’, Y’ and Z’ with calculation of the error for each pixel and then averaging the errors.  DE1000 is the PSNR based value of the average error in terms of CIE DE2000 metric. Table 4-1 also shows the bit-rate savings in terms of the same metrics for the proposed color encoding over NCL YCbCr.  Figure 4-2 Pre/post processing steps of the proposed modified CIELAB for HDR video compression 94  As it can be seen from Figure 4-3, Figure 4-4, Figure 4-5, and Figure 4-6 the proposed modified CIELAB clearly outperforms the NCL YCbCr, luma-adjusted NCL Y’CbCr, and YDzDx in terms of DE1000. This shows that the proposed method can maintain the original colors better at any given bitrate. The proposed method performs almost identical to ICtCp in terms of DE1000.  Moreover, it can be seen form Figure 4-3, Figure 4-4, Figure 4-5, and Figure 4-6  that the proposed method also outperforms NCL YCbCr, luma-adjusted NCL YCbCr, and YDzDx in terms of OSNR, especially at higher bitrates. All the tested color encoding schemes seem to be performing similarly in terms of tPSNR. Please note that the chroma down-sampling filter used for the proposed CIELAB is the same as the one in YCbCr. However, a better performance may be achieved in terms of tPSNR and OSNR if a new sampling filter is designed that better matches the a* and b* characteristics. Although this is not in the scope of this work, it is part of our future work.  Moreover, the rate-distortion optimization (RDO) setting inside the encoder was maintained the same in all these experiments. Since the current RDO is customized for YCbCr characteristics, it is expected that further improvements may be obtained by modifying the RDO process according to the proposed modified CIELAB color encoding. This step as well is in the scope of future work. Another note-worthy observation from Figure 4-3, Figure 4-4, Figure 4-5, and Figure 4-6 is how YDzDx underperforms all the tested color encodings, although its derivation is very similar to what is proposed in this paper. However, as the proposed scaling of a* and b* employs the available code-words more efficiently, it achieves better compression performance compared to YDzDx. 95  Overall, it is shown that the proposed color encoding results in better performance in terms of DE1000 compared to conventional NCL YCbCr, by an average of 41% over the four videos, hence better maintaining the original HDR colors. By using a chroma down-sampling filter that is designed for the proposed space and changing the encoder rate-distortion optimization process, it is expected to improve the performance of the tested color in terms of tPSNR and OSNR Table 4-1 Bit-rate Savings of the Proposed CIELAB Compared to NCL YCbCr Metric  Video tPSNR X   (%) tPSNR Y (%) tPSNR Z (%) tPSNR XYZ (%) tOSNR XYZ (%) DE1000 (%) FireEater2 -7.4 8.0 -3.0 -1.0 -24.0 -32.6 Market3 13.1 17.4 2.7 10.5 6.2 -63.6 SunRise 0.2 9.9 -23.0 -6.3 -19.5 0.0 BalloonFestival 5.4 21.5 -17.7 -0.7 -14.3 -69.2 Average 2.9 14.2 -10.2 0.6 -12.9 -41.3  96   Figure 4-3 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjusted YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for FireEater2  Figure 4-4 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjusted YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for Market3 97   Figure 4-5 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjusted YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for SunRise  Figure 4-6 R-D curves of the proposed color encoding compared to NCL YCbCr, luma-adjuste YCbCr, Y’D’zD’x and ICtCp in terms of DE100 (db), OSNR (db) and tPSNR (db) for BalloonFestival  98  4.2 A Novel Chroma Processing Scheme for Improved Color Accuracy of Transmitted HDR Video Content Our eyes do not perceive all colors the same way, being sensitive to changes in some colors more than others. In this work, by taking advantage of this characteristic of the human eye, we propose a chroma processing scheme that reduces the color errors generated by the quantization and subsampling processes in the HDR video delivery pipeline. To this end, our method uses the de-correlated color space of ICtCp which was shown in Chapter 3 to outperform other color encodings in representing HDR colors and re-distributes chroma code-words in favor of those colors that our eyes are most sensitive to. 4.2.1 Proposed Chroma Processing Scheme Our eyes do not perceive different colors in the same way i.e. our eyes are sensitive to changes in some colors more than the others. In this work, we propose a chroma processing that exploits this characteristic of the human eye to assign more code-words during the bit-depth quantization to those colors that our eyes are more sensitive to. To achieve that, we first need to identify how different colors are distorted due to bit-depth quantization and whether these distortions are visible to our eyes. We use the CIE DE2000 [41], a perceptual-based color difference metric, to calculate the color differences between the original and the 10-bit quantized colors. To construct the original colors, we consider the colors in BT. 2020 starting with the CIE 1976 Lu’v’ color space due to its perceptual uniformity. While L is constant, the u’ and v’ values are increased from 0 to 0.62 with step size of 0.001. According to [19], chromaticity changes lower 99  than 0.001 are imperceptible to the human eye. Since constructing and analyzing all the colors of the BT.2020 gamut (230 of them if we consider 10 bits) is impractical, we sample these colors at different luminance (L) levels. The samples we use for L values are 0.05, 0.5, 5, 50, 100, and 400 nits, resulting in 6 color patches. We stopped at 400 nits, since the range of colors represented at maximum luminance of 1000 nits is significantly reduced.  Each of these color patches are then converted to ICtCp through the workflow shown in the top branch of Figure 4-7. To understand how chroma quantization affects color perception, the chroma channels of Ct and Cp are then quantized to 8, 9 and 10 bits while the luma channel I is always quantized to 10 bits. Note that instead of multiplying the Ct and Cp channels by 28 and 29 to obtain 8 and 9-bit chroma channels, we scale them by 1/4 and 1/2, respectively, and then quantize to 10 bits.  The resulting quantized I10CtCp10, I10CtCp9 and I10CtCp8 (numbers representing the bit-depth for each channel) are then converted back to linear RGB as shown in the bottom branch of Figure 4-7 by inverting the steps taken to construct the ICtCp signals.  Note that the Just Noticeable Difference (JND) threshold in terms of DE2000 is equal to one, meaning that any color difference less than one is not perceptible by human eyes. Moreover, the larger the value of the DE2000 metric is, the more different the compared colors are perceptually. Since the DE2000 metric is designed specifically for the CIELAB color space, the original RGB and the converted RGB from the quantized ICtCp signals are converted to CIELAB.  100    Figure 4-7 Workflow for calculating color errors in terms of DE2000 due to ICtCp quantization on BT.2020 sampled colors. 101  Note that DE2000 requires the white point of the values to be set to 100 nits for SDR. However, since our experiments are on HDR content, throughout this work the white point to calculate DE2000 is set at 1000 nits. Figure 4-8 (b), (c), and (d) show the color differences due to quantization in terms of DE2000 at selected sampled luminance levels and bit-depths using ICtCp and the original linear colors, using an error bar. We also included the color differences generated by YCbCr at 10 bits as a reference in (a). The luminance transfer function for YCbCr was SMPTE-ST 2084, also known as Perceptual Quantizer (PQ). This is also the transfer function used for ICtCp. Note that no chroma subsampling was performed on these colors and the visible errors are merely due to the quantization (Figure 4-7). Visible color differences are shown with colors other than dark blue in Figure 4-8 (see the error bar on the right) Figure 4-8 implies that 10-bit YCbCr and I10CtCp9 show similar behavior in representing colors, with I10CtCp9 slightly outperforming 10-bit YCbCr. However, reducing the bit-depth of Ct and Cp channels further to 8 bits, as shown in Figure 4-8 (d) and (e), will introduce visible color differences that are more than those of the 10-bit YCbCr (Figure 4-8 (a)).  Another interesting observation from Figure 4-8 is that the color differences are mainly visible around the white point as the bit-depth is reduced, while saturated colors at the border of the gamut are still intact at 8 bits per color channel in terms of DE2000. The latter observation implies that colors around white point need to be represented more accurately and thus must be assigned more code-words than the ones further away. In contrast, saturated colors can be quantized with fewer bits and still be perceptually similar to the original color. This suggests a non-linear behavior in chroma channels in terms of required code-words. 102    Figure 4-8 Color errors generated due to bit-depth quantization for (a) 10-bit YCbCr, (b) 10-bit ICtCp, (c) I10CtCp9, (d) I10CtCp8, (e) the proposed I10Ct*Cp*9, (f) the proposed I10Ct*Cp*8, and (g) the proposed I10Ct*Cp*7 over BT.2020 gamut using the error bars on the right. Numbers indicate the used bit-depth. 103  Since saturated colors are the ones with Ct and Cp at the beginning and end of the supported [-0.5 to +0.5] range, the non-linear behavior of code-words needed to encode chroma channels can be modeled with a simple sigmoid function: 𝑓(𝐶𝑡) = 𝐶𝑡∗ =   201+ 𝑒−0.45𝐶𝑡− 0.5       (28) The offset of 0.5 is added to maintain the chroma range at [-0.5 to +0.5]. Cp is also modeled by the same function as in (1). The transferred Ct* and Cp* channels are then scaled down by 1/4 and quantized as below using limited 10-bit code-words:     C𝑡∗(10 − 𝑏𝑖𝑡) =  (224 × (14  C𝑡∗) + 16) × 4      (29) Cp is also quantized to 10 bits as in (2).  Figure 4-9 shows how Ct (or similarly Cp) values are translated into code-words using the proposed transfer function compared to the original linear scaling method using a 10-bit range of code- words. Figure 4-10 shows the percentage of assigned code-words for each range of Ct and Cp values with step size of 0.1, using our proposed method and the existing linear quantization in case of 10 bits. As can be seen, most of the available code-words are allocated to colors around the white point with Ct and Cp values between [-0.2 and 0.2]. To evaluate the performance of the proposed chroma transfer function in (1), we apply it on the Ct and Cp channels before quantization. The transformed Ct* and Cp* channels are then quantized to 9, 8 and 7 bits.  104  Figure 4-8 (e), (f) and (g) show the color differences generated from I10Ct*Cp*9, I10Ct*Cp*8 and I10Ct*Cp*7. It can be observed that the color differences around the white point for the 9 and 8-bit Ct* and Cp* channels are reduced compared to those of their original ones and those of the 10-bit YCbCr. I10Ct*Cp*7 errors are, however, more than to 10-bit YCbCr and I10CtCp9. Hence, we choose 8 bits as the minimum number of bits required to represent HDR colors using our method with perceptual differences comparable to 10-bit YCbCr. Based on Figure 4-8 (b), (c) and (d), it is observed that the color errors are more concentrated along the vertical axis, from color yellow to blue. This range of colors corresponds to what the Ct channel represents. This observation implies a different behavior for Ct and Cp, indicating that a different code-words redistribution function may be needed for representing the two. However, in this work  Figure 4-9 Original and transferred chroma code-words with 10 bits 105  we focus on using a common transfer function for both and leave the study of deriving a specific function for each chroma channels for our future research. 4.2.2 Color Perception Evaluation of the Proposed Method Although Figure 4-8 (f) clearly shows that the proposed chroma transfer function with 8 bits does not result in visible color changes more than I10CtCp9 or 10-bit YCbCr in terms of DE2000, we conducted a set of subjective tests to validate the objective evaluations. Note that no video compression is performed for this test. 4.2.2.1 Test Setup To evaluate whether our proposed method results in visible color changes, we applied our method  Figure 4-10 Comparison of the chroma channels’ code-word distribution with 10 bits using the original and the  106  on a set of 10 images, cut-outs of a frame from the existing HDR video dataset in MPEG CfE. These test images are shown in Figure 4-11 (a). Images were shown as a pair to the viewers with the original unprocessed image on one side, and the chroma processed I10Ct*Cp*8 on the other side. We processed the original images with increased saturation and decreased saturation as well and included them as the test images. Note that the participants were aware of the original video position. All the test images were randomized in sequence. The steps taken to prepare the test images are shown in Figure 4-12 (without the codec). A SMPTE ST-2084 transfer function was applied as the last step to adapt to display expectations in HDR mode. Note that the same step was applied on the original images as well. All images were scaled to 1000 nits before displaying to correspond to the maximum luminance supported by the display.  The images were shown on a 30-inch Sony BVM-X300 display at 10 bits with BT.2020 color gamut. Note that we did not apply the chroma subsampling for this test to isolate the color differences, if any, to the quantization only. Subjects were asked to rate the pair ‘different’ or ‘same’ in terms of color perception. Sixteen individuals participated in our test consisting of naïve and experts in the field. All participants were tested for normal color blindness and normal visual acuity and those who failed these tests did not participate.107  I1 I2 I3 I4 I5 I6 I7 I8 I9   (a)  (b)  (c) Figure 4-11 (a) original test images cut-outs (tone-mapped), (b), color errors generated by I10CtCp9 and, (c) color errors generated by I10Ct*Cp*8 using error bar in  Figure 4-8  108  4.2.2.2 Results Figure 4-13 shows the results of the subjective test evaluations. The blue bars represent number of subjects that detected difference between the image pairs of the original and the processed ones using our proposed method (we do not report the result for increased and decreased saturation dummy test images). Since the experiment was based on the participants’ opinion of the similarity between two images, we did not discard any subjects as outliers. On average, the images processed with the proposed chroma processing were rated identical to the originals by 83% of the subjects. I7 has the least similarity score with 68% of subjects rating it similar to its original counterpart, while I10 reached the 100% mark. Figure 4-11 (b) shows the color errors for each pixel of each test image in terms of DE2000 generated by the I10CtCp9, using the same error bar indicator as in Figure 4-8. Figure 4-11 (c) shows the same errors generated by our method. We observe that the proposed method reduces both the number of pixels with visible errors and the magnitude of the errors. Table 4-2 reports the reduction in number of pixels (in %) with visible color errors, as well as the average error over the  Figure 4-12 End-to-end workflow with the proposed chroma processing scheme shown in purple dotted box. 109  entire frame comparing the proposed method to I10CtCp9 and 10-bit YCbCr, in 4:4:4 and 4:2:0 formats. As it is shown, the proposed method reduces the percentage of pixels with visible errors substantially in both 4:4:4 and 4:2:0 format, while the average error is also reduced.  Another interesting observation from Figure 4-11 (b) and Figure 4-11 (c) is that most visible errors in the test images are around white and its shades (See I3 and I8 errors and how they are reduced using the proposed chroma processing). Although based on Figure 4-11 (c), I3, and I8 show most visible errors (represented with color red), they are still rated similar by 85% and 95% of the subjects, respectively. 4.2.3 Compression Performance of the Proposed Chroma Processing – Objective Evaluation In the previous section we showed that our approach results in more accurate representation of  Figure 4-13 Subjective test results of the color difference perception between the images generated by the proposed method and the original images. 110  colors compared to the existing alternatives. In this section, we objectively evaluate the effect of compression on the decoded signal when our chroma processing scheme is used. 4.2.3.1 Pre-Processing We processed three 1920x1080 HDR videos, BalloonFestival, FireEater2, and Market3, from the MPEG Call for Evidence video dataset using our proposed chroma processing I10Ct*Cp*8. The Table 4-2 Average DE2000 and Percentage of Pixels with DE2000 Value Greater Than One for the Test Images when Represented with YCbCr10, I10CtCp9 and the Proposed I10Ct*Cp*8 with 4:4:4 and 4:2:0 Chroma Chroma format Method Test Images I1 I2 I3 I4 I5 I6 I7 I8 I9 % of pixels with DE > 1 4:4:4 YCbCr10 60.49 56.11 57.93 60.07 61.45 58.87 43.89 63.44 64.66 I10CtCp9 28.1 43.73 37.97 40.66 50.36 47.71 44.32 68.94 48.09 I10Ct*Cp*8 39.38 34.96 30.30 32.04 39.98 33.18 29.75 48.75 39.40 4:2:0 YCbCr10 64.01 70.62 70.82 75.51 77.77 65.90 52.52 63.95 71.46 I10CtCp9 33.63 66.16 60.93 62.75 74.47 60.94 66.77 69.79 60.16 I10Ct*Cp*8 45.82 61.84 56.39 58.35 68.90 50.38 53.60 50.40 55.3 Average DE of the Test Images 4:4:4 YCbCr10 1.10 1.17 1.15 1.21 1.27 1.19 0.99 1.27 1.28 I10CtCp9 0.83 0.98 0.91 0.95 1.05 1.03 0.98 1.23 1.03 I10Ct*Cp*8 0.92 0.88 0.82 0.85 0.92 0.86 0.83 1.04 0.93 4:2:0 YCbCr10 1.29 1.81 1.86 1.90 1.99 1.46 1.62 1.31 1.62 I10CtCp9 1.03 1.68 1.67 1.67 1.82 1.38 1.70 1.26 1.40 I10Ct*Cp*8 1.12 1.60 1.58 1.59 1.71 1.21 1.54 1.07 1.32  111  videos using the proposed I10Ct*Cp*8 were generated based on the workflow shown in Fig. 6 top row. For comparison, we also generated I10CtCp9 by using scaling value of 0.5 (1/2(10 – 9)) for Ct and Cp (See Figure 4-7). To separate the color errors due to quantization from chroma subsampling, we generated both 4:4:4 and 4:2:0 signals and encoded them separately. The 4:2:0 signals are generated by applying the subsampling filter suggested in MPEG CfE before the quantization step in Figure 4-12.  All three channels are quantized to 10 bits using the BT.1361 limited range. 4.2.3.2 Video Coding The proposed I10Ct*Cp*8, and I10CtCp9 were compressed using the HEVC encoder software version HM 16.16. We used the Main10 Range Extension (RExt) and Main10 profiles to compress 4:4:4 and 4:2:0 signals, respectively. The GOP size for both cases was set to 8. The Chroma QP Offset was set to 0. To generate four different bitrates, we used the QP values of [20, 23, 26, 29], [18, 22, 26, 30], and [21, 25, 29, 33] for BalloonFestival, FireEater2, and Market3, respectively. The IntraPeriod parameter was set to 24 for BalloonFestival and FireEater2, and 48 for Market3. Video Usability Information (VUI) parameters were used to signal the HDR metadata. 4.2.3.3 Post-Processing Once the bitstreams are decoded, we convert them back to the linear RGB using the steps shown in Figure 4-12 for I10Ct*Cp*8 and in  Figure 4-7 for I10CtCp9. HDRTools is used for objective evaluations [100][101]. The white point was set to 1000 nits for the metrics that are calculated relative to the white point. 112  4.2.3.4 Results and Discussions Table 4-3 provides the average bitrate saving/loss in terms of, tPSNR Y, tPSNR XYZ, DE1000, MD1000 and PSNR L1000 metrics using Bjøntegaard average difference [102] for I10Ct*Cp*8  Table 4-3 Average Objective results of I10CtCp9 and I10CtCp*8 Compared to the Anchor of 10-bit YCbCr (Negative Values Indicate an Improvement over the Anchor, While Positive Values Indicate a Loss) Method Sequence Metrics tPSNR Y tPSNR XYZ DE1000 MD1000 PSNRL1000 I10Ct*Cp*8 4:4:4 FireEater2 -2.8% 4.0% -23.7% 15.9% -5.6% Market3 3.8% 2.3%  -24.1% -10.9% 3.7% BalloonFestival 0.0% 4.0% -8.2% -17.9% -0.2% Overall 0.4% 3.5% -18.6% -4.3% -0.7% I10CtCp9 4:4:4 FireEater2 1.5% 4.9% -21.1% 11.0% -1.2% Market3 3.2% 1.7% -18.9% -9.0% 3.1% BalloonFestival 2.9% 0.8% -7.3% -17.4% 2.7% Overall 2.0% 1.8% -15.8% -5.3% 1.5% I10Ct*Cp*8 4:2:0 FireEater2 -13.5% 5.2% -34.3% 9.3% -13.0% Market3 -0.5% -3.3% -33.8% -57.4% 0.4% BalloonFestival -5.0% 0.6% -10.7% -64.3% -3.0% Overall -6.3% 0.9% -26.2% -37.5% -5.2% I10CtCp9 4:2:0 FireEater2 -7.6% 6.6% -29.9% 10.6% -10.8% Market3 0.4% -1.4% -25.4% -38.8% 0.3% BalloonFestival -1.4% 2.6% -8.7% -18.1% -1.5% Overall -2.8% 2.6% -21.3% -15.5% -4.0%  113  and I10CtCp9 with 10-bit NCL YCbCr as the anchor. For measuring luma quality, we used tPSNR Y and PSNRL1000. Note that DE1000 in the table is the DE2000 with the white point set to 1000 nits to address HDR content, presented in dB.   We observe that the compression efficiency in terms of tPSNR Y and PSNR L1000 which consider luma artifacts, is barely affected, with bitrate changes kept to an average of less than 1%. However, the proposed I10Ct*Cp*8 method clearly outperforms the original 10-bit 4:4:4 YCbCr in terms of DE1000 by an average of 18.6%. By comparing the performance of I10CtCp9 in 4:4:4 format in terms of DE1000 with the performance of our method, we conclude that our method is better than I10CtCp9 by an average of 2.8% (18.6% - 15.8%). Regarding the 4:2:0 case, in terms of tPSNR Y, MD1000 and PSNR L1000, the proposed method outperforms 10-bit YCbCr and I10CtCp9, while it slightly underperforms 10-bit YCbCr in terms of tPSNR XYZ compared to YCbCr (refer to Table 4-3 for exact numbers). Additionally, the proposed method outperforms YCbCr and I10CtCp9 in compression efficiency by an average 26.2% and 5%, respectively in terms of DE1000.   Table 4-4 reports the number of pixels with visible color errors in a frame and the average error over a frame in DE2000 using our method, as well as I10CtCp9 and 10-bit YCbCr before compression. Recall that DE2000 values larger than one correspond to color errors visible to the human eye.  As can be noted from Table 4-4 the percentage of pixels with visible color errors is more reduced by the proposed method for Market3 in 4:4:4 format, compared to the other two videos. This can be associated with the fact that Market3 has more colors around the white point, whose errors are better corrected by our method.  In addition to reducing the number of pixels with 114  visible errors, Table 4-4 shows that the magnitude of the average color error is also reduced more for Market3 in 4:4:4 format by the proposed method compared to the other test sequences. Figure 4-14 shows a visual example of how color errors in terms of DE2000 are distributed for the first frame of the Market3 using 10-bit YCbCr, I10CtCp9, and the proposed method in 4:4:4 format (same error bar used in Figure 4-8). The above observations explain why our method performs better for Market 3 in the 4:4:4 format. Overall, we can conclude that the proposed method Table 4-4 Average DE2000 and Percentage of Pixels with DE2000 Value Greater than One for the First Frame of the Test Videos when Represented with YCbCr10, I10CtCp9 and the Proposed I10Ct*Cp*8 with 4:4:4 and 4:2:0 Chroma Compared to the Originals Sequence Method % of pixels with DE > 1 Average DE of frame 4:4:4 4:2:0 4:4:4 4:2:0 FireEater2 YCbCr10 7.74 65.27 0.5514 2.0131 I10CtCp9 2.84 62.14 0.4530 1.9171 I10Ct*Cp*8 1.84 61.16 0.4180 1.8944 Market3 YCbCr10 42.35 86.52 1.2062 6.07 I10CtCp9 33.22 85.88 0.96 5.9590 I10Ct*Cp*8 29.38 85.81 0.8807 5.9218 Balloon Festival YCbCr10 3.79 33.51 0.425 2.7661 I10CtCp9 2.02 32.32 0.3935 2.7618 I10Ct*Cp*8 1.49 32.22 0.3828 2.7604   115  represents colors with more accuracy without affecting the bitrate in 4:2:0 mode, compared to 10-bit YCbCr and I10CtCp9.  4.2.4 Subjective evaluation of the Proposed Method In this section, we subjectively evaluate the effect of our chroma processing scheme on the decoded signal, with emphasis on compression artifacts.                   (a)      (b)              (c)         (d) Figure 4-14  (a) the first frames of Market3 (tone mapped) and DE2000 values distribution of the same frame represented with (b) 10-bit YCbCr, (c) I10CtCp9 and (d) the proposed I10Ct*Cp*8 shown using the error bar similar to Figure 4-8 116  4.2.4.1 Test Methodology and Procedures The methodology used in this test is similar to the Simultaneous Double Stimulus for Continuous Evaluation (SDSCE) method suggested in BT.500 with the exception that we used a discrete grading scale. Viewers were shown the reference (uncompressed) and test videos at the same time on two identical Sony BVM-X300 reference displays (see 4.2.4.2 for details). The test video set consisted of the same 3 HDR videos used in our previous evaluations. We chose to evaluate the performance of I10CtCp9 and the proposed I10Ct*Cp*8, encoded at 4 QP levels in 4:4:4 and in 4:2:0 formats, resulting in total of 48 test videos. The 4:2:0 and 4:4:4 tests were run separately with a short break between them. The videos were presented in a random order. Before the actual test, a training video set was presented to the subjects at five quality levels, including the original video itself, to familiarize them with the test procedure as suggested in  BT.500. The training video set was not part of the video test set. After each reference-test video pair, a neutral grey picture was shown for three seconds so the subjects could register their score.   Prior to the tests, all subjects were screened for color blindness and visual acuity by the Ishihara chart and the Snellen charts, respectively. Subjects that failed the pre-screening did not participate in the test. The same verbal and written instructions were presented to the subjects prior to the test.  The conditions of the room in which the tests were conducted were compliant with the suggestions in BT.500. The tests were conducted with one participant at a time. The viewers were seated with the distance of 2.5 meters from the displays to satisfy the preferred viewing distance suggested in [45]. Subjects were aware of the reference video location and were asked to rate the fidelity of the 117  test videos compared to the reference (uncompressed) videos. Sharpness and color were provided as examples based on which the subjects can evaluate a video’s fidelity to the source. Fidelity scores varied from 1 to 10, with 1 being the farthest (lowest) quality and 10 being the closest (best) to that of the original signal.  4.2.4.2 Displays and Viewers The tests were conducted on two identical 30-inch Sony BVM-X300 displays side-by-side to show original and test videos at the same time to the viewers. The resolution of both displays was set to full HD. Since the content was of HD 1920x1080 resolution, they were not cropped and were shown at original resolution. The displays were put on HDR mode corresponding to applying SMPTE ST 2084 and BT. 2020 color matrix on the signal. The peak luminance of the displays on HDR mode was 1000 cd/m2.  The videos were shown as 10-bit at their framerates of 50 fps, 25 fps, and 24 fps for Market3, BalloonFestival, and FireEater2, respectively. Since all the videos were originally mastered at 4000 nits, we scaled them down to 1000 nits before displaying, to avoid any clipping from the displays. Eighteen individuals participated in our test, a mix of naïve and experts, with ages ranging from 22 to 30 years old. 4.2.4.3 Results and Discussions After collecting the subjective tests results, three outliers were identified according to the BT.500 recommendations and their scores were discarded from the results. The Mean Opinion Score (MOS) for each test video was calculated by averaging the scores over all subjects with a 95% 118  confidence interval. Figure 4-15 (a) and (b) plot the MOS values of I10CtCp9 and I10Ct*Cp*8 at different bitrates in 4:4:4 and 4:2:0 formats, respectively. As can be observed, I10CtCp9 and I10Ct*Cp*8 were rated almost similar to each other at all the bitrate levels with no statistical difference. The performance of the proposed I10Ct*Cp*8 in 4:2:0 is slightly better compared to I10CtCp9, mainly attributed to the better color preservation by our method when subsampling is involved.  Additionally, it is worth noting that for Market3 and FireEater2, which consist of more colors closer to the white point than the BalloonFestival, the color accuracy is more evident with viewers rating them closer to the original, since the other compression artifacts are identical for both tested methods.  In conclusion, our chroma processing scheme does not negatively affect the overall compression performance while it preserves colors through the quantization and subsampling processes much more accurately than the existing methods.119   (a)  (b)  Figure 4-15 MOS-bitrate comparison of I10CtCp9 and ICt*Cp*8 using (a) 4:4:4 and (b) 4:2:0 chroma formats for tested sequences 120  4.3 Conclusions In this chapter, we presented two chroma processing schemes for more accurate representation of HDR color.  In Section 4.1, a modified CIELAB color encoding scheme for efficiently compressing HDR content. Performance evaluations show that the proposed method, even when using the chroma subsampling designed for YCbCr, maintains the original HDR colors better than other existing methods and results in an average of 41% bit-rate savings over four videos in terms of DE1000 (db). The performance of the proposed modified color space even without changing the chroma sub-sampling filters of YCbCr is almost similar to that of ICtCp. The slight underperformance of the proposed approach in terms of tPSNR can be improved by changing the chroma down-sampling filter to a more tailored one to the a* and b* characteristics. Furthermore, changing the RDO process to be performed in the proposed space instead of YCbCr may also result in further performance improvement in terms of tPSNR. In Section 4.2, we presented a chroma processing scheme that is designed to reduce the color errors generated by the quantization and subsampling processes in the HDR video delivery pipeline. The proposed method uses the de-correlated color space of ICtCp and re-distributes chroma code-words in favor of those colors that our eyes are most sensitive to. Performance evaluations of color perception without the effect of compression showed that the proposed method substantially reduces both the number of pixels with visible errors and the magnitude of the errors compared to 10-bit YCbCr and I10CtCp9 in terms of DE2000. Subjective evaluation also showed that the visual 121  quality of the proposed redistributed chroma information with reduced bit depth is similar to the original signal. When compression is included, objective evaluation in terms of tPSNR Y and PSNR L1000, which consider luma artifacts, showed that overall the compression efficiency stayed the same. For the same scenario, the proposed method was shown to outperform both YCbCr and I10CtCp9 in compression efficiency by an average 26.2% and 5%, respectively in terms of DE1000. Additionally, subjective evaluations of the decoded signal showed that the proposed method did not affect the bitrate, yielding significant improvements in color accuracy for content with colors closer to the white point.   In our future work, we plan to explore improving the proposed chroma processing by deriving different re-distribution functions for Ct and Cp. Additionally, the chroma subsampling filter we used in the experiments is designed for YCbCr with linear distribution of code-words in chroma channels. Since the proposed chroma processing affects Ct and Cp code-word distribution, an adaptive chroma subsampling that considers the non-linear characteristic of the new chroma channels could further benefit the proposed method. 122  Chapter 5: Gamut Mapping for Backward Compatible HDR Videos Transmission In this chapter we propose two gamut mapping techniques that reduce the visible color errors caused by mapping the wider color gamut of HDR to the limited SDR color gamut to address backward compatibility. One technique is a hybrid approach for mapping BT.2020 colors to the ones of the BT.709 with backward compatible HDR to SDR or similarly UHD to HD mapping application. This hybrid approach is described in Section 5.1. The second technique is specifically proposed for current pipelines that can only support BT.709 colors transmission. The proposed technique has the advantage of retrieving BT.2020 colors with less errors compared to regular clipping mapping techniques. The approach is described in Section 5.2. 5.1 A Hybrid Approach for Efficient Color Gamut Mapping Distributing and broadcasting UHD/HDR content with wider color gamut such as BT.2020 to HD/SDR TVs with a smaller gamut (e.g., BT. 709) requires an adaptation process called gamut mapping. The process of gamut mapping inevitably leads to a loss in the mapped video’s color information. Therefore, to ensure an acceptable quality on legacy displays, an efficient gamut mapping process is required before transmission of the content, or at the receiver side. The efficiency of gamut mapping depends on the chosen color space and the projection technique as shown in [86]. To this end, in this study we propose a hybrid approach that for each color in the BT.2020 gamut chooses a combination of color space and projection technique that yields the minimum possible color error. 123  5.1.1 Proposed Hybrid Gamut Mapping Method Many color spaces exist with different characteristics, such as perceptual uniformity [71], hue linearity [103], etc. Gamut mapping can be performed in any color space and using different projection techniques with different constraints such as hue linearity [104] or minimizing the Euclidian distance from the boundary of the gamut [105]. The study in [86] shows that among the tested color spaces and projection techniques, the combination of the CIELAB color space and Toward White Point (TWP) projection technique, explained in Section 1.5, results in the least average error. The TWP projects out-of-gamut colors to the intersection between the gamut boundary and the line that connects the source color value to the white point (refer to Section 1.5). However, as it can be observed in Figure 5-1, for some colors the combination of the CIELAB and the Closest projection method (denoted by Closest-CIELAB) outperform TWP-CIELAB. The Closest projection method maps out-of-gamut colors to a point on the gamut boundary that yields the minimum Euclidian distance between the source and mapped color value (refer to Section 1.5). In other words, the best overall gamut mapping technique does not yield the lowest distortion for every possible color.  In our hybrid mapping approach, we map each RGB code value of the BT.2020 gamut using 10 bits to BT.709 using 8 bits, with the combination of color space and projection technique that results in the minimum error. This method is hybrid in the sense that each color code value can potentially use a different combination of color space and projection technique. The color spaces used in this implementation are xyY[106], Yu’v’ [107], Yuv, CIELUV, CIELAB, and ICaCb [108]. 124  Note that xyY is not designed with perceptual uniformity in mind, contrary to Yu’v’, Yuv, CIELUV, and CIELAB. ICaCb is a color space that focuses on keeping hue lines constant. The projection techniques utilized in our hybrid approach are TWP and Closest projection (refer to Section 1.5 for details).  To select the optimized combination, for each of the 230 combinations of RGB values, we compute the DE2000 metric for all combination pairs. Note that our method is generic in the sense that any new color space or projection could be included in our minimization process to achieve even higher gamut mapping accuracy.  Figure 5-1 Visual Comparison of Closest-CIELAB and TWP-CIELAB approaches for gamut mapping from a larger gamut to a smaller one  125  5.1.2 Results and Discussions Table 5-1 presents the results of the proposed hybrid gamut mapping method. We selected the TWP and CIELAB pair (denoted by TWP-CIELAB) as our point of reference since it was the combination that resulted in the least average error amongst all the tested combinations in [86].  As it can be observed, the Mean Error has been reduced by 0.36 using the proposed hybrid gamut mapping method. Additionally, the percentage of colors with error value of less than one has increased. This is an important aspect since the DE2000 metric returns a value greater than one only if the difference between the two tested colors is noticeable. The increase in number of colors with error less than one means that more colors are mapped below the visible threshold.  Our method can be computed offline and implemented in a Look-Up Table (LUT). Without any subsampling, the size of this LUT would reach 3.2 GB. Further optimization can be achieved by considering only out of gamut values in the LUT. Another possibility for reducing the LUT size is to use an octree-forest approach such as the one in [108] to subsample the number of coefficients used. Such a LUT could be implemented in set-top boxes or TVs to guarantee that each color is mapped to a color resulting in the lowest perceptual distortion possible. Please note that these optimization methods were not used in the obtained results.   126  5.2 A Color Gamut Mapping Scheme for Backward Compatible HDR Video Transmission It is essential for service providers and broadcasters to ensure the quality of service (QoS) for their customers regardless of the display technology used by their clients. For the transmitted video’s color to be interpreted and reproduced properly at the display, the color gamut of the content needs to match that of the display so that content. However, challenges arise when the content has a color gamut different from the one the display can show for example when an SDR signal with BT. 709 gamut is transmitted to an HDR display with BT. 2020 gamut.  One possibility is that the display’s gamut is a subset of the content color gamut. In this case the source color gamut needs to be compressed into the smaller destination gamut e.g. if the content is of BT.2020 gamut and the display is of BT.709 gamut, gamut compression such as the one proposed in 5.1 should be applied on the content gamut. Another possibility is that the content gamut is smaller than that of what the display is capable of reproducing. In this case, if no information of the source gamut is present at the viewers’ side, the content is viewed directly. Table 5-1 Results of the Hybrid Gamut Mapping Approach versus TWP-CIELAB  Gamut Mapping Method Mean Error (DE2000) Number of pixels with DE2000 error < 1 TWP-CIELAB 4.46 340180257 Proposed hybrid approach 4.08 358361835  127  However, if source gamut is known at the viewers’ side (through metadata sent with the bitstream), gamut expansion can be applied to the content so that the viewers are benefited from the larger gamut display capabilities [84].   Figure 5-2 depicts the commonly used pipeline for 8-bit SDR HD/SD content delivery that utilizes H.264/AVC [110] video coding standard. To support both HD and UHD or similarly SDR and HDR, one solution is to update the existing pipeline with the recent video compression standard, High Efficiency Video Coding (HEVC), and replace all the existing set-top boxes (STB) with the new ones that have HEVC decoder and gamut mapping functionality, as shown in Figure 5-3. That imposes an update on the broadcasters’ distribution pipeline, which is quite costly and time consuming. The other alternative, during the transition phase from HD/SDR to UHD/HDR, is to preprocess the UHD/HDR content using an invertible gamut mapping scheme that compresses the BT.2020 gamut of the UHD/HDR content to BT.709 gamut and use the existing distribution pipeline to support customers with HD/SDR displays or subscribers to UHD/HDR services. However, new STBs with gamut expansion capability are provided to retrieve the original colors of the source gamut (See Figure 5-4). While this solution is quite cost-effective, the end users’ quality of experience (QoE) relies on the performance of the gamut mapping scheme. An effective  Figure 5-2 Current HD/SD distribution pipeline with 8-bit BT. 709 support  128  gamut mapping would preserve the perceptual characteristics of the mapped signal such as lightness, hue and chroma through an adaptation process [111], and thus the viewed signal will have the same contrast and the same overall perceptual attributes [112]. In this section we propose an invertible method for color gamut mapping from BT.2020 to BT.709 with minimal perceptual error. This method involves compressing the source gamut into the destination gamut and then expanding it to the original one using a scaling factor. The scaling factor adjusts the trade-off between the perceptual quality of mapped HD signal and that of the retrieved UHD signal to ensure the same QoE for all the viewers regardless of their devices’ color gamut. Figure 5-4 shows a scenario in which gamut mapping should be invertible. Since set-top boxes and TV displays may not be updated in a near future to support gamut mapping, broadcasters may need to send a mapped signal to viewers and let the ones with displays supporting a larger gamut retrieve the original gamut through an invertible gamut mapping process. In this case, clipping cannot be an appropriate method, as a subset the out-of-gamut colors has been mapped to a single color thus making inversion inaccurate. Compression, however, can be an appropriate  Figure 5-3 future distribution pipeline with 10-bit BT. 2020 support  129  mapping technique due to its many-to-many mapping relationship. During inversion, more colors of the original source gamut can be recovered with unnoticeable perceptual error [113]. By using compression for invertible gamut mapping, during inversion the colors that were inside the smaller gamut before mapping, will be re-mapped to the outside of the smaller gamut (and inside the larger gamut). Therefore, color errors will be inevitably introduced for the inside-gamut colors. In our proposed method that is explained in detail below, we try to strike a balance for the trade-off between the inside and outside-gamut color errors. 5.2.1 Proposed Method In this section, we propose a method for point-wise gamut mapping from BT.2020 to BT.709 based on the compression projection technique. To indicate how far we compress the larger source gamut into the smaller destination gamut we introduce a scaling factor, α. We test different values of α (α ∈[0,1]) to find the one that yields the least color error, both when the color is mapped to the smaller gamut and when it is inverse-mapped to the original gamut. Figure 5-5 shows an example of a gamut area inside BT.709 resulting from a specific scaling factor value.  Figure 5-4 invertible gamut mapping compatible with the current distribution pipeline    130  Our method employs DE2000 as the color error metric to evaluate the perceptual error in the mapped colors. For the DE2000 metric to work, it should be calculated on a perceptually uniform space, where the color components are de-correlated. For this reason, we use the CIE LC*h* color space to perform the gamut mapping process [114]. This color space is commonly used in traditional gamut mapping [85]. The proposed gamut mapping scheme is based on the TWP projection method described in [86], as this method results in less color error compared to closest method, on average.  In addition, by using TWP the direction with which the pixels have been mapped is known. The proposed method only needs the scaling factor to be sent as the metadata, if we assume that the source color gamut and the color space in which the mapping was performed, are known. The source color gamut is part of the HDR VUI [115].   Figure 5-5 Effect of scaling factor, α, on the size of the inner gamut inside BT.709 gamut  131  The proposed method application is shown in Figure 5-4. As it can be seen, there are two steps involved in the gamut mapping process. First the BT. 2020 gamut, which is the gamut of the original video content, needs to be reduced to BT.709 before encoding. After decoding the video, the BT.709 gamut needs to be expanded to BT.2020 (inverse gamut mapping) so that the signal can be displayed on a device that supports BT.2020. Therefore, the proposed mapping process will result in two mapped signals; one that is compressed into BT.709 gamut and the other one that is expanded back into BT.2020 gamut from the compressed BT.709. These two processes are explained in more details below. 5.2.1.1 Gamut Compression If the out-of-gamut color has a distance x from the white point, then xm, the distance of the corresponding mapped color from the white point, is: 𝑥𝑚 = {𝑥 − (𝐷 − 𝑑) × (𝑥−𝑑𝛼𝐷− 𝑑𝛼), 𝑥 > 𝑑𝛼𝑥                          , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒               (30) where D is the distance from the gamut border of BT.2020 to the white point, d is the distance from the gamut border of BT.709 to the white point and dα is the distance from the gamut border of the new smaller gamut inside BT.709 to the white point (which is essentially α × d). Figure 5-6 shows these parameters and their relationship. Please note that for simplicity, we just show an approximation of the three gamuts in Figure 5-6. 132   5.2.1.2 Gamut Expansion If the scaling factor of α is known at the viewers’ side, the mapped colors can be expanded back to the source color gamut. The distance of the mapped color inside the destination gamut from the retrieved color inside the source gamut will be: 𝑥 = {𝑥𝑚 +  (𝐷 − 𝑑) × (𝑥𝑚−𝑑𝛼𝑑− 𝑑𝛼), 𝑥𝑚 > 𝑑𝛼𝑥𝑚                          , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒               (31)   Figure 5-6 Relationship between larger (BT. 2020), smaller (BT. 709) and inner gamut (scaled by α) distances and the original and mapped color  133  This process will cause some of the inside-gamut colors of the BT.709 to also go through the inverse mapping process. This will inevitably generate some color distortions for these colors.  5.2.1.3 Bit-depth Considerations In our proposed method, we represent the BT.2020 gamut with 10 bits, while we use 8 bits for representing the colors of the BT.709 gamut. Therefore, by going from BT.2020 to BT.709 and then going back to the BT.2020 color gamut using our method, some colors will be lost through quantization. 5.2.1.4 Results and Discussions We use the DE2000 metric to evaluate the performance of the proposed color mapping scheme. Table 5-3 presents the results of mapping BT.2020 gamut colors into BT.709 gamut using different α values between 0 and 1. Note that α = 1 results in a gamut essentially the same as BT.709. Hence, we only report up to value of 1. It can be observed from the results in Table 5-3 that the smaller the α value is, the larger is the mean error and the percentage of colors with error larger than one.  Recall that an error less than one in terms of DE2000 means that both colors are perceptually similar, while errors equal and larger than one indicate that we can see the difference between the colors. It is expected from the results of Table 5-3 that as α gets closer to one, the error decreases and becomes closer to zero since the inner gamut gets closer to BT.709 gamut. However, due to the quantization error resulting from going from 10-bit to 8-bit content, the average error and the percentage of pixels with error more than 1 in terms of DE2000 is still relatively large. 134  Table 5-2 presents the results of the invertible gamut mapping when mapped back to the original BT.2020 gamut. It can be seen from these results that the parameter α has a large impact on the perceptual color error. If α is small, it means that more out-of-gamut colors can be mapped into inside colors, while others are mapped to gamut border. Therefore, inside-mapped colors can be retrieved back to the outside colors again with lower error.  For instance, in the case of α = 0.1, 95.6% of the colors are mapped inside and hence the mean error is low with most of the colors having no perceptible error (DE2000 > 1). However, even in Table 5-2 Results of Gamut Mapping from Bt.2020 to Bt.709 in terms of the DE2000 Metric Color Space α Mean Error % error > 1 % of mapped pixel LC*h* 0.1 0.08 0.194 95.6 0.2 0.08 0.202 89.7 0.3 0.08 0.240 82.7 0.4 0.08 0.265 75.3 0.5 0.08 0.285 67.6 0.6 0.08 0.323 59.9 0.7 0.09 0.383 52.2 0.8 0.10 0.663 44.6 0.9 0.14 0.773 37.1  135  the case of α being as small as 0.1, there is still the quantization error, as the 10-bit colors are mapped to 8-bits and then retrieved to 10-bit colors again during the inverse process.  By comparing the results of Table 5-3 and Table 5-2, it seems that there is no general value of α that results in the least color error for mapping BT.2020 to BT.709 and BT.709 to BT.2020. However, to find an appropriate α, for each video content, an analysis on their color distribution can be performed. If most of the colors of the content lay on the outside of the BT.709 gamut, then one can choose a smaller α, or in other words a larger distance between the BT. 709 gamut and the new inner gamut. Similarly, if most of the colors of the content lay inside the gamut of BT.709, Table 5-3 Results of Inverse Gamut Mapping from the Resulted Bt.709 to Bt.2020 in terms of the D2000 Metric Color Space α Mean Error % error > 1 % of mapped pixel LC*h* 0.1 5.01 91.4 95.6 0.2 4.80 83.4 89.7 0.3 4.69 75.4 82.7 0.4 4.64 67.6 75.3 0.5 4.63 60.1 67.6 0.6 4.66 52.9 59.9 0.7 4.74 46.0 52.2 0.8 4.86 39.3 44.6 0.9        5.05 32.9 37.1   136  then a larger α may be a more appropriate choice so that colors inside the BT.709 gamut are preserved more. This will be part of our future work on this topic. 5.3 Conclusions In section 5.1 in this chapter, we proposed a hybrid gamut mapping technique to convert BT.2020 color code values to the BT.709 gamut. A specific application of this method is to adapt UHD/HDR content to HD/SDR television systems and set top boxes. The results showed that our method reduces the overall error introduced by the mandatory gamut conversion. Our method is practical and easy to be used in set up boxes since it can be implemented as a subsampled 3D-LUT. If a new color space, projection technique or color metric is designed, the created LUT would only need to be updated to improve its performance. Furthermore, we proposed an invertible method for color gamut mapping from BT.2020 to BT.709 and then back to BT.2020 in section 5.2. One specific application of this method is transmission of UHD and/or HDR video content to viewers with both BT.709 and BT.2020 capable displays. A scaling factor is introduced that controls the mapping process. The lower the scaling factor, the more is the distance of an original outside-gamut color to the inside-gamut mapped color. Prior knowledge of the distribution of the content pixels would lead to an “optimum” selection of the scaling factor. If most of the colors lay outside of the BT.709 gamut, then a smaller scaling factor can be chosen. Otherwise, a larger scaling factor is more appropriate.  137  Chapter 6: Conclusions and Future Work 6.1 Summary of the Contributions In this thesis, we address the color accuracy of the transmitted HDR video signals, improving the overall perceptual quality and hence quality of experience (QoE), without increasing the compression bitrate. We also improved the color accuracy of HDR content when viewed on Standard Dynamic Range (SDR) displays, successfully addressing the backward compatibility requirement.  In Chapter2, we presented the first ever investigation of delivering HDR video content using the existing pipeline infrastructures, identifying the most efficient one in terms of required bitrate and subjective quality for both HDR and SDR displays. We showed that the HDR10 single layer approach is the most favorable and efficient choice for backward compatible HDR transmission. Additionally, we provided guidelines on how chroma channels can achieve better overall HDR perpetual quality. In Chapter 3 we investigated how bit-depth quantization when tied with HDR perceptual transfer functions (PTFs) and HDR color representations suitable for compression, affect the color accuracy of HDR color pixels. Using the findings of Chapter 2 and 3, we designed two processing techniques for HDR color pixels in Chapter 4 that can significantly increase the color accuracy of the existing most accurate representations of HDR colors, objectively and subjectively. The proposed methods in Chapter 4 are designed for delivery of HDR videos applications and they are shown not to increase the compression bitrate. Furthermore, in Chapter 5 we present two post-processing color mapping techniques for addressing the compatibility of 138  delivered HDR videos with SDR displays and pipelines. It was shown that the proposed color mapping methods result in better preservation of the original HDR colors and hence improved quality of experience, compared to the existing methods. 6.2 Significance and Potential Applications of the Research The research presented in this thesis aims at improving the visual quality of delivered HDR video content. Unlike the SDR delivery pipeline, that has been verified and matured in terms of required bitrate and visual quality, HDR delivery pipelines are untested and can be implemented in multiple ways. Furthermore, a constraint for a practical HDR delivery pipeline is its backward compatibility with the SDR displays, as the installed base of the latter technology is very large and destined to coexist with the emerging HDR technology for the foreseeable future. All these challenges that are the result of the introduction of the HDR technology to the consumer market, are answered in depth in Chapter 2 through a comprehensive study on the subjective quality of the delivered content on both SDR and HDR displays. The in-depth analysis of how different processing steps (such as tone mapping operators, chroma subsampling  and display adaptation) of each pipeline attribute to the final visual quality and required bitrate offered in Chapter 2, is an extremely helpful resource for service providers and broadcasters to design infrastructures that ensure the quality of service (QoS) for their customers and clients. One distinctive characteristic of HDR is its wider color gamut. While with SDR, the color distortion caused by perceptual transfer functions, color encoding and bit-depth quantization are kept below the visible threshold, these distortions become visible with the introduction of the HDR wider color range. All studies so far investigated these color errors on HDR videos and captured 139  content and hence failed to isolate problematic colors with visible color changes from the ones that result in invisible color changes.  In Chapter 3, for the first time we identify all the HDR colors that are affected by the perceptual transfer function, choice of color encoding and bit-depth quantization resulting in a visible color change. The presented results are useful for broadcasters to choose a perceptual transfer function and color encoding to preserve the colors more accurately. It also provides them with the insight of whether increasing the bit-depth can benefit the visual color perception of the delivered HDR videos.  Building on the findings in Chapter 3 and by considering some characteristics of the Human Visual System (HVS), in Chapter 4 we design processing methods that significantly improve the color accuracy of the existing methods. The proposed methods can be implemented in set-top boxes and TV displays so that the delivered color pixels can be decoded and displayed with more accuracy and more appealing final quality. The proposed methods are not computationally expensive and do not introduce an overhead on the content preparation or at the display (viewer) side. Furthermore, in Chapter 4, we study the use of a perceptually uniform color space for HDR video delivery and propose modifications that result in better representation of HDR colors. The developed color encoding with the proposed adjustments showed better performance in terms of maintaining HDR color information, while not affecting the bitrate. The set-top boxes and TV displays can easily be updated by simply incorporating a conversion matrix that transfers the original color space to the proposed one.  To address the backward compatibility of the HDR content with SDR displays, in Section 5.1, we propose a color gamut mapping technique that outperforms all existing methods in preserving the original colors in terms of a color difference predictor metric. The mapping process can be 140  performed offline and be implemented in a Look-Up Table (LUT). Such a LUT could be deployed in set-top boxes or TV sets to guarantee that each HDR color is mapped to an SDR color, with the lowest perceptual distortion possible. In Section 5.2, a reversible gamut mapping scheme is proposed and studied. One specific application is the transmission of UHD and/or HDR video content using an SDR pipeline to viewers with displays that support BT.2020. The proposed method incorporates a scaling factor in the form of metadata and can be easily implemented in HDR-capable set-top boxes, significantly improving the accuracy of colors expanded at the display. 6.3 Future Work Regarding the proposed method in Section 4.1, as the rate-distortion optimization (RDO) process inside the encoder is customized for characteristics of the YCbCr color encoding characteristics, future work would include modifying the RDO process according to the proposed modified CIELAB color encoding. Furthermore, the chroma subsampling filters used in our experiments were the ones originally designed for YCbCr color encoding. By using a chroma down-sampling filter that is designed for the proposed color space, it is expected that the performance of the proposed method will further improve in terms of compression efficiency. We will keep the chroma up-sampling filter intact not to impose costly decoder upgrades. Regarding the proposed method in Section 4.2, in future work we plan to explore improving the proposed chroma processing by deriving different re-distribution functions for Ct and Cp. Additionally, the chroma subsampling filter we used in the experiments is designed for YCbCr with linear distribution of code-words in chroma channels. Since the proposed chroma processing 141  affects Ct and Cp code-word distribution, an adaptive chroma subsampling that considers the non-linear characteristic of the new chroma channels could further benefit the proposed method. In Section 5.1, as mentioned the proposed method can be implemented in an LUT. Without any subsampling, the size of this LUT would reach 3.2 GB. Our future work includes further reduction of LUT’s size by considering only out of gamut values in the table. Another possibility for reducing the LUT size is to use an octree-forest approach to subsample the table entries. The method proposed in Section 5.2 can be further improved by prior knowledge of the distribution of the content pixels, as this information is helpful to leading to an “optimum” selection of the scaling factor. If most of the colors fall outside the BT.709 gamut, then a smaller scaling factor can be chosen. Otherwise, a larger scaling factor is more appropriate. Another possible direction in identifying the optimum scaling value is to use machine learning techniques so that the optimum value can be learnt through an extensive training phase by using DE2000 as the error metric. 142  Bibliography [1] E. Reinhard, G. Ward, S. Pattanaik, P. Debevec. High Dynamic Range Imaging Acquisition, Display and Image-Based Lighting. Morgan Kaufmann, 2010.  [2] J. Munkberg, P. Clarberg, J. Hasselgren, and T. Akenine-Möller, “High dynamic range texture compression for graphics hardware,” ACM Transactions on Graphics, vol. 25, no. 3, p. 698, Jul. 2006. [3] R. Boitard, M. T. Pourazad, and P. Nasiopoulos, “High Dynamic Range versus Standard Dynamic Range Compression Efficiency,” Proceeding of DMIAF, Santorini, pp. 1-5, 2016. [4] N. Bonnier, F. Schmitt, M. Hull, and C. Leynadier. “Spatial and color adaptive gamut mapping: A mathematical framework and two new algorithms,” Color and Imaging Conference, vol. 2007, no. 1, pp. 267-272, 2007.  [5] G. H. Joblove, “A Tutorial on Photometric Dimensions and Units,” SMPTE Motion Imaging J., vol. 124, no. 5, pp. 48–55, Jul. 2015. [6] H. Seetzen et al., “High dynamic range display systems,” ACM Transactions on Graphics, vol. 23, no. 3, pp. 760-768, Aug. 2004. [7] P. E. Debevec, J. Malik, “Recovering high dynamic range radiance maps from photographs”, Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'97), pp. 369-378, 1997. [8] A. Darmont High Dynamic Range Imaging. Sensors and Architectures. Bellingham WA USA: SPIE Press 2012. [9] M. D. Tocci, C. Kiser, N. Tocci, P. Sen, “A versatile HDR video production system”, ACM Transactions on Graphics, vol. 30, no. 4, pp. 41:1-41:10, July 2011. 143  [10] G. Ward-Larson and R. A. Shakespeare. Rending with Radiance. San Francisco: Morgan Kaufmann, 1998.  [11] G. Ward, “Real pixels,” Graphics Gems II, pp. 80-83, 1991. [12] D. Hough, “Applications of the proposed IEEE 754 standard for floating-point arithmetic,” Computer Journal, vol. 14, no. 3, pp. 70-74, 1981. [13] Industrial Light and Magic OpenEXR, 2008, [online] Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1667288. [14] C. A. Poynton, Digital Video and HD Algorithms and Interfaces. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2003. [15] A. Roberts, “Measurement of display transfer characteristic (gamma, γ),” EBU Technical Review 257, pp. 32-40, 1993. [16] T. Olson, “Behind gamma's disguise,” SMPTE Journal, vol. 104, no. 7, pp.  452–458, July 1995. [17] C. A. Poynton, a Technical Introduction to Digital Video.  New York: John Wiley & Sons, 1996. [18] Reference electro-optical transfer function for flat panel displays used in HDTV studio production, ITU-R BT.1886, 2011. [19] G.W. Larson, “LogLuv encoding for full-gamut, high-dynamic range images,” Journal of Graphics Tools, vo. 3, no.1, pp. 15-31, 1998. [20] Adobe. Tiff 6.0 specification, 1992, http://partners.adobe.com/asn/tech/tiff/specification.jsp. [21] Touzé, David, et al. “HDR video coding based on local LDR quantization,” HDRi2014-Second International Conference and SME Workshop on HDR imaging. 2014. 144  [22] T. Borer, A. Cotton, “Display Independent High Dynamic Range Television System,” International Broadcasting Convention, 2015. [23] Essential Parameter Values for the Extended Image Dynamic Range Television (EIDRTV) System for Programme Production, ARIB STDB67, 2015. [24] J. Hoekstra, D. P. J. van der Goot, G. van den Brink and F.A. Bilsen, “The influence of the number of cycles upon the visual contrast threshold for spatial sine wave pattern,” Vision Research, Vol. 14, No. 6, pp. 365-368, June 1974. [25] P. G. J. Barten, “Physical model for the contrast sensitivity of the human eye,” Proceedings of SPIE 1666 Human Vision Visual Processing and Digital Display III, pp. 57-72, 1992. [26] J. G. Robson, “Spatial and temporal contrast sensitivity functions of the visual system,” Journal of the Optical Society of America, vol. 56, no. 8, pp. 1141-1142, 1966. [27] J.J. DePalma and E.M. Lowry, “Sine-wave response of the visual system. II. Sine-wave and square-wave sensitivity,” Journal of the Optical Society of America, Vol. 52, No. 3, pp. 328-335, March 1962. [28] S. Miller, M. Nezamabadi, and S. Daly, “Perceptual signal coding for more efficient usage of bit codes,” SMPTE Motion Imaging Journal, vol. 122, no. 4, pp. 52-59, May 2013. [29] High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays, SMPTE Standard ST 2084, 2014.  [30] R. Mantiuk, G. Krawczyk, K. Myszkowski, H.  Seidel, “Perception-motivated High Dynamic Range Video Encoding,” Proceedings of 31st Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '04) (Special issue of ACM Transactions on Graphics), 2004. 145  [31] C. Poynton, J. Stessen and R. Nijland, “Deploying Wide Color Gamut and High Dynamic Range in HD and UHD,” in SMPTE Motion Imaging Journal, vol. 124, no. 3, pp. 37-49, April 2015. [32] Worldwide unified colorimetry and related characteristics of future television and imaging systems, ITU-R BT.1361, 1998. [33] Image parameters values for high dynamic range television for use in production and international programme exchange, ITU-R BT.2100-0, 2016. [34] Parameter values for the HDTV standards for production and international program exchange, ITU-R BT.709-3, 1998. [35] D-cinema quality—Reference projector and environment, Proc. Society of Motion Picture & Television Engineers Reference Projector and Environment (SMPTE RP), vol. 10607, no. 914, 2011. [36] Parameter values for ultrahigh definition television systems for production and international programme exchange, ITU-R BT.2020, 2012. [37] M. Azimi, A. Banitalebi-Dehkordi, Y. Dong, M.T. Pourazad, P. Nasiopoulos, “Evaluating the performance of existing full-reference quality metrics on high dynamic range (HDR) video content,” Proceedings of ICMSP 2014, International Conference on Multimedia Signal Processing, p. 811, 2014. [38] A. Luthra, E. Francois, and W. Husak, “Call for Evidence (CfE) for HDR and WCG video coding,” ISO/IEC JTC1/SC29/WG11 N15083, Feb. 2015. [39] C. CIE, “Commission internationale de l’eclairage proceedings, 1931,” 1932. [40] T. O. Aydin, R. Mantiuk, and H. P. Seidel, “Extending quality metrics to full luminance range images,” SPIE, Human Vision and Electronic Imaging XIII, San Jose, USA, Jan. 2008. 146  [41] Luo, M. R., Cui, G. and Rigg, B. (2001), “The development of the CIE 2000 colour‐difference formula: CIEDE2000,” Color Research and Applications, vol. 26, no. 5, pp 340-350, 2001. [42] W. S. Mokrzycki, and M. Tatol. “Colour difference ∆E - A survey,” Machine Graphics and Vision 20, no. 4, pp. 383-411, 2011. [43] S. J. Daly, “Visible differences predictor: an algorithm for the assessment of image fidelity,” Human Vision, Visual Processing, and Digital Display III, Proceedings of SPIE 1666, 2, Aug. 1992. [44] R. Mantiuk, K. J. Kim, A. G. Rempel, and W. Heidrich, “HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions,” ACM Transactions on Graphics, Proceeding of SIGGRAPH'11, vol. 30, no. 4, Article 40, 14 pages, July 2011. [45] Methodology for the subjective assessment of the quality of television pictures BT series broadcasting service, ITU-R BT.500-13, 2012. [46] ISO/IEC JTC 1/SC 29/WG 11 (MPEG) available at: http://wg11.sc29.org/ [47] M. T. Pourazad, C. Doutre, M. Azimi and P. Nasiopoulos, “HEVC: the new gold standard for video compression: How does HEVC compare with H.264/AVC?” IEEE Consumer Electronics Magazine, vol. 1, no.3, pp. 36-46, July 2012.  [48] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp.1649-1668, Dec. 2012. [49] M. Azimi et al., “Compression efficiency of HDR/LDR content,” 2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX), Pylos-Nestoras, pp. 1-6, 2015. [50] Dynamic Metadata for Color Volume Transformation, SMPTE Standard ST 2094, 2017.  147  [51] Signaling, backward compatibility and display adaptation for HDR/WCG video coding, ITU-T Series H Supplement 18, 2017. [52] D. Le Gall, A. Tourapis, M. Raulet, et al., “High dynamic range with HEVC main10,” JCTVC⁃U0045, Warsaw, Poland, Jun. 2015. [53] Blu⁃ray Disc Association,“Ultra HD Blu⁃ray Video Parameters Liaison Information,” Doc. m36740, Warsaw, Poland, Jun. 2015. [54] S. Cvetkovic, J. Klijn, and P. With, “Tone-mapping functions and multiple-exposure techniques for high dynamic-range images,” IEEE Transactions on Consumer Electronics, vol. 54, no. 2, pp. 904-911, May 2008. [55] Z. Mai, H. Mansour, R. Mantiuk, P. Nasiopoulos, R. Ward, and W. Heidrich, “Optimizing a tone curve for backward-compatible high dynamic range image and video compression,” IEEE Transactions on Image Processing, vol. 20, no. 6, pp. 1558-1571, June 2011. [56] J. Petit and R. K. Mantiuk, “Assessment of video tone-mapping: Are cameras S-shaped tone-curves good enough?” Journal of Visual Communication and Image Representation, vol. 24, no. 7, pp. 1020-1030, 2013. [57] E. Reinhard, M. Stark, P. Shirley and J. Ferwerda, “Photographic tone reproduction for digital images,” ACM Transactions on Graphics, vol. 21, no. 3, pp. 267-276, July 2002. [58] A. O. Akyuz, R. Fleming, B. E. Riecke, E. Reinhard, and H. H. Bulthoff, “Do HDR displays support LDR content?” ACM Transactions on Graphics, vol. 26, pp. 38, 2007. [59] F. Banterle, P. Ledda, K. Debattista, and A. Chalmers, “Expanding low dynamic range videos for high dynamic range applications,” Proceedings of the 24th Spring Conference on Computer Graphics, pp. 33-41, 2008. 148  [60] C. Bist, R. Cozot, G. Madec, and X. Ducloux, “Tone expansion using lightning style aesthetics,” Computers and Graphics, vol. 62, pp. 77-86, 2017. [61] P. Didyk, R. Mantiuk, M. Hein, and H.-P. Seidel, “Enhancement of bright video features for HDR displays,” Computer Graphics Forum (Proceedings of Eurographics Symposium, Rendering 2008 Sarajevo, Bosnia, Herzegovina), vol. 27, no. 4, pp. 1265-1274, 2008. [62] A. Artusi, F. Banterle, T. O. Aydın, D. Panozzo, and O. Sorkine-Hornung, Image Content Retargeting: Maintaining Color, Tone, and Spatial Consistency. Natick, MA, USA: A K Peters Ltd., 2016. [63] J. Laird, R. Muijs, and J. Kuang, “Development and evaluation of gamut extension algorithms,” Color Research and Application, vol. 34, pp. 443-451, 2009.  [64] R. M. Boynton, Human Color Visio. New York: Holt, Rinehart and Winston, 1979. [65] D. L. Ruderman, T. W. Cronin, and C. C. Chiao, “Statistics of cone responses to natural images: implications for visual coding,” Journal of Optical Society of America, Vol. 15, pp. 2036-2045, 1998. [66] E. Lubbe, Colors in the Mind – Color systems in Reality, Books on Demand: USA, Feb. 2010. [67] J. Benesty, et al., Noise reduction in speech processing, Springer, Berlin, Heidelberg, 2009.  [68] E. Francois, “MPEG HDR AhG: about using a bt.2020 container for bt.709 content (not public),” m35255, 110th MPEG meeting in Strasbourg, Oct. 2014. [69] S. Y. Choi, K. J. Oh, D. S. Park, and D. K. Nam, “Constant vs. Non-constant Luminance Video Signals for UHDTV,” SID Symposium Digest of Technical Papers, vol. 44, no. 1, pp. 26, 2013. 149  [70] F. Xie, Improving non-constant luminance color encoding efficiency for high dynamic range video applications. University of British Columbia, 2017. [71] M. Safdar, G. Cui, Y. Kim, and M. Luo, “Perceptually uniform color space for image signals including high dynamic range and wide gamut,” Optics Express, vol. 25, pp. 15131-15151, 2017.  [72] C. Poynton, J. Stessen and R. Nijland, “Deploying Wide Color Gamut and High Dynamic Range in HD and UHD,” SMPTE Motion Imaging Journal, vol. 124, no. 3, pp. 37-49, April 2015. [73] Y’D’ZD’X Color-difference Computations for High Dynamic Range X’Y’Z’ Signals, SMPTE Standard ST 2085, 2015. [74] T. Lu et al., “ITP Colour Space and Its Compression Performance for High Dynamic Range and Wide Colour Gamut Video Distribution,” ZTE Communications, Vol. 14, no. 1, Feb. 2016. [75] J. Samuelsson, M. Pettersson, J. Strom, K. Andersson, “Using chroma QP offset on HDR sequences,” m36581 111th MPEG meeting, Jun. 2015. [76] J. Ström and P. Wennersten, “Chroma adjustment for HDR video,” 2017 IEEE International Conference on Image Processing (ICIP), Beijing, pp. 11-15, 2017. [77] R. Boitard, M. T. Pourazad and P. Nasiopoulos, “Chroma scaling for high dynamic range video compression,” 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, pp. 1387-1391, 2016.  [78] S. Mahmalat, T. O. Aydın and A. Smolic, “Pipelines for HDR Video Coding Based on Luminance Independent Chromaticity Preprocessing,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 12, pp. 3467-3477, Dec. 2018.  150  [79] S. Mahmalat, N. Stefanoski, D. Luginbühl, T. O. Aydın and A. Smolic, “Luminance independent chromaticity preprocessing for HDR video coding,” 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, pp. 1389-1393, 2016. [80] F. Pu, T. Lu, P. Yin, T. Chen, W. Husak, “Comments on Reshaping for HDR/WCG compression,” m37267, 111th MPEG meeting, Oct. 2015.  [81] J. Ström, J. Samuelsson, and K. Dovstam, “Luma Adjustment for High Dynamic Range Video,” Proceedings of Data Compression Conference (DCC), pp. 319-328, Mar. 2016. [82] A. Norkin, “Fast algorithm for HDR video pre-processing,” 2016 Picture Coding Symposium (PCS), Nuremberg, pp. 1-5, 2016. [83] R. Boitard, R. K. Mantiuk, and T. Pouli, “Evaluation of color encodings for high dynamic range pixels,” Proc. SPIE 9394, Human Vision and Electronic Imaging XX, 2015. [84] R. Boitard, M. T. Pourazad, P. Nasiopoulos, and J. Slevinsky, “Demystifying High-Dynamic-Range technology: A new evolution in digital media,” IEEE Consumer Electronics Magazine, vol. 4, no. 4, pp. 72-86, Oct. 2015.  [85] J. Morovič, Color gamut mapping, John Wiley and Sons Ltd., 2008. [86] T. Bronner, R. Boitard, M.T. Pourazad, Panos Nasipoulos, and T. Ebrahimi, “Evaluation of Color Mapping Algorithms in Different Color Spaces,” Applications of Digital Image Processing XXXIX, San Diego, USA, August 2016. [87] Ultra HD Forum Draft: Ultra HD Forum Guidelines, April 2016. [88] M. Azimi, et al., “Compression efficiency of HDR/LDR content,” Proc. QoMEX, Costa Navarino, pp. 1-6, 2015. [89] V. Baroncini and P. Topiwala, “An expert subjective evaluation of a reshaper model vs. HDR10 for HDR coding,” Proceedings of DMIAF, Santorini, pp. 48-52, 2016. 151  [90] S. Cvetkovic, J. Klijn, and P. With, “Tone-mapping functions and multiple-exposure techniques for high dynamic-range images,” IEEE Transactions of Consumer Electronics, vol. 54, no. 2, pp. 904-911, May 2008. [91] D. Rusanovskyy, D. B. Sansli, A. Ramasubramonian, S. Lee, J. Sole, and M. Karczewicz, “High dynamic range video coding with backward compatibility,” Proceedings of DCC, Snowbird, UT, pp. 289-298, 2016. [92] J. Praeter, A. Díaz-Honrubia, T. Paridaens, G. Wallendael, and P. Lambert, “Simultaneous encoder for high-dynamic-range and low-dynamic-range video,” IEEE Transactions of Consumer Electronics, vol. 62, no. 4, pp. 420-428, Nov. 2016. [93] R. Boitard, D. Thoreau, R. Cozot and K. Bouatouch, “Impact of temporal coherence-based tone mapping on video compression,” Proceedings of EUSIPCO, Marrakech, pp. 1-5, 2013. [94] https://hevc.hhi.fraunhofer.de/ [95] R. Boitard, M. T. Pourazad and P. Nasiopoulos, “Evaluation of chroma subsampling for high dynamic range video compression”, JCTVC-W0105, Feb. 2016. [96] R. Boitard, M. T. Pourazad and P. Nasiopoulos, “Chroma scaling for high dynamic range video compression,” Proceedings of IEEE ICASSP, Shanghai, pp. 1387-1391, 2016. [97] Z. Farbman, R. Fattal, D. Lischinski and R. Szeliski, “Edge-preserving decompositions for multiscale tone and detail manipulation,” ACM Transactions on Graphics, vol. 27, no. 3, pp. 1-10, Aug. 2008. [98] Scratch Player: http://www.assimilateinc.com [99] M. D. Fairchild, Color Appearance Models, 3rd Edition. Wiley, 2013. [100] HDRTools package. ITU-T and ISO/IEC. [Online]. Available: https: gitlab.com/standards/HDRTools 152  [101] A. M. Tourapis and D. Singer, “HDRTools: Software updates,” in ISO/IEC JTC1/SC29/WG11 MPEG2015/M35471, IEEE, Ed., Geneva, Switzerland, 2015.  [102] G. Bjøntegaard, “Calculation of Average PSNR Differences between RD-Curves,” document VCEG-M33, 2001. [103] F. Ebner, and M.D. Fairchild, “Finding Constant Hue Surfaces in Color Space,” Proceedings of SPIE, Color Imaging: Device- Independent Color, Color Hardcopy, and Graphic Arts III, 3300-16, pp.107-117, 1998. [104] B. Gustav J, M. D. Fairchild, and F. Ebner. “Color gamut mapping in a hue-linearized CIELAB color space,” Color and Imaging Conference, vol. 1998, no. 1, pp. 163-168. Society for Imaging Science and Technology, 1998. [105] J. Sara, “The automated reproduction of pictures with non-reproducible colors,” Ph.D. dissertation, MIT, 1984. [106] T. Smith and J. Guild, “The cie colorimetric standards and their use,” Transactions of the Optical Society, vol. 33, no. 3, p. 73, 1931. [107] A. R. Robertson, “The cie 1976 color-difference formulae,” Color Research and Application, vol. 2, no. 1, pp. 7 -11, 1977.  [108] J. Froehlich, et al. “Encoding color difference signals for high dynamic range and wide gamut imagery,” Color and Imaging Conference. Vol. 2015. No. 1. Society for Imaging Science and Technology, 2015. [109] J. Liu, N. Stefanoski, O. Aydın, A. Grundhofer, and A. Smolic, “Chromatic Calibration of an HDR Display Using 3D Octree Forests,” IEEE International Conference on Image Processing 2015, 2015. 153  [110] T. Wiegand, G. J. Sullivan, G. Bjontegaard and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, pp. 560-576, 2003. [111] P. Bordes, P. Andrivon, X. Lim Y. Ye, and Y. He, “Overview of Color Gamut Scalability,” IEEE Transactions on Circuits and Systems for Video Technology, vol. PP, no. 99, pp.1-1, 2016. [112] L. W. MacDonald, “Gamut mapping in perceptual colour space,” Color and Imaging Conference, vol. 1993, no. 1, pp. 193-196 Society for Imaging Science and Technology, 1993. [113] T. P. Sebastien Lasserre, and E. Francois, “Gamut mapping sei for backward compatibility,” MPEG, no. 37368. [114] CIE (1978) Recommendations on uniform color spaces, color difference equations, psychometric color terms, Supplement 2 to CIE publication 15 (E1.3.1) 1971/ (TC1.3). Central Bureau of the Commission Internationale de l'Éclairage (Vienna, Austria). [115] Mastering display color volume metadata supporting high luminance and wide color gamut images, SMPTE Standard ST 2086, 2014.   

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0384606/manifest

Comment

Related Items