UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Merging multiple light fields Chiu, Changching 1998

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1998-0393.pdf [ 5.24MB ]
Metadata
JSON: 831-1.0051660.json
JSON-LD: 831-1.0051660-ld.json
RDF/XML (Pretty): 831-1.0051660-rdf.xml
RDF/JSON: 831-1.0051660-rdf.json
Turtle: 831-1.0051660-turtle.txt
N-Triples: 831-1.0051660-rdf-ntriples.txt
Original Record: 831-1.0051660-source.json
Full Text
831-1.0051660-fulltext.txt
Citation
831-1.0051660.ris

Full Text

Merging Multiple Light Fields by Changching Chiu B. Sc., University of California, Los Angeles, 1995 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in THE FACULTY OF GRADUATE STUDIES Department of Computer Science We accept this thesis as conforming to the required standard The University Of British Columbia October 1998 © Changching Chiu, 1998 In presenting this tfiesis/essay in partial fuCfdCment of the requirements for an advanced degree at the University of'British CoCumhia, I agree that the Library shad make it freeCy avaiCahCe for reference and study. I further agree that permission for extensive copying for this thesis for sc hoCarCy purposes may he granted hy the 3-Cead of my department or By his or her representatives. It is understood that copying or puhdcation of this thesis for financiaCgain shad not he adowedwithout my written permission. Computer Science The University of'British CoCumMa 2366 Main maCC yancouver, BC Canada 'V6T' 1Z4 ' ABSTRACT Light field provides an alternative to describing objects by geometric mod-els. It has the advantage of geometric models, namely new views of an object can be generated at run time. The time required to generate a new view is independent of the scene complexity. It lacks, however, the flexibility of geometric models to com-pose complex scenes. This thesis proposes a method of merging multiple light field objects into a more complex light field. An approximate volume is first recon-structed from the light field. The level of detail can be controlled in the reconstruc-tion process. A test ray is sent into the scene to determine visibility based on the ex-tracted volume. In the merging process, affine and other transformations can be applied to individual objects. The method computes visibility correctly when there is sufficient.level of detail in the volume. This thesis shows that light fields can be composed and transformed to increase its usefulness. ii CONTENTS Abstract ii Contents iii List Of Figures v Acknowledgments vii Chapter 1 — Introduction 1 1.1 Light Field 3 1.2 Thesis Objective 4 Chapter 2 — Background 6 2.1 Two-Plane Parameterization 6 2.2 Sampling Density And Pattern 7 2.3 Multiple Slabs 9 Chapter 3 — Methodology 11 3.1 Overview 11 3.2 Reconstructing Volume 12 , 3.2.1 Octree Representation 12 3.2.2 Node Type Test 13 3.2.3 Trimming Octree 14 3.2.4 Summary 15 3.3 Interacting With Objects 16 iii 3.3.1 Visualizing Octree 16 3.3.2 Position and Orientation 16 3.4 Merging Light fields 17 3.4.1 Determining The Closest Object 17 3.4.2 Finding The Corresponding Slab 18 3.4.3 Refining Volume 18 Chapter 4 — Results 19 4.1 Volume Reconstruction 19 4.1.1 Algorithmic Limitations 24 4.1.2 Insufficient Data 25 4.1.3 Workaround 26 4.2 Point Cloud Visualization 26 4.3 Merged Light field 27 4.3.1 Shortcoming 29 4.4 CSG-Like Operation 31 4.5 System Resources 35 Chapter 5 — Conclusions And Future Work 36 5.1 Summary and Conclusions 36 5.2 Future Work 37 5.2.1 Dynamic and Incremental Reconstruction 37 5.2.2 Better Geometry Representation 38 5.2.3 More Efficient Representation 38 5.2.4 Free-Form Deformation 39 Appendix — Active Measurement (ACME) Facility 41 References 44 iv LIST O F FIGURES Figure 1: Data flow pipeline of LightPack software. 5 Figure 2: Data flow pipeline of LightPack with thesis implementation. 5 Figure 3: Sampling pattern of test objects at 32x32 positions. 8 Figure 4: Sampling pattern at 8x8 positions. 9 Figure 5: Off-centred slabs need to be larger to cover the same shaded area. 10 Figure 6: The merging process. 12 Figure 7: A slice of the light field used as input for the reconstruction algorithm. 20 Figure 8: Four views of the cylinder volume reconstructed to level 4. 21 Figure 9: Four views of the cylinder volume reconstructed to level 5. 22 Figure 10: A slice of the light field for the rod. 23 Figure 11: Four views of the rod volume reconstructed to level 4. 24 Figure 12: Example of insufficient data. 26 Figure 13: Point cloud visualization of the cylinder with hole. 27 Figure 14: Three slices of the merged light fields. 28 Figure 15: Screen capture oilijview showing merged objects. 29 Figure 16: A slice of merged light fields using level 4 volume. 30 Figure 17: A slice of merged light field using level 5 volume. 31 v Figure 18: Primitive light fields used to compose the desk lamp. 32 Figure 19: Views of a desk lamp merged from basic primitives. 33 Figure 20: Stanford Dragon in a marble container. 34 Figure 21: Gantry of the ACME facility. 42 Figure 22: Components of the ACME facility. 43 vi ACKNOWLEDGMENTS I would like to thank Dr. Alain Fournier for his supervision and guidance of this thesis. My research would not be possible without the generous funding of the University Graduate Fellowship. Appreciation goes to Dr. Dinesh Pai and the ACME team members for the opportunity to work with them on the project. I am grateful for the assistance, encouragement, and support from members of the Im-ager Lab, colleagues in the department, and friends from the International House. And special thanks to the special people who have crossed my path, for pointing the way through the adventurous and often tortuous road of my life. Changching Chiu The University of British Columbia October 1998 vii C H A P T E R 1 — INTRODUCTION Computer graphics covers a wide range of topics. Expressing or recreating human perception of the real world is one of the many goals. While some may ex-press their perception in an interpretive style, others strive to reproduce the real world as closely as they can discern. Two approaches to recreate our perception are through geometric modeling and image manipulation. Objects in the real world can be described mostly by their mathematical equations, lists of discretized polygonal surfaces, or Boolean operations of solid primitives. These are a few of the methods of specifying the geometry of objects. A rendering step then follows to visualize these objects. However, geometry alone does not capture the richness of details that sur-rounds us. Surface features play a critical role in our recognition of objects. We rarely mistake a shiny red apple for a fuzzy pink peach or a bumpy tangerine, nor do we believe any round object in these shades of colours is a palatable fruit. Illumi-nation models, bump mapping, displacement mapping, and texture mapping are 1 some of the techniques for adding realism to plain geometry. Together in the ren-dering step, they help recreate images from the real world. How convincing these images are depends on how much detail they capture. But it is already difficult to pinpoint and quantify the important features. It is even more so to find a metric of sufficient details. Another approach to realism is to take photographs of the real objects and manipulate these photographs. The advantage of geometric modeling is that a new view can be created simply by changing a few parameters in the rendering step. It also allows the computation of physical proper-ties such as volume, area and weight for other purposes. Photographic images on the other hand, are 2D projections of the real world fixed at a particular time and space. A new view may require information that was blocked in the available views. Finding the correct views and filling in the gap replace capturing sufficient details as the central problem in image-based approaches. Combining advantages of geometry with the details from photographic im-ages has been the research topic of many in recent years. Geometric information provides the bases for computation while photographic images provide the details of real-world features. Several techniques of image-based rendering have been pub-lished. QuickTime VR [4] is probably one of the best known. It warps images taken at a fixed view point but in different directions to generate new views of the sur-roundings. Another work by Debevec, Taylor, and Malik augments geometry with images to produce new views [5]. They reconstructed geometric information from a series of photographs and applied the photographs as texture maps specific to the 2 given view on the geometry. The results are realistic images with highly variable view points. 1.1 Light Field Using photographic images to provide the details closes the gap between computer generated images and reality. Light field is an attempt to capture the ob-served radiance of objects, and Levoy and Hanrahan presented an efficient method of working with light fields [12]. Lumigraph is a similar approach presented by Gortler et al. [8]. Light field is a 4-variable function of radiance, the amount of light emitted by objects in the sampled space in a given direction, observed from a region free of occluding. This is a reduced version of the 5-variable plenoptic function in-troduced by Adelson and Bergen [1], under the assumption that radiance is con-stant along a line in free space. A view of an object is a 2D sampling of this 4-variable function. Each pixel in the view image is a sampling of the radiance along the line through the pixel in the viewing direction. By sampling the light field func-tion at all pixels in the viewing direction, a specific view can be reconstructed. One advantage of light fields is that it breaks the dependency of rendering time on the complexity in the scene. Although a complex object requires more sam-ples in the light field than a simple object, the number of rendered pixels for a given view is the same for either object. Since the rendering time for each pixel is con-stant, the rendering time for a light field depends only on the view, not on the scene. 3 For polygonal representation on the other hand3 rendering a pixel requires iterating through at least a subset of the polygons in the scene. Therefore, the rendering time for polygonal objects is dependent on the complexity of the scene even if the num-ber of rendered pixels is the same for all cases. Table 1 compares the different ap-proaches to modeling objects. Representation Polygonal Surfaces Photographic Images Light Field Measure of Complexity polygons images samples Generation of New Views' yes limited yes Rendering Time Dependency2 number of polygons none none Surface Details limited yes yes Geometric Information yes no no Table 1: Comparison of representations. 1.2 Thesis Objective Any representation, including light field, is limited in usefulness if it cannot be manipulated once it has been determined. Some operations that would increase the usefulness of light fields include reshading of objects [3], [21], modifying the shape, and composing with other forms of representation. The goal of this thesis is to develop a method for merging multiple light fields into a single representation' and applying affine transformation in the process. ' At run time. 2 In addition to view resolution. 4 The LightPack [13] software developed at Stanford Computer Graphics Laboratory facilitates the authoring and viewing of a single light field. The author-ing software is called lifauth, and the viewer is called lifview. The implementation of this thesis is a preprocess to the authoring step. It takes multiple sets of light field data, where each set is in the format specified by the authoring software, and com-bines them into a single set of data for processing by the software. The viewer then provides a user interface to visualize the merged light field. Figure 1 and Figure 2 shows the data flow pipeline of this process. authoring program Figure 1: Data flow pipeline of LightPack software. thesis implemen-tation authoring program Figure 2: Data flow pipeline of LightPack with thesis implementation. 5 CHAPTER 2 — BACKGROUND 2.1 Two-Plane Parameterization A light field is a 4-variable function (2 spatial variables and 2 directional variables). There are many ways to parameterize this space such as using spherical or cylindrical coordinates. Levoy and Hanrahan chose to parameterize it by the in-tersection of a line with two parallel planes. The UV plane sits between the ST plane and the viewer. Under this condition, the light field function can be interpreted as a set of images on the ST plane viewed from the (u, v) coordinates. When the UV plane is placed at infinity, the images become orthographic projections of the space. This interpretation allows a light field to be captured using a camera sys-tematically placed at known positions. The number of images in the data set will be large, however. A device that will automate this process is the Active Measurement (ACME) project currently under development by Dinesh Pai et al. (See Appendix). A light field can also be synthetically generated using a ray-tracer. The test objects 6 used in this thesis are generated by POV-Ray [18], a ray-tracer widely available as freeware. In addition to the ease of generating light fields, these same images are the input for the volume reconstruction algorithm which will be explained in Chapter 3. The algorithm requires images from different view points around the object with known projection parameters. A light field under the ST-image interpretation pro-vides the silhouette for determining whether a node is inside or outside. 2.2 Sampling Density And Pattern The amount of data available in a light field determines how successfully a new view can be created from it. If the input data does not contain the information, a new view cannot be created no matter how accurate and efficient the algorithm is. Levoy and Hanrahan stated that assuming all views are equally likely to be gener-ated, then all data points from the light field are equally likely to be accessed. Therefore, an ideal light field should contain a uniform sampling of the region. They also noted that two parallel planes with one at infinity leads to a more uniform sampling pattern than other arrangements. To visualize the coverage of the sampling space by a light field, they plotted these parameterizing lines in line space. For each line, the perpendicular distance to the origin (r) and the angle it makes with the X-axis (8) become its coordinates in line space. The sampling pattern for the test objects in the XZ planes is plotted in 7 Figure 3. The sampling density is 32x32 in the UV plane. Figure 4 shows the sam-pling pattern if a density of 8x8 were used. Notice that the horizontal density is not changed between the plots, but the vertical density is drastically reduced in Figure 4 because the number of different directions is fewer. This affects how accurately a new view can be constructed from the light field. Sampling Pattern Figure 3: Sampling pattern of test objects at 32x32 positions. 8 Sampling Pattern 4-1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 111 1 1 1 1 1 1 1 I I 1 1 1 11 1 1 1 1 1 1 1 i i i 1 1 1 111 1 1 i 1 1 1 1 1 1 i i 0 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i i I I I I I " 11 j M i l ' I I i 11 ii 'i 'i i' i i I I Vi' 11 i I I 1 1 Q i , JM 5 ii,,,,,,?} M , , I I ^ mm if nm j111 mm M i! 1 11"11 iVi'i'i'i11fi 11!mm'i 11?!'"mm 1 ' i t ! ! i 1 ' I ' I ' I ' I ' " " . '.......!......!!......!......!!....... 111 I I 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 M i n 1111111111 I I 1 111 1 1 1 111 1 1 1 H - 0 i i 11 II I 11 11 I 1 1 1 1 1 1 I 11 11 11 I 11 11 11 I 11 11 11 I 11 11 1 1 I 11 1 1 11 I 11 11 11 I I 0 I . 4-Figure 4: Sampling pattern at 8x8 positions. 2.3 Multiple Slabs Each pair of parallel planes defines one slab of a light field. Since a closed surface in 3D space does not map to a single plane nicely, a single slab cannot easily parameterize the free space. Multiple slabs are used to capture the entire set of radi-ance measurements. Although there are no specific restrictions on the orientation of these slabs, the most convenient and intuitive configuration consists of six slabs parallel to the axial planes similar to a box, with ST planes passing through the ob-ject (typically at the origin) and UV planes at infinity. Placing ST planes away from the object requires larger planes to cover the same space, as illustrated in Figure 5. 9 viewing range covered area Figure 5: Off-centred slabs need to be larger to cover the same shaded area. 10 CHAPTER 3 — M E T H O D O L O G Y 3.1 Overview Since a light field contains only radiance but no geometric information, an approximating volume must be first extracted in order to merge multiple light fields. The volume information helps in determining visibility during the merging process. Given the light fields of several objects, approximate shape of each object is ex-tracted using the algorithm described by Szeliski [20]. A visual representation based on the extracted volume is used to position and orient the objects relative to each other as well as to the slabs in the new light field. The new light field is computed, with consideration of the position and orientation of each object and visibility de-termined from volume information. 11 Mult ip le Light Fields Resampling Vo lume Reconstruct ion Volume User Compos i t ion Vo lume Vis ib i l i ty Single Light Field Figure 6: The merging process. 3.2 Reconstructing Volume 3.2.1 Octree Representation The volume reconstruction algorithm produces an octree representation of the volume occupied by the object. The octree is built in a top-down fashion and iteratively refined using information from the images. Leaf nodes can be inside or outside. They can also be ambiguous if partly inside and partly outside. This is dif-12 ferent from normal octrees where only non-leaf nodes are ambiguous. The root node is initially assigned ambiguous to start the process. During each iteration, am-biguous nodes are subdivided and each child node is classified as one of the three types. 3.2.2 Node Type Test In the classification step, each node is projected onto the images to deter-mine its type. An image is first binarized by thresholding to classify areas of the background and the object. For each pixel in the image, the size of the largest square, with lower left corner fixed at the pixel, that will fit inside the object is com-puted. The value will be zero for the background pixels. These values make up the inside map and will be compared with the node projection to determine whether the projection is inside. The same steps are repeated for the inverse of the binary image to determine the outside map. If the projection is completely inside the silhouette of the object as explained later, the node is considered inside. Similarly, if the projec-tion is completely outside, the node is classified as outside. When the projection is partly inside and partly outside, the node is classified as ambiguous. The decision is postponed until the next iteration. In general the shape of the projection is hexagonal. Testing this hexagonal shape against the image is not trivial. Since the aspect ratio of length to width of the orthographic projection of a node is approximately one3, the hexagonal shape is 3 The maximum occurs when the projection direction is parallel to a face diagonal. The as-pect ratio is V2 :1 or 1.414:1. 13 simplified to its bounding square and tested against the image. The lower left corner of this bounding square is located in the image and the size of the square is com-pared with the pre-computed value in the inside map for that pixel. If the size of the bounding square is less than the value from the inside map, then the projection is inside the silhouette. Likewise, if the size is less than the value from the outside map, the projection is outside the silhouette. The simplification may defer the correct type until later iterations by classifying nodes as ambiguous, but it does not erroneously classify an inside node as outside or vice versa. All nodes are classified using one image first, then the whole process starts again using the next image. Some reclassifications by a later image are not allowed. For example, once a node is classified as ambiguous or outside, it cannot be reclas-sified as inside by a later image even if the projection is completely inside the silhou-ette in the later image. However, an ambiguous node can be reclassified as outside. The allowable reclassifications are summarized in Table 2. 3.2.3 Trimming Octree Octree is a representation of 3D volume. Its space requirement could grow enormously with just a small number of levels. Each additional level subdivides a node in half for each of the three dimensions. Therefore, the demand for space could increase by eight-fold for each new level. It becomes important to trim un-necessary and redundant nodes as soon as possible. During the reconstruction al-gorithm, the octree is trimmed and simplified after each iteration. 14 For the purpose of merging light fields, an outside node is unnecessary. The spatial information it contains is not used during the algorithm, so its resources are released during the simplification process and the child reference of its parent is set to null. In other words, if a node contains a null child reference, then that child is consider to be outside. No additional operation should pass beyond that branch. By construction, many nodes are classified as ambiguous, and the true type is delayed until the next level. Therefore, ambiguous nodes with child nodes of the same type (other than ambiguous) are likely occurrences. These child nodes are redundant and should be removed. The parent of these nodes is set to the same type as the children; all children and references to them are removed. The reduction process cascades upward until the root is reached. The invariants for each iteration of the process are 1) there should be no outside nodes, and 2) all inside nodes are leaf nodes. However, this algorithm does not work on all types of objects. The specifi-cation of reconstructible objects is explained in Chapter 4, along with other limita-tions and examples of problematic objects. 3.2.4 Summary At each iteration of the refinement process, the ambiguous nodes are subdi-vided. Their children are classified as one of the three types. The octree is then sim-plified to remove unnecessary and redundant nodes. Table 2 summarizes these steps. 15 Step Inside Node Ambiguous Node Internal Leaf Outside Node Subdivision no no yes no Classification inside, ambiguous or outside N/A ambiguous or outside outside Reduction removed if siblings are same type children removed if same type none removed Table 2: Reconstruction steps for different node types. 3.3 Interacting With Objects 3.3.1 Visualizing Octree The octree representation gives an approximate shape to the object. It can be used to position and orient the object. Since the visualization of the octree is part of the user interaction, responsiveness is more important than accuracy. Rather than tessellating the octree to polygonal surfaces, a cloud of "fat points" is shown to represent the object. The user can specify the density level of the cloud as well as the radius of all points. Nodes at the same level as the density level are represented as one point. For each level above the density level, the number of points repre-senting a node increases by eight times. Therefore, an inside root node of the octree would be represented by 8d points, where d is the density level. 3.3.2 Position and Orientation The position and orientation of an object is expressed as a 4x4 transforma-tion matrix in homogeneous coordinates. This matrix transforms from object coor-dinate space to world coordinate space. It is used in the display of the point cloud 16 representation. The inverse of this matrix transforms from world coordinate space to object coordinate space. It is used to transform test rays into object space as de-scribed in later sections. Associating a 4x4 matrix with each object is sufficient to specify the position and orientation of the object in world coordinate space, and the relationship among objects can be easily calculated. 3.4 Merging Light fields 3.4.1 Determining The Closest Object To create a slab for the merged light field, a test ray is sent out into the space with origin on the ST plane in the direction determined by the UV plane. In-tersections both in front of and behind the ray origin are considered; points behind the origin are considered closer to the viewer than those in front of the origin. The test ray is tested for intersection with the octrees of each object starting at the root node. If the test ray intersects a node, the test is repeated for all its inside and am-biguous children. If the node is an inside node or an ambiguous leaf node, the clos-est point of intersection with one of node faces is returned. The closest point of in-tersection among the children of a node is the point of intersection for that node. After the test has propagated to all leaf nodes, the closest intersection of the test ray with the object is determined. The object with the closest intersection is the closest object and determines the visibility at the (u, v, s, t) coordinate. 17 3.4.2 Finding The Corresponding Slab Each object consists of one or more slabs. The intersection test only deter-mines whether the test ray passes through the object, but does not specify which slab contains the relevant radiance. For each slab, the test ray is first checked for whether it comes from a direction covered by the slab. Then the intersection of the test ray with the plane containing the slab is determined and verified to be inside the slab. Depending on the arrangement of the slabs, multiple slabs could contain the information. In this case, the slab with the closest intersection is selected although the data should be exactly the same from any slab if the light field is self-consistent4. 3.4.3 Refining Volume In the intersection test, ambiguous nodes are also included in the test in ad-dition to inside nodes. This results in a superset of the true object volume. The dif-ference caused by overestimating object volume is significant, especially for complex objects with low reconstruction level. Steps must be taken to reduce this difference and recover from the lack of volume detail. While determining the closest object, the radiance of a given (u, v, s, t) coordinate is retrieved if the test ray intersects the object. Under the assumption that the radiance of the background is different from the radiance of the object, a radiance value equal to the background value signifies that the object does not intersect the test ray as the octree indicates. Therefore, this object can be safely eliminated from the candidates of closest objects. 4 The exact values obtained from an implementation.may be different due to sampling error and interpolation method. 18 CHAPTER 4 — RESULTS Results of the algorithm described in Chapter 3 operating on sample objects are described and analyzed in this chapter. Each light field object consists of four slabs of 32x32 images. Each image has the resolution of 128x 128. The four slabs contain views from the front, back, left, and right of the object. 4.1 Volume Reconstruction The results from the volume reconstruction algorithm are described in this section. Figure 7 shows a slice of the light field for one of the objects used in the reconstruction algorithm. It is a vertical cylinder with a horizontal section in the shape of a horizontal cylinder taken out. Four views of the reconstructed volume are illustrated in Figure 8. Open Inventor is used to visualize these nodes collected during the reconstruction process. The octree is reconstructed up to level four. The hollow section is correctly removed from the volume. Compare this reconstruction with the reconstruction of a level five in Figure 9. Level five reconstruction shows a better outline of the curves defining the cylinder. The round hollow section is also 19 closer to the real shape. The accuracy of this reconstructed volume will determine the error in the merged light field as explained in later sections. The second object is a horizontal cylindrical rod. A slice of its light field is shown in Figure 10. Its reconstructed volume is in Figure 11. The volume is recon-structed up to level 4. A higher level reconstruction of this object is not necessary because the object is fairly simple and convex. The excess volume will be eliminated by the refinement process. Figure 7: A slice of the light field used as input for the reconstruction algorithm. 20 Figure 8: Four views of the cylinder volume reconstructed to level 4. 21 Figure 9: Four views of the cylinder volume reconstructed to level 5. 22 Figure 10: A slice of the light field for the rod. 23 Figure 11: Four views of the rod volume reconstructed to level 4. 4.1.1 Algorithmic Limitations Although the volume reconstruction algorithm integrates nicely with the merging process, it does have some limitations that must be considered. The recon-struction algorithm projects each node onto the image and uses the boundaries in the images to separate the object from the background. For a point on the surface of the object to be identified as such, it must project onto one of the images as a 24 boundary point. Under affine transformations, points are projected along a straight line. A point will be a boundary point in the image if the line of sight is tangent to the object at that point. Otherwise, other surface points will surround it and it be-comes an internal point. As a result of this, not all objects are reconstructible. A point on the object is reconstructible if and only if there exist a tangent line that does not intersect any other parts of the object. An object is reconstructible if and only if all points on the surface are reconstructible. Convex objects fall under this category. A cup is not reconstructible because the inside wall does not have tangent lines that do not inter-sect the bottom of the cup. 4.1.2 Insufficient Data The tangent line test only specifies the type of objects reconstructible under ideal conditions. However, a sampled light field rarely provides the ideal condition. An object may be reconstructible, but there may be simply not enough data to do so. A surface point may not be projected as a boundary point in the available images even though a tangent line exists. Figure 12 shows an example of partial construc-tion due to insufficient data. A vertical circular cylinder under orthographic projec-tion is a rectangle if the projection direction is horizontal. When only four views are available, the reconstructed volume becomes an octahedral cylinder rather than a circular cylinder. The result is obviously incorrect even though the cylinder is re-constructible. 25 Figure 12: Example of insufficient data. 4.1.3 Workaround Although these limitations result in incorrect volumes, there are solutions to these problems. The limitation due to the object shape can be avoided by either physically or computationally dividing the object into reconstructible pieces. The merging process presented in this thesis can then merge the pieces back together along with other objects. The problem caused by insufficient data can be solved by increasing the number of view directions. 4.2 Point Cloud Visualization Point cloud visualization allows quick interaction with the object. It is less demanding for hardware acceleration than other forms of visualization. The pur-pose here is to give relative position and orientation of objects, not necessarily true 26 shape or form. Figure 13 shows two views of the user interface with the point cloud of the cylinder. Lightfield Tray <• tCranjIifg C Rotate; Reset •it C Translate (• [fcotetej Reset j Figure 13: Point cloud visualization of the cylinder with hole. 4.3 Merged Light field Figure 14 shows three slices of the merged light fields. The rod is positioned inside the vertical cylinder through the hollow section. These 2D slices show that visibility is computed correctly. Merged light fields are then processed by lifauth and viewed by lifview. Figure 15 shows screen captures oflifview displaying merged light field. 27 Figure 14: Three slices of the merged light fields. 28 Figure 15: Screen capture of lifview showing merged objects. 4.3.1 Shortcoming The lack of inherent geometric information in the light field is the main cause of error in the merged object. Although the volume reconstruction algorithm used for this thesis recovers from some of these errors, it is not perfect. Figure 16 illustrates the problem. The image shows that the rod is broken in two pieces when 29 it should be in one contiguous piece. The octree provides a superset of the true vol-ume. The false region is eliminated if the radiance is that of the background. In the circled area, the octree node containing the cylinder would include excess volume that is not part of the object. However, the radiance of that region comes from an-other part of the object, not the background. Therefore, the region is not eliminated and the intersection test falsely indicates that the ray intersects the object when it really does not. The algorithm determines that the cylinder is closer than the rod, and chooses the radiance of the cylinder rather than the rod for the new light field. This problem can be reduced, although not completely resolved, by increasing the octree level. Figure 16: A slice of merged light fields using level 4 volume. The difference between using a coarser volume and a finer volume can be seen by comparing Figure 16 and Figure 17. Figure 16 is built using level four vol-ume for the cylinder object, while Figure 17 uses level five. The broken rod is more obvious when level four volume is used, since ambiguous nodes are bigger resulting 30 in a greater area of false intersection with the test ray. Level five volume is closer to the true volume, so the area of false intersection is reduced. As seen in Figure 17, the rod does not break until much closer to the cylinder. Figure 17: A slice of merged light field using level 5 volume. 4.4 CSG-Like Operation The ability to apply affine transformation in the process of merging light fields allows objects to be built using constructive solid geometry (CSG) union op-eration. As mentioned in previous sections, one limitation of volume reconstruction is the class of objects that are reconstructible. The merging process is similar to the union operation of CSG and permits combining smaller pieces of an object that is not reconstructible as a whole. In addition, the union operation allows building ob-jects from a library of light fields. Figure 19 shows a desk lamp composed from two light fields. The base and arms are from the same box light field, while the bulb is 31 from a dome light field. In the merging process, each piece is scaled, rotated, and translated to form the final shape. Individual light fields are shown in Figure 18. Figure 20 shows the Stanford Dragon light field merged with a marble cy-lindrical container. The dragon is obtained from the Stanford Light Field Archive, while the marble container is built using POV-Ray. This illustrates the ability to merge light fields made from different sources, as long as they adhere to the Light-Pack file format. Figure 18: Primitive light fields used to compose the desk lamp. 32 4.5 System Resources Light fields are inherently large data sets. The space requirement for the raw images of each test object is approximately 32 MB. Using Java jar lossless compres-sion, raw images are archived into a single working file of approximately 14 MB in size. More complex objects can have size of 80 MB after jar compression. The working file is then processed by lifauth to produce a quantized light field of about 12 MB for the lifview viewer. The running time for the merging process depends on the arrangements of the object as well as the number of objects. On a dual Pentium II 300 MHz PC5 with 128 MB of main memory running over NFS, the test objects took 3-4 hours for merging two objects and approximately 8 hours for four objects. A significant portion of the processing time is simply moving around 30-80 MB of data over the network. Note that the merging process is done only once for each scene. The implementation source code, test objects, and the results can be found online at http://www.cs.ubc.ca/spider/cchiu/lightfield/ or on the attached CD-ROM. 5 The implementation does not take advantage of multiple processors. 35 CHAPTER 5 — CONCLUSIONS A N D FUTURE WORK 5.1 Summary and Conclusions This thesis describes a method of merging multiple light fields into a single light field. A volume for each object is first reconstructed using a method presented by Szeliski. A user then specifies the position and orientation of these objects by interacting with the point-cloud visualization of the volume. Test rays through the ST plane in the direction specified by the UV plane are tested for intersections with these objects and the closest object is determined. The radiance from the closest object is the value for the (u, v, s, t) coordinate. While the raw images also serve as the input for the volume reconstruction algorithm, the class of reconstructible objects is restricted. An object is reconstruc-tible if and only if for every point on the object, it has at least one tangent line not passing through itself. For objects not in this class, it needs to be subdivided into smaller pieces that are reconstructible. 36 During experimentation, the algorithm is found to be sensitive to the level of detail available in the octree. The intersection test may indicate falsely that the test ray intersects the octree when the ray does not intersect the true volume. This is the result of including ambiguous nodes as part of the volume, effectively producing a superset of the true volume. Errors caused by excessive volume is minimized by eliminating those test rays with background radiance. The method is not foolproof, however. Despite some of the limitations mentioned above, applying affine transfor-mation and merging light fields have been shown as feasible operations. Possible enhancements to the process and future directions are listed in the sections below. 5.2 Future Work 5.2.1 Dynamic and Incremental Reconstruction As indicated in previous sections, an octree with insufficient level of details can cause unacceptable error in the merged light field. However, increasing the re-construction level for the entire octree is expensive in both time and space. Each additional level increases the requirement eight-fold. Since the reconstruction algo-rithm does not require that the octree be built all in one pass, incrementally refining the entire octree based on user interaction or predefined criteria would reduce the time to refine an octree should the current one be insufficient in detail. A better al-gorithm is to locally refine the octree as needed based on tolerance criteria. It would 37 alleviate not only the space shortage but also the time complexity, since not all nodes are expanded. 5.2.2 Better Geometry Representation Many hardware platforms that capture light fields, including ACME, can also capture range data. Reconstructing a surface representation from range data has been the research topic for many years with fairly satisfactory results. Some of the works include [2], [10], and [11]. Surface-based objects may have advantages over volumetric objects not explored by this thesis. Combining surface-based ob-jects using a representation based on the analysis of lines and triangles in the con-text of two-plane parameterization by Gu, Gortler, and Cohen in [9] with radiance information from a light field into a coherent structure would make it more useful than either one alone. Note that the key point of this thesis, however, is to recover sufficient information when such range data does not exist. 5.2.3 More Efficient Representation The space requirement to store a light field has not been discussed in this thesis, but it is an important consideration when working with light fields. The sim-plest way of storing a light field as used here is to store them as an array of 2D im-ages. Example light fields presented in this thesis all consist of 32x32 images at resolution of 128x 128 for each slab. And each object is made of four or six slabs. This easily translates to over 20 MB of data for each object. Any of the popular al-gorithms of compressing images can be applied to individual images, however, they do not explore the relationship between adjacent images. As multimedia hardware 38 that supports MPEG video [17] becomes a commodity, it may be possible to take advantage of the hardware for compression and decompression of light fields. By arranging images in an order that follows a space-filling curve, these 2D slices of a light field can be stitched together and interpreted as a video. MPEG compression [16] and hardware accelerators can then improve the storage and access of light fields. Other more drastic representations of storing light fields are possible. For example, multidimensional wavelet compression [14] explores the coherence in all dimensions at once. 5.2.4 Free-Form Deformation The method proposed here uses the inverse transformation matrix of the object to transform the test ray while leaving the light field in its original configura-tion. This matrix can describe any affine transformation including scaling, rotation, translation and shear. While this is sufficient for most purposes, it may not satisfy more complex modification. Free-form deformation presented by Sederberg [19] facilitates more com-plex manipulation of objects and can be integrated into the method presented here. Free-form deformation in 3D is described by three trivariate functions 39 xffii=fx(x>y>z) yff<,=fy{x>y>z) tffj =fz(x>y>z) that maps points from the normal object space to deformed object space. The in-verse that maps points from the deformed space back to the original space can be computed numerically by trivariate Newton iteration. Once a point is back in the original space, it can then be tested whether it is inside or outside the octree with recursive methods. Using the inside/outside test, the intersection of the test ray with the deformed object can be computed iteratively. The remaining part of the algo-rithm would proceed as presented here. The inverse computation could be intensive and cause problems, however, if the Jacobian of the deformation changes sign as noted by Sederberg. While some of the future directions for light fields are listed above, this the-sis shows that light fields can be composed and transformed to create more com-plex scenes. These operations increase the usefulness of light fields, adding some dynamism to the static light fields. 40 A P P E N D I X — ACTIVE MEASUREMENT ( A C M E ) FACILITY The ACME facility is a multipurpose platform for obtaining dense meas-urements from objects. Data such as light field, bi-directional reflectance distribu-tion function (BRDF), surface points, and contact forces requires systematic prob-ing of the object at predetermined points and directions. The number of measure-ments could range from 64,000 and up. Such task is greatly simplified and the re-sults have more validity by using automated a device and procedure. ACME is de-signed towards this goal under the direction of Dr. Dinesh Pai at the Computer Science Department of the University of British Columbia. ACME is under development at the time of this writing. It consists of several components for different measurements. The overall structure is a 8'x6'x6' gantry supporting an optical table (Figure 21). Three motorized tracks are mounted on the gantry to allow computer controlled movements in all three X, Y, and Z dimensions. The height of the optical table is adjustable to accommodate different objects. 41 Figure 21: Gantry of the ACME facility. Figure 22 shows the components of A C M E in a separate testing area. A col-our camera (a) with computer controlled parameters will be mounted on the Z-axis of the gantry with pan and tilt capability. Objects are placed on the test station (c) which also has three degrees of freedom. It can move linearly in the two horizontal dimensions and rotate 360°. A laser range finder (b) is aligned with the test station to measure the distance to the object on the test station. A robotic arm (d) is mounted nearby to perform contact measurements such as friction and deforma-tion. The facility can be controlled from any computer over the Internet by an authorized client. The client software is written in Java to accommodate multiple user platforms. A graphical representation of the A C M E facility is available on the 42 client computer using Java 3D. The current state of the components can be moni-tored through this model. The goal of ACME is to build a flexible tool to acquire the dense data sets crucial to many other areas of research. Manually determining data such as light fields, and BRDFs, surface points and contact forces is tedious if not impossible. Devices such as ACME will simplify this task and provide valuable information to other fields of science. Figure 22: Components of the ACME facility. 43 REFERENCES [1] Adelson, E. H. and J. R. Bergen, "The Plenoptic Function and the Elements of Early Vision", Computational Models of Visual Processing, The MIT Press, Cambridge, Massachusetts, 1.991. [2] Bajaj, Chandrajit L., Fausto Bernardini, and Guoliang Xu, "Automatic Re-construction Of Surfaces And Scalar Fields From 3D Scans", Computer Graphics Proceedings, Annual Conference Series, 1995, pages 109-118. [3] Blais, Martin, "Excursions en rendu par champ de lumiere: champ de visi-bility et re-illumination", Master Thesis, Departement d'Informatique et Re-cherche Operationnelle, Universite de Montreal, December, 1998. [4] Chen, S.E., "QuickTime VR—An Image-Based Approach to Virtual Envi-ronment Navigation", Computer Graphics Proceedings, Annual Conference Series, 1995, pages 29-38. [5] Debevec, Paul E., Camillo J. Taylor, and Jitendra Malik, "Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach", Computer Graphics Proceedings, Annual Conference Se-ries, 1996, pages 11-20. [6] Foley, James D., Andries van Dam, Steven K. Feiner, and John F. Hughes, Computer Graphics: principles and practice, Addison-Wesley Publishing Company, Reading, Massachusetts, 1995. [7] Glassner, Andrew S, Editor, Graphics Gems, Academic Press, Cambridge, Massachusetts, 1990. [8] Gortler, Steven J., Radek Grzeszczuk, Richard Szeliski, and Michael F. Co-hen, "The Lumigraph", Computer Graphics Proceedings, Annual Confer-ence Series, 1996, pages 43-54. 44 [9] Gu, Xianfeng, Steven J. Gortler, and Michael F. Cohen, "Polyhedral Ge-ometry and the Two-Plane Parameterization", Rendering Techniques '97 (Proceedings of the Eurographics Workshop in Saint Etienne, France, June 16-18, 1997), pages 1-12. [10] Hoppe, Tony DeRose, Tom Duchamp, John McDonald and Werner Stu-etzle, "Surface reconstruction from unorganized points", Computer Graph-ics Proceedings, Annual Conference Series, 1992, pages 71-78. [11] Hoppe, Hughes, Tony DeRose, Tom Duchamp, Mark Halstead, Hubert Jin, John McDonald, Jean Schweitzer, and Werner Stuetzle, "Piecewise Smooth Surface Reconstruction", Computer Graphics Proceedings, Annual Confer-ence Series, 1994, pages 295-302. [12] Levoy, Marc and Pat Hanrahan. "Light Field Rendering", Computer Graphics Proceedings, Annual Conference Series, 1996, pages 31-42. [13] LightPack: Light Field Authoring and Rendering Package (1996) [Online], The Leland Stanford Junior University, http://www-graphics.stanford.edu/software/lightpack [September 6, 1998]. [14] Lalonde, Paul and Alain Fournier^  "Real-time Rendering of Wavelet Com-pressed Light Fields", Imager Technical Report 1998, University of British Columbia. [15] McMillan, Leonard and Gary Bishop, Computer Graphics Proceedings, An-nual Conference Series, 1995, pages 39-46. [16] Moore, Jeffery, William Lee, Scott Dawson, and Brian Smith, "Optimal Parallel MPEG Compression", Technical Report TR96-1584, Cornell Uni-versity, May 6, 1996. [17] MPEG Home Page [Online], The Moving Picture Experts Group, http://drogo.cselt.stet.it/mpeg [September 23, 1998]. [18] POV-Ray - the Persistence of Vision Raytracer (1998) [Online], the POV-Ray Team, http://www.povray.org [September 6, 1998]. [19] Sederberg, Thomas W., and Scott R. Parry. "Free-Form Deformation of Solid Geometric Models", Computer Graphics (SIGGRAPH '86 Conference Proceedings), Volume 20, Number 4, August 1986, pages 151-160. 45 [20] Szeliski, Richard, "Rapid Octree Construction from Image Sequences", CVGIP: Image Understanding, Volume 58, Number 1, July 1993, pages 23-32. [21] Wong, Tien-Tsing, Pheng-Ann Heng, Siu-Hang Or, and Wai-Ying Ng, "Image-based Rendering with Controllable Illumination", Rendering Tech-niques '97 (Proceedings of the Eurographics Workshop in Saint Etienne, France, June 16-18, 1997), pages 13-22. 46 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051660/manifest

Comment

Related Items