Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Bioenergy supply chain optimization - decision making under uncertainty Zamar, David Sebastian 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_february_zamar_david.pdf [ 3.31MB ]
Metadata
JSON: 24-1.0363012.json
JSON-LD: 24-1.0363012-ld.json
RDF/XML (Pretty): 24-1.0363012-rdf.xml
RDF/JSON: 24-1.0363012-rdf.json
Turtle: 24-1.0363012-turtle.txt
N-Triples: 24-1.0363012-rdf-ntriples.txt
Original Record: 24-1.0363012-source.json
Full Text
24-1.0363012-fulltext.txt
Citation
24-1.0363012.ris

Full Text

Bioenergy Supply Chain OptimizationDecision Making under UncertaintybyDavid Sebastian ZamarB.Sc., The University of British Columbia, 2004M.Sc., Simon Fraser University, 2006A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Chemical and Biological Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)December 2017© David Sebastian Zamar 2017AbstractIn an age of dwindling fossil fuels, increased air pollution, and toxic groundwater, it is time weembrace renewable energy sources and commit to global green initiatives.In principle, biomass could be used to manufacture all the fuels and chemicals currentlybeing manufactured from fossil fuels. Unlike fossil fuels, which take millions of years to reacha usable form, biomass is an energy source that can close the loop on many of our recyclingand hazardous waste problems. The goal of this research is to develop flexible and easy touse mathematical frameworks suitable for the design and planning of biomass supply chains.This thesis deals with the development of discrete-continuous decision support methodologyand algorithms for solving complex optimization problems frequently encountered in the pro-curement of biomass for bioenergy production. Uncertainty and randomness is predominant,although often ignored, throughout the biomass supply chain. Uncertainty in the biomass sup-ply chain may be classified as upstream (supply) uncertainty, internal (process) uncertainty,and downstream (demand) uncertainty. This thesis endeavors to incorporate uncertainty inthe modeling of biomass supply chains. For this purpose, stochastic modeling and scenarioanalysis methodologies are utilized.The main contributions of this thesis are: (i) the development of a novel stochastic opti-mization methodology, called quantile-based scenario analysis (QSA); and (ii) the developmentof optimization algorithms, namely constrained cluster analysis (CCA) and min-min min-maxoptimization algorithm (MMROA), for the collection of bales across multiple adjoining fields.These methodologies are applied to three distinct biomass procurement case studies.Results show that QSA achieves more favorable solutions than those obtained using existingstochastic or deterministic approaches. In addition, QSA is found to be computationally moreefficient. In a case study involving the collection of forest harvest residues for several compet-iiing power plants, QSA achieved an average cost reduction of 11%. In a case study involvingthe collection of sawmill residues, QSA obtained a 6% gain in performance by accounting foruncertainty in the model parameters. In a case study involving the collection of bales, an 8.7%reduction in the total travel distance was obtained by the MMROA.iiiLay SummaryBiomass and the many forms of bioenergy produced from it are anticipated to provide a signif-icant contribution to the global energy supply. However, obtaining reliable and cost-effectivesupplies of biomass, which meet quality specifications and are produced in a sustainable man-ner over time, can prove to be difficult due to various sources of uncertainty.Bioenergy supply chain models tend to ignore uncertainties in the decision-making process.Results from these studies do not provide control mechanisms for variability and unfavorableoutcomes. This research is aimed at filling this gap.The main goal of this research is to develop novel methods for the optimal design andplanning of biomass supply chains that are robust in the presence of uncertainties. Developedmethods consider uncertainty regarding the availability of biomass, transportation, storage,competition for its use, and technologies used for its conversion into bioenergy.ivPrefaceThis dissertation is submitted for the degree of Doctor of Philosophy at the University of BritishColumbia. The research described herein was conducted under the supervision of ProfessorsS. Sokhansanj, B. Gopaluni, and N. Newlands in the Department of Chemical and BiologicalEngineering, University of British Columbia, between May 2014 and December 2017.This work is to the best of my knowledge original, except where acknowledgments and ref-erences are made to previous work. Parts of this research have been published in the followingjournals and peer reviewed conference proceedings:• Zamar, D. S., Gopaluni, B., Sokhansanj, S., and Newlands, N. K. (2015). Robust Optimiza-tion of Competing Biomass Supply Chains Under Feedstock Uncertainty. In 9th Interna-tional Symposium on Advanced Control of Chemical Processes, pages 1223–1228, Whistler,British Columbia. The International Federation of Automatic Control.• Zamar, D. S., Gopaluni, B., Sokhansanj, S., and Ebadian, M. (2016). Economic Opti-mization of Sawmill Residues Collection for Bioenergy Conversion. IFAC-PapersOnLine,49(7):857–862.• Zamar, D. S., Gopaluni, B., and Sokhansanj, S. (2017a). A constrained k-means and near-est neighbor approach for route optimization: With an application to the bale collectionproblem. In The 20th World Congress of the International Federation of Automatic Control,Toulouse, France. The International Federation of Automatic Control.• Zamar, D. S., Gopaluni, B., and Sokhansanj, S. (2017b). Optimization of sawmill residuescollection for bioenergy production. Applied Energy, 202:487–495.v• Zamar, D. S., Gopaluni, B., Sokhansanj, S., and Newlands, N. K. (2017c). A quantile-based scenario analysis approach to biomass supply chain optimization under uncer-tainty. Computers & Chemical Engineering, 97:114–123.viTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiList of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Biomass as an alternative energy source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Overview of biomass resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.2 Conversion technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 The biomass supply chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.1 General structure and unique challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11vii1.4 Review of current approaches to biomass supply chain optimization . . . . . . . . . . . . 121.5 Research objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.6 Research contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.1 Scenario analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.2 Quantile-based scenario analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.2.1 Mathematical formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2.2 Stopping rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.2.3 Bayesian networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.3 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Proof of Concept of the QSA Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 Study data and scenario simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.3 QSA formulation of the portfolio optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . 313.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 Forest Harvest Residue Collection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Sawmill Residues Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.2 Literature review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.3 Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.3.1 Optimization model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.3.2 Simulation model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61viii5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696 Bale Collection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.2 Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.2.1 Part I: constrained cluster analysis of bales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.2.2 Part II: within cluster route optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.3 Stopping rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Appendix A Quantiles and their Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110Appendix B Sawmills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112Appendix C R-code for the Bale Collection Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113C.1 Model the study area and the bale locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113C.2 Identification of the optimal number of cluster centers . . . . . . . . . . . . . . . . . . . . . . . . . . . 114C.3 Identification of the optimal routes within each cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . 118ixList of Tables3.1 Two-sample t-test of equal mean returns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.1 Optimization model notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.2 Quantiles of the cost distribution of the SA, CC, and QSA solutions. . . . . . . . . . . . . . . 485.1 Optimization model notation - sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.2 Optimization model notation - parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.3 Optimization model notation - decision variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.4 Stochastic input data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.5 QSA optimal route schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.1 Model parameter values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 866.2 Comparison of the MMROA to the NNH for different wagon capacities. . . . . . . . . . . 86B.1 Average annual sawmill residues produced by site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112xList of Figures1.1 World energy consumption in 2013 by fuel source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 World net energy consumption by fuel type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Simplified structure of the biomass supply chain.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1 Diagram of the quantile-based scenario analysis (QSA) approach. . . . . . . . . . . . . . . . . 223.1 Bayesian network structure of the annual return on assets. . . . . . . . . . . . . . . . . . . . . . . . 303.2 Quantile weighting functions provided to the QSA approach. . . . . . . . . . . . . . . . . . . . . 333.3 CDF of the returns obtained by each method (Markowitz, QSA, and SA). . . . . . . . . . 343.4 Boxplots of the out-of-sample performance of the solutions obtained by eachmethod (SA, Markowitz and QSA). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1 FHR availability in units of green ton per forest cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 FHR wet basis moisture content per forest cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.3 Catchment areas identified by the QSA, SA, and CC approaches. . . . . . . . . . . . . . . . . . 474.4 CDF of the QSA, SA, and CC solutuons.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.1 The logic flowchart for truck operations in the simulation model. . . . . . . . . . . . . . . . . 615.2 Map of the study region including the location of sawmills and the depot. . . . . . . . 625.3 Performance distributions of the QSA and MS solutions. . . . . . . . . . . . . . . . . . . . . . . . . . 675.4 Out-of-sample performance of the QSA and MS solutions. . . . . . . . . . . . . . . . . . . . . . . . . 696.1 Obtained tours by the MMROA for a problem previously proposed by Grissoet al. [2007]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.2 Distribution of wheat bales in a study area composed of 9 sections separated byaccess roads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82xi6.3 Bale clusters and roadside storage sites identified by the CCA method. . . . . . . . . . . . 836.4 The MMROA solutions for wagon capacities of 8 and 15 bales. . . . . . . . . . . . . . . . . . . . 846.5 The MMROA solutions for wagon capacities of 35 and 45 bales. . . . . . . . . . . . . . . . . . . 856.6 Elbow plot of the number of clusters versus total WSS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 876.7 Wagon capacity versus distance traveled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87xiiList of Algorithms1 The quantile-based scenario analysis (QSA) algorithm. . . . . . . . . . . . . . . . . 232 The constrained cluster analysis (CCA) algorithm. . . . . . . . . . . . . . . . . . . 753 The min-min min-max route optimization algorithm (MMROA). . . . . . . . . . . 78xiiiList of AbbreviationsBCP Bale collection problemBN Bayesian networkBTU British thermal unitCC Chance constraintsCCA Constrained cluster analysisCDF Cumulative distribution functionCHP Combined heat and powerCVRP Capacitated vehicle routing problemDAG Directed acyclic graphEROEI Energy returned on energy investedFHR Forest harvest residueGLPK GNU linear programming kitGPS Global positioning systemGT Green tonLHV Lower heating valueMC Moisture contentMILP Mixed-integer linear programmingMMBTU One million British thermal unitsMMHC Max-min hill climbingMMROA Min-min min-max optimization algorithmxivMS Mean scenarioNNH Nearest neighbour heuristicPA Precision agricultureQSA Quantile-based scenario analysisR&D Research and developmentSA Scenario analysisSVRP Stochastic vehicle routing problemTSP Traveling salesman problemVRP Vehicle routing problemWSS Within-cluster sum of squaresxvGlossarychance constraint a constraint that is a function of a random variable. Chance constraintsmay be included in stochastic optimization problems to ensure that the probability ofsatisfying the constraint is above a certain level.cumulative distribution function the cumulative distribution function (CDF) for any randomvariable X, denoted by F(x), is the probability that X assumes a value less than or equalto x.directed acyclic graph a collection of vertices connected by edges. Each edge has an orienta-tion, from one vertex to another.NP-hard problems that are intrinsically harder than those that can be solved in polynomialtime.out-of-sample statistical evaluation of a model’s performance obtained by splitting a givendataset into an in-sample period used for model fitting, and an out-of-sample periodused to assess performance.quantile value which divides the domain of the distribution so that a given proportion, q, ofthe values lies below it. For example, the median is the q = 0.5 quantile, which meansthat the proportion 0.5 (half) will be below the median and 0.5 will be above it.scenario a particular realization of all the values of a model’s parameters, both random andfixed.traveling salesman problem a classic mathematical optimization problem that asks: Given acollection of cities and the cost of travel between them, what is the cheapest route thatvisits every city and returns to the starting place?xviAcknowledgmentsI would like to thank Professors S. Sokhansanj, B. Gopaluni, and N. Newlands for their support,enthusiasm, knowledge and friendship. I would also like to express my gratitude to othermembers of my committee, Professors A. Bouchard-Côté and J. Lim for their helpful ideas andsuggestions for improving the work presented in this dissertation. Thanks also to my friendsat the UBC Biomass and Bioenergy Research facility.This research would not have been possible without the financial support from several insti-tutions. This study was funded in part through the University of British Columbia’s GraduateFellowship, the Mitacs-Accelerate Internship, the Natural Sciences and Engineering ResearchCouncil of Canada, Agriculture and Agri-Food Canada, and BioFuelNet.Last but not least, I would like to thank my family for their love, unfailing encouragement,and support. In particular, I would like to thank my wife and son for all the love and joy theybring into my life and for the sacrifices they make daily for me to pursue my dreams.xviiDedicationThis thesis is dedicated to my parents, wife, and son.xviiiChapter 1Introduction1.1 Biomass as an alternative energy sourcePresently, fossil fuels such as oil, coal and natural gas are the prime energy sources of the worldand account for 81% of global energy consumption (see Figure 1.1). It is anticipated that thesesources of energy will be depleted within the next 50 to 100 years [Hughes and Rudolph, 2011;Kerr, 2012; Saidur et al., 2011]. To exasperate the problem, the world net energy consumptionis projected to increase, as illustrated in Figure 1.2, by 69% from 87 quadrillion British thermalunits (BTUs) in 2020 to 131 quadrillion BTUs in 2040 [U.S. Energy Information Administra-tion, 2016]. The BTU is a traditional unit of heat and is defined as the amount of heat requiredto raise the temperature of one pound (0.4 kg) of water by one degree Fahrenheit. The world isconsuming more fossil fuel for energy than is being discovered and the reserves of energy thatcan be cheaply mined have reached peak production [Hughes and Rudolph, 2011; Kerr, 2012;Saidur et al., 2011]. Moreover, the expected environmental damages, such as global warming,acid rain and urban smog from the production of emissions from the combustion of fossil fuelshave compelled the world to reduce carbon emissions and shift towards utilizing sustainableand renewable energy sources [Newlands, 2016, chap. 2; Saidur et al., 2011]. Biomass is de-fined as living or recently dead organisms and any byproducts of those organisms, plant oranimal. The term is generally understood to exclude coal, oil, and other fossilized remnantsof organisms, as well as soils. A unique characteristic of biomass compared to other forms ofrenewable energy is that it can take various forms such as liquids, gases, and solids. Thus,biomass can be used for electricity or mechanical power generation and heat. Biomass hasbeen recognized as a promising alternative feedstock for energy production, since it is bothrenewable and CO2 neutral [Rauch and Gronalt, 2010]. Biomass fuels release no more carbon1Oil 31%Coal 29%Natural gas 21%Nuclear 5%Biofuels 10%Hydro 3%Other 1%Figure 1.1: World energy consumption in 2013 by fuel source [U.S. Energy Information Ad-ministration, 2016].dioxide than would have been produced by natural processes such as crop and plant decay.On the other hand, fossil fuels release carbon into the atmosphere that were trapped for thou-sands of centuries underground. People permit a CO2 indulgence for biomass because of CO2absorption during the growing process. Carbon neutrality cannot be assumed for all biomassenergy a priori because emissions associated with land use change, farming, and transport ofmaterials must be considered [U.S. Environmental Protection Agency, 2012].Biomass material is a large source of renewable energy and an important part of the wastemanagement infrastructure [Callé, 2007, Chapter 1]. It can be used for energy productionat different scales, including large-scale power generation, combined heat and power (CHP),or small-scale thermal heating projects at governmental, educational or other institutions.Biomass comes from both human and natural activities and incorporates by-products from thetimber industry, agricultural crops, raw material from forests, household wastes, and wood.Some of the most common and promising biomass feedstocks are:• Forestry materials (logging residues, forest thinnings, etc.)• Agricultural residues (corn stover, wheat straw, rice straw, etc.)• Energy crops (switchgrass, miscanthus, hybrid poplar, willow, algae, etc.)• Urban and suburban wastes (lawn wastes, urban wood wastes, etc.)20501001502002503003504004505005506006507007508008502012 2020 2025 2030 2035 2040yearquadrillion BTU Fuel TypeLiquidsCoalNatural GasNuclearRenewablesFigure 1.2: World net energy consumption by fuel type [U.S. Energy Information Administra-tion, 2016].The advantages of using biomass over other alternative sources of energy include: (i) uniformdistribution over the world’s surface; (ii) represents a sustainable alternative to fast-depletingfossil fuels; (iii) reduction in green house gas emissions; and (iv) offers opportunities for local,regional, national, and international energy self-sufficiency. The biggest advantage of biomassover alternative sources of renewable energy is that unlike solar or wind energy, it is availableon demand.1.1.1 Overview of biomass resourcesThe processing of biomass yields by-products and waste streams, collectively called residues,that are currently under utilized, but have significant energy potential. An abundance ofbiomass resources are available in natural forests, rural areas and urban centers that can betransformed into energy. Algaculture is a another source of biomass currently in development.3Forest harvest residuesForest harvesting is a major source of biomass for energy and involves cutting trees and deliv-ering them to sawmills, pulp mills and other wood-processing plants. Its practical componentsinclude road construction, logging and log transportation. Harvesting may occur as thinningin young stands, or cutting in older stands for timber or pulp. It may also refer to the re-moval of trees damaged by insects, disease or fire. For example, the millions of acres of dead,downed and diseased timber infected by pine beetles in Western U.S. and Canada could beput to beneficial use by the biomass industry and assist with forest fire mitigation and sup-pression. Harvesting operations usually remove between 25 to 50 percent of the cut volume,leaving forest harvest residue (FHR) as a feedstock for energy markets. For every cubic meterof log removed, a cubic meter of waste remains in the forest. Most of the wood residues areleft in the forest to rot, particularly in sparsely populated areas where local demand for woodfuels is low. This is because forest residues have low density (i.e. low fuel value), which makestransport over long distances economically infeasible. As a result, it is economical to reducethe biomass density at the logging landing sites before transporting over large distances.Agricultural or crop residuesCrop residues, traditionally considered as waste, are increasingly being viewed as a valuableresource with economic value. Agricultural residues include corn stover (stalks and leaves),wheat straw and other leftovers from grain production. Energy derived from agriculturalresidues are generally considered sustainable as they use waste materials from food crop pro-duction, and do not compete with food crops for land. Agricultural residues are a major sourcefor bioenergy applications due to the huge areas dedicated to cultivation of food crops world-wide. Studies on corn stover have shown that, in general, fields with yields above 175 bushelsper acre could remove up to 2 tons of biomass per acre, without any need for additional nitro-gen or phosphorous applications [Johnson et al., 2006].The physical handling of biomass fuels during collection or at a processing plant can bechallenging. Biomass residues vary with density, moisture content and particle size and mayalso be corrosive. An important aspect of biomass residues collection involves the design,4maintenance, and operation of specialized machines (e.g. harvesters, chippers, screening sys-tems, balers, loaders, wagons, and trucks) [Williams et al., 2008]. The correct selection ofconversion technologies is fundamental to enable available biomass resources to be matchedwith a desired end use energy carrier (e.g. heat, electricity or transport fuels).Energy cropsDedicated energy crops are another source of biomass feedstock. These crops include trees orother herbaceous biomass which are harvested exclusively for energy production use. Recentdevelopments in bioengineering have enabled the identification of rapid growing, pest tolerantsite and soil specific crops.Herbaceous energy crops are harvested annually after taking 2 to 3 years to reach full pro-ductivity. These include grasses such as switchgrass, miscanthus, bamboo, sweet sorghum, andwheatgrass. Short rotation woody crops are fast growing hardwood trees harvested within 3 to8 years after planting. These include poplar, willow, silver maple, cottonwood, green ash, blackwalnut, sweetgum, and sycamore. Aquatic resources, such as algae, giant kelp, and seaweedare also considered energy crops.Urban wood wasteUrban waste, such as lawn and tree trimmings, whole tree trunks, wood pallets and any otherconstruction and demolition wastes made from lumber are important biomass feedstocks. Amajor benefit to using wood waste as a biomass feedstock is that it reduces the amount of woodwaste disposed of in landfills or openly burned and stimulate local economic development.AlgacultureAlgaculture is the farming of species of algae. Algae are organisms that grow in aquatic en-vironments and use light and carbon dioxide (CO2) to grow. Algal biomass contains threemain components: carbohydrates, proteins, and lipids/oils. Microalgae have been recognizedas good sources for biodiesel production because of their relatively high oil content (in theform of tricylglycerol) and rapid biomass production [Thomas, 2002]. Algal biomass can also5be burned similar to wood, converted into a biogas to generate heat and electricity, or treatedby pyrolysis to generate crude bio-oil.1.1.2 Conversion technologiesTo make use of the energy available in biomass, technologies exist to either release the energydirectly (e.g. burning of biomass for heat) or to transform it into other more convenient formssuch as solid or liquid fuel. Biomass can be used in its solid form or gasified for heating ap-plications or electricity generation. Alternatively, it can be converted into liquid or gaseousfuels. Various technologies exist to convert biomass into power, heat, and fuels. These includedirectly-fired or conventional steam approach, co-firing, pyrolysis and gasification.Direct fired or conventional steam boilerMost woody biomass to bioenergy plants use a direct-fired system or a conventional steamboiler in which biomass feedstock is burned to produce steam. Three things are required inproper combination before ignition and combustion can take place: heat, oxygen and fuel.There must be heat (ignition temperature) to start and continue the combustion process [Em-mons and Atreya, 1982]. In a direct-fired system, biomass is fed from the bottom of the boilerand air is supplied at the base to sustain combustion. The ensuant hot combustion gases arepassed through a heat exchanger where water is boiled to create steam. The steam is passedthrough a turbine that spins an electrical generator. The heat generated by the exothermic pro-cess of combustion to power the generator can also be used to regulate temperature of the plantand other buildings, making the whole process much more efficient. The combined generationof heat and power provides an economical option, particularly at sawmills or other sites wherea source of biomass waste is already available. For example, wood waste from sawmills aretypically used to produce both electricity and steam at pulp and paper mills.Feedstocks in direct-fired systems undergo various levels of preprocessing, including dry-ing, sizing to smaller fragments, and in some cases pelletization or briquetting for long termstorage and transportation. Pelletization is a process of reducing the bulk volume of biomassfeedstock by mechanical means to improve handling and combustion characteristics of biomass.6Wood pellets are normally produced from dry industrial wood waste, such as shavings, saw-dust, and bark. Pelletization enables several advantages, including: (i) improved handling andconveyance efficiencies; (ii) controlled particle size distribution for improved feedstock unifor-mity and density; and (iii) conformance to predetermined conversion technology and supplysystem specifications [EPA, 2014].Co-firingCo-firing or co-combustion of biomass with coal and other fossil fuels is an effective way tointegrate biomass with energy systems based on fossil fuels. Co-firing involves upgrading anexisting power plant that is fired with coal and displacing a small proportion, typically between3 and 15 percent, of the coal with renewable biomass fuels [EPA, 2014]. Like coal, biomass isplaced into the boilers and burned. The only cost associated with upgrading the system isincurred in buying a boiler capable of burning both the fuels, which is a more cost-effectivethat than building a new plant. Co-firing also has the advantage of avoiding the constructionof a new, dedicated, biomass power plant.Biomass gasificationThe gasification process converts biomass into a combustible gas, which ideally contains allthe energy originally present in the biomass. The gasification process can be either direct orindirect. The direct method uses air or oxygen to generate heat through an exothermic reaction.The indirect method transfers heat to the reactor from the outside. Steam is typically uses forthis purpose, because it is easily produced and increases the hydrogen content of the syngas[Belgiorno et al., 2003]. In practice, conversion efficiencies range from 60% to 90%.The product of gasification is a combustible synthesis gas, or syngas, which contains CO2,CO, H2,CH4,H2O, trace amounts of higher hydrocarbons, inert gases present in the gasifica-tion agent, various contaminants such as small char particles, ash and tars [Bridgwater, 1994].Syngas can be burned to produce industrial or residential heat, to run engines for mechanicalor electrical power, or to make synthetic fuels.7PyrolysisPyrolysis offers a flexible way to convert solid biomass into an easily stored and transportedliquid or gas, which can be successfully used for the production of heat, power and chemicals.In pyrolysis, biomass is subjected to high temperatures in the absence of oxygen resulting inthe production of bio-oil, char or synthesis gas that can be used like petroleum to generateelectricity [Verma et al., 2012]. This process transforms the biomass into high quality fuelwithout creating ash or energy directly.Wood residues, forest residues and bagasse are important short-term feedstocks for pyrol-ysis as they are usually plentiful and inexpensive. Agricultural residues are important in thelonger term, but tend to have higher ash content. Recently, sewage sludge has been utilized asan alternative feedstock, which can be pyrolyzed to produce liquid fuel.Pyrolysis oil offers many advantages over solid biomass and gasification due to its ease ofhandling, storage and combustion in existing power stations.Biochemical conversion to liquid fuelsIn biochemical conversion processes, enzymes are used as catalysts to convert biomass intoliquid fuels. Cellulase and hemicellulase enzymes break down the carbohydrate fractions ofbiomass to form five-carbon and six-carbon sugars in a process known as hydrolysis. Yeast andbacteria are subsequently used to ferment the sugars into products such as ethanol.1.2 The biomass supply chain1.2.1 General structure and unique challengesBioenergy production involves the flow of biomass from the land to its eventual use for heatand power generation. Along the way, biomass passes through a sequence of processes, whichmake up the biomass supply chain. Fundamental stages of the supply chain are growing,harvesting, transporting, storing and conversion, with each stage requiring a unique set ofknowledge, technology and activity. Depending on the biomass feedstock, preprocessing maybe required along the pathway from the land to end use. A typical agricultural biomass supply8Biomass ProducersHarvesting and CollectingStoringHandling and TransportingPreprocessing ConversionBiomass Procurement OperationsFigure 1.3: Simplified structure of the biomass supply chain. Adapted from BISYPLAN [2012].chain encompasses all of the operations from biomass cultivation and harvesting to its deliv-ery at the gate of the conversion facility. The stages of the supply chain are interdependentand interconnected, with changes in productivity and technology in one stage effecting that inothers. As shown in Figure 1.3, transport, storage and handling are key issues throughout thesupply chain and link the various stages to each other.Biomass harvestingBiomass cultivation and harvesting deliver high-quality, stable, and compatible feedstocksfrom diverse agricultural sources to energy conversion and processing facilities. Comparedto the petroleum-to-fuel supply chain, the biomass-to-bioenergy supply chain differs signifi-cantly in raw material acquisition. High energy density petroleum emerges as a liquid via pointsource well-drilling, which can be hauled over long distance with minimum transport costs. Incontrast, raw biomass feedstock is a low-energy density solid that is inherently distributed overa large area requiring considerable collection and transportation efforts [U.S. Deptartment ofEnergy, 2012].In addition, biomass feedstocks have growing cycles and need to be planted, cultivated,and harvested. The growing cycle time and harvesting season vary between feedstocks. The9seasonal nature and annual variability in biomass supply presents an additional challenge inensuring a stable provision of biomass year round. For instance, shrub willow, a type of cellu-losic biomass, is grown on a 3-4 year cropping cycle or one harvest every 3-4 years. Harvestingof shrub willow is done during the winter, after leaf fall and before bud swell in early springAbrahamson et al. [2010]. Harvest timing and frequency affect the yields of biomass feed-stocks, thus careful planning on the use of land and labor is essential to guarantee the quantityand quality of the biomass supply [Sokhansanj et al., 2009].Biomass storing, transport and deliveryThough the supply of biomass may be seasonal, the demand for heat and energy is all-yearround. As a result, the harvested biomass feedstocks must be stored for continuous use in bio-fuel production throughout the year. The storage point could be on the farm, forest or othercollection points, at the energy conversion facility or at an intermediate site. Proper storage de-lays degradation of the biomass and reduces variability in the moisture content. Typical storagesolutions are plastic or net wrapping, temporary post-frame covered structures, reusable tarps,and more costly permanent ambient controlled structures. The presence of high water contentsignificantly decreases the density and increases transportation cost. As such, raw biomassusually goes through a densification procedure, such as chipping, shredding or baling, beforelong-distance transportation. Thermal and chemical treatment may also be applied as addi-tional preprocessing steps to reduce the moisture content, remove contaminants, and improvethe quality, stability and handling characteristics of biomass feedstocks. Major preprocess-ing technologies include torrefaction, pelletization, and ammonia fiber expansion [Uslu et al.,2008].Biomass conversionBiofuel production is typically accomplished in a biorefinery, which is a facility that integratesconversion processes and equipment to produce fuels, power, and chemicals from biomass.Broadly speaking, biomass-to-liquid conversion technologies can be classified into two classes,namely biochemical and thermochemical Anex et al. [2010]. Major biochemical conversion10processes include pretreatment, hydrolysis, biological or chemical processing, and product up-grading and recovery. Thermochemical technologies, such as gasification and pyrolysis, utilizehigh temperature in addition to catalysts to change the physical properties and chemical struc-tures of the biomass resources [U.S. Deptartment of Energy, 2014].1.3 Problem descriptionThe ever growing demand of lignocellulosic (i.e. plant dry matter) biomass for bioenergy pro-duction has resulted in increased transportation distances and costs for procurement [Alamet al., 2012]. Logistics planning of biomass supply become more complicated with increaseddemand due to competition for available biomass over a given space and time. This suggestsa need for developing and using optimization models for short and long term cost effectivedecision planning.Lignocellulosic biomass has been touted as one of the most promising feedstock for nextgeneration biofuel [Isikgor and Becer, 2015], however significant uncertainties regarding theirsupply, price, demand, and technology complicate the decision process and thwart the widespreadadoption of bioenergy by industry and government [Lee, 2014]. The majority of the literatureon biomass supply chain design assumes all the parameters in the system are known a pri-ori [Kazemzadeh and Hu, 2013]. However, there is a high level of uncertainty that can beencountered in practice and it is important to develop approaches to deal with the uncertain-ties associated with the biomass supply chain design [Grossmann and Guillén-Gosálbez, 2010;Shah, 2005].Sources of uncertainty in decision outcomes are typically due to three different conditions:(i) lack of knowledge, such as the quality characteristics of available biomass feedstocks; (ii)noise, such as measurement errors or incomplete data; and (iii) events that have not yet oc-curred, such as future energy demand or feedstock supply shortages and disruptions [New-lands et al., 2014; Ravindran, 2007].A well thought out biomass supply chain will maintain a constant and reliable feedstockfor the conversion process to ensure financial sustainability of the system. Investors are not11likely to back a project that could seemingly have many uncertainties in feedstock availability.However, many important variables regarding biomass logistics planning are not known in ad-vance. An analysis of delivered feedstock cost variability ranked biomass yield (e.g., collectionefficiency, grain yield, storage dry matter loss), biomass materials properties (e.g., bulk density,moisture content), and machine performance (e.g., shredder speed, baler capacity, baler fieldefficiency) as being the most important input parameters to consider when modeling biomasssupply chain processes [Kenney et al., 2012]. Furthermore, unexpected events may occur dur-ing field operations, such as fire, machine breakdowns and delays due to severe weather. It isimportant to account for uncertainties regarding biomass procurement and field operations,since these activities reside at the front end of the feedstock supply chain and their impactsdirectly or indirectly ripple all the way down the line.The optimization criteria of biomass procurement and field operations, such as the min-imization of the non-productive time, fuel consumption, traveled distance, and shortage inavailability or quality, may result in significant economic and environmental benefits. Due tothe inherent uncertainty and complexity of the problem, it becomes necessary to describe agri-cultural biomass procurement operations using probabilistic mathematical models that canaccount for uncertainty in the input parameters. To the best of our knowledge, little researchexists addressing the stochastic nature of the biomass supply chain.1.4 Review of current approaches to biomass supply chainoptimizationGiven the substantial research and development investments that have gone into bioenergytechnologies, significant literature exist on the issue of optimal supply chain design for biomassand biofuel systems. Here, we touch upon a small subset most relevant to the present discus-sion.A proper decision process requires one to model the major uncertainties and examine howthey impact the design decisions and expected performance. Decision models of increasingscope and sophistication have been devised for biomass supply chain optimization, including12some that incorporate uncertainty. Geospatial data analysis have been introduced to biomasssupply chain studies in order to more accurately represent the expected supply of biomass ina given region as well as the transportation distances and related costs [Lee, 2012].Dentcheva [2007] provides an overview of the theory and numerical techniques for op-timization models involving one or more constraints on probability functions, called chanceconstraints (CC). Dunnett et al. [2008] presented a combined production and logistics model inorder to optimize system configurations for various technological and supply and demand sce-narios specific to European agricultural land and population densities. Ekşioğlu et al. [2009]proposed a mathematical programming model to determine the number, sizes and locationsof biorefineries and applied it to the State of Mississippi as a test case. Leduc et al. [2010]presented a mixed-integer linear programming (MILP) to optimize the locations and sizes ofmethanol plants with heat recovery and gas stations in Austria. Osmani and Zhang [2014]proposed a two-stage linear programming approach to maximize the annualized profit of anintegrated dual-feedstock lignocellulosic based bioethanol supply chain.A number of recent studies in this field have considered the uncertainties associated withsupply chain management. Grossmann and Guillén-Gosálbez [2010] reviewed major con-tributions in process synthesis and supply chain management and pointed to the need todevelop more sophisticated optimization and decision-support methodologies. Awudu andZhang [2012] discussed uncertainties in biofuel supply chain management and reviewed re-lated works. Sharma et al. [2013] studied weather uncertainty in field operations when har-vesting agricultural biomass. Sensitivity analysis was done to demonstrate the effect of inputuncertainty in yield, land rent and storage dry matter loss on the model outputs. Ebadian[2013] developed an integrated simulation and optimization model to plan and schedule thesupply of biomass to a commercial-sized cellulosic ethanol plant. Shabani and Sowlati [2015]evaluated the impact of uncertainty and variability on the supply chain optimization of a forestbiomass power plant using Monte Carlo Simulation. Similarly, a global sensitivity analysis ofbiomass supply chain networks has been provided by Kim et al. [2011] using a Monte Carlosimulation over the hypercube formed from the parameter ranges. Gebreslassie et al. [2012]present a bicriterion, multiperiod, stochastic MILP model to address the optimal design of13hydrocarbon biorefinery supply chains under supply and demand uncertainties.Planning models over multiple periods have been formulated and tried in recent years. AMILP model for strategic design and planning of a supply chain in a period of 10 years wasdeveloped by Dal-Mas et al. [2011] that considers uncertainty in biomass production cost andproduct selling price. The model was tested on a real-world case study involving an emergingcorn-based bioethanol production system in Northern Italy. Giarola et al. [2012] developed aMILP model framework to assess the design and planning of a multiperiod and multiechelonbioethanol upstream supply chain under market uncertainty considering economic and envi-ronmental performance. Awudu and Zhang [2013] proposed a stochastic MILP for a biofuelsupply chain under demand and price uncertainties within a single-period planning frame-work to maximize the expected profit. The decisions variables are the amount of raw materialspurchased, the amount of raw materials consumed and the number of products produced.Approaches for solving stochastic optimization problems are scattered throughout the lit-erature on research and development (R&D). A recent review of this area reveals the lack of acomprehensive methodology that can be applied to the general stochastic optimization prob-lem [Berhan et al., 2014]. In most instances surveyed, the stochastic problem of interest istransformed into a deterministic equivalent that can be solved using mathematical program-ming solvers. The predominant approach for solving the stochastic optimization problems is touse “here-and-now-optimization” techniques, where decisions are made based on informationavailable at the time they are taken. When some of the parameters are random, it is no longerpossible to require that all constraints be satisfied for all realizations of the random parame-ters. Therefore, the decision maker may require that the constraints be satisfied with a givenprobability, or the incorporation of corrective actions be taken when a constraint is violated.These techniques generally result in computationally demanding and unscalable models forproblems involving a large number of random parameters.Not surprisingly, the majority of studies on biomass supply chain modeling and optimiza-tion do not explicitly account for uncertainty. Instead, they tend to diagnose the robustness oftheir solutions by means of a sensitivity analysis. Sensitivity analysis is defined as the study ofhow the uncertainty in the output of a mathematical model or system can be apportioned to14uncertainty in its inputs. Strategies for conducting a sensitivity analysis proceed as follows:Before optimization – Construct a mathematical model of the system, including the objec-tive function and all the model constraints. Obtain estimates of the model parametersbased on apriori justification or knowledge.During optimization – Identify the optimal values of the decision variables that simulta-neously maximize the objective function and satisfy all the model constraints.After optimization – Evaluate the robustness of the solution with respect to changes in thevalues of the model parameters.If the performance of a solution is found to be unsatisfactory (i.e. sensitive to small changes inthe values of the input parameters), then the solution must be modified to correct the problem.However, identifying how to tweak the solution is not always straightforward, especially whendealing with biomass supply chains that have numerous model parameters that may interactwith one another.While these approaches consider the stochastic nature of the problem, they do not consideruncertainty during the optimization process. In contrast, our approach explicitly accounts foruncertainty in the model parameters during optimization. This is achieved by examining alarge number of deterministic scenario solutions and selecting the solution, which has the bestoverall performance distribution across the sampled scenarios while satisfying the stochasticproblem constraints with a specified probability.1.5 Research objectivesThe objectives of this research are:1. To develop mathematical models for the optimization of processes involved in the pro-curement of biomass from: (i) forest; (ii) sawmill waste; and (iii) agriculture.2. To explore and highlight the potential benefits of stochastic optimization in the area ofbiomass supply chain planning.3. To develop an efficient and generalizable approach for constrained stochastic optimiza-tion.15The developed approach to stochastic optimization should be easily applicable to biomasssupply chain problems, which are plagued by uncertainty. The solutions obtained by the ap-proach should be statistically robust so that they are not much influenced by extreme eventsthat have a low probability of occurring.Specific aims of this study are as follows:i. To develop a stochastic model to identify the optimal FHR catchment areas for a set ofbiorefineries in a given region. The catchment areas are primarily driven by accessibilityand the spatial yield distributions of potential woody biomass resources.ii. To develop a stochastic model for the collection of sawmill residues in the Lower Main-land Regions of British Columbia, Canada. This is a vehicle routing problem for a net-work composed of a single depot and 25 sawmills. The sawmills serve as potential suppli-ers of biomass residues to the depot, which in turn processes and distributes the residuesto the pulp and paper mills. This problem consists of identifying the best daily routingschedule for a fixed number of trucks. The objective is to maximize the ratio of energyreturned on energy invested, while satisfying a minimum daily amount of dried biomassresidues. There are several random components in the problem, including the availabilityand moisture content of the biomass residues.iii. To develop an efficient algorithm for the collection of bales on the field, which minimizesthe total travel distance consequently reducing the fuel consumption. The problem hastwo main components. The first part involves identifying the optimal roadside storagelocations where bales are to be transported. This translates to a constrained cluster anal-ysis optimization problem as the storage locations must lie on the side of the road. Thesecond part involves identifying efficient collection routes for transporting the bales fromthe field to their respective roadside storage location.1.6 Research contributionsThe contributions of this study are fourfold:i. This study develops a mathematical framework for solving complex constrained stochas-16tic optimization problems that are often encountered in biomass supply chain planning.Conventional optimization based approaches for the design of biomass supply chain net-works assume that the operational characteristics, and hence the design parameters aredeterministic [Alam et al., 2012; Lin et al., 2014].ii. The novel methodology can handle large constrained stochastic optimization problems.This is achieved by expressing the problem as a series of similar deterministic constraintproblems, which can be efficiently solved in parallel. This compilation allows us to useexisting constraint solvers without any modifications, as well as call upon the power ofhybrid solvers for non-conventional optimization problems.iii. This study explores the methodological and practical advantages of addressing the prob-lem of procurement planning for bioenergy production in a stochastic manner in com-parison to a purely deterministic approach. This is illustrated in two distinct case studiesinvolving the collection of forest harvest residues and sawmill residues, respectively.iv. An efficient algorithm is developed and validated for the joint collection of bales acrossneighboring farms. The novelty of the algorithm is that it simultaneously optimizes thetotal travel distance and the locations on the side of the road where the bales should betransported.17Chapter 2Methods2.1 Scenario analysisThis section provides a brief review of scenario analysis. For more details, the reader may referto Birge and Louveaux [2011] and references therein. Systems that need to be controlled oranalyzed involve some level of uncertainty about the values to assign to the model parameters,if not about the model structure itself. Analysis of uncertainty is often neglected in the de-sign of complex systems models. Prediction uncertainty arises from a variety of sources, suchas input error, calibration accuracy, parameter sensitivity and parameter uncertainty. Com-plex system, such as the biomass supply chain, incorporate many parameters, most of whichrequire measurements obtained from field data collection. Regardless of the method used forparameter estimation, one often has little sense of which parameters have the most influence onmodel output. In some situations not so much is lost by selecting a suitable model to representthe system and assigning reasonable values to the model parameters. In other circumstances,ignoring uncertainty may invalidate implications one may wish to draw from the analysis. Un-certainty can be incorporated into a model in many ways. One that has been used successfullyin a wide variety of situations is to model the uncertain parameters as random variables andassign them appropriate probability distributions.As described in the seminal paper by Dembo [1991], the multi-scenario stochastic optimiza-tion problem may be defined in the standard inequality form given in Equation 2.1, where Uis the index set of stochastic constraints and L is the index set of deterministic constraints.Theobjective function, J (x,ω), and the stochastic constraints, Cu (x,ω), depend on the optimiza-tion variables x and the values of the random parameters ω ∈Ω. The constraints Dl(x) do not18depend on the values of the random parameters and are therefore deterministic.maximizexJ(x,ω) (uncertain objective) (2.1a)subject to Cu(x,ω) ≤ 0, u ∈U (uncertain constraints) (2.1b)Dl(x) ≤ 0, l ∈ L (deterministic constraints) (2.1c)A scenario is defined as a particular realization, ωi , of a random scenario ω ∈ Ω, where i ∈ I ,and I is an index set whose elements label (or index) specific members of the set Ω. Note thatfor each realized scenario, ωi ∈Ω, the above problem reduces to a deterministic subproblem.We assume that, in principle, the solution of a single scenario poses no difficulty. On theother hand, solving each subproblem separately does not constitute a solution for the originalstochastic problem, because a single final solution is still required. A fundamental issue inscenario analysis is how to combine the solutions from different scenarios to form a singleoverall best solution for the original stochastic problem. Let (Ω, p) be a probability spacewhere Ω is the set of possible realizations of the random vector ω and p is the associatedprobability function. A common coordination model that combines the scenario solutions intoa single feasible solution is given by:xˆ = argminx∑i∈I||J(x,ωi)− vi ||2p(ωi) +∑i∈I∑u∈U||Cu(x,ωi)− ciu ||2p(ωi) (2.2a)subject to∑l∈LDl(x) ≤ 0 (2.2b)where vi = J(xi ,ωi), ciu = Cu(xi ,ωi) and xi is the optimal solution to the coordination modelgiven in Equations (2.2a) and (2.2b) with ω = ωi [Dembo, 1991]. The term argmin in Equation(2.2a) stands for “argument of the minimum”, and are the points of the domain of x at whichthe function values are minimized. The complementary operator is argmax. The coordinationmodel of Equation (2.2) incorporates the random constraints into the objective function as apenalty. Associated with each scenario, ωi , is a probability p(ωi). This model attempts to trackthe optimal scenario solutions as closely as possible while still maintaining feasibility. In other19words, it searches for a solution, xˆ, such that J(xˆ,ωi) and Cu(xˆ,ωi) are as close as possible toJ(xi ,ωi) and Cu(xi ,ωi) as measured by the coordination model given in Equation (2.2). Wemay think of the solution, xˆ, as a centroid of the various scenario-solutions. In summary, thescenario optimization approach to stochastic optimization involves two stages:(a) Compute a solution to the deterministic problem for each scenario.(b) Solve a coordination model to find a single feasible policy.The problem referred to in stage (a) could be a linear, nonlinear or mixed-integer programmingproblem [Dembo, 1991]. Alternatively, it could consist of a system of equations with stochasticcoefficients or be any function dependent on stochastic parameters [Dembo, 1991]. Neverthe-less, for any assumed scenario, this problem is deterministic and can be solved using knownalgorithms [Dembo, 1991].The scenario analysis approach assumes that the number of scenarios is finite and thattheir corresponding probabilities are known. In practice this may not be the case if the scenar-ios come from a continuous or unknown distribution. If the dimension of parameters beingoptimized is large, the coordination in part (b) of the scenario analysis approach does not scalewell and may become computationally unfeasible. To relieve this computational burden aminimax approach can be applied McLean and Li [2013]. Although the optimization in part(b) above attempts to find a solution that performs well across the scenarios, a more accurateand comprehensive comparison of the performance distribution associated with each candi-date solution is desirable. An alternative coordination approach that addresses these issues ispresented in the next section.2.2 Quantile-based scenario analysisIn a fundamental sense, uncertainty associated with model output may be represented as aprobability distribution or as a specific statistical quantity, such as the expected value or me-dian with associated margins of error. By introducing notions of confidence and probability,our approach accounts for uncertainty in the decision-making process, provides more infor-mation than a single point estimate, and informs policy developers about the degree of risk20associated with particular decisions.2.2.1 Mathematical formulationOur novel stochastic optimization approach, named quantile-based scenario analysis (QSA),examines the performance of a solution xi , for a given scenario, ωi ∈Ω, across other scenariosin Ω. Specifically, we evaluate the performance of xi at a scenario ωj , i, j ∈ I , based on twocriteria: (i) its objective function value J(xi ,ωj ); and (ii) its constraint performance defined asψu(xi ,ωj ) = Cu(xi ,ωj )+, u ∈U (2.3)where U = {1,2, . . . , k} is the index set of stochastic constraints. The exponent (+) indicatesthat only positive values are to be penalized. The constraint performance functions, ψu(xi ,ωj ),express the magnitude of the deficiency of each unsatisfied constraint.Our approach uses the concept of quantile, qt = F−1(t), of a distribution, F(y). A quantile ofa distribution is a value which divides the domain of the distribution so that a given proportiont of the values lies below the quantile. In other words, F(qt) = t. For example, the median is the0.5 quantile, q0.5, representing the central value of the distribution, such that half the values areless than or equal to it (see Appendix A for more details). The proposed approach is as follows.For each scenario solution xi , i ∈ I , we define the random variable J(xi ,ω) with a cumulativedistribution function (CDF) denoted by Fxi (y), and the random variables ψu(xi ,ω), with CDFsdenoted by Gxiu(z), u = 1,2, . . . , k. The CDFs Fxi (y) and Gxiu(z) convey the overall objective andconstraint performance of the solution xi across scenarios, respectively. The inverse functions,F−1xiu(t) and G−1xiu(t), are the corresponding quantile functions. We consider coordination modelsof the form:maximizei∈I f (F−1xi (t)) (2.4a)subject to gu(G−1xiu(t)) ≤ 0, u = 1,2, . . . , k (2.4b)that are based on different functionals of the quantile functions F−1xi (t) andG−1xiu(t). For example,21we may wish to solve the following coordination model:maximizei∈I∫ 10F−1xi (t) ·φ0(t)dt (2.5a)subject to∫ 10G−1xiu(t) ·φu(t)dt ≤ γu , u = 1,2, . . . , k (2.5b)where the functions φ0(t) and φu(t) are weighting functions, which must be positive and inte-grate to unity over the range 0 to 1 and γu are appropriate tolerances. This formulation of theproblem attempts to maximize a weighted average of the quantiles of the objective function,subject to satisfying a weighted average of the quantiles of the constraint performance func-tions, also called a risk spectrum [Stoyanov, 2010]. In practice, the CDFs Fxi (y) and Gxiu(z)are unknown and may be estimated using Monte Carlo simulation. Assuming that we are ableto sample from the probability space, (Ω,p), of scenarios, then Fxi (y) and Gxiu(z) may be esti-mated using the algorithm outlined in Algorithm 1. A visual representation of the procedureused to obtain the distribution of the objective function, Fxi (y), is shown in Figure 2.1. A similarprocedure is applied to obtain the distribution of the constraints, Gxiu(z). Once the determin-istic solutions to the sampled scenario problems have been calculated, the QSA method can beused to obtain solutions with distinct probabilistic characteristics without having to performany additional optimization.ω𝟏ω𝟐ω𝒏⁞𝒙𝟏𝒙𝟐𝒙𝒏⁞𝐽(𝒙𝟏, ω𝟏)𝐽(𝒙𝟐, ω𝟏)𝐽(𝒙𝒏, ω𝟏)𝐽(𝒙𝟏, ω𝟐)𝐽(𝒙𝟐, ω𝟐)𝐽(𝒙𝒏, ω𝟐)𝐽(𝒙𝟏, ω𝒏)𝐽(𝒙𝟐, ω𝒏)𝐽(𝒙𝒏, ω𝒏)⁞⁞⁞⁞ ⁞Scenarios Solutions Objective Function Values𝑭𝒙𝟏(𝒚)𝑭𝒙𝟐(𝒚)𝑭𝒙𝒏(𝒚)⁞Distributions⁞Figure 2.1: Diagram of the quantile-based scenario analysis (QSA) approach. The QSA ap-proach identifies solutions based on the quantiles of their objective distribution, F, and analo-gous constraint distribution, G.22Algorithm 1: The quantile-based scenario analysis (QSA) algorithm.1 for i in 1 to n do2 // sample a scenario from Ω3 ωi ∈Ω4 // solve the deterministic subproblem5 xi = argmaxxJ(x,ωi)6 subject to7 Cu(x,ωi) ≤ 0, and Dl(x,ωi) ≤ 0, u ∈U, l ∈ L8 end9 // evaluate each scenario solution across all sampled scenarios10 for i in 1 to n do11 for j in 1 to n do12 hij = J(xi ,ωj)13 for u in 1 to k do14 ciju = ψu(xi ,ωj)15 end16 end17 end18 // estimate the objective and constraint CDFs for each solution19 for i in 1 to n do20 Fxi (y) =1nn∑j=1I(hij ≤ y)21 for u in 1 to k do22 Gxiu(z) =1nn∑j=1I(ciju ≤ z)23 end24 end25 // obtain the QSA solution, for example26 x∗ = argmaxi∈I∫ 10 F−1xi (t) ·φ0(t)dt27 subject to28∫ 10 G−1xiu(t) ·φu(t)dt ≤ γu , u ∈U23An advantage of the QSA approach is its scalability – in other words its ability to handleproblems of large dimension (i.e. large number of decision variables). The computational com-plexity of QSA is O(mξ +m2log(m)), where m is the number of sampled scenarios required toachieve convergence and ξ is the computational complexity for solving a single deterministicversion of the original stochastic optimization problem. As m does not depend on the dimen-sion of the problem, the computational time needed by QSA is essentially the same as would berequired to solve a single deterministic version of the original stochastic optimization problem.Thus, QSA scales much better than scenario analysis as the dimension of the problem increases.On the other hand, the coordination model of scenario analysis, shown in Equations (2.2a) and(2.2b), results in a quadratic optimization problem of the same dimension as the original one.As a result, the coordination step required by scenario analysis is a computational bottleneck.2.2.2 Stopping ruleA drawback of most heuristics is the absence of an effective stopping criteria [Ribeiro et al.,2011]. Most methods stop after a given number of iterations or a maximum number of consec-utive iterations is performed without improvement in the best known solution. The stabilityof the QSA approach in generating a robust solution is determined by the number of scenariossampled. The number of samples required to achieve stability greatly depends on the numberof random parameters included in the stochastic optimization problem and the sensitivity ofthe solution in response to changes in the values of the random parameters. As these character-istics solely depend on the nature of the stochastic optimization problem, we provide a generalstatistical test (or stopping criteria) that can be applied to determine whether a QSA solutionhas converged to a stable solution or if more scenarios need to be sampled. After sufficientsampling of the scenario distribution the QSA solution warranties near optimal performancesubject to satisfying the stochastic constraints to a specified probability. In this section, wepresent a probabilistic stopping rule for the QSA algorithm when solving problems of the formspecified in Equations (2.4a) and (2.4b).24Let xˆ be the QSA solution obtained using a set, Ω1 ⊂ Ω, of n1 realized scenarios. Let ρ bethe probability that a randomly chosen scenario from the set Ω, whose deterministic solution,x˜, satisfies the constraints in Equation (2.4b) and has a better performance over the set Ω1than that of xˆ. We propose a stopping rule based on statistical evidence that ρ < ρ0, for somespecified value ρ0. We sample an additional set of n2  n1 independently and identicallydistributed scenarios, Ω2 ⊂Ω. Let 0 ≤ N ≤ n2 be the number of sampled scenarios in Ω2 withsolutions that have a better performance over the set Ω1 than xˆ. Suppose that the observedvalue of N is r. Our stopping rule tests the hypothesis H0 : ρ ≥ ρ0 vs. H1 : ρ < ρ0 as follows.We reject H0 at level α and consider that the sample has sufficiently covered the search space,if PH0(N ≤ r) ≤ α, for ρ ≥ ρ0. Under, H0 : ρ ≥ ρ0, N follows the Binomial distribution, Bin(n2,ρ)with ρ ≥ ρ0. It can be shown [Klenke et al., 2010] thatmaxρ≥ρ0P (N ≤ r) = maxρ≥ρ0P (Bin(n2,ρ) ≤ r)= P (Bin(n2,ρ0) ≤ r)=r∑i=0(n2i)ρi0(1− ρ0)n2−i(2.6)and we reject H0 : ρ ≥ ρ0 in favor of H1 : ρ < ρ0 at level α if the RHS of Equation (2.6) is lessthan α, where α is a specified small probability, such as 0.05 corresponding to a 95% confidencelevel.2.2.3 Bayesian networksIn practice, the set of scenarios, denoted by Ω, may be infinite and, as such, cannot be enu-merated. Here, we describe a statistical technique that can be used to model the scenariodistribution, when data is available.The modeling of complex interactions has become extremely important in many researchareas, such as engineering, decision support systems and systems biology. A convenient wayof modeling complex interactions is by using graphs or networks, which correspond to theconditional independence structures in an underlying statistical model. A main class of modelsin this regard are Bayesian network (BN).25A BN is a succession of nodes, which represent uncertain state variables, linked by ar-rows that describe causal or evidential relationships. BNs are interpretable and fairly flexiblemodels for representing probabilistic relationships among interacting variables. The BN de-scribes a system of probabilistic events as nodes in a directed acyclic graph (DAG), in which thelikelihood of an event may be calculated from the likelihoods of its predecessors in the DAG[Shmulevich and Dougherty, 2010].Suppose we have a collection of variables, z1, z2, . . . , zp, which may be discrete or continuous.We represent these visually as a collection of nodes in a graph with arrows connecting some ofthese nodes. As this is an acyclic graph, it is not possible to start at some node zi and follow asequence of edges that eventually loops back to zi again. If an arrow runs from zi to zj , thenzi is a parent of zj . This is an anti-symmetric relationship, thus zj cannot be a parent of zi .We write the set of parents of node zj as pa(zj ). A node zi is a descendant of zj if and only ifthere exists a directed path from zj to zi . The edges of the graph represent the assertion thata variable is conditionally independent of its non-descendants in the graph, given its parents.The underlying assumption of the conditional independence encoded in the graph allows usto calculate the joint probability density asf (z1, z2, . . . , zp) =p∏i=1f (zi |pa(zi)). (2.7)To fully specify the set of conditional density functions, we must (1) select parametric fami-lies for each f (zi |pa(zi)), and (2) determine values for all parameters of each conditional densityfunction. This is typically achieved using a combination of statistical data, literature, and ex-pert knowledge. An application of BNs for scenario modeling and sampling is provided inSection 3.2.3 Concluding remarksThe presented methodology was developed to overcome the limitations of existing techniquesfor optimization of large scale, constrained systems, such as bioenergy supply chains, whichare subject to various sources of uncertainty. We define a robust solution as one that is able to26achieve a required level of performance (in terms of optimizing the objective and satisfying theconstraints) in a majority of plausible scenarios (i.e. realizations of the system). Uncertaintyis an important issue in the design and management of bioenergy supply chains because, forinstance, feedstock availability, procurement cost, and quality are all subject to random varia-tions. Any modeling attempt that does not consider uncertainty and variations in the bioenergysupply chain are likely to fail in identifying a robust solution.27Chapter 3Proof of Concept of the QSA ApproachThe following study was designed to evaluate the effectiveness of the QSA approach to solvingstandard constrained stochastic optimization problems. For this purpose, we chose to examinethe performance of the QSA method to solving the well-known economics and finance problemof portfolio selection (optimization). Using historical market data, we apply the QSA approachto examine: (i) whether it can efficiently identify solutions in terms of maximizing profit andminimizing variance; and (ii) whether our approach is able to outperform classical methods interms of precision and accuracy.3.1 Problem descriptionThe problem of optimizing a portfolio, which consists of a finite number of assets, denoted hereby r, is a classical problem of decision making under uncertainty in theoretical and computa-tional finance [Kolm et al., 2014]. In this study, we use the Markowitz [1952] mean-varianceformulation, which studies how risk-averse investors can construct optimal portfolios takinginto consideration the trade-off between market volatility and the expected return on asset.The return on asset is defined asη =v1 − v0v0,where v0 and v1 are the initial and final values of an asset over a given period. The origi-nal Markowitz model assumes that the expected returns of the assets µ ∈ Rr and their cor-responding variance-covariance matrix Σ ∈ Rr×r are known. A popular formulation of themean-variance portfolio selection problem involves the construction of a portfolio that maxi-mizes the return subject to a prescribed level of risk, κ. The mean-variance portfolio selection28problem can be formulated as follows:maximizexµT xsubject to 12xTΣx ≤ κr∑k=1xk = 1,xk ≥ 0,(3.1)where the decision variables xk , k = 1, . . . , r represent the proportion of capital invested in assetk. The Markowitz model, shown in Equation (3.1), uses the variance of the return, xTΣx, asthe measure of the risk, which is easy to compute and reduces the portfolio selection problemto a parametric quadratic programming problem. In practice, one would like to have a betterunderstanding of the return-risk trade-off since we want to both maximize return while mini-mizing risk. An alternative strategy is to try to balance these two objectives in a single objectivefunction. One way to do this is to solvemaximizexµT x −λ · 12xTΣxr∑k=1xk = 1,xk ≥ 0,(3.2)for a given value of the risk aversion parameter, λ ≥ 0. Note, that if x˜ solves Equation (3.2) fora given value of λ, then it also solves Equation (3.1) for κ = 12 x˜TΣx˜. In the last decade, mucheffort has been devoted to extending the practical usefulness of Markowitz work and makinghis modern portfolio theory more practical [Bonami and Lejeune, 2009]. A limitation of theMarkowidtz mean-variance approach is that it assumes that the model parameters, namely µand Σ , are known exactly. Unfortunately, in practice we never have this information. Instead,we only have estimates of the model parameters for the uncertain future. In this application,we evaluate the effectiveness of the QSA approach in improving realized portfolio return in thepresence of parameter estimation error.293.2 Study data and scenario simulationAn example from the financial market is used to demonstrate the applicability of BNs in mod-eling and generating scenarios for the QSA approach. We obtained real world data on 28 assetsfor which 30 years of annual return data from 1985−2014 were available. The data was down-loaded in R [R Core Team, 2016a], using the quantmod package [Ryan, 2011]. Stocks werechosen from six sectors (Capital Goods, Energy, Health Care, Technology, and Transportation).RVBD DSKY ZIXI KELYAPNTR ELRC CLMT MMPPLUGLREROYTWRES LPG SINOSFL QLTYGMROLLSTRTECSWH WGBSNVSESPRLHCG PETSMCURTFXlllllCapital GoodsEnergyHealth CareTechnologyTransportationFigure 3.1: Bayesian network (BN) structure of the annual return on assets.The max-min hill climbing (MMHC) Bayesian network structure learning algorithm, whichis implemented in the mmhc function of the bnlearn package [Scutari, 2009] in R, was used tolearn the relationship structure between the 28 assets included in this study. The MMHCalgorithm reconstructs the skeleton of a BN and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges, which in this case represent the conditional dependencies30in the annual return on assets. The learned network structure is shown in Figure 3.1. The pa-rameters of the BN were fit conditional on it’s structure and the observed data using maximumlikelihood estimation. The learned network may be sampled to obtain scenarios for the QSAapproach that are representative of the financial market data.3.3 QSA formulation of the portfolio optimization problemGiven a portfolio consisting of r assets, the QSA formulation of the Markowitz mean-varianceportfolio selection problem is given by:maximizei∈I∫ 10F−1xi (t)φ0(t)dtsubject to∫ 10G−1xi (t)φ1(t)dt ≤ δ,r∑s=1xs = 1,xs ≥ 0,xs ≤ τ.(3.3)where xi = (xi1,xi2, . . . ,xir ) is the solution for the ith scenario. Recall, that I is an index setwhose elements label (or index) specific members of the setΩ of scenarios. Each scenario has acorresponding solution, xi , that is obtained by solving the problem specified in Equation (3.1)for a specific realization, ωi ≡ (µi ,Σi) ∈ Ω, of the random model parameters, µ and Σ . For agiven scenario, ωi ∈ Ω, the parameters µi and Σi are the expected returns and the variance-covariance matrix of the r assets, respectively. The functions φ0(t) and φ1(t) are weightingfunctions, which must be positive and integrate to unity over the range 0 to 1. This formula-tion of the problem attempts to maximize a weighted average of the quantiles of the objectivefunction, subject to satisfying a weighted average of the quantiles of the constraint performancefunctions, also called a risk spectrum [Stoyanov, 2010]. Here, we have only one constraint per-formance function, thus m = 1 in Equation (2.5). Therefore, for this problem we may drop thesubscript k in describing the CDF of the constraint performance of a solution, xi ∈ I .31For each solution, xi , i ∈ I , the functions Fxi and Gxi are estimated using Algorithm 1. Forthis problem, the objective and constraint performance functions are given byJ(xi ,ωj ) = µTj xiandψ(xi ,ωj ) = 12xTi Σjxi −κjκj+ ,respectively. Here, κj =12xTj Σjxj is the minimum achievable variance for scenario j, whensolving Equation (3.2) with a given value of λ, typically between 0 and 1. Thus, ψ(xi ,ωj )corresponds to the relative difference in risk of solution xi to that of the optimal solution ofscenario j. The exponent (+) indicates that only positive values of the relative difference areconsidered. Thus, we penalize solutions that have a variance greater than κj in each scenarioωj ∈Ω, j ∈ I .3.4 ResultsTo assess the performance of the QSA approach in solving the portfolio optimization problem,a total of n = 1000 scenarios were sampled from the BN model described in Section 3.2, whichwas fit to the return on asset data. Thus, each sampled scenario has the same number ofobservations (30), and assets (28), as the return on asset data.The Markowitz mean-variance solution was obtained by solving Equation (3.2) with a riskaversion parameter λ = 0.25, and an idiosyncratic risk of τ = 0.15. The expected returns ofthe assets, µ, and the variance-covariance matrix of the expected returns, Σ , were estimatedfrom the return on asset data. The average annual return of each asset over the 30 year periodwas used to estimate the vector of expected returns of the assets, µ. A shrinkage estimator[Schaefer et al., 2010] of the variance-covariance matrix was used to estimate the variance-covariance matrix of the expected returns, Σ . The Markowitz solution for each scenario wereobtained using the quadprog package in R [Weingessel, 2013].The solution obtained using Dembo’s scenario analysis (SA) coordination model solves the32φ0(t) φ1(t)0.51.01.52.00.00 0.25 0.50 0.75 1.00tValueFigure 3.2: Quantile weighting functions provided to the quantile-based scenario analysis(QSA) approach. Notice that for a given solution, φ0 assigns more weight to the lower quantilesof the return performance across scenarios. In contrast, φ1 assigns more weight to the upperquantiles of the constraint performance across scenarios.empirical version of Equation (2.2), where each scenario is given equal weight as they cor-respond to an independent sample from the BN shown in Figure 3.1. Thus, p(ωi) = 1/n fori = 1,2, ...,n. Moreover, J(x,ωi) = xTµi is the return obtained at x and vi = xTi µi is the valueachieved by the Markowitz solution, xi , for scenarioωi . Finally, C(x,ωi) = (x−xi)TΣi(x−xi) andCd(x) correspond to the random and deterministic constraints in Equation 3.2, respectively.The minimization of Dembo’s SA coordination model was solved using the nlminb package inR [R Core Team, 2016b].The QSA solution was obtained by solving Equation (3.3) with the constraint parameter δ =median{κ1,κ2, . . . ,κj}and the weighting functions φ0(x) = 2e−2x(1−e−2)−1 and φ1(x) = 2e2x(e2−1)−1 depicted in Figure 3.2. The parameter δ in Equation (3.3) is the tolerance level of theportfolio risk, whose value was chosen to restrict attention to solutions with a risk spectrumthat is below 50% of the variance achieved by the Markowitz solution of each scenario.The chosen shape of the weighting functions, φ0(x) and φ1(x), assign more weight to thelower quantiles of the expected return, Fxi , and more weight to the upper quantiles of the con-straint performance, Gxi , distributions, respectively. Therefore, we heavily penalize solutions330.000.250.500.751.0010 20 30Return (%)CDFMethodMarkowidtz QSASAFigure 3.3: Cumulative distribution function (CDF) of the returns obtained by each method(Markowitz, QSA, and SA). The quantile-based scenario analysis (QSA) solution consistentlyobtains higher returns than both the Markowitz and scenario analysis (SA) solutions across1000 sampled scenarios.that exhibit extremely poor performance or incur high risk variability in some scenarios. Anidiosyncratic risk of τ = 0.15 was also used to obtain the QSA solution. A computation time of14.28 and 97.96 seconds was required to solve this problem running on a 3.6 GHz Intel Xeonprocessor for the QSA and SA methods, respectively. More importantly, the computation timeof the SA method was found to grow linearly with the number of assets, r, considered in theproblem whereas the QSA method grows logarithmically. To assess if the solution converged,we sampled an additional 200 scenarios and used the stopping rule of Equation (2.6) withρ0 = 0.05. All of the 200 additionally sampled scenarios yielded solutions with lower perfor-mance compared to that of the QSA solution, resulting in a p-value of 3.51×10−5. Therefore, wehave strong evidence that the solution converged and reject the null hypothesis, H0 : ρ ≥ 0.05in favor ofH1 : ρ < 0.05. In other words, we are confident that the QSA solution identified from34the one thousand sampled scenarios is among the top 5% of possible scenario solutions for thisproblem. The CDF of the returns attained for each method (Markowitz, QSA, and SA) acrossthe 1000 sampled scenarios are shown in Figures (3.3).010203040Markowitz QSA   SAMethodReturn (%)Figure 3.4: Boxplots, showing the median and interquartile range, of the out-of-sample per-formance of the solutions obtained by each method (SA, Markowitz and QSA). The black dotsrepresent outlier points.To test the out-of-sample performance of each method, we evaluated them on an addi-tional 1000 sampled scenarios, which were not used earlier in obtaining the solutions. Figure(3.4) shows the box-plot of the returns obtained by each method across the sampled scenarios.It is not surprising that the portfolios selected by the QSA and SA methods are able to out-perform that of Markowitz, since both of these methods account for variability in the modelparameters. However, the QSA method clearly outperforms the SA method, and is able to con-sistently achieve higher returns out-of-sample. A two-sample t-test was performed to comparethe means of the out-of-sample return distributions for the QSA and SA solutions. The results,35Table 3.1: Two-sample t-test of equal mean returns between the QSA and SA solutions.QSA SAMean 25.7 23.9Variance 0.06 0.06Observations 1000 1000P-value 6.11× 10−11listed in Table 3.1, indicate that the difference between the expected return of the QSA solution(23.0%) is statistically significantly higher than that of the SA solution (21.2%). A 95% confi-dence interval for the difference in expected returns between the QSA and SA solutions is 1.4%to 2.2%.3.5 DiscussionStandard approaches for optimization under uncertainty define the best solution as the onewith the best expected value of the performance metric, subject to satisfying the expected valueof a set of constraints. However, when other statistics or attributes of the distribution of theobjective function are more appropriate for comparing solutions (e.g. upper or lower quantiles,inter quartile range, overall dispersion, etc.), then the expected value based approaches are notable to accommodate the selection needs.The QSA approach considers the central tendency and the shape of the objective and con-straint distributions. Moreover, the QSA approach inherits the robust properties of quantilebased estimators, namely that the solution is able to handle variability in the data (or system)and remain effective. In addition, our approach is scalable to large complex stochastic prob-lems. All that is required is that the deterministic scenario subproblems be solvable usingexisting optimization techniques.We applied the QSA approach to the well known constrained stochastic portfolio optimiza-tion problem. It is important to identify solutions that will, with high probability, maximizea portfolio’s expected return contingent on any given amount of risk, with risk measured bythe standard deviation of the portfolio’s rate of return. We compared the performance of the36QSA approach to that of the classical Markowitz mean-variance approach and to the scenarioanalysis (SA) approach for stochastic optimization. Computational results show that quantile-based scenario analysis (QSA) is the best performing approach in terms of the percentage ofscenarios solved optimally that satisfy the stochastic constraints.37Chapter 4Forest Harvest Residue Collection4.1 Problem descriptionTo investigate the performance of the quantile-based scenario analysis (QSA) approach for en-ergy supply chain design and optimization under uncertainty, we analyze a supply chain net-work design problem for supplying forest biomass feedstock to three competing power plantswithin a study region. We assume that the power plant locations are given, and that they uti-lize only one type of feedstock, namely forest harvest residue (FHR), which includes tops andbranches and unmerchantable wood left after stand harvesting. Each power plant is associ-ated with an annual energy demand expressed in terms of one million British thermal units(MMBTU).The area of the study region is 400×400 km and is divided into 625 forest cells representing16 km2 each. The grid cells have an annual technical availability of FHR that is computedas a product of the harvesting factor and the theoretical availability of FHR at each cell. Theharvesting factor is a value between 0 and 1, which evaluates the harvesting capability of a gridcell. The FHR available at each grid cell is also associated with a wet basis moisture content(MC). The wet basis moisture content is used to describe the water content of biomass and isdefined as the percentage equivalent of the ratio of the weight of water to the total weight ofthe biomass. When it comes to biomass, the higher the moisture content the lower the netenergy content. The energy content of a wet ton of FHR was calculated using Equation (4.1),which expresses the relationship between the lower heating value (LHV), Q, and the wet basismoisture content, η, as an average for wood in terms of British thermal units (BTUs) per ton[Roise et al., 2013].Q(η) = (8660− 97.12η)× 2240 (4.1)38The constants 8660 and 97.12 in Equation (4.1) are in units of BTU per lb and represent theestimated average higher heating value of wood (across several species), and the effect of themoisture content on the lower heating value of wood, respectively. The multiplication factor2240 is the conversion factor from lb to long ton (one long ton is equal to 1016.05 kg).The cost of procurement of a green ton (GT) of biomass was calculated based on process-ing and transportation costs. The processing cost includes harvesting, grinding, chipping andpiling of FHR. The transportation cost from a forest cell to a power plant was assumed propor-tional to the distance between them. The vehicles considered for transporting forest biomassfeedstock are tractor-trailers with a 53-foot semitrailer and a payload of 40.55 tons. A costof $5.24 per green ton was included to account for loading and unloading operations. Thecharge-out rate for a biomass truck with operator was fixed at $85 per hour (h). An averagedriving speed of 50 km/h was assumed. With this information, the cost of transporting forestbiomass from each road cell of the study region to the three power plants was established. The01002003004000 100 200 300 400width (km)length (km)14001600180020002200GTFigure 4.1: Forest harvest residues (FHR) availability in units of green ton (GT) per forest cell.Black points represent the power plant locations.3901002003004000 100 200 300 400width (km)length (km)0.400.440.480.520.56MCFigure 4.2: Forest harvest residues (FHR) wet basis moisture content (MC) per forest cell. Blackpoints represent the power plant locations.remaining parameters are subject to random variability: biomass availability per forest cell;harvesting factor per forest cell; processing cost per forest cell; average FHR moisture contentper forest cell; energy demand at each power plant.The random parameters are assumed to be independent and normally distributed withmean and standard deviations as given in Table 4.1. To account for the spatial correlationamong the forest cells, a Gaussian random field with an exponential correlation function wasused to model the spatial dependence in FHR availability and the moisture content. Figures 4.1and 4.2 show the typical spatial distribution of FHR availability and the moisture content andthe location of the three biomass plants. Realizations of the values of the random parametersrepresent specific scenarios.404.2 ModelFor a single scenario, the parameters, indices and variables used for developing the optimiza-tion model are listed in Table 4.1, with random parameters being assigned normal distribu-tions. To use a more compact notation, let x = (x11,x12, . . . ,xmb) represent the vector of decisionvariables and ω be the vector of model parameters. The mathematical optimization model isgiven by:minimizexJ(x;ω) =∑m∈M∑b∈Bxmb ×(βb +λ+2dmb × τν × )(4.2a)subject to∑b∈BQ(ηb)× xmb ≥ δm ∀m ∈M (4.2b)∑m∈Mxmb ≤ θb ×αb ∀b ∈ B (4.2c)xmb ≥ 0 ∀(m,b) ∈M ×B. (4.2d)The objective function, which identifies the FHR catchment area of each power plant that min-imizes the overall procurement cost, is specified in Equation (4.2a). The constraints used in themodel are specified in Equations (4.2b) to (4.2d). Equation (4.2b) enforces that the demand bemet for each power plant. Equation (4.2c) ensures that the amount of FHR procured from aforest cell does not exceed the amount available at that forest cell. Equation (4.2d) states thatthe decision variables are nonnegative.The R package “Rglpk”, which is an R interface to the GNU linear programming kit (GLPK),was used to solve the resulting deterministic linear optimization scenario subproblems [Theussland Hornik, 2016]. GLPK is open source software for solving large-scale linear programming,mixed-integer linear programming (MILP), and other similar problems.41Table 4.1: Optimization model notation.Symbol Description Units Value/DistributionSetsM: set of power plants – {1,2,3}B: set of forest cells – {1,2, . . . ,625}Indicesm: power plant – m ∈Mb: forest cells – b ∈ BParametersλ : load/unload operation cost $ · GT−1 5.24αb: harvesting factor for the bth forestcell% N (61,5)βb: processing cost at roadside for thebth forest cell$ ·GT−1 N (26,3): average driving speed of a truck km ·hr−1 50ν : truck capacity GT 40.55δm: annual energy demand of eachpower plantMMBTU N (850000,16000)N (1100000,20000)N (1300000,24000)θb: availability of FHR in the bth forestcellGT N (4900,280)ηb: wet basis moisture content of FHRin the bth forest cell% N (50,3)dmb: distance from the bth forest cell tothe mth power plantkm Euclidean distanceτ : charge out rate for a truck driver $ ·hr−1 85Variablesxmb : annual amount of FHR to harvestfrom the bth forest cell for the mthpower plantGT –42In accordance with the QSA coordination model described in Equations (2.5a) and (2.5b),we seek a solution that minimizes the overall FHR procurement cost, subject to satisfying atleast (1− γ1)× 100% of the energy demand. We give an exponential increase in importance tothe upper quantiles of the stochastic cost function and define the objective weight function,φ0, in Equation (2.5a) asφ0(t) =et(et−1) if 0 ≤ t ≤ 10 othewise.(4.3)This resembles a minimax optimization, but instead of minimizing the worst-scenario perfor-mance we minimize a weighted average of the performances across all scenarios and assignlarger penalties to less desirable performances. We define the constraint weight function, φ1,as a Dirac delta function:φ1(t) =∞ if t = q10 othewise,(4.4)where the parameter 0 < q1 < 1 represents the probability of satisfying the demand constraintof Equation (4.2b). For this problem, the QSA constraint performance function used in evalu-ating the ith solution in the jth scenario is defined as follows:ψ1(xi ,ωj ) = C1(xi ,ωj)+=133∑m=1(δmi − ymijδmj)+, (4.5)where δmi is the energy demand of the mth power plant in the jth scenario and ymij is theenergy produced by the mth power plant when applying the ith solution to the jth scenario.As mentioned earlier, the exponent (+) indicates that only positive values of the parentheticalexpression are considered. In other words, we do not penalize for exceedance of the demand.Equation (4.5) calculates the overall percentage of unfilled demand across the three powerplants for a given solution and scenario.The FHR availability at each forest cell depends on the scenario. Thus, the solution ob-tained for a given scenario may not satisfy the FHR availability constraint of Equation (4.2d)at each forest cell when applied to a different scenario. When evaluating solutions across sce-43narios, in the event that a solution requests more FHR than is available at a given forest cell,then the available FHR at that forest cell is divided among the power plants in proportion totheir requested amounts. As a result, we do not need to explicitly consider the distribution ofthe availability constraint as it is accounted for when evaluating the objective function of thescenario solutions.Notice that a larger demand will typically incur a greater procurement cost. Hence, itwould be erroneous to compare solutions in terms of the total procurement cost, because thedemand is not fixed between scenarios. A larger demand will typically incur a greater procure-ment cost. If FHR is to be used as a fuel, its value lies in its energy content and not in its weight.In order to make valid comparisons between scenario solutions, the procurement cost shouldbe made relative to the amount of energy produced. To this end, the total procurement costof each solution and scenario combination is divided by the corresponding amount of energyobtained from the procured FHR. Thus, for the proper comparison of solutions evaluated atdifferent scenarios we replace Equation (4.2a) withJ˜(x;ω) =J(x;ω)∑m∈M∑b∈BQ(ηb)× xmb , (4.6)which is in units of dollar per BTU.4.3 ResultsIn this section, we evaluate the performance of the QSA approach for solving the stochasticoptimization problem described in Section 4.1. We examine: (i) its ability to identify optimalsolutions; (ii) its precision and accuracy; and (iii) its computational complexity and scalability.To this effect, we compare the solution obtained by the QSA approach to that obtained byscenario analysis (SA) [Dembo, 1991] and chance constraints (CC) [Dentcheva, 2007].The QSA approach for this problem minimizes a weighted average of the quantiles of theprocurement cost (across scenarios) using the weight function in Equation (4.3). We restrict at-tention to solutions that have a 95% probability of satisfying at least 90% of the demand acrossscenarios. Thus, γ1 = 0.10 and q1 = 0.05 in Equations (2.5b) and (4.4), respectively. In con-44trast, the SA approach minimizes the coordination model given in Equations (2.2a) and (2.2b),where J(xi ,ωi) = J˜(xi ,wj ), Cu(x,ωi) = C1(xi ,ωj), andDl(x) is the deterministic constraint givenin Equation (4.2d). Minimization of the SA coordination model was done using the spectralprojected gradient method for large-scale optimization with simple constraints implementedin the “BB” package in R [Varadhan and Gilbert, 2009].We found that even for this modest stochastic optimization problem, the computationaltime required by the SA method rendered it impractical when the number of scenarios goesbeyond a few hundred. In view of the computing time requirements of the SA method, a totalof n = 500 scenarios were sampled from their underlying Gaussian distribution. A computa-tion time of 119.7 and 1.3 minutes was required by the SA and QSA methods, respectively. Allcomputations were performed on a 3.6 GHz Intel Xeon processor. This tremendous savings incomputational time is an advantage of the QSA approach. To assess if the QSA solution con-verged, we sampled an additional 200 scenarios and used the stopping rule of Equation (2.6)with ρ0 = 0.05. All of the 200 additionally sampled scenarios yielded solutions with lowerperformance compared to that of the QSA solution, resulting in a p-value of 3.51×10−5. There-fore, we have strong evidence that the QSA solution converged and reject the null hypothesis,H0 : ρ ≥ 0.05 in favor of H1 : ρ < 0.05. In other words, we are confident that the QSA solutionidentified from the 500 sampled scenarios is among the top 5% of possible scenario solutionsfor this problem.We also compare the QSA solution with that provided by the rigorous CC approach de-scribed in Dentcheva [2007] whose mathematical formulation is given by:minimizexE∑m∈M∑b∈Bxmb ×(βb +λ+2dmb × τν × ) (4.7a)subject to P∑b∈BQ(ηb)× xmb ≥ δm ×γ1 ≥ 1− q1 ∀m ∈M (4.7b)P∑m∈Mxmb ≤ θb ×αb ≥ 1− q1 ∀b ∈ B (4.7c)xmb ≥ 0 ∀(m,b) ∈M ×B. (4.7d)45Assuming normal distributions for all the random parameters, equations (4.7a) to (4.7d) re-duce to a quadratically constrained linear optimization problem. The fmincon solver in matlab[MATLAB, 2015] was used to solve this optimization problem. The CC approach required over4 hrs of computing time to achieve convergence.The optimal biomass catchment areas identified by the QSA, SA and CC solutions are shownin Figures 4.3a, 4.3b and 4.3c, respectively. The catchment area for the QSA solution is muchmore compact than that of both the SA and CC solutions. The SA and CC solutions attempt toperform well on average. The SA solution tends to over collect biomass in order to compensatefor extreme scenarios with very poor FHR availability. This phenomenon is illustrated veryclearly in Figure 4.3b. Poor yields resulted in several scenarios across the middle belt of thestudy region, which is why the SA solution tries to skip over this area. The CC solution collectsless FHR from this area as well, but it is not as heavily influenced by the outlier scenariosas the SA solution in terms of forest cell selection. However, the outlier scenarios do have anoticeable impact on the chosen amounts to procure from each forest cell by the CC solution.The CC method over estimates the amount of biomass available in the middle belt region ofthe study area, resulting in a substantial overestimation of the probability that it satisfies thedemand. This is because the CC method is highly sensitive to departures from the assumedprobability distributions of the random parameters (e.g. that the biomass availability at eachforest cell is Normally distributed). In contrast, the outlier scenarios have little influence onthe QSA solution, since it optimizes the empirical quantiles of the objective and constraintdistributions, which are much less affected by outliers.The CDF of the procurement cost obtained by the QSA, SA and CC solutions are shown inFigure 4.4a. The x-axis of Figure 4.4a represents the cost of procurement in dollar per MMBTUand the y-axis gives the probability that the cost will be less than or equal to the quantity onthe x-axis. From Figure 4.4a, we see that the QSA solution consistently outperforms the CC andSA solutions in terms of minimizing the cost in each scenario. The SA and CC solutions havesimilar objective function distributions, which are almost indistinguishable in Figure 4.4a.The CDF of the constraint on the unfilled demand is shown in Figure 4.4b. The x-axis ofFigure 4.4b corresponds to the overall percentage of unfilled demand, which was calculated4601002003004000 100 200 300 400width (km)length (km)Plant 12000150010005000Plant 22000150010005000Plant 32000150010005000(a) QSA catchment area (GT).01002003004000 100 200 300 400width (km)length (km)Plant 12000150010005000Plant 22000150010005000Plant 32000150010005000(b) SA catchment area (GT).01002003004000 100 200 300 400width (km)length (km)Plant 110005000Plant 210005000Plant 310005000(c) CC catchment area (GT).Figure 4.3: Catchment areas identified by the QSA, SA, and CC approaches.470.000.250.500.751.006 7 8 9Cost per GJ ($)CDFmethodQSASACC(a) CDF of the cost ($ ·GJ−1).0.000.250.500.751.000 10 20 30Unfilled Demand (%)CDFmethodQSASACC(b) CDF of the percent of unfilled demand.Figure 4.4: Cumulative distribution function (CDF) of the QSA, SA, and CC solutions. (a) CDFof the cost in dollars per GJ. (b) CDF of the unfilled demand as a percentage.using Equation (4.5). The y-axis of Figure 4.4b describes the probability that the overall per-centage of unfilled demand will be less than or equal to the quantity on the x-axis. The 0.90quantile of the unfilled demand for the QSA method is 4.76%, which satisfies the requirementthat it be less than 5.0%. The SA solution was found to collect 22.7% more FHR than the QSAsolution to meet the same energy demand. Whereas the CC solution was found to collect 13%less FHR than the QSA solution. Table 4.2 compares several quantiles of the cost distributionof the QSA, SA and CC solutions. On average, the QSA solution achieves a cost reduction of11% compared to that of the SA and CC solutions.Table 4.2: Quantiles of the cost distribution of the SA, CC, and QSA solutions.Quantile SA CC QSA Relative Improvement of QSA0.10 $7.11 $7.10 $6.28 11.6 %0.25 $7.30 $7.31 $6.47 11.4 %0.50 $7.53 $7.54 $6.71 10.9 %0.75 $7.77 $7.78 $6.91 11.1 %0.90 $8.01 $8.04 $7.19 10.4 %484.4 DiscussionWe applied the quantile-based scenario analysis (QSA) approach to address the problem of an-alyzing a system of biomass supply chains, which are competing for the same feedstock, subjectto stochastic demand and supply in addition to numerous constraints. Here, it is important toidentify solutions that will, with high probability, meet the energy demand and satisfy theproblem constraints, for the vast majority of scenarios. The energy demand and feedstocksupply were considered uncertain, but with known distributions. Our approach was able toprovide a solution that minimizes a weighted average of the procurement cost distribution,while satisfying at least 90% of the energy demand with a probability of 95%.We compared the performance of the QSA approach to that of the classical scenario anal-ysis (SA) approach and the more rigorous chance constraints (CC) approach to stochastic op-timization. We evaluate a solutions’ performance based on its ability to optimize the objectivefunction and satisfy the problem constraints in each sampled scenario. The empirical cumu-lative distribution of both the objective and constraint functions are calculated to facilitatethe assessment and comparison of solutions. The cumulative distribution function (CDF) of asolutions’ constraint function gives the probability that the solution will satisfy the problemconstraints to within a relative percentage α (0 < α < 100). The CDF of a solutions’ objec-tive function gives the probability that the objective function value will be at least as big asy, where y belongs to the domain of the objective function. As the criterion for ranking theperformance of a solution, we use a weighted average of the quantiles of the objective andconstraint functions, respectively. Solutions are ranked based first on their ability to satisfythe problem constraints and second on their ability to optimize the objective function value.The QSA solution was found to achieve a better objective function value across all the sampledscenarios (see Figure 4.4a) while satisfying the problem constraints to within 10% of desired,95% of the time (see Figure 4.4b) as required in the problem setup. Based on this criteria, theQSA approach performed the best among those compared.49Chapter 5Sawmill Residues Collection5.1 Problem descriptionThe deleterious impacts of climate change coupled with the ongoing urbanization of countriesaround the world have led to a global effort to reduce greenhouse gas emissions for energyproduction, especially in the transportation sector. The transport sector is responsible for ap-proximately one quarter of greenhouse gas emissions in both Europe and America, makingit the second largest emitting sector after energy [European Commission, 2009; United StatesEnvironmental Protection Agency, 2015]. A plethora of studies show that urban freight trans-port could be vastly more efficient. According to the European Commission, 24% of commer-cial trucks that operate in Europe are empty [United States Environmental Protection Agency,2015]. In the United States, commercial trucks drive an estimated 19 billion needless mileseach year [Jaffe, 2015]. Thus, significant economic and environmental savings may be achievedby reducing truck transport.This case study presents a modeling framework to optimize the collection of biomass residuesfrom sawmills in the presence of uncertainty. Recognizing that the goal of procuring thebiomass is for the production of energy, we cast the problem as an energy optimization task.Biomass feedstocks that are brought to a combined heat and power (CHP) plant for conver-sion into energy must meet certain specifications in order for the process to be safe and suc-cessful. The wet basis moisture content is used to describe the water content of biomass and isdefined as the percentage equivalent of the ratio of the weight of water to the total weight ofthe biomass. The most important property of biomass feedstocks with regard to transport andenergy conversion is its moisture content. This is for two reasons. First, the biomass feedstockmight have to undergo drying if the moisture content exceeds that accepted by the facilities50boiler, with an ensuing energy expense. Second, the unnecessary transportation of water em-bedded in the biomass feedstock increases the energy utilized for transportation. Hence, ourobjective is to schedule the collection activities and identify collection routes that maximize thetotal energy returned on energy invested (EROEI) taking into consideration the transportationand drying of the biomass feedstock. The EROEI is defined as the ratio of the amount of usableenergy acquired from a particular energy resource to the amount of energy expended to obtainthat energy resource.Many processes in the biomass supply chain may randomly vary each time they are per-formed. For example, the availability and quality of the biomass residues as well as the timerequired to load, unload and transport will vary on a case by case basis. These disturbances canbe caused by spatial and temporal variations in a seemingly unpredictable fashion that can bebest represented in the model with the use of random variables. A main source of uncertaintydriving these random variations can be attributed to weather, such as precipitation, wind, andexposure to the environment. In the context of biomass procurement for bioenergy, variabilityin the biomass quality (i.e. moisture content, calorific value, etc.) and availability can have asignificant impact on the cost of energy production and the energy conversion efficiency. Assuch, the modeling of these activities should consider both the biomass quality and availabil-ity as random parameters with probability distributions. Modeling such variations is oftendone using specified stochastic distributions with mean and variance estimated from histor-ical data. The sampled values of the random variables were obtained using random numbergenerators. All other model input parameters (i.e. number and location of sawmills, traveldistances, number of trucks, truck capacities, etc.) are considered fixed. The values assignedto the fixed parameters were established per the study criteria or derived from the literature.Our motivating case study deals with the planning of truck routes for the collection ofsawmill residues (or waste) in the Lower Mainland region of British Columbia, Canada [Zamaret al., 2016]. A symbiotic relationship exists between sawmills and pulp mills. Approximately50% of a log, by volume, gets turned into lumber at a sawmill. The waste residues from theprocess are utilized by the pulp mills to produce both pulp and excess green energy. Themajority of pulp mills rely on purchased sawmill residue chips for most, if not all, of their51chip supply. Consequently, the sale of residue chips has become an essential revenue streamfor the sawmills. The pulp mills have the necessary expertise, infrastructure and potential tobe future large scale producers of biomass based transportation fuels [Mercer InternationalGroup, 2010].The real-life sawmill residues collection problem under consideration may be describedas follows. There are a total of 25 sawmills in the region and a single depot. The locationof the sawmills and the depot are given along the streets of a defined road network. Thebiomass residues produced by the sawmills must be collected by a fleet of trucks with knowncapacities. The average daily amount of biomass residues produced by each sawmill are subjectto variability. Each truck may collect biomass residues from several sawmills before returningto the depot to unload. Truck drivers work 8 hour shifts that start at 9am and end at 5pm.The trucks leave the depot at the start of the day at 9am and are allowed to make several,potentially different, routes in a single day. A truck must return to the depot to unload aftercompleting a route. In addition, all trucks must return to the depot before the end of the day at5pm. The amount of biomass residues that should be collected on a daily basis is determinedby the energy demand from the pulp mills that are being supplied by the depot.Information regarding the mass, measured in green tons (GT), and energy content, mea-sured in gigajoules per ton (GJ · t−1), of the biomass residues available at each sawmill areunknown and highly variable. The energy content of the biomass residues depends on themoisture content, which is known to play a significant role in the optimization of biomasssupply chains [Sosa et al., 2015].The biomass residues collection problem can be viewed as a stochastic vehicle routing prob-lem (SVRP). The vehicle routing problem (VRP) is a very well known and widely studied prob-lem in combinatorial optimization. The general objective is to route the vehicles, with eachroute starting and ending at the depot, so that all customer supply demands are met and thetotal travel distance is minimized. As this is a computationally very hard problem, whichcannot be solved by exact methods, in practice heuristics are typically used for this purpose[Nuortio et al., 2006]. The SVRP arises when some of the elements of the VRP are not knownexactly, such as the travel times, product availability and customer demand.52Prevalent techniques used to solve the SVRP minimize the expected cost of the assignedroutes, including those incurred by recourse actions, while satisfying a set of chance con-straints. In most cases, simplifying assumptions are used in order to fit probabilistic mod-els that rely on mathematical theorems and lemmas [Buhrkal et al., 2012]. In our experiencethese techniques generally result in computationally demanding and unscalable models [Za-mar et al., 2017c]. Moreover, the explicit use of the expected value in solving the SVRP impliesthat the decision maker is primarily interested in the average performance of the solution andis not concerned with its variance and other features of its random behavior. To avoid thesedrawbacks, the SVRP presented in this paper is solved using the QSA approach [Zamar et al.,2017c]. The QSA approach analyzes the performance of solutions obtained from solving deter-ministic realizations of the stochastic problem and identifies the solution that optimizes chosenquantiles of the stochastic objective function, subject to satisfying conditions on given quan-tiles of the constraint distribution. An advantage of this approach is that it requires only thateach deterministic version (i.e. scenario) of the stochastic problem be solvable. The contribu-tions of the work presented in this chapter are threefold:i. It demonstrates the methodological and practical advantages of addressing the problemof procurement planning for bioenergy production in a stochastic manner in comparisonto a purely deterministic approach. Second, it is shown that the procurement of biomassfor energy production can be presented in a mathematical framework that accounts forboth the energy utilized for procurement and the energy obtained during conversion.Third, it illustrates how randomness can be directly incorporated into any existing deter-ministic VRP, often encountered in energy production, and solved as an SVRP by a directapplication of the QSA approach.ii. It shows that the procurement of biomass for energy production can be presented in amathematical framework that accounts for both the energy utilized for procurement andthe energy obtained during conversion.iii. It illustrates how randomness can be directly incorporated into any existing determin-istic VRP, often encountered in energy production, and solved as an SVRP by a directapplication of the QSA approach.53The remainder of this chapter is organized as follows. A review of the literature on SVRPstudies is discussed in Section 5.2. The sawmill residue collection model and its input require-ments are presented in Section 5.3.1. The heuristic routing and scheduling methodologies areexplained in Section 5.3.2, followed by the presentation and discussion of results in Section5.4. Main conclusions are provided in Section 5.5.5.2 Literature reviewThe deterministic VRP has received some attention in recent years in the area of energy ef-ficiency, production and management. For example, Araujo et al. [2010] studied the collec-tion of waste frying oil for the production of biodiesel in Rio de Janeiro, Brazil. Juan et al.[2016] investigated several open research problems related to the introduction of electric ve-hicles in logistics and transportation, including strategic and planning challenges associatedwith hydrogen-based electric vehicles. Shao et al. [2017] studied an electric vehicle routingproblem in Beijing to address some operational issues such as range limitation and chargingdemand. Jokinen et al. [2015] present a mathematical model that minimizes the total costsassociated with transporting liquefied natural gas. Yagcitekin and Uzunoglu [2016] present asmart charging management algorithm strategy to successfully route electric vehicles to themost suitable charging point, thereby decreasing the charging costs and prevent the overload-ing of transformers. These studies, however, do not address parameter uncertainty in theirrespective optimization problems. While stochastic optimization has been used in the contextof energy, attention has been primarily focused on energy storage and flow characteristics, seefor example the works of Moura et al. [2011] and Kou et al. [2015]. On the other hand, despiteits importance and frequent appearance in energy related problems, SVRP has been mostlyabsent from this literature.Approaches for solving the SVRP may be found throughout the literature on research anddevelopment (R&D). A recent literature review, reveals the lack of a comprehensive methodol-ogy that can be applied to the general SVRP [Berhan et al., 2014]. In most instances surveyed,the SVRP is transformed into a deterministic equivalent that can be solved using mathematical54programming solvers [Smith et al., 2010; Zhang et al., 2013]. The predominant approach forsolving the SVRP is to use “here-and-now-optimization” techniques, where decisions are madebased on information available at the time they are taken. When some of the parameters arerandom, it is no longer possible to require that all constraints be satisfied for all realizationsof the random parameters. Therefore, the decision maker may require that the constraints besatisfied with a given probability [Stewart and Golden, 1983; Laporte et al., 1989; Bastian andRinnooy Kan, 1992] or the incorporation of corrective actions be taken when a constraint isviolated [Laporte et al., 1994; Gendreau et al., 1995].Published studies that investigate different variants of the SVRP include urban publicbus transport systems [Osorio and Bierlaire, 2013], plant to customer distribution [Chep-uri and Homem-de Mello, 2005], waste collection [Pedrag and Miomir, 2011; Benjamin andBeasley, 2010; Buhrkal et al., 2012], collection and delivery of goods [Mata-Pérez and Saucedo-Martínez, 2015], in-field scheduling for machinery fleets in biomass operations [Sørensen andBochtis, 2010; Orfanou et al., 2013], and public transport in adverse weather conditions [Sumaleeet al., 2011]. However, SVRP studies that analyze the collection of biomass feedstock in an ur-ban setting or that focus on the optimization of energy production while minimizing energyutilization during the procurement and preprocessing stages remain mostly untouched areasin the R&D literature.5.3 Methods5.3.1 Optimization modelIn this section we formally define the SVRP model for this study. The problem is defined ona graph G = {V ,A}, where the set of nodes V = Vd ∪ Vs consists of a single depot Vd = {0}, msawmills, Vs = {1, . . . ,m}, and a set of arcs A = {(i, j) | i, j ∈ V ,i , j}. Let K = {1, . . . ,n} be theset of trucks with payload capacities ck , k ∈ K . Let dij and tij be the travel distance and timeassociated with arc (i, j), respectively. In addition, each node, j ∈ Vs, has an inventory ψj ofbiomass residues with an average moisture content ωj and a corresponding mass flow rate βjfor loading a truck. Define the set Rk = {1, . . . , rk}, k ∈ K , where rk denotes the number of routes55assigned to truck k. Let δjkl represent the mass of the biomass residues picked up at nodej ∈ Vs by truck k ∈ K on its l ∈ Rk assigned route. Let ϕijkl represent the sum of both thepayload and curb weight, ζk , of truck k when traveling from node i to node j on its lth assignedroute. The curb weight is simply how much the vehicle weighs on its own, without any cargoor passengers. Thus, for a given truck k and route l,ϕijkl =∑y≺jy,j∈Vsδykl + ζk , (5.1)where y ≺ j if and only if sawmill y is visited before sawmill j. Let θ be the required moisturecontent of the biomass residues for conversion into heat and power at the pulp mills. Let ηrepresent the calorific value of the biomass at a moisture content of θ, and T be the temperatureat which it is stored. Let λ represent the energy required to transport a green ton of biomassa distance of one kilometer (km). The time required to load a truck at node j ∈ Vs is given byαj + βjδjkl , where αj is the truck setup time at node j. Similarly, the time required to unloada truck at the depot is given by α0 + β0ck . The parameters tij , αj , βj , ψj and ωj , ∀(i, j) ∈ A, areunknown quantities, which are assumed to be normally distributed. Thus, tij , αj , βj , ωj andψj are stochastic variables. A summary of the model parameters is provided in Tables 5.1, 5.2and 5.3.For a specific realization of the random variables tij , αj , βj , ωj , and ψj , ∀(i, j) ∈ A, thedeterministic problem can be modeled using the following additional variables. Let uk = 1,k ∈ K , if and only if the driver of truck k has had a lunch break and uk = 0 otherwise. Let γ bethe time alloted for a lunch break. Define the set of arcs between nodes that do not terminateat the depot as As = {(i, j) | i ∈ V ,j ∈ Vs, i , j}. Let xijkl = 1 if and only if truck k ∈ K uses arc (i, j)on its lth assigned route and xijkl = 0 otherwise. The optimization model for the SVRP is givenin Equation (5.2).56max J(x,δ) =∑(i,j)∈As∑k∈K∑l∈Rkδjkl[η −(ωj − θ(1−ωj )1−θ)(η + 4.19× 10−3(100− T ) + 2.26)]xijkl · η∑(i,j)∈A∑k∈K∑l∈Rkxijkl · dij ·ϕijkl ·λ(5.2a)s.t.∑j∈V∑l∈Rkx0jkl = 1, ∀k ∈ K (5.2b)∑i∈V∑l∈Rkxi0kl = 1, ∀k ∈ K (5.2c)∑i∈V∑l∈Rkxijkl =∑i∈V∑l∈Rkxjikl , ∀j ∈ V ,k ∈ K (5.2d)∑k∈K∑l∈Rkδjkl ≤ ψj , ∀j ∈ Vs (5.2e)κ · ck ≤∑j∈Vsδjkl ≤ ck , ∀k ∈ K,l ∈ Rk (5.2f)δjkl ≥ 0, ∀j ∈ Vs, k ∈ K,l ∈ Rk (5.2g)δ0kl = 0, ∀k ∈ K,l ∈ Rk (5.2h)xijkl ∈ {0,1}, ∀(i, j) ∈ A,k ∈ K (5.2i)∑j∈Vs∑k∈K∑l∈Rkδjkl[1−ωj +θ]≥ δ∗ (5.2j)∑l∈Rkukl = 1, ∀k ∈ K (5.2k)∑(i,j)∈As∑l∈Rkxijkl(tij +αj + βjδjkl)+∑i∈V∑l∈Rkxi0kl(ti0 +α0 + β0ck) +γ ≤ τ, ∀k ∈ K(5.2l)57Table 5.1: Optimization model notation - sets.Symbol Description Set NotationV0: depot node V0 = {0}Vs: sawmill nodes Vs = {1,2, . . . ,m}A: arcs connecting the nodes A = {(i, j) | i, j ∈ Vs ∪V0, i , j}K : trucks K = {1,2, . . . ,n}Rk routes assigned to truck k Rk = {1,2, . . . , rk}, k ∈ KTable 5.2: Optimization model notation - parameters.Symbol Description Units Indicesck : truck payload capacity t k ∈ Kζk : truck curb weight t k ∈ Kκ : minimum payload fill percentage per route % –γ : lunch break duration h –τ : maximum daily operating hours for a truck driver h –θ required moisture content of dry biomass %ψj : inventory at node j GT j ∈ Vsωj : wet basis moisture content of residues at sawmill j % j ∈ Vsλ: transport energy GJ · t−1 –η: net calorific value of biomass at a moisture content ofθGJ · t−1 –αj setup time for loading a truck at sawmill j h j ∈ Vsβj mass flow rate for loading a truck at sawmill j GT ·h−1 j ∈ Vsα0 setup time for unloading a truck at the depot h –β0 mass flow rate for unloading a truck at the depot GT ·h−1 –dij travel distance between nodes i and j km i, j ∈ V0 ∪Vstij travel time between nodes i and j h i, j ∈ V0 ∪Vsuk : logical value indicating if truck k had lunch – k ∈ KTable 5.3: Optimization model notation - decision variables.Symbol Description Units Indicesrk : number of routes assigned to truck k – k ∈ Kxijkl : logical value indicating if truck k uses arc(i, j) on its lth assigned route– k ∈ K , i, j ∈ V0 ∪Vs, l ∈ Rkδikl : amount of residues picked up by truck kat node i on its lth assigned routeGT k ∈ K , i ∈ Vs, l ∈ Rk58To simplify the notation, the decision variables are compactly expressed as the route sched-ule, x, and the amount of residues, δ, to pick up from each sawmill visited in each route. Weassume that the energy obtained from a dry ton of biomass residues at a moisture content ofθ is constant. The goal is to maximize the EROEI, denoted by the function J(x,δ) in Equation(5.2a), which is the ratio of the amount of usable energy obtained from the biomass to theamount of energy used in transporting it. As the fuel economy is dependent upon the load ofthe truck [Fontaras et al., 2017], each kilometer traveled is multiplied (weighted) by the loadtransported. Consequently, the EROEI described in Equation (5.2a) is a unitless ratio. The en-ergy required for drying the biomass is also reflected in J(x,δ). This is achieved by subtracting,from the numerator, the quantity of dried biomass that would be required to generate the en-ergy needed to dry the delivered residues to the desired moisture content, θ. Lemma 1 showshow to calculate this quantity.The optimization is done subject to the following constraints. Equations (5.2b) and (5.2c)ensure that all k trucks must leave and return to the depot at the end of each route. Equation(5.2d) ensures that the inflow and outflow must be equal except for the depot node. Equation(5.2e) ensures that the trucks cannot pick up more residues than are available at each sawmillnode. Equation (5.2f) ensures that each truck must come back at least κ × 100% full at the endof each route, where 0 < κ ≤ 1. Equation (5.2f) also ensures that the truck payload capacityck is never exceeded. Equations (5.2g) and (5.2h) enforce non-negativity and that the trucksare empty when leaving the depot at the start of each route. Equation (5.2i) enforces binaryvariables. Equation (5.2j) ensures that the minimum daily amount of required dry tons ofsawmill residues, δ∗, at a moisture content of θ is met. A dry ton refers the mass of the biomassafter it is dried to a relatively low, consistent moisture level. Equation (5.2k) ensures that eachtruck driver takes a lunch break. Finally, Equation (5.2l) ensures that each truck operates amaximum of τ hours. It is convenient to represent the solution to this problem as a set ofcollection runs. The lth collection run represents the ensemble of routes assigned to each truckon their lth route. As such, solutions to the SVRP are presented in this form.Lemma 1. Let δ be a delivered amount of biomass in green tons and ω be its corresponding wetbasis moisture content. Let θ ≤ ω be the required moisture content of the biomass for conversion59into energy and T be the temperature at which the biomass is stored. If the net calorific value of thebiomass at a moisture content of θ is η GJ · t−1, then the net energy gained after drying (assuming100% burner efficiency) is equivalent to that stored inδ[1−(ω − θ(1−ω)1−θ)(1 +(100− T )4.19× 10−3 + 2.26η)](5.3)tons of biomass at a moisture content of θ, where 4.19×10−3 and 2.26 represent the specific heat andlatent heat of vaporization of water in GJ · t−1, respectively [Perrot, 1998].Proof. As the wet basis moisture content, ω, of biomass is the percentage equivalent of the ratioof the weight of water to the total weight of the biomass, we can writeω =δωδω+ δ(1−ω) , (5.4)where δω is the weight of water in the delivered biomass, in tons. Let y be the allowed weight,in tons, of the dried biomass, then y satisfiesyδ(1−ω) + y = θ, (5.5)where θ is the required moisture content. From Equation (5.5) we have:y = δθ(1−ω) +θy (5.6)⇒ y(1−θ) = δθ(1−ω) (5.7)⇒ y = δθ(1−ω)1−θ , (5.8)thus the amount of water, in tons, that must be evaporated from the delivered biomass isδω − y = δω − δθ(1−ω)1−θ . (5.9)If T is the temperature (in degrees Celsius) at which the biomass is stored and η is the netcalorific value (in GJ · t−1) of the biomass at a moisture content of θ, then the number of tons60of biomass that would be required to generate the energy needed to evaporate δω − y tons ofwater is given by:z =(100− T )4.19× 10−3 + 2.26η[δω − δθ(1−ω)1−θ], (5.10)where the first term in the product corresponds to the energy required to heat one ton of waterto its boiling point plus the energy to change its state all divided by the energy stored in oneton of biomass at a moisture content of θ. From Equation (5.9) and (5.10) we have that the netenergy gained (assuming 100% burner efficiency) is equivalent to that stored inδ − (δω − y)− z = δ −(δω − δθ(1−ω)1−θ)− (100− T )4.19× 10−3 + 2.26η(δω − δθ(1−ω)1−θ)= δ[1−(ω − θ(1−ω)1−θ)(1 +(100− T )4.19× 10−3 + 2.26η)](5.11)tons of biomass at a moisture content of θ.5.3.2 Simulation modelScenario analysis approaches to solving complex stochastic problems have become more promi-nent in the literature [Sharma et al., 2013]. We solve the SVRP presented earlier using the QSAStartAt the DepotLunch time?Find the optimal cluster routeEnough time?EndDelay due to lunch breakTake a lunch breakAssign the routeDelay due to transportation, loading and unloadingUpdate the inventory at the depot and at the sawmillsNoYesNoYesFigure 5.1: The logic flowchart for truck operations in the simulation model.61approach described in Zamar et al. [2017c]. A discrete event simulation model based on theheuristic illustrated in Figure 5.1 was implemented to solve each scenario. To make the prob-lem more manageable, we first cluster the sawmills based on their distance matrix and restrictattention to routes within each cluster. Thus, our approach does not consider routes that spanacross clusters, which are unlikely to be selected due to their Hamiltonian distance. The Hamil-tonian distance of a set of nodes is the shortest path, made up from the edges connecting thenodes, which passes through each node. The QSA method samples scenarios from their un-derlying distribution and solves each scenario separately. The solution of each deterministicscenario problem is evaluated across the sampled scenarios to provide an estimate of the objec-tive and constraint satisfaction distribution of each solution. Solutions are then ranked basedlllllllllllllllllllllllS1S2S3S4S5S6S7S8S9S10S11S12S13S14S15S16S17S18S19S20S21S22S23S24 S25DEPFigure 5.2: The study region comprises 25 sawmills (S1–S25) across the Lower Mainland ofBC, and a single depot (DEP) located in Mission. The sawmills have been spatially clusteredinto four distinct color-coded groups. These clusters are used to identify optimal routes in acomputationally efficient fashion.62on their corresponding objective and constraint distribution quantiles. For this application wemaximize the 0.5 quantile (median) of the stochastic objective function subject to satisfying aquantile constraint on the procurement distribution. The objective distribution corresponds tothe EROEI while the constraint distribution represents the fulfilled portion of the demand asdescribed in Equations (5.2a), and (5.2j), respectively. We restrict attention to solutions thathave a 90% probability of satisfying at least 90% of the demand across scenarios and apply ourmodeling framework to the previously described SVRP and solve it using the QSA approach.A map of the study region is shown in Figure 5.2, which identifies the locations of 25sawmills and a single depot. The depot requires a minimum of 180 dry tons of biomass ata moisture content of θ = 0.3 × 100%. The net calorific value of one dry ton of biomass, η, isassumed to be fixed at 12.2 GJ · t−1 [Krajnc, 2015]. Similarly, the energy spent for transport isfixed at 0.0031 GJ per t · km [Eom et al., 2012]. The temperature, T , at which the biomass isstored at the depot is assumed to be fixed at 20 ◦C. There are a limited number of identicaltrucks with a curb weight of ζk = 13.5 tons and a payload capacity of ck = 30 tons, k ∈ K , thatare used to collect residues in the considered region.The random parameters that distinguish the scenarios are categorized as follows: (1) thebiomass residues available at each sawmill, ψj ; (2) the moisture content of the residue avail-able at each sawmill, ωj ; (3) the time required to travel between the nodes (sawmills and thedepot), tij ; (4) the truck setup time at the depot and the sawmills, α0 and αj ; and (5) the loadand unload flow rates at the sawmills and the depot, β0 and βj . The random parameters ψjand ωj are modeled based on published summary data [BC Bioenergy Network and BiomassAvailability Study Working Group, 2012] on biomass residue availability and moisture contentat each of the sawmills. Established conversion factors were used to convert from wet to dryweight and energy content [Briggs, 1994]. The random parameters αj and βj , i ∈ V , describingthe truck setup, load and unload times, are modeled based on the values calculated by FPInno-vations in their assessment of economically viable biomass [MacDonald, 2009]. The expectedtravel time, tij , and the exact travel distance, dij , ∀(i, j) ∈ A, between nodes were calculatedusing the R package “RgoogleMaps” [Loecher and Ropkins, 2015].A summary of the distributions of the random parameters included in the model are shown63in Table 5.4. The average available biomass residue at each sawmill was modeled as a multi-variate normal distribution based on their average annual production (see Appendix Table B.1).Spatial variogram models were fit using the sp package in R to represent the wet basis moisturecontent of the biomass residue available at each sawmill [Bivand et al., 2008]. A variogram isa mathematical description of the spatial continuity of the data and is calculated using a mea-sure of variability between pairs of points at various distances [Gandhi, 2008]. The travel timebetween a pair of nodes was modeled as an exponential random variable with a mean equal tothe expected travel time. The mean and standard deviation of the travel time, averaged acrossall pairs of nodes, are included in Table 5.4. Similarly, the mean and standard deviation of thebiomass residues available at each sawmill, averaged across all the sawmills, are provided inTable 5.4. For completeness, the mean and standard deviation of the total available biomassresidues across all the sawmills are also given in Table 5.4. The number of sawmills,m = 25, thenumber of available trucks, n = 3, and the distance between nodes, dij , ∀(i, j) ∈ A, are all knownand assumed to be fixed. The number of trucks needed was established by adding one truckat a time and repeating the simulation until the required dry tons of daily biomass residues isachieved.Table 5.4: Stochastic input data. The bar above parameters with subscripts indicates that theaverage value is shown (i.e. ψ = 1/j∑ψj ).Parameter Mean Standard deviationTotal available residue (GT)∑ψj 1010.2 31.6Residue per Sawmill (GT) ψ 40.4 10.3Moisture content (%) ω 45.0 3.8Unload setup time (h) α0 0.1 0.01Load setup time (h) α 0.1 0.01Unload time rate (h/GT) β0 0.01 0.001Load time rate (h/GT) β 0.02 0.002Travel time (h) tij 1.5 0.6The simulation model was implemented using the R system for statistical computing [RCore Team, 2016a]. A total of 1000 scenarios were simulated by sampling from the appropriate64distribution of each random parameter included in the model. The parameters were consideredto be statistically independent. To control the amount of biomass procured on a daily basis, werestrict attention to solutions that have a 90% chance of satisfying 90% of the daily demandof 180 dry tons of biomass. Moreover we only consider solutions with a probability of 10% ofexceeding the demand. This will compel the QSA solution to seldom exceed the required dailydemand, but at the same time consistently meet at least 90% of the demand. Exceeding thedaily demand cannot always be avoided as the trucks are constrained to return to the depot atleast 70% full after each route in order to improve the transport efficiency. Thus, κ = 0.70 inEquation (5.2f). Moreover, as the availability of residues at each of the sawmills is unknown,the trucks gather as much residues as they can load at each sawmill they visit.The sawmills were clustered into four groups based on their travel distance matrix usingthe k-medoids method implemented in the package cluster [Maechler et al., 2015]. The numberof clusters was estimated by the method of optimum average silhouette width [Kaufman andRousseeuw, 1990]. The four clusters are depicted in Figure 5.2 and the cluster sizes are 3, 5, 7and 10, respectively. Given the time constraints imposed by the sawmills and the truck driversoperating hours, each truck was able to perform a maximum of 3 routes per day. This fact waslearned from the simulation results.The mean scenario (MS) approach is an alternative method that may serve as a benchmarkfor assessing the performance of the QSA approach. The MS solution is obtained by substitut-ing the random parameters φj , ωj , α0, αj , β0, βj , and tij in Equation (5.2) by their correspond-ing expected values and solving the resulting deterministic constrained optimization problem.This is equivalent to solving the mean scenario, hence the name of the approach. While the MSapproach is helpful for making decisions, its applicability to real world problems is sometimeslimited due to its inability to account for uncertainties in the model parameters.Synthetic data are commonly generated in order to validate mathematical models. Modelfeature calibration for both the QSA and MS methods was done using a synthetic trainingdataset comprised of 1000 scenarios (i.e. realizations of the stochastic model parameters). Aseparate and independent synthetic dataset of the same size as the training dataset was used toperform an out-of-sample validation of the QSA and MS solutions.655.4 ResultsThe optimal routes selected for each truck by the QSA method are shown in Table 5.5. Eachrow in Table 5.5 corresponds to a collection run. From the first row, we see that the first col-lection run consists of sending the first truck to sawmill S22, the second truck to sawmills S15and S18, and the third truck to sawmills S20, S23 and S25, in this order. We denote this routecombination as [S22,(S15,S18),(S20,S23,S25)]. The total distance traveled in the first collec-tion run is 52.0 km and an accumulated time of 5.6 hours is required on average. The averageamount of biomass obtained in the first collection run is 54.9 dry tons. Furthermore, the firstcollection run yields a energy returned on energy invested (EROEI) performance ratio of 96.7.Recall, that the EROEI has been adjusted for the energy spent for transportation and drying ofthe biomass residues. Similarly, the performance of collection runs 2, and 3 are shown in thecorresponding rows of Table 5.5. Notice the drop in the EROEI between collection runs. Thisis attributed to the fact that the sawmills closest to the depot are the first to become depleted,compelling the trucks to travel longer distances to collect the biomass residues. Sawmills S22,S2 and S25 are repeatedly visited as they are relatively close to the depot and have a large pro-duction of residues with a low moisture content. The average accumulated operating hours forthe collection runs are shown in the last column of Table 5.5. The mean of this column givesthe average operating time in hours for each of the three trucks (6.5 h). Evidently, two truckswould not be sufficient to collect the required daily amount of residues in an 8-hour work dayand employing four trucks would be wasteful as they would be underutilized. Therefore, inthis case, using three trucks may be considered optimal. The QSA solution collects an averageof 171.8 dry tons of biomass residues per day, and achieves a median EROEI of 47.3. A compu-tation time of 3.91 minutes was required by the QSA method to solve this problem running onan Intel Xeon 3.6 GHz processor.A comparison of the cumulative distribution function for the EROEI and amount of biomassresidues collected by the QSA and MS solutions are shown in Figures 5.3a and 5.3b, respec-tively. As can be seen in Figure 5.3a and 5.3b, the QSA solution is able to consistently obtain agreater EROEI than that of the MS solution while meeting 90% of the demand (marked by the66Table 5.5: QSA optimal route schedule.Truck 1 Truck 2 Truck 3 Distance (km) Weight (t) EROEI Time (h)S22 S15,S18 S20,S23,S25 52.0 54.9 96.7 5.6S14,S19 S22,S19,S25 S2 118.3 57.1 43.0 6.5S2 S17,S25,S16 S2,S24 218.3 59.8 22.3 7.5vertical dotted line at 162 tons in Figure 5.3b) at least 90% of the time.To validate the QSA solution shown in Table 5.5, we applied it to a new set of 1000 ran-domly generated scenarios. The newly obtained scenario data may be considered test databecause they were not originally used to train (or fit) the QSA solution. In out-of-sample evalu-ation, the QSA solution obtained a 90% probability of collecting at least 89.7% of the demandand a median EROEI of 46.8 with a corresponding interquartile range (IQR) of 44.6−49.0. Onthe other hand, the MS solution obtained a 90% probability of collecting at least 94% of the0.000.250.500.751.0040 45 50EROEICumulative probability(a) CDF of the EROEI.0.000.250.500.751.00130 150 170 190 210Procurement (dry tons)Cumulative probabilityMethodMSQSA(b) CDF of the procurement.Figure 5.3: Performance distributions of the QSA and MS solutions. (a) Cumulative distri-bution function (CDF) of the energy returned on energy invested (EROEI). The QSA solutionis more efficient as it has a consistently larger EROEI ratio than that of the MS solution. (b)Cumulative distribution function of the total amount of biomass procured. The MS solutionconsistently delivers more dry tons of biomass than required thereby producing an undesirablesurplus at the depot. The horizontal dotted line indicates where 90% of the demand (i.e. 162dry tons) is achieved.67demand and has a median EROEI of 44.1 with a corresponding IQR of 42.0− 46.2. A Kruskal-Wallis test [Hollander, 2013] of the difference in medians between the QSA and MS solutionsEROEI values resulted in a p-value of 2.2×10−16. Thus, the median EROEI of the QSA method(46.8) is statistically significantly greater than that of the MS approach (44.1) and results in anenergy savings of approximately 6 GJ per day in transport costs alone with a corresponding95% confidence interval of 5.96− 6.11 GJ. When taking into account the quality (i.e. moisturecontent) of the residues obtained by the QSA method versus those obtained using the MS ap-proach, a net energy gain of 173 GJ is obtained due to a decrease in the transport of water andin the drying of residues. To put this into perspective, a typical household’s monthly electricalbill is approximately 8−9 GJ in Canada [Statistics Canada, 2011]. Overall, an improvement of6% in the EROEI was acheived by the QSA method compared to that of the MS approach.As can be seen in Figure 5.4b, the MS solution tends to over collect and has an estimated69.6% probability of exceeding the required demand of 180 dry tons per day. In contrast theQSA solution collects between 162 and 180 dry tons 75.2% of the time (area enclosed betweenthe dotted lines in Figure 5.4b) and only exceeds the demand 13.6% of the time. This is aconvenient feature of the QSA solution because the depot has a limited storage capacity andthroughput.680.00.10.240 45 50EROEIDensity(a) Probability density function the EROEI.0.000.020.040.06125 150 175 200 225Procurement (dry tons)Density MethodMSQSA(b) Probability density function of the procurement.Figure 5.4: Out-of-sample performance of the quantile-based scenarios analysis (QSA) andmean scenario (MS) solutions. (a) Probability density function of the energy returned on energyinvested (EROEI). The QSA solution is more efficient as it has a consistently larger EROEI ratiothan that of the MS solution. (b) Probability density function of the total amount of biomassprocured. The dotted lines show where 90% and 100% of the daily demand is met. The MSsolution consistently delivers a great deal more than the required 180 dry tons of biomassthereby producing an undesirable surplus at the depot. In contrast, the QSA solution is moreaccurate and precise in delivering the required amount of biomass.5.5 DiscussionIn this case study, a simulation approach was used to solve an stochastic vehicle routing prob-lem (SVRP), where the goal is to efficiently collect biomass residues from a set of sawmills.Several factors, such as the moisture content and the amount of available biomass residues areuncertain and treated as random. Though the mills are located alongside a river, trucks ratherthan large capacity barges were considered for the transport of residues due to the daily na-ture of the demand and limited storage capacity of the depot. The optimal routing schedulewas obtained by maximizing the energy returned on energy invested (EROEI). Use of the QSAmethod was found to be more efficient in terms of minimizing the EROEI and more accurateand precise in terms of meeting the daily demand than by simply solving the expected or meanscenario.69Beyond theoretical consideration on solution feasibility, the presented methodology pro-vides industry leaders and operation research practitioners with a simulation-based frame-work to effectively manage, mitigate and avert risk. The main strength of this methodologyis its applicability to almost any SVRP, where one or more model parameters (i.e. travel time,resource availability, customer demand, etc.) are random or uncertain.Ignoring uncertainty when solving an SVRP will result in less effective solutions that costmore to implement and have more variable performance. Through the use of QSA, the pre-sented approach has the benefit of incorporating distributional information, which leads toless conservative solutions.70Chapter 6Bale Collection6.1 Problem descriptionThe agricultural industry is now capable of collecting comprehensive real-time data regardingtheir field operations. Proper use of this data compels the formulation of novel approachesto help improve the management of tasks involving the coordination of agricultural machinesand vehicles. These technologies can provide accurate information for precision agriculture(PA) decision support systems in farm management.PA is conceptualized by a system approach to reorganize farm management systems to-wards a low-input, high-efficiency, sustainable practice. PA benefits from the use of moderntechnologies, such as global positioning system (GPS), geographic information system, auto-matic control, remote sensing, miniaturized computer components, mobile computing, ad-vanced information processing, and telecommunications [Gracia et al., 2013].Major field operations require the planned coordination of various farm equipment. Thebale collection problem (BCP) appears after harvest and baling operations of a crop and con-sists of defining the sequence in which bales spread over a field are collected. Once the har-vesters have operated throughout the field, the mowed crop is left behind in windrows to becompressed and compacted by balers into bales that are convenient to handle and transport.Bales remain scattered on the field awaiting their collection by loaders and transported to theroadside by wagons (either self-propelled or pull-type).The BCP is concerned with the collaborative operation of several machines and vehicles.Therefore, planned management is required to coordinate the various tasks efficiently. Thecollection strategy is typically established based on the skills and experience of the operator.The inconsistent and subjective nature of decisions based on operator judgment tend to pro-71duce suboptimal solutions [Milkman et al., 2009]. On the other hand, more efficient bale col-lection plans are achievable. Bales, loaders and wagons can be equipped with geo-positioningtechnology that make it possible to know the exact location of bales and track vehicles onpredetermined paths [Amiama et al., 2008]. The BCP can be modeled and solved efficientlyby applying optimization techniques, which can be easily integrated into a farm managementdecision support system [Gracia et al., 2013].Solving the BCP involves determining the optimal roadside storage sites and the bale col-lection routes while taking into account the maximum capacity of the wagons as well as thedistance traveled in transporting the bales to the roadside. The general process involves balesbeing transported from the field to the field’s roadside, where they will be collected and takento more permanent storage locations (i.e. silos, storage bunkers or barns). Hence, it becomesnecessary to describe the BCP using mathematical models that can be efficiently solved to yieldoptimal allocation, route planning and timing solutions. The BCP belongs to a class of opera-tional research problems known as the vehicle routing problem (VRP), which has been widelystudied. The VRP is a combinatorial optimization and integer programming problem and isa generalization of the traveling salesman problem (TSP). The goal of the VRP is to find theoptimal set of routes for a fleet of vehicles delivering goods or services to a set of geograph-ically dispersed locations or customers. Ekşioğlu et al. [2009] provide a taxonomic review ofthe plentiful literature published on VRP related research. Despite the fact that field tasks in-volve a complex routing of vehicles, these concepts have only recently been transferred to theagricultural environment [Bochtis et al., 2013; Gracia et al., 2013]. According to the theory ofcomputational complexity, this class of decision problems is NP-hard.Exact methods for solving the VRP are computationally intensive, and are not guaranteedto find the optimal route in reasonable computing time when the number of customers is large.As such, commonly used techniques for solving VRPs focus on the use of algorithmic methodsbased on the application of heuristics or meta-heuristics. Heuristics are optimization tech-niques that perform a relatively limited, but well-crafted, exploration of the search space andtypically produce good quality solutions within modest computing times. Examples of heuris-tic procedures applied to VRP include particle swarm optimization [Lei et al., 2014], artificial72optimized performance of bees [Szeto et al., 2011], ant colony optimization [Yu and Li, 2012],constraint programming algorithms [Rego, 2006] and genetic algorithms [Gracia et al., 2013].This paper presents a heuristic algorithm to efficiently solve the BCP appearing after mowingand harvesting operations. The proposed algorithm aims to increase the overall field efficiencyof collection operations by providing the basis for a navigation instrument dedicated to loadersand bale wagons. The low computational requirements of our approach makes it feasible forintegration in large scale operations. The proposed approach has two main parts. The firstpart involves identifying the optimal roadside storage sites where bales are to be transported.This translates to a constrained cluster analysis optimization problem as the storage sites mustlie on the side of the road. The second part involves identifying efficient collection routes fortransporting the bales from the field to their respective roadside storage site. This is a capaci-tated vehicle routing problem (CVRP) as the number of bales that a wagon can pick up on anygiven route is limited by the wagons capacity.The nearest neighbour heuristic (NNH) is commonly used to identify solutions for the BCP.It is obtained by sequentially picking up the nearest bale from the current position of theloader and is commonly used by experienced operators for the collection of bales spread overa field. The solutions produced by the NNH approach will be used as a baseline to evaluate theefficiency of the proposed approach.6.2 Methods6.2.1 Part I: constrained cluster analysis of balesIn the context of collecting bales on a field, the identification of roadside storage points wherebales are to be transported can be expressed as a cluster analysis problem, where the aim isto partition the bales into k clusters in which each bale belongs to the cluster with the nearestmean, resulting in a partition of the bales on the field. If the location of the cluster centers werenot constrained to lie on the road, then the k-means algorithm [Hartigan, 1975], originally usedfor signal processing, may be directly applied. Even so, this standard cluster analysis problemis known to be computationally difficultly (NP-hard). The fact that the storage locations are73constrained to lie on the roadside makes this part of the BCP an even greater challenge tosolve. We believe that this is the first work that shows how to optimize this class of constrainedcluster analysis problem.Given a set of observations (x1, . . . ,xn), where each observation is a d-dimensional real vec-tor, k-means clustering aims to partition the n observations into k ≤ n sets S = {S1,S2, . . . ,Sk}so as to minimize the within-cluster sum of squares. The mathematical representation of theconstrained cluster analysis (CCA) problem is as follows:minimizeuk∑i=1∑x∈Si||x −ui ||2 (6.1a)subject tom∏j=1gj(ui) = 0 for i = 1,2, . . . , k (6.1b)S1 ∪ S2 ∪ · · · ∪ Sk = S (6.1c)Si ∩ Sj = ∅ ∀i , j, (6.1d)wheregj(x) = mina∈Aj||x −a||2and Aj , j = 1,2, . . . ,m are compact subsets of R2 corresponding to valid positions for the clustercenters. The constraint given in Equation (6.1b) ensures that each cluster center is located ina valid position. Equations (6.1c) and (6.1d) enforce strict partitioning clustering and a manyto one mapping of bales to clusters, respectively. The functions gj(x), j = 1,2, . . . ,m calculatethe minimum euclidean distance between a given point, x, and any point in the set Aj . TheCCA problem represented in Equations (6.1a) to (6.1d) is a quadratically constrained quadraticprogram that is NP-hard.The back-fitting approach described in Algorithm 2, may be used to solve the CCA problemrepresented in Equations (6.1a) to (6.1d). A sufficiently large constant weight parameter γ mustbe given to ensure the solution satisfies the CCA constraint of Equation (6.1b) as accurately asnecessary.74Algorithm 2: The constrained cluster analysis (CCA) algorithm.Step 1: Choose k random points from S to be the initial cluster centers, u = (u1, . . . ,uk). Therelaxed solution of the k-means algorithm is a good initial starting point.Step 2: Assign points to clusters based on their Euclidean distance to the cluster centers:Si ={x : ||x −ui ||2 < ||x −uj ||2, j , i}.Step 3: Update the cluster centers by solving the following minimization problem:ui = argminνfi(ν) =∑x∈Si||x − ν ||2 +γ · gi(ν)Step 4: Repeat Steps 2 and 3 until there is no significant change in the clustering criteria,∑ki=1 fi(ui).6.2.2 Part II: within cluster route optimizationAs in the VRP, this part of the BCP may be formulated as a graph theory problem. For a givencluster, %, let G = (N,A) be an undirected graph, where N is the set of nodes and A is the setof edges. In our case, N = {0,1, . . . ,n%} is an index set for the n% bales and the roadside storagenode, denoted as 0. A = {(i, j) | i, j ∈ N ; i < j} represents the set of (n% + 1)(n% + 2)/2 existingedges connecting the n% bales and the storage site.A weight, qi , is assigned to each bale i, 1 ≤ i ≤ n% (q0 = 0). Each edge has an associated cost,cij > 0, of sending a vehicle from node i to node j. The cij are assumed to be symmetric andproportional to the Euclidean distance, dij , between any two nodes, thus cij = cji ∝ dij , i, j ∈N .The collection process is to be carried out by a fleet of v vehicles, v ≥ 1, with equal capacity,κ ≥max{qi | 1 ≤ i ≤ n%}. Notice that each vehicle may represent an autoloader trailer or a loaderaccompanied by a transport wagon.The problem is to determine the set of routes that minimize the total travel cost within eachcluster identified by the CCA algorithm in Section 6.2.1. As time is not being considered, thenumber of vehicles does not effect the problem solution as the routes are independent of oneanother and can be completed either in series or in parallel. The following are some additional75constraints associated with the problem: i) roadside storage node can only be visited at thestart and at the end of each route; ii) all routes begin and end at the roadside storage node; iii)no two routes visit the same bale; iv) all bales are visited exactly once; and v) no vehicle can beloaded exceeding its maximum capacity.The decision vector is x = (xijr ), where i, j ∈ N , r ∈ R = {1,2, . . . , τ} and τ = dn%/κe is thenumber of routes needed in order to pick up all of the n% bales:xijr = 1 if route r contains edge (i,j)0 otherwise,Here we assume that, with exception to the last route, vehicles are loaded to their maximumcapacity. The mathematical formulation of the route optimization part of the BCP is as shownin Equations (6.2a) to (6.2e). Equation (6.2b) is to make sure that each bale is assigned to exactlyone route. Equation (6.2c) states capacity constraints, so that the sum of all bales collected ina route is less than or equal to the loading capacity of the vehicle. Finally flow constraints areshown in Equations (6.2d) and (6.2e) to ensure that each route begins and ends at the roadsidestorage site and that the inflow and outflow of edges must be equal for all the nodes.The route optimization component of the BCP is an NP-hard problem, which explains whymost research efforts have focused on heuristic approaches. A variety of methods to solvingthe classical VRP have been proposed over the past decades. The strategies employed rangefrom the use of pure optimization for solving small size problems to the use of heuristics thatprovide near-optimal solutions for medium and large-size problems with complex constraints[Braekers et al., 2016].minimizext∑r∈R∑(i,j)∈Acijxijr (6.2a)subject to∑r∈R∑j∈Nxijr = 1 ∀i ∈N (6.2b)∑i∈N∑j∈Nxijr × qj ≤ κ ∀r ∈ R (6.2c)76∑j∈Nx0jr = 1 ∀r ∈ R (6.2d)∑i∈Nxijr =∑i∈Nxjir ∀j ∈N, r ∈ R (6.2e)We provide a simple, yet efficient heuristic, summarized in Figure 3, for identifying near-optimal solutions for the second part (i.e. within cluster route optimization) of the BCP. Ourheuristic works by identifying the most isolated, and least isolated bales, and choosing one ofthe them with equal probability. Denote the chosen bale as b∗. A bales degree of isolation isdetermined by the number of times it is selected as a nearest neighbor, in Euclidean distance,by its peers.For each bale that selects b∗ as a nearest neighbor, we solve a corresponding TSP that visitsall of that bales κ − 1 nearest neighbors (which must include b∗ by definition) as well as thestorage node. Recall that κ represents the capacity of the wagon. The TSP solution with theshortest Hamiltonian cycle is chosen as a route and its corresponding bales are removed fromthe set of nodes, N . In graph theory, a Hamiltonian path is a traceable path in an undirected ordirected graph that visits each node exactly once. A Hamiltonian cycle is a Hamiltonian paththat is a cycle (i.e. begins and ends at the same node).The process is repeated until all bales have been picked up. The proposed heuristic iscalled the min-min min-max optimization algorithm (MMROA) because at each iteration itselects either the most isolated or least isolated bale with equal probability and minimizes thelength of the route that picks up the chosen bale as described in Step 3 of Algorithm 3. Thisstrategy is employed to avoid premature convergence to a local optima. The MMROA reducesa very large CVRP into several much smaller TSP problems. This is justified as the number ofbales to be collected are typically several orders of magnitude larger than the capacity of thevehicles. In such case, the proposed heuristic will dramatically reduce the complexity of theproblem and the computing time necessary to solve it. As the solution obtained from differentruns of MMROA may differ due to its stochastic nature, several runs of the MMROA shouldbe made and the best solution chosen. A rule for selecting the number of runs of the MMROAalgorithm that are needed is described in Section 6.3.77Algorithm 3: The min-min min-max route optimization algorithm (MMROA).Initialization: Z = ∅Step 1: Let k =min(κ, |N | − 1), where κ is the capacity of the wagon. Compute the set of k − 1nearest neighbors for each bale b ∈N . Denote the set of k − 1 nearest neighbors of bale bas Qb.Step 2:for b ∈N domb =∑i∈NI(b ∈Qi)endb1 = argminb{mb : b ∈N }b2 = argmaxb{mb : b ∈N }u ∼U (0,1)if u ≤ 0.5 thenb∗← b1elseb∗← b2endStep 3: Among the sets:Q = {Qb | 0,b∗ ∈Qb,b ∈N },which contain both b∗ and 0, select the one that has the shortest Hamiltonian cycle.Denote this set as Q∗ and its corresponding shortest Hamiltonian cycle as h.Z← Z ∪ hStep 4:N ←N \Q∗if N , ∅ thengo to Step 1endreturn Z78051015200 5 10 15width (m)length (m)Figure 6.1: Obtained tours by the min-min min-max optimization algorithm (MMROA) for aproblem previously proposed by Grisso et al. [2007].There are almost no previous references to solving this part of the BCP in the literature.Grisso et al. [2007] presented a simple case in which 34 bales scattered over a field should becollected with a vehicle capacity of 6 bales. This problem was subsequently solved by Graciaet al. [2013] with a hybrid genetic algorithm yielding a 6.0% reduction in the total travel dis-tance over the original solution. The solution obtained by the MMROA yields a 6.8% reductionin the total travel distance. Figure 6.1 shows the solution obtained by MMROA. Each tour isdepicted with a different color and the storage site is located at the origin.6.3 Stopping ruleWhen performing simulation-based optimization for solving complex problems, it is often amatter of debate whether the run length (i.e. number of iterations) is sufficient to yield an ac-curate solution. The solution space of the CCA problem may potentially be infinite. Therefore,no limited number of runs is guaranteed to yield the “optimal” solution (if one indeed exists).We present a probabilistic stopping rule for the CCA algorithm that allows one to determine79with a desired level of accuracy, α, the probability that the optimal solution obtained from arun of length, n, is within the top ρ percent of “optimal”.Let xˆ = argminxi{d(xi) : i = 1, . . . ,n}, be the best solution obtained from a set of n indepen-dent CCA solutions for a given problem, where d(xi) represents the attained travel distanceof solution xi . Let ρ be the probability that there exists a solution, x˜, such that d(x˜) < d(xˆ).We propose a stopping rule based on statistical evidence that ρ < ρ0, for some specified valueρ0. We sample an additional set of n2  n1 CCA solutions. Let 0 ≤ N ≤ n2 be the number ofadditionally sampled CCA solutions that have a better performance than xˆ. Suppose that theobserved value of N is r. Our stopping rule tests the hypothesis H0 : ρ ≥ ρ0 vs. H1 : ρ < ρ0 asfollows. We rejectH0 at level α and consider that the sample has sufficiently covered the searchspace, if PH0(N ≤ r) ≤ α, for ρ ≥ ρ0. Under, H0 : ρ ≥ ρ0, N follows the Binomial distribution,Bin(n2,ρ) with ρ ≥ ρ0. It can be shown [Klenke et al., 2010] thatmaxρ≥ρ0P (N ≤ r) = maxρ≥ρ0P (Bin(n2,ρ) ≤ r)= P (Bin(n2,ρ0) ≤ r)=r∑i=0(n2i)ρi0(1− ρ0)n2−i(6.3)and we reject H0 : ρ ≥ ρ0 in favour of H1 : ρ < ρ0 at level α if the RHS of Equation (6.3) is lessthan α, where α is a specified small probability, such as 0.05 corresponding to a 95% confidencelevel.6.4 ResultsIn order to evaluate the proposed algorithm in a realistic manner, we developed a scenariogenerator that is able to produce problem instances of the BCP, each characterized by a setof parameters. More specifically, a problem instance of the BCP is defined by the capacityconstraint of the vehicle, κ, the location of the roads and the n exact locations of the bales inthe field. As such, the problem generator simulates the number and location of bales on thefield. Note that the number of available vehicles does not affect the solution as each tour is80independent of one another and can be done in sequence or in parallel. However, the numberof vehicles required would need to be considered if the total collection time were to be includedas a constraint.If we consider a uniform yield (kg ha−1) throughout the field, the distribution of bales fol-lows a constant distance pattern easily obtained using a Poisson process with rate (or intensity):λ =µ× 10000γ ×ω , (6.4)where λ is the average travel distance in meters (m) by a baler until it packs a bale, µ is the massof one bale in kilograms (kg), γ is the production level of wheat (kg ha−1) and ω is the workingwidth of balers in meters. We focus on farmlands in Western Canada, where the farmlands arecommonly divided into quarter (square) sections of approximately 64 hectares [Friesen andThauberger, 2017]. The hectare is the area of 10,000 m2.810800160024000 800 1600 2400width (m)length (m)Figure 6.2: Distribution of wheat bales in a study area composed of 9 sections separated byaccess roads.820800160024000 800 1600 2400width (m)length (m) Cluster12345Figure 6.3: Bale clusters and roadside storage sites identified by the constrained cluster analysis(CCA) method.830800160024000 800 1600 2400width (m)length (m)Cluster12345(a) Wagon capacity of 8 bales.0800160024000 800 1600 2400width (m)length (m)Cluster12345(b) Wagon capacity of 15 bales.Figure 6.4: The min-min min-max optimization algorithm (MMROA) solutions for wagoncapacities of (a) 8 bales and (b) 15 bales.840800160024000 800 1600 2400width (m)length (m)Cluster12345(a) Wagon capacity of 35 bales.0800160024000 800 1600 2400width (m)length (m)Cluster12345(b) Wagon capacity of 45 bales.Figure 6.5: The min-min min-max optimization algorithm (MMROA) solutions for wagoncapacities of (a) 35 bales and (b) 45 bales.85Table 6.1: Model parameter values.Parameter Value UnitAverage yield 2000 kg ha−1Working width of baler 2.5 mAverage mass of bales 400 kgWagon capacity 8, 15, 35, 45 balesWe consider a study area with a dimension of 2400 m × 2400 m that is composed of nineequal-sized sections of 64 ha each. Figure 6.2 shows the distribution of bales in the study area.Additional input parameters required for the first and second parts of the BCP are comprised inTable 6.1, which includes information on the average yield, bale, and machinery characteristics.Different capacity constraints appear depending on the wagon used. There is a wide range ofeither self-propelled or pull-type bale wagons with different loading capacities depending onthe dimension of the bales. Four different wagon capacities are considered in Table 6.1.Based on an analysis of the within-cluster sum of squares, we chose to divide the bales intofive clusters (see Figure 6.6). As such, we apply the CCA algorithm with k = 5 in order toidentify the bale clusters and their corresponding roadside storage sites, which are constrainedto lie on the horizontal and vertical grid-roads shown in Figure 6.2. The five bale clusters andcorresponding constrained centers identified by the CCA method are shown in Figure 6.3.Table 6.2: Comparison of the min-min min-max optimization algorithm (MMROA) to thenearest-neighbor heuristic (NNH) for different wagon capacities.Capacity (bales) # of routes NNH (m) MMROA (m) Savings (%)8 363 422,314 402,161 4.815 196 278,360 254,552 8.635 85 181,421 157,605 8.745 67 165,375 140,898 8.5The optimal routes identified by the MMROA for the four wagon capacities considered(8, 15, 35, and 45) are shown in Figures 6.4 and 6.5. To validate the proposed approach,it’s performance is compared to the customary assignment rules that an experienced operatorwould follow when collecting bales (i.e. the NNH procedure). A summary of the solutionobtained for each wagon capacity is shown in Table 6.2. Results are compared to the NNH863000040000500001 2 3 4 5 6 7number of clusterstotal within−clusters sum of squaresFigure 6.6: Elbow plot of the number of clusters versus total within-cluster sum of squares(WSS). The goal is to choose the smallest number of clusters, k, that still has a low WSS. Theelbow of the plot indicates where we start to have diminishing returns by increasing k.20030040010 20 30 40wagon capacity (bales)kmFigure 6.7: Wagon capacity versus distance traveled. Larger wagon capacities provided moreoptimal solutions as the total distance traveled was found to decrease logarithmically with anincrease in the wagon capacity.87procedure and the percentage of savings calculated. We implement our algorithms using theR programming language [R Core Team, 2016a]. The R-code for this simulation study may befound in Appendix C. The optimal path for each route is calculated using the TSP packagein R [Hahsler and Hornik, 2007]. As expected, the number of routes required decreases withan increase in the wagon capacity. The travel distance also decreases as the wagon capacityincreases, but at a logarithmic rate. For this application, the recommended wagon capacity is35 bales as the rapid decline in the total distance traveled by loaders and bale wagons reachesa lower plateau near this value (see Figure 6.7). As seen in Table 6.2, an 8.7% reduction in thetotal travel distance is obtained by the MMROA over the NNH approach when using a wagoncapacity of 35 bales. The lower plateau in the total travel distance is also observed through thesimilarity of route densities between Figures 6.5a and 6.5b.6.5 DiscussionWe presented a two-part optimization approach for solving the bale collection problem (BCP).The first part of our approach is to identify the optimal locations for the roadside storage siteswhere bales are to be temporarily piled for future transport to their final destination, such asa silo, silage bunker or barn. We assume that the number of roadside storage sites is given, buttheir locations are to be optimized. This results in a constrained cluster analysis problem asthe storage sites must be located on the roadside. This first part of the BCP is solved using anew constrained k-means constrained cluster analysis (CCA). This is not to be confused withprevious work on constrained k-means procedures where the constraints consist of groupsof points that must be clustered together. On the other hand, our constraints pertain to thelocations of the cluster centers, which must be situated on the roadside. The optimization ofthe number of roadside storage sites should be determined to minimize the total travel distancein the second part of the BCP problem. This will be considered in future research.The second part of our approach to solving the BCP is to identify the optimal collectionroutes for bringing the bales within each cluster to their corresponding roadside storage site,which have already been determined by the CCA algorithm. We developed an algorithm, called88the min-min min-max optimization algorithm (MMROA), which approximately solves this NP-hard problem. The MMROA works by iteratively identifying the most isolated, and least iso-lated bales, choosing one of the them with equal probability, and optimizing its collectionroute. The MMROA was shown to give comparable results in a test case proposed by Grissoet al. [2007], which was subsequently solved by Gracia et al. [2013] with a hybrid genetic al-gorithm yielding a 6% reduction in the total travel distance. In the presented BCP case study,the solution identified by the MMROA achieves an 8.7% reduction in the total travel distancecompared to that obtained by the nearest neighbour heuristic (NNH).The potential benefits of our approach to solving the BCP is its scalability and ease of imple-mentation, which allow it to tackle much larger problems. The MMROA algorithm implementsa divide and conquer strategy that breaks down a complex CVRP into several smaller TSPs thatcan be approximately solved using well-known and efficient heuristic procedures.89Chapter 7Conclusions and Future Work7.1 ConclusionsBiomass and the useful forms of bioenergy produced from it are anticipated to provide a signif-icant contribution to the global energy supply. However, developing a strong and competitivebioenergy industry is a complex and challenging process. In particular, procuring reliableand cost effective supplies of biomass feedstocks, which meet quality specifications and areproduced in a sustainable manner over the operating life of a biomass plant, can prove to bedifficult due to various sources of uncertainty [U.S. Energy Information Administration, 2016].A biomass supply chain model should take into consideration uncertainty regarding thespatial and temporal distributions of biomass, transportation and storage costs, the competi-tion for its use, and the technologies used for its conversion into bioenergy. Moreover, parame-ters contributing to the quality aspects of the biomass, such as moisture content, heating value,and bulk density should be considered as they contribute to the overall costs of transport,storage and conversion of biomass into energy [Yue et al., 2014].The overall goal of this research was to develop flexible and easy to use mathematical frame-works that are suitable for the design and planning of biomass supply chains. This work fo-cused on the upstream processes concerning biomass procurement because the success (andfailure) of biomass supply chains are most sensitive to deficiencies at this stage.In Section 1.4, available approaches for modeling the various stages of biomass supplychains were examined. The literature survey revealed a lack of modeling approaches that ac-count for uncertainty in the input model parameters when performing optimization. This isespecially important at the operational level, which involves decision making and planningto keep the biomass supply chain active. Effective operational level processes are the result90of strong strategical and tactical planning. The monetary and environmental costs associ-ated with transport have an emphatic impact on sustainable production. Combining problem-solving and decision-making strategies such as those developed in this thesis are necessary con-tributions to developing sustainable cost effective logistics practices for biomass procurementoperations. Due to the heterogeneity between biomass supply chains for various products, itwas necessary to develop generalizable, computationally efficient, and easily implementablemethods.Scenario based approaches provide an effective and practical approach for capturing theprobabilistic nature of the biomass supply chain. In Section 2.2, a computationally tractableapproach called quantile-based scenario analysis (QSA) for the optimization of complex stochas-tic problems was developed and a statistical test for convergence of the approach was formu-lated. The first step of the QSA approach is to sample joint realizations of the stochastic inputparameters. Each realization is considered to be a unique scenario, representing mutually con-sistent value combinations of the associated input parameters. When data on the stochasticmodel parameters is available, Bayesian networks may be used for modeling the probabilitydistribution of the scenarios. An illustrative example of the application of the QSA approachfor dealing with parameter uncertainty in the context of portfolio optimization was providedin Section 3. Three unique case study applications, which involve the modeling and optimiza-tion of procurement operations for biomass-to-bioenergy supply chains were presented in thisthesis, as described below.Forest harvest residues collectionSupply chain optimization for biomass-based power plants is an important research area dueto an increased emphasis on green energy sources. In Section 4, the QSA approach was used tojointly analyze a stochastic system of forest harvest residue (FHR) supply chains serving sev-eral bioenergy plants. The model accounts for uncertainty in availability, quality, processingcosts, and demand. As the QSA approach considers the distribution of the problem constraintsas well as that of the objective function, it is able to identify solutions with better performancecompared to those obtained using existing methods. The QSA solution was found to be su-91perior in terms of its ability to minimize the FHR procurement cost across sampled scenarioswhile satisfying the demand constraints to within 10% of desired, with a probability of 95%.The ability to jointly optimize the entire stochastic system of neighboring FHR supply chainsis beneficial because it allows for the identification of more economically sustainable solutionsthan what can be achieved through separate optimization of each individual FHR supply chain.Economic sustainability in this context is concerned with long-term stability and balance. Forthe effective management of FHR supply chain systems, a QSA should be performed period-ically due to changes over time in the distributions of the stochastic model parameters. Inaddition, the values of the deterministic model parameters themselves could change over anextended period of time and may also need to be adjusted accordingly.Sawmill residues collectionThe collection of sawmill residuals is an important logistic activity for the pulp and paper in-dustry, which use the biomass as a source of energy. The sawmill residues are composed of amixture of sawdust, wood shaving and hog fuel feedstock. In Section 5, we presented a vehi-cle routing problem for a network composed of a single depot and 25 nearby sawmills in theLower Mainland region of British Columbia, Canada. The sawmills serve as potential suppliersof residual biomass to the depot, located in Mission BC, which in turn processes and distributesthe sawmill residues to the pulp and paper mills. The main function of the depot is to reduceparticle size variability and moisture content, by screening, size segregation, grinding and dry-ing. This problem consisted of identifying the best daily routing schedule for a fixed numberof vehicles that are used for collecting the residues. The objective was to maximize the en-ergy returned on energy invested (EROEI), while achieving a minimum daily amount of dryresiduals. There are several random components in the problem, including the availability andmoisture content of the residues as well as the time spent on the road to retrieve the residues.This problem was mathematically formulated as a stochastic capacitated vehicle routing prob-lem (CVRP) with the objective of maximizing the EROEI while taking into consideration thetransportation and drying of the residues. A simulation model of the collection process wasdeveloped and subsequently solved using the QSA approach. The QSA solution to this problem92was compared to that of the mean scenario solution, which was obtained by solving the mean(or average) scenario where all the random model parameters have been replaced by their cor-responding expected values. The median EROEI of the QSA solution (46.8) was found to bestatistically significantly greater than that of the mean scenario (MS) solution (44.1) and resultsin an energy savings of approximately 6 GJ per day with a corresponding 95% confidence in-terval of 5.96−6.11 GJ. In addition, the QSA solution was found to be more precise (i.e. correctexpected value) and accurate (i.e. smaller variance) in procuring the required daily amount of180 dry tons of residues when compared to the MS solution.Our problem was to find a single optimal daily solution for the entire season without im-posing any constraints to ensure that each sawmill is visited with any regularity (e.g. at leastonce a month). This constraint can be easily imposed by formulating the problem at a monthlybasis where different collection schedules may be selected for each day.Bale CollectionIncreasing pressures to improve sustainability and production quality requirements in agri-culture are being observed across the globe. To meet these demands, farming tasks need toincorporate more advanced modeling and planning technologies. An important component ofthe agriculture and biomass supply chains is the collection of bales following harvest opera-tions on the field. In Section 6, we presented a novel two-step approach to solving the balecollection problem (BCP) that minimizes the total travel distance of machinery on the field.The methodologies developed make use of data that the agricultural industry are capable ofcollecting, namely the locations of bales on the field and the global positioning system (GPS)tracking of machines and vehicles used in their collection. This information is utilized by ourapproach to jointly optimize bale collection operations across adjoining fields. The efficiency ofour approach was demonstrated on a simulated study area like those found in real situations.The potential benefits of our approach to solving the BCP are its scalability and computationalefficiency, which enable it to be easily integrated into existing bale collection operations.The first part of our approach is to identify the optimal locations for the roadside storagesites where bales are to be temporarily piled for future transport. This is solved using a novel93k-means approach called constrained cluster analysis (CCA), which restricts the cluster cen-ters to lie on the roadside. This form of the constrained k-means problem has not been lookedat in the optimization literature (to the best of our knowledge). Its solution has the potentialto be widely used in many applications, especially in the broad area of sustainable develop-ment where goods are constantly being collected, transported and stored either temporarily orpermanently.The second part of our approach to solving the BCP is to identify the optimal collectionroutes for bales within each cluster. This problem can be formulated as a CVRP. We solveit using a novel approach, called the min-min min-max optimization algorithm (MMROA),that is both computationally fast and reliable in identifying near optimal route solutions. TheCVRP appears regularly in the optimal management of transport of consumables and products.The monetary and environmental costs associated with transport have an emphatic impacton sustainable production. Combining problem-solving and decision-making strategies suchas those developed in the paper are necessary contributions to developing sustainable costeffective logistics practices in farm management.The methodologies developed are tailored for PA farming management systems, which aregeared towards a low-input, high efficiency, sustainable agriculture that integrate modern tech-nology with logistics planning and scheduling operations. Current bale collection strategies aretypically reliant on operator judgment and experience, which tend to produce inconsistent andsuboptimal solutions. The approach presented is expected to lower the consumption of fueland generation of greenhouse gas emissions during bale collection operations by significantlyreducing the travel distance of machinery and vehicles on the field. Bale collection opera-tions are an important stage of both agriculture and biomass procurement operations. Hence,the developed methodology can benefit both the agricultural and bioenergy sectors. Moreover,GPS-based systems in agriculture are already a reality, which make it possible to use such routeplanning optimization techniques within an everyday context.947.2 Future workThe findings presented in the body of this thesis attest to how real-world operations researchproblems involving complex constraints and uncertainties, may be effectively analyzed using acombination of simulation, statistics, heuristics and a hybrid integration of geospatial, nearestneighbor and clustering machine-learning algorithms. The practical innovations of this worklies in its broad potential for transferability across different types of biomass or agriculturalharvest and procurement operations as well the ability to offer both cost and time savings. Afew areas in which the developed methodology could be further explored and expanded uponinclude:i. Adapting the QSA method to handle purely discrete problems: If the set of scenarios arefinite and countable an optimal solution may not necessarily correspond to a distinctscenario. In such case, a genetic algorithm (GA) may be used to combine pairs of scenariosolutions using mutations and crossover routines [Mitchell, 1996]. The hybrid solutionsformed in this fashion may be used to enrich and augment the initial set of candidatesolutions that are used as input to the QSA method.ii. Investigating the utility of the QSA approach in process control systems: This may beachieved by iterative stage-wise application of the QSA approach, where scenarios aremodeled via Bayesian networks at each stage. Bayesian networks would be used to gen-erate scenarios conditional on i) the observed values of the model parameters and ii) theinferred values of latent variables for monitoring the process.iii. Adapting the QSA method to solve multi-objective optimization problems: When more thanone objective is to be optimized at the same time, there is likely no single solution thatsimultaneously optimizes each objective and there exists a (possibly infinite) number ofPareto optimal solutions. A solution is said to be non-dominated Pareto optimal if noneof the objective functions can be improved in value without degrading some of the otherobjective values. Several methods exist to solve deterministic multi-objective optimiza-tion problems [Ehrgott, 2005]. The applicability of the QSA approach to solve stochasticmulti-objective optimization problems could be investigated in future work.95iv. Adapting the QSA method to take into account the joint distribution of the constraints:Currently, the marginal distribution of the constraints are being considered. If the con-straints are independent of one another, then the joint probability that each constraintis satisfied is simply the product of their marginal probabilities. However, in practice,the constraints may be dependent on one another. For example the probability that thedemand is met may be dependent on availability. The QSA method currently generatessamples from the joint distribution of the constraints. The probability that a solutionjointly satisfies all constraints can be estimated by replacing lines 21–23 of Algorithm 1with the following:Gxi (z1, . . . , zk) =1nn∑j=1 k∏u=1I(ciju ≤ zu) ,which calculates the probability that all constraints are satisfied simultaneously for agiven scenario solution, xi .The proposed extensions to the QSA approach should be applied to other real-life cases toinvestigate its effectiveness in identifying optimal solutions. In the presented work, focus wasprimarily given to the procurement aspect of the biomass supply chain problem. There areseveral ways in which the presented optimization models can be broadened. First, the modelscould be expanded to explicitly model the catalytic, biochemical, and thermo-chemical con-version pathways for the production of biofuels from biomass feedstocks. Second, the modelscould be extended to explicitly model the processes involved in the short and long-term stor-age of biomass prior to conversion, such as the degradation of biomass feedstocks over time.Finally, the models could be adapted to account for the delivery of biofuel products to theend-user. The integration these extensions into a single global optimization problem may beinfeasible, especially when accounting for uncertainty in the model parameters. Instead, itera-tive optimization of each system, where the output from one optimized system is used as inputto the next system to be optimized, may prove more tenable. If the modeled systems includestochastic parameters, then the QSA method can be iteratively used for the optimization ofeach system.96BibliographyAbrahamson, L., Volk, T., Smart, L., and Cameron, K. (2010). Shrub willow biomass producer’shandbook. Technical report, Syracuse: SUNY College of Environmental Science and Forestry.http://www.esf.edu/willow/documents/ProducersHandbook.pdf.Alam, M. B., Pulkki, R., Shahi, C., and Upadhyay, T. (2012). Modeling woody biomass procure-ment for bioenergy production at the atikokan generating station in northwestern ontario,canada. Energies, 5(12):5065–5085.Amiama, C., Bueno, J., Álvarez, C. J., and Pereira, J. M. (2008). Design and field test of an auto-matic data acquisition system in a self-propelled forage harvester. Computers and Electronicsin Agriculture, 61(2):192–200.Anex, R. P., Aden, A., Kazi, F. K., Fortman, J., Swanson, R. M., Wright, M. M., Satrio, J. A.,Brown, R. C., Daugaard, D. E., Platon, A., Kothandaraman, G., Hsu, D. D., and Dutta, A.(2010). Techno-economic comparison of biomass-to-transportation fuels via pyrolysis, gasi-fication, and biochemical pathways. Fuel, 89:S29–S35.Araujo, V. K. W. S., Hamacher, S., and Scavarda, L. F. (2010). Economic assessment of biodieselproduction from waste frying oils. Bioresource Technology, 101(12):4415–4422.Awudu, I. and Zhang, J. (2012). Uncertainties and sustainability concepts in biofuel supplychain management: A review. Renewable and Sustainable Energy Reviews, 16(2):1359–1368.Awudu, I. and Zhang, J. (2013). Stochastic production planning for a biofuel supply chainunder demand and price uncertainties. Applied Energy, 103:189–196.Bastian, C. and Rinnooy Kan, A. H. (1992). The stochastic vehicle routing problem revisited.European Journal of Operational Research, 56(3):407–412.97BC Bioenergy Network and Biomass Availability Study Working Group (2012).Biomass availability study for district heating systems. http://bcbioenergy.ca/wp-content/uploads/2012/%2005/Complete-Biomass-Availability-Study-May-3-2012-Final.pdf.Belgiorno, V., De Feo, G., Della Rocca, C., and Napoli, R. M. (2003). Energy from gasificationof solid wastes. Waste Management, 23(1):1–15.Benjamin, A. and Beasley, J. (2010). Metaheuristics for the waste collection vehicle routingproblem with time windows, driver rest period and multiple disposal facilities. Computers& Operations Research, 37(12):2270–2280.Berhan, E., Beshah, B., Kitaw, D., and Abraham, A. (2014). Stochastic Vehicle Routing Problem:A Literature Survey. Journal of Information & Knowledge Management, 13(03):1450022.Birge, J. R. and Louveaux, F. (2011). Introduction to stochastic programming. Springer.BISYPLAN (2012). The bionergy system planners handbook web-based handbook.http://bisyplan.bioenarea.eu/ html-files-en/03-00.html.Bivand, R. S., Pebesma, E. J., and Gómez-Rubio, V. (2008). Applied spatial data analysis with R,Second edition. Springer.Bochtis, D. D., Dogoulis, P., Busato, P., Sørensen, C. G., Berruto, R., and Gemtos, T. (2013). Aflow-shop problem formulation of biomass handling operations scheduling. Computers andElectronics in Agriculture, 91:49–56.Bonami, P. and Lejeune, M. A. (2009). An exact solution approach for portfolio optimizationproblems under stochastic and integer constraints. Operations research, 57(3):650–670.Braekers, K., Ramaekers, K., and Van Nieuwenhuyse, I. (2016). The vehicle routing problem:State of the art classification and review. Computers & Industrial Engineering, 99:300–313.Bridgwater, A. V. (1994). Catalysis in thermal biomass conversion. Applied Catalysis A, General,116(1-2):5–47.98Briggs, D. (1994). Forest products measurements and conversion factors : with special emphasison the U.S. Pacific Northwest. College of Forest Resources University of Washington, SeattleWash.Buhrkal, K., Larsen, A., and Ropke, S. (2012). The Waste Collection Vehicle Routing Problemwith Time Windows in a City Logistics Context. Procedia - Social and Behavioral Sciences,39:241–254.Callé, F. (2007). The biomass assessment handbook : bioenergy for a sustainable environment.Earthscan, London.Chepuri, K. and Homem-de Mello, T. (2005). Solving the vehicle routing problem with stochas-tic demands using the cross-entropy method. Annals of Operations Research, 134(1):153.Dal-Mas, M., Giarola, S., Zamboni, A., and Bezzo, F. (2011). Strategic design and investment ca-pacity planning of the ethanol supply chain under price uncertainty. Biomass and Bioenergy,35(5):2059–2071.Dembo, R. S. (1991). Scenario optimization. Annals of Operations Research, 30(1):63–80.Dentcheva, D. (2007). Optimization models with probabilistic constraints. Springer London.Dunnett, A. J., Adjiman, C. S., and Shah, N. (2008). A spatially explicit whole-system modelof the lignocellulosic bioethanol supply chain: an assessment of decentralised processingpotential. Biotechnology for biofuels, 1(1):13.Ebadian, M. (2013). Design and scheduling of agricultural biomass supply chain for a cellulosicethanol plant. PhD thesis, University of British Columbia.Ehrgott, M. (2005). Multicriteria Optimization. Springer.Ekşioğlu, S. D., Acharya, A., Leightley, L. E., and Arora, S. (2009). Analyzing the design andmanagement of biomass-to-biorefinery supply chain. Computers and Industrial Engineering,57(4):1342–1352.99Emmons, H. W. and Atreya, A. (1982). The science of wood combustion. Proceedings of theIndian Academy of Sciences Section C: Engineering Sciences, 5(4):259.Eom, J., Schipper, L., and Thompson, L. (2012). We keep on truckin’: Trends in freight energyuse and carbon emissions in 11 IEA countries. Energy Policy, 45:327–341.EPA (2014). Biomass combined heat and power catalog of technologies. Tech-nical report, U.S. Environmental Protection Agency (EPA) Combined Heat andPower Partnership. https://www.epa.gov/sites/production/files/2015-07/documents/biomass_combined_heat_and_power_catalog_of_technologies_v.1.1.pdf.European Commission (2009). Commission takes action to make urban travel greener, betterorganised and more userfriendly. [Online; accessed 04-October-2015].Fontaras, G., Zacharof, N.-G., and Ciuffo, B. (2017). Fuel consumption and CO2 emissions frompassenger cars in Europe âĂŞ Laboratory versus real-world emissions. Progress in Energy andCombustion Science, 60:97–131.Friesen, O. and Thauberger, J. (2017). 640 acres, more or less. Technical report, Prairie Agricul-tural Machinery Institute (Canada). http://pami.ca/pdfs/reports_research_updates/(12h)MiscellaneousTopics/213.PDF.Gandhi, V. (2008). Semivariogram Modeling, pages 1042–1046. Springer US, Boston, MA.Gebreslassie, B. H., Yao, Y., and You, F. (2012). Design under uncertainty of hydrocarbonbiorefinery supply chains: Multiobjective stochastic programming models, decompositionalgorithm, and a comparison between cvar and downside risk. AIChE Journal, 58(7):2155–2179.Gendreau, M., Laporte, G., and Séguin, R. (1995). An exact algorithm for the vehicle routingproblem with stochastic demands and customers. Transportation Science, 29(2):143–155.Giarola, S., Shah, N., and Bezzo, F. (2012). A comprehensive approach to the design of ethanolsupply chains including carbon trading effects. Bioresource Technology, 107:175–185.100Gracia, C., Diezma-Iglesias, B., and Barreiro, P. (2013). A hybrid genetic algorithm for route op-timization in the bale collecting problem. Spanish Journal of Agricultural Research, 11(3):603–614.Grisso, R. D., Cundiff, J. S., and Vaughan, D. H. (2007). Investigating Machinery ManagementParameters with Computer Tools. ASABE Conf, Paper 071030.Grossmann, I. E. and Guillén-Gosálbez, G. (2010). Scope for the application of mathematicalprogramming techniques in the synthesis and planning of sustainable processes. Computersand Chemical Engineering, 34(9):1365–1376.Hahsler, M. and Hornik, K. (2007). Tsp – Infrastructure for the traveling salesperson problem.Journal of Statistical Software, 23(2):1–21.Hartigan, J. A. (1975). Clustering algorithms. John Wiley & Sons, New York.Hollander, M. (2013). Nonparametric statistical methods. John Wiley & Sons, Inc, Hoboken, NewJersey.Hughes, L. and Rudolph, J. (2011). Future world oil production: growth, plateau, or peak?Current Opinion in Environmental Sustainability, 3(4):225–234.Isikgor, F. H. and Becer, C. R. (2015). Lignocellulosic biomass: a sustainable platform for theproduction of bio-based chemicals and polymers. Polym. Chem., 6:4497–4559.Jaffe, E. (2015). How the trucking industry could be vastly more efficient.http://www.citylab.com/tech/2015/01/the-trucking-of-tomorrow-is-here-and-its-a-huge-win-for-city-traffic/384672.Johnson, J. M.-f., Reicosky, D., Archer, D., and Wilhelm, W. (2006). A matter of balance: Con-servation and renewable energy. Journal of soil and water conservation, 61.Jokinen, R., Pettersson, F., and Saxén, H. (2015). An MILP model for optimization of a small-scale LNG supply chain along a coastline. Applied Energy, 138:423–431.101Juan, A. A., Mendez, C. A., Faulin, J., De Armas, J., and Grasman, S. E. (2016). Electric ve-hicles in logistics and transportation: A survey on emerging environmental, strategic, andoperational challenges. Energies, 9(2):1–21.Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to ClusterAnalysis. John Wiley.Kazemzadeh, N. and Hu, G. (2013). Optimization models for biorefinery supply chain networkdesign under uncertainty. Journal of Renewable and Sustainable Energy, 5(5).Kenney, K. L., Hess, J. R., Smith, W. a., and Muth, D. J. (2012). Improving biomass logisticscost within agronomic sustainability constraints and biomass quality targets. The Sun GrantInitiative National Feedstock Production and Utilization.Kerr, R. A. (2012). Technology is turning us oil around but not the world’s. Science,335(6068):522–523.Kim, J., Realff, M. J., and Lee, J. H. (2011). Optimal design and global sensitivity analysisof biomass supply chain networks for biofuels under uncertainty. Computers and ChemicalEngineering, 35(9):1738–1751.Klenke, A., Mattner, L., et al. (2010). Stochastic ordering of classical discrete distributions.Advances in Applied probability, 42(2):392–410.Kolm, P. N., Tütüncü, R., and Fabozzi, F. J. (2014). 60 years of portfolio optimization: Practicalchallenges and current trends. European Journal of Operational Research, 234(2):356–371.Kou, P., Gao, F., and Guan, X. (2015). Stochastic predictive control of battery energy storagefor wind farm dispatching: Using probabilistic wind power forecasts. Renewable Energy,80:286–300.Krajnc, N. (2015). Wood Fuels Handbook. Food and agricultural organization of the UnitedNations.Laporte, G., Louveaux, F., and Mercure, H. (1989). Models and exact solutions for a class ofstochastic location-routing problems. European Journal of Operational Research, 39(1):71–78.102Laporte, G., Louveaux, F. V., and Mercure, H. (1994). A Priori Optimization of the ProbabilisticTraveling Salesman Problem. Operations Research, 42(3):543–549.Leduc, S., Lundgren, J., Franklin, O., and Dotzauer, E. (2010). Location of a biomass basedmethanol production plant: A dynamic problem in northern Sweden. Applied Energy,87(1):68–75.Lee, J. H. (2012). Energy supply chain optimization: A challenge for control engineers? IFACProceedings Volumes (IFAC-PapersOnline), 8(PART 1):361–370.Lee, J. H. (2014). Energy supply planning and supply chain optimization under uncertainty.Journal of Process Control, 24(2):323–331.Lei, K., Zhu, X., Hou, J., and Huang, W. (2014). Decision of multimodal transportation schemebased on swarm intelligence. Mathematical Problems in Engineering, 2014:1–10.Lin, T., Rodríguez, L. F., Shastri, Y. N., Hansen, A. C., and Ting, K. (2014). Integrated strategicand tactical biomass–biofuel supply chain optimization. Bioresource technology, 156:256–266.Loecher, M. and Ropkins, K. (2015). RgoogleMaps and loa: Unleashing R graphics power onmap tiles. Journal of Statistical Software, 63(4):1–18.MacDonald, A. (2009). Assessment of economically accessible biomass. Technical report, Que-bec, Canada: FP Innovations.Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., and Hornik, K. (2015). cluster: Clus-ter Analysis Basics and Extensions. R package version 2.0.3 — For new features, see the’Changelog’ file (in the package source).Markowitz, H. (1952). Harry m. markowitz. Portfolio selection, Journal of Finance, 7(1):77–91.Mata-Pérez, M. and Saucedo-Martínez, J. A. (2015). Optimization of the distribution of steelpipes using a mathematical model. Dyna, 82(191):69–75.MATLAB (2015). version 7.10.0 (R2015b). The MathWorks Inc., Natick, Massachusetts.103McLean, K. and Li, X. (2013). Robust scenario formulations for strategic supply chain opti-mization under uncertainty. Industrial & Engineering Chemistry Research, 52(16):5721–5734.Mercer International Group (2010). BC’s Bioenergy Strategy: Examining Progress and Potentialin the Province. http://www.mercerint.com/i/pdf/BCPowerConferencePresentation.pdf.Milkman, K. L., Chugh, D., and Bazerman, M. H. (2009). How Can Decision Making Be Im-proved? Perspectives on Psychological Science, 4(4):379–383.Mitchell, M. (1996). An Introduction to Genetic Algorithms (Complex Adaptive Systems). The MITPress.Moura, S. J., Fathy, H. K., Callaway, D. S., and Stein, J. L. (2011). A Stochastic Optimal ControlApproach for Power Management in Plug-In Hybrid Electric Vehicles. IEEE Transactions onControl Systems Technology, 19(3):545–555.Newlands, N. K. (2016). Future Sustainable Ecosystems: Complexity, Risk, and Uncertainty (Chap-man & Hall/CRC Applied Environmental Statistics). Chapman and Hall/CRC.Newlands, N. K., Zamar, D. S., Kouadio, L. A., Zhang, Y., Chipanshi, A., Potgieter, A., Toure, S.,and Hill, H. S. J. (2014). An integrated, probabilistic model for improved seasonal forecast-ing of agricultural crop yield under environmental uncertainty. Frontiers in EnvironmentalScience, 2:17.Nuortio, T., Kytojoki, J., Niska, H., and Braysy, O. (2006). Improved route planning andscheduling of waste collection and transport. Expert Systems with Applications, 30(2):223–232.Orfanou, A., Busato, P., Bochtis, D. D., Edwards, G., Pavlou, D., Sørensen, C. G., and Berruto,R. (2013). Scheduling for machinery fleets in biomass multiple-field operations. Computersand Electronics in Agriculture, 94(September 2015):12–19.Osmani, A. and Zhang, J. (2014). Optimal grid design and logistic planning for wind andbiomass based renewable electricity supply chains under uncertainties. Energy, 70:514–528.104Osorio, C. and Bierlaire, M. (2013). A simulation-based optimization framework for urbantransportation problems. Operations Research, 61(6):1333–1345.Pedrag, M. and Miomir, J. (2011). The advanced system for dynamic vehicle routing in theprocess of waste collection. Mechanical Engineering, 9(1):127–136.Perrot, P. (1998). A to Z of Thermodynamics. Supplementary Series; 27. Oxford University Press.R Core Team (2016a). R: A Language and Environment for Statistical Computing. R Foundationfor Statistical Computing, Vienna, Austria.R Core Team (2016b). R: A Language and Environment for Statistical Computing. R Foundationfor Statistical Computing, Vienna, Austria.Rauch, P. and Gronalt, M. (2010). The terminal location problem in the forest fuels supplynetwork. International Journal of Forest Engineering, 21(2):32–40.Ravindran, A. R. (2007). Operations research and management science handbook. CRC Press.Rego, C. (2006). Operations Research/Computer Science Interfaces Series: Metaheuristic Optimiza-tion via Memory and Evolution : Tabu Search and Scatter Search. Springer US.Ribeiro, C. C., Rosseti, I., and Souza, R. C. (2011). Effective probabilistic stopping rules for ran-domized metaheuristics: Grasp implementations. In Learning and Intelligent Optimization,pages 146–160. Springer.Roise, J. P., Catts, G., Hazel, D., Hobbs, A., and Hopkins, C. (2013). Balancing biomass har-vesting and drying tactics with delivered payment practice. Technical report, TechnicalReport-Not held in TRLN member libraries,) Greenville, SC: US Endowment for Forestryand Community.Ryan, J. A. (2011). Quantmod: Quantitative financial modelling framework. R package version0.3-17.Saidur, R., Abdelaziz, E., Demirbas, A., Hossain, M., and Mekhilef, S. (2011). A review onbiomass as a fuel for boilers. Renewable and Sustainable Energy Reviews, 15(5):2262–2289.105Schaefer, J., Opgen-Rhein, R., and Strimmer, K. (2010). Corpcor: Efficient estimation of covari-ance and (partial) correlation. R package version 1.5, 7.Scutari, M. (2009). Learning bayesian networks with the bnlearn R package. arXiv preprintarXiv:0908.3817.Shabani, N. and Sowlati, T. (2015). Evaluating the impact of uncertainty and variability onthe value chain optimization of a forest biomass power plant using Monte Carlo Simulation.International Journal of Green Energy, 5075(October):150129124349000.Shah, N. (2005). Process industry supply chains: Advances and challenges. Computers andChemical Engineering, 29(6 SPEC. ISS.):1225–1235.Shao, S., Guan, W., Ran, B., He, Z., and Bi, J. (2017). Electric Vehicle Routing Problem withCharging Time and Variable Travel Time. Mathematical Problems in Engineering, 2017:1–13.Sharma, B., Ingalls, R. G., Jones, C. L., Huhnke, R. L., and Khanchi, A. (2013). Scenario opti-mization modeling approach for design and management of biomass-to-biorefinery supplychain system. Bioresource Technology, 150:163–171.Shmulevich, I. and Dougherty, E. R. (2010). Probabilistic Boolean networks: the modeling andcontrol of gene regulatory networks. siam.Smith, S. L., Pavone, M., Bullo, F., and Frazzoli, E. (2010). Dynamic Vehicle Routing with Pri-ority Classes of Stochastic Demands. SIAM Journal on Control and Optimization, 48(5):3224–3245.Sokhansanj, S., Mani, S., Turhollow, A., Kumar, A., Bransby, D., Lynd, L., and Laser, M. (2009).Large-scale production, harvest and logistics of switchgrass (Panicum virgatum l.) - cur-rent technology and envisioning a mature technology. Biofuels, Bioproducts and Biorefining,3(2):124–141.Sørensen, C. G. and Bochtis, D. D. (2010). Conceptual model of fleet management in agricul-ture. Biosystems Engineering, 105(1):41–50.106Sosa, A., Acuna, M., McDonnell, K., and Devlin, G. (2015). Controlling moisture content andtruck configurations to model and optimise biomass supply chain logistics in Ireland. Ap-plied Energy, 137:338–351.Statistics Canada (2011). Households and the Environment: Energy Use. Retrieved on Decem-ber 21, 2017 from http://www.statcan.gc.ca/pub/11-526-s/11-526-s2013002-eng.pdf.Stewart, W. R. and Golden, B. L. (1983). Stochastic vehicle routing: a comprehensive approach.European Journal of Operational Research, 14(4):371–385.Stoyanov, S. V. (2010). A moment expansion of downside risk measures. Technical report,EDHEC-Risk Institute Publication.Sumalee, A., Uchida, K., and Lam, W. (2011). Stochastic multi-modal transport network un-der demand uncertainties and adverse weather condition. Transportation Research Part C:Emerging Technologies, 19:338–350.Szeto, W. Y., Wu, Y., and Ho, S. C. (2011). An artificial bee colony algorithm for the capacitatedvehicle routing problem. European Journal of Operational Research, 215(1):126–135.Theussl, S. and Hornik, K. (2016). Rglpk: R/GNU Linear Programming Kit Interface. R packageversion 0.6-2.Thomas, D. (2002). Seaweeds (Life). The Natural History Museum.United States Environmental Protection Agency (2015). Sources of greenhouse gas emissions.www3.epa.gov/climatechange/ghgemissions/sources/transportation.html.U.S. Deptartment of Energy (2012). Feedstock supply and logistics: Biomass as a commod-ity. Technical report, Energy Efficiency & Renewable Energy, Bioenergy Technologies Office.https://www1.eere.energy.gov/bioenergy/pdfs/feedstocks_four_pager.pdf.U.S. Deptartment of Energy (2014). Thermochemical conversion: Using heat and catalysis tomake biofuels and bioproducts. Technical report, BioEnergy Technologies Office. https://www.energy.gov/sites/prod/files/2014/04/f14/thermochemical_four_pager.pdf.107U.S. Energy Information Administration (2016). International energy outlook 2016. https://www.eia.gov/outlooks/ieo/pdf/0484(2016).pdf.U.S. Environmental Protection Agency (2012). SAB review of EPA’s accounting framework forbiogenic CO2 emissions from stationary sources. EPA-SAB-12-011.Uslu, A., Faaij, A. P., and Bergman, P. (2008). Pre-treatment technologies, and their effect oninternational bioenergy supply chain logistics. techno-economic evaluation of torrefaction,fast pyrolysis and pelletisation. Energy, 33(8):1206–1223.Varadhan, R. and Gilbert, P. (2009). BB: An R package for solving a large system of nonlinearequations and for optimizing a high-dimensional nonlinear objective function. Journal ofStatistical Software, 32(4):1–26.Verma, M., Godbout, S., Brar, S. K., Solomatnikova, O., Lemay, S. P., and Larouche, J. P. (2012).Biofuels production from biomass by thermochemical conversion technologies. InternationalJournal of Chemical Engineering, 2012.Weingessel, A. (2013). quadprog: Functions to solve Quadratic Programming Problems. R packageversion 1.5-5.Williams, G. D., Jofriet, J. C., and Rosentrater, K. A. (2008). Biomass Storage andHandling: Status and Industry Needs. http://facilityengserv.com/WhitePapers/BiomassStorageandHandling.pdf.Yagcitekin, B. and Uzunoglu, M. (2016). A double-layer smart charging strategy of electricvehicles taking routing and charge scheduling into account. Applied Energy, 167:407–419.Yu, S. P. and Li, Y. P. (2012). An improved ant colony optimization for vrp with time windows.Applied Mechanics and Materials, 263-266:1609.Yue, D., You, F., and Snyder, S. W. (2014). Biomass-to-bioenergy and biofuel supply chain opti-mization: Overview, key issues and challenges. Computers & Chemical Engineering, 66:36–56.108Zamar, D. S., Gopaluni, B., and Sokhansanj, S. (2017a). A constrained k-means and nearestneighbor approach for route optimization: With an application to the bale collection prob-lem. In The 20th World Congress of the International Federation of Automatic Control, Toulouse,France. The International Federation of Automatic Control.Zamar, D. S., Gopaluni, B., and Sokhansanj, S. (2017b). Optimization of sawmill residuescollection for bioenergy production. Applied Energy, 202:487–495.Zamar, D. S., Gopaluni, B., Sokhansanj, S., and Ebadian, M. (2016). Economic Optimization ofSawmill Residues Collection for Bioenergy Conversion. IFAC-PapersOnLine, 49(7):857–862.Zamar, D. S., Gopaluni, B., Sokhansanj, S., and Newlands, N. K. (2015). Robust Optimizationof Competing Biomass Supply Chains Under Feedstock Uncertainty. In 9th InternationalSymposium on Advanced Control of Chemical Processes, pages 1223–1228, Whistler, BritishColumbia. The International Federation of Automatic Control.Zamar, D. S., Gopaluni, B., Sokhansanj, S., and Newlands, N. K. (2017c). A quantile-based sce-nario analysis approach to biomass supply chain optimization under uncertainty. Computers& Chemical Engineering, 97:114–123.Zhang, J., Lam, W. H. K., and Chen, B. Y. (2013). A Stochastic Vehicle Routing Problem withTravel Time Uncertainty: Trade-Off Between Cost and Customer Service. Networks and Spa-tial Economics, 13(4):471–496.109Appendix AQuantiles and their PropertiesSummary statistics such as the median and mean are measures of central tendency. The medianis the middle position of a set of numbers, such that half of the data have values less than themedian. For example, in the set of numbers {5,7,62,47,30,16,36,22,82,70,98}, the median isobtained by first rearranging the numbers in order of magnitude (smallest first):5,7,16,22,30,36,47,62,70,82,98and selecting the middle value. In this example, the median is 36 (shown in bold). If there arean even number of numbers, then the median is the average of the middle two numbers.The mean is simply the arithmetic average of the data points. In the previous example, themean is (5 + 7 + 62 + 47 + 30 + 16 + 36 + 22 + 82 + 70 + 98)/11 = 43.2. A disadvantage of the mean,however, is that it is particularly susceptible to the influence of outliers. Outliers are valuesthat are unusual compared to the rest of the data, because they are much smaller or larger innumerical value. The median is much less affected by outliers. In the above example, if thevalue 98 were changed to 980, the median would still be 36, but the mean would increase to123.4, which is a misleading summary of the observed values.The concept of the median can be generalized. One way to do this is to consider quantiles.The qth quantile is the value x, such that q% of the data is less than x. For instance, the 0.90quantile is the value x, such that 90% percent of the data have values less than x. In the aboveexample, the 0.90 quantile is 82. Note, the median corresponds to the 0.50 quantile.Quantiles may be used to appropriately summarize non-normal or asymmetric distribu-tions. For example, if I were to tell you the mean and standard deviation of tree heights ina given region, you would not know if the tree height distribution was bell-shaped, barbell110shaped, or had some other form. This is because probability distributions can share the samemean and standard deviation, but have very different shapes. On the other hand, if I were totell you the 0.10, 0.25, 0.5, 0.75, and 0.90 sample quantiles, you would have a much betterdepiction of the tree height distribution.111Appendix BSawmillsTable B.1: Average annual sawmill residues produced by site [BC Bioenergy Network andBiomass Availability Study Working Group, 2012] and travel distances from the Mission depot[Loecher and Ropkins, 2015].ID Sawmill name Distance to depot (km) Residues (dry ton/year)S1 S&R Sawmills Ltd. 40.3 75000S2 Interfor-Hammond Mill 31.7 60000S3 J.S. Jones-Small Log Sawmill 43.2 59000S4 Aspen Planers Ltd. 54.1 42500S5 Terminal Forest Products Ltd. 81.6 30000S6 Goldwood Industries Ltd. 81.2 25000S7 CIPA Lumber Co Ltd. 67.9 20000S8 Delta Cedar Sawmill LP 64.3 18750S9 Halo Sawmill Ltd. 38.9 18500S10 Mill & Timber Products Ltd. 53.0 18500S11 Richmond Plywood Corporation Ltd. 79.5 12000S12 Weyerhaeuser Vancouver Plant 68.6 66500S13 Anbrook Industries Ltd. 38.6 2506S14 Best Quality Cedar Products Ltd. 12.7 5199S15 Coast Mountain Cedar Products Ltd. 5.7 3853S16 G & R Cedar Ltd. 11.2 3142S17 Imperial Shake Co Ltd. 24.0 5386S18 J & D Shake & Cedar Mill Ltd. 4.6 2880S19 Premium Cedar Products Ltd. 12.9 4526S20 S & W Shake & Shingle Ltd. 12.8 4488S21 Serpentine Cedar Ltd. 28.6 2169S22 Silver Creek Premium Products 7.0 12568S23 Silvermere Forest Prod Ltd. 12.5 1347S24 Teal Cedar Prod (1977) Ltd. 43.2 12568S25 Waldun Forest Products Ltd. 12.2 7369112Appendix CR-code for the Bale Collection StudyC.1 Model the study area and the bale locationsl i b r a r y ( s p a t s t a t )l i b r a r y ( ggplot2 )l i b r a r y ( cowplot )# Input Parameters :# 1 . s e c t i o n _ width : width o f t h e s tudy r e g i o n in meter s# 2 . s e c t i o n _ h e i g h t : l e n g t h o f t h e s tudy r e g i o n in meter s# 3 . num_ rows : s tudy r e g i o n i s a g r i d o f farms num_ rows by num_ rows# 4 . y i e l d : y i e l d in u n i t s o f kg per h e c t a r e# 5 . b a l e _mass : mass o f each b a l e in u n i t s o f kg# 6 . windrow : t h e width o f each row o f wheat t o be b a l e d in u n i t s o fmete r ss e c t i o n _width = 800 # mete r ss e c t i o n _ height = 800 # mete r s# number o f rows in t h e reg ion , which i s r e p r e s e n t e d as a square g r i dnum_rows = 3lon = c ( 0 ,num_rows * s e c t i o n _width ) # domain ’ s l o n g i t u d el a t = c ( 0 ,num_rows * s e c t i o n _ height ) # domain ’ s l a t i t u d ey i e l d = 2000 # kg / habale _mass = 400 # kg / b a l ewindrow = 2.5 # mete r s# area per windrow ( in mete r s square ) c o n v e r t e d t o hawindrow_ area = windrow * d i f f ( l a t ) / 10000# b a l e s per habales _per_ha = y i e l d / bale _mass# b a l e s per windrow d i v i d e d by windrow area in meter slambda = ( windrow_ area * bales _per_ha ) / ( d i f f ( l a t ) *windrow )lon . vec = seq ( lon [ 1 ] , lon [ 2 ] , by=windrow )113lon . win = cbind ( lon . vec [− length ( lon . vec ) ] , lon . vec [ −1] )lon . win = s p l i t ( lon . win , row ( lon . win ) )l a t . win = matrix ( rep ( l a t , t imes=length ( lon . win ) ) , ncol =2 ,byrow=TRUE)l a t . win = s p l i t ( l a t . win , row ( l a t . win ) )sim = l i s t ( )for ( i in 1 : length ( lon . win ) ) {sim [ [ i ] ] = rpoispp ( lambda , win=as . owin ( l i s t ( xrange=lon . win [ [ i ] ] ,yrange= l a t . win [ [ i ] ] ) ) )}ba les _ dat = c ( )for ( i in 1 : length ( sim ) ) {# g e n e r a t e b a l e l o c a t i o n s a c c o r d i n g t o a p o i s s o n p r o c e s sbales _ dat = rbind ( ba les _dat , cbind ( sim [ [ i ] ] $x , sim [ [ i ] ] $y , rnorm ( length (sim [ [ i ] ] $x ) , bale _mass , bale _mass / 25) ) )}ba les _ dat = as . data . frame ( ba les _ dat )colnames ( ba les _ dat ) = c ( " x " , " y " , " mass " )g0 = ggplot ( ba les _dat , aes ( x , y ) ) +geom_ point ( s i z e =1.0) +s c a l e _x_ continuous ( breaks=seq ( lon [ 1 ] , lon [ 2 ] , by=800) ) +s c a l e _y_ continuous ( breaks=seq ( l a t [ 1 ] , l a t [ 2 ] , by=800) ) +geom_ hl ine ( y i n t e r c e p t=c ( seq ( 0 , s e c t i o n _width *num_rows , by=s e c t i o n _width) ) ) +geom_ v l i n e ( x i n t e r c e p t=c ( seq ( 0 , s e c t i o n _width *num_rows , by=s e c t i o n _width) ) ) +xlab ( " width (m) " ) + ylab ( " length (m) " )dim ( ba les _ dat )# p l o t t h e s i m u l a t e d b a l e splot ( g0 )C.2 Identification of the optimal number of cluster centersl i b r a r y ( dfoptim )# f u n c t i o n t h a t a s s i g n s p o i n t s t o t h e c l u s t e r c e n t e r sassignment _ step = function ( x , y ) {d_ l i s t = l i s t ( )114for ( i in 1 : length ( x ) ) {d_ l i s t [ [ i ] ] = sqrt ( apply ( sweep ( ba les _ dat [ , 1 : 2 ] ,MARGIN=2 , c ( x [ i ] , y [ i ] ) ,FUN="−" ) ^2 ,1 ,sum ) )}d_mat <− data . frame ( matrix ( u n l i s t ( d_ l i s t ) , nrow=nrow ( ba les _ dat ) ,byrow=F ) )d_ res = apply ( d_mat , 1 , sort , index . return=TRUE)group = t ( sapply ( d_ res , " [ [ " , " i x " ) ) [ , 1 ]return ( group )}fpenal ty = function ( x , d ) { t t = ( x / d ) %% 1 ; apply ( cbind ( t t ,1− t t ) ,1 ,min) }# f u n c t i o n t h a t computes new c l u s t e r c e n t e r supdate_ step = function ( dat , k , i n i t ) {f1 = function ( x ) {mat = cbind ( x [ 1 : k ] , x [ ( k+1) : ( 2 *k ) ] )pen = apply ( mat , 1 , fpenalty , s e c t i o n _width )pen = apply ( pen , 2 , ’ prod ’ )acc = 0xx = x [ df1$group ]yy = x [ df1$group+k ]sqrt (sum ( ( df1$x − xx ) ^2 + ( df1$y − yy ) ^2) ) + 10^8*sum( pen )}res = hjkb ( i n i t , f1 , lower=c ( rep ( lon [ 1 ] , k ) , rep ( l a t [ 1 ] , k ) ) , upper=c (rep ( lon [ 2 ] , k ) , rep ( l a t [ 2 ] , k ) ) )mat = cbind ( re s $par [ 1 : k ] , r e s $par [ ( k+1) : ( 2 *k ) ] )return ( re s )}k = 10 # maximum number o f r o a d s i d e l o c a t i o n s ( c l u s t e r s ) t o c o n s i d e rprint ( " I d e n t i f y Optimal Number of Clus ters : " )ptm <− proc . time ( )df_ l i s t = l i s t ( )115# c r e a t e p r o g r e s s barpb <− tx tProgressBar (min = 0 , max = k , s t y l e = 3)# i d e n t i f y t h e op t ima l number o f c l u s t e r s t o usefor ( i in 1 : k ) {df_ l i s t [ [ i ] ] = l i s t ( )kmeans_ res = kmeans ( ba les _ dat [ , 1 : 2 ] , i , n s t a r t = 10)x = kmeans_ res $ center [ , " x " ] # i n i t i a l c l u s t e r c e n t e r s ( x c o o r d i n a t e )y = kmeans_ res $ center [ , " y " ] # i n i t i a l c l u s t e r c e n t e r s ( y c o o r d i n a t e )group_old = kmeans_ res $ c l u s t e r # i n i t i a l groups o b t a i n e d v i a kmeansdf1 = data . frame ( cbind ( ba les _dat , group=group_old ) )# update t h e c l u s t e r c e n t e r supdate_ c e n t e r s = update_ step ( df1 , i , i n i t =c ( x , y ) )value _old = update_ c e n t e r s $ values o l = update_ c e n t e r s $parx=s o l [ 1 : i ]y=s o l [ ( i +1) : ( 2 * i ) ]repeat { # u n t i l c o n v e r g e n c egroup_new = assignment _ step ( x , y ) # a s s i g n p o i n t s t o c l u s t e r c e n t e r s (update memberships )df1 = data . frame ( cbind ( ba les _dat , group=group_new ) )update_ c e n t e r s _new = update_ step ( df1 , i , i n i t =c ( x , y ) ) # update c l u s t e rc e n t e r svalue _new = update_ c e n t e r s _new$ values o l _new = update_ c e n t e r s _new$parx=s o l _new [ 1 : i ]y=s o l _new [ ( i +1) : ( 2 * i ) ]i f ( abs ( value _new−value _old ) / value _old < 0.0001) {value _old = value _newgroup_old = group_newbreak} e l s e {value _old = value _newgroup_old = group_new}}group_new = assignment _ step ( x , y ) # a s s i g n p o i n t s t o c l u s t e r sdf1 = data . frame ( cbind ( ba les _dat , group=group_new ) )df1$group = f a c t o r ( df1$group )116df3 = data . frame ( cbind ( x=x , y=y , shape =1: i ) )df3$shape = f a c t o r ( df3$shape )df_ l i s t [ [ i ] ] [ [ " i " ] ] = idf_ l i s t [ [ i ] ] [ [ " df1 " ] ] = df1df_ l i s t [ [ i ] ] [ [ " df3 " ] ] = df3df_ l i s t [ [ i ] ] [ [ " group " ] ] = group_newdf_ l i s t [ [ i ] ] [ [ " wss " ] ] = value _newse tTxtProgressBar ( pb , i )}close ( pb )proc . time ( ) − ptmelbow_ dat = as . data . frame ( cbind ( 1 : k , u n l i s t ( lapply ( df_ l i s t , " [ [ " , " wss " )) ) )colnames ( elbow_ dat ) = c ( " c l u s t e r s " , " wss " )g5 = ggplot ( data=elbow_dat , aes ( x=c l u s t e r s , y=wss ) ) +geom_ l i n e ( s i z e =1.2 , c o l o r=" darkgrey " ) +geom_ point ( s i z e =2 , c o l o r=" blue " )+s c a l e _x_ continuous ( breaks =1:k ) +ylab ( " t o t a l within− c l u s t e r s sum of squares " ) +xlab ( "number of c l u s t e r s " ) +geom_ v l i n e ( x i n t e r c e p t =5 , l i n e t y p e=" dashed " )ggsave ( fi lename=paste ( " f i g u r e s / Fig5 . pdf " , sep=" " ) , g5 , width =5 , height=5)# opt ima l number o f c l u s t e r s i s 5df1 = df_ l i s t [ [ 5 ] ] [ [ " df1 " ] ]df3 = df_ l i s t [ [ 5 ] ] [ [ " df3 " ] ]lapply ( ( lapply ( df_ l i s t , " [ [ " , " group " ) ) , table )g = ggplot ( data=df1 , aes ( x=x , y=y ) ) + geom_ point ( aes ( c o l o r=group ) , s i z e=1) +labs ( c o l o r=" Cluster " ) +geom_ point ( data=df3 , aes ( x=x , y=y ) , shape =15 , s i z e =2.5 , show . legend=FALSE) +s c a l e _x_ continuous ( breaks=seq ( lon [ 1 ] , lon [ 2 ] , by=800) ) +s c a l e _y_ continuous ( breaks=seq ( l a t [ 1 ] , l a t [ 2 ] , by=800) ) +geom_ hl ine ( y i n t e r c e p t=c ( seq ( 0 , s e c t i o n _width *num_rows , by=s e c t i o n _width) ) ) +geom_ v l i n e ( x i n t e r c e p t=c ( seq ( 0 , s e c t i o n _width *num_rows , by=s e c t i o n _width) ) ) +xlab ( " width (m) " ) + ylab ( " length (m) " )g2 = g117C.3 Identification of the optimal routes within each clusterl i b r a r y ( grid )l i b r a r y ( plyr )ptm <− proc . time ( )# s p e c i f y t h e c a p a c i t y o f t h e b a l e rcap = 45 # {8 , 15 , 35 , 45}l i b r a r y ( "RANN" )route _ l i s t = l i s t ( )df1 = cbind ( df1 , ID=1:nrow ( df1 ) )df2 = rbind . f i l l ( df1 , cbind ( df3 [ , 1 : 2 ] , ID=(nrow ( df1 ) +1) : ( nrow ( df1 )+nrow( df3 ) ) ) )t o t a l _ length = 0num_perms = 100# c r e a t e p r o g r e s s barpb <− tx tProgressBar (min = 0 , max = length ( unique ( df1$group ) ) , s t y l e= 3)# i d e n t i f y t h e op t ima l r o u t e s f o r each c l u s t e r ( t h i s t a k e s a fewminutes )for ( i in as . integer ( s o r t ( unique ( df1$group ) ) ) ) {s o l u t i o n = NULLs o l u t i o n _ length = NA# r e p e a t s e v e r a l t i me s and p i c k t h e b e s t r o u t e ( randomly s e l e c t i n gbetween t h e most and and l e a s t i s o l a t e d b a l e s )for ( perm in 1 :num_perms ) {df = subset ( df1 , group==i )nn_ res = nn2 ( data=df [ , 1 : 2 ] , k=min ( nrow ( df ) , cap ) )route _ l i s t [ [ i ] ] = l i s t ( )s o l _ length = 0j = 1while ( nrow ( df ) > 0) {# count number o f t i me s each b a l e appear s as a n e a r e s t ne ighbour118nn_ counts = s o r t ( table ( nn_ res $nn . idx ) )bale = NULL# randomly s e l e c t between t h e most i s o l a t e d and l e a s t i s o l a t e d b a l e si f ( length ( nn_ counts ) > 1){tmp = prop . table ( table ( c ( rbinom ( 1 , length ( nn_ counts ) −1 ,1 / ( length ( nn_counts ) −1) ) +1 ,rbinom ( 1 , length ( nn_ counts ) −1,1−1 / ( length ( nn_ counts )−1) ) +1) ) )tmp = tmp /sum( tmp )bale = as . numeric ( sample ( names ( tmp ) ,1 , prob=tmp ) )} e l s e {bale = names ( nn_ counts ) [ 1 ]}# i d e n t i f y n e a r e s t ne ighbour c l u s t e r s t h a t i n c l u d e t h e chosen b a l ebale _ routes _ i x = which ( apply ( nn_ res $nn . idx , 1 , function ( r ) any ( r ==bale ) ) )bale _ routes = nn_ res $nn . idx [ bale _ routes _ ix , ]tmp = l i s t ( )# i d e n t i f y b e s t r o u t e f o r each c l u s t e rfor (m in 1 : length ( bale _ routes _ i x ) ) {i f ( length ( bale _ routes _ i x ) > 1) {x = df [ bale _ routes [m, ] , 1 : 2 ]x = rbind ( x , df2 [ nrow ( df1 )+i , 1 : 2 ] )} e l s e {x = df [ bale _ routes , 1 : 2 ]x = rbind ( x , df2 [ nrow ( df1 )+i , 1 : 2 ] )}etsp <− ETSP ( x )tour <− solve _TSP ( etsp )tmp [ [m] ] = l i s t ( tour = tour , dat = x )}tours = lapply ( tmp , " [ [ " , " tour " )bes t _ i x = s o r t ( sapply ( tours , " a t t r " , " tour _ length " ) , index . return = TRUE) $ i x [ 1 ]new_ route = as . integer ( rownames ( tmp [ [ bes t _ i x ] ] $ dat [ as . integer ( tmp [ [bes t _ i x ] ] $ tour ) , ] ) )route _ l i s t [ [ i ] ] [ [ j ] ] = new_ routes o l _ length = s o l _ length + a t t r ( tmp [ [ bes t _ i x ] ] $ tour , " tour _ length " )119df = subset ( df , ! ID %in% new_ route )i f ( nrow ( df ) >0) {nn_ res = nn2 ( data=df [ , 1 : 2 ] , k=min ( nrow ( df ) , cap ) )}j = j +1}i f ( i s . na ( s o l u t i o n _ length ) | | s o l _ length < s o l u t i o n _ length ) {s o l u t i o n = route _ l i s ts o l u t i o n _ length = s o l _ length}}t o t a l _ length = t o t a l _ length + s o l u t i o n _ lengthse tTxtProgressBar ( pb , i )}close ( pb )proc . time ( ) − ptmroute _ l i s t = s o l u t i o ns o l _ length = t o t a l _ lengthgg_ c o l o r _hue <− function ( n ) {hues = seq (15 , 375 , length = n + 1)hcl ( h = hues , l = 65 , c = 100) [ 1 : n ]}c o l o r = gg_ c o l o r _hue ( length ( route _ l i s t ) )c o l o r _ dat = data . frame ( cbind ( colour=color , group =1: length ( route _ l i s t ) ))c o l o r _ dat $ colour = as . character ( c o l o r _ dat $ colour )for ( i in 1 : length ( route _ l i s t ) ) {for ( b in 1 : length ( route _ l i s t [ [ i ] ] ) ) {g2 = g2 + geom_path ( aes ( x=x , y=y ) , c o l o r=c o l o r _ dat [ i , " colour " ] , data=df2 [ c ( route _ l i s t [ [ i ] ] [ [ b ] ] , route _ l i s t [ [ i ] ] [ [ b ] ] [ 1 ] ) , 1 : 2 ] )}}g2 = g2 + geom_ point ( data=df3 , aes ( x=x , y=y ) , shape =15 , s i z e =2.5 , show .legend=FALSE )120g1 = gpar ( ask=TRUE)# g e n e r a t e p l o t sg3 <− lapply ( l i s t ( g0 , g1 , g2 ) , ggplotGrob )widths <− do . c a l l ( unit . pmax , lapply ( g3 , " [ [ " , " widths " ) )he ights <− do . c a l l ( unit . pmax , lapply ( g3 , " [ [ " , " he ights " ) )lg <− lapply ( g3 , function ( g ) { g$ widths <− widths ; g$ heights <−heights ; g } )grid . newpage ( )grid . draw ( lg [ [ 1 ] ] )grid . newpage ( )grid . draw ( lg [ [ 2 ] ] )grid . newpage ( )grid . draw ( lg [ [ 3 ] ] )ggsave ( fi lename=" f i g u r e s / Fig1 . pdf " , lg [ [ 1 ] ] )ggsave ( fi lename=" f i g u r e s / Fig2 . pdf " , lg [ [ 2 ] ] )ggsave ( fi lename=paste ( " f i g u r e s / Fig3 _ " , cap , " . pdf " , sep=" " ) , lg [ [ 3 ] ] )n_ routes = 0for ( i in as . integer ( s o r t ( unique ( df1$group ) ) ) ) {n_ routes = n_ routes + length ( route _ l i s t [ [ i ] ] )}save . image ( paste ( " sim_ " , cap , " _ bales . RData " , sep=" " ) )121

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0363012/manifest

Comment

Related Items