12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 2015Applications of Dynamic Trees to Sensitivity AnalysisWilliam BeckerResearch Associate, Econometrics and Applied Statistics Unit, European Commission -Joint Research Centre, Ispra, ItalyABSTRACT: A recent approach to surrogate modelling, called dynamic trees, uses regression treesto partition the input space, and fits simple constant or linear models in each “leaf” (region of the in-put space). This article aims to investigate the applicability of dynamic trees in sensitivity analysis, inparticular on high dimensional problems at low sample size, to see whether they can be applied to dimen-sionalities usually out of the range of surrogate models. Comparisons are made with Gaussian processes,as well as three measures based on a radial sampling scheme: the Monte Carlo estimator of the totalsensitivity index, an elementary effects measure, and a derivative-based sensitivity measure. The resultsshow that the radial sampling measures generally outperform the surrogate models tested here, with theexception of response surfaces that feature discontinuities.Uncertainty analysis (UA) and sensitivity analy-sis (SA) are now widely acknowledged as essen-tial components of model-based engineering de-sign and risk analysis. However, there are stillmany practical difficulties associated with accu-rately quantifying and propagating uncertaintiesthrough a model. Not the least of these problemsis that of computational expense: given that mostcomplex models cannot be represented in a closedform, UA and SA must be performed by samplingthe model by running it a number of times at dif-ferent values of its input variables. If it is possibleto run the model a fairly large number of times, theMonte Carlo method can be used to estimate thevariance of the model output and measures of sen-sitivity to a reasonable degree of precision.Unfortunately, complex models may often takehours or even days to run for a single set of in-put variable values; in such cases the Monte Carlomethod cannot provide accurate estimates. Surro-gate models, known variously as “emulators” or“metamodels”, have become widely-used tools thataim to overcome this problem by building a statis-tical approximation of the model based on a smallnumber of model runs. The surrogate model canthen be used to estimate quantities of interest viathe Monte Carlo method (since the surrogate modelcan be run in a very small amount of time), or usinganalytical integration if it is sufficiently tractable.Data modelling approaches may also be used inthe setting of “given data”, i.e. when the avail-able data points are arbitrarily placed and cannotbe positioned according to a Monte Carlo design,for example. This occurs in the analysis of com-posite indicators, in which data is available for afixed number of entities, such as countries, regionsor universities, and no further points can be added,In such cases, nonlinear regression, such as lo-cal linear regression or penalised splines, can beused to estimate first-order sensitivity by regress-ing against each variable in turn (Paruolo et al.,2013). In general, however, interactions betweenvariables may be significant and it is not sufficientto know only first-order sensitivity indices. Mul-tidimensional surrogate models have the capabilityto estimate the effects of higher order interactionsand can estimate the total sensitivity indices.At least two major drawbacks of surrogate mod-els are that first, they introduce further uncertaintyin the approximation of the model by its surrogate,although this is acceptable if the uncertainty is suf-ficiently small. Second, they tend to scale poorlywith dimensionality: as the number of input vari-ables grows, the number of data points required toaccurately fit the surrogate can grow beyond a fea-sible limit.112th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 2015Given these caveats, there is a substantial amountof research effort devoted to building surrogatemodels that can emulate possibly strongly nonlin-ear and nonstationary model responses, at high di-mensionalities, for as few model runs as possi-ble. Examples of such emulators include Gaussianprocesses (Oakley and O’Hagan, 2004), polyno-mial chaos expansions (Sudret, 2008), and high di-mensional model representations (Rabitz and Alis¸,1999). A comparison of some of these methods inthe context of sensitivity analysis can be found inStorlie and Helton (2008).One class of surrogate model that has the poten-tial to be applicable in high dimensions, and candemonstrate considerable flexibility, is the use ofregression trees (RTs). RTs take the approach ofdividing the input space of the model into a num-ber of complementary regions, known as “leaves”,such that each leaf has a regression model asso-ciated with it which approximates a particular re-gion of the input space. The advantage of this ap-proach is that a number of simple, stationary (pos-sibly linear or even constant) regression models canbe combined to make a surrogate model that, glob-ally, is able to handle nonlinear and nonstationarymodel responses, even with discontinuities (Beckeret al., 2013).A Bayesian approach to building RTs was de-veloped by Chipman et al. (1998), which builds aposterior distribution over trees from a prior distri-bution conditioned on training data (model runs).Inferences can then be made by averaging over alarge number of possible tree models. This con-cept was extended by Gramacy and Lee (2008) byusing Gaussian processes as the regression modelat each leaf. Most recently, the idea of “dynamictrees” (DTs) was developed by Gramacy and Pol-son (2011), which uses a particle learning algorithmto allow sequential updating of the RT. Addition-ally, an approach for variable selection was pro-posed based on the variance reduction due to treenodes using each variable. These approaches areencoded in the R packages tgp (Gramacy, 2007)and dynaTree (Gramacy et al., 2013), which areused as the basis for these experiments.In this work the use of DTs, and RTs in general,is investigated in the context of sensitivity analy-sis, in particular to see whether RT surrogate mod-els can be used to perform sensitivity analysis andvariable screening in high dimensions, at low sam-ple sizes. RTs are used to estimate variance-basedsensitivity indices as well as an alternative sensitiv-ity measure based on variance reduction caused bysplits on a particular variable, described in Section2. The performance is compared with sampling-based measures described in Section 3.1. BAYESIAN REGRESSION TREESConsider a model f , which has k uncertain inputsdenoted {X i}ki=1, and a univariate output Y , suchthat Y = f (X ). It will be assumed that X ∈ X ,where X = [0,1]k. A regression tree works byrecursively dividing the input space X into non-overlapping partitions using rules of the form xi≤ s,i.e. splitting the data using a single input variable ata time. An example of a simple RT is shown in Fig-ure 1. The data are sorted by the splitting rules intothe terminal nodes, or “leaves”. Each leaf defines aregion of the input space and has its own regressionmodel assigned to it – this may be as simple as aconstant or linear regression, or more sophisticated,such as a Gaussian process. Although a single re-All data x 1 ≤3x 2 ≤5x 1 ≤5.3x 2 ≤2ABCDEFigure 1: An example of a regression tree.gression tree will have discontinuities between eachleaf region, by adopting a Bayesian approach whichaverages over many possible trees, the discontinu-ities can potentially be smoothed out. In this sensethe Bayesian approach shares some similarities to212th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 2015the random forests method (Breiman, 2001): a largenumber of “‘weak learners” which can model com-plex data when combined. In the following, a verybrief overview of Bayesian tree-based models isgiven.In order to build regression trees using theBayesian framework, it is necessary to create aprior distribution over tree models, p(T,Θ), whereT represents the tree-structure random variable, andΘ the random vector of unknown regression param-eters defining the regression models at each leaf ofthe tree. This can conveniently be divided such thatp(Θ,T ) = p(Θ|T )p(T ) (1)which allows the tree prior to specified indepen-dently of the regression parameters. It is not pos-sible to specify an analytical prior over trees, buta prior distribution may be specified indirectly viaa number of rules which dictate the probability ofmoving from one tree to another, by adding and re-moving nodes, as well as swapping and changingsplitting rules. This fits naturally into the frame-work of Markov chain Monte Carlo (MCMC) sam-pling, which is used to sample the posterior distri-bution.The prior may now be combined with the modellikelihood, (Y |X ,T,Θ), given the data, which is de-pendent on the type of model used at each leaf. Ifthe parameter prior has a carefully chosen form, itis possible to analytically marginalise the model pa-rameters, i.e.,p(Y |X ,T ) =∫p(Y |X ,T,Θ)p(Θ|T )dΘ. (2)Now using Bayes’ theorem the posterior distribu-tion over trees can be found up to a proportionalconstant,p(T |X ,Y ) ∝ p(Y |X ,T )p(T ) (3)which can now be explored via the MetropolisHastings (MH) algorithm. When the regressionmodel at the leaves is more complex, for exam-ple when it is a Gaussian process, the parameterscannot be analytically marginalised. As a result,the MH algorithm cannot be exclusively used sincethe dimensionality of the parameter space changesfrom one step to the next. In this situation, Gra-macy and Lee (2008) use reversible-jump MCMC,for jumps between tree models, with a combinationof Gibbs sampling and the MH algorithm for sam-pling from the posterior parameter distribution.Dynamic trees (DTs) are an extension to RTs ingeneral, being designed to dynamically adapt andupdate on the observation of new data, using a se-ries of rules on how a tree can change with the ar-rival of new data points, and a particle learning ap-proach. The DTs used here use either constant orlinear models at each tree leaf: in this sense theyare simpler than the treed Gaussian process modelsdescribed in (Gramacy and Lee, 2008). However,in the context of low sample sizes, the simpler leafmodels might have an advantage.DTs are designed in particular to dynamicallyupdate on the arrival of new data – they incorpo-rate rules which change the structure of the tree lo-cally with each new observation. The experimentshere do not involve dynamic updating of the model,however the DT approach can equally be applied tobatch data by running the data through the learn-ing algorithm several times in a different order andusing the averaged results. The details of the con-struction of the tree prior and the updating rulesmust be referred to Gramacy and Polson (2011) dueto space limitations, however the essence of the DTapproach is the same as any other Bayesian regres-sion tree.2. SENSITIVITY ANALYSIS WITH REGRESSIONTREESRTs can be used to generate large surrogate-modelsamples which can be used to estimate measures ofsensitivity, specifically the first-order sensitivity in-dices (Cukier et al., 1973), and the total-order sen-sitivity indices (Homma and Saltelli, 1996). Theexperiments in this work will focus on the latter,defined as,STi =EX∼i[VXi (Y | X∼i)]V (Y )(4)where V (·) is the variance operator, and ∼ i de-notes the set of indices except i. Estimation is per-formed using the Monte Carlo approach, describedin Jansen (1999). In fact estimates can be returned312th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 2015as distributions rather than point estimates, sincethey are estimated at every tree model visited in theMCMC search (after burn in). These distributionstherefore account for both the uncertainty in the treestructure and model parameters.A more novel approach to measuring sensitiv-ity was also proposed by Gramacy et al. (2013), inwhich the “relevance” of a variable xi is measuredas the sum of the reductions in predictive variancedue to splits involving xi. Let the reduction in pre-dictive variance for a given node (split) be ∆(η).The relevance is defined as,Ji(T ) = ∑η∈IT∆η1[ν(η)=i], (5)where ν(η) is the variable index of the split of η ,and IT is the set of all internal (non-terminal) treenodes. When the regression model is a simple con-stant at each leaf, dependencies of y on xi are cap-tured exclusively by splits on xi, and each split con-tributes to a reduction in variance if xi affects theoutput in some way. Intuitively then, if a variablehas no influence on the output, any splits on thatvariable will give no reduction in variance; con-versely splitting on an influential variable will de-crease the predictive variance. In the case where theleaf model is anything other than constant, the rele-vance measure cannot be used as a reliable measureof sensitivity because reductions in predictive vari-ance will be due to both splits on variables and thespecification of the regression model at the leaf. Al-though the requirement of a constant leaf model issomewhat restrictive, this work aims to see whetherit can be used to perform “low-resolution” screen-ing analyses in for models with high dimensional-ity.3. SAMPLING-BASED MEASURESIn order to put the performance of the DT approachinto perspective, the measures from the DTs (STand J) are compared here with some other sensi-tivity measures. To compare against the perfor-mance of a more “conventional” surrogate model,the fully Bayesian Gaussian process is used fromthe TGP package. Additionally, three much sim-pler “sampling-based” approaches are used whichdo not use surrogate models: first, the Monte Carloestimator of ST , which is based on a so-called “ra-dial” experimental design. Letting x(i)j and x(i′)j be,respectively, a point in the input space, and a pointthat differs from x(i)j only in the value of xi, the esti-mator of the numerator of STi (see (4)) is as follows(Jansen, 1999),VˆTi =12NN∑j=1∣∣∣ f (x(i′)j )− f (x j)∣∣∣2. (6)The next measure is the mean of absolute elemen-tary effects, µˆ∗i , which is estimated as (Campolongoet al., 2011),µˆ∗i =1NN∑j=1∣∣∣ f (x(i′)j )− f (x j)∣∣∣|x(i′)ji − x ji|. (7)Here, x ji denotes the ith coordinate of x j, so thatthe denominator of (7) is equal to the differencein xi between x(i)j and x(i′)j . The final measureused in this study is part of a set of sensitivitymeasures called “derivative-based global sensitiv-ity measures” (DGSM). The measure is the integralof squared derivatives, i.e. νi =∫H (∂y/∂xi)2dx.This may be estimated as (Sobol and Kucherenko,2009),νˆi =1NN∑j=1∣∣∣ f (x(i′′)j )− f (x j)∣∣∣2|x(i′′)ji − x ji|, (8)where x(i′′)j is a point that differs from x j only by asmall increment δ of xi, in order to give an estimateof ∂y∂xi at each point x j. This increment is kept fixedfor all j, and is recommended as δ = 1×10−5 whensampling with respect to the unit hypercube.4. EXPERIMENTSIn order to assess the performance of the tree mod-els and their associated measures, experiments wereperformed on test functions rather than physicalmodels. Test functions represent the possible be-haviour of complex physical models, but have theadvantage that the sensitivity indices are known apriori via analytical expressions. The performanceof the methods and measures here is of course con-ditional on the test function and may be different412th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 2015for other functions, however the same is true forphysical models.The focus of this work was on the applicabilityof DTs to problems of high dimensionality, at lowsample sizes. In such cases it is unrealistic to ex-pect precise estimates of sensitivity indices — usu-ally one is interested in identifying input variableswhich have a significant influence on the model out-put, and similarly identifying those that have littleor no effect. This setting is often referred to as“screening”. Accordingly, for each test function,a certain fraction γ of the input variables was set tobe of higher influence, and the remaining fraction1− γ of variables to be of lower influence (the ex-act sensitivity is controlled by the parameter valuesin each case).The experiments are then set as follows. Letkhigh = bγkc, i.e. the number of input variablesthat are set as high influence, and klow = k− khigh.In each test function, the variables are set suchthat {ST1 = ST2 = ... = STkhigh} {STkhigh+1 =STkhigh+2 = ...= STk}. In other words, the first khighvariables are set to have equal and high sensitivities,and the remainder to have equal and low sensitivi-ties.Now let ri be the ranking of the ith variable byone of the sensitivity measures defined previously,where ranking runs in descending order, i.e. ri = 1is ranked as the most influential variable, and ri = kis the least. The measure of error, Z, is as follows,Z =1khighkhigh∑i=11(ri > khigh), (9)where 1(·) is the count function. This metric there-fore measures the fraction of influential variablesthat are ranked outside the top khigh variables bythe sensitivity measure. This is purely a measureof sorting the variables into high and low impor-tance groups, and gives no regard to precise rank-ings or possible cutoff values that might be used toselect high importance from low importance vari-ables, since what is a “high importance” variable isusually subjective and problem-dependent.In order to capture the average performance, foreach test function investigated, 20 repetitions aremade (this limit was imposed by the significantcomputational cost of constructing a large num-ber of surrogate models). The experimental de-signs here are all based on the Sobol’ sequence,which is a low-discrepancy sequence suitable forboth Monte Carlo estimation and surrogate modeltraining. The sample is randomised by applying arandom shift in each dimension for each replica-tion, following the approach of Owen (1998). Ad-ditionally, each function is tested at sample sizesfrom NT = 62 to NT = 248, representing the sizesof samples that might be available when the modelis very computationally expensive. These particu-lar values were chosen because the measures basedon radial sampling require a structured sample ofsize NT = N(k+ 1), where N is a positive integer.At the chosen dimensionality of k = 30, NT = 62when N = 2, for example. The surrogate modelswere built at the same sample sizes to make a faircomparison.4.1. Polynomial additive functionThe first function used for comparison was a simplepolynomial additive function, of the form,h(x) =k∑i=1aixpi , (10)where p is the order of the polynomial, and the aiare weighting coefficients. In this function there areno interactions between variables, so ∑Si = 1. Theparameters were set as follows: p = 2, ahigh = 3and alow = 1, and γ = 0.2, with k = 30. This meansthat 20% of variables are set to have high sensitiv-ities, i.e. by setting a1 = a2 = ... = a6 = 3, anda7 = a8 = ...= a30 = 1. The results are as shown inFigure 2. The polynomial function is a smooth ad-ditive function, which would tend to favour surro-gate models which rely on assumptions of smooth-ness. However the results show that the perfor-mance of the DTs is rather poor. By far the worstperformer is the relevance measure of the dynamictrees surrogate model, which does start converge toa reasonable level of error as the sample size in-creases above 200 points, but at lower sample sizesis little better than random noise (consider that ifrandom sensitivity measures were assigned to eachvariable, the value of Z would on average be 0.8).512th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 201550 100 150 200 25000.050.10.150.20.250.30.350.40.45NTZ STµ*νDT, JDT, STGP, STFigure 2: Results of sensitivity measures applied topolynomial additive function.The Monte Carlo estimator of ST performs better,but still with a considerable margin of error. Amuch better performance is given by the ST esti-mation via the DT surrogate, and even better viathe Gaussian process. However, the best perfor-mance of all is given by the elementary effects andDGSM measures, which do not rely on surrogatemodels, and correctly identify the group of signif-icant variables at every sample size, and for everydata replication tested. This seems to suggest that atthe sample sizes tested, when the objective is sim-ply to identify important variables, surrogate mod-els do not offer any improvement over measuresthat estimate directly from the sample (at least onthis function).4.2. G∗ functionThe second test function is a widely-used bench-mark function in sensitivity analysis studies – the“G∗ function”. It has the following form:G∗ =k∏i=1g∗ig∗i =(1+α) |2(xi+δi− I[xi+δi])−1|α +ai1+ai(11)where ai,δi and αi are parameters which can bechosen to obtain different behaviours of the func-tion. I[xi + δi] is the integer part of (xi + δi). Therelative importance of the inputs (x1,x2, . . . ,xk) inthe G∗ function is controlled by the magnitude ofai, and the nonlinearity by αi. The parameter δi isa “shift” parameter which moves the position of thefunction in the input space, without having any ef-fect on the sensitivities. This is set to zero in thiswork since for each replication, the Sobol’ sampleis already randomly shifted, achieving exactly thesame effect. In this experiment, the parameters areset as ahigh = 1 and alow = 2, which are chosen toresult in a function with strong interactions: withk = 30 and α = 2, the sensitivity indices can be an-alytically calculated as ∑Si = 0.151.50 100 150 200 25000.10.20.30.40.50.60.70.80.9NTZ Figure 3: Results of sensitivity measures applied to G∗function. The line styles are the same as in Figure 2.The results of the RTs and other measures ap-plied to the G∗ function show a similar story to thatof the polynomial function, but with a clearer divi-sion. The G∗ function is strongly nonlinear and hasstrong interactions between variables, so is natu-rally a more challenging subject for sensitivity anal-ysis. Referring to Figure 3, one can see that theRT surrogate models, and the Gaussian process, allshow poor performance at the sample sizes tested,with variable ordering little better than random.The most successful measure is clearly the DGSMmeasure ν , which effectively orders the variableseven at the lowest sample size. Similar to the othermeasures, the error does not decrease very signifi-cantly with increasing sample size.The DGSM measure appears to have an advan-612th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 201550 100 150 200 25000.10.20.30.40.50.60.7NTZ STµ*νDT, JDT, STGP, STFigure 4: Results of sensitivity measures applied to stepfunction.tage because it relies on small steps, which capturethe partial derivative of the function at each pointvisited. In the case of the G∗ function this cap-tures the sensitivity quite well. Measures such as STand µ∗, on the other hand, take large steps betweensamples, which in the case of a non-monotonicfunction, can underestimate sensitivity at low sam-ple sizes. The surrogate models simply do not haveenough training data to characterise the function atthis sample size.4.3. Step functionThe final test function is a simple function with anear-discontinuity, of the form,s(x) =k∑i=1aierf(15(xi−0.5)) (12)where erf is the error function. This function hasa gradient of zero in most places, except aroundxi=0.5, at which point the gradient is very steep. Forthe numerical experiments, ahigh = 2 and alow = 1,with a fraction γ set to 0.2.The step function was in fact chosen as a counter-example to show the limitations of the DGSM mea-sures. This is clearly shown in Figure 4, where thethree DGSM measures perform quite poorly. Thisis very likely due to the fact that DGSM measuresuse small steps to approximate the pointwise gradi-ent, however in the step function the gradient is zeroin most places. So the DGSM measures require afairly large sample size to sample a point in whichthe gradient is non-zero. On the other hand, theemulator approaches build a response surface fromall the points simultaneously, so the large steps ineach x direction are identified, even at low samplesizes. The relevance measure J still does not per-form well, but the ST estimate of the dynamic treemodel performs the best on average of all the meth-ods considered here. This suggests that in the pres-ence of discontinuities, the dynamic tree approachmight be the preferred option.5. DISCUSSION AND CONCLUSIONSThe conclusion of this work is that surrogate mod-els did not really help in identifying significant vari-ables at low sample sizes, when the dimensionalitywas reasonably high, with the exception of the near-discontinuous step function. The hope was that dy-namic trees, being relatively simple surrogate mod-els, might be fruitfully applied in the screening con-text. However, it seems that surrogate models (in-cluding DTs) tend to be restricted to a particulardomain of application: problems with low dimen-sionality and sufficient sample size. The fact thatgradient-based measures easily outperform the sur-rogates in many of the experiments here demon-strates that when the sample size is low, surrogatemodels impose assumptions which cannot be justi-fied – this is likely due to the surrogate model’s at-tempts to extrapolate into unsampled regions of theinput space with very little sample data to estimatethe behaviour of the true model. Even in the caseof the additive polynomial function, a very smoothfunction which would tend to favour smooth surro-gate models such as Gaussian process, the surrogatemodels did not perform as well as the simpler ele-mentary effects and DGSM measures.One case in which the dynamic trees did performwell was that of the discontinuous step function,which is a setting that is unsuitable for gradient-based approaches. This also favours a model basedon linear or constant regressions. However (in theexperience of the author), most physical models donot exhibit this kind of behaviour.In particular, the relevance measure based on dy-namic trees was not very successful in the exper-712th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12Vancouver, Canada, July 12-15, 2015iments performed here. This could be due to thefact that it relies on regression trees with constantsat each leaf, which were unable to effectively modelthe nonlinear test functions considered here.On the other hand, it is revealing that DGSMmeasures can be applied successfully even at verylow sample sizes, when the aim is to screen signif-icant variables from insignificant ones. A possibledrawback of this approach could however be thatthe requirement of small perturbations may presentproblems in real models, because results may not beavailable to a sufficient number of significant fig-ures to accurately estimate partial derivatives. Thiscould possibly be overcome by tuning the perturba-tion size to the smallest value that can result in areasonable estimate.Further work that could stem from this studywould be to understand under what circumstancessurrogate models in general perform better thanDGSM and elementary effects measures, and thusto guide practitioners to know whether to use a sur-rogate or sampling-based measures for a particu-lar problem, perhaps based on the dimensionalityof the problem and the available number of samplepoints.6. REFERENCESBecker, W., Worden, K., and Rowson, J. (2013).“Bayesian sensitivity analysis of bifurcating nonlin-ear models.” Mechanical Systems and Signal Process-ing, 34(1), 57–75.Breiman, L. (2001). “Random forests.” Machine learn-ing, 45(1), 5–32.Campolongo, F., Saltelli, A., and Campolongo, J.(2011). “From screening to quantitative sensitiv-ity analysis. a unified approach.” Computer PhysicsCommunication, 182(4), 978–988.Chipman, H. A., George, E. I., and McCulloch, R. E.(1998). “Bayesian CART model search.” Journal ofthe American Statistical Association, 93(443), 935–948.Cukier, R. I., Fortuin, C., Schuler, K. E., Petschek, A. G.,and Schaibly, J. (1973). “Study of the sensitivity ofcoupled reaction systems to uncertainties in rate coef-ficients. i theory.” The Journal of Chemical Physics,59, 3873–3878.Gramacy, R. B. (2007). “tgp: An R package forBayesian nonstationary, semiparametric nonlinear re-gression and design by treed Gaussian process mod-els.” Journal of Statistical Software, 19(9).Gramacy, R. B. and Lee, H. K. H. (2008). “Bayesiantreed Gaussian process models with an application tocomputer modeling.” Journal of the American Statis-tical Association, 103(483), 1119–1130.Gramacy, R. B. and Polson, N. G. (2011). “Particlelearning of gaussian process models for sequential de-sign and optimization.” Journal of Computational andGraphical Statistics, 20(1).Gramacy, R. B., Taddy, M., Wild, S. M., et al. (2013).“Variable selection and sensitivity analysis using dy-namic trees, with an application to computer code per-formance tuning.” The Annals of Applied Statistics,7(1), 51–80.Homma, T. and Saltelli, A. (1996). “Importance mea-sures in global sensitivity analysis of nonlinear mod-els.” Reliability Engineering & System Safety, 52(1),1–17.Jansen, M. (1999). “Analysis of variance designs formodel output.” Computer Physics Communications,117, 35–43.Oakley, J. and O’Hagan, A. (2004). “Probabilistic sen-sitivity analysis of complex models: a Bayesian ap-proach.” Journal of the Royal Statistical Society B, 66,751–769.Owen, A. B. (1998). “Monte carlo, quasi-monte carlo,and randomized quasi-monte carlo.” Monte Carlo andQuasi-Monte Carlo Methods, 2000, 86–97.Paruolo, P., Saisana, M., and Saltelli, A. (2013). “Rat-ings and rankings: voodoo or science?.” Journal ofthe Royal Statistical Society: Series A (Statistics inSociety), 176(3), 609–634.Rabitz, H. and Alis¸, Ö. F. (1999). “General foundationsof high-dimensional model representations.” Journalof Mathematical Chemistry, 25(2-3), 197–233.Sobol, I. M. and Kucherenko, S. (2009). “Derivativebased global sensitivity measures and their link withglobal sensitivity indices.” Mathematics and Comput-ers in Simulation, 79(10), 3009–3017.Storlie, C. and Helton, J. (2008). “Multiple predictorsmoothing methods for sensitivity analysis: Descrip-tion of techniques.” Reliability Engineering & SystemSafety, 93(1), 28–54.Sudret, B. (2008). “Global sensitivity analysis usingpolynomial chaos expansions.” Reliability Engineer-ing & System Safety, 93(7), 964–979.8
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP) (12th : 2015) /
- Applications of dynamic trees to sensitivity analysis
Open Collections
International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP) (12th : 2015)
Applications of dynamic trees to sensitivity analysis Becker, William Jul 31, 2015
pdf
Page Metadata
Item Metadata
Title | Applications of dynamic trees to sensitivity analysis |
Creator |
Becker, William |
Contributor | International Conference on Applications of Statistics and Probability (12th : 2015 : Vancouver, B.C.) |
Date Issued | 2015-07 |
Description | A recent approach to surrogate modelling, called dynamic trees, uses regression trees to partition the input space, and fits simple constant or linear models in each “leaf” (region of the input space). This article aims to investigate the applicability of dynamic trees in sensitivity analysis, in particular on high dimensional problems at low sample size, to see whether they can be applied to dimensionalities usually out of the range of surrogate models. Comparisons are made with Gaussian processes, as well as three measures based on a radial sampling scheme: the Monte Carlo estimator of the total sensitivity index, an elementary effects measure, and a derivative-based sensitivity measure. The results show that the radial sampling measures generally outperform the surrogate models tested here, with the exception of response surfaces that feature discontinuities. |
Genre |
Conference Paper |
Type |
Text |
Language | eng |
Notes | This collection contains the proceedings of ICASP12, the 12th International Conference on Applications of Statistics and Probability in Civil Engineering held in Vancouver, Canada on July 12-15, 2015. Abstracts were peer-reviewed and authors of accepted abstracts were invited to submit full papers. Also full papers were peer reviewed. The editor for this collection is Professor Terje Haukaas, Department of Civil Engineering, UBC Vancouver. |
Date Available | 2015-05-26 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivs 2.5 Canada |
DOI | 10.14288/1.0076191 |
URI | http://hdl.handle.net/2429/53497 |
Affiliation |
Non UBC |
Citation | Haukaas, T. (Ed.) (2015). Proceedings of the 12th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP12), Vancouver, Canada, July 12-15. |
Peer Review Status | Unreviewed |
Scholarly Level | Faculty |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/2.5/ca/ |
Aggregated Source Repository | DSpace |
Download
- Media
- 53032-Paper_365_Becker.pdf [ 180.01kB ]
- Metadata
- JSON: 53032-1.0076191.json
- JSON-LD: 53032-1.0076191-ld.json
- RDF/XML (Pretty): 53032-1.0076191-rdf.xml
- RDF/JSON: 53032-1.0076191-rdf.json
- Turtle: 53032-1.0076191-turtle.txt
- N-Triples: 53032-1.0076191-rdf-ntriples.txt
- Original Record: 53032-1.0076191-source.json
- Full Text
- 53032-1.0076191-fulltext.txt
- Citation
- 53032-1.0076191.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.53032.1-0076191/manifest