Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Quantum algorithms for finding extrema with unary predicates Chan, Man Hon 2007

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2007-0353.pdf [ 3.5MB ]
Metadata
JSON: 831-1.0052063.json
JSON-LD: 831-1.0052063-ld.json
RDF/XML (Pretty): 831-1.0052063-rdf.xml
RDF/JSON: 831-1.0052063-rdf.json
Turtle: 831-1.0052063-turtle.txt
N-Triples: 831-1.0052063-rdf-ntriples.txt
Original Record: 831-1.0052063-source.json
Full Text
831-1.0052063-fulltext.txt
Citation
831-1.0052063.ris

Full Text

Quantum Algorithms for Finding Extrema with Unary Predicates and related problems by Man Hon Chan B.Eng.(Computer Engineering), The University of Hong Kong, 2005 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F Master of Science in The Faculty of Graduate Studies . . (Computer Science) The University of Bri t ish Columbia August,. 2007 © M a n Hon Chan 2007 11 Abstract We study the problem of finding the maximum or the minimum of a given set S = {xo, xi,... xn-i}, each element Xi drawn from some finite universe % of real numbers. We assume that the inputs are abstracted within an oracle & where we can only gain information through unary comparisons in the form "Is Xi greater than, equal to, or less than some constant fc?" Classically, this problem is solved optimally with a runtime of 0 ( n + lg | ^ | ) . 1 In the setting of Quantum Computing, we show that at least D,(^yn + \g \^\) queries are required to solve the problem even with bounded error. Combining variants of the Grover's search [1, 2] algorithm and the optimal classical unary extrema finding algorithm, we have derived a series of new quantum algorithms, some running in time as fast as 0(\/n\g* n + lg | ^ | ) . This shows that quantum computers can accelerate the speed in the unary comparison model asymptoti-cally. Inspecting our tools, we find convincing arguments that our lower bound is most probably tight, but we may need an entirely new approach to solve the problem optimally. The technique used in our algorithm can also be extended to solve variations of quantum statistics problems. For instance, our result can be directly extended to approximation of extrema of real numbers, similar to that of [3]. Moreover, we can also solve the quantum /c-select problem optimally in time 0(Vk~n) with constant success probability. We hope that our ideas and tools will prove to be useful in other areas. Ill Contents Abstract i i Contents i i i List of Tables v List of Figures v i Acknowledgements v i i 1 Introduction 1 1.1 Unary Maximum Finding 4 1.1.1 Lower bound of classical unary maximum finding 7 1.1.2 Classical unary maximum finding 8 1.2 Quantum Unary Maximum Finding 10 2 A brief introduction to Quantum Computing 13 2.1 A brief history of Quantum Computing 13 2.2 Qubits 15 2.3 Unitary Transform and Entanglement 17 2.4 Quantum Circuits and (Approximately) Universal Gate Set . . . 22 2.5 Model of Computation 26 2.6 Summary 28 3 Quantum Search and Amplitude Amplification 30 3.1 Grover's Algorithm 32 3.2 B B H T 36 3.3 Other variants 39 3.4 F indAl l 40 4 Quantum Unary Extrema Finding 43 4.1 Lower bound 43 4.2 Extending previous algorithms 45 4.3 Budgeted Sampling 48 4.4 Geometric Budgeting . 51 4.5 Progressive Approximation 55 4.6 Pseudo Large Sampling 60 Contents iv 4.7 Variants 64 4.8 Other applications 67 4.8.1 Maximum Root Approximation .• 67 4.8.2 Quantum K-select 69 5 Conclusions 71 5.1 Future Directions 71 Bibliography 73 A Expected Pruning with Large Samples 76 B Expected Pruning with Budgeted-Sampling or Failed BBHT . . . 78 C Properties of Large Pseudo-random Samples 83 V List of Tables 1.1 Bounds on extrema problems 10 1.2 Runtime of various algorithms 11 List of Figures 2.1 Differences between classical circuits and quantum circuits . . . . 23 2.2 Classical Circuit of y = (x\ A x_) V (x% A au) 24 2.3 Quantum Circuit of j / = ( i i A x2) V (X3 A £ 4 ) 24 3.1 One Grover Iteration 35 4.1 Summary on different Quantum Unary Maximum Finding results 67 Acknowledgements The author would like to express his deepest gratitude his supervisor David G. Kirkpatrick for his unfailing support and encouragement. Not only did he provide motivations and ideas, he has helped the author to find tremendous joy in doing researches and solving challenging problems. The author would also like to thank his second reader W i l l Evans for his suggestions and guidance. The author is also greatly in debt to his family and friends for always being supportive and caring. This research would not have been successful without the help from Bartholomew Furrow, Zephyr L iu , Jack M a i , and many more others. The feedbacks, suggestions and encouragements are what make this work possible. 1 Chapter 1 Introduct ion Quantum Computing has become an exciting field over the previous decades. Although significantly younger than most topics in classical algorithms and complexity theory, it is very interesting in its own right. The very notion of quantum computing was introduced in the beginning of the 1980s. In 1994, Shor presented the celebrated algorithm [4] for discrete logarithm and- factoring integers, showing how quantum computers could possibly revolutionarize the age of computation. There were other significant contributions including the Graver's search algorithm [1] and several of its extensions (e.g. BBHT[2], BCWZ[5]), which are more general and far more powerful than any classical counterpart. In Furrow's thesis [6], he put together and optimized previous known results, and presented a couple of basic quantum algorithmic tools that prove to be useful in many circustances. He tackles real life problems, using quantum algorithms to bypass classical lower bounds. One could imagine how powerful a sublinear extrema finding algorithm could change many existing algorithms. For instance, the naive brute force solution to the maximum submatrix sum problem is 0 ( n 4 ) . Wi th various dynamic programming and optimization, one can solve it in 0 ( n 3 ) . However, using a maximum finding algorithm that runs in square root time, one can achieve a runtime of 0(n2). Not only is this asymptotically faster than the current best classical algorithm, the idea to the solution is much more succinct. Wi th a similar spirit, we wish to tackle problems that are fundamental or frequently reused so that any improvement could be cascaded and impact Chapter 1. Introduction 2 many other algorithms. The problem of locating the maximum among a given set is a well studied problem. Wi th quantum computers, an optimal O(^Jn) algorithm was presented in [7]. It works under the assumption that one can take any two elements from the input set and rank them in constant time. This is usually the case in the RAM model or when we have an external comparison oracle. Arguably, this binary comparison operation may not be feasible if inputs are distributed or if inputs are not infinitely precise. One of such cases is manipulation of real numbers, where each number can only be approximated. Directly comparing two real numbers requires us to approximate the two real numbers until they can be clearly distinguished. In this dissertation, we study a more restrictive model, where one can only compare an element with a constant. For instance, suppose we are given n real numbers, each a root of some blackbox monotone function in the range [0,1). We can easily compare the root of the function / with a constant k simply by evaluating f(k). The sign of f(k) tells us whether the root is to the left or to the right of k. However, there is no way one could directly compare the roots of two blackbox functions directly. As one of the motivating problems of our research, we are given n such blackbox functions and we want to approximate the maximum to a certain absolute error efficiently. The notion of approximation is introduced because we assume that we cannot represent the root of any of our functions perfectly. There are also cases where two roots may be indistinguishable within the specified error, and they could both be the maximum. Under this setting, we cannot directly apply binary comparison algorithms. We need different approaches to solve this new class of problems. In our thesis, we shall first formalize the problem of unary maximum find-ing and introduce notations that would be useful throughout our discussion. Maximum finding and minimum finding are basically the two faces of the same Chapter 1. Introduction 3 coin, extrema finding. In keeping with the discussion of the optimal, classical counterpart [3], we primarily focus on finding the maximum. We then introduce the basics of quantum computing and the model of computation adapted in this thesis. Next, we look into specific quantum search algorithms, which are the heart of our speedup over classical algorithms. These tools include G r o v e r [1] and BBHT [2]. The primary problem they tackle is a variant of the satisfiability problem: Given a blackbox Boolean function F over the domain { 0 , . . . , n — 1}, we would like to find some x such that F(x) is T r u e . Classically, without gaining further information of F, one can only revert to a linear search for such an x. This implies a lower bound of 0(n) in both the average and the worst case. This remains the case even when we allow our algorithm to err with a fixed constant probability. Using quantum parallelism and amplitude amplification, we can achieve an expected time of 0(^/n)\ which breaks our classical lower bound of Q,(n). We study these tools and their theories in order to use these algorithms correctly. After that, we present our new algorithm for finding maximum using only unary predicates. Classically, this problem is optimally solved with an algorithm that runs in 0{n + log 2 \ Hereafter, we shall denote \g(x) as the logarithm of x with respect to the base 2. The above runtime would then be written as 0(n + lg | ^ | ) . Our new algorithm is presented as a series of improvements over previous known algorithms, with occasionally new tricks. We have achieved the expected runtime of 0(^/n\g* n + lg | ^ | ) . We hope that the techniques used in our algorithms will prove to be useful in other circumstances. Finally, we would present ideas for future explorations. Chapter 1. Introduction 4 1.1 Unary Maximum Finding We now formally define the unary maximum finding problem. As input, we are given three items: • n, denoting the numbers of elements there are. • a finite and totally ordered set denoting the domain of all input ele-ments. • G, a blackbox oracle which maps { 0 , . . . , n - 1} x <fy —> {' -< ' , ' = ' , ' y '}. G(i, k) = ' -< ' (resp. ' = ' , '>- ') means that for the ith element Xi, Xi < (resp. =, >) fe. Wi th a totally ordered this oracle must also satisfy that Vfci < k2 € 9* (0(i,ki) = ' -< ') =» (0{iM) = '-<'). a n d (^( i , fc2) = ' ^ ' ) = * ( ^ ( * . f c i ) = ' ^ ' ) -For simplicity we write < (resp. =, >) k interchangeably as ff(i, k) = ' -< ' (resp. ' = ' , ' y ')• We can also define an equivalently powerful oracle G' as a threshold predicate. 6' would then be a mapping from { 0 , . . . , n — 1} x —» {False, True}, where fe) is Trueif and only if x* < fe. The two definitions are equivalently up to a constant factor 2. Given an oracle G in our first definition, we can create a threshold oracle G' simply by { True if G(i, fe) = ' -< ' False otherwise On the other hand, given a threshold oracle G', we can implement G(i, fe) as i f G'(i,k) return ' -< ' i f (fe = max(^C)) or G'{i, succ(k)) return ' = ' return ' >- ' Chapter 1. Introduction 5 Here, we define succ(k) to be the successor of k in the universe ty, that is, the minimum of all numbers in ty which are greater than k. For contiguous integral domain, this equals to the value k + 1. This operation depends only on the ty and does not impact our 6. As a result, the two definitions are equivalent. We shall stick with our first ternary function as that is more intuitive. In our hypothetical maximum root approximation problem, our global inter-val is [0,1). Since we want to approximate the maximum root within an error of e, it suffices to divide the global interval into buckets of size e. Our universe ty will be a collection of such buckets. A root is ' y ' (resp. ' -< ') than a bucket if it lies to the right (resp. left) of the bucket boundaries. A root is ' = ' a bucket if it lies within the bucket. The size of our universe would then be K Given the input, we would like to output A* £ ty as the maximum of the given set {x\\. More precisely, we want to ensure that • For all i € { 0 , . . . , n - A*) ^  ' y ' and • There exists i € { 0 , . . . , n — 1} such that &(i, A*) = ' = ' In general, it does not matter if we return the index or the exact value itself. We choose to return the exact value as it is generally more useful. Consider our maximum root approximation problem with n = 1, it is of little value to return the index as it provides no further information. Moreover, simply returning the index may give us an illusion that the error term e is irrelevant, which is unfortunately not the case. Taking aside degenerated cases, the two different types of returning the result do not affect the intrinsic difficulty of the problem. Given a specific index, we can perform a binary search to retrieve its exact value. On the other hand, we can perform a search to find the corresponding index given its value. We shall later see that the costs in binary searching or performing a complete search are below the lower bound for the unary maximum finding problem itself. As a result, an extra conversion step does not incur any Chapter 1. Introduction 6 penalty to our complexity. There are many ways we can calculate the cost or runtime of an algorithm. That heavily depends on the model of computation we are using. In general, we either focus on only oracle calls or the total runtime. The rationale behind charging only oracle calls is that information can only be extracted from calls to the oracle. They record the path our algorithm evolves and gradually approaches the final answer. Furthermore, the oracle call should be much more expensive and dominating in most situations (e.g. evaluating of black box functions). Afterall, the hardness of our problem relies on a hard unary oracle where no binary comparison can be effectively simulated. However, counting only oracle calls is certainly an underestimate of the total runtime any algorithm has to take. It is only fair if we incorporate the cost of all auxiliary work such as memory managements. It is also unrealistic when we assume that we can manipulate real numbers arbitrarily or perform complicated arithmetics in constant time. For instance, one may try to approximate our blackbox functions with polynomials and estimate the roots. While this may be useful in many cases, the cost associated in performing interpolation and an additional root finding is simply too big to be practical in our scenario. In our following discussion, we restrict ourselves to only simple elementary operations (e.g. addition, subtraction, average of two numbers) on the index do-main { 0 , . . . , n — 1} or the universe domain & . Suppose we denote T(&) as the number of oracle calls in an algorithm, and T(Elementary) as the number of el-ementary arithmetic operations, we maintain that T(0) = 0(T'(Elementary)). As a result, we can refer both of them interchangeably. The reader will be in-formed when this claim does not hold. We should take into account the actual runtime of oracle calls and elementary arithmetic operations. In general, the cost of elementary operations is linear to the description size. Making a call to Chapter 1. Introduction 7 the oracle require us to supply the arguments to it, which has a cost linear to the description size as well. Therefore, the cost in calling the oracle almost always dominates the cost of elementary operations. Wi th T(0) = Q(T'(Elementary)), it then suffices to count only oracle calls to derive the complexity of our algo-rithms. 1.1.1 Lower bound of classical unary maximum finding The classical lower bound of the unary maximum finding problem is shown to be 0(n -fig \ & \ ) by Gao et al.[3]. The state of computation of any algorithm can be expressed as an n-tuple of sets, (So, Si,..., SVi-i) where Si denotes the possible values of x^. The lower bound is proved via a collective adversary strategy, where we try to prepare the worst test case for our algorithm. Let & = C]^CQ Si denote the subset of ^ that is feasible for all inputs. Any algorithm cannot output the maximum with certainty if | ^ | > 1. Moreover, if | ^ | = 2, the collective adversary could set all but 1 entry to the minimum of forcing any correect algorithm to ask Q.(n) questions before reaching any conclusion. Initially, we have j^ l = and it takes fi(lg \^ \ ) queries to reduce \ia\ to 2. It follows from the combination that it takes at least Q(n + lg \*2f\) queries for any algorithm to correctly solve this problem. T h e o r e m 1 (Gao et ai: Theorem 2.1) Determining max{a;o,a;i,... , x „ _ i } re-quires n+ [lg | ^ | ] — 1 unary predicate ealuations, in the worst case, even when it is given that \{XQ,XI,... , x n _ i } | < 2. It is also straightforward to extend this idea to the case of unary minimum finding. It takes fl(n + l g \ & \ ) to compute the extrema given our model. We can also understand this as a combined effort in (i) certifying maximality by looping through all possible entries, taking tt(n) time; and (ii) outputting the answer, Chapter 1. Introduction 8 taking 0.(\gxmax) time. Making lgxmax = ^ ( lg an adversary could easily force our algorithm to run in fi(n + lg \ty\) time, establishing our lower bound. 1.1.2 Classical unary maximum finding In a binary model, we can tackle the problem simply by iterating through all entries and keeping track of the largest seen entry. Suppose we are given the exact value of each entry, the same approach would also work in the unary model. However, knowing all the values requires n binary searches, each at cost 0( lg \ty\). This 0(nlg \ty\) approach is certainly overkill. Given a current estimate A, checking if the next entry is larger is immediate. If the next entry is smaller, determining its value is unnecessary. In a randomized setting, we can loop through all entries in a randomized order (x^0, x7ri,..., x7Tn_1). During our. loop, we only need to perform a binary search on xni when it is strictly greater than all x„k, k < i. That happens with probably at most j^. Therefore, the expected runtime of this approach is 0(n + 2^™=i \ lg 1^ 1) — 0(n + \g\ty\ Ign). The optimal classical unary maximum finding is attributed to Gao et al. [3]. This algorithm will be reused heavily in our future discussion and is thus presented here for completeness. The readers are however strongly advised to review the original paper for a thorough and rigorous discussion of the algorithm and its applications. To begin with, we first assume that our universe ty is the integer range [0, m). Any finite, totally ordered universe can be remapped to an integral range with some appropriate m. As we are only using unary predicates, we can represent our current knowledge on the inputs as a vector of feasible ranges {Si}. We further denote the ranges with its extrema, giving Si = [Ai,/ij). The maximum lower bound A m a x is defined accordingly as max{Aj} and is an underestimate of our desired output A*. Clearly, for any i with fa < Xmax + 1, Xi cannot be Chapter 1. Introduction 9 greater than our current estimate and the corresponding Xi is deemed impotent. Using the idea in the lower bound proof and discarding impotent elements, we have = [A m a x,min{^ij | Xi not impotent}). In the same spirit, an algorithm can only make progress by reducing | ^ | . To efficiently handle all the ranges, the authors in [3] have grouped elements with the same fa in the same bucket, called a block. The blocks are ordered descendingly according to the associated fa In the following descriptions, Bi refer to the buckets with (3i as their associated fa The pseudo code of the classical optimal unary maximum finding algorithm is presented here: B0 «- {(),... , n - l } s <- t <- 0 0o <- m a x ( ^ ) Amax * 0 whi le 0S > Xmax + 1 do select a random i from B s and remove it j <~ | (Amax + A) i f < j then A+i «- 3 insert i into Bt+i t <- t + 1 e l se Amax * J j - A whi le > j do ^ m a i < j clear £? t * < - t - l end whi le insert i into B t end i f i f Bs is empty then s <- s + 1 end i f end whi le r e t u r n A m a x By constructions, 0i are descending. At competition, 0S < Xmax + 1 certifies that all remaining elements are impotent and thus Amax is indeed the desired Chapter 1. Introduction 10 Problem Lower Bound Upper Bound Extreama (Binary) Classical fi(n) 0(n) Quantum n{y/n) OWn) Extreama (Unary) Classical n(n + lg|<2f|) 0(n + \g\W\) Quantum fi^n + l g l ^ l ) OWn\g*n + \g\W\) Table 1.1: Bounds on extrema problems answer. Furthermore, we discard blocks with fa < A m a x . As a result fa, the upper bound of the last block, reflects the minimum of all upper bounds. This implies 'if = [Xmax>Pt)- Throughout the algorithm, we either query with the median of the range in order to halve 1^1 or we compare with some fa to discard elements. Moreover, we can verify that whenever we discard an element, 1^1 wil l increase by no more than twice of its original size. Eventually, the program would have reduced \^\ to 1 and discarding all impotent elements. This thus runs in 0(n + lg \W\) time. A complete analysis can be found in [3]. The importance of this algorithm is immediate. It proves that the lower bound is tight and we can solve the unary maximum finding problem in G(n + lg |) time optimally. It is one of the fundamentals in many of our algorithms. 1.2 Quantum Unary Maximum Finding The same problem of Unary Maximum Finding problem can be extended to the quantum setting. We would still be working with the three tuple (n, , G) together with all our assumptions. However, we add to our oracle the power of quantum parallelism. As a result, we can input a superposition of input and get an entangled superposition with the corresponding results. This, in most case, requires no more than a straight forward classical-to-quantum transformation. Some of the details will be covered in the following chapters. The binary maximum finding problem is well studied in the quantum setting, Chapter 1. Introduction 11 Name Description Runtime Sample and Im-prove Adaptation of the optimal algorithm from [7] 0(Vn + l g | ^ | l g n ) Budgeted Sam-ple and Improve Make use of Budget ed-Sampling to accelerate Sample_And_Improve 0 ( ( v ^ + l g | ^ | ) l g l g n ) Geometric Bud-getting Wi th a geometrically de-creasing budget, isolate the i /n term from the lg lg n fac-tor O ( 0 i - t - l g | ^ | l g l g n ) Progressive Approximation Isolate the lg term from the extra factor 0(V^lgn + l g | ^ | ) 0 ( y n l g l g n + l g | ^ | ) Large Sampling Make use of large pseudo-random sampling to further accelerate the runtime 0 ( ( ^ + l g | ^ | ) l g * n) Table 1.2: Runtime of various algorithms with an optimal O(yfn) algorithm derived in the work by Diirr and H0yer [7]. The unary maximum finding problem is, unfortunately, untouched. The first question in general is whether we can outperform classical algorithms once we have access to quantum parallelism, as demonstrated in the binary maximum finding problem. However, one could show that the binary search term 0( lg \ ^ |) is optimal in both the classical and the quantum setting. There is an intrinsic lower bound of Q,(^/n + lg | ^ | ) . Table 1.1 lists our current best understanding for the maximum finding problems in various flavours. We derive new algorithms iteratively in an attempt to narrow the gap be-tween our upperbound and the intrinsic lowerbound. Classicaly, we have already seen that a direct adaptation of the best binary algorithm may not yeild a good unary algorithm. That is also the case in the quantum setting. B y adapting the optimal algorithm we could achieve an expected time of 0(y/n + \g \<% \ lgn) (c.f. 0(n + lg | ^ | ) from the randomized classical algorithm). Further studies show that we could in fact solve the problem in 0((y/n + lg \ |) lg lg n) or even Chapter 1. Introduction 12 0(y/n + lg \ ty\ lglgn) expected time. Unfortunately, these algorithms perform worse than the classical algorithm when \g\W\ dominates. We then approach the problem from an approximation perspective, yielding algorithms with ex-pected runtime 0(yfn\gn-\-\g \^\) or 0{^/n\g\gn-\-\g\ty\). These algorithms provide speedup even when \g\^\ is considerably large. Wi th large pseudo-random sampling, we further push the limit 1 to 0((y/n + lg \ ^ \ ) lg* n). Table 1.2 lists some of our achieved runtimes for the unary maximum finding problem. Finally, we show that how one could reuse our previous results to generate algorithms with expected runtime of 0(°Jn\g* n + lg | ^ | ) , 0(-Jn + \g \aU\ lg* n) or 0 ( O + lg|<2r|)lg*(lg* n)). The details of each algorithm will be covered in Chapter 4. 1 lg* n is defined to be the number of times one has to take lg before n becomes less than 1. !3 Chapter 2 A brief introduction to Quantum Computing Computing can be viewed as the manipulation of data in order to get from our initial state to an useful final state. Suppose we want to sum up n integers, we would first create an output register and subsequently add numbers to this register. At the end, we can find the accumulated sum in our output register. What we have done is that we gradually transform our input and auxiliary work bits (temporary variables) to a good final state. Classically, all information are stored in bits and we have different electronic gates at our disposal. In the quantum world, we have access to quantum in-formation and can utilize various quantum effects such as superposition and entanglement to aid our process of transforming input data to useful target states. This chapter aims at providing a brief, and necessarily superficial, overview on quantum computing and various notations. 2.1 A brief history of Quantum Computing It all began with the problem of simulation. In his article [8] in 1982, Nobel Prize winner Richard P. Feynman proposed that a quantum physical system with R particles cannot be simulated efficiently with ordinary computers without an Chapter 2. A brief introduction to Quantum Computing 14 exponential slowdown. However, a classical physical system can be simulated by ordinary computers efficiently in polynomial time. The major challenge of the quantum simulation problem is the description size of the system. The description of a quantum system with R particles involes expressing a function (j)(xi,X2, • • • ,XR,t) across its full domain which is exponential in R. He went on and suggested that if we can use computers that run according to the law of quantum mechanics, this problem becomes tractable. This suggests that a quantum computer can potentially provide an exponential speedup over classical ones. The quantum model of computation and the idea of a universal quantum computer is formalized by Deutsch in [9] in 1985. The construction of a univer-sal quantum Turing machine was also improved by Bernstein and Vazirani in [10], in which the authors showed how one can construct a universal quantum Turing machine to simulate any given quantum Turing machine with polynomial efficiency. The field had not gained much attention until Shor published his well known integer factorization algorithm [4]. The current R S A crytosystem depends on the intractability of factoring arbitrary large composites. Given a quantum algo-rithm, we now have a polynomial time algorithm not only to test for primality but to actually retrieve the corresponding factors. This shows that quantum computers could potentially be faster than classical computers not only on hy-pothetical problems of marginal interest. In 1996, Grover introduced his famous search algorithm [1]. Wi th the as-sumption that there is a unique solution to our satisfiability problem, it finds the satisfying index in expectedly 0(^/n) time. The technique employed was later generalized [2, 7] and applied in different scenarios (e.g. waiving the unique-ness assumption, extreama finding, etc.), providing quadratic speedup over the Chapter 2. A brief introduction to Quantum Computing 15 best known classical algorithms. In the following decades, new algorithm based on these primitive techniques were derived, such as various graph algorithms, element distinctness and collision algorithms. In 2006, Bartholomew Furrow showed how to combine these results and apply these tools to various geome-try and combinatorial problems [6]. Some of these results will be used in our algorithm. The theory is, however, far more developed than the practice. B y far, only very small scale quantum computers have been built and tested. Quantum ef-fects are only observable in very fine scale and they can easily be corrupted by external interference. The problem of coping with errors in quantum com-putations has long be considered intractible mainly because of the No-Cloning theorem [11]. According to the theorem, one could not copy quantum informa-tion from one storage to another one. Duplication is essential in most classical error correction schemes and this theorem seems to be a huge obstacle. It was not until Shor demonstrated an error correcting scheme in [12] that established the theory of quantum error-correcting codes. Given a modest error probability in low level components, we can now employ different error-correcting techniques to support arbitrarily long quantum computations. 2.2 Qubits Consider a simple coin flip where the output is unknown to us. This simple coin flip can be formalized as a system with two possible outcomes head and tail, each with its respective probability ph and pt. For simplicity, we can express this as a probabilistic mixture ph[head] + pt[tail], where Ph + Pt = 1- This system remains in an unknown state until we actually observe or measure it. Once measured, we will know the exact outcome of this system and it degenerates to either \\head) + 0[tail] or 0\head] + 1 [£aiZ]. This measurement is destructive Chapter 2. A brief introduction to Quantum Computing 16 as we can not return this system to the previous mixed state without another flip. On a similar note, it is debatable if we can actually clone the state of an unknown coin. There is no way we can measure the probability ph or pt. Now consider the same experient where we flip a "coin" the size of an elec-tron. Quantum mechanics tells us that a simple probabilistic mixture is not an adequate representation of the system. The correct way to formalize the sys-tem takes the form ah [head] + at [tail], where both coefficient are complex and \ah\2 + |c*t|2 = 1. As discussed, the system remains in an unknown state until we measure it. The probability of measuring a head (resp. tail) event is now given by |a^ e ad| 2 (resp. |a( ai( | 2)- Since the sum of the probabilities equals to 1, we will always detect some valid outcome whenever we measure. The coefficient a is called the amplitude of observing the corresponding state. The significance of complex amplitude would be clear in the next section. This notation can be extended to any physical system with 2 orthogonal observables. Without loss of generality, we denote them as |0) and |1). The ket notation \<p) is a mathematical tool to denote a quantum state and is widely used in quantum related context. A state of this system would take the form \<j>) = ao|0) + Q i | l ) . Similar to orthonomal bases of a vector space, the observables are completely arbitrary. One may exchange the role of |0) and |1). We can also choose to operate on the rotated basis {^ ( |0 ) + |1)), ^ ( | 0 ) — |1))}- The choice of our observables depends only on how we build our measurement device. We have portrayed a qubit as a generalized binary random variable. How-ever, a qubit cannot be reused. By the No-Cloning theorem [11] one cannot back up a qubit for future uses. We understand that measurements collapse the state of a qubit and subsequent measurements will always end with the same outcome. For instance, suppose we start with a qubit with state |</>) = ^(%/2|0) + We have with probability two thirds measuring state 0 and probability one third Chapter 2. A brief introduction to Quantum Computing 17 measuring state 1. However, once we measured 0, we stayed 0 all the time. Fu-ture measurements must always agree with the first measurement. If we measure 1 from the qubit for the first time, we would have collapsed the state to contian only |1) for all remaining computation. In some implementation, the.measure-ment operation is so destructive that the qubit itself is destroyed and further measurements would not even be possible. As a result, we have to reprepare the qubit before we can perform the same measurement. This concept can be extended to system with arbitrary number (> 1) of levels. A qubit is a device capable of encapsulating a two level system. A se-quence of qubits makes a quantum register. Concatenating qubits provide a system with exponential number of levels. For instance, an k bit register can represent 2 f c integers. Wi th k qubits, we can have 2k different observables, in-cluding |0) |0). . . |0), |0}|0). . . |1), . . . , etc. For simplicity, we write |0) |0) . . . |0) as [00 . . . 0). Similarly, we can write |00 . . . 1) or |00 . . . 10). Since these are basi-cally binary numbers, we will also use the notation |0), |1), |2), etc. Generally, if we have k bits, we would have |0) up to \2k — 1). These 2 f c states provides a basis for the concatenated multi-level system. A quantum computer refers to a collection of qubits / quantum registers and devices to manipulate them. 2.3 Unitary Transform and Entanglement The state of a probabilistic (quantum) system can be described by a vector of probabilities (amplitudes). Different states are isolated and progress inde-pendently. A probabilistic system evolves through a series of redistribution of probabilities. When we perform any operation (state transition) on an n-level Chapter 2. A brief introduction to Quantum Computing 18 system, we can express that as: P0,0 Pl,0 P0,1 P l , l P0,n-1 Pl,n-1 \ / CO \ C l V Cn-1 ) \ Pn-1,0 Pn-1,1 P n - l , n - l / \ Cn-1 j This transition matrix means that if my system is in state i, it has probability Pij to evolve into state j. A valid transition matrix must preserve the probability of each state, giving the formula _j Pi,j = 1- Such a matrix is called a Markov matrix. The general idea applies in the quantum setting. We have different basis states and they operate independently and linearly. Instead of using probabili-ties, the transition matrix is replaced with one that contains complex entries. A valid transition must preserve the amplitude and ensure that YDi l a * | 2 = 1 after the transition. It turns out that such a transition must be unitary. Given any unitary transition matrix A, we have A-1 = A^ where A^ denotes the adjoint (conjugate tranpose) of A. This also implies that all allowed transitions are inherently invertible. This is clearly not required in the classical case. Some basic single qubit operators include I = / 1 0 0 1 X = ( 0 1 1 0 Y = 0 - i 1 0 Z = 1 0 0 -1) Consider the basic operation X. Given the single-qubit state \<f>) = |0) = 110) + 0|1), the result would be 0 1 1 0 0 1 1 0 1° I i = u) Chapter 2. A brief introduction to Quantum Computing 19 Similar, X flips the state |1) into |0). As a result, X can be thought of as the not gate in electronics. Operator Z flip the sign of the amplitude of |1) and leave the amplitude of |0) unchanged. One of the most important gates is the Hadamard gate. The corresponding matrix is: It transforms |0) into ^ ( | 0 ) + |1)) and |1) into ^ ( | 0 ) —11)). Consider an initial state |0), the probability of getting |0) is 1. After the transformation, the prob-for initial state Although the amplitude for |1) is — the measurement probability is still ^ after squaring. It appears that Hadamard is similar to a fair coin toss probabilistically. However, there are two subtleties. Firstly, in a probabilitic model, a coin toss operator is usually depicted as: This is a complete random shuffle and it destroyed whatever information stored before. As a result, this transformation is not invertible and is not unitary. Therefore, this is not valid in the quantum computing setting. Secondly, the effect of cascading CoinToss and H are completely different. Consider an initial state [0] (resp. |0) quantumly), performing CossToss twice brings us the state |[0] + |[1] which is a perfect mixture of both state. In the quantum case, performing H once brings us the state -75(|0) + |1)) Preforming H again gives ability of measuring |0) or |1) is exactly (-75)2 = \- The same analysis applies CoinToss = -2 Chapter 2. A brief introduction to Quantum Computing 20 us 7! i i i - i 1 V V2 V 0 / Performing H twice surprisingly brings us back to our original state. This is the effect of interference in Physics and is generally absent in any probabilistic computing model. One can also verify that HH = / . Such operators are called self-inverse. The effect of interference is one of the main reasons why quantum computers could be more powerful.than probabilistic computing, while the major restriction is that we can only perform reversible operations. In some literatures, the state -^g(|0) + |1)) is shorthanded as |+) and the state -^(|0) - |1)) as | - ) . Given k qubits at |0 n ) , one can apply the Hamadard Transform to all for them, producing the state |+ f c). Expanding |+ fe) we have ^ r ( | 0 0 . . . 00) + |00.. .01) + |00 . . . 10) + . . . + | 1 1 . . . l )) = - 7 L r ( |0 ) + | l ) + |2) + . . . + | 2 n — 1)), a uniform superposition of all possible bit-patterns of k bits. This is often used as an initialization step. Note that redoing this will revert the state back to \0k) A n important 2-qubit operator is the controlled-not (CX) operator. It is best described by "If the first qubit is 1, apply X to the second qubit. The matrix form is given as: / 1 0 0 0 ^ CX = 0 1 0 0 0 0 0 1 0 0 1 0 Consider the following scenario where we start with the state |00) = 1|00) + 0|01) + 0|10) + 0|11), represented as (1 ,0 ,0 ,0) T in the vector form. Firstly, we apply H to the first qubit and reach the state | + 0) = -^(|00) + 110)). We then Chapter 2. A brief introduction to Quantum Computing 21 apply the CX operator to both of the qubits. CX would turn the state |10) to 111) but will leave the state |00) unchanged. The final state thus becomes ^=(-|00) + 111)). This is a special state in which the two qubits are entangled. Suppose we now measure the first qubit and get a result of 1. The measur-ment would collapse the state of the first qubit to only contain |1). However, the only feasible state for this to happen is precisely |11). B y measuring the first qubit and getting an 1, we have also collapsed the second qubit to contain only 1. This phenomenon does not hold for all multi-qubit systems. Suppose we are given a state |0) = |(|00) + [01) — 110) — |11)), we can factorize it to -L( |0) - |1 ) )^( |0) + |1)) = | -) |+) . The measurement of the first qubit is com-pletely independent of the second qubit and they both behave like a fair coin. Consider another example with the state |^ > = ^ 5 ( | 0 0 ) + 2i|01.) + 2 | 1 0 ) - 3 | l l » , which can be rewritten as ^ | 0 > ( ^ ( | 0 > + 2 i | l » ) + ^ | | l > ( - ^ ( 2 | 0 > - 3 | l » ) Wi th probability we could get a measurement of 0 in the first qubit, col-lapsing the state of the second qubit to ^=(|0) + 2i | l ) ) . Wi th probability y | , we could get a measurement of 1 in the first qubit, collapsing the state of the second qubit to ^ ( 2 | 0 ) - 3|1)). Entanglement is an extremely important property in quantum computa-tion. As we have described, quantum states evolves independently and linearly. Suppose we are given a multi-qubit quantum state after a long computation. Without entanglement, each qubit behaves as a separate binary probabilistic system on its own. We cannot tell if the answers we read off from each byte correspond to the same computational path. Wi th entanglement, this guarantee is assured. Chapter 2. A brief introduction to Quantum Computing 22 2.4 Quantum Circuits and (Approximately) Universal Gate Set We have discussed qubits and unitary transformation. However, we have yet to establish how this can be more powerful than classical electronic circuits. To begin with, we first tackle the question if all classical electronic circuits can be done with quantum circuits. This may sound trivial, but this is not immediate when we consider all the restrictions of a qubit system. Firstly, most of the classical circuits are not invertible. Let us consider a circuit that takes in n input bits and produce m output bits. This circuit cannot be invertible if n > m by simple cardinality argument. Some of the useful functions may simply be not invertible even when n < m. Our input bits will be lost and there is no way we could recover them given the output bits. To overcome this problem, we can simply make our circuits echo the input as part of the output. As a result, our equivalent quantum circuit will take in n bits and producing (n + m) bits. However, this is clearly not unitary as the dimensions simply do not match. To handle this problem, we have to add m dummy bits as our input. A standard way to convert a classical circuit to a quantum circuit involves adding m dummy input qubits, all initialized to |0) to which the m result bits would be exclusively-ORed with it. Since we have initialized the dummy qubits to |0) before hand, the dummy qubits wil l now contain the result. This quantum circuit is also self-inverse, as a second application will remove all data written in the dummy qubits. Figure 2.1 shows the schematics of the two circuit models. Now that we have our revised circuit model, we have to establish that for any classical circuit, there is a quantum circuit that computes it. Classically, we know that the gate NAND is universal. Any electronic circuits (e.g. AND, OR, Chapter 2. A brief introduction to Quantum Computing 23 Classical Circuit Quantum Circuit Input Output Input Output Figure 2.1: Differences between classical circuits and quantum circuits NOT) can be implemented as some combinations of NAND. We have already shown that the X operator is indeed a NOT gate. We now present the Toffoli gate, a 3-qubit quantum unitary gate. Similar to the CX gate, it flips the last qubit only if the first two qubits are both 1. Mathematically, it transform the state |a)|6)|c) into \a)\b)\c(Bab). Together with an X gate, we have access to the NAND equivalent in the quantum setting and this establishes the fact that all classical circuits can be implemented with quantum circuits. More importantly, any classical circuits using only {AND,OR, NOT,XOR) can be implemented in the quantum setting with no more than a constant factor overhead. There is, unfortunately, a technical detail about assignments that we have to attend to. Consider the function y = (xi A x%) V (13 A 1 4 ) , we can implement it as the circuit as in Figure 2.2 classically. However, since assignments or cloning are not allowed, and we need to do our echoing, the quantum circuit has to take in extra bits as the work spaces. The number of anxillary qubits required depends on how we implement our circuits. Luckily, we never need more anxillary qubits than the original complexity of our circuits. Combining these procedures, our quantum circuits will take as input n input qubits, m output qubits and r anxillary qubits (initialized to |0)). The input qubits will be echoed while the result will be exclusively or-ed into the m output qubits. The r qubits will be reset to 0 at the end of the gate so that they will be reused. Figure 2.3 shows one Chapter 2. A brief introduction to Quantum Computing 24 of the possible implementations of the same expression y — (x\ Ax2) V (2:3 A x$). Figure 2.2: Classical Circuit of y = (x\ A x2) V (X3 A X4) Figure 2.3: Quantum Circuit of y = (x\ A x2) V (£3 A X4) Directly from the study of circuit theory, we know that any Turing ma-chine in P can be expressed as a circuit of polynomial size. Together with the argument above, any classical algorithm can be implemented with quantum cir-cuits. The major question is whether we can retain the same efficiency as our algorithm. There are two major questions, namely assignments and branching. Consider the follow code fragment: y «— 0 / / Initialization y <- V V x0 • y *- y v i i y <- y A x2 return y Apart from the initilization, the remaining three lines rewrite the content of acc and is not invertible. Using the tricks mentioned above, we can rewrite this as: Chapter 2. A brief introduction to Quantum Computing 25 yo<- 2/1 <-- 2/2 2/3 <— 0 / / Initialization y i <- 3/i © (2/0 V n ) 2/2 <~ 2/2 © (2/1 V i 2 ) 2 / 3 ^ 2 / 3 © (j/2 A x3) r e t u r n 2/3 Since we are "writing" those results in different registers, no information is lost and this code fragment is now invertible. B y doing so, we have to introduce a new anxillary quantum register for each assignment. Since we only need k anxillary registers if we have k assignments, there is only a constant factor over-head to prepare the state. The second problem concerns the i f statement and branchings. Classically, we have to evaluate a Boolean expression and select which branch we should be taking. However, measurements collapse superpo-sitions and are not unitary. Luckily, we can postpone measurements. After evaluating the expression, we simply wire that qubit to controlled gates. A l l our remaining calculations would only be performed if it satisfies all previous conditions. This is similar to the concept of multiplexers commonly used in elctronics. If needed, we then measure the expression qubits at the end of our computation to recover the history of our computation. Entanglement guaran-tees that the history and the result will always match perfectly. Similar, since we only need extra resources for each branching we have, the added overhead is proportional to the actual work needed in the original algorithm. As a result, all classical algorithms can be implemented quantumly with at most a linear overhead. The potential drawback is that we may need asymptotically more memory than that of the classical algorithm. In general, we would try to free up anxillary qubits where possible for reuse. We have discussed that any classical computation can be transformed into an equivalent quantum part with no more than a linear overhead. However, there are more unitary transforms than there are circuits. The power of quantum computing could be significantly better than classical algorithms or circuits. Chapter 2. A brief introduction to Quantum Computing 26 However, there are two potential problems: • We may not be able to specify an arbitrary unitary transformation. After-all, if we are using an electronic computer to manage its internal quantum module, we only have a limited representative power from classical bits. For instance, we may not be able to specify a transform with irrational coefficients. • • The transform may not be realizable efficiently. A n n-qubit unitary trans-form is actually a 2" x 2" matrix. It is unreal to assume that we can get any type of unitary transform off the shelf. At some point, we must construct our own circuits with small pieces. Since the underlying matrix is exponential in size, it is not surprising that some of transforms would take exponential number of small pieces to construct. In this thesis, we try to stay away from any precision or irrational gates by restricting ourselves to the set of primitive gates we have discussed before l . This finite set is approximately universal since combinations of these gates can approximate any unitary transform to within any given error. Moreover, extensive error correcting schemes have been developed over these gates. We believe that our chosen set of unitary transform is less demanding and more practical, while maintaining most of the power of unitary transforms. 2.5 Model of Computation With the foundations of qubits and unitary transform, we can carry on to con-struct more powerful computational models. Classically, all algorithms can be expressed as a Turing Machine. It is true that we can model any quantum algo-rithms as a Quantum Turing Machine. Unfortunately, the details in expressing 1 W e also consider the rotational gate R{^) which transforms |0) to |0) and |1) to e1^ |1) Chapter 2. A brief introduction to Quantum Computing 27 algorithms in this model are unnecessarily tedious. To simplify our discussion, we would base our algorithms on a computation model similar to the standard R A M model. We have access to all the standard features including a set of classical registers and the ability to do simple arith-metic operations on them in constant time. To add quantum capability, we add another independent set of quantum registers. We are allowed to perform ini-tialization (to |0)) and apply simple gates to any of these registers. For instance, we can apply the Hadamard transform H to each bit in the register. As in the R A M model, we avail ourselves standard arithmetics in constant time. It is worthwhile to mention that although the cost of multi-bit arithmetics depends on the number of bits, we still treat them a constant cost atomic operation. This is similar to computers nowadays, where adding 2 4-bit integers is as costly as adding 2 32-bit integers. The main reason is that the underlying circuits for arithmetics are highly optimized and parallelized, and it generally takes more time to interpret a command instead of executing it. W i t h hardware acceler-ations, basic arithmetic operations such as additions and substractions on an n-bit registers can be as fast as O(lgn), using O(n lgn) gates. Complicated op-erations such as multipliations, division or Quantum Fourier Transform (QFT) could take as long as O(n lgn) (or even longer). For uniformality, we avail our-selves an elementary operation if and only if the counterpart is available in the standard R A M model.. Similar to standard quantum computing assumptions, we also assume that all quantum computations are error checked and are decoherence free. We do not capture any error in our pseudocode, but one may want to add error checking code appropriately in practical implementations. Many quantum algorithms are not exact. They cannot guarantee the cor-rectness of their output, and may even produce incorrect outputs. However, Chapter 2. A brief introduction to Quantum Computing 28 most of them can, fortunately, guarantee that their output is correct with a probability no less than two thirds. We may rerun the algorithm several times to increase our confidence. Some of the algorithms such as factoring produces positive certificates. As a result, the correctness is guaranteed on positive in-stances and we only have to deal with those negative instances. The problem of having incorrect outputs may be arguably unavoidable. Firstly, no quantum circuit, including our finally measurement, could be error-free. Secondly, one can prove that some of the speed ups are only available in a bounded error model, where we only guarantee a success probability significantly better than half (generally two thirds). This nondeterminism may also impact the running as well. Some algorithms may run forever if it gets unlucky in every round, but have a very good expected behaviour. In this thesis, we focus on algorithms that succeed with a constant probability significantly larger than half and with a good expected runtime. This expectation is only dependant on probabilistic measurement or coin tosses during the executation of our algorithm and is in-dependent of our input. More precisely, we are interested in the worst expected behaviour among all possible inputs as in [6]. We would also like to point out that quantum computers have the inherit ability to provide pure randomness. It is less likely that an adversary could force our algorithm to take any worst path. 2.6 Summary We have reviewed the basic notations of quantum information and computation. We have also surveyed a couple of useful gates and reviewed the model of com-putation generally used in the setting of quantum computations. Our model resembles the R A M model with the addition of quantum registers and quantum gates. We are also more restrictive as we only allow gates that are primitive Chapter 2. A brief introduction to Quantum Computing 29 (readily implementable), small (with a small unitary transform matrix) and error-recoverable. Since the runtime of some quantum algorithms depend heav-ily on probabilities and may not even terminate, all our runtime analysis will focus on the worst expected runtime over all possible inputs instead. 30 Chapter 3 Quantum Search and Amplitude Amplification Over the pass few decades, people have studied different problems and derived efficient quantum algorithms on them. One of the fundamental problems of interest is the Database Search problem. Given a collection of elements, we would like to find some good entry subject to a given predict. One such example is that we are given a phone book and we would like to search for the company with the telephone (135)792-4680. This task is easy if we are given a phone book sorted according to the phone entry. However, most directories store entries according to the name, and there is no particular ordering on their phone numbers. This is like searching a record from an unordered database, which can be generalized to the satisfiability problem we have mentioned before. We now formally define the search problem. As input, we are given two items: • n, denoting the numbers of entries there are. • F, a predicate function that maps { 0 , . . . , n—1} —» {False, True}. (F(i) — True) if and only if the ith entry satisfies the predicate. The predicate function is implemented as a black box and information can only be extracted through actual invocation of the predicate. As output, we have two possibilities: Chapter 3. Quantum Search and Amplitude Amplification 31 • some i such that F(i) = True • NotFound We further denote m = |F _ 1 (1 ) | as the number of satisfying indices, and a = ^ as the fraction of domain elements satisfying the predicate of satisfying the predicate. Linear search through the whole domain provides a natural classical solution to this problem. This yields an 0(n) linear time algorithm. The runtime is relatively independent of o and m. To understand the difficulty of the problem, we can analyse it with an adversary argument. We take on a role as the oracle and try to answer queries in such a way so as to force the algorithm to. run as long as possible. A simple strategy is to provide negative answers until the last m entries. Even in the extreme case with o = \ , any algorithm would still need to scan through half of the set before hitting a good entry. On the other hand, we can also tackle this problem with a randomized algorithm which probes random entires iteratively. The probability that we hit a good entry is a, and it takes expectedly 0(a~1) time before this algorithm terminates with a good entry. In the extreme case with randomized algorithm can complete the task in expected constant time. There are, however, three drawbacks with the randomized approach: • This algorithm may never terminate if our random index generator is flawed. • Wi th low probability this algorithm may take many steps before termi-nating. • This algorithm can never terminate if m = 0, in which case it should have returned NotFound. Chapter 3. Quantum Search and Amplitude Amplification 32 The first issue is related to pseudorandomness. We would not discuss the issue in depth as with quantum computers, we have access to full randomness and this issue would be solved. The second issue and the third issue mainly concern about when we should stop trying and how we can avoid testing some entries repeatedly. Theoretically, we can mark entries that we have checked before and avoid them in later iterations. However, the workload in remembering tested entries and avoiding them could impose severe overhead to the algorithm. A n alternative approach is that we will keep trying for a fixed number of times (e.g. 8n). We return a good index immediately if we find one, and return NotFound if we keep failing t i l l the end. We successfully bound our runtime by sacrificing our success probability of outputting a correct answer. We can see that the probability of success with 8n trials is at least | and the error is also one-sided; we only err when we output NotFound but m is actually positive. The behaviour of this simple randomized algorithm resembles the scenarios we have discussed in the previous section. As discussed in our model of compu-tation, we are only interested in the worst expected time to achieve the correct answer with error bounded by some constant over all possible cases. This this example, the runtime of the algorithm is given by 0(min(8n, a - 1 ) ) with a con-stant error rate. The quantum algorithm that we are going to discuss share many of the features of this randomized algorithm. The same reasoning will still apply. 3.1 Grover's Algorithm Grover studied the database search problem and derived a quantum algorithm with quadratic speedup in [1], We follow the original discussion by assuming that m = 1 and n = 2k for some k. Furthermore, we assume that the predicate F is given as a quantum black box T. Chapter 3. Quantum Search and Amplitude Amplification 33 With a quantum implementation of the predicate F, we can build a quantum blackbox T which transform \i) into ( — R e c a l l that any algorithm or circuit can be turned into an equivalent quantum box, taking in an extra output bit to which the result is exclusived or-ed in. In previous sections, we have illustrated how we can feed in a |0) as the output bit and read the result after. To implement T, we first direct our input |i)|0) into F to get We then apply the Z transform to our output qubit, which negates the amplitude of the state if and only if F(i) is 1. We then run F again on our state (—l)F^\i)\F(i)) to revert the output bit, giving us (—l)F^\i)\Q). This requires two invocations of F. There is a more clever way by which we can construct our blackbox T with only one invocation of F, as noted in the work of Boyer et al. [2]. Wi th m = 1, we denote the only good index as q. Our goal is to find q among the n possibilities. The first step of Grover's algorithm is to prepare a superposition of all values in the domain, denoted as Remember that n = 2 f c, we can prepare this state simply by applying Hadamard's gate to each of the k qubits. We denote this operation S = Hk. We further define\A) = \q) and \B) = - ^ j ( | 0 > + |1> + . . . + \q- 1) + \q+ 1) + . . . + \n- 1)). These two states are orthonormal to each others. The state \A) is a uniform mixture of all good indices while \B) is a uniform mixture of all bad indices. Recall that a = ^ = ^ , one can verify that the state is equivalent to i/a|j4) + y/1 — a\B). Given the state we can perform a measurement and test the outcome against F. The probability of measuring |^ 4) is a. This is essentially our randomized algorithm with a pure random number generator. The main idea of Grover's algorithm is to boost the amplitude associated with \A) to more than half, thereby increasing our probability of success. In fact, we |*) = -L|0> + 4=|1> + . . . + -±=\q) + ... + ±=\n - 1) Chapter 3. Quantum Search and Amplitude Amplification 34 can also think of the linear scan algorithm as a probability boost to the success probability. At each failed trial, we increase our chance of success from I to j^. Wi th only a limited rate of increase, such algorithms take approximately 0(a~1) time to boost the success probability to at least a half. The quantum algorithm consists of repeated applications of the Grover it-eration, given by —SJroS~1JR, where TQ refers to the predicate that is satisfied only at 0. To better understand the algorithm, we first draw our state |^) as a vector in the 2D plane spanned by \A) and \B). Although we may have complex amplitudes in the general case, we shall see that we stay in the real domain before and after each iteration. As a result, our current state (denoted as \4>)) can always be written as a vector in the form x\B) -\-y\A), where x, y are real coefficients. Applying T invert the y amplitude associated with \A) and is virtual a reflection along the x-axis. The remaining term — SJ-QS-1 is actually a reflection along the vector |^) = y/1 — a\B) + y ^ l ^ ) - To better understand this, we can do a few mathematical tricks. We define another coordinate system with orthonormal basis \A') = |0) and \B') = ^ = ( | 1 ) + |2) + . . . + |n - 1)). B y definition, we have S\A') = | * ) . Since \A') and \B') are orthonormal, it follows that S\B') must also be orthonormal to S\A') = |#). As a result, |*) = S\A') and its normal S\B') is also an orthonormal basis for our vector. We have three operations in successions —SFQS-1. The first operator 5 _ 1 simply converts the coordinate system from S\A') and S\B') to \A') and \B'). After that, TQ in-verts the amplitude associated with \A'). Since we have a global negation after each iteration, this negation along \A') is undone and the net effect is that the amplitude associated with \B') is negated. Finally, the final 5 transform this negation back to the basis vector S\B'). As we have argued, S\B') is normal to 1$). The net effect is that our state is reflected along |\I>). Schematically, we have rotated our state about the origin with an angle 29 Chapter 3. Quantum Search and Amplitude Amplification 35 Figure 3.1: One Grover Iteration (Figure 3.1). Initially, we have sin(0) = ^fa. The probability of getting a good state \A) is sin2(#) = a. After the rotation, the probability of getting a good state is then sin2(9+ 26). After g such rotations, our state would make an angle of (2g+l)9 with the x-axis. the success probability would then be sin2((2<7+l)0). Ideally, we want (2g + 1)9 « \• For small a, we have yfa ~ sin(#) « 9. We need g to be rougly — 1) for the best result. This translates to an 0(y/a~l) algorithm. Compared to the classical 0 ( a _ 1 ) algorithm, this provides a quadratic speedup. The pseudo code is presented here for completeness: repeat forever \<t>) - |0> apply HK on \<f>) sin - l a while a + 29 <z do apply H^QH^J7 on end while x <— measure \4>) i f F(x) = 1 then re turn x end i f end repeat The overall repeat loop is necessary since our state \4>) may never overlap entirely with \ A). We may have to do this a few times before we finally get lucky Chapter 3. Quantum Search and Amplitude Amplification 36 with our measurement. In Grover's setting m = 1 and this program should never return NotFound. The inner Grover iteration make use of S, which is Hk in our case. Note that Hk is self-inverse, and thus 5 _ 1 is simply Hk. Moreover, we have skipped the global negation after each iteration. That does not affect our measurement probability and can be neglected. The total work within the repeat loop takes 0 ( \ / a _ 1 ) . We can see that we always improve \<j>) until it has a success probability of no less than a half. The expected running time of this algorithm is thus 0 ( V a ~ l ) . Wi th m = 1, it translates to an 0(y/n) algorithm for the database search problem. Grover's algorithm works by selectively amplifying the amplitude associated with the good term. This is generally referred to as Amplitude Amplification. The technique relies on the destructive and constructive interference effect only available in quantum computers. As a result, there is no parallel in classical computers. Before Grover pubished his algorithm, in 1994, Charles Bennett, Ethan Bern-stein, Gilles Brassard and Umesh Vazirani had already proved that an unstruc-tured search for a unique solution over a space of size n requires at least Q.(yfn) calls to the query/predicate (J7 in our case) [13]. Therefore, the database search problem is optimally solved by Grover's algorithm. 3.2 B B H T Shortly after Grover's publicaition, Boyer, Brassard, H0yer and Tapp studied the algorithm in greater depth [2] and proposed different ideas to loosen its constraints. In honour with their work, we shall call their algorithm after their initials, BBHT. m is fixed to 1 in Grover's scenario. It is not hard to see that by changing the definition of \A) and |1?) accordingly, the algorithm will still proceed in exactly Chapter 3. Quantum Search and Amplitude Amplification 37 the same way. The state \<j>) will stilt rotate about the original with an angle of 2sin~ 1(- v/a) per iteration. Should we know o, our previous pseudo code would work seamlessly. The same analysis still applies and we wil l have an algorithm that runs in 0(y/^) steps. Note that if we know that m = 0, we would have returned NotFound immediately, rather than working through futile iterations. Another less binding constraint they have tackled is that n need not be a power of 2. As a matter of fact, we only exploited this constraint when we prepare the state Any unitary transform that brings the state |0) to will work. They have suggested the use of a Quantum Fourier Transform for the task. The Quantum Fourier Transform of order n is given by the transition \i) —* _*j=o wn'\J)i where u>n denotes the nth root of unity. On the other hand, we can simply stretch our domain to a power of 2. We wrap our predicate so that any index greater or equal to n will be rejected. The new a will be no less than half of what it used to be, and this translates to no more than a small constant factor (< \/2) on our expected runtime. In their paper [2], Boyer et al have also tackled the problem when m is un-known. We know that the success probability of the Grover's algorithm depends heavily on how many iterations we are taking. This number also depends heav-ily on m. The main idea behind the BBHT algorithm is to repeatedly try random number of iterations in a controlled manner. For instance, we can guess that m might be very big and simply measure without doing any amplification. After a few failures, we may believe that m could be smaller and only measure after a few iterations. Furrow [6] further studied'the BBHT algorithm and provided a tight error bound for it. We now present the version adopted in Furrow's thesis: Chapter 3. Quantum Search and Amplitude Amplification 38 procedure BBHT(F) factor <— 1.31 while g < 2-Jn do \J>) - |*> j *— random(m), a random integer in the range [0, <?) repeat j times apply j Grover iterations on \(j>) end repeat x <— measure \<j>) i f F(x) = 1 then return x end i f g <— 5 x factor end while return NotFound end procedure Furrow proves that the probability of failure (returning NotFound even where there is a solution) is less than . 5 m - ' 9 3 and the total number of calls to F has an expectation of © ( \ / ^ " ) when m > 0. When m = 0, all measurements will return a bad index. Since the total amount of work we have to do is bounded by g, which is a geometric sequence bounded by 2y/n, the algorithm will run for 0(y/n) steps before it terminates, returning NotFound. As a result, the runtime is bounded by O(yfn) even when m = 0. The actual proof requires a significant amount of math and will not be reproduced in this thesis. Interested readers are advised to read the relevant chapter (Appendix A) in Furrow's thesis [6]. Since the success probability is constant and the error is one sided, we simply assume that we have a BBHT implementation where the success probability is at least | . This can be achieved by no more than a constant number of invocations of any existing implementation. BBHT is very powerful since it operates without any prior knowledge about the domain. Boyer et al. have also proved that it is optimal as an blackbox database search algorithm. It is widely used in different algorithms. Its error Chapter 3. Quantum Search and Amplitude Amplification 39 is one-sided, and we can boost the success probability simply by rerunning it a few times. It is also a practical algorithm with only a small constant factor in its expectation. One may also note that all computations in the BBHT pseudo code are ra-tional. The original Grover's algorithm may require computations of angles, in-cluding square roots and inverse trigonometric functions which cannot be done in absolute precision. Without the information about m and a, BBHT works entirely in the rational domain and can be coordinated / regulated by ordinary computers. In addition, we also have the nice uniform distribution on outcome. As we have shown in the Grover's algorithm, the state of our computation can always be represented in the 2D plane spanned by the good vector and the bad vector. The amplitude of all good (and bad) indices are exactly the same in their respective axis. Boyer et al. have also showed that BBHT is unbiased on its output and any satisfying index will be returned with equal probability. As a result, we can treat BBHT as an ideal random sampler. 3.3 Other variants The technique for Amplitude Amplification can be further fine tuned. For ex-ample the BCWZ algorithm invented by Buhrman, Cleve, de Wolf and Zalka [5] tackle the trade-off between running time and the probability of error. Sup-pose we want to run a database search with error probability bounded by e, it takes P»( lge _ 1 ) rounds of BBHT before we can obtain the desired error bound. Buhrman et al. of [5] showed how one can achieve this in a total expected time 0(y/nlge-1). This introduces a quadratic speed up on the error term but does not take advantage of m even if the latter is large. Furrow has combined BBHT and BCWZ and created a general error-bounded Chapter 3. Quantum Search and Amplitude Amplification 40 search algorithm [6] which take the advantage of both. He has also illustrated how his tools can be used to either improve various known algorithms. However, both of these requires manipulation over irrational number. We will skip over them in this thesis, as some of those techniques require arbitrary manipulation of real numbers. Our ideas only depend ona a search algorithm running in square root time. One can plug in the more elaborated version in place of our BBHT where they see fit. 3.4 FindAll Another fundamental problem people have addressed is the F i n d A l l problem. Given a blackbox predicate F, we would like to find all satisfying entries of it. Since we will also make extensive use of it, we would rederive it with the standard BBHT. In the last paragraph of Quantum Counting [14], Brassard et al. have dis-cussed the possibility to repeatedly invoke BBHT in order to count how many satisfying indices there are. Although this runs with a larger memory overhead in terms of the counting problem, this approach can be generalized to find the whole satisfying set. A naive iterative BBHT approach has two main drawbacks: (i) we may get unnecessary duplicates; (ii) we do not know when to stop. One may try to do complicated analysis to see how we can accomodate with a pure probabilistic point of view. We can also estimate the number of good entries by Quantum Counting [14]. On the other hand, we can simply mark seen entries. For instance, we can keep a Boolean array of size n. Each of the entries marks if we have already collected the corresponding item or not. Wi th this we define our augmented predicate F' such that i is good if and only if F(i) = 1 and i is not marked before. Wi th a linear array, the later condition can be checked in constant time and the running time of our augmented oracle is still dominated Chapter 3. Quantum Search and Amplitude Amplification 41 by the original predicate. The pseudocode is thus: procedure FindAll(F) seen[*] <— False Result <- 0 repeat forever x <— run BBHT with the predicate F'(i) = 1 if and only if seen\i] = False and F(i) = 1 i f x = NotFound then return Result end i f add a; to Result end repeat end procedure Since we have marked seen entries, BBHT will always return an unseen good entry or return NotFound when all good entries are seen. The potential error of this code is also one-sided, in that we may terminate before we have depleted all the good entries. At termination, we must have called BBHT while it returns NotFound. This is a certificate that our algorithm has correctly terminated. As a result, the error rate of this FindAll procedure only depends on how good this certificate is. Since BBHT has a bounded error of at most j, our FindAll procedure will also have a bounded error of at most \ . The running time of this procedure depends on both n and m. Initially, we have m good indices and it takes no more than cw^ time to retrieve any one of them, for some appropriately constant c. After marking the first seen entry, we have only m — 1 good entries and it takes no more than c./—37 time to retrieve the next one. Finally, we will have explored all good entries, bring m to 0. BBHT will take at most c'^/n time before returning NotFound, certifying that all good indices have been explored. Continuing this way, the overall runtime Chapter 3. Quantum Search and Amplitude Amplification 42 is no more than: m ^ fc=i = 2cy/nm + c'y/n — 0(y/nm) This algorithm thus gives us a way to sample all good indices in 0(y/nm) time. When m = n, it will take linear time which matches the corresponding classical bound. The drawback of this approach is that we need to use linear memory 1 It seems that initializing the memory may dominate the total runtime. Luckily, we have many ways to bypass this. For instance, we can reuse our allocated memory across runs. We can also perform zero-initialization using 2 arrays and several counters. Interested readers may consult Exercise 11.1-4 from Introduction to Algorithms [15]. 1 We use a Boolean array of size n to store the marking information. One may use a balanced binary search tree of size O(m) for the same purpose but it takes O(lgm) to check if something is marked 43 Chapter 4 Quantum Unary Extrema Finding We have already discussed the problem of looking for the extremum of a set when we are only given an unary predicate. We have reviewed that the problem takes 0 ( n + lg \ time to solve classically and have described an optimal algo-rithm that achieves this bound. We now begin our discussion on how quantum algorithm could provide asymptotic speed up which is not possible classically. We begin by showing an intrinsic lower bound of this problem. After that, we study related algorithms and see how we can improve them. Our upper bound is derived through a series of optimizations and different techniques applied can be combined to create better results. We will present each technique sequentially with their motivations and effects. We also maintain that our algorithm succeed with a probability at least | , on par with BBHT. 4.1 Lower bound We begin by deriving the intrinsic lower bound for this problem. This gives us an idea how difficult this problem is and what we are aiming for. Before we jump right into the extrema problem, we first review two lower bound results from [16] and [17]. Chapter 4. Quantum Unary Extrema Finding 44 T h e o r e m 2 (Beds et al.[16j: Theorem 4-8 & 4-10) Suppose we are given a blackbox Boolean oracle function F. Any bounded error quantum algorithm must make at least Q,(\/n) oracle queries to verify that there is at least one image with value True. This also implies that the verification of the predicate "\/xF(x) = Fa lse" re-quires Cl(^/n) oracle calls. In our problem, suppose we want to verify the max-imality of a candidate value A, it is equivalent to asking if the blackbox oracle function F(i) = = ' >- ') is unsatsifiable or not. On the other hand, given a blackbox Boolean oracle function F, we can construct an equivalent maximum finding problem with • ty = {0,1} ' -< ' if fc = 1 and F(i) = Fa l se &(i, k)= { < = ' if (fc = 0 and F(i) = False) or (fc = 1 and F(i) = True) ' >- ' if fc = 0 and F(i) = True We can then run any unary maximum finding algorithm to find the "maximum" of those n entries. By construction, the function F is satifisable when and only when the maximum of these n numbers is 1. Therefore, the lower bound follows and that any quantum unary maximum finding algorithm must make at least f2(i/n) queries to the blackbox function F. Note that each call to our 6 defined above can only replace a constant number of invocations of F. As a result, any unary maximum finding algorithm must also make at least Q(y/n) calls to our oracle. In addition, we have another theorem about ordered searching in a domain: T h e o r e m 3 (H0yer et al. Theorem 1) Suppose we are given a totally ordered universe ty and a value u drawn from the universe. Any bounded error quantum Chapter 4. Quantum Unary Extrema Finding 45 algorithm determining the value of u with only a blackbox threshold predicate function requires at least Q(lg |) calls to the oracle. When n = 1, the maximum find problem naturally reduces to an ordered search problem. As a result, any quantum algorithm solving the quantum unary ex-trema finding problem requires at least fi(lg \ & \ ) calls to the oracle predicate. Combining these two reduction ideas, we have the theorem for the lower bound in the Quantum Unary Extreama finding problem: T h e o r e m 4 Any bounded error quantum algorithm solving the quantum unary extreama finding problem requires fl(y/n + lg | ^ | ) calls to the predicate oracle. A full algorithm may contain extra operations other than oracle calls. This lower bound trivially follows in the more general case. 4.2 Extending previous algorithms We start our discussion by reviewing the optimal quantum algorithm [7] for maximum finding in the binary comparison model. We denote the original set S = {xi\0 < i < n). The algorithm starts by picking a random element candidate A. It serves as an underestimate of the global maximum, and can be treated as a filter. Elements that are smaller than A are certainly not maximal and can be neglected. We refer to elements larger than A as active elements and their corresponding set S' = {xi\xi > A} the active set. We iteratively sample an element from S' and update A to it accordingly. When S' becomes 0, the algorithm terminates with A being the maximum. The sampling from S' is a typical search problem under a predicate and is best solved by BBHT. We denote the size of S' as m. Each selection wil l expectedly call the oracle O(yf^) number of times. In the binary comparison model, A can be specified by an index. Given the index of the candidate, we Chapter 4. Quantum Unary Extrema Finding 46 can verify if another index corresponds to a bigger entry with a single call to the binary oracle. The selection is thus of cost O(^f^). If we restrict ourselves to the unary comparison model, checking if an index corresponds to a larger entry requires a binary search of cost 0( lg | ^ | ) . As a result, each selection will cost expectedly 0(^f^\g\^f\). Diirr and H0yer proved that their algorithm runs with 0(^/n) calls to the comparison oracle. In the binary model, this algorithms runs in 0(y/n) time and is optimal. In the unary model, this straight replacement of the comparison function will yield an 0(y/n\g \^\) algorithm. We shall rederive the runtime as it gives us insight how this algorithm pro-gresses. Moreover, we will show an alternative adaptation of this algorithm. Initially, we start with a threshold A = —oo and m = n. The BBHT algorithm will uniformly select a random sample among the m available larger elements. Wi th probability at least half, we would have selected an entry at least as large as the median. Setting A to that element will halve the active set, reducing m to at most ^ - Such halving will occur in expectedly 2 iterations. It takes expectedly 2 l g m + 1 iterations before m is reduced to 0, when the algorithm ter-minates with the correct answer. We now divide the execution of the algorithm in different stages. Suppose we start with m = mo = n, we divide the executions into lgmo stages. We call the algorithm being in stage i if [ ^ J > m > L^¥VJ- The BBHT selection at stage i requires O(yf^) = 0(\J^^L) calls to the comparison function. The expected total number of comparison calls in selection is the sum over all stages, giving us 0(Y^fJo ^ / 2 ' ^ " ) i a geometric series dominated by the last term i /n . In the binary comparison model, each oracle call is 0(1) and the expected total selection cost is 0(*jn). In the unary comparison model, where binary comparisons can be simulated with 0( lg \%\). oracle calls, the expected total selection cost is then 0(\/n\g Y%\). Note that this 0(y/n) comparison cost Chapter 4. Quantum Unary Extrema Finding 47 depends only on the final \fn term and is independent on our initial threshold or mo. At the end of each stage, we will also update A which takes constant time if we only store the index. This adds an extra O(lgmo) term to the total runtime. However, with mo upper bounded by n , this does not impact the total run time. It may already be apparent that we can do more work per update in order to ease the workload of comparison during BBHT. In fact,.we can update A to the exact value of Our chosen index. Wi th the exact value, each comparison can be done in a single call to the oracle. Updating A to the exact value requires a full binary search at the end of each stage. The pseudo code is listed as follows: procedure Sample_And_Improve(€?, Ao) A ^ A 0 do t <- BBHT(i i > A) i f t / NotFound then A <— binarysearch(t) end i f whi le t ^ NotFound r e t u r n A end procedure The BBHT selection is done with respect to a predicate, which is a function on the index i, denoted as f(i) = (ii > A) = (C(i,\) = '> - ' ) Moreover, we pass in Ao as our initial guess. This defines mo and could potentially improve our performance if set properly. . Although a single binary search is costly, it only occurs an expected 0( lg mo) number of times. Therefore, this unary maximum finding algorithm spents 0(y/n) time in running BBHT for sampling, and 0( lg | ^ | lgmo) time updating A. We now have the following theorem: T h e o r e m 5 Sample_And_Improve solves the problem of unary maximum find-ing in expected time 0{~Jn + l g l g m o ) where mo refers to the number of active elements we have started with. It succeeds with probability at least half. Chapter 4. Quantum Unary Extrema Finding 48 We can start with Ao = —oo, thereby setting mo = n and giving us an 0 ( y / n + lg\ty\ lgn) upperbound for the unary maximum finding problem. It is important to also analyze the success probability of this algorithm. It is easy to see that the feasibility of A is guaranteed. The only error we could run into is that the final BBHT call returned NotFound when there were still active element(s) to sample. As a result, our error probability is exactly the same as that of BBHT, which is a constant no more than \ . The runtime consists of three separate terms: • y/n: The optimal cost associated with selecting and verification of maxi-mal ly . We refer to this term as the BBHT term. • lgmo: The cost to reduce m to 0, indicating how many iterations one has to do. We call this term as the m-reduction term. • \g\ty\: The optimal cost in retrieving the value given an index. This term is referred as the binary search term. As we shall see, the term lgmo is far from optimal. Moreover, we can move this m-reduction term from the binary search term to the BBHT term. We will devote the following sections to techniques that achieve these improvements. We begin by showing how we can reduce m to 0 in l g l g m time using larger samples. 4.3 Budgeted Sampling Our Sample_And_Improve algorithm samples one active element at a time and expectedly reduce m t o | each time. We can also sample k > 1 active elements and update A to the maximum of the k elements. The following lemma shows how effective this reduction is. L e m m a 1 Given a sample of k randomly chosen active elements with replace-ment, setting the threshold to the maximum of this sample will expectedly prune Chapter 4. Quantum Unary Extrema Finding 49 the active set to size no more than j ^ y , where m denotes the original size of the active set. The proof is presented in Appendix A . Recall that the cost of selecting an entry is very sensitive to m. When m = n, each selection is essentially free and we can afford a huge sample of size yfn. When m becomes \/n, each selection costs n i Since our target is bounded by y/n, we should restrict our k to 0(ni) to get this close to optimal. It seems that we must choose our k according to m, which is unknown throughout the course of our algorithm. Fortunately, we can perform the calculation in reverse. Our primary concern is that the selection step should not take more than O(yfn) time as that breaks optimality immediately. As a result, we can budget our selection. Within each iteration, we limit ourselves to a fixed budget 1 of 4\/n and continuously run BBHT to sample elements. The following code samples as much as possible given a budget and a predicate by repeated invocation of BBHT. procedure Budgeted_Sampling(S, F) while B > 0 do t <- BBHT(F) where we limit the number of iterations within BBHT by B i f t ^ NotFound then Add t into T end i f Decrease B by the number of inner iterations the previous BBHT call has used, end while return T end procedure We then figure out the maximum of the sample and update our A to it. This is repeated until we fail to sample anything with our budget of 2\/n. Note that 2i /n is generally more than enough for BBHT to sample even if there was only a single element. Failure to sample anything is thus a good certificate that our 'The stated cost \fn is normalized. Suppose BBHT takes cy ^ to sample an element, we should allow ourselves a budget of Acy/n. T <- 0 Chapter 4. Quantum Unary Extrema Finding 50 A is indeed the maximum. We may redo this a number of times to boost our success probability on demand. Similar to the previous algorithm, we simply keep the success probability equivalent to that of BBHT and would not rerun our sampling too many times. We then return A. The following pseudo code captures the main ideas: procedure Budgeted_Sample_And_Improve(^>, Ao) A <— Ao while True do T <— Budgeted_Sampling(4v/n, Xi > A) if T = 0 then break end i f A <- max(T) end while return A end procedure The runtime of this algorithm is simply the number of iterations we need, multiplied by the time spent within each iteration. We have already limited our sampling step to be O(^fn), which also limits the size of the sample to be 0(y/n). We have to find the maximum of our sample T in order to update A, and this can be done through the classical algorithm in time 0 ( | T | +lg | '&' | ) = 0(y/n + lg The work within each iteration is thus 0{^/n + lg The number of iterations can be modelled as how fast m approaches 0. As before, we denote the initial size of the active set (induced by Ao) as mo and divide the execution into several stages. We say that our execution is in stage i if m 0 2 ' + 1 < m < m 0 2 ' . Numerically, we start with stage 0, entering stage 1 only when m drops below y/rriQ and entering stage 2 only when m drops below i \/i/mo- B y solving the equation m,Q < 2, we know that m reaches 2 after stage lglgmo At that point, it takes at most 2 iterations to reduce m to 0. As a result, there are 0(lglgmrj) stages. The following lemma tells us how many iterations we have to perform to proceed from stage i to stage i + 1. Chapter 4. Quantum Unary Extrema Finding 51 L e m m a 2 Suppose we devote a budget of B and run BBHT repeatedly to get a sample. We then update our threshold to the maximum of this sample. The expected pruning factor of this process is O(Byfi^). The proof is presented in Appendix B . Wi th a budget of 4-y/n per iteration, it takes expectedly constant number of iterations to transit from state i to state i + 1. We are also spending 0(y/n + lg | ^ | ) per iteration, giving a total runtime of 0((y/n + lg | ^ | ) lglgmo). T h e o r e m 6 Budgeted_Sample_And_Improve solves the problem of unary maxi-mum finding in expected time 0((y/n + \g \fy |) lg lg mo) with a success probability greater than half. Basically, Budgeted_Sampling offers us a way to sample without knowing m, and the returned set has a high pruning ratio. At later stages, Budgeted_Sampling becomes an ordinary BBHT, useful in both sampling and verification. As a re-sult, our algorithm share the same success probability as a BBHT. On the other hand, with our huge initial pruning, we are able to reduce m to 0 in O(lglgmo) iterations. Once again, mo is bounded by n and this algorithm has an up-per bound of 0((\/n + lg 1^ 1) lg lgn) . Moreover, the success probability of BudgetedJ3ample_And_Improve is inherited from BBHT, which is a constant. 4.4 Geometric Budgeting In our first algorithm, the m reduction term is only attached to the binary search term. In our second algorithm our m reduction term is linked with both the BBHT and the binary search term. Depending on the relative size of y/n and lg | ^ | , one algorithm could be better than the other. Luckily, we can apply tricks and optimize the runtime so that the lg lg mo reduction term is only attached to the binary search term. This gives us a definite improvement over Chapter 4. Quantum Unary Extrema Finding 52 the simple Sample_And_Improve. The main idea is that we can apply the concept of Budgeted_Sampling to prune m substantially in time 0(\/n-\-\g \% \ lglgmo). After that, we can com-plete the algorithm using the original SampleJVnd_Improve, taking advantage of the much reduced m. The whole procedure is given as follows: procedure G e o m e t r i c B u d g e t i n g ^ , Ao) B «- 0{y/n) A ^ A 0 whi le True do T <- Budgeted-Sampling(j5, A)) i f T = 0 then break end i f A *- max(T) end whi le r e t u r n Sample_And_Improve(£?, A) end procedure This procedure is almost identical to Budgeted_Sample_And_Improve except that it decreases B over iterations and perform an additional Sample JVnd.Improve at the end of the procedure. In our previous version, our sampling step serves both as a sampler and a verifier. However, since we are decreasing B every iteration, B could be substantially less than \/n when T becomes empty. More precisely, we exit the loop in Budgeted_SampleJVnd_Improve when m drops to 0. In this version, we terminate the loop when m becomes too small and a budget of B is no longer enough to even sample an element. Since A is unlikely to be the maximum, we follow up by running an extra Sample_And_Improve to retrieve the maximum. This trailing call establishes our correctness. We begin our runtime analysis by pointing out that the work need for each iteration is 0(B + l g | ^ | ) . We pay B to sample a set of size bounded by B and retrieve its maximum with the classical algorithm, resulting in a total of 0(B + lg \ty\). There are again two contributors to this runtime, the sampling Chapter 4. Quantum Unary Extrema Finding 53 (B) and the binary search ( l g | ^ | ) . We already know that B is decreasing geometrically (hence the name). Therefore, the total amount of work spent in sampling is the sum of a geometric sequence dominated by the first 0 ( i/n) term. This essentially decouples the m-reduction term from the sampling term. However, we still pay the full price of l g | ^ | in each iteration. We' now trace through the reduction of m over the iterations. We will extensively use the result Lemma 2. To simplify our notations, we will scale our budget and all the runtime to the same constant. Given a budget of B, we should have expectedly reduced m to -^yfm. We now define rrii to be the expected value of m after i iterations. We again start with mo. We apply a budget of ^ / J in the first iteration, which expectedly reduces m to %/2m. As a result, m\ = \J2m§. In the next iteration, we would apply a budget of yf\, giving us = \J4m\. At the (i + l)th iteration, we apply a budget of and reduce m from rrii to m-i+i = y/2irrii. Note that it is not mathematically rigorous to directly use the expectation in our formulation. There is a hidden distribution over what m could be, and our reduction is simply a constant times the square root function. Luckily, we only need an upper bound for m and since the square root function is concave, by Jensen's inequality we have 2l^/E[X] > E[2ly/~X]. Our m» serve as good upperbounds. Expanding m,j + i = \/2imi, we have rrii = mf2^1+2-2+3-22+-+i-2i") < Tmt We terminate our loop when we failed to sample anything. For that par-ticular iteration, the value of m; does not change. Intuitively, that brings us 2l+1m02i+1 = 2xrriQ . As a result, our pruning would halt at around the (lglgmo)" 1 iteration. More formally, after lglgmo + 2 iterations, our be-comes 2i lgmo, while our budget becomes .1^41nmo • B y the runtime analysis Chapter 4. Quantum Unary Extrema Finding 54 of BBHT, the next BBHT will fail to sample anything with probability at least half. As a result, the while loop will continue expectedly a constant number of times before it stops. The total number of iterations the while loop has done is therefore O(lglgmn). The total time of this pruning algorithm is thus O ( v ^ + l g | ^ | l g l g m 0 ) . We can then analyse the behaviour of the final SampleJVncLImprove, which relies heavily on the final m after the pruning process. It is also highly dependent on how many pruning iterations we have done, and how much budget we have before exiting the loop. In order to bound the remaining runtime, we make use of the following lemma: L e m m a 3 Suppose we have failed to sample anything with BBHT running against a budget B, the expected number of good elements is bounded by 0 ( -§ r ) The proof is presented in Appendix B . Suppose we have done t pruning iterations, our budget before exiting the loop would be yflg. By Lemma 3, the expected number of remaining larger entries is bounded by 0(22t). The runtime of Sample_And_Improve depends on the loga-rithm of the number of remaining elements. Again, as the lg function is concave, we have E[y/n + lg \<fr\lgm] < y/n+ \g\ty\\gE[m\< y/n + lg\<&\(2t + const). Intuitively, since a budget of yjljr does not guarantee maximality, we have to fix the potential error incurred by insufficient budget. Sample_And_Improve works by repeatedly selecting elements until we hit a budget of y/n and certify maximality with higher confidence. Each iteration within Sample Ji.nd_Improve improves our confidence in getting the maximum entry. As a result, we will likely spend only twice as many iterations as we have spent on our pruning step. Since the expected number of pruning iteration is bounded by O(lglgmo), the expected runtime of the final Sample_And_Improve step wil l also be 0(y/n + lg | ^ | l g l g m 0 ) . The total runtime is thus 0(y/n + lg \ty\ l g l g m 0 ) . Chapter 4. Quantum Unary Extrema Finding 55 The success probability of Geometric-Budgeting clearly shares with that of Sample_And_Improve. As such, we have the following theorem: T h e o r e m 7 Geometric_Budgeting solves the problem of unary maximum find-ing in expectedly 0(\/n + \g \eM\ lg lgmo) time, and with a constant success prob-ability shared with Sample_And_Improve. As we can see, blending in the idea of geometric budget can isolate the BBHT term from the m-reduction term. This also suggest that if we have a faster m-reduction procedure, we can probably apply the same budgeting idea to isolate the BBHT search term from the m-reduction factor. 4.5 Progressive Approximation Our previous algorithms perform well with small lg\ty\. When lg | ^  | = 0(l^™n), Geometric-Budgeting solves the problem optimally. Unfortunately, it performs worse than the classical optimal algorithm when = u>(n). Looking into our algorithms, we see that some of the binary searches are wasted. For in-stance, we may not need to perform full binary searches in earlier iterations if their results will be overwritten by later candidates. However, it is also nec-essary to do some of the binary searches and increase our threshold, since it boosts our probability of finding large elements. To balance the two factors, we seek insight from maximum root approximation problem. In the maximum root approximation problem, our universe spans the interval [0,1) and we work with buckets of size e. There are ^ buckets and our problem corresponds to finding which in which bucket the maximum root lies. Given any universe, we can also normalize it so that it spans the inverval [0,1) and each value in the universe corresponding to a bucket of size r ^ j in the interval. A solution to any of the two problems can easily be translated to fit the setting Chapter 4. Quantum Unary Extrema Finding 56 of the other problem. Similar to most approximation algorithm, we can tackle the problem grad-ually with decreasing error each step. For instance, we can first approximate the maximum within an error of ^TTT- This translate to a unary maximum finding with \g\W\ = \fn. We can apply the Budgeted_Sample_And_Improve to solve this problem in 0(y/n lglgn) time. Wi th this approximated candidate, we can carry on and approximate the maximum within an error of ^57 -^ A l -though the error is much smaller than that of the previous iteration, the size of the restricted universe remains the same. We start with an interval of size and our final bucket size is ^57^ • I* suffices to divide our starting interval into 2 ^ buckets, translating to an universe with l g | ^ | = y/n. Solving this approximation problem takes another 0(y/n\g\gn) time. In a total run, we have [ ' S y ^ J + 1 approximation problems to solve and each takes 0(^/nlglgn) time. It seems that the algorithm runs in 0(yfnlglgn + l g | ^ | l g l g n ) time. Even if we start with some estimate with a better mo, our runtime is no better than that of BudgetedJ3ample_And_Improve. Moreover, the correctness of this algorithm depends on all subproblems being correct. The success probability of such procedure would not be constant. Before we deal with the runtime, we should first remedy the error incurred by successive invocations of probabilistic algorithms. The error of this algorithm is huge because once our approximation deviates from the actual maximum, we have no correction mechanism to fix that. Each of our previous algorithms is a self contained box that ends with a failed BBHT, with a low probability of error. However, this algorithm has numerous failed BBHT before it terminates. Should any of them err, our approximate will be wrong. Fortunately, we can also detect such errors. Suppose we start with an interval [/, u) and we subdivide the interval into 2^" buckets. Budgeted_Sample_And_Improve would first call Chapter 4. Quantum Unary Extrema Finding 57 BBHT repeatedly to get samples above I. It is easy to also check if any of the sampled elements are larger than u. If that is the case, it means that at least one of our previous approximations is wrong and we should backtrack. This guarantees that errors are checked and corrected over the execution. Indeed, we keep our promise that whenever we return a value, its maximality is backed by a failed BBHT to sample anything bigger and this certificate is at least half correct. The success probability is ensured. We now present the pseudocode of our algorithm and analyse its runtime afterward. We first describe it from the prespective of the maximum root ap-proximation problem. procedure Progress ive-Approximat ion^, e ) A < - 0 ; 5 < - l ; u < - 5 while 5 > e do while True do / / loop (*) T <— Budgeted-Sampling(4y /n, A)) i f T = 0 then u <- X + 5 break end i f while 3a; £ T s.t. x > u do 5 <— 2^5 II backtrack to a larger delta u <— A + 2^5 If Reset the upper bound end while A <— max(T) with error 6 end while end while re turn A end procedure Equivalently, the pseudocode for the general unary maximum finding prob-lem is: Chapter 4. Quantum Unary Extrema Finding 58 procedure P r o g r e s s i v e - A p p r o x i m a t i o n ^ , Ao) A ^ A 0 ; A « - 2 > 8 l * l ; u < - A whi le A > 1 do whi le True do / / loop (*) T <— Budgeted-Sampling^Vn, G(i,X)) i f T = 0 then w «— A + A break end i f whi le 3x € T s.t. x > u do A 2 ^ A / / backtrack to a larger delta •U <— A + 2 v / " A / / Reset the upper bound end whi le A <— max(T) with error A end whi le end whi le r e t u r n A end procedure We now derive the runtime of this algorithm. We first remark that the extra checks for error do not incur any major penalty in the overall complexity. Our algorithm will try to solve the problem when A = Y&\,2^k, etc. Our error checking may cause each problem with a specific A be run more than once. Moreover, the backtracking may be cascading, restarting more than one pre-vious iterations. On the other hand, an instance would only be restarted in two situations. Firstly, it will be restarted if it did not find the actual maxi-mum (there were larger elements but they went undetected). It would also be restarted in a cascading backtracking if, instead of it detecting the error, some later iteration detects the error. Furthermore, these two causes will only occur if the final BBHT maximality certificate was wrong, which happen at most half the time. As a result, the expected number we have to restart some iteration with a specific A is no more than a constant. As a result, we only have to focus on the no-error scenario and multiply the expected runtime by 2. We have already derived a loose bound of 0 ( ( i / n + lg \fy |) lg lg mo). Fortu-nately, we can prove a much tighter bound. The total work done within each Chapter 4. Quantum Unary Extrema Finding 59 iteration of loop (*) is clearly 0(\/n). In the errorless case, each iteration will progress by either (i) continuing with a smaller A or (ii) getting a larger estimate for A. We call the former an instance of Type I work and later an instance of Type / /work. The algorithm can only perform Type I work 0 ( L G J ^ ) number of times. Moreover, each instance of Type II work would contribute to a reduction on m. As we have discussed before, it takes expectedly O(lglgmo) instances of Type II work to reduce m to 1, after which all work would be of type I. This more careful charging and counting scheme shows that the total runtime is 0(\fn\g\grriQ + lg | ^ | ) . We have finally isolated the m-reduction term from the binary search term. As the penalty from correcting errors is no more than a constant, we have the following: T h e o r e m 8 Progress ive-Approximat ion solves the problem of unary max-imum finding in expected time O(y/n\g\gmo + \g\ty\) with constant success probability. A major observation is that the reduction on m is shared across all iterations. As. a result, we only pay once for each reduction on m. The same approxima-tion and charging technique can be applied on Sample_And_Improve to yield an algorithm that runs in expected time 0(\/nlgmo + lg I^ D- These algorithms are also asymptotically at least as fast as the classical optimal algorithm in all cases. In addition, we can use the same idea to prove that the problem of unary maximum finding with lg \ty\ = \/n is almost as hard as the general problem. If we can optimally solve the problem in the general setting, having an algorithm that runs in 0(\/n + \g\ty\) time, we can solve the restricted problem-in 0(\/n) time. On the other hand, if we can solve the restricted problem in optimal 0(y/n) time, applying progressive approximation will yield an algorithm that runs in 0(y/n + lg \ time. Chapter 4. Quantum Unary Extrema Finding 60 Theorem 9 The optimality condition of the general unary maximum finding problem is the same as the one with \g\&\ = y/n. An optimal algorithm in one of the cases can be modified to solve the other case optimally. 4.6 Pseudo Large Sampling Geometric-Budgeting and Progress ive-Approximat ion show us how we can shift the m-reduction term between the BBHT and the binary search term. As much as we want the m-reduction term to be constant, it is also unintuitive as why it could be so. We will devote this section to our current best result in reducing m quickly. We have extensively used the idea of large samples to accelerate the speed we reduce m. We have already shown in Theorem 1 that a sample of size k has a pruning factor of Literally, it is always better to use huge samples. Yet, there are two obstacles. Firstly, finding the maximum of the sample may take too long. After all, it is too expensive to sample the whole set and reduce m in "one" step. Secondly, the sampling step may take too long. If our sample consists of more than y/n + l g | ^ | elements, writing down the sample would take longer time than we are willing to invest. In Budgeted_Sample_And_Improve, the maximum of the sample is retrieved via the classical optimal algorithm. As a result, the size of the sample is bounded by v ^ + l g l ^ l - Luckily, with the Progress ive-Approximat ion procedure, we can afford a sample of size / l g l " \* • Finding the maximum of such sample takes no more time than 0 ( w ( 1 )" ^ lg lg n + lg | ^ |) = 0(y/n + lg |) expectedly. In general, suppose we have a unary maximum finding algorithm with run-ning time 0(f(n) + lg | ^ | ) and we have a target of 0(y/n + lg \ ^ \ ) , we can re-versely calculate how big our sample T should be with the relation f(\T\) = y/n. With a much larger size limit, it seems that the second problem is our only ob-stacle. Surprisingly, we can perform large sampling in constant time. Chapter 4. Quantum Unary Extrema Finding 61 We start by assuming that n is a prime number. If n is not a prime number, we can pad 0 to our input set until the size is a prime number. B y Bertrand's postulate (which is also known as the Chebyshev's theorem), there is always a prime number between n and 2n. Thus, the padding will not enlarge the input size by more than a factor of 2. This padding can also be done via a wrapper on our original oracle. The wrapper filters out all larger indices and treat their values as 0. To find out how much we should pad until n becomes a prime number, it suffices to search for a prime number in the range [n, 2n). Verifying primality takes only polynomial time on the logarithm of the input. Classically, we can do it in around 0 ( ( l g n ) 1 2 ) time with a deterministic algorithm [18] or 0 ( ( lgn) 6 ) with a randomized algorithm [19]. Quantumly, it takes around 0( ( lgn) 3 ) [4] time. The density of prime is approximately O(lgn). Finding one prime with BBHT in the specfied range takes at most polynomial of l g n time. Thus, snapping to a prime number boundary can be done within our desired \fn time limit. As discussed, writing down a huge sample may take too long even if each sample can be done in constant amount of time. Sampling against a threshold would only make the situation worse. Indeed, BBHT is already optimal if we want to sample anything against a predicate, but it could not run arbitrarily fast. As a result, we must first discard the concept of a threshold and focus on simply sampling any element (even when it is less than our threshold A) from our original set S. Moreover, we cannot afford sampling items one by one. The idea of solving this sampling problem comes directly from how we can represent our sample. For instance, we can take the first k elements and make them a sample. However, this sample is not at all random, and we cannot apply Lemma 1. If we are working on a sorted set, the maximum of this sample would correspond to the kth smallest element in our set and does not prune m by much. Extending Chapter 4. Quantum Unary Extrema Finding 62 this, we may try to pick an offset a and mark the next k elements as our sample. Generating such samples requires a single random integer generation and is thus can be done in constant time. Unfortunately, there is no guarantee how often we can get a good sample of this sort. Taking this idea further, we can generate two integers a € [0,n) and 0 € [ l ,n) . Our sample of size k would be T{ot,0) = {xa,%a+0(mod n)ixa+2f3{mod « ) > • • • i xa+(k-\)0{mod n)} where X\ denotes the i t h entry in our input set. Wi th this method, we have n{n — 1) distinct samples. They are only a small fraction of the all nk possible random samples of size k. The randomness of our samples is arguably much smaller. Fortunately, the following lemma shows that this pseudo-random sam-ple generator is efficient on all k. Lemma 4 The probability that T(a,0) is good is at least \. By definition, a good pseudo-random sample contains some of the largest j elements. W i t h probability at least half, setting our threshold A to the maximum ele-ment of some pseudo-random sample T(a, 0) would reduce mo to no more than In other words, we can start with a sample of size ( i g ig n ) i and set the A to the maximum of this. This takes 0(\/n + lg \ty\) and would reduce m to ( lg lgn) 2 immediately. We may then apply any of our previous algorithms to take advantage of the reduced m. Fortunately, we can push this bar even further. Notice that the runtime of Progressive_Approximation depends 2 on m. When m is reduced, we can afford a larger sample. For instance, a sample of size ( i g i g i g i g n ) ^ would still run in our desired time 0(\/n + lg\ty\) given that m is at most ( lg lgn) 2 . Formally, 2Progressive.Approximation actually depends on the number of active elements in the sample, which is upper bounded by the number of active elements in the. whole input set. Chapter 4. Quantum Unary Extrema Finding 63 we first define: ,.. n if i = 0 l g ( t ) n = { l g ( l g ( i - 1 ) n ) i f t > 0 We denote lg* n to be the minimum of % such that l g ^ n < 1. Note that lg* is an extremely slowly growing function and is often treated as a constant in most practical setting. We further define the sequence: di = ( l g ( 2 i ) n ) This sequence d has at most ^ r ^ + l terms before later entries become undefined. Taking in all our previous results, our new algorithm is as follows: procedure Large_Sampling(^, Ao) i < — 0; A <— Ao while a\+i is defined do Generate a and (3 6' *— wrapped oracle with respect to ur sample of size - T 2 2 — with budget 0(y/n + l g r u n A <— Progressive_Approximation(^", A) i f timed out then i <— max(i — 1,0) else i<-i + l end i f end while end procedure The basic idea is that we will keep getting large samples and find their maxima. The first iteration would reduce m to ( l g ^ n ) 2 . Another iteration would bring m down to (lg^ 4 ' n ) 2 . After the ith iteration, m would become ( lg ' 2 ^ n)2. Therefore, m will gradually approach 0. On the other hand, di < 1 in the last iteration. We would have sampled the whole input set and find the global maximum. Therefore, the correctness and success probability are inherited. We aim at running inner iterations within 0(\/ri + lg time. This could fail for a couple of reasons. For instance, we may have hit a bad sample Chapter 4. Quantum Unary Extrema Finding 64 and could not prune m fast enough. We may also be unlucky in one of the iterations, in that our Progressive-Approximation procedure returns a wrong estimate. However, the probability of getting a bad run is still bounded by a constant probability. As a result, each specific iteration would not restart more than a constant number of times expectedly. This is similar to our discussion in Progressive-Approximation. It suffices to analyse the perfect scenario. The complexity with error will follow with a larger constant. The perfect scenario is now easy to calculate. We will loop until di be-comes undefined, which takes 0(lg* n) iterations. Moreover, each iteration takes 0{y/n+ lg\<fc\) time. The whole algorithm takes 0((y/n + lg \<% |) lg* n) time. Theorem 10 Large-Sampling solves the unary maximum finding problem in 0 ( ( i / n + lg | ^ | ) lg* n) expected time with constant success probability. Note that this runtime is not sensitive to mo because we have no good way to estimate it. Should we be able to estimate mo, we would adjust our sample size to take full advantage of a reduced initial m. 4.7 Variants Large-Sampling shows us how we can reduce m to 0 in 0(lg* n) time. As in Budgeted_Sample_And_Improve, the next natural question is whether we can isolate the m-reduction term from either then BBHT search term (\/n) or the binary search term ( l g | ^ | ) . Fortunately, the answer is affirmative. We will briefly go over the ideas and leave the details for those who are interested. Using the idea of Geometric-Budgeting, we can replace the sequence di with d\ = 22tdj. The time required to find the maximum of the sample corresponding to di would become expectedly 0 ( ^ r + lg I^ D- The total work done before di becomes undefined is thus 0(\/n + \g\^C\\g* n). The last sample is of size Chapter 4. Quantum Unary Extrema Finding 65 2 2 „ . The number of remaining active elements would be expectedly 2 2 1 g We can follow this by a simple Sample_And_Improve which runs in expected time 0(\/n + lg | ^ | lg* n). The total time required for this strategy is thus 0 ( v ^ + l g | ^ | l g * n). Similarly, we can apply La rge - S a m p l i n g in Progress ive-Approximat ion. We overlay the loops from both algorithm. We continue doing our budgeted sampling in Progress ive-Approximat ion while maintaining our di separately. After each budgeted sampling, we perform a large sampling and find the max-imum according to the current d,. If this step runs in time, we advance our pointer on di and revert our di should we time out. This extra pruning step is successful if we get a good large sample and find the maximum in time. More-over, this step suceeds with constant probability. Clearly, this pruning step is more effective than the original budgeted sampling. We could picture the execu-tation of the algorithm as the original Progress ive-Approximat ion algorithm. However, the pruning from La rge - S a m p l i n g may kick in with some constant probability, reseting m to the corresponding di. It takes 0(lg* n) extra pruning before m = 1. After that, the whole procedure would boil down to a simple binary search. The correctness of this approach is guaranteed with the underly-ing Progress ive-Approximat ion algorithm, and the extra speedup is provided by the occasionally successful La rge - S a m p l i n g algorithm. The total runtime of this approach is thus 0(\fnlg* n + lg | ^ | ) . The most intriguing result is probably that we can cascade the idea of Large - S a m p l i n g . We have already developed a way to find the maximum in 0(\/n\g* n + time. That means that we can start with a sample of size ( i g " n ) i i to run in expected time 0(y/n + lg \ %f\). This set the number of active elements to (lg* n ) 2 instantly. We can repeat and generate a sample of size ,, (2),",—y-5 and run Progress ive-Approximat ion on it. This should also Chapter 4. Quantum Unary Extrema Finding 66 run in expected time 0(\fh~ + \g\ty\). This process can be repeated until we reduce m to 1 in 0(lg*(lg* n)) steps. As a result, this algorithm would run in 0((\/n + lg \ty\) lg*(lg* n))- As in our previous discussion, one could probably work on tricks to isolate the m-reduction term from either the BBHT term or the binary search term. Although it seems that these ideas can be cascaded further, we choose to stop the mechanical process because: • The constant attached to the runtime is probably too big for it to become practical. On the other hand, lg* n is already an extremely slowly growing function that many regard as a constant in any practical setting. • Each LargejSampling requires a wrapper for the oracle so that the internal algorithm can work on the restricted sample. If we cascade the idea of . Large_Sampling r times, we must at least wrap our oracle r times. As a result, a single call to the oracle would no longer be a constant operation even when each wrapping is just a simple arithmetic transformation. Theorectically, the ability to cascade Large_Sampling hints that our lower bound could be optimal. Although we have not succeeded in getting the desired bound of 0(\/n + lg \ty\), it appears that we can get close to the optimum up to any fixed degree with a fixed number of cascading. Moreover, it appears that our techniques can potentially speed up any unary maximum finding algorithm which focuses on m-reduction. As a result, there may not be any tight upper bound for any algorithm working with m-reduction, but we have not formally studied this claim. Chapter 4. Quantum Unary Extrema Finding 67 BBHT term isolated m-reduction oriented Binary Search term isolated Sample _And_Improve Ofv^+lgWlgn) O^+lg |<ar | lglgn) O U / n + lg|®'I Ig 'n) Geometr ic Budget ing Budget ed_Sainple-A nd„Improve 0((vS + lg |*|) lglgn) Large-Sampling 0 « ^ + l g | # | ) l g * n ) P r o g r e -s s i v e Appro -x imat ion 0 (VH lg lgn + l g | * | ) 0 (VHlg*n + l g | ^ | ) OCtv^ + l g l ^ D l g ' t l g ' n ) ) Figure 4.1: Summary on different Quantum Unary Maximum Finding results 4.8 Other applications 4.8.1 Maximum Root Approximation Maximum root approximation is one of the motivating problems of our research. Our P r o g r e s s i v e - A p p r o x i m a t i o n procedure is also modelled after an approxi-mation problem. As illustrated in the work by Gao et al. [3], the unary maxi-mum finding problem can still be applied to real numbers outside of the range [0,1). We only have to find an appropriate universe with the right bucket, and the results follow. For simplicity, we consider the case when all real numbers are non-negative. We cannot directly apply any unary maximum finding algorithm because the universe is infinite. Fortunately, we can probe a good range. The algorithm presented by Gao et al. is as follows: p r o c e d u r e Upper_Bound(n, G, e) u <— e f o r % 6 { 0 , . . . , n — 1} do w h i l e Xi > u do u <— 2u end w h i l e end f o r r e t u r n u Chapter 4. Quantum Unary Extrema Finding 68 This procedure simply loops through all elements and find the right upper bound that is applicable to each element. We start with the minimum error margin e and work our way up in an exponential fashion. Note that such an upper bound always exist and is no more than twice the value of the maximum entry. The runtime of this procedure is thus 0(n + lg Xme< i x ). We can then apply the classical unary maximum finding algorithm to approximate the value of the largest real number in time 0(n + lg x ™* x ) . We can speed up the process considerably in the quantum setting. Suppose we have already gone through several elements and found a non-trivial upper bound u, we would want to skip all later entries whose value are less than u. In other words, given an upper bound u, we are only interested in elements larger than it. As discussed, BBHT is an ideal tool for this purpose. Combining these ideas gives the following: procedure Upper_Bound(n, &, e) u e whi le True do t <— (Xi > u) i f t = NotFound then break end i f whi le t > u do u <— 2u end whi le end fo r r e t u r n u This is yet another bounded error algorithm where the success probability is shared with that of BBHT. We basically sample larger entries and update our upperbound to cover it. Similar to SampleJVnd_Improve, with probability no less than half, we would have selected the medium of all entries greater than u. Updating u to cover this entry would also take care of half of the remaining meaningful entries. Similar to the discussion of Sample_And_Improve, the expected total selection cost is bounded by 0(y/n), while the total work done Chapter 4. Quantum Unary Extrema Finding 69 in increasing u is bounded by 0( lg X m e a x ) . As a result, we can find a good upper bound in expectedly 0(\/n + lg x ™ a x ) . We can then apply any of our quantum unary maximum finding algorithm to retrieve the maximum. Suppose we apply the Large-Sampling algorithm, we will approximate the maximum of n real numbers given by a comparison oracle in expected 0(\/n lg* n+ lg X m c a x ) time with constant success probability. Note that the probability of error is applified because we need at least two failed BBHT, one for the Upper-Bound procedure and one for our unary maximum finding algorithm. As discussed before, we can detect failures because BBHT is one-sided. We can simply restart some steps when there were errors, and this translates to no more than a constant multiplier in front of our overall runtime (similar to Progress ive-Approximat ion and Large-Sampling). The same idea can also be applied to real approximation with relative errors. Interested readers are referred to the discussion in the work by Gao et al. [3]. 4.8.2 Quantum K-select We would also like to point out that some of our ideas are also useful in a completely different scenario. The problem of finding the kth largest entry in a given set' was discussed by Nayak and Wu [20]. They have also derived lower bounds for their problem in the binary model. The time required to find the kth largest element is proven to be at least fi(\/fcn), but they only gave an 0(Vkn\g(kn) lglg(fcn)) time algorithm. We now present an optimal approach to the quantum k-select problem: Chapter 4. Quantum Unary Extrema Finding 70 whi le True do T *— a large pseudo-random sample of size p <— max(T) / / using the optimal algorithm in the binary comparison model [7] w i t h budget O(Vkn) run S' <— F i n d A l l ( : E j > p) i f we did not time out and \S'\ € [k,3k] then break end i f end wh i l e r e t u r n classical-k-select(fc, S') The correctness of the algorithm is guaranteed by the final call through the classical optimal algorithm. The time for generating the large sample and find-ing its maximum is 0(y/n). The budget constraint ensures that each iteration takes no more than O(Vkn) time. We break if we sample everything above the sample maximum and the size of the reduced set is close to fc. Note if there were 3fc elements above our threshold, selecting all of them with F i n d A l l takes expectedly 0(y/kn) time. The following lemma tells us that this occurs with probability at least | . Lemma 5 Suppose we generate a pseudo-random sample of size | and find its maximum. We then count the number of elements above this maximum. With probability at least g, this count falls in the range [ ^ ,3^ ] . The proof is given in Appendix C. As a result, we will expectedly go through a constant number of iterations before we invoke the classical-k-select on a reduced set of size 0(k). The total run time is thus 0(1 + y/n + Vkn) + 0(k) = O(yfkn). This shows that the technique we have developed for the unary maximum finding problem can also be applied to other areas. We hope that our ideas could bring insight to other researchers as well. Chapter 5 71 Conclusions We have reviewed the problem of extrema finding with only unary predicates. The problem is optimally solved for the classical setting [3], but has not been previously addressed in the quantum setting. Although not as immediate as the extrema finding with a binary oracle, quantum computers do provide a speedup over classical ones. We have proved that the lower bound of this problem is fi(-y/n + lg\ty\) and showed a sequence of upperbounds ranging from 0(\/n + l g | ^ | l g n ) to 0(\/n\g* n + lg \ty\)- There are quantum unary maximum finding algorithms that perform asymptotically faster than any classical counterpart unless lg \ ty\ dominates, in which case some of our quantum algorithms are as fast as the classical optimal one. Furthermore, we have generalized our results in three fundamental techniques that can be reapplied and cascaded to improve existing bounds. 5.1 Future Directions We believe that our lower bound is in effect tight but our approach may not yield any optimal algorithm. A l l our algorithms are off by a factor induced by the speed we prune our active set. One may need an entire new approach to get a tight upper bound. In addition, we are interested in how we may apply our tools to solve other Chapter 5. Conclusions 72 problems. In particular, the large pseudo-random sample generation is efficient and fits well in many scenarios. We hope that some of our techniques could provide insights to other researchers. 73 Bibliography [1] Lov K . Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the 28th Annual ACM Symposium on Theory of Comput-ing (STOC), pages 212-219, 1996. [2] Michel Boyer, Gilles Brassard, Peter H0yer, and Alain Tapp. Tight bounds on quantum searching. Fortschritte Der Physik, 46:493-505, 1998. [3] Feng Gao, Leonidas J . Guibas, David G . Kirkpatrick, Wil l iam T. Laaser, and James Saxe. Finding extrema with unary predicates. Algorithmica, 9:591-600, 1993. [4] P. W. Shor. Algorithms for quantum computation: discrete logarithms and factoring. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science (FOCS), 1994. [5] H . Buhrman, R. Cleve, R. de Wolf, and Ch. Zalka. Bounds for small-error and zero-error quantum algorithms. In Proceedings of the 40th IEEE Symposium on Foundations of Computer Science (FOCS), pages 358-368, 1999. [6] Bartholomew Furrow. A panoply of quantum algorithms. Master's thesis, The University of British Columbia, September 2006. [7] Christoph Diirr and Peter H0yer. A quantum algorithm for finding the minimum, quant-ph/9607014, 1996. Bibliography 74 [8] R. Feynman. Simulating physics with computers. International Journal of Theoretical Physics, 21(6/7):467-488, 1982. [9] D. Deutsch. Quantum theory, the Church-Turing principle and the univer-sal quantum computer. Proceedings of the Royal Society of London Ser. A, A400:97-117, 1985. [10] E . Bernstein and U . Vazirani. Quantum complexity theory. SIAM Journal of Computing, 26:5:1411-1473, 1997. [11] W. H . Zurek W . K . Wootters. A single quantum cannot be cloned. Nature, 299:802-803, 1982. [12] P. W . Shor. Scheme for reducing decoherence in quantum computer mem-ory. Physical Review, A 52:4:2493-2496, 1995. [13] C H . Bennett, E . Bernstein, G. Brassard, and U . Vazirani. Strengths and weaknesses of quantum computing. SIAM Journal on Computing, 26(5):1510-1523, 1997. [14] Gilles Brassard, Peter H0yer, and Alain Tapp. Quantum counting. Lecture Notes in Computer Science, 1443:820, 1998. [15] T . H . Cormen, C . E . Leiserson, R . L . Rivest, and C. Stein. Introduction to Algorithms. M I T Press, 2001. [16] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, and Ronald de Wolf. Quantum lower bounds by polynomials. Journal of the ACM, 48(4):778-797, July 2001. [17] Peter H0yer, Jan Neerbek, and Yaoyun Shi. Quantum complexities of or-dered searching, sorting, and element distinctness. Algorithmica, 34(4):429-448, 2002. Bibliography 75 [18] Neeraj Kayal Manindra Agrawal and Nit in Saxena. Primes is in p. Annals of Mathematics, 160(2):781-793, 2004. [19] Michael O. Rabin. Probabilistic algorithm for testing primality. Journal of Number Theory, 12(1):128-138, 1980. [20] Ashwin Nayak and Felix Wu. The quantum query complexity of approx-imating the median and related statistics. Conference Proceeding Annual Symposium of Theoretical Computing, pages 384-393, 1999. 76 Appendix A Expected Pruning wi th Large Samples Recall that our active sets are defined by their corresponding threshold A and contain all elements greater than A. The only way we can reduce our active set is by raising our threshold, while keeping the feasibility of the threshold. One simple way to achieve this is to sample an active element from the active set and update our threshold to it. B y definition, we must have increased our threshold and reduced the active set. We have aruged that choosing a random element from the active set as the new threshold value expectedly halves the size of the active set. We have also considered choosing a sample with more than 1 element. To maximize the pruning, we should always update our threshold to the maximum of the sample. Given a sample of size k, we would like to know how much this process can prune our active set. Equivalently, we want to know how big the maximum of the sample is. If the sample maximum is the (i + l)th largest element in our original set, updating the threshold to the maximum will prune the size of the active set to i . We are particularly interested when k is large and non-constant. For instance, k is about \fm in our Budgeted_Sampling algorithm, where m denotes the current size of the active set. We assume that the original size of the active set is m, and we are given Appendix A. Expected Pruning with Large Samples 77 a sample consisting of k active elements chosen randomly with replacement. This sample can be generated by repeated BBHT as described in previous sec-tions. We define X to be the random variable denoting the size of the active set after the pruning. Note that unless k = 0, X < m as any active element would have increased the threshold and remove itself from the active set. As discussed, X > i means that our sample does not contain any of the largest i + 1 elements. The probability of this happening is thus (Z 2-^—-)k- Similarly, Pr[X =i] = P[X >i-l]-Pr[X > i] = ^[(m-i)k - {m-i-l)k\. As such, we have m—1 E[X] = iPrlX = i] i=0 1 m—1 - m— 1 m ^[E^-^-D*-1)*™-*)* i-0 i=l - m — 1 - m — l = ^ [E(--)fe] = i E ' ^ Furthermore, we know that m - l « m .. .. i=i t=i m Combining, we have 1 1 fc+1 m L J ~ mk k + 1 k + 1 We have thus proved Lemma 1 (page 48). When k = 1, this lemma says that we will expectedly select the median and half our set. When k = y/m, this theorem says that we can expectedly reduce m to y/m. 78 Appendix B Expected Pruning with Budgeted_Sampling or Failed BBHT Budget ed_Sampling is used extensively to generate large samples and prune our active set. In Appendix A , we show that with a sample of size k, we can reduce m to expectedly -j^j. This gives us an idea how we should tune k. However, we have also discussed that the optimal choices for k depends on m which is usually unknown. In Budgeted_Sampling, we tackle this by setting a fixed budget B and repeatedly running BBHT until we run out of our budget. Intuitively, each selection is about Q(y/^) and the size of the sample k would be about Q(B^). However, the size of the sample has become a random variable (even when m is known). We have already proved that the pruning factor of a sample of size k is expectedly ^ - j - . Define K to be the random variable of the returned sample size, we want to bound the expectation of -p^p[-Before we begin, we would like to note that -E[-^py] > E\K\+\ there is no nontrivial upperbound for the first term. Furthermore, finding E[K] is also a very hard problem and would require a lot of mathematical work. We now present a very loose upper bound for i?[-^q-j-] which works with most of our algorithms. This should take less derivation than directly dealing with all Appendix B. Expected Pruning with Budgeted_Sampling or Failed BBHT 79 different kinds of distributions. The idea of the proof is to find another variable K\ K' < K, which follows some well studied distribution. The inequality K' < K can be interpreted in several ways. Suppose we are given the log of the previous Budgeted_Sampling, denoted as an outcome o, K(o) counts the number of successful BBHT calls. Wi th K' < K, we have K'(o) < K(o) for any given outcome o. Equivalently, the probability of getting lower values from K' is higher than K. Therefore, the cumulative probability function of K' is greater or equal to that of K. As a result, we have both E\K'\ < E[K] and £ [ 7 ^ ] > £[7^1]• Thus, to properly bound £ [ 7 ^ - ] , it suffices to find an upper bound for E\K}+1\. This is easier than dealing with K directly because we take control over the distribution of K'. First of all, we should investigate how the sampling is done. BBHT works by doing serveral Grover iterations and sampling in a controlled manner. W i t h a budget of B, the execution is similar to running B Grover iteration, each followed by an optional measurement and a reset step. Some of the measurements would succeed and return a valid entry. Our random variable K is simply a count of successful measurements over B similar iterations. Suppose the expected runtime of BBHT is Cy/~^ for some appropriate constant c, we partition the B steps into groups of size Ic^J^ each. We simply ignore the last few iterations if 2 c y ^ does not divide B. As a result, we have L j | \ / 7 r J groups. We call a group successful if there is any good measurement within the period bounded by that group. We know define K* to be the random variable counting the number of successful groups. Clearly, for any execution sequence, K* < K. However, K* is still fairly complicated to analyse. Fortunately, we can think of it as several Bernoulli experiments and use a binomial distribution to approximate it. The major problem of this is that adjacent groups are not Appendix B. Expected Pruning with Budget ed_Sampling or Failed BBHT 80 independent experiments and the success probability of each experiment is not the same. However, the success probability is still well bounded by Markov's inequality. The probability of an instance of BBHT spanning more than a single group (of size 2 c y ^ ) is no more than 5. We now define K' to be a random variable following a binomial distribution with Ljrx/^ J experiments and suc-2c cess probability ^. Since we deliberately decrease the success probabitlity and decouple groups, we have K' < K* < K. Next, we are going to find an upper bound for E\K}+1]. To simplify the notation, we denote N to be With this setting, we have: / N \ 1 Pr\K' = i} = The expectation of K } + 1 is thus N i = 0 N _ 1 V m 1 ~ 2" 2-, i\{N - i)\ i + 1 t = 0 N Ly lN f r f (% + 1 2 " ^ ( i + l)![(JV + l ) - ( i + l)]! 1 1 ^ (N + l)\ 2N N + 1 (i + 1)\[N - (i + 1)]! 1 \ ( 2 N + l - l ) < 2 N + 12N" ' - N + 1 By transitivity, we conclude that < — 0(Byp^). This com-pletes the proof of Lemma 2 (page 50). In most of our context, we normalize all runtimes so that the constant is virtually 1. When B is \/ri, our expected pruning is approximately When Appendix B. Expected Pruning with Budgeted_Sampling or Failed BBHT 81 B is yf_?, the expected pruning is It is also interesting to know that the same idea can be applied to understand how much information a failed BBHT provides. Suppose we run a BBHT with budget B and it failed to return anything, we would like to bound the expected number of good indices that remain. In the standard BBHT, running with a budget of y/n is a certificate if there is 1 or more elements in our set. Suppose we have a budget of y/j to run our BBHT. We expect that it is effective in returning some good index if m, the number of good indices, is more than t. Unfortunately, this is not mathematically rigorous. We shall prove that the expectation of m given that a BBHT with budget y/j failed to return is bounded by 0(t2). This bound is probably not tight but is sufficiently powerful in many of our scenarios. To simplify the notation, we assume that B is normalized and T = 2^L is an integer. The case where B > 2 v / n boils down to the error rate of BBHT, and would not be reiterated. Moreover we let M be the random variable of the number of good indices given that a BBHT of budget B = 2 ^ failed. For any given m, we can partition and go through the same transformation to get an approximated binomial distribution for the number of successful BBHT. Formally, we construct a random variable K' following a binomial distribution with N — = and success probability one half. Since K' is an underestimate, the probability of getting no return from BBHT with K' is larger than that with the original K. Having a failed BBHT is equivalent to K = 0, which gives us: Pr[M = m | BBHT with budget failed] < - j ^ j f j I Appendix B. Expected Pruning with Budget ed_Sampling or Failed BBHT 82 As a result, we have: E{M} = ^ i P r l M = i^}Zi^k 2>- T 2L *r J _ 1 + 2 + . . . + ( T 2 - 1) T 2 + ( T 2 + 1) + . . . + (4T 2 - 1) 2° + 2 1 T2(T2 - 1) AT2{AT2 - 1) - 2 1 + 2 2 + " ' j 2 T 4 _ ,2 < T ^ _ < T 4 V -As the fractions ^ are decreasing geometrically (by at least a factor of § in later i ) , the sum converges and is well bounded by a constant around 150. Together with our defintion of T , we have proved Lemma 3 (page 54). Specifically, suppose we have run a BBHT of budget A/IT and failed to sample anything, the expected number of good elements left would be bounded by 0 ( 2 2 i ) . Appendix C 83 Properties of Large Pseudo-random Samples We have shown that a pure random sample of size fc has an expected pruning factor of . Unforturnately, we cannot reuse the theorem when we are working with large pseudo-random samples. For one thing, the elements within our samples are not independent and they are not random. Suppose we start with a set of size n (n is assumed to be a prime) and we want to generate a sample of size k. We generate our pseudo random sample by randomly picking an offset a and a gap p. Denoting Xi as the ith element in our set, our sample T consists of the elements ia+j/3(mod n ) i where j S [0, fc). The size of T is clearly fc, and it contains no duplicates. There are also a total of n(n — 1) such pseudo-random samples. Ideally, we want our pseudo-random samples to be as efficient as purely random samples. For instance, a pseudo-random sample of size fc is good if it contains at least one of the largest j elements. Setting our threshold to the maximum of a good pseudo-random sample will prune m directly to ^ . Note that since we do not sample against a threshold, mo can only be trivially n. This correspond to a pruning factor of around | . In addition, we want our sample to be efficient, that the success probability of getting a good sample is at least a constant and the time required to generate one is short. Appendix C. Properties of Large Pseudo-random Samples 84 We will now look at some of the properties of our random sample generator. Given a sample T(a, /?), we denote ti to be the ith element in our sample, namely %a+i(3(7nodn) • L e m m a 6 For any given position p £ [0,n) and any index i £ [0, fe), xv = U with probability ^. We observe that given any gap /?, we can set the offset a to be p — i/3(mod n) in order to make ti = xp. We have n — 1 possible values of /?, and the corresponding choice of a is unique. As a result, we have exactly n — 1 satisfying samples over the entire space of n(n — 1) samples. The probability of this event happening is thus - . n L e m m a 7 For any two given positions p,q £ [0,n),p ^ q and any two indices i,j £ [0, k),i j, the probability that xp = U and xq — tj is n(J^_^ • The condition can be translated to the following system: 1 (mod n) 1 i 1 j With i ^ j, the coefficient matrix is of full rank and there is only one solution. As a result, this occurs only with probability n ^ n 1 _ 1 ^ should we choose our sample randomly from our space. The combination of these two lemmas gives rise to Lemma 4 (page 62). We first mark the largest ^ largest entries, and call their respective positions po,Pi , • • • >P£- i - A sample T is good if E= V • U ie[o,fc),je[o,£) Appendix C. Properties of Large Pseudo-random Samples 85 is true. For some random T, the probability of E happening is: Pj]~ E Pr[ti=piM/i=j/j\ ^ n ( n - 1) The probability of getting a good pseudo-random sample from our generator is no less than half. As a result, it takes expectedly no more than a constant number of repetition before we actually hit a good pseudo-random sample. We can also extend this theorem and prove Lemma 5 (page 70). What we want to show is that with good probability, our sample is also representative. It would neither contain only large elements nor completely miss them. There is a significant probability that the largest element is within the largest jtfl to (3f )th elements. We have already shown that with probability at least | , the maximum entry in our sample is among the largest 3^ elements. It suffices to prove that the probablity of the maximum being the largest ^ elements is low. Again, we mark the positions of the largest ^ elements and see how often we hit them with our sample. For a random sample of size 3k, the probability of hiting one of the positions is The probablity of hitting any one of the positions is at most f = 3 • The probability of the maximum entry behaving well is thus no less Pr\E] > YI P& i€[0,fc),je[0,£) ^n 1 n ~ ~k"n~ 2k^k In - k > 1 ~ 2 n - 1 ~ 2 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0052063/manifest

Comment

Related Items