Mathematical Representation of Empirical Phenomena A Case Study of 18th-Century Mathematical and Mechanical Concepts by Serban Dragulin B.Sc. Computer Science, Polytechnic University of Bucharest, 2006 B.A. Philosophy, University of Bucharest, 2009 a thesis submitted in partial fulfillment of the requirements for the degree of Master of Arts in the faculty of graduate studies (Philosophy) The University Of British Columbia (Vancouver) December 2011 © Serban Dragulin, 2011 Abstract I suggest a theory of scientiﬁc models which considers that a model is composed of three parts: an idealized system, a theoretical description and a set of mathematical equations. Each component is connected to the previous one by bridge principles, i.e. any assertion, whether justiﬁed or simply postulated, which establishes a correspondence between two parts of a model. The goal of my research was to ﬁnd within 18th century mechanics a set of laws that could be considered mediators between the parts of models. The greater part is dedicated to an analysis of the conceptual developments made by Euler and Lagrange in the 18th century both in mathematics and in mechanics. In the conclusion, I show that the relations between the parts of models are too complex to be expressed in a single, unifying assertion. ii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 A Textbook Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 The Simple Linearized Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.1 The Damped Linearized Pendulum . . . . . . . . . . . . . . . . . . . . 12 2.2 The Compound Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 The Double Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Breaking Down the Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5 The Structure of a Scientiﬁc Model . . . . . . . . . . . . . . . . . . . . . . . . 28 2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3 The Status of Science in the Late 17th, Early 18th Centuries . . . . . . . 34 3.1 18th-century Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 34 3.2 The Status of Mathematics Before Euler . . . . . . . . . . . . . . . . . . . . . 38 3.2.1 42 Newton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Foundational Research in the 18th Century: Euler . . . . . . . . . . . . . . 50 4.1 4.2 Foundations of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.1.1 Eulerian Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.1.2 Inﬁnitely Small Quantities . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.1.3 Calculus of Finite Diﬀerences (cfd) and Inﬁnitesimal Calculus (ci) . . 70 Foundations of Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5 Foundational Research in the 18th Century: Lagrange . . . . . . . . . . . 82 5.1 Lagrange’s Criticism of His Predecessors . . . . . . . . . . . . . . . . . . . . . 84 5.2 Foundations of Lagrangian Calculus . . . . . . . . . . . . . . . . . . . . . . . 85 5.3 Euler and Lagrange on the Calculus of Variations . . . . . . . . . . . . . . . . 96 5.3.1 Euler 98 5.3.2 Lagrange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Project of Mechanics in Lagrange’s M´echanique Analitique . . . . . . . . 106 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 iv List of Figures Figure 2.1 Simple pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Figure 2.2 Compound pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Figure 2.3 Double pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Figure 3.1 Variable quantities of a curve . . . . . . . . . . . . . . . . . . . . . . . . . 41 Figure 3.2 Curve quadrature in Newton’s De quadratura . . . . . . . . . . . . . . . . 48 Figure 4.1 Prop. 14, Ch.2, Mechanica . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Figure 5.1 Graphical representation of Proposition III of Euler’s Methodus . . . . . . 98 Figure 5.2 Lagrange’s illustration of pvv . . . . . . . . . . . . . . . . . . . . . . . . . 113 Figure 5.3 The reduction of a curve to polygons . . . . . . . . . . . . . . . . . . . . . 122 v Glossary cfd Calculus of ﬁnite diﬀerences ci Calculus of inﬁnitesimals el Euler-Lagrange equation pcf Principle of composition of forces pda D’Alembert’s Principle pl Principle of the lever pla Principle of least action pp Principle of pulleys pvv Principle of virtual velocities pvw Principle of virtual work vi Chapter 1 Introduction It has become common philosophical practice to construe models as playing a representational role in science. For instance, Ronald Giere claims that What is special about models is that they are designed so that elements of the model can be identiﬁed with features of the real world. This is what makes it possible to use models to represent aspects of the world. [Giere, 2004, p. 747] R.I.G Hughes has a weaker approach, suggesting that if we examine a theoretical model [...] we shall achieve some insight into the kind of representation that it provides. [Hughes, 1997, p. S329] Daniela Bailer-Jones also supports the representational role of models: the term “representation” has been introduced to capture how scientiﬁc models can be about empirical phenomena. [Bailer-Jones, 2003, p. 60] while for Mauricio Suarez the aim of a theory of scientiﬁc representation is to lay down the general conditions that such disparate models must meet to carry out a repre- sentational function: It does not need to stipulate the conditions for accurate, true, or complete, representation. [Suarez, 2004, p. 767] 1 Despite its importance for philosophical theories of scientiﬁc representation, the concept of model has not yet been clearly deﬁned. In simpler terms, we do not know how to answer the question: “What is a scientiﬁc model?” Some properties of models are known and have been thoroughly investigated in the literature. For instance, we know that models can be about classes of empirical phenomena, that they involve processes of abstraction and idealization and that they are developed to explain and predict phenomena. However, summing up these characteristics still will not tell us what a model really is. The main aim of this paper is to suggest and defend a theory of scientiﬁc models as representations. In doing so, I will assume that models have a representational role in science and will try to deﬁne models focusing on those properties in virtue of which a model can be considered to represent a class of phenomena. My suggestion is that a model has three components: an idealized system, a theoretical description of this system and mathematical formalisms. The idealized system is deﬁned by a series of sentences expressed in natural language which identify the components of the system and the relations between them. The system is idealized because there need be no perfect real-world correspondent to it, e.g. there are no frictionless pendulums. The theoretical description is a set of mathematical equations which express the relations between the theoretical entities of the system, e.g. force, kinetic or potential energy, wave amplitudes, and so on. An example would be the equation ∥F⃗f ∥ = 0 which simply states that the force of friction, represented by F⃗f , is null. By using theoretical terms to describe the idealized system, this layer connects the model with a particular theory. In consequence, the model will necessarily be associated with a theory. This explains the position of Mary Morgan and Margaret Morrison in [Morgan and Morrison, 1999] which views models as mediators between theories and empirical phenomena. The present view also agrees with [Morgan and Morrison, 1999] in that it considers the process of modelling to be independent 2 of a particular theory. For instance, the model of the simple pendulum is not deduced from Newton’s laws of motion, but from the laws of motion in conjunction with a series of independently justiﬁed principles and the theoretical description of the idealized system. Finally, the mathematical layer includes the equations of motion of the system which have been deduced following the procedure outlined above, i.e. by applying mathematical methods and principles to the theoretical description and the laws and principles related to the theory. One example of a mathematical principle, belonging to the calculus of variations, is the Euler-Lagrange equation, ∂L ∂q = d ∂L dt ∂ q˙ , which is used to obtain the equations of motion of a system. Only when L is interpreted as the diﬀerence of the kinetic and the potential energy of a system can the Euler-Lagrange equation be applied to mechanics. In the following chapters I will focus on the relations between these three layers in general, and on the relation between the mathematical and the theoretical layer in particular. I will not address the other important problems concerning models, that is, “Why and how do models represent phenomena?” but only sketch a possible answer, leaving a more in-depth analysis of this topic for further research. Considering the deﬁnition of scientiﬁc model just given, and the aim of the present work, an ideal starting point for investigation will be 18th-century mechanics and calculus. There are four reasons why a historical analysis of the development of 18th-century science is a good way to support the assumption that scientiﬁc models have a three-layered structure. First, 18th-century mathematicians moved away from the geometrical procedures of their predecessors. This is witnessed in Euler’s Introductio in analysin inﬁnitorum and Lagrange’s Th´eorie des fonctions analytiques, Le¸cons sur le calcul des fonctions, and M´echanique Analitique, as well as in the works of d’Alembert and Johann and Jakob Bernoulli. The emphasis on the analytical expression of a functional relation, disconnected from its previous geometrical representation, allowed for a deductive formulation of mechanical principles. In turn, 3 these principles were used to represent and solve various problems in mechanics. Finding an answer to the questions “What justiﬁed 18th-century mathematicians and physicists to apply the new mathematical methods to physical problems?” and “How did they proceed in this direction?” would provide a good explanation of the connection between the ﬁrst two layers of a model (the mathematical and the theoretical). A second reason why an incursion into the history of 18th-century mechanics is relevant to our present purposes is that during this period we see the emergence of new theoretical concepts, such as force, and a reconsideration and systematization of older ones, in particular of the concept of vis viva. Thirdly, 18th-century scientists addressed foundational issues of both calculus and mechanics, and in doing so they also discussed the role of idealization and abstraction in the scientiﬁc process. The interest in foundational issues introduced new concepts to mathematics (functions, derivative, etc.), as well as a higher degree of rigor to a chaotic mathematical domain. Finally, before this period there was no clear distinction between calculus, mechanics and geometry, and the great achievement of the 18th century was to separate each theory by introducing fundamental algorithms. The domains of these sciences were separated, e.g. it was known what the object study of geometry was, but geometry still employed mechanical concepts. Finding algorithms to solve classes of problems was the main motivation for the separation of the calculus from mechanics during the 18th century and by analyzing the gradual separation of the two domains we will understand better the separation of the mathematical and theoretical layers. The following chapters are organized following the structure of the presentation above. The ﬁrst chapter presents a deﬁnition of a model, following the algorithm used to deduce the equations of motion of the simple and double pendulum. Conceptually, the chapter is 4 divided in two parts. In the ﬁrst part I present the algorithms for deducing the equations of motion, while in the second part I show that the algorithmic procedure follows the threelayered deﬁnition of model. The second chapter presents the status of science before Euler’s Mechanica and Introductio in analysin inﬁnitorum. The aim is to introduce a term of comparison with the 18th-century development of the calculus and mechanics. Here I introduce a taxonomy of the domains of science in the pre-Euler era and discuss the status of experimental practices within these domains. Moreover, in this chapter I analyze, following [Boyer, 1949],[Kline, 1990] and [Grabiner, 2005] the geometrical nature of the diﬀerential calculus and mechanics before the ground-breaking work of Euler and Lagrange. As will be shown, 17th-century mathematicians, Newton and Leibniz in particular, had problems proving that the methods of the diﬀerential calculus could meet the standards of rigor of Euclidean geometry. The main problem was to show that by using inﬁnitely small quantities, or diﬀerentials, the results of the calculus were not approximations of their rigorous geometrical counterparts. The main objections to the diﬀerential calculus were those raised in [Berkeley, 2009], reiterated in [Carnot and Browell, 1832] and by Lagrange in [Lagrange, 1806] and [Lagrange, 1797]. With minor diﬀerences, all accounts of the diﬀerential calculus before Lagrange’s work followed the same pattern. The diﬀerential calculus is concerned with the relative rates of change of quantities. In other words, suppose that y is a function of x, then the problem is to calculate, for an inﬁnitesimal increment ω of x, the ratio between ω and the corresponding increment of y. For instance, if y = x2 , then for x + ω, y will be x2 + 2xω + ω 2 . The diﬀerence between the value of y for x and its value for x + ω will be 2xω + ω 2 . The ratio dy dx will be 2x + ω. At this point, the approaches to the diﬀerential calculus consider ω a negligible quantity, 0, or as tending towards 0. [Berkeley, 2009] pointed out that if ω is negligible, then the calculus is not rigorous, for these minor errors could add up in an unpredictable manner 5 so that the end result of the methods of diﬀerential calculus will be uncertain. If ω = 0 then dx = ω = 0, so that calculating dy dx will inevitably require a division by zero. If ω → 0, then it is not clear whether ω is an assignable or unassignable quantity. These issues were partly solved by Lagrange’s algebraic approach to the foundations of calculus, as pointed out by Craig Fraser in [Fraser, 1987] and argued in chapter 5. The third chapter investigates Euler’s contributions to the foundations of the calculus and mechanics. An important concept in Euler’s work, presented in Introductio in analysin inﬁnitorum, was that of function. To appreciate the magnitude of this breakthrough, I present brieﬂy the history of the concept of function, following [Youschkevitch, 1976], [Boyer, 1949], [Kline, 1990]. The history of the concept of function will be better understood if we assume the distinction drawn by Giovanni Ferraro [Ferraro, 2000] between functional relations and functional expressions. The concept of functional relation is nothing else than the perceived correlation between two or more quantities. However, a functional relation can be expressed as a table of values, as a geometrical curve, or as an analytical formula. Euler’s breakthrough was to deﬁne functions analytically. A function was no longer just a curve expressing the relation between an abscissa and an ordinate, but a mathematical formula expressing the relation between two or more variables1 . This was a step forward in separating the calculus from its geometrical origins, but Euler’s mathematical methods still relied on inﬁnitely small quantities and as such they were still open to Berkeley’s objections. Moreover, the application of the diﬀerential calculus to Euler’s mechanics, as presented in Mechanica, partly depends on an analogy between the mathematical concept of inﬁnitesimal and Euler’s inﬁnitely small point of mass. Given the conceptual problems of inﬁnitesimals, using them to represent motion would be unjustiﬁed. However, there is more to Euler’s mechanics than this. I agree with Dieter Suisky’s claim in [Suisky, 2009] that the foundation 1 This will be explained in greater detail in chapters 2 and 3. 6 of Euler’s mechanics is the distinction between internal and external principles, i.e. between rest and motion as essential characteristics of bodies and forces as causes of change. To be more precise, Euler wanted to construct his mechanics on a priori principles. These principles were the essential properties of bodies deduced through a purely rational process. He found that the fundamental property of bodies is impenetrability. From this follow the other three essential properties: extent, mobility and persistence. In this sense, impenetrability is an internal principle of mechanics. The external principle is that of force. For Euler, forces are necessarily external and the only causes of motion, since it is in the nature of bodies to remain at rest. From this distinction Euler deduces Newton’s equations of motion. In this sense, Euler’s mechanics can be viewed as an attempt to provide an a priori justiﬁcation of Newtonian mechanics. The fourth chapter is dedicated to Lagrange’s work on the foundations of the calculus and mechanics. As Craig Fraser observes in [Fraser, 1985], [Fraser, 1987], Lagrange’s views on the foundations of the diﬀerential calculus and, implicitly, of the calculus of variations, can be divided into two periods. The ﬁrst period begins with Lagrange’s letters to Euler in the 1750s and ends with his M´echanique Analitique in 1788. During this period Lagrange assumed the views of his predecessors on the diﬀerential calculus and developed the axiomatic mechanics of M´echanique Analitique. In his second period, Lagrange constructed the calculus upon a ﬁrm algebraic basis, which did not appeal to inﬁnitesimals or limits. He was also the ﬁrst to introduce functions as the primary object of the diﬀerential calculus, and to replace differentials (i.e. inﬁnitesimals) with derivative functions. Although the M´echanique Analitique belongs to Lagrange’s “inﬁnitesimal period”, he did not reject its ﬁndings. The reason for this was that for Lagrange although the methods using inﬁnitesimals were problematic, they nevertheless yielded accurate results, and this accuracy could be proved by applying methods that did not rely on inﬁnitesimals, i.e. his calculus of derivatives. More importantly, in the 7 preface to the second edition of the M´echanique Analitique, published after his algebraic turn, he points out that he used inﬁnitesimals for ease of calculation, and that the results could be justiﬁed by algebraic means. This justiﬁes Helmut Pulte’s claim in [Pulte, 1998] that Lagrange was an instrumentalist. The claim is further strengthened by Lagrange’s own approach to mechanics. The M´echanique Analitique contains no geometric constructions, but also no philosophical reﬂections on the nature of space, time, mass, etc. These are not the basic concepts of his mechanics. Instead, he founds statics on a set of principles he considers self-evident and which he translates into mathematical equations, and then reduces dynamics to statics by proving d’Alembert’s principle. The fundamental principle of mechanics is the principle of virtual velocities (henceforth pvv) which Lagrange proves in a not too satisfactory way as argued in [Pulte, 1998]. In this chapter I also discuss the deduction of the Euler-Lagrange equation, both in Euler’s Methodus Inveniendi [Euler, 1743] and in Lagrange’s subsequent letters to Euler. This is a particularly important discussion for two reasons. First, because the Euler-Lagrange equation is a mathematical principle used to derive the equations of motion for any dynamical system and, second, because the Euler-Lagrange equations were the result of Euler and Lagrange’s eﬀorts to justify the principle of least action (pla). This principle was the basis of Lagrange’s mechanical thought prior to the M´echanique Analitique, as expressed in his treatise, Application de la m´ethode expos´ee dans le m´emoir pr´ec´edent ` a la solution de diﬀ`erentes probl`emes de dynamique (1760) [Lagrange, 1867a]. 8 Chapter 2 A Textbook Approach This chapter is an analysis of a textbook approach to what can be called the pendulum family of models. This is not intended as a presentation of all models within the family, but only as an exposition of the mathematical tools currently used in dynamic systems theory for some members of this hierarchy of models. The taxonomy of pendulum models is constructed following two criteria. The ﬁrst is structural and regards the components of the dynamical system. Thus, the root of the hierarchy is the simple pendulum composed of a bob attached to a massless rod which, in turn, is connected to a frictionless fulcrum. By connecting via another rod a second bob to the ﬁrst, the resulting model will be of a double pendulum. If we take into consideration the mass of the rod, then the resulting model is that of the compound pendulum. The second criterion is given by the degree of abstraction of environmental factors. Thus, if, for any of the previous models, we consider the friction of the fulcrum, we obtain the damped version of each of the previous models, a total of three diﬀerent models. However, it should not be understood that this family of models includes replicas of actual pendulums, i.e. that take into account every single physical attribute of the real world 9 system, such as colour, volume, etc. Each model deﬁnes what it deems as being relevant for the questions it wants to answer. Insofar as some properties of the real system do not inﬂuence the predicted outcome, they are ignored by the mathematical representation of the system. There is another constraint on mathematical representation, apart from the aforementioned criterion of relevance, dictated by mathematics itself. Just as one cannot use a pickaxe to bring down the Statue of liberty, the mathematical apparatus proves incapable of handling all possible scenarios with the same ease and elegance. More precisely, if θ is bigger than sin θ, then the resulting diﬀerential equations will be non-linear. In order to avoid complications, I will address only those cases in which the relation sin θ ≈ θ holds. 2.1 The Simple Linearized Pendulum The simplest model of a pendulum consists of a point of mass attached to a massless rigid rod which is itself tied to a frictionless fulcrum, moving without air resistance. The ﬁrst step Figure 2.1: Illustration of the simple pendulum is to identify and label the components of the system. Thus, the mass is m, the length of the rod is l, and the angle between the rod and the normal is θ. θ is the variable property of the system, whose equation of variation must be found. The other two components are constants. 10 The second step is to identify the forces acting on the elements of the system. These are the gravitational force G and the tension T in the rod acting on the bob. Using the parallelogram rule, we break up G into a component parallel with the string (mg cos θ) and another component perpendicular to the string (mg sin θ). Since the bob does not detach from the rod, nor does it move upwards, bending or compressing the rod, we can conclude that T = mg cos θ. Therefore, the only force responsible for the bob’s movement is −mg sin θ. The minus sign indicates the fact that the force points in the negative direction of the xaxis when θ > 0. Next we calculate the bob’s acceleration from Newton’s second law of mechanics, F = ma.1 From mg sin θ = ma we ﬁnd the acceleration a = g sin θ. At the same time, the acceleration is a = d2 s , dt2 where s is the arc starting from the lowest point of the bob’s trajectory to the point of angle θ. The relation between s and θ is s = lθ, which means that a = d2 s dt2 2 = l ddt2θ = −g sin θ. This gives the equation of motion: d2 θ g + sin θ = 0 dt2 l (2.1) Since we are dealing only with small oscillations, i.e. sin θ ≈ θ, we can linearize the previous equation thus d2 θ g + θ=0 dt2 l (2.2) θ = θ0 sin ωt + ϕ0 (2.3) which has the solution where ω = √ gl is the angular frequency, θ0 is the angular amplitude and ϕ0 is the initial angle of the pendulum. Since the fulcrum is frictionless and there is no friction with the √ air, the pendulum will swing indeﬁnitely with the period T = 2π gl . How the solution of 1 In the next chapters it will be shown that this is not Newton’s formulation, but an 18th century expression. 11 the equation of motion was obtained, i.e. how to ﬁnd the solution of a second-order linear diﬀerential equation, will be dealt with in the next section. For now the focus is on the steps that must be followed in order to ﬁnd the equations of the main parameters of the system. 2.1.1 The Damped Linearized Pendulum How do the above equations change if we take into account air friction? For low velocities the drag force is directly proportional to the ﬁrst power of the velocity. This suggestion seems to model realistically the behaviour of a damped pendulum with small angular amplitudes, i.e. we can approximate sinθ with θ. The formula of the drag force is FD = −2γ dθ dt (2.4) where γ is a constant determined by experiment. Since we are still considering the low oscillations case, we obtain the equation of motion of the damped pendulum by adding FD to the linearized version of (2.1): d2 θ dθ g + θ=0 + 2γ dt2 dt l (2.5) (2.5) is the result of the principle of composition of forces. Since the only force responsible ⃗ and the force of friction impedes motion, for movement is the tangential component of G, then the movement of the bob will be the result of the composition of these two forces. Let us solve this equation and ﬁnd the variation in time of θ. The ﬁrst step in solving a second-order linear diﬀerential equation is to substitute the function with erx or, in our case, ert since the derivatives are relative to the variable t. The choice of erx as a provisional solution 2 d y dy is not arbitrary. The general form of a second-order linear equation is a dx 2 + b dx + c = 0, 12 where a, b and c are constants and a ̸= 0 (a could be 0, but this case does not concern us). The derivative of the function y = erx is a constant multiple of the function itself, i.e. dy dx = rerx . If we replace y with er x we get ar2 erx + brerx + c = 0 (2.6) erx (ar2 + br + c) = 0 (2.7) Since erx is never 0, it follows that erx is a solution to our equation only if r is a root of ar2 + br + c = 0. This way, we reduced our initial second-order diﬀerential equation to a simple quadratic equation usually called the auxiliary or characteristic equation. The next step is to ﬁnd the roots of this equation and determine the solutions of the initial diﬀerential equation. We know that erx is a solution of the equation but since the characteristic equation has two roots, the solutions will be er1 x and er2 x . The general solution of the equation is y(x) = c1 y1 (x) + c2 y2 x.2 Returning to the equation of motion of the simple damped pendulum, a similar replacement as above gives us the characteristic equation with its two roots r2 + 2γr + g/l = 0 √ √ 1 r1 = (−2γ + 4γ 2 − 4g/l) = −γ + γ 2 − 4ω02 2 √ r2 = −γ − γ 2 − 4ω02 where ω0 = √ (2.8) (2.9) (2.10) g/l. Depending on the values of γ and r, we can identify three distinct cases. First, when γ 2 − ω02 < 0 or γ 2 < ω02 , the pendulum is said to be underdamped, i.e. there is small friction with elements of the environment. The roots of the characteristic equation 2 This is a mathematical theorem, but its proof is beyond our purposes. 13 √ in this case are r1 = −γ + i ω02 − γ 2 = −γ + iω and r2 = −γ + iω where ω 2 = ω02 − γ 2 for convenience. So the general solution of the equation of motion in this case is θ(t) = c1 e(−γ+iω)t + c2 e(−γ−iω)t (2.11) where c1 and c2 are arbitrary.3 We can further simplify this equation using Euler’s formula: eix = cos x + i sin x: θ(t) = c1 e−γt eiωt + c2 e−γt e−iωt (2.12) θ(t) = e−γt (c1 eiωt + c2 e−iωt ) = e−γt (c1 (cos ωt + i sin ωt) + c2 (cos ωt − i sin ωt)) (2.13) θ(t) = e−γt (c1 + c2 ) cos ωt + ie−γt (c1 − c2 ) sin ωt (2.14) But we are interested in real solutions, mainly because it is hard to interpret a complex one. Since the general solution of the diﬀerential equation is a linear sum of linearly independent solutions it follows that (c1 − c2 ) sin(ωt) and (c1 + c2 ) cos ωt are also solutions of the secondorder linear equation. Let us deﬁne C = c1 − c2 , then a solution is θ(t) = Ce−γt cos ωt + ϕ, where C is the initial amplitude and ϕ is the initial phase shift. If γ 2 = ω02 then the pendulum’s motion is critically damped and r1 = r2 = r = −γ. In this case the mathematics dictates that the general solutions is θ(t) = e−γt [A + Bt] (2.15) 3 The mathematical solution takes these constants to be arbitrary. The physical interpretation, however, is diﬀerent. 14 where the two constants are, as usual, determined by the initial conditions. Finally, the third possibility is γ 2 > ω 2 in which case pendulum has an overdamped motion described by √ 2 2 √ 2 2 θ(t) = e−γt [Ae γ −ω0 t + Be− γ −ω0 t ] 2.2 (2.16) The Compound Pendulum The mathematical description of the compound pendulum includes the concept of moment of inertia (Ip ) about the pivot point. The moment of inertia is generally viewed as the rotational equivalent of the mass. It is expressed in kg × m2 and depends both on the mass and the shape of the rotating object. For instance, a cylinder with mass m and radius r rotating around its axis has the moment of inertia on this axis Iz = mr2 2 . On the axes x and y, in the Figure 2.2: Diagram of the compound pendulum plane perpendicular on the cylinder’s axis, the moment of inertia is Ix = Iy = 1 2 2 12 m(3r + h ), where h is the height of the cylinder. A simple circular hoop rotating around the z-axis perpendicular on its center has the moments of inertia: Iz = mr2 ; Ix = Iy = mr2 /2. 15 In general, the moment of inertia of a given object rotating around an axis is the sum ∑ of the moments of inertia of every particle of the object, i.e. I = i=1 nmi ri2 . The formula mi ri entails that particles situated closer to the axis of rotation contribute less to the overall result than those further away. This then captures the idea of mass distribution relative to a certain axis. Obviously, this is not a workable formula; we cannot determine the masses and the number of particles that make up an object. But we have a mathematical workaround. If we assume that the object is divided into inﬁnitely small volumes having elementary mass ∫ dm, then the moment of inertia is I = r2 dm. This allows us to compute the formulas in the previous examples. Having clariﬁed the notion of moment of inertia, the equation of motion of the compound pendulum is I d2 θ + mgl sin θ = 0 dt2 (2.17) 2 which for small angular displacements becomes I ddt2θ +mglθ = 0. The period of the compound √ 1 I pendulum is T = 2π mgl . 2.3 The Double Pendulum We start with the simple linearized pendulum and tie a second bob to the ﬁrst one. There is no friction in the fulcrums and no air friction. Because the system is complex and the motion of each bob depends on the motion of the other, a Newtonian approach would be overly diﬃcult, since the values of the forces that acted upon each bob would be interdependent. If the force generated by m1 and acting on m2 is F12 and the force generated by m2 and acting on m1 is F21 , then, when m2 is at the point of maximum height F21 will have a determinate magnitude x. However, when m2 begins to descend, the magnitude of the force F21 will 16 Figure 2.3: Diagram of the double pendulum increase proportionally with the velocity of m2 . Moreover, if F21 increases, then so will the velocity of m1 . In consequence, the magnitude of F12 will also increase and further accelerate m2 . Therefore, in order to derive the equations of motion following the same steps as in the case of the simple pendulum, we would have to ﬁnd the relation between F21 and F12 . Instead, I will use the Lagrangian Equation. The coordinates of the two bobs are: x1 = l1 sin θ1 (2.18) y1 = −l1 cos θ1 (2.19) x2 = x1 + l2 sin θ2 (2.20) y2 = −y1 + l2 cos θ2 (2.21) Since the energy of the system is conserved, we can use the following formula L=K −P (2.22) where K is the total kinetic energy of the system, P is the total potential energy and L is the Lagrangian function of the system. Therefore, we must ﬁnd the equations of the two 17 energies. The potential energies, as given by the potential equation, of the two bobs are: P1 = m1 gy1 = m1 gl1 cos θ1 (2.23) P2 = m2 gy2 = m2 g(l2 cos θ2 + l1 cos θ1 ) (2.24) The general equation of kinetic energy is K = mv 2 2 . By rewriting v 2 = x˙ 2 + y˙ 2 , where x˙ and y˙ are the derivatives of the two coordinates with respect to time, we ﬁnd the equation of the velocity 2 x˙1 = l12 θ˙1 cos2 θ1 (2.25) 2 y˙1 = l12 θ˙1 sin2 θ1 (2.26) 2 2 2 2 x˙2 = l22 θ˙2 cos2 θ2 + l12 θ˙1 cos2 θ1 + 2l1 l2 θ˙1 θ˙2 cos θ1 cos θ2 (2.27) 2 2 2 2 y˙2 = l22 θ˙2 sin2 θ2 + l12 θ˙1 sin2 θ1 + 2l1 l2 θ˙1 θ˙2 sin θ1 sin θ2 (2.28) m1 2 ˙ 2 l θ1 2 1 m2 2 ˙ 2 2 2 2 K2 = (l2 θ2 + l12 θ˙1 + 2l1 l2 θ˙2 θ˙1 cos (θ1 − θ2 )) 2 K1 = (2.29) (2.30) Finally, we can calculate the Lagrangian L = (K1 + K2 ) − (P1 + P2 ) L= (2.31) m1 2 ˙ 2 m2 2 ˙ 2 2 l1 θ1 + (l1 θ1 + l22 θ˙2 2 2 2 2 + 2l1 l2 θ˙1 θ˙2 cos (θ1 − θ2 ) + m1 gl1 cos θ1 + m2 gl2 cos θ2 + m2 gl1 cos θ1 ) 18 (2.32) This result is used in the Euler-Lagrange equation which, for the ﬁrst bob, has the following form: d ∂L ∂L ( )= dt ∂ θ˙1 ∂θ1 (2.33) The component derivatives are ∂L = m1 l12 θ˙1 + m2 l12 θ˙1 + m2 l1 l2 θ˙2 cos (θ1 − θ2 ) ∂ θ˙1 (2.34) d ∂L ( ) = (m1 + m2 )l12 θ¨1 + m2 l1 l2 θ¨2 cos (θ1 − θ2 ) dt ∂ θ˙1 − m2 l1 l2 θ˙2 sin (θ1 − θ2 )(θ˙1 − θ˙2 ) (2.35) ∂L = −l1 g(m1 + m2 ) sin θ1 − m2 l1 l2 θ˙1 θ˙2 sin (θ1 − θ2 ) ∂θ1 (2.36) Thus, (2.33) for the ﬁrst bob becomes (m1 + m2 )l12 θ¨1 + m2 l1 l2 θ¨2 cos (θ1 − θ2 ) 2 + m2 l1 l2 θ˙2 sin (θ1 − θ2 ) + l1 g(m1 + m2 ) sin θ1 = 0 (2.37) which, divided by l1 is (m1 + m2 )l1 θ¨1 + m2 l2 θ¨2 cos (θ1 − θ2 ) 2 + m2 l2 θ˙2 sin (θ1 − θ2 ) + g(m1 + m2 ) sin θ1 = 0 (2.38) Through a similar process we get the equation of motion for the second bob: m2 l2 θ¨2 + m2 l1 θ¨1 cos (θ1 − θ2 ) 2 − m2 l1 θ˙1 sin (θ1 − θ2 ) + m2 g sin θ2 = 0 19 (2.39) 2.4 Breaking Down the Formulas It can be observed from the previous section that there is a structural similarity between the algorithms used to deduce the equations of motion. Consider the case of the simple linearized pendulum. The ﬁrst step was to describe the overall structure of the system by individuating its components and the relations between them. Thus, we have a simple bob connected to a ﬁxed point by a massless rod. Here we also establish the relations of dependency between the components of the system. If we consider l and m variable, then the angle θ will be a function of time, of l and m. The next step is to identify the forces acting on the system. Obviously, the ﬁrst force ⃗ Since the bob’s trajectory is an arc, it follows that to be taken into account is gravity, G. there is a centripetal force acting on the bob and oriented towards the fulcrum, T⃗ , and a force with the same magnitude as T⃗ but pulling in the opposite direction. This force is the ⃗ G ⃗ y and is obtained by applying the principle of decomposition of vertical component of G, forces. Since this principle is derived from the principle of composition of forces (henceforth pcf) which is fundamental for Lagrangian statics, I will present the latter: ... a body that is set in uniform motion following two diﬀerent direction at the same time, necessarily moves along the diagonal of the parallelogram whose sides it would travel separately in virtue of each force acting independently.4 The pcf should not be confused with the principle of composition of motion. The latter principle was originally proved by Galileo, but the composition of forces is a 17th-century development. The proof of this principle, as given by d’Alembert in Opuscles rests on two principles: (i) two forces acting on an object are equivalent to a single force dividing the angle between the ﬁrst two in two equal angles and (ii) multiplying the two forces with the 4 “...un corps qu’on fait mouvoir uniform´ement suivant deux directions diﬀ´erentes ` a la fois, parcourt n´ecessairement la diagonale du parall´elogramme dont il eˆ ut parcouru s´epar´ement les cˆ ot´es en vertu de chacun des deux mouvements.” [Lagrange, 1853, p. 10] 20 same factor will yield a proportional resultant force. These two principles are not part of Newtonian mechanics, but independently justiﬁed principles that use the theoretical concepts of Newton’s theory (i.e. forces) ⃗ whose direction is Returning to the pendulum, we observe that the component of G ⃗ = the tangent to the arc of the trajectory is the force which makes the bob move, ∥G∥ mg sin θ, and it is this force which allows us to ﬁnd the acceleration of the bob, a = g sin θ. The introduction of forces allows us to rephrase the constrains of the system. Thus, the requirement that the system be frictionless can be expressed now mathematically as ∥F⃗f ∥ = 0, where F⃗f is the force of friction acting on various components of the system. The next step of the deductive process leading to the equation of motion is to use the deﬁnition of the acceleration as the second derivative of space relative to time a = d2 s dt2 and the geometrical relation between the angle of displacement, the length of the arc and the length of the rod, s = lθ. By making the appropriate substitutions we obtain the equation of motion. The solutions of the equation are obtained by applying purely mathematical methods which are reinterpreted. In summary, the algorithm by means of which we obtain the equation of motion of a system follows these general steps: 1. Describe the structure of the system and individuate the elements of the structure 2. Construct the theoretical description of the system in this case, individuate the forces 3. Calculate the theoretical magnitudes (forces, acceleration) 4. Deduce the dynamic, i.e. the equation of motion. The ﬁnal result of the ﬁrst step is a set of sentences in natural language which deﬁne an abstract system. I say “deﬁne” because even though the simple pendulum resembles an actual 21 physical system, it does so only in an idealized manner. After all, there are no frictionless pendulums and no pendulums with massless rods. Rather than construing the description given in step (1) as an incomplete description of the structure of a physical system which ignores some features of the actual system, I suggest that this description is a deﬁnite description of a postulated system, an idealized entity embroidered within a story world. More precisely, the description in step (1) is not an idealization or an approximation of a real system, but the deﬁnition of a ﬁction which may resemble the real system. The problem however, is that if the model is about an idealized system, then we have no reason to believe that it would apply to the real thing, in this case, an actual pendulum. I will return to this later on, but for now it suﬃces to point out this fundamental assumption, that the equations of motion describe the behavior of an idealized system. The natural-language description of the system must individuate the relevant properties of the system, i.e. those properties that play a causal role in determining a measurable outcome. There are two kinds of properties: those that remain constant and those that vary in time. It is the variation in time of the latter that we are trying to capture in functional form, although the ﬁnal mathematical expression will also depend on the constant properties. The variable properties, e.g. θ denoted by variables in the mathematical form, may depend on invariable physical properties of the system, e.g. g, or on properties that might vary, e.g. l, m. As we have seen in the previous sections, the solutions to the equations of motion depend on these quantities, although there we considered l and m constant. However, the generality of the mathematical representation of the behavior in time of θ is given by the fact that we can replace l and m by any value. In this way the equation applies to any pendulum. Therefore, in this case mathematical functions represent more than just the variation in time of the angle of displacement. They also represent the relation between the “provision22 ally” constant quantities of the system (l, m) and those quantities that change in time. It follows that before introducing functions as mathematical representations of relations of interdependency between quantities, we must ﬁrst have the mathematical concepts of variable and constant. As it will be shown in chapter 4, Euler’s Introductio in analysin inﬁnitorum respects this order, introducing variables and constants before functions. More importantly, in the preface of Institutiones calculi diﬀerentialis he gives the intuitive basis for introducing functions: In order that this diﬀerence between constant quantities and variables might be clearly illustrated, let us consider a shot ﬁred from a cannon with a charge of gunpowder. This example seems to be especially appropriate to clarify this matter. There are many quantities involved here: First, there is the quantity of gunpowder; then, the angle of elevation of the cannon above the horizon; third, the distance travelled by the shot; and, fourth, the length of time the shot is in the air.. . . Hence, if we always keep the same quantity of powder, the elevation of the barrel will vary continuously with the distance travelled and the shots duration of time in the air. In this case, the amount of powder, or the force of the explosion, will be the constant quantity. . . Those quantities that depend on others in this way, namely, those that undergo a change when others change, are called functions of these quantities. This deﬁnition applies rather widely and includes all ways in which one quantity can be determined by others. Hence, if x designates the variable quantity, all other quantities that in any way depend on x or are determined by it are called its functions. [Euler, 2000, Preface] This passage resembles the ﬁrst step of the present algorithm, in that it individuates those quantities that vary and those that remain constant and, more importantly, determines the relations of dependency between them. I will call these relations functional relations to distinguish them from functional expressions. A functional relation is any perceived dependency between two physical quantities. On the other hand, a functional expression is any means of representing the functional relation. In Euler’s case, the functional expression is an analytic formula, but a functional relation can be represented through other means: graphs, tables, geometric constructions, etc. An important thing to observe is that functional relations obtain only between quantities. 23 This is true both for Euler’s deﬁnition of function and the modern one. In Euler’s example, the variable quantity is the amount of gunpowder, while the “quantity of space”, i.e. distance, through which the cannonball passes and the “quantity of time” it spends in the air are functions of the amount of powder. Applying Euler’s reasoning to the case of the simple pendulum, we consider l and m to be constant and the angle θ a function of these two. But l, m and θ are quantiﬁable and in virtue of this they can be represented as variables. As it will be shown in chapter 4, Euler understood variables diﬀerently than we do today, and these diﬀerences are reﬂected in his concept of function as well. But functions as representations of a relation of dependency between quantities are static representations in that they express the value of the dependent quantity for particular values of the variables. Derivatives, on the other hand, are dynamical representations of these dependencies. In more concise terms, the derivative represents the rate at which the primitive function changes relative to the variation of one of its variables. The expression “change in time of θ” becomes, in mathematical terms, dθ dt which also represents “change at a time” or the “instantaneous angular velocity”. Variations could also be represented as ratios of diﬀerences, without introducing the concept of derivatives, e.g. f (x2 )−f (x1 ) . x2 −x1 The accuracy of this representation is given by the value of the diﬀerence x2 − x1 . For instance, consider the function sin x and x2 = π, x1 = −π. The variation of the function is sin π − sin −π = 0 − 0 = 0. In consequence, the ratio sin x2 −sin x1 x2 −x1 will also be 0. This would lead to the wrong conclusion that on the interval [−π, π] the function sin x does not vary at all. A more accurate representation would be obtained by making x2 approach x1 or vice versa. An exact representation would be obtained only if the length of the interval [x1 , x2 ] is considered inﬁnitely small. Of course, this assumption is problematic, since a perfectly accurate representation could be obtained only if the length of the interval 24 is 0. However, if this is the case, then there would be no variation of the variable at all. This problem and others pertaining to inﬁnitely small quantities will be treated in greater detail in chapters 3, 4 and 5. It might be objected that derivatives are functions themselves and so it follows that functions can be used to represent both functional relations and their variations. This is true, however, in the previous paragraph I chose to ignore this aspect in order to distinguish between two represented objects: a relation between quantities and the relative variation of two interdependent quantities. Moreover, from a historical perspective the introduction of derivatives as functions is attributed to the last years of Lagrange’s career. Throughout the 18th century, relative variations of functions were represented as ratios of diﬀerentials or, which amounts to the same thing, of inﬁnitely small quantities. As it will be argued in chapter 3, reliance on these quantities was problematic, especially in the absence of a formal deﬁnition of limits. This problem was solved by Cauchy early in the 19th century, an event which also established the ﬁrm foundations of calculus. Another reason for presenting derivatives as ratios of inﬁnitesimals rather than functions is that the concept of inﬁnitely small quantity was used in the deduction of the equation of motion of the compound pendulum in 2.2. There we assumed that the oscillating object was divided into inﬁnitely small volumes having an inﬁnitely small mass dm. The motivation for introducing this concept was given by the impossibility of calculating the total moment ∑ of inertia of the oscillating object, i.e. I = i=1 nmi ri2 . This ﬁnitesimal approach is the reverse of the way in which I introduced ratios of inﬁnitesimals in the previous paragraph. Now, instead of subtracting a value of a function from another, we compute the sum of all values possible for m, the mass of a particle, in the discrete set {m1 , m2 , . . . , mn }. Obviously, we cannot ﬁnd out the value of this sum, since we do not know the values of m1 , m2 , etc. Instead, we assume that the body is divided into inﬁnitely small volumes and apply the rules 25 of the integral calculus. This method has its roots in Euler’s Institutiones calculi diﬀerentialis, where he considered the diﬀerential and integral calculus special cases of the calculus of ﬁnite diﬀerences and sums, respectively. To the modern reader, the justiﬁcation of this method is given by the theory of limits. However, at the time Euler wrote the Institutiones, the notion of limit had not been mathematized and was still intuitive. One question that will be addressed in chapter 4 is whether he was justiﬁed in assuming that the diﬀerential calculus is a special case of the calculus of ﬁnite diﬀerences. Returning to the algorithm for the deduction of the equations of motion it can be observed that, in step (2) we introduced theoretical terms in the description of the described structure. In the case of the simple pendulum this implies individuating the forces acting on the components of the system and calculating their magnitudes. This step describes in theoretical terms the abstract system resulted after (1), by individuating the forces and their magnitudes. Using this description and the laws of the overarching theory we deduce the function representing the dependency on time of the angle of displacement. In formulating the laws of mechanics there are two fundamental mathematical concepts: function and derivative. It is because we deﬁne velocity mathematically as inﬁnitesimal variations of space in inﬁnitesimal variations of time that we can further deﬁne the acceleration as variation of velocity in time. The concept of function and further on that of derivative and partial derivative are central to how we represent the (idealized) system. However, the laws of mechanics and the theoretical description are usually insuﬃcient. We also need boundary conditions that represent the physical constraints of the system. In the case of the simple pendulum the boundary condition is T = mg sin θ. Here the boundary condition does not have any particular inﬂuence on the solution, but in diﬀerent scenarios, the boundary conditions are indispensable to the elimination of variables from the system of equations that results after applying the theoretical laws and principles. The boundary 26 conditions are essentially mathematical equations which fulﬁl a double role. On one hand, they are the mathematical expression of the physical limits of the model. This is the case with T = mg sin θ which, as previously mentioned, expresses the physical requirement that the bob’s trajectory is an arc. On the other hand, boundary conditions are mathematical constraints on the functions used to represent the behavior of the system. For instance, in the simple case of a quantum potential well, the boundary conditions are that the ﬁrst derivative of the wave function and the wave function itself are smooth and ﬁnite at the boundary points, i.e. the coordinates of the potential well. This is due to the dependency of Schr¨odinger’s equation on the second derivative of the wave function which must be ﬁnite at all points in order to be intelligible. This is possible only if the ﬁrst derivative of the wave function and the wave function are continuous and ﬁnite at all points. Thus, in this case the boundary conditions are mathematical in nature. The boundary conditions are a distinct part of the theoretical description of the system, although they might not be explicitly stated. In general, the mathematical boundary conditions, such as continuity of some functions, are tacitly assumed and invoked when we have to choose between several solutions or when we have to eliminate some unknown variables. The ﬁnal two steps are less interesting conceptually, as they are just applications of mathematical methods to the equations represented in the previous steps. The equations of motion of the double pendulum are deduced following the same algorithm. We begin by laying down the structural description of the system and labelling its components. The constant properties of the system are l1 , l2 , m1 and m2 . The variable properties whose evolution in time must be represented mathematically are θ1 and θ2 . The theoretical description constructed in the second step makes use of the concepts of kinetic and potential energy. In the case of the simple pendulum this description introduced the concept of force which allowed us to apply Newton’s laws. The description given is 27 summed up either by the equation L = K − P or by H = K + P where the values of K and P are calculated in the third step of the algorithm by summing the individual energies of each bob, i.e. K = K1 + K2 , P = P1 + P2 . Deducing the equations of motion of the double pendulum is not as straightforward as deducing the equations for the simple pendulum. In addition to the usual mathematical methods, the Euler-Lagrange equation was used to ﬁnd the analytical expression of the variation of the two angles θ1 and θ2 . The physical signiﬁcance of the Euler-Lagrange equation will be analyzed in chapter 5. For now, it suﬃces to say that unlike the principle of the composition/decomposition of forces which is introduced by the physical theory, the EulerLagrange equation(el) is mathematical in nature. Both the principle of composition and the el equation belong to what I will call bridge principles which connect the three components of a model: the idealized system, the theoretical description and the mathematical representation. 2.5 The Structure of a Scientiﬁc Model The idealized system postulated by the model is described by a set of sentences expressed in natural language. The mathematical layer connects with the theoretical description via mathematical bridge principles. In turn, the theoretical description connects with the idealized system via a second kind of bridge principles, which depend on the theory within which we are working. By bridge principle I understand any assertion, whether justiﬁed or simply postulated, which establishes a correspondence between the elements of one layer and those of another, e.g. between functions and theoretical concepts, or between concepts and elements of the idealized system. The mathematical component is the set of equations and their solutions which represent 28 the evolution in time of the relevant properties of the idealized system. The relation of representation, however, depends on the other two layers of the model. In the absence of the two other layers the variables and constants of the equations of motion would have no meaning; they would be nothing more than mathematical expressions. To understand the importance of layers (2) and (3) we must take a closer look at the role of mathematics in general in the proposed account of models. It can easily be observed from the deductions of the equations of motion presented in this chapter, that mathematics plays a part both in describing the system from a theoretical point of view at the second layer, and in representing the relevant properties of the system. However, the equations of motion and their solutions are obtained through purely mathematical methods and this does not explain why they represent the system. To ﬁnd an answer, we should turn to the third step of the algorithm presented in the previous section, i.e. calculating the magnitudes of the forces, because this is the point in which the mathematization of the represented phenomenon occurs. Here, theoretical entities which previously have been labelled, e.g. the force of friction by ∥F⃗f ∥, are represented as mathematical functions, e.g. f (x, y) = xy, where x is interpreted as the mass of an object and y is the body’s acceleration. As shown in the previous section, the fundamental concepts of mathematical representation, as identiﬁed in the pendulum example, are, in logical order, variables, constants, functions, inﬁnitesimals and ratios of inﬁnitesimals. Therefore, to understand how the equations at the mathematical layer function, we must see in greater detail the relation between these concepts and the entities they represent. To illustrate this relation I will draw upon an analogy with art5 and consider these mathematical concepts equivalent to the oils and brushes used to create an artistic representation. 5 Whether such an analogy is permitted or not is subject to debate. However, I will follow Bas van Fraassen [Van Fraassen, 2008] which argues that this analogy is permissible and informative. 29 These concepts are the building blocks of mathematical representations, just as the oils and brushes are prerequisites for constructing an artistic representation. Of course, it should not be understood that the two kinds of representation are created following the same constraints, i.e. that a physicist simply arranges symbols on a sheet of paper just as a painter brushes a canvas.6 First and foremost, the symbols must be organized following the syntactical rules of mathematics. Second, the symbols represent particular instantiations of the general concepts.7 For instance, dx is the inﬁnitely small increment of a particular variable x and sin x does not represent the general concept of function, but only an instantiation of it. Symbols represent these instances of mathematical concepts which, in combination, represent empirical phenomena. The representational relation between symbols and instances of concepts is established through an act of ﬁat, or an “initial baptism” to use Kripke’s terminology. As such, whether we use roman or Greek letters to represent a function is simply a matter of convention. Likewise, whether variables are represented by letters from x to the end of the alphabet or by letters from a on, is again a matter of convention, provided of course, that we can distinguish between variables, constants and functions. The instances of the general concepts, not their symbols, are analogous to the brush strokes on canvas. Just as the combinations of oils of diﬀerent colors make up the features and shades of an idealized representation of Socrates’ face in Jacques-Louis David’s Death of Socrates, so do the combinations of the particular instances of mathematical concepts8 6 There are numerous constraints on the process of constructing a scientiﬁc representation. For more details see [Van Fraassen, 2008],[Morgan and Morrison, 1999],[Bueno, 2000]. In the following chapters I will focus only on those imposed by the degree of mathematical advancement. 7 The relation between the general concept and such an instantiation can be regarded as the same relation between genus and species. 8 I take this expression to refer especially to the aforementioned concepts, i.e. variable, constant, function, ratios of inﬁnitesimals and inﬁnitesimals themselves, although any other notion is acceptable, e.g. matrices, groups, sets, etc. 30 combine to construct a part of the theoretical description. This analogy can be further expanded to include the second layer of the model, i.e. the theoretical description, by considering the theoretical entities and the relations between them analogous to parts of the painting. For instance, the sum of brushstrokes that ultimately represent Socrates in David’s painting are, in this example, analogous to T = mg sin θ, which is a part of the theoretical description of the system. Ultimately, the combination of all individual parts of the description is analogous to the painting and just as the painting represents a particular event, so does the theoretical description as a whole represent a phenomenon. Moreover, David’s painting represents an idealized version of the actual event, not the actual event, just as the ﬁrst two layers of a model represent an idealized phenomenon. However, if we say that this system is postulated and that its description is connected via bridge principles to its mathematical representation, then we must still explain how this idealized system connects with its real counterpart. In other words, what makes the model a representation of that particular physical system? My suggestion is that the relation between the idealized and the real world system is that of iterative correlation.9 Iterative correlation is the gradual adjustment of the model’s components to “ﬁt the empirical data” prompted by inconsistencies between the predictions made by the model and the measured quantities of the real world system. Given the previously suggested structure of the model, there are two possible sources of inconsistency. First, it might be the case that the postulated system does not resemble the relevant aspects of the physical system. The criteria of relevance are determined on a case-by-case basis, but there is one characteristic which is essential to all criteria: the structural description of the postulated system must express the dependent quantities as variables and the constant quantities as constants. In other 9 This is just a brief sketch of the concept of iterative correlation, as the focus of the present chapter is to clarify the relations between model-layers. 31 words, the variable properties of the idealized system must match the variable properties of the real world counterpart, and the same goes for the constant ones. For instance, if the real system is a double pendulum, but the model considers that one of the masses is ﬁxed, then the resulting model will not give accurate predictions, for it will not capture the structure of the real system. Second, the postulated system may correctly identify the constant and the variable properties of the real world system and the dependencies between the latter, but the mathematical layer may not express these dependencies accurately. For instance, the relation between the d2 θ dt2 + gl θ = 0. Now, if ( )−1 2 instead of this equation the model suggests that this relation is expressed by ddt2θ + gl θ = 0, angle of displacement and the length of the rod of a simple pendulum is then the predictions made by the model will not match the measured values. Thus, even though the idealized system postulated by the model is isomorphic with the physical system, the mathematical representation of the relations between the diﬀerent magnitudes does not correspond with reality. In this case, the mathematical representation is modiﬁed so that the predicted values match the measured ones. In this sense, iterative correlation is less a deﬁnite relation between a model and the real world and more an intentional process of “adjusting” the model to reality. Although a great deal more could and should be said about the notion of correlation, an in-depth analysis of it is not the purpose of the present work. Instead, the focus will be to clarify the relations between the three components of the model, in particular, the notion of bridge principles mediating between the mathematical layer and the postulated system. 2.6 Conclusions To recapitulate, the algorithms used to deduce the equations of motion of the simple and double pendulum suggest the following structure of what we call a model: 32 1. Underlying postulated idealized system (e.g. m1 , m2 , l1 , l2 , θ1 , θ2 ) 2. Theoretical description of the system 3. Mathematical formalisms Therefore, the answer to the ontological question “What is a scientiﬁc model?” is a triplet composed of an idealized physical system, a theoretical description and a set of mathematical equations representing, mathematically, the behavior of this system. The three components are connected by bridge principles, which I have loosely deﬁned as any assertion, whether justiﬁed or simply postulated, which establishes a correspondence between the elements of one layer and those of another. To understand the signiﬁcance of each layer, we must take a closer look at the concepts speciﬁc to each layer and at the ways in which they are connected. This will be the task of the following chapters/ 33 Chapter 3 The Status of Science in the Late 17th, Early 18th Centuries This chapter is a brief overview of the status of science before Euler and Lagrange. The goal is to illustrate the diﬃculties the two had to face and, in doing this, to present a basis of comparison for their achievements. Thus, in the ﬁrst section of this chapter I will try to distinguish between the various types of mathematical and mechanical inquiries developed in the 18th century. The second section will focus on the development of mechanics in connection with the development of calculus in mathematics and the separation of the latter from geometry. 3.1 18th-century Physics The scientiﬁc landscape of the 18th century bears little resemblance with the clearly delineated domains of present-day scientiﬁc inquiry. During this period, physics was not an experimental science and its domain bore little resemblance with the one attributed to it today. Eighteenth century physics was a part of natural philosophy concerned with the gen34 eral principles of the nature of matter and change. As such, its domain included topics from chemistry and biology to the human sciences. Moreover, physics was not a mathematized discipline. Mathematics played a more important role in optics, astronomy, harmonics, and any other discipline which lay between mathematics and physics. During the 16th century, these intermediary sciences, together with mechanics, progressed more than physics itself. Physics became more oriented towards the empirical with the development of academic institutions throughout Europe. Two such institutions were the Academi´e Royale des Sciences in Paris and the Royal Academy in London, both founded in the 1660s. The aim of these new institutions and of those similar to them created throughout Europe was to advance knowledge, unlike universities, whose goal was to pass on accumulated knowledge. [Home, 2008] distinguishes three groups whose main concern was physics, in 18th-century France. First, there were university professors whose main concern was the individuation of ﬁrst causes and the certainty of this knowledge. Their inquiries were made from an Aristotelian standpoint following a non-mathematical and verbal expository method. The second group comprised les physicien of the Academi´e Royale des Sciences. Their aim was to expand knowledge through empirical means backed by a mechanistic theory. Finally, there were the Cartesians, who were allowed to join the Academi´e in 1699, after being denied access for almost forty years for their “dogmatic” views. Their method was also experimental, and their theory mechanistic; like the members of the ﬁrst group, their aim was certain knowledge of ﬁnal causes. Around the same time, in the 1660s, a tradition of public speaking on experimental demonstrations was established by Jacques Rohault, and a similar course was introduced into the University of Paris in the 1690s, headed by Pierre Polini´ere. The lectures of the course were published in 1709 and represented “the ﬁrst major inﬁltration of an experimental outlook into the French educational system” [Home, 2008, p. 356]. 35 Things were somewhat diﬀerent across the channel. The meetings of the Royal Society of London were opportunities to demonstrate experiments that previously had been conducted privately and to discuss their consequences. Courses focusing on experiments were introduced at Oxford in 1704 by John Keill and at Cambridge in the following years. James Hodgson and Francis Hauksbee also presented courses on experimental demonstrations, which were a way of advertising Hauksbee’s instruments. By the end of the 18th century, physics had become a largely experimental science, whose main subjects were the constitution of matter, heat theory, electrostatics and optics [Grattan-Guinness, 1990]. What is the place of mechanics in this heterogeneous mix of scientiﬁc domains? Before answering this question, we must distinguish for sake of clarity between mechanics, Newtonian mechanics and physics. Following [Grattan-Guinness, 1990] I will consider mechanics to refer to “the study of the rest and motion of bodies (which themselves are construed widely to include point-masses, extended solid and ﬂuid objects, instruments, machines, frameworks, and constructs) under the action of normal physical forces (and so excluding, for example, heat, electrical, and magnetic actions)”[Grattan-Guinness, 1990, p. 315]. The main topics of mechanics thus deﬁned are equilibrium and motion, questions regarding inertia, force, energy, momentum, collisions, vibration, rotation and oscillation. Newtonian mechanics was a branch of mechanics, a tradition in which all investigations have as a basis the gravitation law and Newton’s three laws of motion. Mechanics in general and, implicitly, Newtonian mechanics were mathematized. This feature distinguishes these two domains from physics, which did not beneﬁt from the contemporary development of mathematical tools. Although it addressed topics on the constitution of matter, just as mechanics did, physics had a nonmathematical approach which set it apart from mechanics. Within mechanics there are two other traditions: variational and energy mechanics. I will not go into details regarding the history of Newtonian mechanics, as the latter two traditions 36 are more relevant for the subject at hand. Moreover, the interaction with these two traditions gave Newtonian mechanics the form it has today, one example being the modern-day expression of Newton’s second law as F = ma, a form popularized by Euler in the eighteenth century. A second reason for focusing on variational and energy mechanics is that the scientists working within these traditions contributed greatly to the advancement of mathematical techniques and to their applications to mechanics. These techniques were presented as principles and were attributed diﬀerent roles within mechanics by their creators. The most important of these principles, all of which will be discussed further on in more detail, were d’Alembert’s principle (pda), which reduced dynamics to statics, the principle of virtual velocities (pvv), the principle of virtual work (pvw), the principle of least action (pla), the principle of momentum, the principle of the lever (pl), etc. An example of the heterogeneity of the 18th-century scientiﬁc environment is the introduction of the concept of energy. The development of the concept of energy was mainly the result of pressures from the domain of engineering. Although we encounter in the works of Euler and Lagrange, for instance, formulas that modern-day physics uses to express kinetic (1/2mv 2 ) and potential energy (e.g. mgh), in the eighteenth century these expressions had a diﬀerent meaning altogether. For instance, Lagrange took them to be purely instrumental devices, useful for calculations. In fact, the concept of energy emerged only early in the 19th century, while the distinction between kinetic and potential energy was introduced even later by Thomson and Tait. For an 18th-century scientist the familiar formula 1/2mv 2 would mean, at most, half of the vis viva, or force vive. These mathematical expressions got their meaning from the domain of engineering. In Du Calcul de l’Eﬀet des Machines (1829), Gaspard Coriolis introduced the concept of work as the main magnitude of the theory of moving engines, and he changed the expression of living force from mv 2 to the more familiar form of ∫ 1/2mv 2 . His deﬁnition of work as P ds, where P is the force acting in the direction of the 37 displacement ds, had been around, in one form or another, in the works of Louis Navier, John Smeaton, Jean Charles Borda and others which had been using similar formulas to calculate the productivity of water wheels and steam engines, usually in terms of the amount of coal required to raise a known weight to a ﬁxed height. For instance, for Jean Victor Poncelet, in “Course de m´ecanique appliqu´e aux machines”(1829), force vive is work when the only resistance is the body’s inertia[Grattan-Guinness, 1990]. The disputes between the three traditions of mechanics (Newtonian, variational and energy-based) arose mainly on the matter of the logical priority of the principles one tradition held to be foundational for mechanics. The competition did not regard the validity of the principles themselves, as all traditions supported the same principles, although they justiﬁed them diﬀerently. Instead, the main question was which principles can be considered suﬃcient to deduce the principles of the opposing traditions. Of course, there were some researchers that can be considered “mavericks” that cannot be placed within a tradition. One example is Euler who supported P.L.A Maupertuis’ principle of least action as being quite general, but who subscribed to other traditions when investigating problems in subdomains of mechanics such as hydrodynamics, machines, etc. 3.2 The Status of Mathematics Before Euler The status of 18th-century calculus is quite accurately described by Giovanni Ferraro as “a nonrigorous corpus of manipulative techniques, which succeeded in anticipating certain modern results thanks to a series of lucky circumstances and fortuitous cases.”[Ferraro, 2000, p. 107] This series of fortunate events lead to the development of new branches of mathematics, such as diﬀerential equations, inﬁnite series, diﬀerential geometry, calculus of variations, etc., to the introduction of new types of functions (hyperbolic, trigonometric, functions of 38 several variables) and, through Euler and Lagrange, to the separation of the calculus from geometry. During this period, more important than this development of the calculus, is the growing interest in its foundations, beginning with its creators, Newton and Leibniz, in the late 17th century and ending with Lagrange’s Le¸cons. Although historians of mathematics[Boyer, 1949], Kline [1990], Ferraro [2004] seem to agree that the interest in the foundation of calculus in the 18th century was deliberate, on a closer inspection, this view seems misguided. In [Grabiner, 2005] the reason is that Lagrange is the ﬁrst mathematician of his time to make foundational issues subjects of research in their own right. However, following Grabiner, Lagrange’s predecessors touched this subject for presentation purposes only, not because they viewed the topic of foundations of the calculus as particularly important: Solving problems was important, not proofs about the concepts used in solving them. The attitude of most of these mathematicians, implicit in their usual choice of problems and methods, may be summed up in a remark attributed to d’Alembert: “Go on, go on; the faith will come to you.” [Grabiner, 2005, p. 23] There are two reasons [Grabiner, 2005] why foundational issues were addressed in the 18th century prior to Lagrange. The ﬁrst is that the calculus was a new domain of mathematics so most works on the subject had to begin with a presentation of the basic concepts. Second, these works were also written with pedagogical intentions: “. . . these expositions were intended as textbooks for a growing readership.”[Grabiner, 2005, p. 23] The audience included both members of the scientiﬁc community and the general public. The increasing number of scientiﬁc journals and societies were instrumental to organizing and expanding the scientiﬁc community, while the success of Newtonian physics lead to a growing public interest in science “prompting both mathematicians and philosophers to explain the calculus to laymen” [Grabiner, 2005, p. 24]. It can be said, therefore, that Lagrange’s predecessors 39 touched upon foundational issues only to make their exposition of the subject matter, be it calculus or mechanics, clearer. However, Lagrange recognized the importance of discussing the basic concepts, rather than simply deﬁning them. This is why in 1784, at his suggestion, the Berlin Academy oﬀered a prize for an adequate solution to the problem of foundations of the calculus. The competition was ultimately won by Simon L’Huilier, but the Academy was not satisﬁed. None of the participants justiﬁed the results of the calculus. More important than simply recognizing the importance of foundational research is that Lagrange “derived his existing major results from his foundation”[Grabiner, 2005, p. 37]. Nevertheless, even if 18th-century mathematicians touched upon foundational issues only for presentation purposes, an understanding of the concepts they considered fundamental for the calculus is important for two reasons: 1. The calculus loses its geometrical interpretation, and as a result, it is possible to apply analytical tools to other kinds of problems. 2. The explicit deﬁnition of fundamental mathematical concepts and methods made possible their correlation with their mechanical counterparts. To understand why these points are relevant, it must be understood that 17th-century mathematical analysis was a “corpus of analytical tools (algebraic equations and operations, later the diﬀerential and the rules of the calculus) for the study of geometric objects, namely curved lines.”[Bos, 1974] The geometrical quantities of mathematical interest, the conceptual precursors of the concept of variable, were deﬁned relative to points on a curve. The relations between these geometrical variables were expressed by equations, after Descartes’ introduction of algebraic methods to geometry, or by descriptions in prose of the geometrical method of constructing the curve, in those cases in which the introduction of an equation was not possible. These descriptions were complemented by graphical methods which “played a 40 Figure 3.1: The variable quantities of the mathematical study of geometrical curves. x: abscissa, y: ordinate, s: arclength, r: radius, a: polar arc, σ: subtangent, τ : tangent, ν: normal, Q = QP R: area between curve and X-axis, x y: circumscribed rectangle. Source [Bos, 1974] role in the early calculus that would later be ﬁlled by the function concept”[Fraser, 2008, p. 309]. The tangent(τ ) of a curve at a point equals the length of the subtangent(σ) at that point. The area under the curve was given by an integral. Calculating the curvature at a point required the determination of the radius of the curvature. The ﬁrst mathematician that emphasized the importance of decoupling analysis from geometry was Euler, although the extent to which he achieved this project is debatable[Kline, 1990], Bell [1992], Boyer [1949], Fraser [2008]. Thus, in Methodus Inveniendi Lineas Curvas, he notes that it is possible to give a purely analytical solution of the problem of maxima and minima, i.e. ﬁnding in a class of curves the curve which maximizes or minimizes a geometrical property. This implied that instead of looking for a curve, the focus shifted towards ﬁnding from a class of equations an equation of x and y such that the quantity of interest is an extremum. An in-depth analysis of Euler’s contributions to the calculus will be given later. For now it suﬃces to point out that the scientiﬁc progress of the 18th century depended on this departure from geometry, which was, essentially, a process of abstraction enabling the application of mathematical tools to a wider domain of problems. This brings us to the second point, i.e. the relation between the fundamental mathematical concepts and mechanical quantities. Obviously, the nature of this relation depends on what are considered to be the fundamental concepts of the calculus and of mechanics. As it 41 will be shown, this was no easy matter to settle, since during the period under investigation there were almost as many “foundations of the calculus” as there were mathematicians. To illustrate this point, as well as to introduce some of the key terms that will be used in the later chapters, I will give a brief overview of the 17th-century predecessors of Euler and Lagrange. 3.2.1 Newton The two relevant features of 18th-century calculus,i.e. the separation from geometry and correlation of mathematical concepts with mechanical ones, are hard to separate. The main reason is that the justiﬁcation of the foundational concepts and methods of the calculus relied on 17th-century geometrical intuitions for their justiﬁcation. This tendency is present, to a certain extent, in Euler’s work as well, and is deﬁnitively removed by Lagrange. The relation between geometry and the foundations of the calculus is clearer in the works of Newton and Leibniz on the fundamental principles of the calculus. The dispute between Newtonians and Leibnizians over the precedence in the invention of the calculus is not relevant to our present concerns. More important are their diﬀerent sets of fundamental concepts and the justiﬁcation given for the methods of the calculus. Therefore, the order in which I will present these should not be interpreted as siding with a particular party in the dispute. Newton’s View on Mechanics and Geometry There are three theories that play a role in Newton’s foundations of the calculus: 1. inﬁnitesimal quantities 2. ﬂuxions 42 3. prime and ultimate ratios. The nature of the relation between the three is disputable. In [Boyer, 1949] they are viewed as diﬀerent interpretations of the calculus occurring in Newton’s works and presented as equivalent in the Principia. In [Kitcher, 2007] they are sets of concepts that played a determinate role in Newton’s foundations of the calculus: ..the theory of ﬂuxions yielded the heuristic methods of the calculus. Those methods were to be justiﬁed rigorously by the theory of ultimate ratios. The theory of inﬁnitesimals was to abbreviate the rigorous proof, and Newton thought that he had shown the abbreviation to be permissible. Rather than competing for the same position, the three theories were designed for quite distinct tasks.[Kitcher, 2007, p. 34] [Suisky, 2009] considers that Newton views inﬁnitesimal quantities as the geometrical foundation of the calculus, while ﬂuxions are mechanical representations of the process of generating curves. In spite of these diﬀerences, all three views have in common the recognition of the geometrical roots of Newtonian calculus. In the preface of the Principia Newton notes that the term “mechanics” is an abbreviation of the term “practical mechanics” which denotes the “practice of artiﬁcers” and it lacks precision, due to the imperfection of the artiﬁcers themselves. Therefore, the diﬀerence between mechanics and geometry is given by the complete accuracy of the latter and the imprecision of the former. However, the root of the distinction is not inherent in the “art of mechanics” since the lack of accuracy is introduced by the one practicing the art. In particular, rational mechanics is a perfectly rigorous abstraction of practical mechanics, which studies the mathematical laws governing natural phenomena. More importantly, for Newton geometry is founded upon (rational) mechanics: ..for the description of right lines and circles, upon which geometry is founded, belongs to mechanics. Geometry does not teach us to draw these lines, but requires them to be drawn. For it requires that the learner should ﬁrst be taught to describe these accurately, before he enters upon geometry; then it shows how by these operations problems may be solved. To describe right lines and circles are problems, but 43 not geometrical problems. The solution of these problems is required from mechanics, and by geometry the use of them, when so solved, is shown. ... Therefore geometry is founded in mechanical practice, and is nothing but that part of universal Mechanics which accurately proposes and demonstrates the art of measuring. [Newton et al., 1729][Preface] The objects of geometry are, therefore, the curves and ﬁgures constructed by mechanics. These ﬁgures are postulated entities which represent the starting point of geometrical calculations. Geometry is about curves, tangents, etc., but the description of these ﬁgures is the task of mechanics. This means that in order to solve geometrical problems, e.g. maxima and minima, tangents to a curve, etc., one must know how the ﬁgures had been kinematically generated. Thus, rational mechanics is just as precise as geometry, but it precedes geometry, as it is a pre-condition for the existence of geometry.. Newton clearly expresses his intent to provide a general theory of the kinematic generation of curves, based on any natural force: But since the manual arts are chieﬂy employed in the moving of bodies, it happens that geometry is commonly referred to their magnitude, and mechanics to their motion. In this sense rational mechanics will be the science of motions resulting from any forces whatsoever, and of the forces required to produce any motions, accurately proposed and demonstrated.[Newton et al., 1729][Preface] Geometry can be applied to natural phenomena precisely because it has its roots in mechanical practice. Since mechanics depends on the notion of artiﬁcer, e.g. technician, God, etc., which introduces a certain amount of inaccuracy, the diﬀerence between practical and rational mechanics can also be expressed in terms of perfect and imperfect artiﬁcers. Rational mechanics is the science of motion of a perfect artiﬁcer, therefore it can provide exact descriptions of curves, thus being able to provide the exactness required for geometrical proofs. The question we must answer now is how does mechanics describe the curves in an exact way? 44 The Theory of Fluxions For Newton, the theory of ﬂuxions would be the right answer to the previous question. Newton’s theory of ﬂuxions reﬂects the kinematic generation of curves and in his De Methodis Serierum et Fluxionum he makes clear that his method of ﬂuxions is applied to geometrical quantities generated by motion: lines generated by the motion of points, surfaces generated by the motion of lines, etc. The basic concepts of the theory of ﬂuxions are ﬂuents, ﬂuxions and moments: 1. Fluents represent the quantity of the ﬂow and are labelled x, y, z, etc. 2. Fluxions represent the instantaneous velocity of the ﬂow and are labelled x, ˙ y, ˙ z, ˙ etc. 3. Moments of ﬂuents are the inﬁnitely small quantities by which the ﬂuents are incremented in an inﬁnitely small time interval and are labelled ox, ˙ oy, ˙ oz, ˙ etc. A ﬂuent is incremented by its moment. The dependence on the time element o ensures the continuous generation of any curve, by modelling the generative process on the intuitive continuous ﬂow of time. Based on these concepts, the subject of the ﬂuxionary calculus is twofold: Now it remains, that for an Illustration of the Analytick Art, I should give some Specimens of Problems, especially such as the nature of Curves will supply. Bur ﬁrst it may be observed, that all the diﬃculties of these may be reduced to these two Problems only, which I shall propose concerning a Space described by local Motion, any how accelerated or retarded. I. The Length of the Space described being continually (that is, at all Times) given; to ﬁnd the Velocity of the Motion at any Time proposed. II. The Velocity of Motion being continually given; to ﬁnd the Length of the Space described at any Time proposed. [Newton, 1736, Ch.1 §55-58] This way, the problems of the “analytical art”(i.e. calculus) are reduced to two fundamental problems: 45 1. the relation between ﬂuents being given, to calculate the relation between their ﬂuxions 2. the relation between ﬂuxions being given, to calculate the relation between ﬂuents. As an example of Newton’s solution to the ﬁrst problem, let us take the general equation x3 − ax2 + axy − y 3 = 0 (3.1) which expresses the relation holding between the ﬂuents x and y, at all times. The moments ox˙ and oy˙ are inﬁnitely small time increments of the ﬂuents, so the equation will also hold for x + ox˙ and y + oy, ˙ thus obtaining (x + ox) ˙ 3 − a(x + ox) ˙ 2 + a(x + ox)(y ˙ + oy) ˙ − (y + oy) ˙ 3=0 (3.2) By applying the binomial theorem we obtain x3 + 3xox ˙ 2 + 3x˙ 2 o2 x + x˙ 3 o3 − ax2 − 2axox˙ − ax2 o2 + axy + axoy ˙ + ayox ˙ + ax˙ yo ˙ 2 − y 3 − 3yoy ˙ 2 − 3y˙ 2 o2 y − y 3 o3 = 0 (3.3) After subtracting (3.1) to ﬁnd the diﬀerence, the terms not containing o cancel out, then we divide everything by o. The terms still containing o will be neglected, as o is inﬁnitely small. The end result is 3xx ˙ 2 − 2axx ˙ + axy ˙ + ayx ˙ − 3yy ˙ 2=0 (3.4) Thus, starting from the relation between ﬂuents we obtained the relation between ﬂuxions. 46 Infinitely Small Quantities In his De analysi per aequationes numero terminorum inﬁnitas, written in 1669 and published in 1711, Newton employed the notion of inﬁnitely small quantities and the binomial theorem to ﬁnd the relation between the abscissa x and the ordinate y of a curve, given the area under the curve. An application of this is presented in [Boyer, 1949, p. 191]. Let the area be ( ) m+n n z = m+n ax n . If we increase the abscissa by the inﬁnitesimal quantity o, the area will be ( ) m+n n increased by oy, the area of a rectangle of width o. Therefore, z + oy = m+n a(x + o) n . The next steps are identical to the ones in the method of ﬂuxions: apply the binomial theorem, divide by o then ignore the terms containing o. The ﬁnal result, the equation of the m curve, will be y = ax n . This method resembles the one presented in De methodis, in that it rests on the inﬁnitely small increments. However, both the theory of ﬂuxions and the theory of inﬁnitely small quantities in De analysi, presuppose the division by an inﬁnitesimal quantity, then the elimination of the terms containing this quantity. We are thus faced with a dilemma: 1. If the inﬁnitely small quantity o is zero, then we cannot divide the equation by o. 2. If o is not zero, then by ignoring the terms containing it, we are approximating. However, geometry requires an exact deﬁnition of ﬁgures, as observed at the beginning of this section. Newton on Ultimate Ratios The problems of method of ﬂuxions and inﬁnitely small quantities were avoided in the Principia and the Treatise on Quadrature(1693) by the use of ultimate ratios combined with the method of expansion in inﬁnite series. In the introduction of the Treatise on Quadrature Newton returns to the theory of ﬂuxions: 47 Fluxions are very nearly as the Augments of the Fluents generated in equal but very small Particles of Time, and, to speak accurately, they are in the ﬁrst Ratio of the nascent Augments; but they may be expounded by any Lines which are proportional to them. [Newton and Stewart, 1745, p. 2] As noted in the previous subsection, the results obtained through the methods of ﬂuxions and inﬁnitely small quantities were obtained by what seemed to be an approximation. Here, by relating ﬂuxions to geometrical concepts, Newton is trying to recover the exactitude required for a proper foundation of the calculus. His proof is also geometrical. Consider the ﬁgure Figure 3.2: Geometrical illustration of the concept of ultimate ratios. Source [Newton and Stewart, 1745] 3.2. Newton considers that if the areas ABC and ABDG are described by the ordinates BC, BD, then the ratio of their ﬂuxions, i.e. the velocity of variation of areas, will be equal to the ratio of the ﬂuxions of the two ordinates. We move BC to the new position bc, construct the parallelogram BCEb and the tangent VTH at point C which intersects bc and BA in points T and V respectively. Cc, Bb and Ec represent the increment of the curve AC, of the abscissa AB and of the ordinate BC. Now, the sides of the triangle CET “are in the ﬁrst Ratio of these Augments considered as nascent, therefore the Fluxions of AB, BC and AC are as the Sides CE, ET and CT of that Triangle CET, and may be expounded by these same Sides, or, which is the same thing, by the Sides of the Triangle VBC, which is Similar to the Triangle 48 CET.”[Newton and Stewart, 1745, p. 2] The same relations of proportionality can be obtained by taking the ultimate ratios of the ﬂuxions. By moving the ordinate bc over BC, the points C and c will coincide, as CK will coincide with the tangent CH. The triangle CEc will be similar to CET, so its sides will stand in the same relation as the sides of CET are, which is the same relation as the one existent between the ﬂuxions of AB, BC and AC. To the modern reader, this procedure resembles passing to the limit of the ratio of two quantities. This interpretation is supported by Newton’s statement in the Principia (Scholium of Lemma XI ): For those ultimate ratios with which quantities vanish, are not truly the ratios of ultimate quantities, but limits towards which the ratios of quantities, decreasing without limit, do always converge; and to which they approach nearer than by any given diﬀerence, but never go beyond, nor in eﬀect attain to, till the quantities are diminished in infnitum. [Newton et al., 1729, p. 39] Although Boyer views the three diﬀerent foundations of Newtonian calculus as equivalent, the outline above suggests something diﬀerent. The method of ﬂuxions and the use of inﬁnitely small quantities are inaccurate, therefore they cannot be used by mechanics for the construction of geometrical ﬁgures. The reason is that the relation Newton envisaged between rational mechanics and geometry requires that the ﬁgures constructed by mechanics be accurately described(see subsection 3.2.1). This problem is avoided by the introduction of ultimate (geometrical) ratios to justify the application of the method of ﬂuxions. Instead of focusing on inﬁnitely small quantities, this “new” method of ﬂuxions considers their ratios taken to the limit. However, this also marks the return to geometry, as these ratios, as we have seen, represent ratios of geometrical, not arithmetical, magnitudes. 49 Chapter 4 Foundational Research in the 18th Century: Euler Euler’s ﬁrst book was the Mechanica sive motus scientia analytice exposita (1736) in which he presented the foundations of mechanics and developed a set of algorithms for solving classes of mechanical problems. The analytical tools used in the Mechanicawere developed and justiﬁed (in print) only later on in Euler’s career, in particular in Introductio in analysin inﬁnitorum (1748) and Institutiones calculi diﬀerentialis cum eius usu in analysi ﬁnitorum ac doctrina serierum (1755). Euler’s work on the foundations of the calculus, presented in Introductio in analysin inﬁnitorumand Institutiones calculi diﬀerentialis, had to be developed in parallel to that on the foundations of mechanics, since the analytical solutions to mechanical problems could not be justiﬁed in the absence of a solid foundation of the calculus itself. Therefore, securing the legitimacy of the calculus as a reliable mathematical discipline has epistemic precedence over its application to mechanics. This is why I will begin with an analysis of Euler’s work on the foundation of the calculus then investigate the connection between the 50 fundamental mathematical concepts and their mechanical counterparts. 4.1 Foundations of Calculus Three ideas lie at the basis of Euler’s calculus: 1. Functions as analytical expressions. 2. The nature of inﬁnitely small quantities. 3. Diﬀerential calculus as a particular case of the calculus of ﬁnite diﬀerences. We examine each in turn 4.1.1 Eulerian Functions There are several modern deﬁnitions of functions. The following are representative: At the present time the word “function” is used broadly to mean any determinate correspondence between two classes of objects. [Taylor, 1983, p. 2] A function F is a set of ordered pairs (x, y), no two of which have the same ﬁrst member. That is, if (x, y) ∈ F and (x, z) ∈ F then y = z. [Apostol, 1981, p. 34] Although the deﬁnitions diﬀer in form, essentially they say the same thing: f is a function deﬁned on the set A with values in set B, f : A → B, iﬀ for every element x ∈ A there is a unique value f (x) ∈ B. This means that it is impossible to have for one value x ∈ A both f (x) = y, y ∈ B and f (x) = z with z ∈ B. The correspondence between x and f (x) is given in the form of an analytical expression, e.g. f (x) = sin x + ex . Euler’s deﬁnition of function in Introductio in analysin inﬁnitorum is very diﬀerent from the one just presented, but at that time it was a giant leap forward for mathematics. To 51 be precise, the concept of dependency between quantities, which we called functional relation,1 had been present in human scientiﬁc thought since Babylonian times, but the ways in which this dependency was represented, the functional expression, varied throughout history.2 [Youschkevitch, 1976] identiﬁes three periods in the evolution of the concept of functional relation: 1. Antiquity - when the notion of variable quantity did not exist so the study of the functional relation focused only on particular instances of interdependency between measured quantities; during this period the function was usually expressed as a table of particular values. 2. The Middle Ages (14th-16th century) - when variable quantities are expressed in geometrical and mechanical terms, but the functional relation is expressed verbally, or graphically. 3. The Modern Period (16th-18th century) - when the introduction of analytical expressions of functional relations and the method of expressing functions as sums of inﬁnite power series became common. The astronomical observations of the Egyptians and the Babylonians were represented as tables of values, which “grasped” the idea of dependence between two or more quantities, but could not do so in a general manner. The concept of variable, fundamental for an abstract representation of functional relations, was lacking, and as a result the relations between 1 The adjective “functional” is introduced to distinguish between correlation and causation. If two quantities are correlated, then the variation of one them does not necessarily result in a variation of the other. Only quantities whose variation necessarily entail the variation of other quantities can be said to stand in a functional relation with the latter quantities. Mathematically, a relation is deﬁned as a set of ordered pairs, but not all sets of ordered pairs can be deﬁned as functions (see [Apostol, 1981, p. 34]). 2 The distinction between a functional relation and the way in which this relation is expressed is drawn in [Youschkevitch, 1976] and applied to Euler’s notion of function in [Ferraro, 2000]. 52 the interdependent quantities could only be expressed for particular cases. An example of a tabular representation of a function can be found in Ptolemy’s Almagest, which contains “numerous astronomical tables of other quantities, equivalent to rational functions and, also, the simplest irrational functions of the sine” [Youschkevitch, 1976, p. 40] The graphical representation of functional relations took the form of verbal descriptions of methods for constructing curves based on given values of curve parameters, e.g. abscissa, ordinate, subtangent, tangent, normal, etc. For an in-depth analysis of these two periods see Bell [1992], Boyer [1949], Kline [1990] and Youschkevitch [1976]. More relevant to our present purposes is the last period. Here, through the eﬀorts of Euler and Lagrange, but also of Johann, Jakob and Daniel Bernoulli, the concept of functional relation becomes intimately related to that of analytical expression. In the Introductio in analysin inﬁnitorum Euler begins by deﬁning variables and constants: A constant quantity is a determined quantity which always keeps the same value. [Euler, 1990, §1] A variable quantity is one which is not determined or is universal, which can take on any value. [Euler, 1990, §2] But what exactly is a universal quantity? Euler’s own remarks suggest that the relation between the concept of variable and that of quantity is similar to the relation between genus and species [Euler, 1990, §2]. Thus, by variable Euler means the essence of quantities, that set of features that all quantities have in common. The essential property of quantities, as Euler expresses in Institutiones calculi diﬀerentialis is the capability of being incremented or decremented: “every quantity can be increased or decreased by its own nature indeﬁnitely” [Euler, 2000, Preface]. Modern mathematics has a formal understanding of the concept of variable, that is, a variable is a symbol which stands for a particular element of a set. In contrast, a universal, or abstract, quantity such as the Eulerian variable can “take on” any 53 value whatsoever: Since all determined values can be expressed as numbers, a variable quantity takes on all possible numbers (all numbers of all types). [Euler, 1990, §2] The expression “takes on” should not be understood from the point of view of modern mathematics. When we say that a variable takes on any value from a set, we understand that the symbol representing the variable quantity has a determinate value and this value belongs to the given set. In modern mathematics a variable is necessarily a determinate quantity and the generality of the concept of variable is formal in nature. To illustrate this, consider a simple function deﬁned on the set of natural numbers: f : N → N , f (x) = 2x + 1. We could deﬁne f , as in one of the modern deﬁnitions presented above, as the set of ordered pairs (0, 1), (1, 3), (2, 5), (3, 7), .... However, the expression 2x + 1 is a formal generalization of this deﬁnition. It is formal, because rather than enumerating the particular values of f for every input value, we replace the symbol of the input value with the symbol x which, from then on, is treated as a determinate value. In other words, the change is in notation, not in interpretation, since x is of the same kind as any number. By contrast, for Euler variables are indeterminate quantities that are all possible numbers of all possible types. They are not “placeholders” as in the modern interpretation. They are diﬀerent in kind from the values they can take. In diﬀerent terms, the variable is not the result of a formal generalization, but of a generalization analogous to that from species to genus, which concerns the essence of the concept of quantity. Hence a variable quantity can be determined in inﬁnitely many ways, since absolutely all numbers can be substituted for it. Nor is the symbol of the variable quantity exhausted until all deﬁnite numbers have been assigned to it. Thus a variable quantity encompasses within itself absolutely all numbers, both positive and negative, integers or rational, irrational and complex numbers. Even zero and complex numbers are not excluded from the signiﬁcation of a variable quantity. [Euler, 1990, §3] 54 This has several implications for the Eulerian notion of function to which I will now turn. Euler’s deﬁnition of function is: A function of a variable quantity is an analytic expression composed in any way whatsoever of the variable quantity and numbers or constant quantities. Hence every analytic expression, in which all component quantities except the variable z are constants, √ will be a function of that z; thus a + 3z; az − 4z 2 ; az + b a2 − z 2 ; cz ; etc. are functions of z. [Euler, 1990, §4] From this deﬁnition it follows that a function will also be a variable quantity(§5) for simple reasons: Since it is permitted to substitute all determined values for the variable quantity, the function takes on innumerable determined values; nor is any determined value excluded from those which the function can take, since the variable quantity includes complex values. [Euler, 1990, §5] For Euler functions are not rules that relate the elements of one set to the elements of another set, but relations expressed analytically. Implicitly, Euler distinguishes between functional relations and their expressions, although the terms used are diﬀerent: The form of a function is changed, either by introducing a diﬀerent variable or if the same variable is kept, the transformation consists in expressing the same function in a diﬀerent way. [Euler, 1990, §27] For instance, if in the function a4 −4a3 z +6a2 z 2 −4az 3 +z 4 we replace a−z with y, we get the simpler expression y 4 . The transformation aﬀects only the way we represent the functional relation between the variable z and the variable a4 − 4a3 z + 6a2 z 2 − 4az 3 + z 4 . Therefore, it would seem that what Euler means by function is the actual relation behind the analytic expression. This would be correct, however, were it not for the fact that Euler’s deﬁnition of the concept of function implies that the form of the function, i.e. its analytic expression, also plays a part in understanding the concept. To understand the nature of the Eulerian concept we must answer two questions [Ferraro, 2000]: 55 Q1 A functional relation being given, what are the conditions it must meet so it can be considered a function? Q2 A string of symbols being given, what are the conditions it must meet so it can be considered a function? The answer to Q1 is that a functional relation is considered a function if “one was able to associate with it an algorithm consisting of symbols and rules of calculation. No function was given without a special calculus concerning it.”[Ferraro, 2000] It seems that for Euler being expressible in an analytic form is not a suﬃcient condition for a functional relation to become a function. Also, he requires the knowledge of the rules according to which the value of a function would be calculated for a given value of the variable. If we consider simple functions such as x2 or x + 2, then these rules are the rules of arithmetic. A diﬀerent set of rules applies to functions such as the sine and the cosine, the logarithmic and the exponential. To illustrate this, consider the function f (x) = a + x + x4 + sin x, in which we suppose the sine function unknown. In order to claim that f (x) is a function, we would have to know how to calculate f (x) for a particular value of x. However, without knowing the rules for calculating the sine function, the rules of calculation of f (x) will also be unknown. Euler classiﬁes functions according to their rules of calculation. Thus, Euler ﬁrst distinguishes between algebraic and transcendental functions. Algebraic functions relate the variables using basic arithmetical operations, “addition, subtraction, multiplication, division, raising to a power, and extraction of roots”[Euler, 1990], while transcendental functions include exponentials and logarithms. If the variable is aﬀected by a radical sign, then it is irrational, and non-irrational otherwise. Irrational functions are further divided into explicit and implicit functions, where an implicit function is given by an equation of the form Z 7 = nz 3 which could not have been solved in Euler’s time. Non-irrational functions too are 56 further divided into polynomial and rational functions. The ﬁnal picture is as follows: 1. algebraic (a) non-irrational i. polynomial ii. rational (b) irrational i. explicit ii. implicit 2. transcendental. To recapitulate, a functional relation becomes a function when (a) it is expressed analytically and (b) its speciﬁc rules of calculation have been provided. These rules lie at the basis of Euler’s classiﬁcation of functions, and the usual way of introducing them is through expansion into inﬁnite series. The Introductio in analysin inﬁnitorum contains several examples to choose from, but I will focus on the sine and cosine functions. Euler develops the rules of calculation of the two functions starting from their geometrical interpretation as functional relations between lines in a circle and the properties derived from this. He begins by considering a circle of radius 1 and circumference of 2π in which the variable quantity is an arc of the circle, labelled z. The next step is to enumerate the properties of sine and cosine known from geometry: Every sine and cosine lies between +1 and -1. Further, we have cos z = sin We also have sin z 2 + cos z 2 π 2 − z and sin z = cos = 1. . . We note further that if y and z are two arcs, then sin y + z = sin y cos z + cos y sin z and cos y + z = cos y cos z − sin y sin z. Likewise sin y − z = sin y cos z − cos y sin z and cos y − z = cos y cos z + sin y sin z. [Euler, 1990, §128] 57 π 2 − z. The formula sin z 2 + cos z 2 = 1 can be also written as (cos z + i sin z)(cos z − i sin z) = 1. The introduction of complex terms might seem unjustiﬁed; however, “although these factors are complex, still they are quite useful in combining and multiplying arcs.”[Euler, 1990](§132). The introduction of complex terms allows Euler to obtain a general formula for cos nz and sin nz through a process of regular (not mathematical) induction. First, he observes that (cos z + i sin z)(cos y + i sin y) = cos y cos z − sin y sin z + (cos y sin z + sin y cos z)i (4.1) From the formulas for cos (y + z) and sin (y + z) quoted above, we get by substitution in (4.1): (cos z + i sin z)(cos y + i sin y) = cos (y + z) + sin (y + z) (4.2) (cos z − i sin z)(cos y − i sin y) = cos (y + z) − sin (y + z) (4.3) Likewise If instead of two variables we had three, then, by following the same steps we get (cos x ± i sin x)(cos y ± i sin y)(cos z ± i sin z) = cos (x + y + z) ± sin (x + y + z) For x = y = z equations (4.3) and (4.4) become: (cos x ± i sin x)2 = cos 2x ± i sin 2x (cos x ± i sin x)3 = cos 3x ± i sin 3x 58 (4.4) From these two cases Euler concludes that the same relation holds for higher powers, i.e. (cos x ± i sin x)n = cos nx ± i sin nx (4.5) The lack of rigor introduced by this inductive inference might make a modern mathematician cringe, but although the justiﬁcation is not perfect, the conclusion is still correct. In order to meet the more stringent requirements of modern mathematics, we would have to consider (4.5) an assumption, rather than a conclusion and try to prove that if (4.5) is true, then (cos x±i sin x)n+1 = cos (n + 1)x±i sin (n + 1)x is also true. That Euler’s generalized formula is correct even by modern standards can easily be proven: (cos x + i sin x)(n+1) = (cos x + i sin x)(cos x + i sin x)n (cos x + i sin x)(cos nx + i sin nx) = cos x cos nx − sin x sin nx + i(cos x sin nx + sin x cos nx) = cos x + nx + i sin x + nx = cos (n + 1)x + i sin (n + 1)x The generalized formula allows us to express the sine and cosine in a form to which we can apply the binomial theorem: (cos z + i sin z)n + (cos z − i sin z)n 2 (cos z + i sin z)n − (cos z − i sin z)n sin nz = 2 cos nz = (4.6) (4.7) (4.8) 59 Which gives n(n − 1) (cos z)n−2 (sin z)2 1· 2 n(n − 1)(n − 2)(n − 3) + (cos z)n−4 (sin z)4 1· 2· 3· 4 n(n − 1)(n − 2)(n − 3)(n − 4)(n − 5) − (cos z)n−6 (sin z)6 + . . . 1· 2· 3· 4· 5· 6 cos nz = (cos nz)n − (4.9) (4.10) (4.11) (4.12) Euler assumes that for an inﬁnitely large n and an inﬁnitely small z, nz will be a ﬁnite number, v. Moreover, since z is an inﬁnitely small arc, sin z = z and cos z = 1.3 An indepth analysis of Euler’s view on inﬁnite quantities, both large and small, will be done in the next subsection. For now it suﬃces to draw attention upon the connection between the development of the rules of calculation of a function and assumptions regarding inﬁnite quantities. The introduction of the inﬁnitely small leads to a simpliﬁed4 form of the series: cos v = 1 − v2 v4 v6 + − + ... 1· 2 1· 2· 3· 4 1· 2· 3· 4· 5· 6 (4.13) This formula is the rule of calculation of the cosine. A similar rule can be deduced for the sine. To recapitulate, we tried to answer Q1, i.e. given a functional relation, what are the conditions that it must meet so it can be considered a function? The answer is that a functional relation becomes a function once it is expressed analytically and its rules of calculation have been given, with the expression “rule of calculation” referring to the procedures of calculating the value of a function for any value of the variable, as well as the method of derivation and 3 Euler takes these formulas not as approximations, but as actual equalities. Since for Euler sin z is equal to z, the result is not an approximation, but a more manageable analytical expression of the series. 4 60 integration of a function. Only when we can calculate the value of the function for a given value of the variable, deduce its derivative and its integral can we say that the function is known. One thing to note is that the rules of calculation must refer to variables in Euler’s sense, i.e. universal quantities, as a consequence of the deﬁnition of function. As such, the rules for calculating the value of a function are abstracting away from any particular interpretation of the variables in question. This answer is supported ﬁrst, by Euler’s classiﬁcation of functions and second, by his presentation of the sine and cosine, which begins by assuming the geometrical interpretation of the two functional relations and ends by expressing them in non-geometrical terms. As it can be seen in equation (4.13), the cosine is detached from its geometrical background and expressed in an algebraic form. If before Euler the values of the sine and cosine were calculated using geometrical methods which implicitly assumed that the variable was a geometrical quantity, after Euler the two functions (together with tan x and cot x) were generalized. This is in agreement with the note made earlier regarding the process of abstraction presupposed by the rules of calculation. The answer to Q2 is similar to Q1: a string of symbols respecting the syntactical rules of algebra was a function if the rules of calculation were known and expressible in an analytical form. Knowledge of the functional relation being implicit. To illustrate this, consider the following expression (F): a + x + y + qwerty(x, y, z). (F) is a string of symbols, probably a function, but since we do not know the rules of calculation of the function qwerty(x, y, z), (F) itself will be no more than a pseudo-mathematical expression. Assuming we know that qwerty(x, y, z) = x + y + z, (F) becomes a function and the functional relation whose analytic expression (F) is also brought to light, for now we know that the relation holds between a + x + y + qwerty(x, y, z) and x, y, z,. More precisely, we know the functional relation exhibited by an analytical expression if we know which quantities are interdependent and how this dependency can be evaluated. 61 We can get a better picture of Eulerian functions by looking at some of their properties. Since functions are variables, they will have the same properties, i.e. they can be incremented or decremented and they “encompass” within themselves absolutely all numbers of all types. The modern idea of function deﬁned on the set of rational numbers is excluded from the outset by considering functions to be Eulerian variables. More importantly, because functions are variables it follows that they are also continuous, but not in the modern mathematical sense of continuity. For Euler an object was continuous if it was taken as an unbroken object. As a variable a function was necessarily continuous, i.e. an unbroken object, due to the deﬁnition of variable as “essence of quantity.” From here it follows that a function has to be deﬁned by a single analytic expression. Functions such as f (x) = x−2 :x≤2 3x − 6 : x > 2 (4.14) despite being continuous in a modern sense are nevertheless discontinuous for Euler, since the two formulas are not equivalent. Another important property of functions is their capability to be expanded into inﬁnite series. This property was illustrated above, in the presentation of the sine and cosine functions, but in the fourth chapter of Introductio in analysin inﬁnitorum Euler shows how any kind of function (algebraic or transcendental) can be expressed as a series: Thus there is no doubt that any function of z can be given the form Az α + Bz β + Czγ + Dz δ + . . . , where the exponents α, β, γ, δ, etc. are any real numbers. [Euler, 1990, §59] 62 4.1.2 Inﬁnitely Small Quantities Originally, the main concern of the calculus was ﬁnding the value of a geometrical variable5 for a given curve. Solutions to these problems depended on the nature of the particular curve under scrutiny, so the method used to ﬁnd the subtangent, for instance, of one curve could not have been applied to ﬁnd the subtangent of a diﬀerent curve. In contrast, the calculus aimed to provide a unitary approach to these problems, i.e. to develop algorithms. This is illustrated in the Introductio in analysin inﬁnitorum, where Euler emphasizes the role played by the algorithm of expansion into inﬁnite series for the introduction of functions and in the previous subsection we have seen how this was applied to the sine and cosine functions. It is also shown in the preface of the Institutiones calculi diﬀerentialis where Euler notes the importance of Leibniz’s work on the calculus: We are no less indebted to Leibniz insofar as this calculus at that time was viewed as individual tricks, while he put it into the form of a discipline, collected its rules into a system, and gave a crystal-clear explanation. From this there followed great aids in the further development of this calculus, and some of the open questions whose answers were sought were pursued through certain deﬁnite principles. [Euler, 2000, Preface] The search for generalization inevitably led to a departure from the geometrical “tricks” in the form of rephrasing geometrical problems in terms of functions and variables. This paved the way for the introduction of algebraic algorithms to deal with these problems. Not surprisingly, in the Preface of the Institutiones calculi diﬀerentialis Euler points out that “everything is kept within the bounds of pure analysis, so that in the explanation of the rules of this calculus there is no need for any geometric ﬁgures.”[Euler, 2000, Preface] The algorithms introduced in inﬁnitesimal calculus (henceforth ci) in the Institutiones calculi diﬀerentialisare the same as those for the calculus of ﬁnite diﬀerences (henceforth cfd): 5 See Fig.3.1 on p. 41 63 The analysis of the inﬁnite, which we begin to treat now, is nothing but a special case of the method of diﬀerences, explained in the ﬁrst chapter, wherein the diﬀerences are inﬁnitely small, while previously the diﬀerences were assumed to be ﬁnite. [Euler, 2000, §114] How does Euler justify this claim? There are three notions we must clarify before attempting to answer: 1. The object of ci 2. The nature of inﬁnitely small quantities 3. The algorithms of cfd. I shall turn now to a detailed analysis of these points. Object of the Calculus In the Institutiones calculi diﬀerentialis Euler deﬁnes the ci as . . . a method for determining the ratio of the vanishing increments that any functions take on when the variable, of which they are functions, is given a vanishing increment. [Euler, 2000, Preface] The deﬁnition suggests that for Euler ci is not a calculus of diﬀerences, but rather a calculus of increments. The diﬀerence is that in the calculus of diﬀerences, we must know the values of the function for two diﬀerent known values of the variable in order to determine the variation of the function. In the calculus of increments, all we need to know is one value of the variable and the increment. Then the next value of the variable can be easily determined, and so will the corresponding values of the function. The diﬀerence between More importantly, Euler’s calculus is not a study of functions, but of ratios of vanishing increments. In Euler’s foundations of calculus, functions play a role insofar as they are variables, i.e. they can be incremented or decremented when the value of their variable is increased. The ratios of vanishing increments are the objects of investigation of the calculus. It is only in Lagrange’s 64 Le¸cons sur le calcul des fonctions and Th´eorie des fonctions analytiques that “function” becomes the central concept of the diﬀerential calculus, by deﬁning the diﬀerential of a function as the functional coeﬃcient of a variable in an expansion in power series.6 Simply put, for Euler the calculus tries to answer this question: for a given function of variable x, “if the quantity x is increased or decreased, by how much is the function changed, whether it increases or decreases?”[Euler, 2000, Preface] The Institutiones calculi diﬀerentialiscan be viewed as giving two answers, one in the Preface, another in the fourth chapter. The answer in the Preface considers a determinate increment ω of the variable x and tries to ﬁnd the corresponding increment of the function x2 . In modern notation, f (x) = x2 f (x + ω) = (x + ω)2 f (x + ω) = x2 + ω 2 + 2xω f (x + ω) − f (x) = 2xω + ω 2 For an increment ω of the variable the function is incremented by 2xω + ω 2 . “Hence, the increase in x is to the increase of x2 as ω is to 2x + ω + ω 2 , that is, as 1 is to 2x + ω”[Euler, 2000, Preface]. In general, the ratio of the increment of any function and the increment of the variable or variables is, in Euler’s view, the foundation of the whole analysis of the inﬁnite. So far, the increment ω had been treated as a determinate quantity, be it ﬁnite or inﬁnite (small or large). Euler’s next step is to consider ω a vanishing increment, i.e. ω “goes to zero”. Therefore, the ratio f (x+ω)−f (x) ω = 2x + ω will approach 2x. Of course, if ω approaches zero both increments vanish so their ratio should approach 6 See the next chapter on this issue. 65 0 0, but Euler insists that their ratio is nevertheless a determinate quantity. The approach in the fourth chapter diﬀers from the one in the Preface in that it includes the idea of “going to zero” within the notion of increment. More precisely, instead of beginning by incrementing the variable with a determinate quantity and calculating the ratio as the increment approaches zero, in the fourth chapter Euler increments the variable by an inﬁnitesimal quantity dx, calculates the corresponding variation of the function dy and only then calculates the ratio of the two inﬁnitesimals. For [Bos, 1974], this inconsistency shows that Euler’s view on inﬁnitely small quantities does not inﬂuence his calculus. In order to analyze this claim we must turn to Euler’s considerations on the nature of the inﬁnitely small. In the Preface of the Institutiones calculi diﬀerentialisEuler makes clear that in reality inﬁnitely small quantities are nothing: They [vanishing increments] are called diﬀerentials, and since they are without quantity, they are also said to be inﬁnitely small. Hence, by their nature they are to be so interpreted as absolutely nothing, or they are considered to be equal to nothing. [Euler, 2000, Preface] “Absolutely nothing” here means zero. In the 18th century a number was viewed as “a determination of quantity and was generated from the ﬂow of quantities; in particular, zero was generated from a quantity that became nothing.”[Ferraro, 2004, p .45] Since numbers are quantities, it follows that 0 could not have been conceived as a number. Actually, in his Elements of Algebra Euler excludes 0 from the enumeration of natural and negative numbers. As can be observed from Newton and Euler’s calculations of the ratios of inﬁnitesimals, 66 these quantities have a double nature. Consider Euler’s calculation of the ratio of f (x) = x2 : f (x) = x2 (4.15) f (x + ω) = (x + ω)2 (4.16) f (x + ω) = x2 + 2xω + ω 2 (4.17) f (x + ω) − f (x) = 2xω + ω 2 (4.18) f (x + ω) − f (x) = 2x + ω 2 ω (4.19) ω→0 (4.20) f (x + ω) − f (x) = 2x ω (4.21) (4.22) In the equations from (4.15) to (4.19) ω is considered a regular variable quantity that could increment or decrement a given quantity. However, after (4.20), ω is completely ignored and Euler maintains that this does not aﬀect the accuracy of the result. In general, if diﬀerentials were not zeros from a mathematical perspective, then the calculus would fail to meet the same standards of rigor as geometry. The reason is that no matter how small these quantities are, the errors they introduce eventually add up so that the end result will not be exact: . . . if these inﬁnitely small quantities, which are neglected in calculus, are not quite nothing, then necessarily an error must result that will be greater the more these quantities are heaped up. [Euler, 2000, Preface] If ω would have been considered 0 before (4.15) then the diﬀerence f (x + ω) − f (x) would be 0 and the left hand of (4.19) would be reduced to 00 . If all increments are zeros, what is the point of using diﬀerent symbols to denote them? Since all nothings are equal, it seems superﬂuous to have diﬀerent signs to designate such a quantity. [Euler, 2000](§84) 67 To solve this conundrum, Euler makes use of the distinction between two means of comparing quantities: arithmetic and geometric. I will talk, following Euler, about arithmetic and geometric ratios, but these expressions do not refer to two diﬀerent interpretations of a/b. “Arithmetic ratio” and “geometric ratio” should be understood as referring to the comparison of terms of an arithmetic and geometric progression, respectively. Thus, the arithmetic ratio of two quantities a and b is a − b, while the geometric ratio is a/b. In more precise terms, the geometric ratio “is found by resolving the question, How many times is one of those numbers greater than the other? This is done by dividing one by the other; and the quotient will express the ratio required.”[Euler et al., 1822, §440] For Euler the ratio of two inﬁnitesimal quantities is geometric. Everyone knows that when zero is multiplied by any number, the product is zero and that n · 0 = 0, so that n : 1 = 0 : 0. Hence, it is clear that any two zeros can be in a geometric ratio, although from the perspective of arithmetic, the ratio is always of equals.[Euler, 2000](§85) From here it follows that we must use diﬀerent symbols to denote the zeros, even though they are essentially equal. The reason is that 0 0 can have any value whatsoever, and there would be no means to distinguish between them (e.g. 0 0 = 2 3 and 0 0 = 4 5 are indistinguishable, if we use 0 to denote diﬀerentials). For instance, dx = adx = 0, but the ratio adx dx = a, is a determinate quantity. Newton’s theory of ultimate ratios, which resembles Euler’s view of geometric ratios, had a geometric justiﬁcation. Ultimate ratios had a ﬁnite value because they were proportional to the ratio of the sides of a triangle. This relation was deduced from the similarity between two triangles. However, Euler wanted to reduce the dependency of the calculus on geometry, so he had to justify the ﬁnite values of ratios diﬀerently. The intuition behind Euler’s account of ratios of diﬀerentials is that of the terms of two series which increase or decrease diﬀerently for the same value of the index. Compare the 68 series with general terms 1 n2 and 1 : n3 n 1 2 3 4 5 S1 = 1 n2 1 1 1 4 1 9 1 16 1 25 S2 = 1 n3 1 1 1 8 1 27 1 64 1 125 It can be observed that for the same value of n, S2 (n) is smaller than S1 (n). The terms of the two series become smaller and smaller and approach 0, but they do so at a diﬀerent rate. The ratio of the rates is analogous to the ratio of inﬁnitesimals. This resembles Newton’s method of ﬂuxions. Just as a ﬂuxion is the velocity with which ﬂuents change in time, the ratio of inﬁnitesimals can also be viewed as a velocity only not relative to time, but to the terms of an arithmetical progression, e.g. n, n + 1, n + 2, n + 3, . . . To sum up, Eulerian inﬁnitesimals are zero, but from a formal point of view they are treated as regular quantities. This is the “before” interpretation of inﬁnitesimals, i.e. the inﬁnitesimals in equations (4.15)-(4.19). However, inﬁnitesimals also have a “modal” aspect, a disposition to “tend” towards 0. This would be the “after” interpretation, as appears in (4.21). The presentation of the calculus in the Preface of the Introductio in analysin inﬁnitorumonly seems at odds with the one in the fourth chapter of the same book. In reality they are emphasizing diﬀerent aspects of the concept of inﬁnitesimal. The presentation in the Preface focused on the modal aspect, i.e. on an increment approaching nothing, while the fourth chapter emphasizes the formal aspect of inﬁnitesimals, i.e. zero quantities. By treating diﬀerentials formally as zeros, Euler confers geometrical rigor to the calculus. The double nature of the inﬁnitesimal lies at the basis of Euler’s extension of the rules of cfd to ci, which will be the focus of the next subsection. 69 4.1.3 Calculus of Finite Diﬀerences (cfd) and Inﬁnitesimal Calculus (ci) The ﬁrst chapter of the Institutiones calculi diﬀerentialis presents the cfd. The method employed here is similar to the one used in the Preface to present the ci. Considering y, a function of x, how does y change if x is incremented by a ﬁnite quantity ω? For each increment of x there corresponds a diﬀerent value of y: The next step is to deﬁne the variation of x y x+ω yI x + 2ω y II x + 3ω y III x + 4ω y IV x + 5ω yV y for each increment of the variable x: ∆y = y I − y ∆y I = y II − y I ∆y II = y III − y II ... To each increment of x there is a corresponding variation of y: x y x+ω ∆y x + 2ω ∆y I x + 3ω ∆y II x + 4ω ∆y III x + 5ω ∆y IV From here, we can calculate higher order diﬀerentials such as the second (∆∆y = ∆y I − ∆y; ∆∆y I = ∆∆y II −∆∆y I ; . . . ), third (∆3 y; ∆3 y I ;∆3 y II ;. . . ) and fourth (∆4 y; ∆4 y I ;∆4 y II ;. . . ) order. Obviously, the values of the increments depend on the nature of y. For instance, if y = x, all increments of y will be constant(∆y = ω; ∆y I = ω; . . . ) and all increments of order higher than one will be zero(∆2 y = 0;∆3 y III = 0;. . . ). For a function y = p + q where both p and q are functions of x, for a ﬁnite increment of x the increment of y is y I − y = (pI + q I ) − (p + q) = ∆p + ∆q. With equal ease we obtain 70 the rules of diﬀerentiation for basic algebraic functions: y = p + q ∆y = ∆p + ∆q y = p − q ∆y = ∆p − ∆q y = pq ∆y = p∆q + q∆p + ∆p∆q y = p/q ∆y = q∆p−p∆q q 2 +q∆q In general, the increment of order n of a function y for the increment ω of x is given by the relation: ∆n y = P ω n + Qω n+1 + Rω n+2 + Sω n+3 + . . . (4.23) Proving (4.23) would take too long and it would not present anything new. It is worth mentioning nonetheless, since it is the starting point of Euler’s presentation of the diﬀerential calculus: In this chapter, and in all of the analysis of the inﬁnite, the increment ω by which we let the variable x increase will be inﬁnitely small, so that it vanishes; that is, it is equal to 0. Hence it is clear that the increase, or the diﬀerence, of the function y will also be inﬁnitely small. With this hypothesis, each term of the expression P ω + Qω 2 + Rω 3 + Sω 4 + . . . (4.24) will vanish when compared with its predecessor, so that only P ω will remain. [Euler, 2000, §113] In this sense the diﬀerential calculus is a special case of the calculus of ﬁnite diﬀerences. From a formal point of view, the change from cfd to ci is expressed by the replacement of ∆ with d and in (4.23) by replacing the ﬁnite diﬀerences with the inﬁnitely small diﬀerences. If ω = dx then the terms containing higher-order diﬀerentials will be 0 relative to dx, just as dx is 0 relative to any ﬁnite quantity. The proof of this is easy. We know that dx = 0, it follows that a ﬁnite quantity cannot be incremented by an inﬁnitesimal, i.e. a ± dx = a. 71 Therefore, an inﬁnitesimal “vanishes” in comparison to a ﬁnite quantity. a ± ndx = a a ± ndx − a = 0 (a ± ndx) : a = a : a = 1 In a similar manner we can prove that diﬀerentials of higher powers vanish in comparison to the same diﬀerential of power 1: (dx ± dx2 ) : dx = dx ± dx2 = 1 ± dx = 1 dx Because dx ± dxn = dx, the ﬁrst-order diﬀerential of y will be P dx, that is, dy = P dx. For Euler second-order diﬀerentials arise from second-order ﬁnite diﬀerences, not from ﬁrst-order diﬀerentials: These[second order diﬀerentials] arise from second diﬀerences...when we let ω become the inﬁnitely small dx. If we suppose that the variable x increases by equal increments, then the second value xI becomes equal to x + dx, and the following will be xII = x + dx, X III = x + 3dx. . . Since the ﬁrst diﬀerences dx are constant, the second diﬀerences vanish, and so the second diﬀerential of x, that is, d2 x, is equal to 0. [Euler, 2000, §124] 4.2 Foundations of Mechanics Euler developed his foundations of mechanics in parallel with those of the calculus, although the latter were published later. [Suisky, 2009] notes that the Mechanica is an application of Euler’s mathematical methods to mechanical problems. The task at hand is to ﬁnd Euler’s justiﬁcation for representing mechanical quantities using mathematics. A key part of this is the analogy between Euler’s view on the essence of material bodies and the nature of the inﬁnitely small. 72 Euler envisages two ways in which one might be able to ﬁnd the essential properties of material bodies. The ﬁrst would be to examine all bodies and individuate all their common properties. As such, it is impossible to pursue, since the totality of bodies is potentially inﬁnite. The second way would be to focus on a smaller set of bodies and individuate those properties in virtue of which they are bodies. For a property which is such that when it is absent in a thing, we would not regard the thing as a body, can justly be regarded as a general property of bodies. [Euler, 1862](§4) This process is purely rational so the end result will be the logically essential properties of bodies, i.e. those properties which, if absent, the body would not be a material body. Furthermore, the essence of material bodies is not a common property of all bodies(the same way shape might be), but a condition of possibility for a thing to be considered a material body. Although Euler does not explicitly mention this, he is assuming the existence of immaterial bodies. This is explainable, considering his religious background, but need not concern us. Following this method, Euler ﬁnds impenetrability to be the essence of material bodies: Every material body must occupy in space a particular location, and it is impossible for two bodies to be at the same location at the same time.[Euler, 1862](§85) This is an a priori truth, although Euler does not use this terminology. Should a body lack this property, then it is possible for two bodies to occupy the same location, but in this case there would be no way to distinguish between the two bodies. Moreover, as explained in An Introduction to Natural Science, a body must be impenetrable, otherwise it would be inaccessible to our senses, since rays of light would “pass through” the body without being reﬂected, so they could not be seen. Likewise, if there were penetrable objects, then they could not be sensed by touch, since our hands would pass through them without meeting any resistance. 73 From impenetrability three other general properties of bodies follow: extent, mobility and persistence. Moreover, since each property presupposes other properties, the latter will necessarily also be properties of material bodies. Extent necessarily follows from impenetrability since the latter concept is deﬁned by reference to location, which requires that the body should also have the property of extent. If a body does not have extent, then the property of impenetrability cannot be attributed to it, so it cannot be considered a material body. 1. That extent is a necessary property of material bodies is a conclusion we can arrive at from experience: We are not only convinced by our experience that all material bodies that we know possess extent, but our concept of bodies incorporates extent in such a way, that we can exclude all things without extent from the category of material bodies. [Euler, 1862, §9] A number of properties follow with necessity from the idea of bodies having extent: Divisibility: “Everything that has extent is divisible, and divisibility can be continued ad inﬁnitum; therefore all material bodies are inﬁnitely divisible.”[Euler, 1862, §11] No atomic components: “Although, in view of its divisibility, a body must be regarded as a composite thing, it is nevertheless not composed of simple things”[Euler, 1862](§13) Dimensionality: “Every material body . . . is always extended in the so called three di- mensions of length. [Euler, 1862](§15) Regarding the ﬁrst, Euler draws upon an analogy with geometry to prove that extent necessarily entails divisibility. Given a segment of length l, any division of the line will result in a number of smaller segments of deﬁnite length, l/2, l/5, l/100, etc. This process can be carried ad inﬁnitum, since every smaller segment would have to have a certain size and extent, and in virtue of it having extent, it can be further divided. The argument can also be applied to real material bodies, for after repeated divisions if we reach a part that cannot 74 be divided, then that part has no size. However, it is impossible for this to happen, for there is no reason to believe that a part that is a third or a quarter of the material body has size, and at the same time deny that the millionth part of the body has no size. Even though a body is inﬁnitely divisible, Euler does not think this entails that bodies are composed of indivisible parts. Of course, they are composed of parts, but there are no “ultimate” parts (atoms). The reason being that atoms must have extent. Which means they can be divided even further. Finally, saying that material bodies have extent means that material bodies have extent in all three dimensions. There are bodies that have extent, but cannot be considered material bodies. Lines are geometrical bodies which have extent, but only in one dimension. Surfaces too have extent but only in two dimensions. Material bodies must have three dimensions, and none more. Euler does not explain why this is the case, but merely postulates that A body must have threefold extent, in length, in breadth and in depth. There are no further kinds of extent. [Euler, 1862](§15) 2. Mobility is deﬁned as the possibility of a material body to occupy another location. Just as with the property of extent, there are two ways to prove the necessity of mobility. First, if we assume that impenetrability is the essence of material bodies, then a body cannot be anything but mobile. Assuming the contrary would mean that a body is permanently attached to a location. This can only be the case if an external force acts continuously on the body keeping it in the same location; the force must be external “for in the thing itself one can ﬁnd nothing that would prevent it from being moved from this position.”[Euler, 1862](§38) Therefore, even if an immovable body is conceivable, the cause of this immovability is not inherent in the body itself, so we can consider mobility a necessary property of material bodies. Alternatively, mobility is also proved by observing that there is nothing in the concept of space or the concept of body that would “tie down” a body to a particular location. There is no suﬃcient 75 reason to believe that a body would not move should a force act upon it. 3. Persistence is the property of bodies to remain in the same state in the absence of any external inﬂuences. If the body is in a state of rest, then there is no reason for it to change its position on its own: There is however no reason why it should move in one direction rather than in a diﬀerent one, and in the absence of any suﬃcient reason for this, we can conclude with certainty that a body, once at rest, will remain in that state, unless an external cause arises that can set it in motion. [Euler, 1862, §26] A similar argument applies to the case of bodies in motion. If a body moves in a straight line, then it will keep doing so in the absence of any external interactions, for there are no inherent causes of sudden changes in its behavior. On the other hand, should the body’s trajectory be an arc, then this must be caused by the action of an external force. This does not mean that if a body moves in a straight line then no force is acting upon it. A force may aﬀect two parameters of a body’s dynamic: the trajectory and the velocity. A curvilinear motion is necessarily brought about only by the action of a function, but also the velocity of a body can be altered(increased/decreased) only through the action of an external force. Persistence is a direct consequence of mobility. If a body is mobile, then unless it also has the property of persistence, its direction, velocity, and state would change without any external cause. It will turn out that the applicability of the calculus to mechanics rests upon this view of bodies. This is expressed in Euler’s Mechanica sive motus scientia exposita which contains the analytical representation of mechanics using the mathematical tools of the diﬀerential calculus. His attitude towards the use of the analytical method is expressed in the Preface: But as with all writings composed without analysis, and that mainly falls to be the lot of Mechanics, for the reader to be convinced of the very truth of these propositions oﬀered, an examination of these propositions cannot be followed with suﬃcient clarity and distinction: thus as the same questions, if changed a little, cannot be resolved from what is given, unless one enquires using analysis, and these same propositions are explained by the analytical method. Thus, I always have the same trouble, when 76 I might chance to glance through Newton’s Principia or Hermann’s Phoronomiam, that comes about in using these, that whenever the solutions of problems seem to be suﬃciently well understood by me, that yet by making only a small change, I might not be able to solve the new problem using this method. [Euler, 1736, Preface] The use of analytical tools for mechanical problems has the same motivation as their use in geometry: generality. The calculus does not apply, like geometrical methods, to particular problems, but to classes of mechanical problems. Obviously, this is possible because the fundamental concepts of the calculus, i.e. variables, constants, functions, have been deﬁned (by Euler) independent of geometry, although not independently of geometrical intuitions. In the Preface Euler also mentions the fundamental role played by inﬁnitely small bodies: Moreover I have striven in the classiﬁcation of the work to distinguish between bodies which can be moved and those which are ﬁxed, as being either free or not. The division supplied by me has taken into account the innate character of the bodies themselves, as in the ﬁrst case I shall investigate the motion of inﬁnitely small bodies, or as it were of points, then indeed I shall move on to investigate bodies of ﬁnite size and these can be either rigid or ﬂexible, or I may proceed to extended bodies which in turn are entirely free. For as in geometry, in which the measurement of bodies is discussed, it is customary to begin with a discussion of points, thus also the motion of bodies of ﬁnite sizes cannot be explained, unless ﬁrst the motion of points, from which the bodies are considered to be composed, has been carefully examined. [Euler, 1736, Preface] The existence of inﬁnitely small bodies is a consequence of the property of extent of material bodies. As mentioned above, in virtue of having the property of extent, material bodies also have the property of inﬁnite divisibility. This means that inﬁnitely small bodies are, in theory, possible. The analogy between the concept of inﬁnitely small quantities and that of inﬁnitely small bodies justiﬁes the application of the diﬀerential calculus to problems of mechanics. One question we must answer is whether this application is justiﬁed by more than this analogy. The Mechanica begins by deﬁning the fundamental concepts of mechanics: motion, rest, place, and velocity. The place occupied by a body “is a part of the immense or boundless space 77 which constitutes the whole world.” [Euler, 1736, §4] Based on this concept, we deﬁne absolute motion as the translation of a body from one place to another, rest as the preservation of place, and velocity as the variation of the distance in time. Relative motion and relative rest on the other hand, assume a bounded space in which the translation or lack thereof takes place. Euler seems to focus more on the relative notions than on the absolute ones, since the latter are not easy to manipulate: Moreover, because of the immense nature of space and of its unbounded nature...we are unable to form a ﬁxed idea of this[absolute motion/rest]. Thus, in place of this immense space within which bodies can move, we are accustomed to deﬁning a ﬁnite space and the limits within which bodies can move, from which we can indicate the states of motion and of rest of bodies. [Euler, 1736, §7] The notion of inﬁnite divisibility is not the only connection between the foundations of mechanics and those of the calculus. The idea of translation of a point of mass from one location to another is analogous to the continuous generation of geometrical curves. Translation has to be continuous, otherwise, if a body were to move from one place to another in discrete “jumps”, then it would have to be annihilated at the starting point and then recreated at the destination, “which cannot be done according to the laws of nature, except by agreeing to a miracle” [Euler, 1736, §13]. [Suisky, 2009] points out that Euler describes translation and motion “in terms of ﬁnite and inﬁnitesimal increments of the path” [Suisky, 2009, p. 121]. That this is the case can be seen in Proposition 3 of the Mechanica: 33. In motion with any non-uniformity, the smallest elements of the distance are considered to be traversed by uniform motions [Euler, 1736, §33]. The demonstration of this proposition draws upon an analogy with the geometrical division of curves into inﬁnitely small segments. Just as a curve can be divided into segments, nonuniform motion can be divided into inﬁnitely small increments of uniform velocities: “For either the elements are actually traversed in a uniform motion, or with the change of the 78 speed by an element of this kind is so small, that the increment or decrement can be ignored without error.”[Euler, 1736, §33]. Mathematically, this means that if the initial velocity is c, the velocity in the second element is c + dc, the third c + 2dc + ddc, i.e. (c+dc)+ d(c+dc), and so on. However, since in the calculus of inﬁnitely small quantities the increment dx of a variable was considered zero, one might be tempted to think that representing the increment of velocity in a similar manner would mean that there is no increment at all, since dc = 0. In consequence, this would be the same as saying that there is no variation of velocity. The reply to this objection is that velocity is the geometrical ratio of an inﬁnitesimal increment of space and of an inﬁnitesimal increment of time so the ratio has a determinate value. Therefore, in the case of non-uniform motion, the increment dv ̸= 0. Since the cause of the change of state of a body (from motion to rest and vice versa) is not inherent in the body(due to the property of persistence), it follows that a body changes state only as a result of an external action. This cause is called force: A force is an action on a free body that either leads to the motion of the body at rest or changes the motion of that body.[Euler, 1736, §99] With the introduction of forces the Mechanicaturns towards geometry. Euler proves the principle of composition of forces (pcf) and the principle of the lever (pl) geometrically. Take for instance, Proposition 14, in the second chapter of Mechanica For the given eﬀect of an aboslute force on a point at rest, to ﬁnd the eﬀect of the same absolute force on the same point in some kind of motion. The solution of this problem is not analytical, but geometrical. Euler assumes that the point is initially at A then assumes that if the point is at rest then in dt he will be at C as a result of a force acting upon it. If the force does not act, then the point will reach, in dt, position B. The length of AB is cdt. The eﬀect of the force will be the segment BD = AC. 79 Figure 4.1: Graphical representation of Proposition 14, Ch.2, Mechanica Euler’s introduction of functions was a giant leap forward in decoupling calculus from geometry. However, when solving various mechanical problems, he still appeals to geometrical ﬁgures and methods. Therefore, we can say that although the function came to play a greater role in his calculus, the intuitions behind it were still geometrical. The relevance of geometry to Euler’s mechanics goes a bit further than this. Geometry connects the calculus to an idealized version of reality. Consider Euler’s remarks on the nature of material bodies and compare them with the concept of point. Both geometrical points and material bodies are inﬁnitely divisible and neither of them is composed of simpler parts. The two entities diﬀer in kind. An inﬁnitely small quantitiy is necessarily impenetrable, while the same does not hold for the point. This explains why Euler uses geometry to represent mechanical problems. Points resemble inﬁnitely small bodies, and on the basis of this relation we can represent the latter through the former. If we wanted to know whether this relation of resemblance is actually accurate, we would have to know Euler’s deﬁnition of point. I am assuming that he has in mind the Euclidean deﬁnition of point as that which has no parts. The use of the calculus is also connected to geometry via the concept of inﬁnitely small. Points and inﬁnitesimals also have properties in common. Just as a point has no length, an inﬁnitesimal is nothing, i.e. it is 0. The inﬁnitesimal too has no parts, just as the point 80 and the material body. This is why analytical expressions can be considered representations of geometrical objects. The reliance on geometry as an intermediary between geometry and calculus was completely removed by Lagrange’s work in mechanics and the calculus. This will be the topic of the next chapter. 81 Chapter 5 Foundational Research in the 18th Century: Lagrange Euler’s Mechanica marks a turn to analysis from geometry, but it does not go all the way in establishing mechanics as a full blown analytical domain. The solutions to the problems presented in the Mecanica still depended on geometrical representations. Things are quite diﬀerent with the M´echanique Analitique. In the preface to the second edition Lagrange says No ﬁgures will be found in this work. The methods that I will present do not require constructions, nor geometrical or mechanical justiﬁcations, but only algebraic operations.1 The diﬀerences between the two treatises go further than just their relation with geometry. First, Euler and Lagrange posit diﬀerent principles and concepts at the foundations of mechanics: Euler begins with a distinction between internal and external principles derived from the essence of material bodies, while Lagrangian mechanics is founded upon the principle of virtual work. A second diﬀerence regarding the two mechanics is the diﬀerent emphasis they 1 “On ne trouvera point de Figures dans cet Ouvrage. Les m´ethodes que j’y expose ne demandent ni constructions, ni raisonnemens g´eom´etriques ou m´ecaniques, mais seulement des op´erations alg´ebriques” [Lagrange, 1853, p. III]. 82 place on theoretical concepts, such as force and vis viva. While Euler takes “force” to be a theoretical primitive, Lagrange does not try to establish forces as mechanical concepts, and the same applies for other theoretical concepts as well, e.g. mass, vis viva. On a second level there are the diﬀerences between the foundations of the calculus in which Eulerian and Lagrangian mechanics are couched. One such example is Lagrange’s rejection of inﬁnitesimals and his attempt to deﬁne derivatives using Taylor’s theorem. Obviously, there are also similarities between the two views on mechanics and the calculus. Euler and Lagrange share with Jean le Rond d’Alembert and Pierre Louis de Maupertuis the Cartesian “belief in a mathematically structured reality and in the capability of the human mind to condense this reality into a deductive symbolical structure with only a few ﬁrst, indisputable principles.”[Pulte, 1998, p. 156] Moreover, despite the diﬀerences mentioned above, both Euler and Lagrange emphasize the algebraic character of the calculus and its importance in the development of algorithms. Needless to say, they also share the belief that calculus must be separated from geometry and mechanics, although it is being applied to solve classes of problems from these domains. Lagrange’s work on the foundations of the calculus is divided in two distinct components presented in Lagrange’s letters to Euler from 1755-1756, and in the Th´eorie des fonctions analytiques and Le¸cons sur le calcul des fonctions [Fraser, 1985]. A similar, but unrelated, change of mind is present in the foundations of mechanics, as Lagrange strips the principle of least action (pla) of the fundamental role attributed to it in [Lagrange, 1867a], and places the principle of virtual velocities (pvv), later referred to as the principle of virtual work (pvw), as the basis of mechanics in [Lagrange, 1853]. 83 5.1 Lagrange’s Criticism of His Predecessors In the ﬁrst chapter of his Th´eorie des fonctions analytiques Lagrange makes a deﬁnitive critique of previous attempts to provide the calculus with sound foundations. It must be noted that his objections are not original, some of them having been presented in Bishop Berkeley’s The Analyst, and most of them have been mentioned in previous chapters. Thus, the way in which the “geometers” [Lagrange, 1797] Leibniz, the Bernoullis and l’Hˆopital employed the concept of inﬁnitesimal made their calculations inaccurate. In their works, inﬁnitely small quantities were quantities after all and, as such, they could increment or decrement another quantity. They considered that the results of the calculus could be considered accurate only if the errors introduced by these small variations cancel out. However, Lagrange notes that it would be diﬃcult to demonstrate that the errors are always compensated: . . . it also seems to me. . . that the genuine metaphysics of this calculus consists in that the error resulting from this false supposition [inﬁnitesimals] is rectiﬁed or compensated by the error generated by the procedures of the calculus. . . 2 Also, in his method of ﬂuxions, Newton tried to avoid the assumption of inﬁnitely small quantities: In order to avoid assuming inﬁnitely small quantities, Newton considered mathematical quantities as generated by movement and he searched for a method to determine directly the velocities, or rather the ratio of the variable velocities with which these quantities are generated . . . This method, or this calculus, agrees with the diﬀerential calculus in content and in what regards the basic operations, and diﬀers only in its metaphysics which seems to be clearer, since everybody thought to have an idea of velocity3 . 2 “. . . d’ailleurs il me semble que. . . la v´eritable m´etaphysique de ce calcul consiste en ce que l’erreur r´esultant de cette fausse supposition est redress´ee ou compens´ee par celle qui naˆıt des proc´ed´es mˆemes du calcul. . . ” [Lagrange, 1797, §4] 3 “Newton, pour ´eviter la supposition des inﬁniment petits, a consid´er´e les quantit´es math´ematiques comme engendr´ees par le mouvement, et il a cherch´e une m´ethode pour determiner directement les vˆıtesses variables avec lesquelles ces quantit´es sont produites . . . Cette m´ethode ou ce calcul s’accorde pour le fond et pour les op´erations, avec le calcul diﬀ´erentiel, et n’en diﬀ´ere que par la m´etaphysique qui paraˆıt en eﬀet plus claire, parce que tout le monde a ou croit avoir une id´ee de la vˆıtesse.”[Lagrange, 1797, §5] 84 However, for Lagrange the problem with the method of ﬂuxions is that while we might have an intuitive idea of velocity, we do not have an idea of instantaneous velocity, which is what the ﬂuxion actually is. Moreover, insofar as the ﬂuxion is interpreted as a velocity it is essentially a physical concept and as such cannot be included among the foundational concepts of a mathematical discipline such as the calculus. For this reason, as noted in 3.2.1, Newton introduced the method of ultimate ratios. Apart from their name, there is no real diﬀerence between ultimate ratios and the 18thcentury concept of limit as Lagrange and Carnot [Carnot and Browell, 1832, §78] observed. So Lagrange’s objection applies equally to both. To be more precise, the notion of limit as introduced by d’Alembert in 1754 in his Encyclopedie and used in the 18th century is One quantity is the limit of another if the second can approach the ﬁrst nearer than by a given quantity, so that the diﬀerence between them is absolutely inassignable. [Boyer, 1949, p. 247] However, when applying this concept we are considering “quantities in the state in which they cease, so to speak, to be quantities”,4 because they become nothing (i.e. zero). Since both terms of the ratio become simultaneously zero, the ratio will be equal to 0 0 which “no longer oﬀers a clear and precise idea to the mind.”5 Lagrange’s historical introductions to the calculus in the Th´eorie des fonctions analytiques and to mechanics in M´echanique Analitique is an uncommon occurrence in the 18th century. Also peculiar is his changing his views on the foundations of the calculus and mechanics. 5.2 Foundations of Lagrangian Calculus Although the presentation of Eulerian foundations of calculus began with an analysis of the concept of function, a similar strategy would be redundant here, since Lagrange uses a similar 4 “. . . de consid´erer les quantit´es dans l’´etat o` u elles cessent, pour ainsi dire, d’ˆetre quantit´es. . . ”[Lagrange, 1797, §5] 5 “. . . n’oﬀre plus ` a l’esprit une id´ee claire et pr´ecise. . . ”[Lagrange, 1797, §5] 85 concept. For him, a function is any expression into which constants and variable quantities enter [Lagrange, 1806]. Although Lagrange did not detail the deﬁnition any further, he attributes the same properties to functions as Euler, i.e. they are continuous, diﬀerentiable and expandable into inﬁnite series. However, functions had a more important role for Lagrange than they had for Euler. Recall that for Euler geometric ratios of inﬁnitely small increments constitute the object of study of the diﬀerential calculus. For Lagrange, on the other hand, it is functions themselves who have this role. The calculus is a study of functions, not of geometric ratios: But in algebra, we consider functions insofar as they result from the operations of arithmetic, generalized and translated to symbols, whereas in the calculus of functions per se, we consider the functions that result from the algebraical operation of series expansion, when we attribute an indeterminate increment to one or more quantities of the function.6 To get a better picture of Lagrange’s suggestion, we should look at his analysis of the increment of a general function. I will use as a template the second lesson [Lagrange, 1806], although the same procedure is also used in Th´eorie des fonctions analytiques. Consider a function f (x) in which we substitute x for x + i, where i is an indeterminate quantity. The function will become f (x + i) which expanded into series will be f (x + i) = f (x) + pi + qi2 + ri3 + . . . (5.1) where p, q and r are functions of x “derived from the primitive function f x, and independent from the quantity i.”7 and whose analytical expression depends on the expression of 6 “Mais, en alg`ebre, on ne consid`ere les fonctions qu’autant qu’elles r´esultent des op´erations de l’arithm´etique, g´en´eralis´ees et transport´ees aux letters, au lieu que, dans le calcul des fonctions proprement dit, on consid`ere les fonctions qui r´esultent de l’op´eration alg´ebrique du d´eveloppement en s´erie, lorsqu’on attribue ` a une ou ` a plusieurs quantit´es de la fonctions, des accroissemens ind´etermin´es.” [Lagrange, 1806, p. 4] 7 “de´eriv´ees de la fonction primitive de f x et ind´ependantes de la quantit´e i.”[Lagrange, 1806, p. 7] 86 f . In order to secure the status of derivative functions as substitutes of diﬀerentials, without appealing to inﬁnitesimals or geometrical concepts, Lagrange must show that derivative functions exist for any primitive function whatsoever. However, as observed in the previous chapter, Eulerian functions focused on functional relations between variables, where the latter were understood to be any quantity whatsoever, including those quantities for which the value of the function cannot be calculated due to the deﬁnition of the algebraic operations applied to the functional variables8 or because the function is expressed as a solution to an equation, or of an integral that cannot be calculated.9 These cases were accepted as mere peculiarities and were not given too much thought at the time. Similarly, when we say that all functions can be expanded into inﬁnite series, we must be aware that there will be exceptional cases in which the algebraic operation of expanding into series cannot be applied, just as other algebraic operations admit exceptions. Lagrange does not seem to consider this an issue: We are thus assured that since f (x) expresses any algebraic function of x, the function f (x + i) can, generally speaking, be developed into a series of this form: f x + ip + i2 q + i3 r + i4 s + &c in which p, q, r, &c, will be new functions of x derived from the primitive function f x. If the function f x is not algebric, we can nevertheless assume that the development of f (x + i) has in general the same form, considering as particular exceptions those cases in which this development contains powers of i other than positive integers10 8 1 Consider f (x) = 1−x for x = 1. The value of f (x) at x = 1 cannot be calculated due to the deﬁnition of the operation of division. 9 These transcendental functions, as Euler called them, were discussed in the previous chapter. 10 Nous sommes donc assur´es que f x exprimant une fonction quelconque de x alg´ebrique, la fonction f (x + i) peut, g´en´eralement parlant, se d´evelopper en une s´erie de cetter forme, f x + ip + i2 q + i3 r + i4 s + &c dans laquelle, p, q, r, &c, seront de nouvelles fonctions de x d´eriv´ees de la fonction primitive f x. Si la fonction f x n’est pas alg´ebrique, on peut n´eanmoins supposer que le d´eveloppement de f (x + i) soit 87 Therefore, Lagrange’s aim is to show that in general algebric functions can be represented as in (5.1) “and he wishes to construct the demonstration so as to provide a plausible account within his algebraic framework for the possibility of exceptional values” [Fraser, 1987]. To this end, he begins by proving that in the series resulting from the expansion of an algebraic function f (x + i) there are no fractional or negative powers of i unless we give x particular values. For instance, if there is a negative power of i in the formula (5.1), then it is possible to ﬁnd for i = 0, a in a in (n > 0; n ∈ N ∗ ) among the terms of the series. However, this implies that will be inﬁnite; this means that f (x + i) = f (x) + ip + i2 q + i3 r + i4 s + &c + ∞ will be inﬁnite too. For Lagrange, this is possible only for particular values of the variable x so the validity of his claim is not endangered. Of course, by modern standards, this proof is tentative at best, but recall that following Euler’s lead, 18th-century mathematicians had a diﬀerent understanding of variables than we do today. As explained in section 4.1.1, the relation between variables and particular values was similar to the one between genus and species. It follows that when trying to determine whether a function f has a property P , it does not matter whether there are particular values for which f does not have P , as long as the formal relation holds between indeterminate values.11 Note that the expansion of f (x + i) has the form (5.1) only if f (x) is algebraic. Lagrange explicitly restricts his argument to algebraic functions, admitting that for transcendental en g´en´eral de la mˆeme forme, en regardant comme des exceptions particuli`eres les cas o` u ce d´eveloppement contiendrait d’autres puissances de i que des puissances positives et enti`eres. [Lagrange, 1806, p. 9] 11 In 4.1.1 I cited Ferraro[Ferraro, 2000] to illustrate this point. I will repeat that citation here: What is legitimate for the variable could not be legitimate for all its occasional values. Consequently, given any property P of x, there might exist exceptional values at which the property fails. A proof involving the variables x, y, . . . was valid and rigorous as long as the variables x, y, . . . remained indeterminate; but this was no longer the case if∑one gave a determinate ∞ n value to x, y, . . . . Thus, if one expanded f (x) into a power series and made no nx n=1 a∑ ∞ assumptions concerning the individual values of variables, then the equality n=1 an xn = f (x) was considered globally valid even if there might exist certain occasional values at which the ∑ n general relation ∞ n=1 an x = f (x) did not furnish a numerical equality: these points were “not signiﬁcant” [Ferraro, 2000, p. 119]. 88 functions it is not possible to know whether the expansion of f (x + i), where f (x) is transcendental, has the same form as (5.1). He does show, however, that particular transcendental functions can be expanded into series such as (5.1), but does not generalize this result12 . After showing that algebraic functions can generally be expressed as (5.1) Lagrange tries to show that although the forms of the derivative functions depend on the primitive, they are, nevertheless, governed by a “general law”[Lagrange, 1806, p. 10]. The law is obtained by a series of substitutions and expansions into series. First, if we replace x with x + o in f (x + i) we obtain f (x + i + o). Second, replacing i with i + o in the same function, we also obtain f (x + i + o). From the subsequent equality we will obtain the relation between derivative functions: f (x + i + o) = f (x) + p(i + o) + q(i + o)2 + r(i + o)3 + . . . (5.2) By expanding (i + o) and retaining only the ﬁrst terms we obtain f (x + i + o) = f (x) + pi + qi2 + ri3 + si4 + · · · + op + 2ioq + 3i2 or + 4i3 os + . . . (5.3) Replacing x with x + o in f (x) gives f (x + o) = f (x) + op + o2 q + o3 r + . . . (5.4) Since p, q, r, . . . are also functions of x, it follows that replacing x with x + o will give p(x + o) = p(x) + op′ + o2 p′′ + . . . q(x + o) = q(x) + oq ′ + o2 q ′′ + . . . 12 Thus, exponentials and logarithms are discussed in the fourth Le¸con and sine and cosine in the ﬁfth. 89 Then, f (x + i + o) will be f (x+i+o) = f (x)+pi+qi2 +ri3 +· · ·+i(p+p′ o+p′′ o2 +. . . )+i2 (q(x)+oq ′ +o2 q ′′ +. . . )+ . . . (5.5) Now, (5.3) and (5.5) are equal to f (x + i + o). If the two expressions are equal, then the coeﬃcients of the products of the same powers of i and of o should be equal both in (5.3) and in (5.5). This means that 2q = p′ , 3r = q ′ , etc., which means that q = 12 p′ r = 13 q ′ s = 14 r′ . . . 1 1 f (x + i) = f (x) + pi + i2 p′ + i3 q ′ + . . . 2 3 (5.6) At this point Lagrange introduces the well-known notation of derivatives: f ′ (x) = p(x), f ′′ = p′ (x) and q = 21 f ′′ , . . . . Replacing in (5.6) we obtain the well-known formula f (x + i) = f (x) + if ′ (x) + i2 ′′ i3 ′′′ i4 f (x) + f (x) + f iv (x) + . . . 2 2·3 2·3·4 (5.7) This is the general law expressing the relation between the primitive and the derivative functions. To illustrate this procedure, consider the third Le¸cons, in which Lagrange applies the law to ﬁnd the derivative of any power function f (x) = xm . For x + i we have f (x + i) = (x + i)m = xm (1 + xi )m = xm (1 + ω)m . The result of expanding (1 + ω)m is 1 + ωF (m) + . . . . For another exponent n, the result of expanding is 1 + ωF (n) + . . . . Multiplying the two 90 functions we get (1 + ω)m+n = 1 + ω(F (m) + F (n)) + ω 2 F (m)F (n) + . . . (5.8) But at the same time the expansion of (1 + ω)m+n is 1 + ω(F (m + n)) + . . . , which means that F (m + n) = F (m) + F (n): F (m + n) = F (m) + F (n) F (m + i + n − i) = F (m + i) + F (n − i) F (m + i) = F (m) + iF ′ (m) + i2 ′′ 2 F (m) F (n − i) = F (n) − iF ′ (n) + F ′′ (n) + . . . F (m + n) = F (m) + F (n) + i(F ′ (m) − F ′ (n)) + i2 2 + ... i2 i2 ′′ 2 ( 2 F (m) + i2 ′′ 2 F (n)) + ... = F (m) + F (n) Therefore, i(F ′ (m) − F ′ (n)) + i2 i2 ′′ i2 ( F (m) + F ′′ (n)) + · · · = 0 2 2 2 (5.9) which holds only if F ′ (m) − F ′ (n) = 0, F ′′ (m) + F ′′ (n) = 0, etc. This can only be the case if F ′ (m) = F ′ (n) = a and F ′′ (m) = F ′′ (n) = 0. Given the relation (5.7), it follows that higher-order derivatives will also be 0. Returning to the expansion of F (m + i), F (m + i) = F (m) + F ′ (m) = F (m) + ai F (m + (−m)) = F (0) = F (m) − ma F (m) = F (0) + ma = b + ma But (1 + ω)m = 1 + ωF (m) + . . . , that is (1 + ω)m = 1 + ω(am + b) + . . . . We can ﬁnd the values of a and b for special values of m. Thus, for m = 0: 1 = 1 + ωb + . . . . Since all terms 91 after 1 contain ω but no ω appears on the left hand of the equality, it follows that necessarily b = 0. If m = 1 then 1 + ω = 1 + aω + . . . , which means that a = 1. We conclude that (1+ω)m = 1+mω+. . . . Since xm (1+i/x)m = xm (1+ω)m = (x+i)m , it follows that (x+i)m = xm (1 + m xi + . . . ) = xm + mxm−1 + . . . . However, (x + i)m = (xm )′ + i(xm )′ + i2 (xm )′′ + . . . , 2 therefore, the derivative of the function f (x) = xm is f ′ (x) = mxm−1 . After ﬁnding this result, Lagrange gives “a demonstration both simple and general, and perhaps the only rigorous one, given as yet to the binomial formula for a variable exponent.”13 The demonstration is simple. If f ′ (x) = mxm−1 , then taking f ′ as primitive and deriving again f ′′ = m(m − 1)xm−2 . Replacing the derivatives in (5.7) gives the general case: (x + i)m = xm + mxm−1 i + m(m − 1) m−2 2 m(m − 1)(m − 2) 3 x i + i + ... 2 2·3 (5.10) For x = a and i = b we obtain Newton’s binomial: (a + b)m = am + mam−1 b + m(m − 1) m−2 2 m(m − 1)(m − 2) m−3 3 a b + a b + ... 2 2·3 (5.11) Although Lagrange views the proof of the binomial theorem a consequence of (5.7) it might be tempting to view it as a veriﬁcation of the validity of (5.7). This was a common means of validating the mathematical procedures during the 18th century. The solutions to geometrical problems obtained through the analytical methods of the calculus were compared with the results obtained through purely geometrical methods and from this kind of successes the validity of the calculus, regardless of its foundations. However, under this interpretation Lagrange’s argument is open to one of Berkeley’s objections: 13 . . . une d´emonstration aussi simple que g´en´erale, et peut ˆetre la seule rigoureuse qu’on ait encore donn´ee de la formule du binome pour un exposant quelconque. [Lagrange, 1806, p. 17] 92 And yet it should seem that, whatever errors are admitted in the Premises, proportional errors ought to be apprehended in the Conclusion, be they ﬁnite or inﬁnitesimal: and that therefore the ακριβϵια of Geometry requires nothing should be neglected or rejected. In answer to this you will perhaps say, that the Conclusions are accurately true, and that therefore the Principles and Methods from whence they are derived must be so too. But this inverted way of demonstrating your Principles by your Conclusions, as it would be peculiar to you Gentlemen, so it is contrary to the Rules of Logic. The truth of the Conclusion will not prove either the Form or the Matter of a Syllogism to be true. [Berkeley, 2009, §19] Lagrange’s deduction of the general law of derivation shows that for him the relation between the primitive and its derivative is a formal algebraic relation. This way, Lagrange avoids the complications raised by inﬁnitesimals and the appeal to geometry. Moreover, this method of derivation is completely diﬀerent from the modern, limit-based approach. However, Lagrange’s law is too separated from geometry [Grabiner, 2005, p. 36]. He deﬁnes the derivative of a function as the ﬁrst coeﬃcient of i in the expansion of the function in Taylor series, but does not explain how this concept applies to solving problems of geometry (such as ﬁnding the rate of change of a function). Furthermore, Lagrange did not prove that the series obtained from the expansion of a function is unique, i.e. that there are no other functions whose corresponding series are identical to the series of the function under investigation. Cauchy rightly pointed out that the functions ex and e−x + e−1/x have the 2 same Taylor series around x = 0, i.e. 1 − x2 1 + x3 1·2 − x4 1·2·3 2 2 + . . . [Cauchy, 1974, p. 277]. It might be thought that Lagrange’s method resembles Euler’s foundations of the calculus in that it uses ﬁnite increments. However, in the eighteenth Le¸con Lagrange explicitly distinguishes his own foundations from those relying on ﬁnite increments. For Lagrange, founding the diﬀerential calculus upon the calculus of ﬁnite diﬀerences can only be done through a process of passing to the limit and, as noted previously, he already rejected the validity of such an approach. Regarding the diﬀerence between the calculus of ﬁnite diﬀerences and Lagrange’s own calculus of derivatives, is that the former represents the terms of the progression 93 by functions of diﬀerent quantities, while the latter represents the same terms by diﬀerent functions of the same quantity. As seen in the previous chapter, for an increment ω of x there corresponds a second value(y I ) of a function y. In a similar manner we calculate the rest of the series of values of y(y II , y III ,. . . ). From here we compute the diﬀerences ∆y = y I − y. The diﬀerence is between the diﬀerent values of the same function. By contrast, in Lagrange’s calculus the derivative function, which corresponds to ∆y ∆x for ∆x → 0, is a function in its own right, obtained from the primitive function or from a derivative function of a lower order following “ﬁxed and uniform rules” [Lagrange, 1806, p. 292]: The equations in ﬁnite diﬀerences are nothing more but a series of similar equations between diﬀerent unknown quantities, by which we can always successively determine each of these quantities. But the uniform law which governs these equations allow us to consider these unknown quantities as forming a regular series which may have a general term, and the expressions of this term gives the general resolution of all equations.14 The second problem with founding ci upon the cfd is that in moving from the latter to the former we are inevitably assuming the problematic concept of inﬁnitely small quantities. Although the problems raised by this concept are provisionally solved by expressing the ratio of inﬁnitesimals as the limit of a ratio of ﬁnite diﬀerences, in doing so we are introducing the problematic concept of limit. For Lagrange, “passing from the ﬁnite to the inﬁnite always demands a kind of leap, more or less forced, which breaks the law of continuity and changes the form of the functions.”15 In the Th´eorie des fonctions analytiques Lagrange extends the method of derivation to functions of multiple variables. The procedure is simple. For a function f (x, y) of two 14 “Les ´equations aux diﬀ´erences ﬁnies ne sont autre chose qu’une suite d’´equations semblables entre diﬀ´erentes inconnues, par lesquelles on peut toujours d´eterminer successivement chacune de ces inconnues. Mais la loi uniforme qui r`egne entre ces ´equations fait qu’on peut regarder leurs inconnues comme formant une suite r´eguli`ere et susceptible d’un terme g´en´eral, et l’expression de ce terme donne alors la r´esolution g´en´erale de toutes les ´equations.”[Lagrange, 1806, p. 292] 15 “le passage du ﬁni ` a l’inﬁni exige toujours une esp´ece de saut, plus ou moins forc´e, qui rompt la loi de continuit´e et change la forme des fonctions.”[Lagrange, 1806, p. 293] 94 variables, ﬁrst we replace x with x + i then expand the function. After this, we replace y with y + o in the expanded function and perform the appropriate algebraic operations. f (x + i, y) = f (x, y) + if ′ (x, y) + i2 ′′ 2 f (x, y) f (x + i, y + o) = f (x, y + o) + if ′ (x, y + o) + i2 2 + f ′′ (x, y + o) + o2 2 f′′ (x, y) f (x, y + o) = f (x, y) + of′ (x, y) + f ′ (x, y + o) = f ′ (x, y) + of′′ (x, y) + f ′′ (x, y + o) = f ′′ (x, y) + of′′′ (x, y) + o2 2 i3 ′′′ 2···3 f (x, y) + o3 2·3 f′′′ (x, y) + ... i3 ′′′ 2·3 f (x, y + o) + . . . + ... f′′′ (x, y) + . . . o2 ′′ 2 f′′ (x, y) + ... ... f (x + i, y + o) = f (x, y) + if ′ (x, y) + of′ (x, y) + i3 ′′′ 2·3 f (x, y) + i 2o f′′′ + 2 i2 ′′ 2 f (x, y) + iof′′ (x, y) + io2 ′ 2 f′′ 3 o + 2·3 f′′′ (x, y) + . . . In the above deduction of the expansion of a function with two variables, f ′ , f ′′ , f ′′′ , etc. are the derivative functions relative to the variable x, while f′ , f′′ , f′′′ , etc. are the derivatives relative to y. Likewise, f′′′ is the ﬁrst derivative relative to x of the second derivative relative to y, and so on. The general term of the expansion is in on f n (x, y) (1 · 2 · 3 · 4 . . . m)(1 · 2 · 3 · 4 . . . n) m (5.12) This small digression into functions of several variables is required if we are to understand probably the most important discovery of 18th-century calculus: the calculus of variations. 95 o2 2 f′′ (x, y)+ 5.3 Euler and Lagrange on the Calculus of Variations Simply put, the central problem of the calculus of variations is to ﬁnd the function y = y(x) from the members of a class of functions such that a given deﬁnite integral has an extremum. Under this heading fall a series of problems addressed in the 18th century: 1. Brachistochrone problems 2. Problems of geodesics 3. Isoperimetric problems The ﬁrst brachistochrone problem appears in the Acta Eruditorum of June 1696 [Kline, 1990] as a challenge to mathematicians raised by John Bernoulli. “The problem is to determine the path down which a particle will slide from one given point to another not directly below in the shortest time.”[Kline, 1990, p. 574] If v1 is the initial velocity at point P1 and friction is ignored, then the brachistochrone problem is to minimize the integral I 1 I=√ sg ∫ x2 √ x1 1 + [y ′ (x)]2 dx y(x) − α representing the time of descent, g is the gravitational acceleration and α = y1 − v12 2g . The solution requires ﬁnding the function y(x) for which I is a minimum. The problem was solved independently by Newton, Leibniz, L’Hˆospital and John Bernoulli. Problems of geodesics require ﬁnding the shortest path between two points on a surface. In the simple case of a planar surface the integral is ∫ x2 I= √ 1 + [y ′ (x)]2 dx x1 96 Obviously, the solution will be a straight line. However, 18th-century mathematicians were more interested in the shortest paths on curved surfaces, having in mind the practical applications to navigation. This research was impeded by the lack of knowledge of the curvature of the earth. Isoperimetric problems form a subclass of the problems of geodesics, which require ﬁnding the functions x = x(t) and y = y(t), where t1 ≤ t ≤ t2 , such that ∫ t2 L= √ (x′ )2 + (y ′ )2 dt t1 is a given constant and the area integral: ∫ t2 J= (xy ′ − x′ y)dt t1 Although extremum problems also appear in the works of Galileo and Newton, the calculus of variations was systematized in collaboration, by Euler and Lagrange. Euler was ﬁrst introduced to the brachistochrone problem in 1728 when John Bernoulli proposed “the problem of obtaining geodesics on surfaces by using the property that the osculating planes of geodesics cut the surface at right angles”[Kline, 1990, p. 577]. He found a solution the same year and generalized it in 1734 to include resisting media and minimize quantities other than time. After this, he generalized the results and in 1744 he published the Methodus Inveniendi Lineas Curvas Maximi Minimive Proprietate Gaudentes in which he presented a collection of methods for solving variational problems. As reported in [Fraser, 1985], these methods had a strong geometrical character and were too complex to be applied eﬀectively. A complete analysis of Euler’s Methodus would lead us away from our present purposes. Instead, following [Fraser, 1985], I will focus on the one problem presented in the Methodus that inﬂuenced 97 Lagrange’s own work on calculus of variations. 5.3.1 Euler The problem is presented in the third proposition of the Methodus: If Z is a function of x, y, and p is determined, so that dZ = M dx + N dy + P dp, discover, among all ∫ curves with the same abscissa, the one that makes Zdx a maximum or a minimum.16 Let us illustrate this graphically. In 5.1 the relation(Z) between the abscissa x and the ordinate y is represented by the curve az. The segments M N and N O are considered inﬁnitely small. Correspondingly, m, n and o are the points on the curve az corresponding to M , N and O respectively. Let AM = x, AN = x′ , AO = x′′ , M m = y ′ , N n = y ′′ and Oo = y ′′′ . Figure 5.1: Graphical representation of Proposition III of Euler’s Methodus(Source:[Fraser, 1985]) p is deﬁned as the ratio between inﬁnitesimals, i.e. p = dx dy . Euler introduces the following 16 “Si Z fuerit functio ipsarum x,y, &p determinata, ∫ ita ut sit dZ = M dx + N dy + P dp; invenire, inter omnes curvas eidem abscisse respondentes, eam in qua sit Zdx maximum vel minimum.”[Euler, 1743, p. 42] 98 relations: p= y ′ −y dx p′ = y ′′ −y ′ dx Keep in mind that since AN is inﬁnitesimal and AM = y ′ − y it follows that the diﬀerence y ′ − y will also be inﬁnitely small. The same notation convention applies to the function Z, i.e. Z is the value of the function at x, y, p, Z ′ the value at x′ , y ′ , p′ and Z ′′ the value at ∫ ∫ x′′ , y ′′ , p′′ . On the interval AZ the integral Zdx will be (Z + Z ′ + Z ′′ + . . . )dx, since the abscissa x takes on all values from A to Z, and to each inﬁnitely small increment of x there corresponds a variation of y and consequently of p and Z. Euler’s next step is to move the point n to v, thereby incrementing the ordinate y ′ .17 ∫ If Zdx is a maximum or a minimum, then incrementing the ordinate y ′ will not bring ∫ about any change of the integral, otherwise Zdx would not be an extremum. Euler then observes that only the values Z and Z ′ are aﬀected by the variation of y ′ . Euler calculates the corresponding variations of Z and Z ′ : dZ = M dx + N dy + P dp (5.13) dZ ′ = M ′ dx + N ′ dy ′ + P ′ dp′ However, we already know that p = y ′ −y dx and p′ = y ′′ −y ′ dx . It follows that p = nv dx and p′ = − nv dx . The relations in (5.13) become: dZ dZ ′ = M dx + N dy + P nv dx = M ′ dx + N ′ dy ′ − P ′ nv dx (5.14) ∫ “Totius igitur quantitatis Zdx valor diﬀerentialis ex translatione puncti n in v habebitur, si singulorum illorum terminorum, qui quidem hac translatione aﬃciuntur, valores diﬀerentiales quaerantur & in unam sumam addantur.”[Euler, 1743, p. 43] 17 99 However, if y ′ is increased by nv, dx = 0 and dy = 0, since these inﬁnitely small quantities are not aﬀected. It follows that dZ = P nv dx dZ ′ N′ = · nv − Since when y is increased by nv the change of ∫ P ′ nv dx (5.15) Zdx is zero (otherwise the integral would not be an extremum) and the corresponding variation of Z is dZ + dZ ′ , it follows that dZ + dZ ′ = 0 dZ + dZ ′ = P nv nv + (N ′ · nv − P ′ ) dx dx (dZ + dZ ′ )dx = P nv + dxN ′ · nv − P ′ nv (dZ + dZ ′ )dx = nv · (P + N ′ dx − P ′ ) = 0 P ′ − P = dP (dZ + dZ ′ )dx = nv · (N ′ dx − dP ) = 0 N′ − dP =0 dx In the last equation Euler replaces N ′ with N and obtains his formulation of the EulerLagrange equations for the calculus of variations. N− dP =0 dx (5.16) (5.14) is obtained from (5.15) by taking into consideration the values of those inﬁnitesimals which were not aﬀected by the original variation, i.e. dx = 0, dy = 0. There seems to be a 100 contradiction here with Euler’s general view of inﬁnitesimals. In chapter 4 we saw that for Euler inﬁnitely small quantities were ultimately nothing, yet now he considers that some of them could be diﬀerent from 0, i.e. dp = nv dx , ′ dp′ = − nv dx , dy = nv. These values of inﬁnitesimals in this case should be understood from a modal perspective, i.e. the values of dp, dp′ , dy ′ are the values the displacements would have if the point n would move to v. To put matters into perspective, consider the overall problem of ﬁnding the ∫ function Z such that Zdx is an extremum. If n would be moved to v then Z would change by dZ + dZ ′ . This variation, however, has to be null, and from here we deduce (5.16). The modal interpretation of inﬁnitesimals will be more visible in section 5.4 where instead of considering counterfactual variations of geometrical curves, we will focus on the application of inﬁnitesimals to mechanics. However, before tackling this problem, I will present Lagrange’s calculus of variations since this is the version used in his M´echanique Analitique. 5.3.2 Lagrange The problem with Euler’s calculus of variation was that it was overtly complex and diﬃcult to apply to speciﬁc problems. In a series of letters to Euler, Lagrange systematized and introduced simpler methods for the calculus of variations. However, after the publication of the M´echanique Analitique Lagrange’s approach to the calculus of variations changed, as a result of his turn towards a more algebraic diﬀerential calculus. In a letter to Euler sent on August 12th 1755 Lagrange presents his reply to Euler’s Methodus Inveniendi and introduces the δ operator to the inﬁnitesimal calculus. The deﬁnition of the operator is operational. Thus, x is constant relative to δ, i.e. in δx = 0, δy represents the diﬀerential of y but is applicable only to problems of maxima and minima to distinguish it from the diﬀerential dy appearing in these problems. δF y represents the variation of the 101 function F corresponding to an increment of δy of y. Lagrange postulates that dδF y = δdF y dm δF y = δdm F y (5.17) The reason I call (5.17) a postulate is that Lagrange justiﬁes it by invoking one of Euler’s earlier works [Euler, 1740]. However, in [Euler, 1740] Euler tries to ﬁnd a method for ﬁnding the equations of curves of the same kind: Here I call curves of the same kinds such curves that are only diﬀerent from each other because of the length of a certain constant line, which by assuming some or other values determines these curves. [Euler, 1740, §1] Euler calls the constant line “modulus”, but modern mathematics calls it “parameter”. This parameter determines an inﬁnite number of curves: Thus a parameter is a constant invariable line, by which each one of an inﬁnitude of curves is determined; moreover it has diﬀerent values and thus it is a variable, if it is to refer to diﬀerent curves. Thus if in the equation y 2 = ax, a is taken for the parameter, from the variability of an innumerable parabolas can arise placed on the same axis and having a common vertical axis. [Euler, 1740, §1] Thus, the treatise does not seem to have too much to do with Lagrange’s postulate. The only connection seems to be Euler’s diﬀerentiation of a function of two variables (in modern parlance, ∂2z ∂a∂x = ∂2z ∂x∂a ). However, Lagrange does not present any arguments supporting any logical connection between Euler’s method of derivation and the relation (5.17). 102 After postulating (5.17), Lagrange lists several results of the integral calculus: ∫ ∫ ∫ zdu = zu − udz (5.18) ∫ zd2 u = zdu − udz + ud2 z ∫ ∫ zd3 = zd2 − dzdu + ud2 − ud3 z ∫ ∫ ∫ ∫ ∫ ∫ u z = u× z− z u ∫ ∫ ∫ u z = Vz (5.19) (5.20) (5.21) (5.22) (5.23) ∫ ∫ u = H and V = H − u. Moreover, u is the deﬁnite integral of u from an unspec∫ ∫ ∫ ∫ iﬁed value x to x = a, z u is the integral of z u, i.e. ﬁrst we calculate the value of u ∫ ∫ ∫ then the integral of z u, and u × z is the product of the two integrals. where ∫ After introducing these theoretical considerations Lagrange attacks the ﬁrst problem discussed above in connection to Euler’s Methodus Inveniendi, i.e. to ﬁnd the relation between x and y that makes the function Z an extremum. The relation between Z and the variables x, y, etc. is δZ = N δy + P δdy + Qδd2 y + Rδd3 y + . . . (5.24) With the same ease with which he postulated (5.17) Lagrange introduces the relation ∫ δ ∫ Z= 103 δZ (5.25) Integrating (5.24) gives: ∫ ∫ δZ = δ ∫ Z= ∫ N δy + ∫ ∫ 2 P δdy + Qδd y + Rδd3 y + . . . By applying (5.17) to the previous equation thus switching the symbols δ and d Lagrange obtains ∫ δ ∫ Z= ∫ N δy + ∫ P dδy + ∫ 2 Qd δy + Rd3 δy + . . . After applying (5.18) and cancelling out some of the terms, we obtain ∫ δ ∫ Z= (N − dP + d2 Q − . . . )δy Since the integral has to be an extremum, then its variation δ ∫ (5.26) Z must be zero. So, ∫ (N − dP + d2 Q − . . . )δy = 0 N − dP + d2 Q − · · · = 0 If Z is a function only of x, dx, y and dy we obtain Lagrange’s formulation of the EulerLagrange equations for the calculus of variations: N − dP = 0 (5.27) As it can be seen from the above deduction of (5.27), at this point of his career, Lagrange assumes the existence of inﬁnitesimals, presumably interpreted in an Eulerian sense. At the same time, the deduction illustrates the emergence of Lagrange’s formalist approach to the 104 calculus. The operator δ is introduced as a helpful instrument, not as a symbol that stands for something. We observed this formalist tendency in the previous section, but in that case we were dealing with the foundations of the calculus in Lagrange’s late work. It is also important to note that this letter is Lagrange’s ﬁrst attempt to address the problems of the calculus. In his subsequent correspondence with Euler, Lagrange changed his view slightly, as a result of his attempts to introduce the δ operator to solve mechanical problems. In a letter to Euler written on October 5th 1756, Lagrange tried to apply what he had learned during his mechanical researches to the problem of the brachistochrone, a problem to which he returned later on in 1760 in his Essai d’une nouvelle m´ethode pour determiner les maxima et les minima des formules int´egrales ind´eﬁnies [Lagrange, 1867b]. More important than all these works, is the publication of the M´echanique Analitique, in which Lagrange applies the variational calculus as he developed it up to that point. The next sections will analyze the M´echanique Analitique in greater depth, for now suﬃce to say that the foundations of calculus presented in the previous chapter are a later development in Lagrange’s thought. Nevertheless, these “new foundations” of the diﬀerential calculus, presented in the Th´eorie des fonctions analytiques and Le¸cons sur le calcul des fonctions lie also at the basis of the calculus of variations. For instance, in the twenty-second lesson Langrange derives the EulerLagrange equation, only this time he does it assuming the basis of the calculus deﬁned by means of Taylor power series: The method of variations, founded upon the use and the combination of the characteristics d and δ which corresponded to diﬀerent diﬀerentiations, leaves nothing to be desired; but this method having at its basis, like the diﬀerential calculus, the assumption of inﬁnitesimals, it is necessary to present it from another point of view to connect it to the calculus of functions ...18 18 “La m´ethode des variations, fond´ee sur l’emploi et la combinaison des caract´eristiques d et δ qui r´epondent ` a des diﬀ´erenciations diﬀ´erentes, ne laissait rien ` a desirer; mais cetter m´ethode ayant, comme le calcul diﬀ´erentiel, la supposition des inﬁniment petits pour base, il ´etait n´ecessaire de la pr´esenter sous un autre 105 The details of Lagrange’s deduction of the Euler-Lagrange equation from the “new foundations” are not particularly relevant for our present purposes, since the method used in the M´echanique Analitique is the δ-algorithm. 5.4 The Project of Mechanics in Lagrange’s M´ echanique Analitique The M´echanique Analitique marks a change in Lagrange’s approach to the foundations of mechanics. More precisely, he strips the Principle of Least Action (pla) from its fundamnetal role and constructs mechanics in a deductive manner starting with the Principle of Virtual Velocities (pvv). Lagrange constructs statics from the pvv and then uses d’Alembert’s principle (pda) to logically connect statics with dynamics. Apart from this axiomatic structure, the M´echanique Analitique is also characterized by its indiﬀerence towards theoretical concepts, particularly the concept of force. The ﬁrst part of the M´echanique Analitique begins with the deﬁnition of force and statics: Statics is the science of the equilibrium of forces. In general, by force or power we understand the cause, whatever that might be, which impresses or tends to impress a movement to the body to which the force is applied; and it is also by the quantity of movement impressed that the force must be estimated by.19 Equilibrium is attained by the mutual “destruction” of forces and the aim of the statics is to provide the laws governing this process of mutual cancelling out. The laws themselves are derived from three principles: 1. the principle of the lever (pl) 2. the principle of the composition of forces (pcf) point de vue pour la lier au Calcul des fonctions ...” [Lagrange, 1806, p. 441] 19 La Statique est la science de l’´equilibre des forces. On entend, en g´en´eral, par force ou puissance la cause, quelle qu’elle soit, qui imprime ou tend ` a imprimer du mouvement au corps auquel on la suppose appliqu´ee; et c’est aussi par la quantit´e du mouvement imprim´e, ou prˆet ` a imprimer, que la force ou puissance doit s’estimer. [Lagrange, 1853, §1] 106 3. the principle of virtual velocities (pvv) In connection with the pl Lagrange mentions the work of Archimedes, whom he credits as the author of pl, Stevin and Galileo, for reducing Archimedes’ proof, and Huyghens. In Lagrange’s formulation, the pl consists in that “if a straight lever is loaded with two diﬀerent weights placed arbitrarily on both sides of a fulcrum, at distances proportional to the weights, the lever will be in equilibrium and the fulcrum will be loaded with the sum of the two weights.”20 Archimedes saw the principle as an axiom of mechanics or as a principle justiﬁable through experience. Lagrange rejects the latter interpretation, focusing instead on the self-evident, axiomatic character of a restricted pl: “The equilibrium of a straight and horizontal lever, whose extremities are loaded with equal weights and whose fulcrum is at the middle of the lever, is a self-evident truth, because there is no reason why one of the weights would move but not the other, since the bodies are placed at equal distances from the fulcrum.”21 The pcf is the parallelogram rule applied to forces: if two forces move a body uniformly following two diﬀerent directions, they are equivalent to a single force moving the body uniformly following the diagonal of the parallelogram constructed by taking the directions of the two forces as adjacent sides. Lagrange credits Varignon and Stevin rather than Galileo with discovering the principle, but for good reason. Galileo discovered the principle of the composition of motions to determine the curve described by a projectile under the action of gravity. Varignon and Slevin on the other hand explicitly used the composition of forces. On the question of justifying pcf, Lagrange turns to d’Alembert’s proof, given in the ﬁrst 20 “en ce que si un levier droit est charg´e de deux poids quelconques plac´es, de part et d’autre du point d’appui, ` a des distances de ce point r´eciproquement proportionnelles aux mˆemes poids, ce levier sera en ´equilibre, et son appui ser charg´e de la somme des deux poids.”[Lagrange, 1853, §1] 21 “L’´equilibre d’un levier droit et horizontal, dont les extr´emiti´es sont charg´ees de poids egaux, et dont le point d’appui est au milieu du levier, est une v´erit´e ´evidente par elle-mˆeme, parce qu’il n’y a pas de raison pour que l’un des poids l’emporte sur l’autre, tout ´etant ´egal de part et d’autre du point d’appui.”[Lagrange, 1853, §2] 107 volume of the Opuscles. d’Alembert’s proof is based on two principles: (i) two equal forces acting on an object are equivalent to a single force dividing the angle between the ﬁrst two in two equal angles and (ii) if the forces are multiplied by the same number, then the resultant is proportional to the factor of multiplication. Lagrange considers (ii) to be self-evident in regarding forces as quantities. As to (i), Lagrange explains it by considering the movement impressed to a body by two forces. Since the two forces necessarily make the body move on a unique trajectory, then the movement can be attributed to a single force. Moreover, the direction of this force has to be the bisector of the angle between the two forces, because, as Lagrange argues, there is no suﬃcient reason to believe otherwise. Finally, the most important principle is pvv. By virtual velocity Lagrange understands the velocity “that a body in equilibrium is disposed to receive, in case the equilibrium would be broken, that is, the velocity that the body really gains in the ﬁrst instant of its movement; and the principle of virtual velocities claims that the powers are in equilibrium when they are in inverse ratio of their virtual velocities, estimated following the directions of these forces.”22 Lagrange’s taste for the historical development of mechanics is present here too. He attributes the pvv to Guido Ubaldi who was lead to it after studying the lever and the pulleys. Galileo too reached it after studying the inclined plane and considered it a fundamental principle of mechanics. Lagrange also mentions Descartes and Wallis as positing principles similar to the pvv at the basis of their theories of statics. An important concept that Lagrange retains, and will use later on, is Galileo’s concept of momentum: By momentum of a weight or a force applied to a machine, Galileo understands the action, the energy, the eﬀort, the impetus of this force to move the machine such that there is an equilibrium between two forces only if their momenta for moving the machine in opposite directions are equal; and it must be seen 22 “... celle qu’on corps en ´equilibre vienne ` a ˆetre rompu, c’est-´ a-dire la vitesse que ce corps prendrait r´eellement dans le premier instant de son mouvement; et le principe dont il s’agit consiste en ce que des puissances sont en ´equilibre quand elles sont en raison inverse de leurs vitesses virtuelles, estim´ees suivant les directions de ces puissances.” [Lagrange, 1853, §15] 108 that the momentum is always proportional with the force multiplied by the virtual velocity, depending upon the way in which the force acts.23 Nevertheless, Lagrange does not consider the previous formulation of the pvv general enough to have a foundational character. Moreover, another problem with the above formulation of pvv is that it cannot be expressed mathematically. Compare it with Lagrange’s deﬁnition: If a system of however many bodies or points we desire, each pulled by arbitrary forces, is in a state of equilibrium, and if we impress a small movement to the system, by virtue of which each point travels an inﬁnitely small distance which will express his virtual velocity, then the sum of the products of the forces with the distances their initial points travel on the direction of the force, will always be zero, by considering as positive the small distances travelled in the sense of the forces, and as negative those distances travelled in the opposite sense.24 Lagrange himself observes that in this formulation, the principle can be translated into a general formula that can be applied to solve all problems of statics. In the second section of the ﬁrst part of the M´echanique Analitique Lagrange constructs the formula of the pvv starting from its natural language formulation. He begins by considering a system of two forces in equilibrium. Let P and Q be two forces oriented along the straight lines p and q. The inﬁnitely small quantities dp and dq are the displacements along p and q brought about by the two forces. The virtual velocities of P and Q are proportional with the inﬁnitely small displacements dp dq and Lagrange considers that they can be, for sake of simplicity, taken to be equal to these displacements. 23 “Galil´ee entend par moment d’un poids ou d’une puissance appliqu´ee ` a une machine, l’eﬀort, l’action, l’´energie, l’impetus de cette puissance pour mouvoir la machine, de mani´ere qu’il y ait ´equilibre entre deux puissances, lorsque leurs moments pour mouvoir la machine en sens contraires sont ´egaux; et il fait voir que le moment est toujours proportionnel ` a la puissance multipli´ee par la vitesse virtuelle, d´ependante de la mani`ere dont la puissance agit.”[Lagrange, 1853, §16] 24 “Si un syst`eme quelconque de tant de corps ou points que l’on veut, tir´es chacun par des puissances quelconques, est en ´equilibre, et qu’on donne ´ a ce syst´eme un petit mouvement quelconque, en vertu duquel chaque point parcoure un espace inﬁniment petit qui exprimera sa vitesse virtuelle, la somme des puissances multipli´ees chacune par l’espace que le point o` u elle est appliqu´ee parcourt suivant la direction de cette mˆeme puissance, sera toujours ´egale ` a z´ero, en regardant comme positifs les petits espaces parcourus dans le sens des puissances, et comme n´egatifs les espaces parcourus dans un sens oppos´e.” [Lagrange, 1853, §17] 109 If the two forces are in equilibrium, then they must be oriented in opposite directions. Moreover, according to pl if the two forces are in equilibrium, then P and Q are in an inverse ratio with the displacements dp and dq, i.e. P Q dq = − dp , where the minus sign comes by taking into account their diﬀerent orientations. Therefore, P dp + Qdq = 0 (5.28) The product P dp is called the momentum of P . After reaching the condition of equilibrium in this simple case, Lagrange considers three forces, P , Q, R on directions p, q, r, causing displacements dp, dq and dr, and that the three forces are in equilibrium. Lagrange applies the pcf to break down Q into Q′ and Q′′ , where Q′ and P are opposing each other, just as Q′′ and R do. Applying (5.28) to the two pairs of functions Lagrange obtains the relations: Q′ dq + Q′′ = Qdq P dp + Q′ dq = 0 (5.29) Q′′ dq + Rdr = 0 (5.30) By adding (5.29) to (5.30) we obtain the pvv for the case of a system acted upon by three forces: P dp + Qdq + Rd = 0 (5.31) Following the same reasoning, the pvv can be proven for four, ﬁve, or more forces. First we break down the forces such that we can form pairs of opposing component forces, then we apply pl to obtain the relations between the forces of each pair. Finally, we add up these 110 formulas to obtain the ﬁnal condition of equilibrium: P dp + Qdq + Rdr + · · · = 0 (5.32) This is the “general formula of Statics for the equilibrium of any system of forces.”25 The considerations above should not be viewed as an inductive justiﬁcation from experience of the pvv, but as an illustration of (i) formalizing a principle originally expressed in natural language and (ii) the connection between the three principles, pvv, pl and pcf. Regarding (i), we must observe that in switching from the non-mathematical to the mathematical expression of pvv Lagrange assumes a particular view of inﬁnitely small quantities. In Lagrange’s argument, if dp, dq, dr, . . . are considered to represent actual displacements, the system could not be considered in equilibrium, unless these displacements are viewed as zeros. Otherwise, it would mean that the forces are actually moving the system. However, if the displacements are zeros, then (5.32) would be trivial, i.e. 0 + 0 + 0 + · · · = 0. It is also worth noticing that in this context the diﬀerentials diﬀer from the ones examined in the previous chapter. Here dp, dq, dr, etc. are interrelated, such that when one of them varies then the magnitude of the other diﬀerentials will depend on the variation of the ﬁrst. This dependency can be expressed in a more accurate form as the proportion between the diﬀerentials and the corresponding forces, i.e. P Q dq = − dp . In Lagrange’s formulation of pvv, i.e. (5.32), this interdependency is implicit, but we should not lose track of the order of ideas. The pvv follows from P Q dq = − dp , not the other way around. This shows that a mathematical formula can express more than its mathematical equivalents, i.e. formulas obtained from the initial one following mathematical rules. It makes sense to say that the inﬁnitesimals in (5.32) have a modal interpretation. They 25 “. . . la formule g´en´erale de la Statique pour l’´equilibre d’un syst´eme quelconque de puissances.” [Lagrange, 1853, Part I, Section II §2] 111 represent the inﬁnitely small displacement of a point if the system were to receive an inﬁnitely small impulse. This allows Lagrange to apply the formal rules of the inﬁnitesimal calculus and at the same time avoid the problems that arise by treating the diﬀerentials as zeros. More importantly, Lagrange’s mathematization of pvv shows the dependence between the foundations of the calculus and those of mechanics. The dependence is not logical; mechanics is not deduced from the calculus. Instead, the foundations of the calculus are conditions of expressibility of the foundations of mechanics. In the absence of inﬁnitesimals, the mathematical formulation of what Lagrange considers an axiom of mechanics (pvv) would not have been possible. Lagrange realized that the pvv was not suﬃciently self-evident to be an axiom. For this reason he tried to provide an alternative justiﬁcation based on what Lagrange calls the principle of pulleys (pp). Lagrange represents a system of forces (P , Q, R, . . . ) acting on points of mass by a number of block and pulley systems sharing the same rope, each pulley having a mass attached. Since the points of mass and forces are all parts of the system, Lagrange can represent their interconnectedness by considering the pulleys as sharing the same rope26 . One end of the rope is ﬁxed, while the other is tied to a unit weight (call it G). Moreover, the rope is looped in each block and pulley, the number of loops corresponding to the magnitude of the force it represents, i.e. if the force is 5 then there will be ﬁve loops; if the force is P , then there will be P loops. Each block and pulley represents a system composed of a force and a point of mass. The overall system of points of mass and forces is represented by all the block and pulleys connected by the same rope. This means that if one block and pulley system is in equilibrium, then, in virtue of being connected, the next block and pulley is also in equilibrium, and so on. The equilibrium of the whole system can be observed by looking at the state of the 26 I am indebted to Hepburn [2007] for clarifying this principle. 112 Figure 5.2: Lagrange’s Illustration of the Principle of Virtual Velocities. Source: [Pulte, 1998] weight G. Now, if we consider that one block and pulley system moves the inﬁnitely small distance dp, then, the other block and pulleys will also move by dq, dr, etc. Because of this, the weight G will descend by P dp + Qdq + Rdr + Sds + . . . (5.33) Therefore, since G always tends to descend, the system will be in equilibrium only if there is no inﬁnitely small displacement of G. Lagrange’s condition of equilibrium will be P dp + Qdq + Rdr + Sds + · · · = 0 (5.34) For Lagrange the negative value of the sum of products also indicates a state of equilibrium, since it would be impossible for the weight G to ascend by itself, i.e. without the intervention of an external force. To recapitulate, Lagrange tried to construct an axiomatic mechanics that would preserve the standards of rigor of ancient geometry. To this end, in the ﬁrst two sections of the M´echanique Analitique he tried to show the self-evident character of pvv, pl and pcf. While 113 the latter two principles might be considered self-evident, we cannot say the same about pvv. Lagrange tried to justify pvv and for this reason he introduced the pp, both as a proof and as an example of the mathematization of pvv. However, Lagrange’s pp raises a number of issues. First, it might be objected that justifying the pvv by analyzing the behavior of a mechanical system is a kind of circular reasoning. To sum up this objection, remember that for Lagrange the pvv is the theoretical foundation of our understanding of the equilibrium of a mechanical system, but in order to justify the pvv Lagrange presupposes that the equilibrium of a mechanical system is already understood. This objection would be valid only if Lagrange were to present the pp as a justiﬁcation of pvv. This, however, is not the case. The pp is an illustration of pvv. More precisely, the role of pp is to make the pvv more intuitive by rephrasing it in terms that are closer to experience. There are two arguments supporting this interpretation and that would make the role of the pp more clear. First, in pp the concept of force is not mentioned and the characteristics of forces are replaced by physical objects or their attributes. For instance, the magnitude of the force is represented by the number of loops of rope around each block and pulley system, the direction of the force is replaced by the direction of the rope around the block and pulley (see ﬁg. 5.2), and so on. This suggests that the pp should not be viewed as a principle per se, but as an illustration aimed at making pvv more intelligible, playing an expositional role similar to the one played by graphs and diagrams in science in general. Second, Lagrange repeatedly mentions the foundational role of pvv, not of pp. However, if pvv would be deduced from pp, then Lagrange should have attributed the foundational role to the latter, not the former. Although the objection of circularity may be avoided, the pp still has some problems. I will present here just two, both presented in Carl Jacobi’s Vorlesungen u ¨ber analytische 114 Mechanik (1848).27 One such problem is that it considers only the vertical movement of the weight G. Lagrange assumes that gravity is the only force acting upon G and from here he inferred that G necessarily descends. However, the pendulum is a clear illustration of the fact that the direction of a force acting on a body and the direction of that body’s movement do not necessarily coincide. Moreover, Lagrange does not seem to distinguish between stable, unstable and median equilibrium. To use Helmut Pulte’s example in [Pulte, 1998, p. 168], suppose the weight G is suspended using a rigid rod and that it is placed perpendicularly above the point of suspension. In this case, G would be in a state of unstable equilibrium and any inﬁnitesimal displacement would change the state of the body. The formula (5.32) is not general enough, since it is applied to a single body. To reach a general formula, Lagrange applies the method of multipliers. In the following paragraphs I will summarize this method, as it is one of Lagrange’s numerous contributions to analytical mechanics. Let L = 0, M = 0, N = 0, . . . be the mathematical equations representing the constraints acting on the system, where L, M , N , . . . are functions of x, y, z, x′ , y ′ , z ′ . Through derivation, we get dL=0 dM=0 dN=0 (5.35) which express the relations between the diﬀerentials of the same variables, insofar as these diﬀerentials are linear. The equations of (5.35) are used to eliminate the same number of diﬀerentials from (5.32). The coeﬃcients of the remaining diﬀerentials of (5.32) will have to be 0 in order to have equilibrium. Obviously, the same result can be obtained if we simply multiply each equation in (5.35) with an unknown coeﬃcient such that when all these 27 See [Pulte, 1998] for a more detailed analysis. 115 equations are added to (5.32) the result will still be 0: P dp + Qdq + Rdr + · · · + λdL + µdM + νdN + · · · = 0 (5.36) This is the general equation of motion. The particular equations, i.e. the equations of motion for a particular coordinate, will be obtained by dividing (5.36) by the corresponding inﬁnitesimal: dp dq dr dL dM dN +Q +R + ··· + λ +µ +ν + ··· = 0 dx dx dx dx dx dx dq dr dL dM dN dp + Q + R + ··· + λ +µ +ν + ··· = 0 P dy dy dy dy dy dy dp dq dr dL dM dN P + Q + R + ··· + λ +µ +ν + ··· = 0 dz dz dz dz dz dz P (5.37) Lagrange regards the terms λdL, µdM , νdN . . . as moments of general forces. However, these forces are not λ, µ, . . . as it might be thought. Lagrange rewrites dL as dL = dL′ + dL′′ + dL′′′ + . . . , where dL′ has only terms containing dx′ , dy ′ , dz ′ , dL′′ only terms containing dx′′ , dy ′′ , dz ′′ and so on. It follows that we can write λdL = λdL′ + λdL′′ + λdL′′′ + . . . 116 (5.38) Each term of (5.38) can be re-written: √ λdL′ =λ √ λdL′′ =λ √ λdL′′′ = λ ( dL′ )2 dx′ ( + ( dL′′ )2 dx′′ ( dL′′′ )2 dx′′′ dL′ dy ′ ( + + dL′′ dy ′′ ( + )2 ( dL′ )2 dz ′ )2 dL′′′ dy ′′′ + × ( dL′′ )2 )2 + dz ′′ × ( dL′′′ )2 dz ′′′ √( dL′ dx′ √( × dL′ )2 ( ′ )2 ( ′ )2 + dL + dL dy ′ dz ′ dL′′ dx′′ √( dL′′ )2 ( ′′ )2 ( ′′ )2 + dL + dL dy ′′ dz ′′ dL′′′ dx′′′ (5.39) dL′′′ )2 ( ′′′ )2 ( ′′′ )2 + dL + dL dy ′′′ dz ′′′ ... In turn, each one of these equations represents the the moment of a force and √applied to(a body ) 2 ( )2 ( dL′ )2 ′ ′ perpendicular to a curved surface. For instance, in the ﬁrst equation, λ + dL + dL dx′ dy ′ dz ′ is the force acting on the body with coordinates (x′ , y ′ , z ′ ) and perpendicular to the curved surface given by the equation dL′ . This result can be generalized to λdL which can be regarded as the “eﬀect of diﬀerent forces expressed by √ λ ( dL )2 dx′ ( + dL dy ′ )2 + ( dL )2 dz ′ √ , λ ( dL )2 dx′′ ( + dL dy ′′ )2 + ( dL )2 dz ′′ , ... and applied to bodies located at coordinates x′ , y ′ , z ′ , x′′ , y ′′ , z ′′ , etc. following directions perpendicular to diﬀerent curved surfaces represented by the equation dL = 0, by varying ﬁrst x′ , y ′ , z ′ , then x′′ , y ′′ , z ′′ , and so on the others.”[Lagrange, 1853, §8, Part II, Section IV] Applying these considerations to the general equation (5.36) and to the particular cases of (5.37), these equations come to represent a diﬀerent system, i.e. a system which includes, apart from the bodies acted upon by the forces P , Q, R, . . . , a√second set of bodies with ( dL )2 ( dL )2 ( dL )2 + dy′ + dz ′ , coordinates (x′ ,y ′ ,z ′ ), (x′′ ,y ′′ ,z ′′ ),. . . acted upon by the forces λ dx′ √ ( ) 2 ( dL )2 ( dL )2 dL λ + dy , etc. + dz ′′ ′′ dx′′ Lagrange inteprets this result in a peculiar way, which intertwines mathematical concepts 117 with theoretical ones. He regards the product λdL as the moment of a force with magnitude λ which tends to vary the function L separately towards the coordinates (x′ , y ′ , z ′ ), (x′′ , y ′′ , z ′′ ),. . . This is at odds with Lagrange’s emphasis in Th´eorie des fonctions analytiques and Le¸cons sur le calcul des fonctions on the need to eliminate non-mathematical concepts from the calculus. Here he seems to turn a blind eye to the intrusion of the mechanical concept of force into calculs. His only justiﬁcation is an analogy with the eﬀect of the forces P , Q, R, etc. which tend to change the lines p, q, r, etc. by either increasing or decreasing them. In a three dimensional coordinate system these lines are represented as functions of three variables(p(x, y, z), q(x, y, z), r(x, y, z)), which allows the interpretation of the concept of force as tending to change a function. However, it must be kept in mind that p(x, y, z), q(x, y, z), r(x, y, z), being the equations of straight lines, cannot contain any variable with a power greater than 1, while there is no such constraint on L, M , N , etc. λ, µ, . . . were introduced as purely mathematical entities, yet Lagrange freely interprets them as forces. In particular, these forces can be viewed as emerging from the internal interactions arising between parts of the system. Therefore, the products λdL, µdM , νdN . . . are the moments of these forces which try to diminish the values of the functions L, M , N . In this double interpretation of λdL, µdM , νdN . . . lies the innovation of Lagrange’s method of multipliers. The reason is that it allows the deduction of the conditions of equilibrium even for systems in which not all forces are known, insofar as their action can be represented as an equation. Moreover, this interpretation does not aﬀect the conditions of equilibrium of the system. Even if λdL, µdM , νdN . . . are considered simple equations, the relation between the forces of the system, represented by (5.32), will still hold. Lagrange’s foundations of statics are not as rigorous as he might have desired. Nevertheless, the project of the M´echanique Analitique is to construct an axiomatic mechanics expressed using the mathematical tools of analysis. Although the principles proposed by 118 Lagrange are not as self-evident as their status of axioms requires, this does not mean that Lagrange’s mechanical project is a failure. Lagrange deﬁnes dynamics as “the science of accelerating or decelerating forces and the diﬀerent movements they must produce.”28 The accelerating force is, in modern terms, the acceleration. The product of mass and accelerating force, is what Lagrange calls the elementary force or the nascent force. However, Lagrange gives two interpretations of the product ma. On one hand, the product can be viewed as the pressure the body may exert in virtue of its having that velocity. Under this interpretation, the body is viewed as an active element, able to aﬀect other bodies. According to the second interpretation, the product is the motor force required to get accelerate the body from rest to the same velocity. Here, the body is a passive element, i.e. it is the subject of a force, not a cause in itself. The basis of Lagrange’s dynamics is d’Alembert’s principle: If we impress a movement to several bodies which are forced to change because of their mutual action, it is clear that we can regard these movements as being composed of those that the bodies really have, and other movements which are destroyed; from here it follows that the latter must be such that the bodies animated only by these forces will be in equilibrium.29 The principle, originally presented in d’Alembert’s Trait´e de dynamique (1743), was used to solve the problem of the centre of oscillation of a double pendulum. According to Lagrange, the movement of such a system would be “somewhere in between” the movement of the two bobs taken as simple pendulums. At this point of his exposition, Lagrange is using a very vague and ambiguous language. For instance, he envisages a certain “compensation and repartition”[Lagrange, 1853, p. 218] of the movements of the two bobs, in the sense that the 28 “. . . la science des forces acc´el´eratrices ou retardatrices, et des mouvements vari´es qu’elle doivent produire.”[Lagrange, 1853, p. 207] 29 “Si l’on imprime ` a plusieurs corps des mouvements qu’ils soient forc´es de changer ` a cause de leur action mutuelle, il est clair qu’on peut regarder ces mouvements comme compos´es de ceux que les corps prendront r´eelment, et d’autres mouvements qui sont d´etruits; d’o` u il suit que ces derniers doivent ˆetre tels, que les corps anim´es de ces seuls mouvements se fassent ´equilibre. [Lagrange, 1853, §10 p.23] 119 upper bob will tend to increase the frequency of the oscillations, while the lower one will tend to decrease it. Therefore, concludes Lagrange, there has to be a point, between the two bobs, where if a third bob were to be placed, it would not be neither accelerated, nor decelerated by the other two. The ﬁrst step towards ﬁnding a solution to the problem of the centre of oscillation had been done, according to Lagrange, by Jacques Bernoulli. Bernoulli considered the simple case of two masses situated on a rigid rod and observed that the upper mass moves slower than it would move if it were not connected to the lower mass and that the lower mass moves faster than it would move if it were on its own. Bernoulli concludes that there is a “communication” of movement and since the two bobs are connected by a rigid rod thus forming a second class lever, this communication will obey the laws of the equilibrium applied to the forces acting on the two masses. For Lagrange, Bernoulli was right to try to relate the problem of the centre of oscillation to statics, but he was misled in using in his calculations velocities deﬁned on ﬁnite rather than inﬁnitesimal time intervals. D’Alembert carried Bernoulli’s idea further and by introducing pda he managed to “reduce all the laws of the movement of bodies to those of their equilibrium, and [to bring] back Dynamics to Statics.”30 However, the pda cannot be used to solve problems of dynamics on its own, but it enables the application of the principles of statics to dynamics, but only for internal forces, i.e. forces that appear between the elements of a system but do not aﬀect the movement of the system as a whole. The fundament of dynamics is a principle Lagrange develops in the second section of the Dynamics. He begins by considering a system of bodies upon which act diﬀerent forces. From this system, pick out a body m regarded as a point of mass, with coordinates x, y, z at time 30 “. . . r´eduit toutes les lois du mouvement des corps ` a celles de leur ´equilibre, et ram`ene ainsi la Dynamique ` a la Statique.”[Lagrange, 1853, p. 223] 120 t, considered relative to three axes perpendicular on each other. The instantaneous velocities along the three axes are represented as dx dy dz dt , dt , dt . Lagrange adds that due to the action of forces these velocities will be incremented in an inﬁnitely small amount of time(dt) by d · dx dt , d· dy dt , d· dz dt . This gives the accelerating forces d2 x d2 y d2 z , , . dt2 dt2 dt2 At this point of his exposition Lagrange makes an important remark: We can regard these increments [d · dx , dt d· dy , dt d· dz ] dt as new velocities given to each body and, by dividing them by dt we will have the measure of the instantaneous accelerating forces used to produce them; because, for any variable that might be the action of a force, we can always, in virtue of the nature of the diﬀerential calculus, regard it as constant during an inﬁnitely small amount of time, and the velocity generated by this force is then proportional to the force multiplied by the time.31 In this paragraph, Lagrange is talking about the measure of accelerating forces. An important point for our present purposes is the relation between the nature of the calculus and the measure of accelerating force. On the basis of this relation we can further calculate the motor force. This is done by considering dt constant relative to the increments of the velocities along the three axes, which means that the accelerating forces will be d2 x d2 y d2 z , , . dt2 dt2 dt2 By multiplying these values with the mass m we obtain the expressions of the forces acting upon the body m in the interval dt. Therefore, we can speak about the relation between the nature of the calculus and the measure of motor force. Essentially, this relation is of constraint of representation, for Lagrange infers from the nature of the calculus that the action of a force (i.e. the increment of the velocity) can be considered constant over an inﬁnitely small time interval and from here he concludes that for a constant time interval the accelerating forces are represented as d · dx dt , d· dy dt , d· dz dt . 31 “On peut regarder ces accroissements comme de nouvelles vitesses imprim´ees ` a chaque coprs, et, en les divisant par dt, on aura la mesure des forces acc´el´eratrices employ´ees imm´ediatement ` a les produire; car, quelque variable que puisse ˆetre l’action d’une force, on peut toujours, par la nature du calcul diﬀ´erentiel, la regarder comme constante pendant un temps inﬁniment petit, et la vitesse engendr´ee par cetter force est alors proportionelle ` a la force multipli´ee par le temps.[Lagrange, 1853, p. 232, My emphasis] 121 More precisely, Lagrange justiﬁes the representation of accelerating forces, and in consequence of motor forces, by referring to the nature of the diﬀerential calculus. Whether we understand the diﬀerential calculus as a special case of calculus of ﬁnite diﬀerences as Euler, or as the study of ﬂuents and ﬂuxions as Newton, it is nevertheless true that a function can be considered constant if the increment of its variable is suﬃciently small. An illustration of this essential aspect of diﬀerential calculus is given in 5.3 As D approaches C, d will approach c. Figure 5.3: Reduction of a curve to inﬁnitely small polygons. Source: [Newton et al., 1729, p. 45] Eventually, the length of CD will be suﬃciently small such that dc will be constant. Applied to the case of the increments of velocity, this translates as d· dx =α dt (5.40) where α is an arbitrary constant. However, the measure of a force is the velocity it is able to produce, which means that if we regard d · dx dt as a velocity, then it will be equal to the force that produced it multiplied by dt, the time it takes to reach the velocity d · dx dt . Thus, this essential character of the diﬀerential calculus makes possible, at least in Lagrange’s view, the mathematical representation of the accelerating force. Lagrange now generalizes this result to all bodies of the system. That is, the force acting 122 on each body will be the result of the composition of three forces, each parallel to one of the three axes. More importantly, the sum of the momenta of all these forces must equal the sum of the momenta of all forces acting upon the entire system. Consider that the forces mP , mQ, mR, . . . act upon the entire system, where P , Q, R, . . . are accelerating forces. If we apply forces equal in magnitude to mP , mQ, mR, . . . , but in opposite directions, then the system will be in equilibrium. To make matters easier, Lagrange introduces the following notation convention: d denotes the diﬀerentials relative to time, δ represents the virtual velocities. Thus, the moment of the force S( d2 x m dt2 2 will be m ddt2x δx. The sum of all moments will be: d2 x d2 y d2 z δx + δy + δz)m + S(P δp + Qδq + Rδr + . . . )m = 0 dt2 dt2 dt2 (5.41) where the symbol S stands for the indeﬁnite integral applied to all points of mass. In the fourth section of the second part of M´echanique Analitique, in §1, Lagrange explains that applying (5.41) to particular cases amounts to reducing the number of variables by using the equations describing the particular system. Here, just as in the method of multipliers, Lagrange considers that each system is necessarily describable by a set of equations. Moreover, at least in the case of the method of multipliers, these equations can be interpreted as expressing relations between forces. This is not diﬀerent from the function I attributed to the second layer of models32 , the theoretical description. The equations that expressed the relations between the forces acting on the pendulum’s bob have the same function as the equations L = 0, M = 0, etc. of the method of multipliers, and as the equations used to reduce the number of variables in (5.41). 2 2 2 For Lagrange, the ﬁrst part of (5.41), S( ddt2x δx + ddt2y δy + ddt2z δz)m, represents the eﬀect of the forces of inertia. More precisely, is the result of the internal interactions between the com32 see section 2.5 123 ponents of the system. In the case of the double pendulum this expression would represent the eﬀects the two bobs have on each other. The second part of (5.41), S(P δp+Qδq+Rδr+. . . )m represents the eﬀects of the active forces, i.e. forces that move the entire system. In the case of the double pendulum, this expression would represent the forces in virtue of which the rod l1 connected to the fulcrum oscillates. Lagrange’s inclination towards generality is illustrated by his attempt to ﬁnd a single equation which ignores the particularities of the system to which (5.41) is applied and allows the deduction of all variable quantities of the system. To ﬁnd this equation, Lagrange considers a function Φ of variables x, y, z, . . . , dx, dy, dz, . . . , d2 x, d2 y, d2 z, . . . . A change of the coordinate system, from xyz . . . to ξψϕ . . . will transform Φ into a function of ξ, ψ, ϕ, . . . , dξ, dψ, dϕ, . . . , d2 ξ, d2 ψ, d2 ϕ, . . . Next, he diﬀerentiates both functions (Φ(x, y, z, . . . , dx, dy, dz, . . . , d2 x, d2 y, d2 z, . . . ), and Φ(ξ, ψ, ϕ, . . . , dξ, dψ, dϕ, . . . , d2 ξ, d2 ψ, d2 ϕ, . . . ))relative to δ Φ(x, y, z, . . . , dx, dy, dz, . . . , d2 x, d2 y, d2 z, . . . ) = Φ(ξ, ψ, ϕ, . . . , dξ, dψ, dϕ, . . . , d2 ξ, d2 ψ, d2 ϕ, . . . ) δΦ δx δx δΦ = + δΦ δy δy ... ...... = δΦ δξ δξ + δΦ δdξ δdξ + δΦ δd2 ξ δd2 ξ + + + δΦ δdx δdx + δΦ δd2 x δd2 x δΦ δΦ 2 δdy δdy δd2 y δd y δΦ δψ δdψ + + δΦ δdψ δdψ + δΦ δϕ δϕ + + ... + ... δΦ δdϕ δdϕ δΦ δd2 ψ δd2 ψ + ... + + ... δΦ δd2 ϕ δd2 ϕ + ... (5.42) 124 Given the relation dm δF y = δdm F y((5.17)), Lagrange replaces δd and δd2 with dδ and d2 δ, respectively, which allows him to integrate (5.42). The result has the form ∫ (Aδx + Bδy + Cδz + . . . ) + Z ∫ = (A′ δξ + B ′ δψ + C ′ δϕ + . . . ) + Z ′ (5.43) where the coeﬃcients A, B, C, A′ , B ′ , C ′ ,. . . are calculated from (5.42): A = δΦ δx δΦ − d δdx + d2 δdδΦ2 x − . . . B = δΦ δy δΦ − d δdy + d2 δdδΦ2 y − . . . C = δΦ δz δΦ − d δdz + d2 δdδΦ2 z − . . . A′ = δΦ δξ δΦ − d δdξ + d2 δdδΦ2 ξ + . . . B′ = δΦ δψ δΦ − d δdψ + d2 δdδΦ 2ψ C′ = δΦ δϕ Z + + Z′ + δΦ − d δdϕ + d2 δdδΦ 2ϕ ( δΦ ) = δdx − d δdδΦ2 x + . . . + δdδΦ2 x dδx + . . . ) ( δΦ δΦ + . . . + δdδΦ2 y dδy + . . . − d δdy δd2 y ( δΦ ) δΦ δΦ δdz − d δd2 z + . . . + δd2 z dδz + . . . ( ) δΦ = δdξ − d δdδΦ2 ξ + . . . + δdδΦ2 ξ dδξ + . . . ( ) δΦ δΦ − d + . . . + δdδΦ 2 2 ψ dδψ + . . . δdψ δd ψ (5.44) Diﬀerentiating (5.43) Lagrange obtains: Aδx + Bδy + Cδz + . . . −A′ δξ − B ′ δψ − C ′ δϕ = dZ ′ − dZ 125 (5.45) (5.45) should hold for any variation δ, since Lagrange no particular value of δ was assumed in deducing (5.45). The right side of the equality is an exact diﬀerential of Z and Z ′ , while the left side is independent of d. The sides cannot be equal unless they are both zero. Therefore, Aδx + Bδy + Cδz + · · · = A′ δξ + B ′ δψ + C ′ δϕ + . . . (5.46) dZ = dZ ′ (5.47) 2 2 2 In order to obtain S( ddt2x δx + ddt2y δy + ddt2z δz)m, Lagrange considers Φ = 12 (dx2 + dy 2 + d2 z) for which δΦ δx δΦ δdx =0 A= −d2 x = dx B= −d2 y δΦ δd2 x =0 C= −d2 z (5.48) From this it follows that Φ contains only ﬁrst order diﬀerentials. Therefore, the equations of A′ , B ′ , C ′ , . . . will also contain only ﬁrst order diﬀerences: A′ = δΦ δξ δΦ − d δdξ B′ = δΦ δψ δΦ − d δdψ C′ = δΦ δϕ δΦ − d δdϕ (5.49) Replacing these values in (5.46) we obtain: Φ = −d2 xδx − d2 yδy − d2 zδz ( ) ( ) δΦ δΦ δΦ = δΦ − d δξ + − d δξ δdξ δψ δdψ δψ ( ) δΦ + δΦ δϕ − d δdϕ δϕ + . . . 126 (5.50) 2 2 2 The ﬁrst term of (5.41), S( ddt2x δx + ddt2y δy + ddt2z δz)m, is S dtΦ2 m, i.e. S ( d2 x+d2 y+d2 z 2dt2 ) m. Since Lagrange is trying to generalize (5.41), he must express it in a form independent of the initial choice of the system of coordinates. Therefore, he must show that both terms of (5.41) can be expressed in terms of ξ, ψ, ϕ, . . . ( Lagrange deﬁnes the function T as the expression of S d2 x+d2 y+d2 z 2dt2 ) m in terms of ξ, ψ, ϕ, . . . He then ﬁnds it obvious that ( 2 2 2 S( ddt2x δx + ddt2y δy + ddt2z δz)m = d · ( ) δT + d · δdϕ − δT δϕ δϕ + . . . δT δdξ − δT δξ ) ( δξ + d · δT δdψ − δT δψ ) δψ (5.51) Turning to the second term of (5.41), expressing it in terms of ξ, ψ, ϕ, . . . is easier, since it amounts to translating the equations of the lines p, q, r, . . . and the forces in a diﬀerent coordinate system (ξψϕ . . . ). Lagrange’s argument goes as follows: dΠ = P dp + Qdq + Rdr + . . . (5.52) δΠ = P δp + Qδq + Rδr + . . . (5.53) S (P δp + Qδq + Rδr + . . . ) m = SδΠm = δSΠm (5.54) (5.55) (5.54) is obtained from (5.53) by multiplying the latter then summing up the result for the masses of all the bodies in the system. Lagrange deﬁnes the function V = SΠm. This entails that δSΠm = δV = dV dV dV δξ + δψ + δϕ + . . . dξ dψ dϕ 127 (5.56) Adding up (5.51) and (5.56) Lagrange obtains the general formula of dynamics Ξδξ + Ψδψ + Φδϕ + · · · = 0 (5.57) where d· δT δdξ − δT δξ + δV δξ Ψ= d· δT δdψ − δT δψ + δV δψ Φ= d· δT δdϕ − δT δϕ + δV δϕ Ξ= (5.58) It might be tempting to think of T and V as kinetic and potential energy. However, this would be a misunderstanding of Lagrange’s position For him, both T and V were nothing else but functions in which the variables represented the coordinates of the bodies. Using (5.57) has two advantages. First, there is no need to individuate the forces acting on every body of the system. We observed this in section 2.3 where the same procedure was used to ﬁnd the equations of motion of the double pendulum. However, in that case, the theoretical description of the system employed the concept of energy. An implicit assumption in using the Lagrangian in 2.3 was that the energy of the system is conserved. For Lagrange, on the other hand, the conservation of energy is a consequence of the fundamental principle of dynamics ((5.57)). The second advantage is that we can choose the coordinate system that best suits our purposes. As long as we can express the relation between the system xyz and the system of our choice as a function, we can always obtain equations of the form (5.58), in which the only variables are the coordinates of the new system. This chapter provides an overview of Lagrange’s work on the calculus and mechanics. Concerning the calculus, we can say in Lagrange’s works we ﬁnd traces of instrumentalism. This 128 is justiﬁed by his changing view of the foundations of calculus throughout his career. In the M´echanique Analitique he uses inﬁnitesimals without questioning their accuracy. However, in Th´eorie des fonctions analytiques and Le¸cons sur le calcul des fonctions he approaches foundational problems with the intention of providing a solid basis for the construction of the methods of the calculus. This does not mean that he excluded inﬁnitesimals altogether, but quite the opposite. Lagrange justiﬁed the use of inﬁnitesimals by showing that their results are accurate, despite the conceptual issues they raise. There are a couple of things to remember about Lagrange’s work. First, he attempts to construct an axiomatic mechanics, based on the principle of the lever(pl), the principle of composition of forces(pcf) and the principle of virtual velocities(pvv). Since the latter is too counter-intuitive to be considered a self-evident truth, Lagrange tries to justify it by applying the principle of the pulleys(pp), which is more an illustration of the concept of equilibrium rather than a principle. The relation between statics and dynamics is d’Alembert’s principle (pda), which does not play a deductive role, i.e. as a premise which taken together with the principles of statics allows the deduction of the principles of dynamics. Instead, the pda is a methodological principle which justiﬁes the elimination of internal forces, i.e. forces arising from the interaction between parts of the system. Second, Lagrange is the ﬁrst mathematician who treated the foundations of mechanics as a subject of inquiry in its own right, rather than a didactic requirement. In his early works, he used inﬁnitesimals without taking into consideration their lack of precision. However, in Th´eorie des fonctions analytiquesand Le¸cons sur le calcul des fonctionshe constructs the diﬀerential calculus as a calculus of functions rather than ratios of inﬁnitesimals, and proves that the results of the calculus of diﬀerentials are sound whether it uses inﬁnitesimals or derivatives. 129 Chapter 6 Conclusions In the previous pages I suggested a theory of scientiﬁc models as representations, according to which a model has three components: an idealized system, a theoretical description of this system and the corresponding mathematical formalisms. The idealized system is deﬁned by a series of sentences expressed in natural language which identify the components of the system and the relations between them. The system is idealized because there need be no perfect real-world correspondent to it, e.g. there are no frictionless pendulums. The second component is the theoretical description, a set of mathematical equations which express the relations between the theoretical entities of the system, e.g. force, kinetic or potential energy, wave amplitudes, and so on. Finally, the mathematical component is the set of equations and their solutions which represent the evolution in time of the relevant properties of the idealized system. I supported this suggestion by analyzing the algorithms used for deducing the equations of motion of diﬀerent kinds of pendulums, and tried to answer the following question: How do the three components interact? Here I introduced the term “bridge principle”, loosely deﬁned in 2.5 as “any assertion, whether justiﬁed or simply postulated, which establishes a 130 correspondence between the elements of one layer and those of another”1 . Obviously, this deﬁnition is too vague to be included within a theory of models. This is why I dedicated chapters 3-5 to an in-depth analysis of the separation between the three layers (mathematical, theoretical, idealized description), which largely took place in the 18th century. This analysis aimed at ﬁnding within the history of 18th-century science a set of principles that could be considered bridge principles. I admit that the suggested theory of models needs a stronger defense than the one given in chapter 2. However, since the problem of bridging the gaps between layers is the most diﬃcult challenge my proposed theory of models has to face, solving it takes precedence over a more thorough development of the theory. In chapter 3 I presented the status of 18th century science focusing on its conceptual entanglement. During this period, diﬀerent scientiﬁc traditions, which oftenly overlapped, competed against each other, usually on the topic of precedence of their foundational principles. More importantly, although the domains of study were delineated (e.g. mechanics was the study of rest and motion, geometry the study of curves, etc.), at the theoretical layer diﬀerent sciences used concepts unrelated to their ﬁeld of study. A case in point is Newton’s theory of ﬂuxions, where he used the non-mathematical concept of velocity to deﬁne the rate of variation of a quantity. This problem was ignored up until Lagrange’s Th´eorie des fonctions analytiques and Le¸cons sur le calcul des fonctions, which focused on the delineation of sciences at a theoretical level. During this period, mechanics, although a mathematized science, still relied on geometry to perform its calculations and predictions. This meant that geometrical proofs could become cumbersome for more complex mechanical problems, which involved several bodies, and that these proofs could not be extended to solve similar problems. The introduction of functions in Euler’s Introductio in analysin inﬁnitorum changed this. Instead of drawing curves and 1 see p. 28 131 representing forces as lines, solutions to mechanical problems now implied the arithmetical manipulation of symbols. However, explaining how geometry can be used to represent empirical phenomena is easier than explaining how functions, diﬀerentials or integrals perform the same task. Using geometry, trajectories are represented by curves, forces by lines, with the magnitude of the force proportional to the line’s length, etc. Then we determine the relations between these geometrical objects (e.g. proportionality). This approach is more intuitive than the manipulation of symbols required by analytical solutions to mechanical problems. Therefore, in order to understand how and why functions entered in use in science, I looked at Euler’s work on the topic. As it turns out, the 18th-century concept of function, as well as those of variable and constant, are very diﬀerent from the modern ones. There are two diﬀerences between the older and the more recent concepts of function. The ﬁrst is conceptual. Euler regarded functions as analytical expressions of real relations between quantities, while modern mathematics regards functions as rules of correspondence between sets. The second diﬀerence, derived from the ﬁrst, is that the modern conception does not admit exceptional values of its variables, therefore it is more accurate. Eulerian variables were “essences” of quantities which meant that they were at the same time complex, rational, irrational numbers, regardless of the fact that the function might not be deﬁned for particular values of its variables. The history of the concept of functional relation, brieﬂy discussed in 4.1, shows that the concept of analytical function is the oﬀspring of geometry. To make things clearer, I used the distinction drawn in [Ferraro, 2000] between functional relations and functional expressions. A functional relation can be expressed both as a geometrical curve, (or surface, body, if there are several variables) and as an analytical expression. This distinction is presupposed by the graphical representation of functions in modern mathematics. For instance, the second power 132 of a quantity can be expressed both as f (x) = x2 and as a parabola. Therefore, the representational power of functions can be considered a “trait” inherited from geometrical representations of functional relations between quantities. Geometry expressed relations between quantities as curves in a system of coordinates. This is how velocity and acceleration were represented. However, the coordinates of a point are numbers which means that we can treat coordinates as analytic variables. In turn, this implies that the function represents the curve in analytical form. Since geometry was used to represent mechanical quantities (forces, velocities, accelerations, etc.), the introduction of functions to represent geometrical curves shifted this burden from geometry to functions and related concepts. The historical sequence can be reduced to: (1) empirical phenomenon → (2) idealization → (3) theoretical description → (4) geometrical representation → (5) functional representation, where “→” stands for “is represented by”. Eventually, the fourth step turned out to be redundant. In section 3.2.1 I presented Newton’s view on the relation between mechanics and geometry to support my argument concerning the representational role of geometry, i.e. to support the subsequence (1)-(4). In support of the subsequence (4)-(5) I presented in chapters 3 and 4 a study of the development of the calculus as a means to solve the geometrical problems raised by applying geometry to mechanical problems. Finally, in section 5.1, I showed how Lagrange’s criticism of his predecessors cut the ﬁnal ties that bound the calculus to geometry. This holds not only for the concept of function, but also for that of inﬁnitesimal. Inﬁnitely small quantities, or diﬀerentials, also had their origins in geometry, as illustrated by Newton’s method of ﬂuxions and ultimate ratios (see section 3.2.1). Euler tried to avoid the problems raised in [Berkeley, 2009] and introduced geometric ratios. Although at ﬁrst sight geometric ratios are geometric only in name, the underlying intuition is geometric. Euler re-interpreted Newton’s generation of curves by the movement of a point in time as generation of a series 133 in terms of inﬁnitely small ﬁnite increments. Lagrange’s Th´eorie des fonctions analytiques and Le¸cons sur le calcul des fonctions cut this tie too. The connection between geometry and calculus mentioned above is most visible in the second volume of Mechanica, where Euler applies the mechanical concepts developed in the ﬁrst volume together with his view on inﬁnitesimals to some mechanical problems. His solutions begin with a graphical representation of the trajectory and of the forces acting upon an inﬁnitely small body. Here, the forces are represented by lines, and the bodies by points on the curves representing their trajectory. The next step of his solutions is to give a description of the system from a geometrical perspective, i.e. as relations between curves. Inﬁnitesimals are introduced only in the last part to derive the ﬁnal result. The main characteristic of Euler’s solutions is that they are not applications of the calculus to a problem of mechanics, but to a problem of geometry. It might be thought that in the second volume of Mechanica geometry plays the role of a redundant middleman. This is not the case and in the next paragraphs I will explain why. At the basis of Euler’s mechanics lie the concepts of rest and motion. The reason they are given this foundational role lies in Euler’s conception of the nature of bodies (see section 4.2). The essential property of bodies is impenetrability, from which it follows that bodies necessarily have the properties of mobility, persistence and extent. Motion and rest are foundational concepts of mechanics because the disposition to be moved by a force and to maintain a state of rest are essential properties of bodies (mobility and persistence). Moreover, in virtue of having the property of extent, bodies are also inﬁnitely divisible (see 4.2 for the argument). Can we now say what fulﬁlls the role of bridge principle between the mathematical and the theoretical in Euler’s work? My answer yes, but we have to settle for something less rigorous than the term “principle” might lead us to expect. Euler’s conception of bodies as inﬁnitely divisible justiﬁes him to consider bodies analogous to points on a curve. This is 134 illustrated by the solutions given in the second volume of the Mechanica, and is justiﬁed by the classical deﬁnition of points as “things” without parts. As argued in 4.2, for Euler bodies too are not composed of ﬁnite parts or inﬁnitely small parts. So a ﬁrst bridge principle would be: An inﬁnitely small body can be considered a geometrical point. This would connect the theoretical layer to the idealized system. We must observe three things. First, that the principle is justiﬁed by the fact that bodies and points share some essential properties. Second, despite this similarity, the inﬁnitely small bodies and points diﬀer in kind. The diﬀerence is given by the fact that bodies, unlike points, are impenetrable. Thirdly, the principle is postulated. Since inﬁnitesimal bodies and points diﬀer in kind, they cannot be substituted for one another. In order to represent, geometrically, an inﬁnitely small body as a point, an intentional act must occur, through which a cognitive agent decides to use points to represent bodies. This decision is justiﬁed by the similarity between the two kinds of entities. The geometric point is also similar to inﬁnitely small (numerical) quantities. Just as a point has no length, an inﬁnitesimal is nothing (see 4.1.2), i.e. it is 0. Also, the inﬁnitesimal does not have any parts, although it too can be dividied ad inﬁnitum. This means that we can formulate a second bridge principle: A geometrical point can be considered an inﬁnitely small (numerical) quantity. The three observations made above apply here too. The principle is introduced by an intentional act of a cognitive agent, justiﬁed by the resemblance between the two kinds of entities. Do we ﬁnd something similar in Lagrange? We saw that in M´echanique Analitique he tried to construct an axiomatic mechanics. The main axiom of this system is the pvv which he considers the foundation of statics. Dynamics is connected with statics via pda. However, this connection is not logical. Dynamics is not deduced from statics and the pda is not an axiom of dynamics in the same way that pvv is an axiom of statics. pda is a methodological principle which justiﬁes the use of the laws and principles of statics. It cannot have the same 135 axiomatic status as pvv because it concerns only the internal interactions between the bodies composing a system, and not the movement of the entire system. However, as Jacobi argued [Pulte, 1998], Lagrange’s mechanics is not a proper axiomatic system for two reasons. First, its fundamental axiom (pvv) is not self-evident. Second, pda does not allow the deduction of dynamics from statics, as it should do if mechanics were an axiomatic system. Nevertheless, I think that although these principles could not serve as axioms, they play the role of bridge principles that connect the three layers of models. Thus, the principle of the lever (pl) connects the ﬁrst two components of a model, i.e. the idealized system and the theoretical description. As observed in chapter 4 and repeated above, for Euler the cause of a body’s change of state is necessarily external. A body cannot go from rest to motion or vice versa on its own, but only if a force acts upon it. If a body is at rest, then either there are no forces acting upon it, or the forces that do act cancel each other out. Since for Lagrange the aim of statics is to provide the laws by which the forces of a system cancel each other out, only the second case can be considered interesting for statics. However, the equilibrium of a single body is explained by pcf. If P , Q, R act upon a body simultaneously, then by adding P and Q and then adding the result to R, the result should be null. The principle of the lever applies to systems of several interacting bodies, i.e. systems in which the forces acting upon one composing body can be expressed as functions of the forces acting upon the other components. It would be impossible to apply pcf in this case, since it requires that the forces have the same origin. It follows that the simplest system is one containing only two bodies. However, two bodies can be connected in several ways, some of which might be more complex than the lever, while others might not be of much. For instance, if the two bodies, having diﬀerent masses, are tied to each end of a rope which passes through a pulley, 136 then there would be no way of equilibrating the system. If their masses are equal, then the system will be in equilibrium, even though one of the bodies is higher than the other. By contrast, the lever allows us to “determine” the condition of equilibrium when charged with P two weights of diﬀerent masses ( Q = − pq ). The mathematical expression of the condition of equilibrium of a lever is then applied to the three-body systems, four-body systems, and so on, after reducing the number of forces by applying the pcf. Therefore, we can consider the pl a paradigmatic example of describing an idealized system in theoretical terms. How does it connect the ﬁrst two layers of a model? A given idealized system can be divided into groups of two interacting bodies. Each pair of bodies is described in theoretical terms, following the example set by pl. In the case of statics at least, the theoretical description of the pair has the form P dp + Qdq = 0, as shown in Lagrange’s justiﬁcation of pvv. Here I am not presenting an algorithm for constructing the theoretical description of an idealized system, but the way in which the theoretical description of any idealized system is justiﬁed. In this sense, the pl, together with pcf, are bridge principles connecting the ﬁrst two layers of a model, the theoretical description and the idealized system. What connects the mathematical layer to the theoretical description? It depends. In statics, the pvv seems to have this function, while in dynamics, the pvv in conjunction with pda, which makes possible the application of pvv. In Lagrange’s case, it is much harder to identify a bridge principle between the mathematical and the theoretical layer, mainly because he tries to completely eliminate forces from mechanics. This makes the identiﬁcation of bridge principles quite diﬃcult. Nevertheless, by looking at his demonstration of the general principle of dynamics, we can see that an argument similar to the one made concerning the other bridge principle, can also be made here. Lagrange begins by considering the simplest case, that of a simple body 137 acted upon by forces which he then generalizes, by adding all the momenta of all the forces 2 act on all masses, i.e. S( ddt2x δx + d2 y δy dt2 + d2 z δz)m. dt2 This is guaranteed by pda. In the previous chapter I did note that the M´echanique Analitique is completely antimetaphysical, refusing to engage in any debates on the nature of mass, force, body, etc. and this is one point in which Lagrange and Euler diﬀer greatly. Euler tried to build mechanics on a solid basis of a priori principles, and to a certain extent he succeeded. Lagrange, on the other hand, tries to construct an axiomatic system beginning with a few intuitive principles(pp, pl), which can be grasped from experience2 The conclusion is that the bridge principles connecting the parts of a model depend on the nature of the mathematical concepts involved, as illustrated by the part played by inﬁnitesimals both in Euler and in Lagrange’s mechanics. Also, these principles depend on the overall theory in which we are working. Although both Euler and Lagrange can be said to have been working within the conﬁnes of Newtonian mechanics, their views on the nature of mechanics and the calculus shaped the way in which the two domains combine. For these reasons it would be diﬃcult, perhaps impossible, to provide a deﬁnition of bridge principles which would cover any possible case. I would suggest that the three-layered structure of models be viewed more as a framework for the historical analysis of models, rather than a theory of models. The identiﬁcation of the diﬀerent layers of models and the ways in which these layers have been connected could expand our understanding of the interactions between the development of mathematics and that of scientiﬁc theories. 2 What are lever and pulleys if not the most common items on an 18th century dock? 138 Bibliography Apostol, T. (1981). Mathematical Analysis. Addison-Wesley, Reading, Massachusetts. [51, 52] Bailer-Jones, D. M. (2003). When Scientiﬁc Models Represent. International Studies in The Philosophy of Science, 17(1). [1] Bell, E. T. (1992). The Development of Mathematics. [41, 53] Berkeley, G. (2009). The Analyst Or, a Discourse. BiblioBazaar. [5, 93, 133] Bos, H. J. M. (1974). Diﬀerentials, higher-order diﬀerentials and the derivative in the Leibnizian calculus. Archive for History of Exact Sciences, 14(1):1–90. [40, 41, 66] Boyer, C. B. (1949). The history of the calculus and its conceptual development. Courier Dover Publications, New York. [5, 6, 39, 41, 43, 47, 49, 53, 85] Bueno, O. (2000). Empiricism, scientiﬁc change and mathematical change. Studies in History and Philosophy of Science Part A, 31. [30] Carnot, L. and Browell, W. (1832). Reﬂexions on the metaphysical principles of the inﬁnitesimal analysis. J.H. Parker. [5, 85] Cauchy, A.-L. (1882-1974). Oeuvres compltes d’Augustin Cauchy. Srie 2, tome 2. Gauthier-Villars et ﬁls. [93] Euler, L. (1736). Mechanica sive motus scientia analytice exposita. In Opera Omnia, volume II. [77, 78, 79] Euler, L. (1740). De inﬁnitis curvis eiusdem generis seu methodus inveniendi aequationes pro inﬁnitis curvis eiusdem generis. Commentarii academiae scientiarum Petropolitanae, 7. [102] Euler, L. (1743). Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes, sive Solutio problematis isoperimetrici latissimo sensu accepti. Marcum-Michaelem Bousquet & socios, Lausanne. [8, 98, 99] 139 Euler, L. (1862). Anleitung zur naturlehre. In Opera Omnia, volume III. [73, 74, 75, 76] Euler, L. (1990). Introduction to analysis of the inﬁnite. Number v. 1 in Introduction to Analysis of the Inﬁnite. Springer-Verlag. [53, 54, 55, 56, 57, 58, 62] Euler, L. (2000). Foundations of diﬀerential calculus. Springer. [23, 53, 63, 64, 65, 66, 67, 68, 71, 72] Euler, L., Hewlett, J., Horner, F., Bernoulli, J., and Lagrange, J. (1822). Elements of algebra. Printed for Longman, Orme. [68] Ferraro, G. (2000). Functions, Functional Relations, and the Laws of Continuity in Euler. Historia Mathematica, 27(2):107–132. [6, 38, 52, 55, 56, 88, 132] Ferraro, G. (2004). Diﬀerentials and diﬀerential coeﬃcients in the eulerian foundations of the calculus. Historia Mathematica, 31. [39, 66] Fraser, C. (1985). J. l. lagrange’s changing approach to the foundations of the calculus of variations. Archive for History of Exact Sciences, 32(2):151–191. [7, 83, 97, 98] Fraser, C. (2008). Mathematics, volume 4. Cambridge University Press, Cambridge. [41] Fraser, C. G. (1987). Joseph louis lagrange’s algebraic vision of the calculus. Historia Mathematica, 14(1):38–53. [6, 7, 88] Giere, R. (2004). How Models Are Used to Represent Reality. Philosophy of Science, 71. [1] Grabiner, J. (2005). The origins of Cauchy’s rigorous calculus. Dover Publishing, New York. [5, 39, 40, 93] Grattan-Guinness, I. (1990). The varieties of mechanics by 1800. Historia Mathematica, 17(4):313–338. [36, 38] Hepburn, B. (2007). Equilibrium and explanation in 18th century mechanics. PhD thesis, University of Pittsburgh. [112] Home, R. (2008). Mechanics and experimental physics. In Porter, R., editor, The Cambridge History of Science, volume 4. Cambridge University Press, Cambridge. [35] Hughes, R. (1997). Models and representation. Philosophy of Science, 64. [1] Kitcher, P. (2007). Fluxions, Limits, and Inﬁnite Littlenesse. A Study of Newton’s Presentation of the Calculus. Isis, 64(1):33–49. [43] Kline, M. (1990). Mathematical thought from ancient to modern times, volume 2. Oxford University Press. [5, 6, 39, 41, 53, 96, 97] 140 Lagrange, J. (1797). Th´eorie des fonctions analytiques: contenant les principes du calcul diﬀ´erentiel, d´egag´es de toute consid´eration d’inﬁniment petits ou d’´evanouissans, de limites ou de ﬂuxions, et r´eduits ` a l’analyse alg´ebrique des quantit´es ﬁnies. Impr. de la R´epublique. [5, 84, 85] Lagrange, J. (1806). Le¸cons sur le calcul des fonctions. Courcier. [5, 86, 88, 89, 92, 94, 106] Lagrange, J. (1867a). Application de la m´ethode expos´ee dans le m´emoire pr´ec´edent ´a la solution de diﬀ´erentes probl´emes de dynamique. In Miscellanea, volume 1 of Oeuvres de Lagrange. Gauthier Villars. [8, 83] Lagrange, J. (1867b). Essai d’une nouvelle m´ethode pour determiner les maxima et les minima des formules int´egrales ind´eﬁnies. In Miscellanea Taurinensia, volume 1 of Oeuvres de Lagrange. Gauthier Villars. [105] Lagrange, J.-L. (1853). M´ecanique Analytique. Mallet-Bachelier, Paris. [20, 82, 83, 106, 107, 108, 109, 111, 117, 119, 120, 121] Morgan, M. and Morrison, M. (1999). Models as mediators: perspectives on natural and social science. Ideas in context. Cambridge University Press. [2, 30] Newton, I. (1736). The method of ﬂuxions. London. [45] Newton, I., Motte, A., and Machin, J. (1729). The mathematical principles of natural philosophy. Number v. 1 in The Mathematical Principles of Natural Philosophy. Printed for B. Motte. [44, 49, 122] Newton, I. and Stewart, J. (1745). Sir Isaac Newton’s Two treatises: Of the quadrature of curves, and Analysis by equations of an inﬁnite number of terms, explained: containing the treatises themselves, translated into English, with a large commentary; in which the demonstrations are supplied where wanting, the doctrine illustrated, and the whole accommodated to the capacities of beginners, for whom it is chieﬂy designed. Printed by James Bettenham, at the expense of the Society for the encouragement of learning; and sold by John Nourse and John Whiston. [48, 49] Pulte, H. (1998). Jacobi’s Criticism of Lagrange: The Changing Role of Mathematics in the Foundations of Classical Mechanics. Historia Mathematica, 25(2):154–184. [8, 83, 113, 115, 136] Suarez, M. (2004). An inferential conception of scientiﬁc representation. Philosophy of Science, 71:767–779. [1] Suisky, D. (2009). Euler as Physicist. Springer-Verlag, Berlin Heidelberg. [6, 43, 72, 78] 141 Taylor, A. (1983). Advanced Calculus. John Whiley and Sons. [51] Van Fraassen, B. (2008). Scientiﬁc Representation: Paradoxes of Perspective. Oxford University Press, Oxford. [29, 30] Youschkevitch, A. P. (1976). The concept of function up to the middle of the 19th century. Archive for History of Exact Sciences, 16:37–85. [6, 52, 53] 142
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Mathematical representation of empirical phenomena...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Mathematical representation of empirical phenomena : a case study of 18th-century mathematical and mechanical… Dragulin, Serban 2011
pdf
Page Metadata
Item Metadata
Title | Mathematical representation of empirical phenomena : a case study of 18th-century mathematical and mechanical concepts |
Creator |
Dragulin, Serban |
Publisher | University of British Columbia |
Date Issued | 2011 |
Description | I suggest a theory of scientific models which considers that a model is composed of three parts: an idealized system, a theoretical description and a set of mathematical equations. Each component is connected to the previous one by bridge principles, i.e. any assertion, whether justified or simply postulated, which establishes a correspondence between two parts of a model. The goal of my research was to find within 18th-century mechanics a set of laws that could be considered mediators between the parts of models. The greater part is dedicated to an analysis of the conceptual developments made by Euler and Lagrange in the 18th century both in mathematics and in mechanics. In the conclusion, I show that the relations between the parts of models are too complex to be expressed in a single, unifying assertion. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-12-14 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0072436 |
URI | http://hdl.handle.net/2429/39701 |
Degree |
Master of Arts - MA |
Program |
Philosophy |
Affiliation |
Arts, Faculty of Philosophy, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2012-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2012_spring_dragulin_serban.pdf [ 567.13kB ]
- Metadata
- JSON: 24-1.0072436.json
- JSON-LD: 24-1.0072436-ld.json
- RDF/XML (Pretty): 24-1.0072436-rdf.xml
- RDF/JSON: 24-1.0072436-rdf.json
- Turtle: 24-1.0072436-turtle.txt
- N-Triples: 24-1.0072436-rdf-ntriples.txt
- Original Record: 24-1.0072436-source.json
- Full Text
- 24-1.0072436-fulltext.txt
- Citation
- 24-1.0072436.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0072436/manifest