Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Toward a quantum theory of cognition : history, development, and perspectives Veloz, Tomás 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2016_february_Veloz_Tomas.pdf [ 1.12MB ]
JSON: 24-1.0221366.json
JSON-LD: 24-1.0221366-ld.json
RDF/XML (Pretty): 24-1.0221366-rdf.xml
RDF/JSON: 24-1.0221366-rdf.json
Turtle: 24-1.0221366-turtle.txt
N-Triples: 24-1.0221366-rdf-ntriples.txt
Original Record: 24-1.0221366-source.json
Full Text

Full Text

Toward a Quantum Theory ofCognition: History, Development, andPerspectivesbyToma´s VelozB.Sc. Physics, Universidad de Chile, 2005B.Sc. Mathematics, Universidad de Chile, 2007M.Sc. Computer Science, Universidad de Chile, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE COLLEGE OF GRADUATE STUDIES(Interdisciplinary Studies)THE UNIVERSITY OF BRITISH COLUMBIA(Okanagan)December 2015c© Toma´s Veloz, 2015AbstractThe representation and processing of concepts is considered to be one ofthe hardest challenges in cognitive science. While computer scientists andengineers have focused on developing advances for particular tasks, philoso-phers and cognitive scientists have focused on elucidating the structuralnature of meaning.A remarkable bridge between these two limited-success approaches canbe found in behavioral research, since, in a variety of tasks, humans processinformation at a conceptual level in a way that is incompatible with classi-cal probability and fuzzy set theory. Recently, this incompatibility has beenshown to occur at a deep structural level, and attempts have been made touse mathematical schemes founded on quantum structures as alternative ap-proaches. For this reason, the application of quantum structures to this typeof phenomena has received increasing attention. The quantum approach al-lows to faithfully model a number of non-classical deviations observed inexperimental data. Moreover, it shows that genuine quantum theoreticalnotions, such as contextuality, superposition, emergence, and entanglement,are powerful epistemic tools to understand and represent cognitive phenom-ena.In this thesis, we identify the limitations of classical theories to han-dle some important cognitive tasks, and introduce the fundamentals of thequantum cognitive approach to concepts. Next, we perform a mathematicalanalysis of current concept combination models and develop an extensionthat allows for concrete representations of multiple exemplars simultane-ously. Our analysis indicates that a superposition of logical reasoning anda specific form of non-logical reasoning, where non-logical reasoning is dom-inant, allows to faithfully represent the experimental data. Therefore, thenon-logical reasoning introduced by this model represents an important butunexplored form of reasoning in humans.In addition, we develop novel experimental methodologies to identifyiiAbstractquantum conceptual structures for concept combinations in the context ofnatural language processing and psychological experiments. Namely, wepresent a methodology to build entangled concepts represented as sets ofwords with respect to a corpus of text, and present a computational andpsychological methodology to discern if a collection of concepts behavesstatistically as a collection of quantum or classical particles. Using bothmethodologies we have identified a significant presence of quantum concep-tual structure in the context of natural language processing and psycholog-ical experiments.iiiPrefaceIn this thesis, I performed a systematic study of the quantum-cognitiveapproach to concepts. First, I made a comprehensive literature review tomotivate the use of the quantum-cognitive approach in cognitive science.Next, I developed a mathematical analysis of the current quantum-cognitivemodels of concepts, and introduced a novel mathematical tools to produceconcrete representations of experimental data. Finally, I introduced new ex-perimental methodologies to identify quantum conceptual structures in thecontext of natural language processing and psychological experiments.Some of the material presented in the thesis has been published in thefollowing scientific journals1:Chapter 2:Veloz T.(40%), Gabora L.(20%), Eyjolfson M.(20%), Aerts D.(20%) (2011).Toward a Formal Model of the Shifting Relationship between Concepts andContexts during Associative Thought, Lecture Notes in Computer Science,2011, Volume 7052/2011, 25-34.I performed the theoretical and data analysis.Aerts D.(30%), Broekaert J.(20%), Gabora L.(20%), Veloz T.(20%) (2012).The Guppy Effect as Interference. In Quantum Interaction, (pp. 36–47)Springer Berlin Heidelberg, .I performed the data analysis.Chapter 5:Veloz T.(50%), Desjardins S.(50%) (2015). Unitary Transformations in theQuantum Model for Conceptual Conjunctions and its Application to DataRepresentation, Frontiers in Psychology (accepted).I performed the theoretical analysis.Chapter 6:1The relative contribution of each author is indicated with a percentage value after thenameivPrefaceAerts D.(40%), Sozzo S.(40%), Veloz T(20%). (2015). Quantum Structurein Cognition and the Foundations of Human Reasoning, International Jour-nal of Theoretical Physics, (accepted).I collaborated on the development of the theoretical analysis. The theoreti-cal and data analyses has been improved in the final version of this thesis.Aerts D.(40%), Sozzo S.(40%), Veloz T.(20%) (2015). A New Funda-mental Evidence of Non-Classical Structure in the Combination of NaturalConcepts, Philosophical Transactions of the Royal Society A (accepted).I performed some of the data analysis. The theoretical and data analyseshave been improved in the final version of this thesis.Chapter 7:Aerts D.(20%), Sozzo S.(20%), Veloz T.(60%) (2015). The Quantum Natureof Identity in Human Concepts: Bose-Einstein Statistics for Conceptual In-distinguishability, International Journal of Theoretical Physics, 1-14.I developed the theoretical analysis, the experiment design and performance,and the data analysis.Veloz T.(50), Zhao X.(30%), Aerts A.(20%) (2013). Measuring Concep-tual Entanglement in Collections of Documents. In Quantum Interaction(pp. 134–146). Springer Berlin Heidelberg.I developed the theoretical analysis, the experiment design, and the dataanalysis.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiChapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . 1Chapter 2: Basics of Cognitive Modeling . . . . . . . . . . . . 42.1 From the Mind-Body Problem to a Theory of Concepts . . . 42.1.1 Cognitive Science . . . . . . . . . . . . . . . . . . . . . 42.1.2 Cognitive Modeling . . . . . . . . . . . . . . . . . . . 52.1.3 Theories of Concepts . . . . . . . . . . . . . . . . . . . 62.2 Challenges for a Theory of Concepts . . . . . . . . . . . . . . 82.2.1 Vagueness . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.2 Context Dependence . . . . . . . . . . . . . . . . . . . 102.2.3 Non-Compositionality . . . . . . . . . . . . . . . . . . 18Chapter 3: The Quantum Approach to Cognitive Modeling . 233.1 Quantum Physics and Quantum Structures . . . . . . . . . . 233.2 Conditions of Possible Experience and Non-classical Statistics 253.3 The Birth of Quantum Cognition . . . . . . . . . . . . . . . . 283.4 Fundamentals of Quantum Modeling in Cognition . . . . . . 333.5 Quantum Cognitive Models and Cognitive Challenges . . . . 373.5.1 The Conjunction Fallacy as Incompatibility . . . . . . 37viTABLE OF CONTENTS3.5.2 Overextension and Underextension as Interference . . 413.5.3 Ellsberg and Machina Paradoxes . . . . . . . . . . . . 423.6 Entanglement of Conceptual Combinations . . . . . . . . . . 443.6.1 Quantum Entanglement . . . . . . . . . . . . . . . . . 443.6.2 Psychological Evidence of Conceptual Entanglement . 46Chapter 4: Two Quantum Models for the Conjunction andDisjunction of Concepts . . . . . . . . . . . . . . . . 494.1 Modeling on a Hilbert Space . . . . . . . . . . . . . . . . . . 504.1.1 Scope and Dimensionality of a Hilbert Space Model . 504.2 Modeling in the Tensor Product of Hilbert Spaces . . . . . . 604.2.1 A Simple Tensor Product Model . . . . . . . . . . . . 604.2.2 Generalizing the States in the Tensor Product Model . 624.3 Examples and Comparisons . . . . . . . . . . . . . . . . . . . 67Chapter 5: Fock Space Modeling of Conjunctions and Dis-junctions of Concepts . . . . . . . . . . . . . . . . . 725.1 The Two-sector Fock Space Model . . . . . . . . . . . . . . . 725.1.1 Concept Combination in the Hilbert and Tensor Prod-uct Models: One or Two Instances in Mind? . . . . . 735.1.2 Introduction to Fock Space Modeling . . . . . . . . . . 745.2 Data Representation of Multiple Exemplars . . . . . . . . . . 775.2.1 Hilbert Space Representation . . . . . . . . . . . . . . 795.2.2 Tensor Product Model Representation . . . . . . . . . 825.2.3 Two-sector Fock Space Representation . . . . . . . . . 875.3 Data Representation Analysis . . . . . . . . . . . . . . . . . . 89Chapter 6: Fock Space Modeling of Negations and Conjunc-tions of Concepts . . . . . . . . . . . . . . . . . . . . 926.1 Conditions for a Classical Model . . . . . . . . . . . . . . . . 936.2 Experiment on Conjunctions and Negations of Concepts . . . 966.2.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 976.3 Fock Space Modeling of Conjunctions and Negations . . . . . 996.3.1 First Sector Analysis . . . . . . . . . . . . . . . . . . . 1006.3.2 Second Sector Analysis . . . . . . . . . . . . . . . . . 1046.3.3 Fock Space Representation of Experimental Data . . . 1096.4 Examples and Data Representation Analysis . . . . . . . . . 113Chapter 7: Quantum Structures in Natural Language Pro-cessing . . . . . . . . . . . . . . . . . . . . . . . . . . 118viiTABLE OF CONTENTS7.1 Language, Concepts, and Quantum Structures . . . . . . . . 1197.2 Evidence of Quantum Structure in Natural Language Pro-cessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207.2.1 Quantum Entanglement in Text Corpora . . . . . . . 1217.2.2 Indistinguishability of Concepts and Bose-Einstein Statis-tics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126Chapter 8: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 1368.1 General Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 1368.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 1378.2.1 Incompatible Exemplars . . . . . . . . . . . . . . . . . 1378.2.2 Modeling Concept Combinations for Real-World Ap-plications . . . . . . . . . . . . . . . . . . . . . . . . . 1398.2.3 Indistinguishability and Modes of Reasoning . . . . . 140Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163Appendices A: Traditional Modeling Tools . . . . . . . . . . . 163A.1 Classical logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 163A.2 Fuzzy Logic and Fuzzy Set Theory . . . . . . . . . . . . . . . 165A.3 Probabilistic Approaches . . . . . . . . . . . . . . . . . . . . . 166A.3.1 Interpretations of Probability . . . . . . . . . . . . . . 167A.3.2 Probability Spaces . . . . . . . . . . . . . . . . . . . . 168Appendices B: Membership of Conjunctions and Negationsof Concepts . . . . . . . . . . . . . . . . . . . . . . . 170viiiList of TablesTable 2.1 Data table of context-dependence typicality experi-ment for concepts. For each pair of numbers, the firstnumber indicates the average typicality of the exem-plar, and the second number indicates the normal-ized typicality. The total typicality of each contextis shown in the last row. Contexts e1 =Is a hat, ande5 =Is not worn by a person are shaded according totheir normalized typicality (the larger the number, thedarker the shade). . . . . . . . . . . . . . . . . . . . . 11Table 2.2 Payoff table of Machina paradox. In the above, E1, f1pays 202, in E2, f1 pays 201, and so on. . . . . . . . . 17Table 2.3 Experimental data in borderline contradiction for x =‘John.’ 19Table 2.4 Experimental membership weights for exemplars x1 =‘coffeetable,’ and x2 =‘tree house.’ . . . . . . . . . . . . . . . 21Table 3.1 Cognitive experiment revealing non-classical statistics. 29Table 3.2 Data table of conceptual entanglement experiment in [AS14]. 47Table 6.1 95% confidence interval for ΛA and ΛB for the dataon conjunctions and negations in Tables B.1–B.4, Ap-pendix B. . . . . . . . . . . . . . . . . . . . . . . . . . 97Table 6.2 95% confidence interval for IA, IB, IA¯, and IB¯, for thedata on conjunctions and negations in Tables B.1–B.4,Appendix B. . . . . . . . . . . . . . . . . . . . . . . . . 98Table 7.1 List of concepts and their respective exemplars for thepsychological experiment on indistinguishability. . . . 130Table 7.2 Results of statistical fit for the psychological exper-iment. Each column refers to the 14 collections ofconcepts introduced in Table 7.1. . . . . . . . . . . . . 131ixLIST OF TABLESTable 7.3 List of singular/plural reference to states used to per-form the web-based experiment. . . . . . . . . . . . . . 132Table 7.4 List of references to numbers used to perform web-based experiment. . . . . . . . . . . . . . . . . . . . . 133Table 7.5 Results of statistical fit of web-based experiment. Thenumbers in bold correspond to the cases where the BE-distribution provides a best fit according to the ∆BICand R2 criteria. . . . . . . . . . . . . . . . . . . . . . . 134Table B.1 Representation of the membership weights in the caseof the concepts Home Furnishing and Furniture. . . . . 171Table B.2 Representation of the membership weights in the caseof the concepts Spices and Herbs. . . . . . . . . . . . . 172Table B.3 Representation of the membership weights in the caseof the concepts Pets and Farmyard Animals. . . . . . . 173Table B.4 Representation of the membership weights in the caseof the concepts Fruits and Vegetables. . . . . . . . . . 174xList of FiguresFigure 2.1 Tversky and Kahneman experiment on conjunctionprobability estimation. The original experiment con-tained five other alternatives. We present only threefor the sake of simplicity. . . . . . . . . . . . . . . . . 13Figure 2.2 An example of the Sure-Thing principle. . . . . . . . 15Figure 2.3 An Ellsberg paradox situation. . . . . . . . . . . . . . 16Figure 2.4 Normalized typicality estimations of the concepts ‘Fur-niture,’ ‘Household Appliance,’ and their conjunctionwith respect to 16 exemplars (on x-axis). The min-imum and maximum of the former concepts in thecombination are shown in grey lines, the typicality ofthe conjunction is the black line, and the average for-mula is the black-dashed line. Double overextendedexemplars are marked by red points. . . . . . . . . . . 22Figure 3.1 Graphical description of the proportion of partici-pants with or without predetermined answers for ques-tion U [AA97]. . . . . . . . . . . . . . . . . . . . . . . 29Figure 3.2 Measurement process in the −model [AA97]. In a)the state of the participant prior to that question ison the circle, in b) the point falls into the elastic, inc) the elastic breaks, and in d) the point is attachedto one of the extremes revealing the outcome [AA97]. 30Figure 3.3 Graphical description of the proportion of partici-pants with or without predetermined answers to thethree questions U, V , and W [AA97]. . . . . . . . . . 31Figure 4.1 Hilbert space model in C2 for concept combinationwith µ(A) + µ(B) = 1. . . . . . . . . . . . . . . . . . 53Figure 4.2 Relative frequency of experimental data that can berepresented in the Hilbert space or tensor space models. 71xiLIST OF FIGURESFigure 5.1 Fraction of Hampton’s experimental data that can besimultaneously modeled in the two-sector Fock spacemodel for different values of nAB. The blue and redcurves correspond to the fraction of exemplars thatcan be simultaneously modeled using the zero-typeand second-type representations respectively. . . . . . 90Figure 6.1 Representation of intervals IA and IB on blue, and ofintervals IA¯ and IB¯ on red. . . . . . . . . . . . . . . . 99Figure 6.2 Number of exemplars having a zero- and second-typerepresentation for different values of nXY . . . . . . . . 115Figure 6.3 Fraction of experimental data that can be simultane-ously represented in the Fock space models for specificvalues of n. . . . . . . . . . . . . . . . . . . . . . . . . 116Figure 7.1 Frequency of the violation of Eq. (7.1) for the 20most relevant terms. The left plot corresponds to theco-occurrence data for relevance associated to term-frequency score, and the right plot corresponds to theco-occurrence data for relevance associated to td-idfscore. In both plots, the topics were sorted such thatp5(T ) is decreasing so as to avoid that curves crossedeach other. . . . . . . . . . . . . . . . . . . . . . . . . 125Figure 7.2 Each point corresponds to the choice of particularmeasurements A,A′, B, andB′. The x-axis representsthe extent to which equation (7.1) is violated and they-axis denote the δ value. We consider three scalesfor the δ value, and one single scale for the the middleterm of the CHSH inequality. Points to the right ofthe red line violate the CHSH inequalty. . . . . . . . 125xiiAcknowledgementsThis thesis is the result of a number of collaborations with different re-search groups. Therefore there are many people I wish to thank.Before doing so, I would first like to thank my supervisors, Sylvie Des-jardins and Diederik Aerts, for engaging closely in all aspects related to thedevelopment of this thesis, and for giving me wise insights when needed;Sandro Sozzo for being a collaborator in some of the results presented inthe thesis, and for helping structure the thesis; and Liane Gabora for beingthe first person that supported my application to the PhD program, and forbeing my supervisor during the first year.I also want to thank UBC Okanagan for supporting this PhD with vari-ous scholarships and awards.Next, I would like to thank the people I have collaborated with or havehad inspirational discussions at the research groups I have visited:1. Peter Dittrich, Bashar Ibrahim, Gerd Grunhert, Peter Kreigssig, andGabi Escuela at the Jena Centre of Bioinformatics, Jena, Germany.2. Jan Broekaert, Francis Heylighen, and Viktoras Veitas at the CentreLeo Apostel, Brussels, Belgium.3. Pablo Razeto at Instituto de Filosof´ıa y Ciencias de la Complejidad,Santiago, Chile.4. Yuexian Hou, Xiaozhao Zhao, Hailin Wang, and Peng Zhang at the In-stitute for Computational Intelligence and Internet Applications, Tian-jin University, China.5. Yunde Wu at the Mathematics Department, Zhejiang University, China.I would also like to thank the many friends and colleagues that havehelped in one way or another by providing the necessary inspiration and/orxiiiAcknowledgementsby contributing to the development of this project: Alvaro Villanueva, Lau-taro Elgueta, Daniel Souza, Alvaro Fuenzalida, Fabia´n Belmonte, EstebanToha´, Eduardo Urra, Manuel Quezada, Arnaldo Aravena, Ce´sar Valdene-gro, Michel Barros, Mario Markus, Paz Dura´n, David Rojas, Pablo Ortiz,Tirso Gonzales, Alejandro Fajardo, Navid Hossaini, Dan Kheila, JonathanPino, Laura Belliveau, Cristian Va´squez, Rosa Aguilera, Antonella Legovini,Juan-Pablo Va´squez, Paul Bessone, Nadja Peter, Maria Luisa Dalla Chiara,Alex Broughton, and Camila Rojas.Finally, I would like to thank my parents, Alejandra Gonza´lez and Patri-cio Veloz, for their unconditional support throughout my life, and my sisterPaula Veloz and her son Nicola´s Diaz for multiple conversations that havepaved the road that lead to this thesis.xivChapter 1IntroductionA well-established fact in cognitive science is that cognitive phenom-ena cannot be appropriately modeled using the traditional representationaltools [Fod98, Gar90, Daw13]. This fact has serious implications in our un-derstanding of what cognition is and, it is one of the major impediments tothe advance of many research areas related to cognitive modeling such asknowledge representation and decision-making [McC].An alternative approach to cognitive modeling borrows the represen-tational tools of quantum theory to study cases where traditional meth-ods fail. For example, in the field of decision-making, the conjunction fal-lacy [Fra09] and the Ellsberg paradox [ADS11] are important cases wherequantum-inspired models have been used to overcome the limitations of tra-ditional modeling. Quantum-inspired models have recently been developedfor phenomena in multiple areas including psychology [BPFT11, BPB13],economics [Khr10], and computer science [MP13, BKL13]. The researchfield that applies the mathematical formalism of quantum theory to studycognitive phenomena is known as quantum cognition [BBG13].One area where quantum cognition has found interesting results is thefield of concept modeling. Scholars, from a wide range of communities suchas philosophy, linguistics, and psychology, agree that concept combinationscannot in most cases be represented using traditional tools such as logicand probability theory. In fact, it has been shown that the conditions for alogical or probabilistic model for concept combinations are usually violatedby data collected in psychological experiments [SO81, Ham88a, Ham88b].However, quantum-inspired models, with genuine quantum features such asstate superposition, interference, and entanglement, provide faithful repre-sentations for most concept combinations [Aer09, AGS13].In view of the promising results provided by the quantum-cognitive ap-proach to concept combinations, we propose to carry out a systematic reviewof this approach to better understand why quantum cognition provides ade-1Chapter 1. Introductionquate representations of concept combinations, and to propose a frameworkthat enhances the range of applications of concept combination models.This thesis is divided into three parts:1. A systematic review of the structural properties of conceptual phe-nomena, and an introduction to the quantum approach to cognitivemodeling.2. An analysis of the mathematical framework for quantum-cognitivemodels of concept combinations, and the development of new mod-els that have broader applications.3. A philosophical argument and some empirical evidence for the appli-cation of quantum cognition in artificial intelligence.In the first part of the thesis, we introduce concept modeling, and identifythree structural problems that prevent the development of an adequate the-ory of concepts. This is done in part by presenting cognitive phenomenathat cannot be represented by using traditional mathematical tools. Next,we introduce the quantum approach to cognitive modeling, and demonstratehow quantum-cognitive models can be used to represent those cognitive phe-nomena.In the second part, we give a detailed mathematical analysis of the twomost important quantum models of concept combination developed in theliterature: The Hilbert space and tensor product models for concept con-junctions and disjunctions. We focus on the conditions required by eachmodel to represent experimental data, and identify the minimal dimensionthat is required by each model to reach maximal modeling power. We thenshow that the Hilbert and tensor product models entail two fundamentallydifferent ways to reason about concepts, and combine these two models intoa more general model: the two-sector Fock space model.In the two-sector Fock space model, the Hilbert and tensor product mod-els are recovered as extreme cases. Intermediate cases between these twoextremes correspond to superposed modes of thought. These superposedmodes of thought can represent instances of concept combinations that donot have a representation in the original two models. In addition, we showthat the concrete representations provided by the aforementioned models forconcept combinations are not consistent with the quantum cognitive princi-ples that inspire the abstract model: conceptual states must be independent2Chapter 1. Introductionof the exemplar, and measurement operators must be exemplar-dependent.We use unitary transformations in the concrete spaces C3 and C3 ⊗ C3to construct representations of multiple exemplars in accordance with thequantum modeling principles, and extend this representation method to thetwo-sector Fock space model.Next, we extend the two-sector Fock space model of conjunctions to thecase of conjunctions and negations. We first develop a theoretical analysisthat characterizes classical data for the case of conjunctions and negations.Then, we introduce experimental data showing that concept combinationsinvolving negations of concepts do not satisfy the conditions of classicaldata, elaborate concrete representations in the space C8⊕C8⊗C8, and an-alyze these representations to show that the extended two-sector Fock spacemodel can faithfully represent the experimental data.In the last part, we consider the limitations of current artificial intelli-gence methodologies from the perspective of quantum cognition, and provideexamples that justify the development of quantum-inspired models in arti-ficial intelligence. In particular, we explain how the problems of vagueness,contextuality, and non-compositionality are relevant to a sub-area of artifi-cial intelligence, known as natural language processing. We provide exper-imental evidence of quantum structures in natural language processing byshowing that quantum entanglement can be found in the word co-occurrencestatistics of a corpus of text, and that Bose-Einstein statistics can be foundin psychological experiments and in the retrieval statistics of a search engine.3Chapter 2Basics of Cognitive Modeling2.1 From the Mind-Body Problem to a Theory ofConceptsHuman beings have the capacities to observe elements of reality, to iden-tify and represent relations among such elements, and to hypothesize andtest unobserved relations. These capacities have lead to the emergence of anumber of fields of knowledge that have developed to explain physical real-ity. Among these fields, the basic sciences occupy a privileged place becausetheir methodologies have lead to the development of important technologicaladvances.All known human cultures have recognized the existence of a second non-physical realm that must be incorporated to the physical realm to completethe picture of the factuality of human existence. This realm, where humanmanifestations such as ideas, emotions, and self-awareness reside, is knownas ‘The Mind’ [Sea04]. Whether or not these two realms exist independentlyof each other is one of the most fundamental questions in western philos-ophy. This is known as the ‘the mind-body problem’ [Wig61]. In modernscience, the interdisciplinary effort toward the study of the realm of the mindis known as cognitive science [Daw13].2.1.1 Cognitive ScienceCognitive science is defined as the scientific study of the mind and itsprocesses. It examines what cognition is, what it does and how it works, and,like any other science, it aims to develop technologies and tools to advanceour understanding of, in this case, the mind. Such investigation includesconsiderations of multiple aspects of intelligence and behaviour, and focuseson how information is represented, processed, and transformed. In particu-lar, cognitive science investigates cognitive phenomena such as perception,language, memory, reasoning, and emotion.42.1. From the Mind-Body Problem to a Theory of ConceptsThe majority of cognitive scientists assume that cognition is the prod-uct of neurological processes occurring mostly in the brain [Tho85]. Hence,the dominant attitude regarding the mind-body problem is that the mindis a ‘result’ of the body. This is better understood by noting that mostapproaches to study cognition start from a basic ‘cognitive architecture.’For example, neuroscience and clinical psychology assume that cognitivephenomena are the output of a nervous system that is controlled by thehuman brain [Daw13]. For artificial intelligence, the cognitive phenomenaare asssumed to be the output of a specific software implemented on a ma-chine [Gar90].An alternative view, held mainly by a mix of applied mathematiciansand cognitive psychologists [Nei76], focuses on understanding the structuralaspects of the cognitive phenomena from an abstract, and usually mathemat-ical, perspective. This alternative approach, known as cognitive modeling, isthe one we follow here. Therefore, we will identify some fundamental struc-tural properties underlying cognitive phenomena, and attempt to representthem using the language of mathematics.2.1.2 Cognitive ModelingA cognitive model is an approximation to a cognitive phenomenon forthe purpose of comprehension and prediction. Cognitive models normallyfocus on a single cognitive phenomenon. For example, we could study howa person directs visual attention to certain images, or how a person decideswhich links to follow on a webpage. Cognitive modeling can also study howtwo phenomena interact. For example, we can combine the last two phe-nomena to study the effect of how we direct our visual attention on thechoice of links we make in a webpage.There are many mathematical approaches to cognitive modeling. Theyrange from basic arithmetic operations to highly abstract representationsbased on category theory. The most popular mathematical tools applied tocognitive modeling are logic, probability theory, linear algebra, and networktheory [RN95]. We will cover some of these approaches in Appendix A. Re-gardless of the mathematical approach, we can divide the modeling effortsdeveloped within the cognitive modeling community into two main classes.The first kind consists of ‘ad-hoc’ cognitive models. Here, the purpose52.1. From the Mind-Body Problem to a Theory of Conceptsis to model a particular phenomena in a specific domain of application.One example is the model of visual categorization of geometrical shapesbased on ontologies presented in [MT08]. The authors introduce a list offeatures that play an important role in visual categorization and their rela-tions, and an algorithmic procedure, based on Bayesian statistics, to cate-gorize them. Examples of such categorization elements in the ontology aresphere-like, rounded, uniform texture, etc., and an example of a relation is(rounded,uniform texture)→(sphere-like). The algorithms in this model as-sume three incremental stages: i) knowledge acquisition, ii) learning, and iii)categorization. The model is useful for the task for which it was developed,especially in the case of smooth shapes. However, its design is not meant torepresent anything else other than visual categorization, nor is it compatiblewith other models.The second kind includes the so-called concept theories, which are gen-eral representation frameworks for cognitive phenomena. Here, conceptsare envisaged as the units that underlie cognitive phenomena. Since under-standing the nature of these units leads to a first-principles basis for a theoryof cognition [RMG+76, SBZ01, Ga¨r00], the aim of the theories of conceptsis to reveal the formal structure of concepts.In this work, we focus on the second kind of approach. Namely, we areinterested in the structural aspects that the notion of concept needs in orderto be properly applied to produce cognitive models. We aim at a characteri-zation that, on the one hand, identify the fundamental structural aspects ofconcepts, and on the other, can be represented within a mathematical theory.2.1.3 Theories of ConceptsTraditional models of concepts concentrate on categories possessing con-crete or imaginary instances, such as ‘horse,’ or ‘dragon’ [Bea64, Ros73,Mac09a]. Modern approaches, however, extend to include abstract instancessuch as topics of discussion [SG07, BL06], music genres [AP03], and im-ages [BHAT05]. In cognitive science, there are three main proposals fora theory of concepts that are mathematically sound: the classical the-ory [Med89], the prototype theory [RMG+76], and the exemplar theory [Nos86].The classical theory follows the tradition of classical logic, and assumesthat concepts are determined by a fixed set of attributes. Hence, any in-62.1. From the Mind-Body Problem to a Theory of Conceptsstance that holds these attributes is a member of the concept. Classicallogic or some of its extensions are applied for inferential tasks, and for con-cept combinations. This theory of concepts thus assigns membership truthvalues: an instance is or is not associated with a particular concept (Ap-pendix A).The prototype theory proposes that concepts are not defined by a fixedset of attributes, but instead by one or multiple prototypes that incorporatethe most relevant properties. Each exemplar has a degree of membershipand, if the membership is positive, a degree of typicality. The prototype hasthe maximum degree of typicality. Prototype theory, formulated in the lan-guage of fuzzy sets (Appendix A.2), is more general than the classical theoryof concepts in that it introduces a graded structure for the membership interms of such things as typicality, similarity, and representativeness [GA02].The exemplar theory assumes that a concept is defined by a list of storedentities that represent the current understanding of a certain agent concern-ing the concept in a given context. One can assess similarity estimationsamong the instances, and apply logical techniques to infer the similarity tonew instances, as well as to combine concepts. Thus, the notion of proto-type is recovered in this theory, for one can refer to some instances as moretypical. However, the mathematical framework of this theory requires anumber of parameters that grows with the number of exemplars, and theseparameters do not have a clear interpretation [Nos86].Throughout this thesis, we will denote concepts with single quotationsin italic style with the first letter capitalized on each word, and by capi-tal caligraphic letters when denoted in abstract form. For example, let Adenote the concept ‘Animal.’ Conceptual instances, also called exemplars,will be denoted between quotations without italics, and by lowercase letterswhen denoted in abstract form. For example, we say p =‘dog’ is an exem-plar of concept A. Properties, also called attributes or features, apply toboth concepts and instances. We will denote properties in italics withoutquotations: has four legs is a property of the exemplar ‘dog’ of the concept‘Animal.’72.2. Challenges for a Theory of Concepts2.2 Challenges for a Theory of ConceptsThere is a particular set of phenomena in concept research that high-lights the problems of current cognitive models. These problematic phe-nomena challenge not only the accuracy of traditional models, but also thephilosophical principles these models are built upon. We take a closer lookat these phenomena to better understand the traditional models, and toidentify their weaknesses. We can identify three issues that are problematicin cognitive modeling: vagueness, contextuality, and non-compositionality.In this chapter, we introduce these three issues, and present cognitive phe-nomena that characterize problems associated with them.2.2.1 VaguenessConcepts we reason with in our daily life are not sharply defined, nei-ther in their boundaries nor in their implications [Wit58, Zad65]. Cognitivepsychologists, mostly during the seventies and eighties, investigated the im-precise use of concepts in reasoning. They carried out a large number ofexperiments to characterize how people understand the meaning of con-cepts we use in daily life, and concluded that the way people estimate themeaning of concepts cannot be modeled using binary systems (‘yes’/‘no’),but requires instead graded relations that reflect their structural vague-ness [Ros73, RMG+76, SM81, SL97].This is illustrated in studying how people estimate the membership ofdifferent exemplars with respect to a concept. For example, consider theconcept ‘Pet,’ and suppose we want to estimate the membership of the ex-emplars ‘dog,’ ‘snake,’ and ‘robot.’ Clearly, we can be more certain aboutthe first instance being a member of ‘Pet’ than about the second instance,and in turn, we can be more certain about the second instance being a mem-ber of ‘Pet’ than about the third instance. This suggests that membershipshould be quantified and that a graded structure is required.Several cognitive scientists believe that the membership of an exemplarwith respect to a concept depends on how much the exemplar resembles theprototype for the concept. Here, the prototype represents the most typicalexemplar of a concept. Hence, membership of an exemplar with respect to aconcept is measured by the similarity between the prototype and the exem-plar [RMG+76]. This idea is one of the milestones of the prototype theoryof concepts. In particular, prototypes of concepts can be experimentally82.2. Challenges for a Theory of Conceptsobtained by requesting participants to estimate the typicality of a list ofexemplars with respect to a concept [RMG+76, Ros99]. Using similar meth-ods, we can also measure experimentally the extent to which an exemplarresembles the prototype of a concept. Experiments confirm that the moresimilar an exemplar is to the prototype of a concept, the larger its degree ofmembership is [Ham07].But the prototype-based approach to membership becomes unclear whena concept has more than one prototype because, after assessment of the sim-ilarity of an exemplar to one of the prototypes, there are now different waysto assign membership to the concept as a function of prototypes. Althoughseveral similarity measures have been proposed in the literature [Gol94],none of them gives a satisfactory answer to the relation between member-ship and similarity to prototypes [Tve77, TG82, Ham07]. Furthermore, sinceprototypes are highly specific, it is difficult to determine a priori which pro-totypes are required to characterize a concept. Consider for example theconcept ‘Pet’ with two prototypes ‘cat’ and ‘dog.’ Neither of these two pro-totypes is similar enough to the exemplar ‘goldfish’ to provide a membershipassessment. Hence, ‘goldfish’ should also be considered a prototype. By ap-plying a similar reasoning to other possible pets such as ‘spider,’ ‘robot,’ and‘rabbit,’ it becomes clear that similarity-based approaches are inadequate toassess the membership of exemplars.An alternative way to measure the degree of membership of an exemplarwith respect to a concept is to consider the most representative propertiesof the concept. In fact, the prototypes of a concept can be recovered fromthe set of most representative properties [Nos87, Bal04]. In the literature,the notions of typicality and similarity have also been assessed using theidea of representative properties of a concept, or of its prototypes [Tve77].However, the relation between the representative properties and the mem-bership of exemplars to a concept is unclear for at least two reasons. First,since there can be many properties for a given concept, selecting the set ofrepresentative properties of a concept is subjective [GA09]. Second, the se-lection from these properties can mislead our membership estimations. Forexample, able to fly is generally a representative property of the concept‘Bird,’ but a ‘penguin’ is a ‘Bird’ that is not able to fly [AG05a].In conclusion, although researchers have put forward several alternativemethods to assess degrees of membership, experimental evidence does notprovide conclusive results concerning the relation between these methods92.2. Challenges for a Theory of Conceptsand the membership of a concept. Therefore, the way in which typical-ity, similarity, and representativeness relate to the notion of membership isvague [Fod98]. Throughout the thesis, we will refer to membership, typ-icality, similarity, and representativeness using the generic term semanticestimation, and refer to the fact that the relation among these forms of se-mantic estimation is unclear as the “vagueness problem” of concepts.2.2.2 Context DependenceBecause the meaning of concepts people use in daily life is generallycontextual, we can achieve significant improvements by incorporating thenotion of context in the study of concepts. People do not think about con-cepts in isolation, but rather, in an environment that involves both internaland external circumstances. From now on, we refer to this problem as the“contextuality problem” of conceptual structures.Paradoxically, the notion of context seems harder to define than thenotion of concept itself. Depending on the area of application, differentperceptions of what constitutes a context become the focus of the defini-tion. While context is roughly understood as ‘the circumstances in whichsomething occurs’ [Mei12], a total of more than 150 definitions have beenproposed in different areas such as linguistics, cognitive science, psychology,and philosophy [BB05].For concept theories, context entails all the priors at the moment of elicit-ing a concept. Cognitive psychologists have performed multiple experimentsto observe how these priors affect the semantic estimations of concepts.These semantic estimations involve exemplar membership [Ros99, RMG+76,Ham07], typicality [GA02, AG05a, VGEA11], property relevance [MS88,AG05a], and similarity [Nos87, Nos88] among others. For a detailed reviewof contextual effects on different semantic estimations see [PH14]. In allcases, they conclude that context radically affects the meaning of a concept.In an experiment reported in [VGEA11], ninety-eight University of BritishColumbia undergraduates who were taking a first-year psychology course es-timated the typicality of different exemplars of a concept given different con-texts. The concept chosen was ‘Hat,’ and the chosen exemplars p1 =‘cowboyhat,’ p2 =‘baseball cap,’ p3 =‘helmet,’ p4 =‘top hat,’ p5 =‘coonskincap,’102.2. Challenges for a Theory of Conceptsp6 =‘toque,’ and p7 =‘Medicine Hat2.’ Properties of ‘Hat’ were used to cre-ate a context. We denote the context by the name of the property in italicletters and capitalize the first letter to differentiate it from a property. Thecontexts chosen for the experiment were e1 = Is a hat, e2 = Is worn to befunny, e3 = Is worn for protection, e4 = Is worn in the South, and e5 = Isnot worn by a person. Typicality estimations were made using a Likert-scaleranging from 0 to 7. The average and normalized typicality estimates foreach context are shown in Table 2.1. The normalized typicality correspondsto the ratio between the average exemplar typicality and the sum of theexemplar typicalities for a given context.The data shows that the typicality of exemplars for the concept ‘Hat’is strongly affected by the contexts under consideration. Context e1 wasspecifically introduced to minimize the contextual influence, so that theconventional typicality of the exemplars with respect to the concept can bedetermined. The other contexts were chosen to influence the meaning of theconcept. Particularly, e5 was chosen to induce a context that is counterin-2Medicine Hat is a city in Canada.Table 2.1: Data table of context-dependence typicality experiment for con-cepts. For each pair of numbers, the first number indicates the averagetypicality of the exemplar, and the second number indicates the normalizedtypicality. The total typicality of each context is shown in the last row. Con-texts e1 =Is a hat, and e5 =Is not worn by a person are shaded according totheir normalized typicality (the larger the number, the darker the shade).Exp. Data e1 e2 e3 e4 e5cowboy hat (5.44;0.18) (3.57;0.14) (3.06;0.13) (6.24;0.28) (0.69;0.05)baseball cap (6.32;0.21) (1.67;0.06) (3.16;0.13) (4.83;0.21) (0.64;0.04)helmet (3.45;0.11) (2.19;0.08) (6.85;0.28) (2.85;0.13) (0.86;0.06)top hat (5.12;0.17) (4.52;0.17) (2.00;0.08) (2.81;0.12) (0.92;0.06)coonskincap (3.55;0.11) (5.10;0.19) (2.57;0.10) (2.70;0.12) (1.38;0.1)toque (4.96;0.16) (2.31;0.09) (4.11;0.17) (1.52;0.07) (0.77;0.05)pylon (0.56;0.02) (5.46;0.21) (1.36;0.05) (0.68;0.03) (3.95;0.29)Medicine Hat (0.86;0.02) (1.14;0.04) (0.67;0.03) (0.56;0.02) (4.25;0.31)N(e) 30.30 25.98 23.80 22.22 13.51112.2. Challenges for a Theory of Conceptstuitive to the meaning of ‘Hat.’ Exemplars were chosen to cover the widerange of uses of the concept ‘Hat.’ For example, the exemplars ‘pylon’ and‘Medicine Hat’ are hardly members of ‘Hat.’ This can be verified by theextremely small typicality they receive in the context e1. It is interestingto note that ‘pylon’ and ‘Medicine Hat’ become the most typical exemplarsunder context e5. Moreover, the correlation coefficient for the typicalityestimations between contexts e1 and e5 is p = −0.93. This strong anti-correlation is evidence for the possibility that, when contexts of a concepthave opposite meanings, then the corresponding typicality estimations ofthe concept are anticorrelated. This suggests that structural comparisonsbetween the typicality estimations obtained for different contexts could beused to characterize semantic relations between contexts and between ex-emplars [VGEA11].From a mathematical point of view, there are cognitive experimentsshowing that the way context influences concepts is incompatible with the as-sumptions of probability theory (Appendix A.3). In what follows, we presentexperiments that reveal the incompatibility of probabilistic approaches inthree cognitive situations: direct probability estimation, order-effects inpsychological surveys, and decision-making experiments involving successivebets.Direct Probability Estimation: The Conjunction FallacyIn the course of their extremely influential research program on decisionmaking, Amos Tversky and the Nobel laureate Daniel Kahnemann 3 intro-duced for the first time the conjunction fallacy [TK83]. This phenomenonstates that people generally estimate the occurrence of conjunctions of eventsto be more likely than the occurrence of the former events alone. Thus, itcontradicts probabilistic rules about conjunction. For example, let E1 andE2 be two events, and the probability of their conjunction be given byP (E1and E2) = P (E1 ∩ E2). (2.1)Then, the following inequality should hold:P (E1 ∩ E2) ≤ min(P (E1), P (E2)). (2.2)3As an historical note, Kahneman recognized in his Nobel prize acceptance speech thathe should have shared the award with Tversky, who died six years before.122.2. Challenges for a Theory of ConceptsNote that in a standard logical setting such as classical and fuzzy logic,the membership for the conjunction of two categories is smaller or equal thanthe minimum of the memberships of the former categories (Appendix A.2).However, experimental data shows that people’s estimations usually violateEq. (2.1). The example used by Tversky and Kahneman in [TK83] is pre-sented in Fig. 2.1.Advocates of Boolean or fuzzy logical approaches to natural languagehave proposed multiple participants’ misunderstandings to explain this ef-fect. Namely, participants might misunderstand the meaning of the words‘and’ [BHN93] and ‘probable’ [Gig96], or participants might tend to believethat a) implies that ‘Linda is not a feminist’ [TK83, BTO04]. Measures weretaken in subsequent experiments to mitigate these and other possible misun-derstandings. They included either the training of participants, or explicitlystating the logical consequences of the possible choices in explanatory text.Although in most cases the percentage of participants committing the fal-lacy is reduced, the fallacy remains significant (above 30%) in all reasonableexperimental settings. For a detailed review of the experiments where theparadox has been tested, see [Mor09].The conjunction fallacy has been confirmed in several studies that in-cluded hypothetical situations as well as real life situations like diagno-Linda is 31 years old, single, outspoken, and very bright. She majoredin philosophy. As a student, she was deeply concerned with issues ofdiscrimination and social justice, and also participated in anti-nucleardemonstrations.Which is more probable?a) Linda is a bank tellerb) Linda is active in the feminist movementc) Linda is a bank teller and is active in the feminist movementFigure 2.1: Tversky and Kahneman experiment on conjunction probabilityestimation. The original experiment contained five other alternatives. Wepresent only three for the sake of simplicity.132.2. Challenges for a Theory of Conceptssis and prognosis in clinical settings [DH91, Rao09], forecasts of sportsresults [NA10], effects of government policies [BTO04], and political out-comes [LGS09]. Moreover, the fact that the same result is confirmed in dif-ferent experimental settings, ranging from those choosing children [Agn91]to those using statistics experts [TK81] as participants, and consideringdifferent methodologies like choice [TBO04], ranking [SOSS03], and fre-quency [TC12] among others [TML96, BTO04, WM08], provides powerfulempirical evidence for the conjunction fallacy.Survey Answering: Order EffectsResearchers in psychology know that the order in which questions arepresented influences the statistics of the responses. This is because ear-lier questions can provide context for the questions that follow, and hencecan produce non-commutative effects. For example, when people are asked,‘What is the most important problem facing the nation?,’ the answer par-ticipants give becomes the object of focus for their answer to a subsequentquestion: ‘Do you approve or disapprove of the way the president is handlinghis job?’ Indeed, most people will tend to judge the president’s performanceprimarily on the issue they selected in the first question [KK90].But even though these non-commutative effects are understood, mostdecision-making models in psychology are based on classical probability,where the probability of joint events commute by definition.In a classical probabilistic setting, answers ‘yes’ to two questions F andH are represented by sets Fy, Hy ⊆ Σ, where Σ is the space of events(Appendix A.3). The event corresponding to answer ‘yes’ to F and H isdefined byFy and Hy = Fy ∩Hy, (2.3)which is commutative. In Bayesian probability, the likelihood that a subjectanswers ‘yes’ to the question H given that the answer to F is ‘yes’ is rep-resented by the conditional probability P (Hy|Fy). Analogously, P (Fy|Hy)represents conditional probabilities for the reverse order. The two probabil-ities are related by Bayes rule:P (Fy)P (Hy|Fy) = P (Fy ∩Hy) = P (Hy)P (Fy|Hy). (2.4)142.2. Challenges for a Theory of ConceptsExperimental evidence confirms that Eq. (2.4) does not hold in gen-eral [BW07, WB13, TB11]. Bayesian models involving more elaboratedforms of conditioning can account for order effects. However, these mod-els involve ad-hoc assumptions that can be accomodated only a posteriori,and thus have no predictive capacity [WSSB14]. Similarly, Markov mod-els that can account for order effects have been constructed, but they alsorequire the introduction of ad-hoc elements to accommodate the differentkinds of deviations reported in the literature [WB13]. For extensive reviewsof the kinds of experiments and deviations measuring order effects, we referto [SB74, SP96, HE92, TRR00].Decision Making: Ellsberg and Machina ParadoxesIn economics, the predominant model of decision making is given by theExpected Utility Theory [VNM07]. A fundamental principle, the so-calledSavage’s Sure-Thing Principle, ensures that if the possible outcomes of avariable x do not change the utility of a decision situation S, then the vari-able x can be neglected in the decision analysis. In [Sav72], the principle isintroduced with the story shown in Fig. 2.2:Let the events D and R represent the two disjoint possible outcomesof the presidential election, and let B represent the businessman buys theproperty. We haveA businessman contemplates buying a certain piece of property. He con-siders the outcome of the next presidential election relevant. So, to clar-ify the matter to himself, he asks whether he would buy if he knew thatthe Democratic candidate were going to win, and decides that he would.Similarly, he considers whether he would buy if he knew that the Repub-lican candidate were going to win, and again finds that he would. Seeingthat he would buy in either event, he decides that he should buy, eventhough he does not know which event obtains, or will obtain, as we wouldordinarily say.Figure 2.2: An example of the Sure-Thing principle.152.2. Challenges for a Theory of ConceptsP (D) + P (R) = 1,P (D ∩R) = 0. (2.5)Moreover, the fact that in both possible outcomes of the presidential electionthe businessman prefers to buy the property implies thatP (B|D) ≥ 0.5,P (B|R) ≥ 0.5. (2.6)As P (B|R) + P (B|D) = P (B) ≤ 1, we conclude that P (B) = 1. Therefore,because P (B) is equal to one, buying the property is deterministic with re-spect to the presidential election.In a well-known study [Ell61], Daniel Ellsberg demonstrated that theSavage’s Sure-Thing principle is inconsistent with the reality of humandecision-making. The experiment performed by Ellsberg describes a sit-uation such as the one in Fig. 2.3:Participants in the experiment are confronted by the following 4 options:(I) bet on red, (II) bet on black, (III) bet on red or yellow, (IV) bet on blackor yellow. Subjects must decide between options (I) and (II), and then de-cide between options (III) and (IV).The experimental results presented in [Ell61] show that a high proportionof participants prefer (I) over (II), and (IV) over (III). But, this violates theSure-Thing Principle, which requires that (I) preferred over (II) would mean(III) is preferred over (IV). A possible explanation for this violation could bethat people make a mistake in their choice, and that the paradox is causedby an error of reasoning, or by aversion to ambiguity [FT95]. A number ofConsider an urn with 30 red balls and 60 balls that are either black oryellow in an unknown proportion. A bet regarding the color of a balldrawn from the urn is proposed under the following rules: When bet onc, a prize is given if the ball drawn is c, otherwise no prize is given. ccan be a color or a disjunction of two colors.Figure 2.3: An Ellsberg paradox situation.162.2. Challenges for a Theory of Conceptsmodels have tried to account for these possible errors in judgement. Mostnotable among them, Choquet expected utility [Gil87], max-min expectedutility [GS89], variational preferences [MMR06], and second-order proba-bilities [KMM05]. All of these models are generalizations of the expectedutility model based on either Bayesian inference schema, or on a frameworkthat generalizes some specific aspect of a σ-algebra classical probabilisticsetting [AST12]. Recently, a new decision situation, the so-called Machinaparadox [Mac09b], was shown to be incompatible with all the above mod-els [BLP11, Mac14].The Machina paradox considers an urn with four kinds of balls, each al-located a number between 1 and 4. The number of balls with the number 1together with the number of balls with the number 2 is fifty, and the numberof balls with the number 3 or 4 is fifty-one. The event Ej indicates that aball with a number j has been drawn from the urn. In a first stage of theexperiment, participants are explained that the choices fi, i = 1, ..., 4, havepayoffs defined by Table 2.2. Next, participants are asked to decide betweenbetting on f1 or f2.If a participant is sufficiently ambiguity averse, he will prefer f1 overf2, because although f2 presents a slight Bayesian advantage, f1 has noambiguity in its payoffs. The person is then asked to bet on f3 or f4. Inthis case, both choices present ambiguity in their payoffs. Thus, a decisionmaker who values unambiguous information would be indifferent between f3and f4. On the other hand, f4 benefits from the 51 balls, hence in this casethe Bayesian advantage implies that f4 should be preferred over f3 becauseof the different payoffs for cases E2 and E3. However, most participantspreferring f1 over f2, later prefer f4 over f3.Table 2.2: Payoff table of Machina paradox. In the above, E1, f1 pays 202,in E2, f1 pays 201, and so on.Act E1 E2 E3 E4f1 202 202 101 101f2 202 101 202 101f3 303 202 101 0f4 303 101 202 0172.2. Challenges for a Theory of ConceptsThe paradox appears because none of the existing models can representthe participants’ contextual behaviour [AST12]. Namely, in a first stage,participants are in a context where there is enough information to discernwhich act has a more ambiguous payoff, and choose f1 over f2, exposing theiraversion towards ambiguity. In the second stage, however, participants arein a context where there is not enough information to discern which act hasa more ambiguous payoff, and choose f4 over f3, recognizing the Bayesianadvantage.2.2.3 Non-CompositionalityThe vagueness and contextuality problems, mentioned in § 2.2.1 and§ 2.2.2 respectively, occur for individual concepts. In a general setting,a cognitive situation might include multiple concepts forming aggregatedstructures [Rip95]. For example, the concepts ‘Fruit’ and ‘Vegetable’ can becombined to form a new concept ‘Fruit or Vegetable’ [Ham88a]. This con-cept combination is built with the connective ‘or,’ which is also an operationmathematically defined in logic and probability. The question becomes, isit possible to apply the mathematical definition of the connective ‘or’ tobuild the structure of ‘Fruit or Vegetable’ from the structures of ‘Fruit’ and‘Vegetable’?Traditional approaches to the study of cognitive phenomena assume thatthis question has a positive answer. This assumption, known as the principleof compositionality [Pel94], was first introduced to formalize logical infer-ence, and later applied to linguistics [Gra90] and concept theory [RMG+76].But modern cognitive psychologists still don’t agree on whether or notconcepts are compositional [FP88, Fod98]. They have performed severalexperiments measuring various semantic estimations for concept combina-tions built with connectives used in logic such as ‘Pet and Fish,’ and ‘NotSport’ [Ham88b, Ham88a, Ham97a], and adjective-noun compounds such as‘Red Apple’ [MS88, KP95, Med89] among others [Ham97b]. The evidencecollected during two decades of research reveals that concept combinationsare not compositional in general, at least in the sense suggested by fuzzylogic and probability theory. From now on, we refer to this problem as the“non-compositionality” problem of conceptual structures.One of the most illustrative phenomena, called borderline contradiction,considers the gradedness structure of a concept in conjunction with its nega-182.2. Challenges for a Theory of Conceptstion. Namely, a borderline contradiction case is a logical sentence of the typep(x) and Not p(x) that is estimated to be ‘true,’ for a certain predicate pand a borderline exemplar x. For example, in [AP11], participants esti-mate the truth value of the predicate p(x) =‘x Is Tall’ for an instantiationx =‘John,’ whose height assumes the values in Table 2.3.To fit the evidence found in borderline contradiction, pioneering inves-tigations [BOVW99] assumed the ignorance of participants, and proposedweakening of logical rules for truth estimations. Other models assumedtruth gaps based on pragmatic logic [AP11], slight relaxations of proba-bility theory [Rip11], paraconsistent logic [Rip13], and models inspired byfuzzy logic [Sau11]. None of these approaches has provided faithful model-ing of empirical data with a coherent explanation of how to model conceptcombinations [BPB13, Soz14].And for combinations involving any two concepts combined by conjunc-tion or disjunction, the gradedness structure exhibits features that are evenless obvious than what has been found in borderline contradiction research.Aerts, in [Aer09], formally states the conditions that characterize the exis-tence of a classical probability model for concept conjunction and disjunc-tion:Definition 2.1. Let µx(A), µx(B), and µx(A and B) be the membershipweights of an item x with respect to a pair of concepts A and B and theirconjunction A and B. We say that these membership weights are clas-sical conjunction data if there exists a Kolmogorovian probability space(Ω, σ(Ω), P ), and events EA, EB ∈ σ(Ω) such thatP (EA) = µx(A),P (EB) = µx(B),P (EA ∩ EB) = µx(A and B).(2.7)Table 2.3: Experimental data in borderline contradiction for x =‘John.’height 5′4′′ 5′7′′ 5′11′′ 6′2′′ 6′6′′% p(x) =‘true’ 14.5 21.1 44.6 28.9 5.3192.2. Challenges for a Theory of ConceptsTheorem 2.2. The membership weights µx(A), µx(B), and µx(A and B)of an item x with respect to concepts A, B and their conjunction A and Bare classical conjunction data if and only if0 ≤ µx(A and B) ≤ µx(A) ≤ 1,0 ≤ µx(A and B) ≤ µx(B) ≤ 1,µx(A) + µx(B)− µx(A and B) ≤ 1.(2.8)Definition 2.3. Let µx(A), µx(B), and µx(A or B) be the membershipweights of an item x with respect to a pair of concepts A and B and theirdisjunction A or B. We say that these membership weights are classical dis-junction data if there exists a Kolmogorovian probability space (Ω, σ(Ω), P ),and events EA, EB ∈ σ(Ω) such thatP (EA) = µx(A),P (EB) = µx(B),P (EA ∪ EB) = µx(A or B).(2.9)Theorem 2.4. The membership weights µx(A), µx(B), and µx(A or B) ofan item x with respect to concepts A, B and their conjunction A or B areclassical disjunction data if and only if0 ≤µx(A) ≤ µx(A or B) ≤ 1,0 ≤µx(B) ≤ µx(A or B) ≤ 1,0 ≤µx(A) + µx(B)− µx(A or B).(2.10)A large body of experimental evidence indicates that the membershipweights of exemplars with respect to concept combinations do not formclassical conjunction or classical disjunction data. In particular, for the caseof conjunction, the membership weight with respect to the conjunction ofconcepts is generally larger than the membership weight of at least one of theformer concepts. This phenomenon is called “single overextension.” Whenit is larger than both of the former membership weights it is called “doubleoverextension.”In Table 2.4, we show two cases reported in [Ham88b]. Here the mem-bership weights µx1(A1), µx1(B1), and µx1(A1B1) of the item x1 =‘coffee ta-ble’ with respect to concepts A1 =‘Furniture,’ B1 =‘Household Appliances,’202.2. Challenges for a Theory of Conceptsand their conjunction A1B1 show single overextension, and the membershipweights µx2(A2), µx2(B2), and µx2(A2B2) of the item x2 =‘tree house’ withrespect to concepts A2 =‘Building,’ B2 =‘Dwelling,’ and their conjunctionA2B2 show double overextension.The phenomenon of overextension has also been demonstrated not onlyfor membership weights, but also in typicality estimations. A famous ex-ample, known as the guppy effect, states that the typicality of ‘guppy’ withrespect to ‘Pet and Fish,’ is larger than the typicalities of ‘Pet,’ and of‘Fish’ [SO81, SDBVMR98, Ham96]. Estimations of the applicability of rel-evant properties of concepts and their conjunctions exhibit the same effect.For example, talk is not a relevant property for either ‘Pet’ or ‘Bird,’ but itis for ‘Pet and Bird’ [Ham97a, Ham97c, FL96, AG05a, AG05b]. Overexten-sions have also been observed in experiments considering negated concepts.For example, ‘chess’ is overextended with respect to the concepts ‘Game’and ‘not Sport’ and their conjunction [ASV14b].For the disjunction of two concepts, the analogous underextension effectalso occurs. That is, semantic estimations of an exemplar with respect tothe disjunction of two concepts can be smaller than the semantic estimationof exemplar of the individual concepts [Ham88a, Ham97b, Ham07].Evidence supports the idea that overextension of conjunction and un-derextension of disjunction are common traits of conceptual combinationsrather than mere cognitive effects. A case study in [ABGV12] reports thatall 16 exemplars studied were overextended (Fig. 2.4).Furthermore, the average seems to be a better estimator for the typi-cality of the conjunction than the fuzzy minimum rule. In particular, thedifference between the normalized data typicality estimations of the con-Table 2.4: Experimental membership weights for exemplars x1 =‘coffee ta-ble,’ and x2 =‘tree house.’X = A1 X = B1 X = A1B1 X = A2 X = B2 X = A2B2µx1(X) 1 0.18 0.35 - - -µx2(X) - - - 0.5 0.9 0.95212.2. Challenges for a Theory of Conceptsjunction and the fuzzy minimum formula is 0.026 on average, while it is0.011 with the average of the former concepts’ typicalities. Moreover, thecorrelation between the normalized data and the minimum is 0.795, while itis 0.899 for the average of the former concepts’ typicalities.From a structural point of view however, the average formula is still nota solid estimator because of the existence of double overextended exemplars.In Fig. 2.4, the 4th and 14th value on the x-axis are double overextended.These exemplars correspond to ‘hifi’ and ‘desk lamp’ respectively. Formallyspeaking, double overextended exemplars cannot be described in terms oft-norms that entail all possible convex combinations of the former concepts’typicalities (Appendix A.2).Figure 2.4: Normalized typicality estimations of the concepts ‘Furniture,’‘Household Appliance,’ and their conjunction with respect to 16 exemplars(on x-axis). The minimum and maximum of the former concepts in thecombination are shown in grey lines, the typicality of the conjunction isthe black line, and the average formula is the black-dashed line. Doubleoverextended exemplars are marked by red points.22Chapter 3The Quantum Approach toCognitive Modeling3.1 Quantum Physics and Quantum StructuresQuantum theory emerged at the beginning of the 20th century as a the-ory for microscopic phenomena that could not be explained by the currentclassical theories. These include the radiation profile of black bodies atdifferent temperature levels and the measurement of electric currents in ma-terials exposed to light. These two phenomena are known as the black-bodyproblem and the photoelectric effect, respectively. Quantum theory was ableto explain and incorporate these challenging phenomena into a unified rep-resentation of the microscopic realm that proposed an entirely different wayof thinking.This new perspective captured the attention of not only physicists, butalso of philosophers and mathematicians. Whereas philosophers were con-cerned with the ontological nature of quantum entities, mathematicians fo-cused on the development of suitable mathematical tools to describe quan-tum theory and its relation to other theories such as classical, relativity,and information theory. The area of research that lies in the intersection ofthe physical, philosophical, and mathematical aspects of quantum systemsis named quantum structures.From a philosophical standpoint, the differences between classical andquantum theories are the following: in the classical theory, the outcomes weobserve when performing experiments exist as concrete states in the systemprior to measurement, and are deterministically obtained from measure-ments; in the quantum theory, the outcomes we observe when performingexperiments exist in potential states prior to measurement, and the mea-surement acts as a context that co-determines the observed outcome in anon-deterministic manner.233.1. Quantum Physics and Quantum StructuresThese differences are best illustrated by taking a closer look at the no-tions used to represent physical systems. While in classical physics systemsare described by particles, and are not influenced by measurements, in quan-tum physics systems are described by waves modeled by complex valuedfunctions, and measurements influence the system. Specifically, a quantumsystem is a superposition of waves of different wavelengths. Each wavelengthrepresents a possible energy level for the system. Therefore, the state of su-perposition of a quantum system does not represent a real physical system,but embodies the potentiality of encountering different physical energy-levelconfigurations. When a measurement is performed, a probabilistic changeoccurs to the superposition state. This change consists of the collapse of thesuperposition of waves into only one wave. Thus a measurement acts as acontext that destroys the evolution of potentialities of the quantum system,and collapses the quantum system into a realistic physical configuration inwhich its properties can be observed.An interesting phenomenon related to the potentiality of quantum sys-tems is quantum interference. The states that make up the superposedstate of a quantum system interact prior to measurement. This interactionchanges the probabilistic structure of the wave and, unlike classical interfer-ence, quantum interference does not require the existence of an observableflow of particles. So, quantum interference is a phenomenon that occursfor one entity, prior to measurements, and is related to the potentiality ofquantum systems.Another important difference, which comes as a result of this shift inperspective, involves the mathematical representation of a system as a col-lection of sub-systems. A system formed by two sub-systems A and B withstate spaces SA and SB respectively, is represented in classical physics byan element of the Cartesian product SA×SB. In a Cartesian product, eachsubsystem is separately described within the joint system. In quantum the-ory, however, the state of the sub-systems are unit vectors in the Hilbertspaces HA and HB respectively, and the joint system is a unit vector ofthe tensor product denoted by HA ⊗HB. In a tensor product, it is not al-ways the case that subsystems are described separately. Specifically, a jointsystem in quantum theory can exist in a state that is not the compositionof the states of the two separate systems, but rather, is an entangled state(see Definition 3.8). Thus joint systems in quantum theory are in generalnon-compositional.243.2. Conditions of Possible Experience and Non-classical StatisticsA consequence of non-compositionality in quantum theory is the phe-nomenon of “quantum entanglement.” Namely, when two or more particlesinteract, their superposed states may become entangled, and thus evolve asa single emergent entity even if they become spatially separated. In particu-lar, measuring one of the two particles will result in a collapsed state for bothregardless of the distance separating them. This was first conjectured in thecelebrated Einstein Podolsky Rosen (EPR) paradox [EPR35], then mathe-matically formalized by the Bell-inequality formulation [B+64], and finallytested experimentally by Alain Aspect’s Bell-test experiment [ADR82].In summary, quantum theory is a formal theory that is fundamentallydifferent from its classical counterpart. The differences involve what isknown as quantum structures. In particular, the central features of quantumstructures are: i) they exist in states of potentiality, ii) they acquire con-crete features through contextual processes, and iii) they evolve as emergentsystems when combined.3.2 Conditions of Possible Experience andNon-classical StatisticsNon-deterministic cognitive phenomena must be studied using a proba-bilistic model that describes a system by quantifying its tendency to behavein one way or another (Appendix A.3). The model is generally verified byobserving the phenomena a large number of times through some experimen-tal procedure. The behavioral tendencies of the system are reflected in theobserved relative frequencies, also called statistics, of the experimental out-comes.For example, consider an urn containing a large number of balls, and letus define the following measurements: E1 =‘the ball is red,’ and E2 =‘theball is wooden.’ Note that other measurements such asnot E1 = ‘the ball is not red,’not E2 = ‘the ball is not wooden,’E1 ∩ E2 = ‘the ball is red and wooden,’E1 ∪ not E2 = ‘the ball is red or not wooden,’(3.1)253.2. Conditions of Possible Experience and Non-classical Statisticscan be defined using σ-algebraic constructions.Draw a ball from the urn, and record the result of one or more measure-ments. A probabilistic model for this experiment should give a consistentdescription of the outcomes of these measurements. Consider the probabil-ities P (Eyes1 ), P (Eyes2 ) and, P (Eyes1 ∩ Eyes2 ) to obtain the outcome ‘yes’for the measurements E1, E2, and E1 ∩E2. Then, the following consistencyconditions must be satisfied:P (Eyes1 ∩ Eyes2 ) ≤ min(P (Eyes1 ), P (Eyes2 )), (3.2)P (Eyes1 ) + P (Eyes2 )− P (Eyes1 ∩ Eyes2 ) ≤ 1. (3.3)Eq. (3.3) is equivalent to requiring that P (Eyes1 ∪ Eyes2 ) be well defined.The consistency conditions given in Eqs. (3.2)-(3.3) are some of the condi-tions of possible experience derived by George Boole to restrict the possiblestatistics of an experimental situation to plausible results [Boo54].Suppose we extract 100 balls and obtain 60 balls are red, 75 balls arewooden, and 32 are both red and wooden. Then, the estimated probabilitiesare P (Eyes1 ) = 0.6, P (Eyes2 ) = 0.75, and P (Eyes1 ∩ Eyes1 ) = 0.32. Notethat Eq. (3.3) is not satisfied sinceP (Eyes1 ) + P (Eyes2 )− P (Eyes1 ∩ Eyes2 ) = 1.03. (3.4)Clearly, this example cannot occur for any real urn since these proportionsof the balls pose a logical contradiction [Pit94].If all properties are measurable within a single sample, then the condi-tions of possible experience cannot be violated, and there exists a classicalprobabilistic representation [Pit89]. However, not all systems allow all prop-erties to be measured in a single sample. For example, because most mea-surements in quantum systems will involve non-deterministic disturbances,they cannot allow for multiple measurements in a single sample. In thesecases measurements are called incompatible. It may still be possible to builda probabilistic representation of the system from a subset of the completeσ-algebra of measurements using the notions of marginal and joint proba-bility (Appendix A.3). But there are statistical situations where a detailedanalysis reveals non-trivial violations even though the marginal probabilitiesappear to satisfy the conditions of possible experience.263.2. Conditions of Possible Experience and Non-classical StatisticsThe first example of this type was put forward by the mathematicianVorob’ev [Vor62]. He considered an abstract system with three experimentsE1, E2, and E3, each one having two outcomes Eji , for i = 1, 2, 3, andj = 1, 2. For the experiment to involve incompatible measurements, heassumed that only two out of the three experiments can be performed oneach sample. We now show that the system violates the conditions of pos-sible experience.Consider the following marginal probabilities:P(E11 , E12) = P(E21 , E22) = 1/2, (3.5)P(E11 , E13) = P(E21 , E23) = 1/2, (3.6)P(E12 , E23) = P(E22 , E13) = 1/2. (3.7)Note that Eq. (3.5) implies thatP(E21 , E22 , E13) + P(E21 , E22 , E23) = 1/2. (3.8)Analogously, we can use Eq. (3.6) and (3.7) to obtainP(E11 , E12 , E13) + P(E11 , E22 , E13) = 1/2, (3.9)P(E21 , E12 , E23) + P(E11 , E12 , E23) = 1/2, (3.10)P(E11 , E22 , E13) + P(E21 , E22 , E13) = 1/2. (3.11)If we subtract Eq. (3.9) and Eq. (3.11), we haveP(E11 , E12 , E13) = P(E21 , E22 , E13). (3.12)Hence, replacing the right-hand side of Eq. (3.12) in Eq. (3.8) yieldsP(E11 , E12 , E13) + P(E21 , E22 , E23) = 1/2. (3.13)Now, to have a well-defined probability for these events, we also require thatP(E11 ,E12 , E13) + P(E11 , E12 , E23) + P(E11 , E22 , E13) + P(E11 , E22 , E23)+P(E21 , E12 , E13) + P(E21 , E12 , E23) + P(E21 , E22 , E13) + P(E21 , E22 , E23) = 1.(3.14)Substituting Eqs. (3.13), (3.10), and (3.11) in Eq. (3.14) gives273.3. The Birth of Quantum CognitionP(E21 , E12 , E13) + P(E11 , E22 , E23) = −1/2. (3.15)Since probabilities cannot be negative, this shows that the conditions of pos-sible experience are violated. Therefore, this system cannot be representedby a classical probabilistic model.3.3 The Birth of Quantum CognitionSome scientists and philosophers, and remarkably, among them the found-ing fathers of quantum theory such as Bohr [Boh63] and Heisenberg [Sch92],have recognized that the joint measurement of properties is a relevant issuefor cognitive phenomena. However, if properties measured in cognitive phe-nomena cannot be jointly measured, then non-classical probabilistic models,and particularly quantum-probabilistic modeling, might yield more accurateresults [BK99, Bor10, Ama93, Khr10, Smi03]. In these cases, it is possiblethat cognitive phenomena could exhibit quantum-probabilistic features.The first example of a cognitive phenomenon exhibiting non-classicalprobabilistic features was put forward by Aerts in [AA97]. The exampleconsists of an opinion poll that contains three questions, each question hav-ing only two possible answers: ‘yes’ or ‘no.’ What brings the non-classicalityto this situation is the fact that some participants do not have a predefinedanswer to the questions; but rather their answer must be formed at the mo-ment the question is posed.To draw an analogy between this cognitive situation and the urn examplein the previous section, a participant ‘forming his answer at the moment thequestion is posed’ would correspond to ‘a ball acquiring its colour when theball is extracted from the urn.’ Clearly, balls do not acquire their color whenthey are extracted from an urn, but our thought process can be influencedby a question. Similarly, quantum systems do acquire their properties whenobserved. This is exactly what the collapse of the wave function embodies:an enquiry itself influencing an outcome. And it is why the formalism ofquantum theory provides a reasonable approach to model cognitive phenom-ena.The questions for the opinion poll are given in Table The Birth of Quantum CognitionConsider the case where for each question, 50% of the participants answer‘yes,’ and only 30% of the total participants are certain of their answersbefore the question is posed, with 15% ‘yes’ and 15% ‘no.’ This means that70% of the participants form their answer when the question is posed.A probabilistic model, known as the -model, was constructed to as-sign probabilities for the various outcomes (Fig. 3.1). The -model has alsobeen applied to study the relation between classical and quantum probabili-ties [Aer98, Aer96, AA97]. We can use this model to compute the probabili-ties for this experiment for question U as follows. Assume that the points onthe perimeter of the circle represent all the possible states a participant canbe in before the question is asked, and let the points u and −u represent theTable 3.1: Cognitive experiment revealing non-classical statistics.U: Are you a proponent of the use of nuclear energy?V: Do you think it would be a good idea to legalize soft-drugs?W: Do you think it is better for people to live in a capitalistic system?Figure 3.1: Graphical description of the proportion of participants with orwithout predetermined answers for question U [AA97].293.3. The Birth of Quantum Cognitionstates where answers ‘yes’ and ‘no’ are completely deterministic. Imaginean elastic band joining these two points, and assume that this elastic bandcan break at any point along the unshaded portion. When the question isasked, the points fall sideways into the line determined by the elastic andthe elastic breaks. We say that the participant answered ’yes,’ if it is on theportion of the elastic that fell toward u, and ‘no’ if it is on the portion thatfell toward −u. This process is shown in Fig. 3.2.Note that, because the elastic can only break in the unshaded portion,all points in the shaded region are deterministic. They correspond to par-ticipants with predetermined answers. Furthermore, because the elasticcan break at any point in the unshaded region, the answer for the partici-pants without a predetermined answer is obtained in a probabilistic manner.Moreover, the closer the point representing a participant is to one of the cer-tainty regions, the more likely the process will lead to the answer the regionrepresents. The case in which 50% of the participants answer ‘yes’ to eachquestion is modeled by assuming that the probability that the elastic breaksat each point is given by a symmetric distribution with respect to the mid-point. The calculation of the probability of a certain outcome correspondsto the expected value of getting the outcome for all the states and all theelastic breaking points. We refer to [AA97] for a detailed description ofhow to compute the probabilities in the −model.Figure 3.2: Measurement process in the −model [AA97]. In a) the state ofthe participant prior to that question is on the circle, in b) the point fallsinto the elastic, in c) the elastic breaks, and in d) the point is attached toone of the extremes revealing the outcome [AA97].303.3. The Birth of Quantum CognitionNow consider the full experiment as described in Fig. 3.3. We can com-pute the relative proportions of participants with or without predeterminedanswers with respect to the three questions using the −model graphicaldescription. Region 1 in the figure corresponds to the participants whoseanswer is ‘yes’ for question U prior consultation, but who do not have apredetermined answer for questions V and W . Region 2 corresponds to par-ticipants with a predetermined answer ‘yes’ for questions U and V , but nopredetermined answer for W , and so on.The conditional probabilities of getting an answer, given that we knowthe answer to another question, can now be computed. The conditionalprobability of obtaining the answer U =‘yes’ given that V =‘yes,’ denotedby P (U = ‘yes’|V = ‘yes’), is obtained by estimating how likely it is thatparticipants in the regions 2, 3, and 4 are attached to point u after themeasurement process. Since they are deterministic, all points in region 2will be attached to u. We refer to [AA97] for the trigonometric calculationsrequired to compute the likelihood that the points in regions 3 and 4 leadto u after the measurement process. The conditional probabilities of allother outcomes can be calculated from the respective deterministic and non-deterministic regions as shown in Fig. 3.3. In particular, we have thatP (U = ‘yes’|V = ‘yes’) = 0.78, (3.16)Figure 3.3: Graphical description of the proportion of participants with orwithout predetermined answers to the three questions U, V , and W [AA97].313.3. The Birth of Quantum CognitionP (U = ‘yes’|W = ‘yes’) = 0.22, (3.17)P (V = ‘yes’|W = ‘yes’) = 0.78. (3.18)The conditional probabilities given in Eqs. (3.16)-(3.18) cannot be repre-sented by a classical statistical model [AA97]. To prove this, denote theoutcomes U =‘yes,’ V =‘yes,’ W =‘yes,’ by U+, V+, and W+ respectively.Analogously, denote U =‘no,’ V =‘no,’ W =‘no,’ by U−, V−, and W− re-spectively. Since the probability of having a ‘yes’ outcome for each questionis equal to 0.5, the following marginal probabilities areP (U+) = P (V+) = P (W+) = 0.5. (3.19)Bayes rule (Eq. (2.4)) givesP (U+|W+) = P (U+ ∩W+)P (U+). (3.20)We use this fact to obtainP (U+ ∩W+) = P (U+ ∩ V− ∩W+) + P (U+ ∩ V+ ∩W+) = 0.11. (3.21)A similar procedure yieldsP (U− ∩ V+ ∩W+) + P (U+ ∩ V+ ∩W+) = 0.39. (3.22)Subtracting Eqs. (3.22) and (3.21), we obtainP (U− ∩ V+ ∩W+) = 0.28 + P (U+ ∩ V− ∩W+). (3.23)On the other hand, Eq. (3.16) impliesP (U−|V+) = 1− P (U+|V+) = 0.22. (3.24)Using Bayes rule, and repeating the previous comparison, yieldsP (U− ∩ V+ ∩W+) = 0.11− P (U− ∩ V+ ∩W−). (3.25)Finally, we combine Eqs. (3.23) and (3.25) to obtainP (U+ ∩ V− ∩W+) + P (U− ∩ V+ ∩W−) = −0.17. (3.26)Since all the probabilities must be positive, we have shown that this systemcannot be represented by a classical probabilistic model. It can however, be323.4. Fundamentals of Quantum Modeling in Cognitionrepresented by the non-classical -model.The discovery of non-classical statistical results in this cognitive phe-nomenon inspired the development of quantum models for multiple cogni-tive phenomena including decision making [BB12], psychology of catego-rization [Aer09], human memory [BKNM09], and finances [HK13], amongothers [Khr10, PB13]. The use of quantum probability, and of quantum-inspired models for cognitive systems, is an emergent area of research knownas Quantum Cognition [BBG13].3.4 Fundamentals of Quantum Modeling inCognitionThis section introduces some mathematical elements of standard quan-tum theory and shows how they can be applied to cognition.In quantum theory, the state of a quantum entity is described by acomplex-valued vector of unit length. Vectors are denoted using the bra-ketnotation introduced by Paul Dirac [Dir39]. In Dirac notation, there are twokinds of vectors: ‘bra’ vectors denoted by 〈A|, and ‘ket’ vectors denoted by|A〉. By convention, the state of a quantum entity is described by a ‘ket’vector.Definition 3.1. Let α, β ∈ C. Consider the vectors 〈A| and |B〉. Theoperation bra-ket defined by the inner product 〈A|B〉 is1. linear in the ket: 〈A|(α|B〉+ β|C〉) = α〈A|B〉+ β〈A|C〉, and2. anti-linear in the bra: 〈C|(α|A〉+ β|B〉) = α∗〈A|C〉+ β∗〈B|C〉.We say that |A〉 and |B〉 are orthogonal if and only if 〈A|B〉 = 0. Wedenote it by |A〉 ⊥ |B〉. Additionally, we say that 〈A|B〉 is the complexconjugate of 〈B|A〉. Therefore〈A|B〉 = 〈B|A〉∗. (3.27)Definition 3.2. The bra-ket operation induces the norm || · || = √〈·|·〉.The space of complex-valued vectors representing the possible states of aquantum entity, equipped with the bra-ket operation and its induced norm,is called a Hilbert space, denoted by H. The formalism of quantum theory333.4. Fundamentals of Quantum Modeling in Cognitionis built upon the mathematics of Hilbert spaces [RS80].Measurable quantities of a system, known as observables in quantumtheory, are represented by self-adjoint operators on the Hilbert space. Wefocus on a special kind of self-adjoint operators, known as orthogonal projec-tors, used to represent quantum measurements representing questions whosepossible outcomes are ‘yes’ and ‘no.’Definition 3.3. Let |A〉, and |B〉 ∈ H, and let M : H → H, be an operatordefined by|A〉 →M|A〉.M is an orthogonal projector if and only if it is1. Linear: for α, β ∈ C we have M(α|A〉+ β|B〉) = αM|A〉+ βM|B〉,2. self-adjoint: 〈A|M|B〉 = 〈B|M|A〉∗, and3. Idempotent: M ·M = M.Orthogonal projectors induce a subspace of H representing the stateswhose outcome is ‘yes.’ This space is given byHM = {M|A〉, |A〉 ∈ H}. (3.28)The probability to obtain an outcome ‘yes’ for a measurement is given bythe extent to which the state |A〉 belongs to HM. This is formalized by theBorn rule of probability [ST85].Definition 3.4. Let |A〉 be the state of an entity A, and Mx be an orthog-onal projector associated to a question x. The probability of an answer ‘yes’to the question x is given byµx(A) = 〈A|Mx|A〉. (3.29)For the probabilistic structure to remain valid after a measurement, thestate vector must be renormalized so it is still a unit vector.Definition 3.5. Let |A〉 be the state of an entityA, and M be an orthogonalprojector. The state vector after measurement is given by|AM〉 = M|A〉√〈A|M|A〉 . (3.30)343.4. Fundamentals of Quantum Modeling in CognitionEq.(3.30), known as the projection postulate4, ensures that when a physi-cal quantity is measured two consecutive times, the result encountered in thefirst measurement is obtained with probability 1 in the second measurement.The probability of an outcome ‘no’ is measured in quantum theory bythe orthogonal operator M⊥ defined byM⊥ = 1−M. (3.31)Hence, the probability to obtain the outcome ‘no’ is given byµ(Not A) = 〈A|M⊥|A〉 = 1− 〈A|M|A〉. (3.32)Because M is idempotent and M⊥ = 1−M, we also haveMM⊥ = M(1−M) = M−M2 = 0. (3.33)When we consider two different measurements, the operator representingthe successive application of these two measurements is not necessarilly ameasurement, since the order in which they are measured might lead todifferent results. In terms of operators, this represents non-commutativity.Definition 3.6. Given two operators M1 and M2. We say that M1 and M2represent compatible measurements if and only if the commutator operator[M1,M2] = M1M2 −M2M1 = 0. (3.34)Otherwise, the operators represent incompatible measurements.So far, we have not considered the internal structure of states. In quan-tum theory, the set of states is linearily closed. This means that every linearcombination of two states that corresponds to a unit vector is also a state.For example, if a quantum system can exist in two different states, |A〉 and|B〉, then it can also exist in the superposed state|AB〉 = z1|A〉+ z2|B〉, (3.35)with z1, z2 ∈ C, and |z1|2 + |z2|2 = 1. When a measurement Mx is appliedto a superposed state, the probability µx(AB) of obtaining an outcome ‘yes’is given by4Also known in the literature as the collapse postulate.353.4. Fundamentals of Quantum Modeling in Cognitionµx(AB) = (z1〈A|+ z2〈B|)Mx(z1|A〉+ z2|B〉),= |z1|2〈A|Mx|A〉+ |z2|2〈B|Mx|B〉+ 2<(z1z∗2)〈A|Mx|B〉,= |z1|2µx(A) + |z2|2µx(B) + 2<(z1z∗2)〈A|Mx|B〉,(3.36)where <(z) denotes the real part of z. So, the probability of an outcome‘yes’ for the measurement represented by Mx is the weighted sum of theprobabilities of the former events, ‘yes’ on state |A〉 and ‘yes’ on state |B〉,together with an interference term.To understand why this term is called an “interference,” we rewrite thelast equation using the polar notation of complex numbers: zi = rieiθi , fori = 1, 2. Eq. (3.36) becomesµx(AB) = (r1eiθ1〈A|+ r2eiθ2〈B|)Mx(r1eiθ1 |A〉+ r2eiθ2 |B〉)= r21ei(θ1−θ1)µx(A) + r22ei(θ2−θ2)µx(B) + 2r1r2 cos(θ1 − θ2)〈A|Mx|B〉= r21µx(A) + r22µx(B) + 2r1r2 cos(θ1 − θ2)〈A|Mx|B〉.(3.37)In this representation, the interference term corresponds to the product ofthe weights r1 and r2, the cosine of the phase difference θ1− θ2 between thestates, and the inner product of |A〉 and |B〉 restricted to HMx . If only onestate is under consideration, the phase angle plays no role in the probabilityof a given measurement. However, when measurements are performed onsuperposed states, the interplay between the relative phases induce eitherpositive or negative interferences. Extreme positive or negative interferenceis reached when θ2 − θ1 = 0, or θ2 − θ1 = pi respectively.In quantum cognition, cognitive tasks are modeled by representing se-mantic estimations as probabilistic events [AGS13]. Thus, a quantum cogni-tive model considers a concept A, whose state is represented by a unit vector|A〉, and semantic estimations are modeled by orthogonal projections on aHilbert space. Let Mx be a semantic estimation: HMx represents the spaceof states of the concept whose measurement outcome is ‘yes.’ Therefore,the probability of having an answer ‘yes’ to a certain semantic estimation isobtained by the Born rule of probability:µx(A) = 〈A|Mx|A〉. (3.38)363.5. Quantum Cognitive Models and Cognitive ChallengesThese outcome probabilities are generally called “weights” in the cognitive-science literature. For example, if Mx represents a membership estimationfor a certain exemplar x, then µx(A) represents the membership weight ofx with respect to the concept A being in the state |A〉 [Aer09].Philosophers and cognitive scientists have on many occasions proposedthat non-logical processes such as intuitive or unconscious thinking could beunderstood in terms of superposition, interference, or incompatibility, buthave not given a formal account on how these notions operate [EF09, Tha97,Kih87, Smi03]. Quantum cognition is the first approach to incorporate theseideas into mathematical models. In particular, superposed states can be usedto represent uncertainty [AS11a]; interference is a mechanism for non-logicalcognitive coherence [Aer09]; and incompatible measurements correspond totwo consecutive cognitive actions where the first action serves as a contextfor the second action [WB13, BW07].3.5 Quantum Cognitive Models and CognitiveChallengesA quantum model for a cognitive phenomena requires the specificationof the entities at play, their state spaces, and the operators used to repre-sent measurements. Once this is determined, the mathematical frameworkof quantum theory is used to compute the probabilities of the measurementoutcomes. In this section, we show how a quantum cognitive framework issuccessful at modeling phenomena that cannot be explained within tradi-tional approaches. In particular, we outline the quantum models developedto resolve the challenges presented in Chapter The Conjunction Fallacy as IncompatibilityTo model the conjunction fallacy, we use a Hilbert space H, and astate vector |A〉 ∈ H that represents the belief state after reading Linda’sstory [Fra09, BPFT11]. Next, the event ‘yes’ to questions a), b), and c) isrepresented by the subspace corresponding to the projectors M1, M2, andthe operator M1M2 respectively. The key assumption is that M1 does notcommute with M2.First, we expand the term representing the probability of event b) so wecan compare it with the probability of event c):373.5. Quantum Cognitive Models and Cognitive Challenges〈A|M2|A〉 =〈A(M1 + M⊥1 )|M2|(M1 + M⊥1 )A〉=〈AM1|M2|M1A〉+ 〈AM⊥1 |M2|M1A〉+〈AM1|M2|M⊥1 A〉+ 〈AM⊥1 |M2|M⊥1 A〉.(3.39)If M1 and M2 commute, and since M1M⊥1 = 0, we have〈AM⊥1 |M2|M1A〉 = 〈AM1|M2|M⊥1 A〉 = 0. (3.40)Therefore,〈A|M2|A〉 = 〈AM1|M2|M1A〉+ 〈AM⊥1 |M2|M⊥1 A〉. (3.41)Otherwise, setδ = 〈AM1|M2|M⊥1 A〉+ 〈AM⊥1 |M2|M1A〉. (3.42)The incompatibility term5 δ accounts for the conjunction fallacy as follows:Note that the story does not suggest that Linda is a bank teller; in fact, itis somewhat likely that she is not a bank teller [BPFT11]. Hence, we canassume〈AM⊥1 |M2|M⊥1 A〉 ≥ 0. (3.43)Choose δ to be negative, and such thatδ + 〈AM⊥1 |M2|M⊥1 A〉 < 0. (3.44)Then, Eq. (3.39) becomes〈A|M2|A〉 = 〈AM1|M2|M1A〉+ δ + 〈AM⊥1 |M2|M⊥1 A〉 < 〈AM1|M2|M1A〉.(3.45)This shows that by incorporating a non-commutative term, it is possible tomodel a situation where the probability for the conjunction of two eventsis greater than one of the events. A concrete example for the state |A〉,and the operators M1 and M2, has been constructed on a complex Hilbertspace of dimension 3 to model empirical data collected on the conjunction5In [Fra09], this term is called interference but since it is related to non-commutativityof measurements rather than superposition of states, we refer to it here as an incompati-bility term.383.5. Quantum Cognitive Models and Cognitive Challengesfallacy [Fra09].The same incompatibility term can account for order effects [WB13]. Inparticular, the quantum model can be used to derive a property about ordereffects that has no counterpart in the classical theories.Let MF and MH be projectors representing two questions F and Hwhose outcomes ‘yes’ and ‘no’ are represented by Fy, Fn, Hy, and Hn re-spectively. Let µ be a function that measures the probability for the out-comes of these questions. For example, µ(Fn) is the probability of obtainingthe outcome ‘no’ to question F , and µ(FyHn) is the probability of obtaining‘yes’ to question F , and then ‘no’ to question H.For each question, the order effect reflects how having the other questionas a prior influences the statistics of the outcomes. For example, the ordereffects for F andH with respect to the answer ‘yes’ are respectively measuredbyIF = µ(HyFy) + µ(HnFy)− µ(Fy),IH = µ(FyHy) + µ(FnHy)− µ(Hy).(3.46)To obtain the probability of consecutive measurements in the quantummodel, we compute the probability of the outcome ‘yes’ for the first question,renormalize using Eq. (3.30) to obtain the post-measurement state, andthen reapply the Born rule to the post-measurement state to compute theprobability of obtaining the outcome ‘yes’ for the second question. Forexample, µ(HyFy) is given byµ(HyFy) = 〈A|MF |A〉〈AMF |MH |AMF 〉= 〈A|MF |A〉(1〈A|MF |A〉〈AMF |MH |MFA〉)= 〈AMF |MH |MFA〉.(3.47)Similarly,µ(HyFn) = 〈AM⊥F |MH |M⊥FA〉,µ(FyHy) = 〈AMH |MF |MHA〉,µ(FyHn) = 〈AM⊥H |MF |M⊥HA〉.(3.48)393.5. Quantum Cognitive Models and Cognitive ChallengesSubstituting these in Eq. (3.46) gives the order effect in term of measurementoperators:I1 = 〈AMH |MF |MHA〉+ 〈AM⊥H |MF |M⊥HA〉 − 〈A|MF |A〉, (3.49)I2 = 〈AMF |MH |MFA〉+ 〈AM⊥F |MH |M⊥FA〉 − 〈A|MH |A〉. (3.50)Note that the probability of the outcome ‘yes’ to a question is experi-mentally obtained from the statistics of the experiment when the questionis first asked. Hence〈A|MF |A〉 = µ(HyFy) + µ(HnFy),〈A|MH |A〉 = µ(FyHy) + µ(FnHy).(3.51)Since M⊥i = 1 −Mi, for i = 1, 2, we can expand the second term inEqs. (3.49) and (3.50), to obtain〈AM⊥F |MH |MFA〉 = 〈A|MHMF |A〉 − 〈AMF |MH |MFA〉, (3.52)〈AMF |MH |M⊥FA〉 = 〈A|MFMH |A〉 − 〈AMF |MH |MFA〉. (3.53)We add Eqs. (3.52) and (3.53), and use the facts that MFMH = (M∗HM∗F )∗,and MF and MH are self-adjoint, to write〈AM⊥F |MH |MFA〉+〈AMF |MH |M⊥FA〉 = 2<(〈A|MHMF |A〉)−2〈AMF |MH |MFA〉.(3.54)A similar procedure yields〈AM⊥H |MF |MHA〉+〈AMH |MF |M⊥HA〉 = 2<(〈A|MHMF |A〉)−2〈AMH |MF |MHA〉.(3.55)Since Eqs. (3.54) and Eq. (3.55) share the term 2<(〈A|MHMF |A〉), theircombination yields〈AM⊥F |MH |MFA〉+ 〈AMF |MH |M⊥FA〉+ 2〈AMF |MH |MFA〉= 〈AM⊥H |MF |MHA〉+ 〈AMH |MF |M⊥HA〉+ 2〈AMH |MF |MHA〉.(3.56)From the properties of the quantum model, Eq. (3.56) can be used toderive the following identity:403.5. Quantum Cognitive Models and Cognitive Challengesµ(FyHn) + µ(FnHy) = µ(HyFn) + µ(HnFy). (3.57)Eq. (3.57) is known as the Quantum Question equality (QQ-equality). Thisprobabilistic identity cannot be obtained from a classical probability theorybecause it requires the manipulation of the incompatibility terms 〈AMF |MH |M⊥FA〉and 〈AMH |MF |M⊥H〉. Notably, the QQ-equality has been confirmed in astatistical analysis containing more than seventy surveys, each survey havingbetween three hundred and two thousands participants [WSSB14]. There-fore, the QQ-equality presents evidence that quantum models are necessaryto accurately represent non-classical aspects of cognitive phenomena.3.5.2 Overextension and Underextension as InterferenceCollected data on concept conjunction and disjunction that is not clas-sical (Defs. 2.1–2.3) can often be explained in terms of state superpositionand interference. Consider the concepts A, B, and a concept combinationAB, which can be either the conjunction A and B or the disjunction A orB. These are represented by states |A〉, |B〉, and |AB〉 respectively. Let xbe an exemplar, and let M represent the semantic estimation that measuresthe membership of x with respect to the concepts A, B, and their combi-nation AB. Now, assume |A〉 ⊥ |B〉, and choose the following state for thecombined concept:|AB〉 = 1√2(|A〉+ |B〉). (3.58)With this choice, the membership weight of exemplar x with respect to theconjunction or disjunction of concepts A and B is given byµ(AB) =12〈A+B|M|A+B〉 = (µ(A) + µ(B))2+ <〈A|M|B〉. (3.59)The membership weight µ(AB) corresponds to the sum of the average ofµ(A) and µ(B), and an interference term that depends on the way vectors|A〉 and |B〉 project onto HM.The quantum probability formula in Eq. (3.59) has been applied to modeloverextension and underextension of semantic estimations for concept con-junction and disjunction reported in [Ham88b, Ham96, Ham88a, Ham97b].It is interesting to note that deviations from classical models found for bothconnectives can be explained in terms of the same model. Indeed, in the413.5. Quantum Cognitive Models and Cognitive Challengesabsence of interference, that is when 〈A|M|B〉 = 0, the probability formulais reduced to the average of the former probabilities. Therefore, if the mem-bership weights are not equal, Eq. (3.59) is singly overextended and singlyunderextended even in the absence of interference [ABGV12]. However, dif-ferent phase angles have to be chosen for each connective in order to give aprecise account of the experimental data. In general, positive interference isneeded to account for overextension, and negative interference is needed toaccount for underextension [Aer09].3.5.3 Ellsberg and Machina ParadoxesBoth Ellsberg and Machina paradoxes can be modeled using similarmethods [AS11b, AS11a, AST12]. The idea is to model the subject’s uncer-tainty about the number of balls of each type in the urn by a superpositionof possible urn states. For the Ellsberg paradox, let |r〉, |y〉, and |b〉 be threeorthogonal vectors representing the existence of red, yellow, and black ballsrespectively, and let the projectors Mr,My, and Mb represent the event ofextracting a red, yellow, or black ball from the urn. Consider a pure Ells-berg state reflecting certain belief about the numbers of balls in the urn,represented by|p〉 = 1√3eiα|r〉+ ρyeiβ|y〉+ ρbeiγ |r〉, (3.60)where ρ2y + ρ2b =23 . Note that the probability that a red ball is extractedgiven this pure Ellsberg state is〈p|Mr|p〉 = 13. (3.61)Similarly, the probability of extracting either a yellow or black ball is 23 .To represent the uncertainty about the number of yellow and black balls,we introduce an ambiguous Ellsberg state modeled by the superposition ofpure states as follows:|s〉 =n∑i=1aieiθi |pi〉, (3.62)where∑ni=1 a2i = 1. This superposition of state produces an interfer-ence term in the probability formula that accounts for the Ellsberg para-dox [AST12]. In particular, the simplest superposition choice, involving thesuperposition of only yellow versus only black balls, is sufficient. Set423.5. Quantum Cognitive Models and Cognitive Challenges|s〉 = a1eiθ1 |p1〉+ a2eiθ2 |p2〉, where (3.63)a21 + a22 = 1, (3.64)with |p1〉 and |p2〉 given by|p1〉 = 1√3(eiα1 |r〉+√2eiβ1 |y〉),|p2〉 = 1√3(eiα2 |r〉+√2eiβ2 |b〉),(3.65)forcos(α1 − α2 + θ1 − θ2) = 0. (3.66)Eq. (3.64) is set to ensure |s〉 is a unit vector, and Eq. (3.66) ensures thatthe probability to extract a red ball is 13 . Indeed,〈s|Mr|s〉 = a1eiθ1〈p1|+ a2eiθ2〈p2|Mr|a1eiθ1 |p1〉+ a2eiθ2 |p2〉=13(a21 + a22) +23a1a2 cos(α1 − α2 + θ1 − θ2)=13(a21 + a22) =13.(3.67)Different choices of a1, a2, and the phase angles θi, αi, and βi, for i = 1, 2,lead to different kinds of reasoning about the Ellsberg paradox. Therefore,interference effects induced by the superposed state explain the deviationsfrom Savage’s sure thing principle [AST12].To model the Machina paradox we introduce orthogonal states |i〉, fori = 1, ..., 4, representing the existence of balls of each kind, and orthogonalprojectors on each dimension Mi, i = 1, ..., 4, representing the bet on acertain kind of ball. Next, we introduce a pure Machina state|p〉 =4∑i=1ρieiαi |i〉 (3.68)to represent a distribution of balls, and impose the following consistencyconstraints:433.6. Entanglement of Conceptual Combinationsρ21 + ρ22 = 50, and ρ23 + ρ24 = 51. (3.69)Finally, the Machina state|s〉 =n∑k=1aieiθi |pk〉 (3.70)represents the ambiguity in terms of state superposition. The interferenceproduced in the probability formula from a simple superposition that consid-ers only extreme distributions, accounts for the results of Machina paradox.Thus, the quantum model is the only approach that provides a unified viewof the Ellsberg and Machina paradoxes [AST12].3.6 Entanglement of Conceptual CombinationsWe have shown in § 3.5 that the basic structures of the quantum frame-work can successfully be applied to cognition. The interesting question isthe extent to which the tools developed for quantum theory can be appliedto cognition. In particular, can we identify other characteristics of quantumtheory in the field of cognition? In this section, we provide evidence for apositive answer. Namely, we show that the phenomenon of entanglementarises in the modeling of concept combinations.3.6.1 Quantum EntanglementQuantum formalism assumes that quantum systems may exist in su-perposed states. If a quantum system is formed by the composition ofsub-systems, then each sub-system exists in its own superposed state, butthe behavior for the emerging system may correspond to that of a non-decomposable entity. In particular, when we perform measurements on thesub-systems of a composite quantum system, the results reveal that thesub-systems may not behave independently, even if the sub-systems are sep-arated by a large distance. In fact, when the system is analyzed, it is possibleto encounter states that exhibit non-trivial correlations in the outcomes oftheir measurements. These states, called entangled states, have inspiredsome of the most important applications of quantum theory [HHHH09].Consider for example a composite quantum systems C obtained by com-posing two separate quantum systems C1 and C2. Formally, the composition443.6. Entanglement of Conceptual Combinationsof quantum entities corresponds to an element in the tensor product space6H⊗H.Definition 3.7. Let {|Ai〉} form a basis for H. Then, a composite vector|C〉 ∈ H ⊗H is given by|C〉 =n∑i,jcij |Ai〉 ⊗ |Aj〉. (3.71)Definition 3.8. Let |C〉 ∈ H ⊗ H. If |C〉 can be factorized as |C〉 =|C1〉 ⊗ |C2〉, where |C1〉 ∈ H, and |C2〉 ∈ H, we say that |C〉 is a separabletensor, also known as product vector. Otherwise, |C〉 is a non-separabletensor, representing an entangled state.Because separable tensors can be represented as ordered pairs, when ameasurement is performed on one of the sub-systems, the collapse of thewave function induced by the measurement occurs only at the measuredsub-system. The other sub-system remains in its original state. When ameasurement is performed on a non-separable tensor, the collapse of thewave function induced by the measurement will affect both sub-systems.An example of an entangled state is given by the famous Einstein-Podolsky-Rosen state [EPR35]. Let A be an entity whose possible statesare |A1〉 and |A2〉, and let B be another entity whose possible states are|B1〉 and |B2〉, where |A1〉 ⊥ |A2〉 and |B1〉 ⊥ |B2〉. Consider the compositeentity C represented by|C〉 = 1√2(|A1〉 ⊗ |B1〉+ |A2〉 ⊗ |B2〉) . (3.72)The state |C〉 is non-separable because it cannot be decomposed as the prod-uct of two state vectors |C1〉, |C2〉 ∈ H.From a probabilistic perspective, the correlations obtained when mea-suring entangled states are incompatible with classical probabilistic modelsthat assume independent measurements. We can test whether a statisticalsituation can be described by a classical probabilistic model using Bell-likeinequalities [B+64]. These inequalities are analogous to the conditions ofpossible experience presented in § 3.2, except that they are based on aggre-gate probabilistic indicators such as correlations and expected values rather6We assume here a particular form of joint quantum system that is useful for ourpurposes.453.6. Entanglement of Conceptual Combinationsthan joint probabilities.It has been proposed that the quantum description of joint entitiesis a suitable framework to describe concept combinations. In particular,entangled states can be used to model non-trivial semantic correlationsbetween the concepts that form the combination in the combination it-self [AS11c, DCGL+10, HS09, VAZ11, WBAP13, SMR13]. We now presentan experimental verification of these non-trivial semantic correlations in con-cept combinations [AGS13].3.6.2 Psychological Evidence of Conceptual EntanglementAn abstract formulation to test quantum entanglement in concept com-binations was presented in [AABG00]. Consider two entities A and B withtwo measurement; each measurement having two possible outcomes. We de-note these measurements and their outcomes by MA = {A1, A2} and MA′ ={A′1, A′2} for entityA, and MB = {B1, B2} and MB′ = {B′1, B′2} for entity B.Next, we define the composed operator XY ∈ {MAB,MA′B,MAB′ ,MA′B′}and associate the value 1 to the outcomes X1Y1 and X2Y2, and the value−1 to the outcomes X1Y2 and X2Y1.If we perform the experiment XY a large number of times, we can esti-mate the expected value E(MXY ) of each composed experiment. A Bell-likeinequality, named the Clauser-Horn-Shimony-Holt (CHSH) inequality, canbe used to test the statistics. The CHSH inequality states that if− 2 ≤ E(MA′B′) + E(MA′B) + E(MAB′)− E(MAB) ≤ 2 (3.73)is violated, then no classical probability model exists for the consideredjoint experiments [AF82]. Additionally, if the marginal law of probability issatisfied (see Appendix A.3), then the entities are entangled [DK14].A cognitive experiment in [AS11c, AS14] confirmed that semantic de-pendencies of concept combinations can violate the CHSH inequality. Forexample, let the entities A and B refer to the concepts Animal and Acts,respectively. Let MA, and MA′ be two measurements for concept A, andMB and MB′ be two measurements for concept B. The outcomes of thesemeasurements are given by:463.6. Entanglement of Conceptual CombinationsMA = {A1 =‘horse’, A2=‘bear’},MA′ = {A′1 =‘tiger’, A′2 =‘cat’},MB = {B1 =‘growls’, B2 =‘whinnies’},MB′ = {B′1 =‘snorts’, B′2 =‘meows’}.(3.74)A psychological experiment where 81 participants were asked to choosethe combination that best represents the concepts A, B, and the conceptualcombination ‘The Animal Acts,’ with respect to outcomes of measurementsMAB, MAB′ , MA′B, and MA′B′ was performed, and the expected valuesof these joint measurements were calculated (see Table 3.2). From the datawe haveE(MAB) = P (A1, B1) + P (A2, B2)− P (A1, B2)− P (A2, B1) = −0.778,E(MAB′) = P (A1, B′1) + P (A2, B′2)− P (A1, B′2)− P (A2, B′1) = 0.3580,E(MA′B) = P (A′1, B1) + P (A′2, B2)− P (A′1, B2)− P (A′2, B1) = 0.6543,E(MA′B′) = P (A′1, B′1) + P (A′2, B′2)− P (A′1, B′2)− P (A′2, B′1) = 0.6296.(3.75)SinceE(A′B′) + E(A′B) + E(AB′)− E(AB) = 2.4197, (3.76)inequality (3.73) is violated, and so no classical probability model exists forthe considered joint experiments on A and B. Quantum models assumingTable 3.2: Data table of conceptual entanglement experiment in [AS14].‘Animal’ A1 =‘horse’ A2 =‘bear’ A′1 =‘tiger’ A′2 =‘cat’MX , X = A,A′ P (A1)=0.5309 P (A2)=0.4691 P (A′1)=0.7284 P (A′2)=0.2716‘Acts’ B1 =‘growls’ B2=‘whinnies’ B′1 =‘snorts’ B′2 =‘meows’MY , Y = B,B′ P (B1)=0.4815 P (B2)=0.5815 P (B′1)=0.321 P (B′2)=0.679‘Animal Acts’ ‘horse growls’ ‘horse whinnies’ ‘bear growls’ ‘bear whinnies’MAB P (A1, B1) = 0.049 P (A1, B2) = 0.630 P (A2, B1) = 0.259 P (A2, B2) = 0.062‘Animal Acts’ ‘horse snorts’ ‘horse meows’ ‘bear snorts’ ‘bear meows’MA,B′ P (A1, B′1) = 0.593 P (A1, B′2) = 0.025 P (A2, B′1) = 0.296 P (A2, B′1) = 0.086‘Animal Acts’ ‘tiger growls’ ‘tiger whinnies’ ‘cat growls’ ‘cat whinnies’MA′,B P (A′1, B1) = 0.778 P (A′1, B2) = 0.086 P (A′2, B1) = 0.086 P (A′2, B1) = 0.049‘Animal Acts’ ‘tiger snorts’ ‘tiger meows’ ‘cat snorts’ ‘cat meows’MA′,B′ P (A′1, B′1) = 0.148 P (A′1, B′2) = 0.086 P (A′2, B′1) = 0.099 P (A′2, B′2) = 0.667473.6. Entanglement of Conceptual Combinationsthat A and B are entangled concepts can represent this data [AS11c, AS14]even though the marginal law is not satisfied.Other tests have been carried in psychological experiments confirmingentanglement in conceptual combinations [BKR+12, BKRS13, KRBS10,AG05b]. The conclusion is that concept combinations entail semantic cor-relations that might not be obtained in a classical probabilistic framework.Since quantum entanglement can handle these semantic correlations, thequantum description of joint entities becomes a suitable mathematical frame-work to represent concept combinations.48Chapter 4Two Quantum Models forthe Conjunction andDisjunction of ConceptsMost prominent authors in the study of concept combinations believethat the problem of non-compositionality of concepts (§2.2.3) is the con-sequence of non-trivial semantic interactions between the combined con-cepts [Ham88a]. This interaction can be explained in terms of salient con-cept’s properties [SO81, SO82, Rip95], specialized [CM84, Mur88] or con-strained schematas [CK00, CK01], composite prototypes [Ham88b, Ham07],or some combinations of these [WL98, Wis96, Gag00]. Therefore, althoughthey differ in their mathematical approaches, most models look for a coher-ence mechanism that explains the meaning of concepts combinations [Tha97].The experiments carried out by James Hampton [Ham88b, Ham88a] andothers [SO81, SO82, SDBVMR98] show that such coherence mechanism can-not be accounted for using classical or fuzzy logic.In §3.5.2, we used a simple quantum model to represents some of the casesof overextension and underextension found in experiments with conceptualcombinations. This model includes an interference term that depends on thephase angles used to represent the concepts in combination. Phase anglescan therefore be interpreted as a mathematical realization of the coherencemechanism sought by cognitive psychologists.This chapter further explores quantum modeling of concept combina-tions. In particular, in §4.1 we analyze the Hilbert space model of conceptcombinations discussed in §3.5.2, and in §4.2 we introduce another typeof modeling for concept combinations based on tensor products of Hilbertspaces7. The exploration presented here is theoretical; we focus on the mod-7Tensor products were introduced in §3.6 to model the concept combination ‘AnimalActs’ expressed by a noun-verb combination. The model presented in §4.2 is used for494.1. Modeling on a Hilbert Spaceeling power of the introduced frameworks.4.1 Modeling on a Hilbert SpaceThe Hilbert space model introduced in §3.5.2 is useful to represent mostcases of overextension of conjunctions and underextension of disjunctionsfound in experimental data. However, some cases of extreme overextensionand underextension, as well as some combinations that correspond to clas-sical probabilistic data (see Defs. 2.1 and 2.3) cannot be modeled by theHilbert space model. In this section, we will determine what conditions arerequired to find a representation for concept combinations in the Hilbertspace model. Rather than focus on a particular combination of concepts, weassume the existence of two concepts A and B, and of a combined conceptAB that can represent either conjunction or disjunction.4.1.1 Scope and Dimensionality of a Hilbert Space ModelTo explore the type of conceptual combinations that can be representedin the Hilbert space model, we focus on the dimension n of the Hilbert spaceCn equipped with the standard inner product. First, recall that the Hilbertspace model of concept combinations requires vectors |A〉, |B〉 ∈ H, and anorthogonal projector M : H → H, such that the following conditions aresatisfied8:〈A|A〉 = 〈B|B〉 = 1, (4.1)〈A|B〉 = 0, (4.2)〈A|M|A〉 = µ(A), 〈B|M|B〉 = µ(B), (4.3)µ(AB) =12(µ(A) + µ(B)) + <(〈A|M|B〉). (4.4)Next, we determine the type of membership data that is compatible withconditions (4.1)–(4.4). We look at the particular cases H = C2 and C3separately, then show that the general case, Cn for n > 3, is equivalent tothe case C3.conjunctions and disjunctions of concepts referred by nouns, so it is different.8In this chapter we are concerned with the representation of exemplars individually.For this reason, we will simplify the notation denoting the operator Mx by M and themembership weights µx(·) by µ(·).504.1. Modeling on a Hilbert SpaceTheorem 4.1. Let µ(A), µ(B), and µ(AB) denote the membership weightsof an exemplar with respect to concepts A, B, and a combination of theseconcepts denoted by AB. The membership weights are compatible with acomplex Hilbert space model H = C2 if and only if one of the following casesis satisfied1. µ(A) = µ(B) = µ(AB) = 0,2. µ(A) = µ(B) = µ(AB) = 1,3. µ(A)+µ(B) = 1, and µ(AB) ∈ [12−√µ(A)(1− µ(A)), 12+√µ(A)(1− µ(A))].Proof. We use conditions (4.1)–(4.4) to derive the cases stated in the theo-rem.⇐: Note that 1 and 2 are trivially satisfied by choosing M to be a zero-and two-dimensional projector respectively. Then, conditions (4.1)–(4.4) aresatisfied by choosing |A〉 and |B〉 to be any two unit vectors that are or-thogonal.⇒: Let M be a one-dimensional projector. Without loss of generality,we set M(x, y)→ (x, 0), and|A〉 = (eiα1a1, eiα2a2),|B〉 = (eiβ1b1, eiβ2b2).(4.5)Applying condition (4.3), we obtaina1 =√µ(A), and b1 =√µ(B). (4.6)Next, we use condition (4.1) to obtaina2 =√1− µ(A), and b2 =√1− µ(B). (4.7)Hence, condition (4.4) becomesµ(AB) =12(µ(A) + ei(α1−β1)√µ(A)µ(B) + ei(β1−α1)√µ(A)µ(B) + µ(B))=12(µ(A) + µ(B)) +√µ(A)µ(B) cos(β1 − α1).(4.8)514.1. Modeling on a Hilbert SpaceSince | cos(β1 − α1)| ≤ 1,µ(AB) ∈[12(µ(A) + µ(B))−√µ(A)µ(B),12(µ(A) + µ(B)) +√µ(A)µ(B)].(4.9)We have considered the conditions given by Eqs. (4.1), (4.3), and (4.4).Next, we apply condition (4.2) to obtain√µ(A)µ(B) cos(β1 − α1) =√1− µ(A)√1− µ(B) cos(β2 − α2), (4.10)√µ(A)µ(B) sin(β1 − α1) =√1− µ(A)√1− µ(B) sin(β2 − α2). (4.11)We square both sides and add Eqs. (4.10) and (4.11) to obtainµ(A)µ(B) = (1− µ(A))(1− µ(B)), (4.12)which is equivalent toµ(A) + µ(B) = 1. (4.13)Sustituting Eq. (4.13) in Eq. (4.8) yieldsµ(AB) =12+√µ(A)(1− µ(A)) cos(β1 − α1). (4.14)Therefore, if |A〉, |B〉, and M satisfy conditions (4.1)–(4.4), thenµ(A) + µ(B) = 1, andµ(AB) ∈ [12−√µ(A)(1− µ(A)), 12+√µ(A)(1− µ(A))].(4.15)We obtain the other side of the implication by choosing |A〉 and |B〉 to satisfyEqs. (4.6) and (4.7), and α1 and β1 such that condition (4.4) is satisfied.Because Theorem 4.1 requires that µ(A) + µ(B) = 1, the Hilbert spacemodel with H = C2 is strongly constrained. Nonetheless, this simple modelcan be used to demonstrate how a Hilbert space model with interferenceextends classical models of concept combinations.Consider the case µ(A) + µ(B) = 1. From a classical perspective, this isequivalent toA = Not B. (4.16)524.1. Modeling on a Hilbert SpaceTherefore, in a classical probabilistic modelµ(A and B) = µ(Not B and B) = 0, (4.17)µ(A or B) = µ(Not B or B) = 1. (4.18)The Hilbert space model is more flexible because µ(AB) can be either overex-tended or underextended. For example, if µ(A) = µ(B) = 12 , then for allx ∈ [0, 1] there are angles α1 and β1 such that µ(A and B) = x.We use Fig. 4.1 to describe the modeling scope of the C2 model. Themembership weight, µ(A), is given on the x-axis, and can also be representedby the identity function plotted on the diagonal red line. The membershipweight, µ(B) = 1 − µ(A), is represented by the antidiagonal red line. Theblue curve surrounding the shaded area corresponds to the maximal andminimal membership weight, µ(AB), that the concept combination AB canassume in the C2 model. Therefore, the shaded area corresponds to theregion of overextension and underextension that this model can represent.In particular, 1 denotes the single underextended/overextended region, 2denotes the double underextended region, and 3 denotes the double overex-tended region.Note that not all underextended, or overextended, cases admit a repre-sentation in this model. Also, since the shaded area does not contain the twoFigure 4.1: Hilbert space model in C2 for concept combination with µ(A) +µ(B) = 1.534.1. Modeling on a Hilbert Spacered curves entirely, this model cannot represent all possible classical cases.For example, when µ(A) = 0.9, µ(B) = 0.1, and µ(AB) = 0.1 there is norepresentation in the C2 model.We now analyze the Hilbert space model in C3. We introduce the fol-lowing notation to facilitate the presentation of the mathematical results:ave(AB) = µ(A)+µ(B)2 , (4.19)dev(AB) =√min(µ(A)µ(B), (1− µ(A))(1− µ(B))). (4.20)Theorem 4.2. Let µ(A), µ(B), and µ(AB) denote the membership weightsof an exemplar with respect to concepts A, B, and a combination of theseconcepts denoted by AB. The membership weights are compatible with acomplex Hilbert space model H = C3 if and only ifµ(AB) ∈ [ave(AB)− dev(AB), ave(AB) + dev(AB)]. (4.21)Proof. We will show how conditions (4.1)–(4.4) are used to derive Eq. (4.21).First, if M is a zero- or three-dimensional projector, thenµ(A) = µ(B) = µ(AB) = 0, andµ(A) = µ(B) = µ(AB) = 1,(4.22)respectively. Thus Eq. (4.21) is trivially satisfied. Therefore, conditions (4.1)–(4.4) are satisfied by choosing |A〉 and |B〉 to be any two unit orthogonalvectors.The remaining cases are M is a one- or a two-dimensional projector.We apply conditions (4.1)–(4.4) to vectors |A〉 and |B〉 in these two casesseparately, and then combine the two analyses to derive Eq. (4.21).If M is a one-dimensional projector, then, without loss of generality, setM(x, y, z)→ (x, 0, 0), and|A〉 = (eiα1a1, eiα2a2, eiα3a3),|B〉 = (eiβ1b1, eiβ2b2, eiβ3b3).(4.23)Note that conditions (4.1) and (4.3) are satisfied by choosing the coeffi-cients in |A〉 and |B〉 as follows:544.1. Modeling on a Hilbert Spacea1 =√µ(A), a2 =√λ√1− µ(A) , a3 =√1− λ√1− µ(A),b1 =√µ(B), b2 =√κ√1− µ(B) , b3 =√1− κ√1− µ(B),(4.24)with 0 ≤ λ ≤ 1, and 0 ≤ κ ≤ 1. Moreover, condition (4.4) implies thatµ(AB) is given byµ(AB) =12(µ(A) + µ(B)) +√µ(A)µ(B) cos(α1 − β1). (4.25)We apply condition (4.2) to obtain−√µ(A)µ(B) cos(γ1) =√(1− µ(A))(1− µ(B))F (λ, κ, cos(γ2), cos(γ3)), (4.26)−√µ(A)µ(B) sin(γ1) =√(1− µ(A))(1− µ(B))F (λ, κ, sin(γ2), sin(γ3)), (4.27)whereF (λ, κ, f(x), f(y)) =(√λκf(x) +√(1− λ)(1− κ)f(y)). (4.28)Note that F (λ, κ, cos(γ2), cos(γ3)), and F (λ, κ, sin(γ2), sin(γ3)) are convexcombinations of√λκ and√(1− λ)(1− κ). Therefore,|F (λ, κ, cos(γ2), cos(γ3))| ≤ |√λκ|+ |√(1− λ)(1− κ)|,|F (λ, κ, sin(γ2), sin(γ3))| ≤ |√λκ|+ |√(1− λ)(1− κ)|.(4.29)Set√λ = cos(θ1),√κ = cos(θ2), (4.30)for θ1, θ2 in [0,pi2 ]. Then√1− λ = sin(θ1),√1− κ = sin(θ2).(4.31)Substituting Eqs. (4.30) and (4.31) in Eq. (4.29) we obtain|F (λ, κ, cos(γ2), cos(γ3))| ≤ | cos(θ1 − θ2)| ≤ 1,|F (λ, κ, sin(γ2), sin(γ3))| ≤ | sin(θ1 − θ2)| ≤ 1.(4.32)554.1. Modeling on a Hilbert SpaceSince |F (λ, κ, cos(γ2), cos(γ3))| ≤ 1, Eq. (4.26) implies that|√µ(A)µ(B) cos(γ1)| ≤√(1− µ(A))(1− µ(B)). (4.33)Therefore, the interference term is bounded as follows|√µ(A)µ(B) cos(γ1)| ≤ min(√µ(A)µ(B),√(1− µ(A))(1− µ(B)))= dev(AB).(4.34)Next, combining Eqs. (4.26) and (4.27). We obtainµ(A)µ(B) = (1− µ(A))(1− µ(B))Fˆ (λ, κ, γ2, γ3), (4.35)whereFˆ (λ, κ, γ2, γ3) = F (λ, κ, cos(γ2), cos(γ3))2 + F (λ, κ, sin(γ2), sin(γ3))2.(4.36)Thus,µ(A) + µ(B) = 1 + µ(A)µ(B)(1− 1Fˆ (λ, κ, γ2, γ3)). (4.37)Using the parametrization for λ and κ given by Eq. (4.30), and applyingEq. (4.32) to Eq. (4.36), we obtain0 ≤ Fˆ (λ, κ, γ2, γ3) ≤ cos(θ1 − θ2)2 + sin(θ1 − θ2)2 = 1. (4.38)Next, applying Eq. (4.38) to Eq. (4.37) yieldsµ(A) + µ(B) ≤ 1. (4.39)Therefore, when M is a one-dimensional projector, conditions (4.1)–(4.4)implyµ(AB) ∈ [ave(AB)− dev(AB), ave(AB) + dev(AB)], andµ(A) + µ(B) ≤ 1. (4.40)Next, consider the case M is a two-dimensional projector. Without lossof generality we can assume M(x, y, z) → (x, y, 0). In this case, we satisfy564.1. Modeling on a Hilbert Spaceconditions (4.1) and (4.3) by choosing the coefficients in |A〉 and |B〉 asfollowsa1 =√λ√µ(A), a2 =√1− λ√µ(A) , a3 =√1− µ(A),b1 =√κ√µ(B), b2 =√1− κ√µ(B) , b3 =√1− µ(B),(4.41)with 0 ≤ λ ≤ 1, and 0 ≤ κ ≤ 1. Moreover, Eq. (4.4) implies that µ(AB) isgiven byµ(AB) =12(µ(A) + µ(B)) +√µ(A)µ(B)F (λ, κ, cos(γ1), cos(γ2)). (4.42)We apply condition (4.2) to obtain√µ(A)µ(B)F (λ, κ, cos(γ1), cos(γ2)) = −√(1− µ(A))(1− µ(B)) cos(γ3).(4.43)Since |F (λ, κ, cos(γ1), cos(γ2))| ≤ 1, Eq. (4.43) implies that|√µ(A)µ(B)F (λ, κ, cos(γ1), cos(γ2))| ≤ min(√µ(A)µ(B),√(1− µ(A))(1− µ(B)))= dev(AB).(4.44)We repeat the procedure used in the one-dimensional case to deduceµ(A)µ(B)Fˆ (λ, κ, γ1, γ2) = (1− µ(A))(1− µ(B)). (4.45)Since 0 ≤ Fˆ (λ, κ, γ1, γ2) ≤ 1, Eq. (4.45) yields1 ≤ µ(A) + µ(B). (4.46)Therefore, when M is a two-dimensional projector, conditions (4.1)–(4.4)implyµ(AB) ∈ [ave(AB)− dev(AB), ave(AB) + dev(AB)], and1 ≤ µ(A) + µ(B). (4.47)Thus, merging conditions (4.40) and (4.47) completes the proof.574.1. Modeling on a Hilbert SpaceTheorem 4.2 shows that some of the restrictions of the model in C2 areremoved in the C3 model. Moreover, the cases with µ(A)+µ(B) ≤ 1 are rep-resented by a one-dimensional projector, and the cases with 1 ≤ µ(A)+µ(B)are represented by a two-dimensional projector.Because extending the Hilbert space model from dimension two to di-mension three leads to a reduction of constraints in the model, we mightsuspect that adding dimensions would lead to further reductions. However,we now show that the constraints of the C3 model cannot be relaxed in Cnwith n > 3. To do so, we will prove that the case Cn, for n > 3, yields thesame constraints on µ(AB).Theorem 4.3. Let µ(A), µ(B), and µ(AB) denote the membership weightsof an exemplar with respect to concepts A, B, and a combination of theseconcepts denoted by AB. If the membership weights are compatible with acomplex Hilbert space model in Cn, for n > 3, thenµ(AB) ∈ [ave(AB)− dev(AB), ave(AB) + dev(AB)]. (4.48)Proof. We show how conditions (4.1)–(4.4) can be used to derive Eq. (4.48).First note that Eq. (4.48) is trivially satisfied when M is a zero- or n-dimensional projector. Therefore, conditions (4.1)–(4.4) are satisfied whenwe choose |A〉 and |B〉 to be any two unit orthogonal vectors.Let M be a k-dimensional projector. Without loss of generality, we setM(x1, ..., xn)→ (x1, ..., xk, 0, ..., 0) with 0 < k < n, and|A〉 = (eiα1a1, eiα2a2, ..., eiαnan),|B〉 = (eiβ1b1, eiβ2b2, ..., eiβnbn).(4.49)Conditions (4.1) and (4.3) are satisfied by defining the coefficients in |A〉and |B〉 as follows:ai = λi√µ(A), and bi = κi√µ(B), for i = 1, ..., k, (4.50)ai = λi√1− µ(A), and bi = κi√1− µ(B), for i = k + 1, ..., n. (4.51)Then,584.1. Modeling on a Hilbert Space|A〉 = (eiα1λ1√µ(A), ..., eiαkλk√µ(A), eiαk+1λk+1√1− µ(A), ..., eiαnλn√1− µ(A)),|B〉 = (eiβ1κ1√µ(B), ..., eiβkκk√µ(B), eiβk+1κk+1√1− µ(B), ..., eiβnκn√1− µ(B)),(4.52)and condition (4.1) impliesk∑i=1λ2i =n∑i=k+1λ2i =k∑i=1κ2i =n∑i=k+1κ2i = 1. (4.53)Therefore, condition (4.4) becomes〈A|M |B〉 =√µ(A)µ(B)(k∑i=1λiκi cos(γi)), (4.54)where γi = βi − αi. Note that Eq. (4.53) allow us to apply the Cauchy-Schwarz inequality in Eq. (4.54) to obtain− 1 ≤k∑i=1λiκi cos(γi) ≤ 1. (4.55)Hence, condition (4.4) givesµ(AB) ∈ [ave(AB)−√µ(A)µ(B), ave(AB) +√µ(A)µ(B)]. (4.56)Next, condition (4.2) implies that√µ(A)µ(B)(k∑i=1λiκi cos(γi))= −√(1− µ(A))(1− µ(B))(n∑i=k+1λiκi cos(γi)),(4.57)and√µ(A)µ(B)(k∑i=1λiκi sin(γi))= −√(1− µ(A))(1− µ(B))(n∑i=k+1λiκi sin(γi)).(4.58)594.2. Modeling in the Tensor Product of Hilbert SpacesApplying condition (4.4) in Eq. (4.57), yieldsµ(AB)− ave(AB) +√(1− µ(A))(1− µ(B))(n∑i=k+1λiκi cos(γi))= 0.(4.59)We apply Cauchy-Schwarz inequality to Eq. (4.59) to obtain− 1 ≤n∑i=k+1λiκi cos(βi − αi) ≤ 1. (4.60)Therefore, Eqs. (4.59) and (4.60) yieldµ(AB) ∈ [ave(AB)−√(1− µ(A))(1− µ(B)), ave(AB)+√(1− µ(A))(1− µ(B))].(4.61)Combining the constraints in Eqs. (4.56) and (4.61) completes the proof.Theorem 4.2 establishes a limit to the Hilbert space modeling approachby limiting the values µ(A), µ(B), and µ(AB) can assume. Moreover, The-orem 4.3 confirms that a complex Hilbert space of dimension 3 is sufficientto reach the full modeling power of this model.4.2 Modeling in the Tensor Product of HilbertSpacesThe idea of applying the tensor product to model concept conjunctionsand disjunctions was first proposed in [AG05b], and has since been appliedto other types of combinations (see §3.6). In order to introduce the notationand probabilistic structure of the tensor product model, we present a sim-plified version of this model in §4.2.1, and then introduce the general modelin § A Simple Tensor Product ModelWe can build a simple tensor product model for the membership weightof an exemplar with respect to concepts A, B, and their combination ABby using the tensor product model introduced in §4.1. Namely, we useunit vectors |A〉 and |B〉 to represent the state of concepts A and B, and aprojector M : H → H to measure the membership weights. Hence,604.2. Modeling in the Tensor Product of Hilbert Spacesµ(A) = 〈A|M|A〉,µ(B) = 〈B|M|B〉. (4.62)The state |C〉, representing the state of the combined concept AB, is givenby the tensor product of |A〉 and |B〉:|C〉 = |A〉 ⊗ |B〉. (4.63)Note that the membership operator M can be extended to the tensorproduct H⊗H by the operators MA = M⊗1 and MB = 1⊗M respectively.Indeed,〈C|MA|C〉 = (〈A| ⊗ 〈B|)M⊗ 1(|A〉 ⊗ |B〉) = 〈A|M|A〉 ⊗ 〈B|1|B〉 = µ(A),〈C|MB|C〉 = (〈A| ⊗ 〈B|)1⊗M(|A〉 ⊗ |B〉) = 〈A|1|A〉 ⊗ 〈B|M|B〉 = µ(B).(4.64)If we want to measure the membership weight of an exemplar with respectto the conjunction of concepts A and B, we must determine whether theexemplar is a member of both concepts simultaneously. In this case, themembership operator for the conjunction of two concepts is given byM∧ = M⊗M. (4.65)The membership weight of an exemplar with respect to the conjunctionof concepts A and B is given byµ(A and B) = 〈C|M∧|C〉 = (〈A| ⊗ 〈B|)M⊗M(|A〉 ⊗ |B〉)= 〈A|M|A〉 ⊗ 〈B|M|B〉 = µ(A)µ(B). (4.66)Similarly, if we want to measure the membership weight of the exemplar withrespect to the disjunction of concepts A or B, we introduce the operatorM∨ = M⊗M + M⊗ (1−M) + (1−M)⊗M= 1⊗ 1− (1−M)⊗ (1−M). (4.67)Hence, the membership weight of the exemplar with respect to the disjunc-tion of the concepts A or B is given by614.2. Modeling in the Tensor Product of Hilbert Spacesµ(A or B) = 〈C|M∨|C〉= (〈A| ⊗ 〈B|)M⊗M + M⊗ (1−M) + (1−M)⊗M(|A〉 ⊗ |B〉)= 〈A|M|A〉〈B|M|B〉+ 〈A|M|A〉〈B|1−M|B〉+ 〈A|1−M|A〉〈B|M|B〉= µ(A)µ(B) + µ(A)(1− µ(B)) + (1− µ(A))µ(B)= µ(A) + µ(B)− µ(A)µ(B).(4.68)Note that the formulas for the membership weight of the conjunctionand disjunction of two concepts, given by Eqs. (4.66) and (4.68) respectively,are equivalent to the classical probability formulas where the membershipestimation for concepts A and B are independent events.4.2.2 Generalizing the States in the Tensor Product ModelThe probabilistic independence of the model presented in §4.2.1 is a con-sequence of the choice of the state vector representing the concept combina-tion and of the membership operator. Specifically, the state of the combinedconcept |AB〉 is given by the tensor product |C〉 = |A〉 ⊗ |B〉 of the statesof the two former concepts |A〉 and |B〉, and the operators MA, MB, M∧,and M∨ are built from an operator M that acts on the two sides of thetensor product space separately. This choice for the state and operatorsis a simplified application of the tensor product model because the stateand the operators are separable (see Appendix 6.3.2); it means that we canidentify the first part of the tensor space with the concept A, and the sec-ond part with the concept B. We now assume a general state |C〉 that isnot necessarily separable. This means that we do not know what part of|C〉 is inherited from the state of the concept A or the state of the concept B.To obtain the membership weights for the single concepts A and B fromthe state |C〉, we require two projection operators MA,MB : H⊗H → H⊗Hthat recover the membership weights µ(A) and µ(B) when applied to thevector |C〉. Therefore〈C|MA|C〉 = µ(A),〈C|MB|C〉 = µ(B). (4.69)We also require two membership operators M∧,M∨ : H ⊗ H → H ⊗ H624.2. Modeling in the Tensor Product of Hilbert Spacesrepresenting the measurement with respect to concept conjunction and dis-junction:〈C|M∧|C〉 = µ(A and B), (4.70)〈C|M∨|C〉 = µ(A or B). (4.71)Therefore, the general tensor product model for concept combinationsis given by a four-tuple (|C〉,MA,MB,M∧) satisfying conditions (4.69)and (4.70) for conjunction, and by a four-tuple (|C〉,MA,MB,M∨) satisfy-ing conditions (4.69) and (4.71) for disjunction. For simplicity, we will as-sume that the membership operators MA,MB,M∧, and M∨ are built froma measurement operator M : H → H as in Eqs. (4.62), (4.65), and (4.67).We now build a concrete representation of this model in a complex tensorspace. Let |C〉 ∈ Cn⊗Cn and {|i〉}ni=1 be the canonical basis of Cn. Withoutloss of generality, let M be the orthogonal projector on the subspace of Cnspanned by the basis elements |1〉, ..., |r〉, for r < n:M(x1, ..., xn)→ (x1, ..., xr, 0, ..., 0).Next, let |C〉 be a unit vector in Cn ⊗ Cn. That is,|C〉 =n∑i=1n∑j=1cijeiγij |i〉 ⊗ |j〉, (4.72)and〈C|C〉 =n∑i,j=1cijeiγij 〈i| ⊗ 〈j|n∑k,l=1ckleiγkl |k〉 ⊗ |l〉=n∑i,j,k,l=1cijcklei(−γij+γkl)〈i|k〉〈j|l〉=n∑i,j=1c2ij = 1.(4.73)We can now apply condition (4.69) to the vector |C〉:634.2. Modeling in the Tensor Product of Hilbert Spaces〈C|MA|C〉 = 〈C|M⊗ 1|C〉 =n∑i=1n∑j=1cije−iγij 〈i| ⊗ 〈j|M⊗ 1|n∑k=1n∑l=1ckleiγkl |k〉 ⊗ |l〉=n∑i=1n∑j=1cije−iγij 〈i| ⊗ 〈j|r∑k=1n∑l=1ckleiγkl |k〉 ⊗ |l〉=r∑i,j=1n∑k,l=1cijcklei(−γij+γkl)〈i|k〉〈j|l〉=r∑i=1n∑j=1c2ij = µ(A),(4.74)and〈C|MB|C〉 = 〈C|1⊗M|C〉 =n∑i=1n∑j=1cije−iγij 〈i| ⊗ 〈j|1⊗M|n∑k=1n∑l=1ckleiγkl |k〉 ⊗ |l〉=n∑i=1n∑j=1cije−iγij 〈i| ⊗ 〈j|n∑k=1r∑l=1ckleiγkl |k〉 ⊗ |l〉=n∑i,k=1r∑j,l=1cijcklei(−γij+γkl)〈i|k〉〈j|l〉=n∑i=1r∑j=1c2ij = µ(B).(4.75)For the case of conjunction, we apply condition (4.70):〈C|M∧|C〉 = 〈C|M⊗M|C〉=n∑i=1n∑j=1cije−iγij 〈i| ⊗ 〈j|M⊗M|n∑k=1n∑l=1ckleiγkl |k〉 ⊗ |l〉=r∑i=1r∑j=1c2ij = µ(A and B).(4.76)644.2. Modeling in the Tensor Product of Hilbert SpacesFinally, we apply condition (4.71) for the case of disjunction:〈C|M∨|C〉 = 〈C|MA + MB −M∧|C〉= 〈C|(M⊗M + M⊗ (1−M) + (1−M)⊗M|C〉,=r∑i=1n∑j=1c2ij +n∑i=1r∑j=1c2ij −r∑i=1r∑j=1c2ij=n∑i=1n∑j=1c2ij −n∑i=r+1n∑j=r+1c2ij= 1−n∑i=r+1n∑j=r+1c2ij = µ(A or B).(4.77)With these results, we can prove that the constraints of the tensor productmodel are exactly those of the classical probabilistic model.Definition 4.4. Let µ = {µ(A), µ(B), µ(A and B)} be a triplet denotingthe membership weights of concepts A, B, and their conjunction A and B.We say that the triplet µ admits a representation in the tensor productspace Cn ⊗ Cn if there exists a unit vector |C〉 ∈ Cn ⊗ Cn, and an operatorM : Cn → Cn, such that conditions (4.73)–(4.76) are satisfied.Theorem 4.5. Let µ = {µ(A), µ(B), µ(A and B)} be a triplet denoting themembership weights of concepts A, B, and their conjunction A and B. Thetriplet µ is classical conjunction data if and only if it admits a representationin a tensor product space Cn ⊗ Cn with n = 2.Proof. If µ admits a representation in the tensor product space C2 ⊗ C2,there exists |C〉 ∈ C2 ⊗ C2 and an operator M such that (4.73)–(4.76) aresatisfied. If µ(A) = µ(B) = µ(A and B) = 0 or 1, we can choose |C〉 to beany unit vector in C2⊗C2 and M to be a zero- or two-dimensional projectorrespectively. Otherwise, let {|1〉, |2〉} be the canonical base of C2. Withoutloss of generality, we set |C〉 to be|C〉 = c11eiγ11 |1〉⊗ |1〉+ c12eiγ12 |1〉⊗ |2〉+ c21eiγ21 |2〉⊗ |1〉+ c22eiγ22 |2〉⊗ |2〉,(4.78)and let M be a one-dimensional projector into the subspace determined by|1〉. With this choice,654.2. Modeling in the Tensor Product of Hilbert Spacesµ(A) = 〈C|M⊗ 1|C〉 = c211 + c212,µ(B) = 〈C|1⊗M|C〉 = c211 + c221,µ(A and B) = 〈C|M⊗M|C〉 = c211.(4.79)Then, clearlyµ(A and B) ≤ µ(A), (4.80)µ(A and B) ≤ µ(B), and (4.81)µ(A) + µ(B)− µ(A and B) = c211 + c212 + c221 ≤ 1. (4.82)Thus, µ is classical conjunction data. The other implication is provenby taking M to be the same one-dimensional projector, |C〉 such thatc11 =√µ(A and B),c12 =√µ(A)− µ(A and B),c21 =√µ(B)− µ(A and B),c22 =√1− µ(A)− µ(B) + µ(A and B),(4.83)and γij = 0 for i, j = 1, 2.Definition 4.6. Let µ = {µ(A), µ(B), µ(A or B)} be a triplet denoting themembership weights of concepts A, B, and their disjunction A or B. Wesay that the triplet µ admits a representation in the tensor product spaceCn⊗Cn if there exists a unit vector |C〉 ∈ Cn and an operator M : Cn → Cnsuch that conditions (4.73)–(4.75) and (4.77) are satisfied.Theorem 4.7. Let µ = {µ(A), µ(B), µ(A or B)} be a triplet denoting themembership weights of concepts A, B, and their disjunction A or B. Thetriplet µ is classical disjunction data if and only if it admits a representationin a tensor product space Cn ⊗ Cn with n = 2.Proof. If µ admits a representation in the tensor product space C2 ⊗ C2,there exists |C〉 ∈ C2 ⊗ C2 and an operator M such that conditions (4.73)–(4.75) and (4.77) are satisfied. If µ(A) = µ(B) = µ(A or B) = 0 or 1,we can choose |C〉 to be any unit vector in C2 ⊗ C2 and M to be a zero-or two-dimensional projector respectively. Otherwise, let {|1〉, |2〉} be thecanonical basis of C2. Without loss of generality set |C〉 to be664.3. Examples and Comparisons|C〉 = c11eiγ11 |1〉⊗ |1〉+ c12eiγ12 |1〉⊗ |2〉+ c21eiγ21 |2〉⊗ |1〉+ c22eiγ22 |2〉⊗ |2〉,(4.84)and let M be a one-dimensional projector into the subspace determined by|1〉. With this choice,µ(A) = 〈C|M⊗ 1|C〉 = c211 + c212,µ(B) = 〈C|1⊗M|C〉 = c211 + c221,µ(A or B) = 〈C|M⊗M + M⊗ (1−M) + (1−M)⊗M|C〉 = c211 + c212 + c221.(4.85)Then, clearlyµ(A) ≤ µ(A or B),µ(B) ≤ µ(A or B), andµ(A) + µ(B)− µ(A or B) = c211 ≥ 0.(4.86)Hence, µ is classical disjunction data. The other implication is proven bytaking M to be the same one-dimensional projector, |C〉 such thatc11 =√µ(A) + µ(B)− µ(A or B),c12 =√µ(A or B)− µ(B),c21 =√µ(A or B)− µ(A),c22 =√1− µ(A or B),(4.87)and γij = 0 for i, j = 1, 2.Theorems 4.5 and 4.7 give the strict equivalence between classical con-junction and disjunction data and the models of conjunctions and disjunc-tions built in C2 ⊗ C2.4.3 Examples and ComparisonsIn this section, we compare the scope of the two models developed in§4.1 and §4.2. In particular, we use the experimental data presented in674.3. Examples and ComparisonsHampton [Ham88b, Ham88a] to show examples of the two types of repre-sentations, and compute the number of conceptual combinations that eachmodel can represent.There are four different cases. The first case applies when estimationscan be represented by both the Hilbert space and tensor product models,the next two cases when only one of the models can represent the data, andthe last case, when none of the models can represent the data. For simplicitywe show only explicit examples for data on conjunction.The first example consists of concepts A =‘Machine’ and B =‘Vehicle,’and the exemplar p5 =‘sailboat.’ In this case, we haveµ5(A) = 0.56, µ5(B) = 0.8, and µ5(A and B) = 0.42,ave5(AB) = 0.68,dev5(AB) = 0.297.(4.88)By Theorem 4.2, sinceave5(AB)− dev5(AB) ≤ µ5(A and B) ≤ ave5(AB) + dev5(AB), (4.89)the membership estimations can be modeled in the Hilbert space model.Because µ5(A)+µ5(B) > 1, M is a two-dimensional projector. We representthis case by choosing|A〉 = (−0.43 + 0.3i, 0.02− 0.53i, 0.58 + 0.32i), and|B〉 = (0.63, 0.63, 0.45). (4.90)In addition, sinceµ5(A)− µ5(A and B) = 0.14,µ5(B)− µ5(A and B) = 0.38, andµ5(A) + µ5(B)− µ5(A and B) = 0.06,(4.91)by Theorem 4.5 we can also construct a representation in the tensor spacemodel. Here we take M to be a one-dimensional projector, and|C〉 = 0.64|1〉 ⊗ |1〉+ 0.37|1〉 ⊗ |2〉+ 0.62|2〉 ⊗ |1〉+ 0.24|2〉 ⊗ |2〉. (4.92)684.3. Examples and ComparisonsIn the second example, the data can be represented only in a Hilbertspace model. Consider the concepts A =‘Machine’ and B =‘Vehicle,’ andthe exemplar p12 =‘skateboard.’ We haveµ12(A) = 0.28, µ12(B) = 0.84, and µ12(A and B) = 0.34,ave12(AB) = 0.56,dev12(AB) = 0.339.(4.93)By Theorem 4.2, sinceave12(AB)−dev12(AB) ≤ µ12(A and B) ≤ ave12(AB)+dev12(AB), (4.94)the membership estimations can be modeled in the Hilbert space model:|A〉 = (0.034− 0.37i,−0.37− 0.026i, 0.55 + 0.65i), and|B〉 = (0.65, 0.65, 0.4).However, since µ12(A and B) > µ12(A), this case cannot be modeled in thetensor product space.In the third example, the data can only be represented in the tensorproduct model. Consider the concepts A =‘Bird’ and B =‘Pet,’ and theexemplar p14 =‘goldfish.’ We haveµ14(A) = 0, µ14(B) = 1, and µ14(A and B) = 0,ave14(AB) = 0.5,dev14(AB) = 0.(4.95)In this case, we cannot provide a Hilbert space representation of the databecause0 = µ14(A and B) < ave14(AB)− dev14(AB) = 0.5. (4.96)However, the data is compatible with the tensor product model sinceµ14(A)− µ14(A and B) = 0,µ14(B)− µ14(A and B) = 1, andµ14(A) + µ14(B)− µ14(A and B) = 1.(4.97)694.3. Examples and ComparisonsWe obtain a representation by setting M and |C〉 as follows:M(x, y)→ (x, 0),|C〉 = |2〉 ⊗ |1〉. (4.98)Finally, for data that cannot be modeled by either of the two models,consider again the concepts A =‘Bird’ and B =‘Pet,’ and the exemplarp6 =‘heron.’ We haveµ6(A) = 0.94, µ6(B) = 0.15, and µ6(A and B) = 0.26,ave6(AB) = 0.545,dev6(AB) = 0.225.(4.99)By Theorem 4.2, since0.26 = µ(A and B) < ave6(AB)− dev6(AB) = 0.32, (4.100)we cannot provide a Hilbert space representation. Moreover, sinceµ6(A and B) > µ6(A),we cannot represent the data in the tensor product model.We now compare the performance of the two models by counting thenumber of membership estimations that allow a representation in both aHilbert space model and a tensor product model, in only one of the mod-els, or in neither models, for all the concepts conjunctions and disjunctionstested by Hampton in [Ham88b, Ham88a]. Fig. 4.2 shows the relative fre-quency of membership estimations for each of these cases. The histogramon the left gives the relative frequency for the conjunction data, while thehistogram on the right gives the relative frequencies for the disjunction data.We observe that for both conjunctions and disjunctions approximatelyhalf of the cases cannot be modeled by either of the two models (52.4% and42% respectively). Considering the cases that can be modeled for conjunc-tions, the Hilbert space model performs better since 41.6% of cases can bemodeled by the Hilbert space model, and 19% of cases can be modeled onlyby the tensor product. Moreover, 12% of the cases can be modeled by both704.3. Examples and Comparisonsthe tensor and Hilbert space models.For the case of disjunctions, the Hilbert space model provides a repre-sentation in 42.7% of cases, and the tensor product model can represent 37%of the cases.Although in almost half of the cases neither model is capable of provid-ing a representation of the data, overall the Hilbert space model seems toperform better than the tensor product model. Moreover, since the equiva-lence established in Theorems 4.5 and 4.7 establishes the equivalence of thetensor product models for conjunctions and disjunctions and their classicalprobabilistic counterparts, we can conclude that the Hilbert space model isbetter suited for this type of data than the classical probabilistic models. Itis important to note however that the tensor product model can representsome cases that cannot be represented by the Hilbert space model. Thisindicates that there is a need for a general model that incorporates both thetensor product and the Hilbert space models.Figure 4.2: Relative frequency of experimental data that can be representedin the Hilbert space or tensor space models.71Chapter 5Fock Space Modeling ofConjunctions andDisjunctions of ConceptsThe Fock space formalism was developed in quantum mechanics to rep-resent systems composed of a varying or unknown number of entities. Inquantum theory, the state of a quantum entity is represented as a vectorin a Hilbert space H, and the state of a collection of k quantum entities isrepresented in the tensor product space ⊗kH. A Fock space F∗ consists ofa direct sum of these tensor products for all possible values of k:F∗ = ⊕∞k=1 ⊗k H. (5.1)We use the Fock space structure to develop a model that brings togetherthe two models of Chapter 4 from both a mathematical and a cognitiveperspective.5.1 The Two-sector Fock Space ModelIn quantum cognition, the Fock space is used to represent different modesof reasoning in the modeling of concepts combinations. We will show a two-sector Fock space model that is a generalization of the Hilbert space andtensor product models developed in Chapter 4. In fact, both models areobtained as two extreme cases of the two-sector Fock space model, eachrepresenting a specific mode of reasoning. But before presenting the mathe-matical formulation of the two-sector Fock space model of concepts, we firsttake a closer look at the cognitive interpretation of the two models developedin Chapter 4.725.1. The Two-sector Fock Space Model5.1.1 Concept Combination in the Hilbert and TensorProduct Models: One or Two Instances in Mind?Consider the experimental situation of a participant estimating the mem-bership weight of an exemplar with respect to two concepts and their com-bination. For example, take the concepts ‘Fruit,’ and ‘Vegetable,’ and theirconjunction ‘Fruit and Vegetable,’ and suppose we would like to estimate themembership weight of the exemplar ‘apple’ with respect to the conjunction.When participants estimate the membership weight of ‘apple’ with respectto the concept conjunction ‘Fruit and Vegetable’ two kinds of reasoning canbe identified:1. ‘apple’ being an exemplar of the concept ‘Fruit and Vegetable,’2. ‘apple’ being an exemplar of the concept ‘Fruit,’ and the concept ‘Veg-etable,’ separately.In the first case, the membership weight of ‘apple’ is estimated with re-spect to the meaning of a single concept ‘Fruit and Vegetable.’ Thus, asingle instance of ‘apple’ is taken into consideration. In the second case,two instances of ‘apple’ are taken into consideration, one for each estima-tion. Namely, the first instance is estimated with respect to the meaningof ‘Fruit,’ and the second instance with respect to the meaning of ‘Vegetable.’To clarify the conceptual distinction between these two kinds of reason-ing, note that the first case considers a concept that cannot be decomposedas a combination of ‘Fruit’ and ‘Vegetable,’ but is a single emergent concept.In the second case, the combined concept is decomposed into two concepts,and the membership weights for the two concepts are analyzed separately.Therefore, the second case corresponds to the traditional compositional un-derstanding of the conjunction where logical rules of concept combinationsoperate.Likewise, if we consider ‘apple’ with respect to ‘Fruit,’ Vegetable,’ andtheir disjunction ‘Fruit or Vegetable,’ the same kinds of reasoning can beused to estimate membership weights. Therefore, we can identify two fun-damentally different manners of reasoning about the membership of conceptcombinations: the first considers a concept combination as an emergent en-tity that cannot be logically decomposed, and the second considers a conceptcombination as a logically decomposable entity.735.1. The Two-sector Fock Space ModelIt is interesting that the Hilbert space and the tensor product modelspresented in Chapter 4 can be identified with these two kinds of reasoning.The first type of reasoning, modeled by the Hilbert space model, creates aconcept combination state from the states of the former concepts, and themembership weight of the combined concept is related to the average of themembership weights of the two concepts plus an interference term. Thismodel can represent non-logical effects such as overextension of conjunctionand underextension of disjunction. Although it is sometimes compatiblewith classical data, there are multiple cases of classical data that cannotbe represented for by this model. The second type of reasoning, modeledby the tensor product model, is of a logical nature. In fact, Theorems 4.5and 4.7 show that the classical probabilistic model and the tensor productmodel are equivalent.5.1.2 Introduction to Fock Space ModelingDefinition 5.1. Let H be a Hilbert space, and k be an integer. We definethe kth sector, Fk, of a Fock space byFk = ⊗ki=1H. (5.2)Since we have identified two modes of reasoning, we use the first twosectors of the Fock space to model concept combinations. Namely, the firstsector, F1 = H, represents the emergent mode of reasoning previously mod-eled by the Hilbert space model, and the second sector, F2 = H⊗H, repre-sents the logical mode of reasoning previously modeled by the tensor productmodel. Hence, our model represents the concept combination in the spaceF = F1 ⊕F2 = H⊕ (H⊗H). (5.3)Let |A〉, and |B〉 be the states of the concepts A and B in F1, and let|C〉 be the state of the combination of concepts A and B in F2. Also, letM : H → H be the membership operator associated with an exemplar x,and let MA,MB,M∧, and M∨ be given by Eqs. (4.64)–(4.67). We haveµ(A) = 〈A|M|A〉 = 〈C|MA|C〉µ(B) = 〈B|M|B〉 = 〈C|MB|C〉. (5.4)The state |ψAB〉 representing the concept combination is obtained as asuperposition of modes of thought:745.1. The Two-sector Fock Space Model|ψAB〉 = nAB√2(|A〉+ |B〉)⊕√1− n2AB|C〉. (5.5)The first term of |ψAB〉 represents the probabilistic structure of the con-cept combination in the emergent mode of reasoning, which is the contri-bution from the first sector, F1 = H. The second term of |ψAB〉 representsthe probabilistic structure of the concept combination in the logical modeof reasoning, which is the contribution from the second sector, F2 = H⊗H.To estimate the membership weight of a concept combination, we con-struct membership operators in the two-sector Fock space:MF∧ = M⊕M∧,MF∨ = M⊕M∨. (5.6)The membership weights for conjunctions and disjunctions of concepts inthe two-sector Fock space model must satisfy the following conditions:µ(A and B) = 〈ψAB|MF∧ |ψAB〉, (5.7)µ(A or B) = 〈ψAB|MF∨ |ψAB〉. (5.8)Therefore, the formula for the membership weight of an exemplar with re-spect to the concept conjunction is obtained by applying conditions (4.4)and (4.76) to (5.7) as follows:µ(A and B) = 〈ψAB|MF∧ |ψAB〉= 〈ψAB|M⊕M⊗M|ψAB〉=nAB2((〈A+ 〈B)|M(|A〉+ |B〉)) + (1− n2AB)(〈C|)M⊗M(|C〉)= nABµ˜(A and B) + (1− n2AB)µ˘(A and B).(5.9)Similarly, we can measure the disjunction of the two concepts as follows:µ(A or B) = 〈ψAB|MF∨ |ψAB〉=n2AB2((〈A+ 〈B)|M(|A〉+ |B〉)) + (1− n2AB)(〈C|)MA + MB −M∧(|C〉)= n2ABµ˜(A or B) + (1− n2AB)µ˘(A or B).(5.10)755.1. The Two-sector Fock Space ModelEqs. (5.9) and (5.10) show that µ(A and B) and µ(A or B) are given bya convex combination of the membership weight formulas for concept com-bination corresponding to each mode of reasoning. Since each sector mustrespect the constraints of its mode of reasoning, the contribution from thefirst sector is constrained as in Theorem 4.2:µ˜(A and B) ∈ [ave(AB)− dev(AB), ave(AB) + dev(AB)],µ˜(A or B) ∈ [ave(AB)− dev(AB), ave(AB) + dev(AB)]. (5.11)The membership weight contribution from the second sector is con-strained by Theorems (4.5) and (4.7) as follows:µ˘(A and B) ∈ [max(0, 1− µ(A)− µ(B)),min(µ(A), µ(B))], (5.12)µ˘(A or B) ∈ [max(µ(A), µ(B)),min(1, µ(A) + µ(B))]. (5.13)The membership formulas in the two-sector Fock space model are convexcombinations of membership formulas for two modes of thought. Thus, thetwo-sector Fock space model not only generalizes the Hilbert space and ten-sor product models, but can also represent cases that cannot be representedby either model.For example, consider the concepts A =‘Pet,’ B =‘Bird,’ and the exem-plar p6 =‘heron.’ Recall that this case was given in § 4.3 as an examplethat could not be represented by either of the two models. In this case, wehave µ(A) = 0.94, µ(B) = 0.15, and µ(A and B) = 0.26. For simplicity,suppose that conditions (4.3) and (4.69) are satisfied. Then, by applyingTheorems 4.2 and 4.5, the possible values µ˜(A and B) and µ˘(A and B) arebounded by the intervals I1 and I2 respectively:I1 = [ave(AB)− dev(AB), ave(AB) + dev(AB)] = [0.32, 0.77],I2 = [max(0, 1− µ(A)− µ(B)),min(µ(A), µ(B))] = [0.09, 0.15].(5.14)Therefore, neither sector can represent this exemplar. However, this ex-emplar can be represented in the two-sector Fock space model becauseµ(A and B) belongs to the convex combination of I1 and I2:[min(I1 ∪ I2),max(I1 ∪ I2)]. (5.15)765.2. Data Representation of Multiple ExemplarsIn particular, Eq. (5.9) with the choices <(〈A|M|B〉) = 0 and nAB =0.294554 recovers the membership weight µ(A and B).The two-sector Fock space model provides the extra parameter nAB.We will show that this parameter becomes crucial in the construction ofrepresentations of multiple exemplars in concrete two-sector Fock spaces.5.2 Data Representation of Multiple ExemplarsThe two-sector Fock space model developed in § 5.1.2 provides an ab-stract model to represent concept combinations. This model outperforms themodeling scope of the Hilbert space and tensor product models. In the cur-rent literature, the concrete instantiations of this abstract model have alwaysprovided representations that are exemplar-dependent. This means that adifferent state is given for each exemplar, and a single membership operatorrepresent the semantic estimations of all exemplars [Aer07a, Aer07b, Aer09].Such concrete representations are useful to explain the way that state vec-tors, operators, and modes of reasoning operate in the two-sector Fock spacemodel. However, they do not model concepts in accordance with the cogni-tive principles that have inspired the abstract model.From a cognitive perspective, a concept is an entity in a state that isindependent of the exemplar to be measured. In addition, since semanticestimations are used to compare concepts and exemplars, these estimationsshould depend on the exemplar being measured. Therefore, a concrete rep-resentation in the two-sector Fock space model must satisfy the followingtwo modeling principles of quantum cognition:1. Concepts are represented by a state that is independent of the exem-plar to be measured,2. Semantic estimations are represented by a measurement operator thatdepends on the exemplar to be measured.The following examples show that the concrete representations providedin the literature disagree with these modeling principles. Consider theexemplars ‘filing cabinet’ and ‘heated waterbed’ with respect to the con-cepts A =‘Furniture,’ B =‘Household Appliances,’ and their conjunctionAB =‘Furniture and Household Appliances’ [Ham88b]. For the first exem-plar, we have µ(A) = 0.97, µ(B) = 0.31, and µ(A and B) = 0.53. ApplyingTheorem 4.2, we represent this case in the space C3 by the vectors775.2. Data Representation of Multiple Exemplars|A〉 = (−0.57 + 0.40i, 0.29− 0.63i, 0.13 + 0.11i), and|B〉 = (0.39, 0.39, 0.83). (5.16)For the second exemplar, we have µ(A) = 1, µ(B) = 0.49, and µ(A and B) =0.78. Applying Theorem 4.2 yields|A〉 = (0.71, 0.71, 0), and|B〉 = (0.49, 0.49, 0.71). (5.17)Because the vector states |A〉 and |B〉 are different for each exemplar, thisconcrete representation is exemplar-dependent. Moreover, because µ(A) +µ(B) > 1 for both cases, M is the operator that projects onto the first twodimensions of these vectors, so the same operator is used for two differentexemplars.The same situation occurs for the the tensor product model. For exam-ple, consider the exemplars ‘sailboat’ and ‘roadroller’ with respect to theconcepts A =‘Machine,’ B =‘Vehicle,’ and their conjunction AB =‘Machineand Vehicle.’ For the first exemplar, we have µ(A) = 0.56, µ(B) = 0.8, andµ(A and B) = 0.42. Applying Theorem 4.5, we represent this case in thespace C2 ⊗ C2 by the tensor|C〉 = 0.64|1〉 ⊗ |1〉+ 0.37|1〉 ⊗ |2〉+ 0.62|2〉 ⊗ |1〉+ 0.24|2〉 ⊗ |2〉. (5.18)For the second exemplar, we have µ(A) = 0.94, µ(B) = 0.91, and µ(A and B) =0.91. Applying Theorem 4.5 yields|C〉 = 0.95|1〉 ⊗ |1〉+ 0.17|1〉 ⊗ |2〉+ 0.24|2〉 ⊗ |2〉. (5.19)Moreover, M is a one-dimensional projector in both cases. Thus, the con-struction of concrete representations for concepts in the tensor productmodel is also exemplar-dependent. It is, however, possible to develop con-crete representations for multiple exemplars that are consistent with thequantum modeling principles. To do so, we will exploit the linear structureof the Hilbert space and tensor space models and use special linear operatorsknown as unitary transformations.785.2. Data Representation of Multiple ExemplarsUnitary transformations embody the notion of isometry. This meansthat they do not affect the value of the inner product. Formally, given aunitary operator U and two vectors |x〉 and |y〉, we have〈Ux|Uy〉 = 〈x|y〉. (5.20)An important consequence of Eq. (5.20) is that, when a unitary operator isapplied to transform the state of a concept and a membership operator, themembership weight is preserved.Unitary operators in a Hilbert space are a generalization of rotation ma-trices in linear algebra; they correspond to a change of basis for the vectorstates and for the operators in a Hilbert space. We will apply unitary trans-formations to the representations obtained from Theorems 4.2, 4.5, and 4.7to obtain new representations where all the exemplars are represented in thesame basis. Next, we will combine these representations to provide a repre-sentation in the two-sector Fock space model where the concept is identifiedwith one single state, and the measurement operators are different for eachexemplar.Since the representational problems of conjunctions and disjunctions aresimilar, we focus on the case of conjunctions of concepts. In what follows,we denote the set of data {µi(A), µi(B), µi(A and B)}ki=1 by µki=1. A par-ticular triplet (µi(A), µi(B), µi(A and B)) will be denoted by µi, and theconjunction µi(A and B) will be denoted by µi(AB).5.2.1 Hilbert Space RepresentationWe now show how to concretely represent multiple exemplars in theHilbert space model using the space C3.Definition 5.2.Theorem 5.3. The set of data µki=1 has a representation in C3 if and onlyif for all i = 1, ..., k,µi(AB) ∈ [avei(AB)− devi(AB), avei(AB) + devi(AB)]. (5.21)Proof. Let |A〉 = |1〉, |B〉 = |2〉, and |C〉 = |3〉 form the canonical basis ofC3. We prove that if (5.21) is satisfied for each i = 1, ..., k then there existsan orthogonal projector Mi such that conditions (4.1)–(4.4) are satisfied for795.2. Data Representation of Multiple Exemplars|A〉, |B〉, and Mi.Let i ∈ {1, ..., k}. Since µi(A), µi(B), and µi(AB) satisfy (5.21), byTheorem 4.2 there exist two vectors|Ai〉 = (eiα1a1, eiα2a2, eiα3a3), and|Bi〉 = (eiβ1b1, eiβ2b2, eiβ3b3),(5.22)and an orthogonal projector M˜i such that (4.1)–(4.4) are satisfied.Let|Ci〉 =|Ai〉 × |Bi〉=(a2b3e−i(β3+α2) − a3b2e−i(β2+α3),a1b3e−i(β3+α1) − a3b1e−i(β1+α3),a1b2e−i(β2+α1) − a2b1e−i(β1+α2)).(5.23)The vector |Ci〉 is chosen to complete an orthonormal basis for C3 from |Ai〉and |Bi〉. This ensures that |Ci〉 ⊥ |Ai〉, |Ci〉 ⊥ |Bi〉, and ‖|Ci〉‖ = 1. Nowwe define the operator Ui byUi =〈Ai|A〉 〈Ai|B〉 〈Ai|C〉〈Bi|A〉 〈Bi|B〉 〈Bi|C〉〈Ci|A〉 〈Ci|B〉 〈Ci|C〉 . (5.24)Ui is a unitary operator whose action induces a change from the basis(|Ai〉, |Bi〉, |Ci〉) to the basis (|A〉, |B〉, |C〉). In fact,Ui|Ai〉 = |A〉, Ui|Bi〉 = |B〉, and Ui|Ci〉 = |C〉.We apply the operator Ui to represent M˜i in the orthogonal basis {|Ai〉, |Bi〉, |Ci〉}.SetMi = UiM˜iU−1i . (5.25)Mi is the operator M˜i represented in the basis (|A〉, |B〉, |C〉). Since1 = U−1i Ui = UiU−1i , (5.26)we obtain805.2. Data Representation of Multiple Exemplarsµi(A) = 〈Ai|M˜i|Ai〉 = 〈AiU−1i |UiM˜iU−1i |UiAi〉 = 〈A|Mi|A〉,µi(B) = 〈Bi|M˜i|Bi〉 = 〈BiU−1i |UiM˜iU−1i |UiBi〉 = 〈B|Mi|B〉,(5.27)andµi(AB) =12(µi(A) + µi(B)) + <(〈Ai|M˜i|Bi〉)=12(µi(A) + µi(B)) + <(〈AiU−1i |UiM˜iU−1i |UiAi〉)=12(µi(A) + µi(B)) + <(〈A|Mi|B〉).(5.28)The other side of the implication is a direct consequence of Definition 5.2.Theorem 5.3 provides a data representation in terms of a single pair ofvectors |A〉 and |B〉, and a set of projectors Mi, i = 1, ..., k, correspondingto the membership operator for each exemplar.Recall that the exemplars p =‘filing cabinet,’ and q =‘heated waterbed’were represented by different state vectors and the same measurement op-erator in § 5.1.2. We can now apply Theorem 5.3 to obtain a representationconsistent with the modeling principles of quantum cognition using the statevectors|A〉 = (1, 0, 0) = |1〉,|B〉 = (0, 1, 0) = |2〉, (5.29)and two measurement operators corresponding to the exemplars p and q:Mp = 0.97 −0.11 + 0.09i 0.09 + 0.01i−0.11− 0.09i 0.31 0.28 + 0.34i0.09− 0.01i 0.28− 0.34i 0.72 ,Mq =1 0 00 0.49 0.4990 0.499 0.51 .(5.30)The construction introduced in the proof of Theorem 5.3 is independent ofthe choice of the vectors |A〉 and |B〉.815.2. Data Representation of Multiple ExemplarsCorollary 5.4. Let (|A〉, |B〉, {Mi)}ki=1) be a representation of µki=1 in C3,and let |A′〉, |B′〉 ∈ C3 be two orthogonal unit vectors. Then, there exists aunitary transformation U such that (|A′〉, |B′〉, {U−1MiU)}ki=1) is a repre-sentation of µki=1.Corollary 5.4 shows that all representations in C3 are equivalent up to aunitary transformation.5.2.2 Tensor Product Model RepresentationWe now apply unitary transformations in the concrete representationsof the tensor product model in Cn ⊗ Cn. We first define different types ofrepresentations for multiple exemplars, and then provide explicit represen-tation theorems for the cases n = 2 and 3. These will be useful to study theperformance of the two-sector Fock space model.Definition 5.5. A zero-type representation of µki=1 on the tensor productspace Cn ⊗ Cn is a unit vector |C〉 ∈ Cn ⊗ Cn, and a collection of orthog-onal projectors {MAi ,MBi }ki=1 from Cn ⊗ Cn to Cn ⊗ Cn, such that condi-tions (4.73)–(4.76) are satisfied with M∧i = MAi MBi , for i = 1, ..., k. We say(|C〉, {MAi ,MBi }ki=1) is a zero-type representation of µki=1 in Cn ⊗ Cn.The zero-type representation is, mathematically speaking, the most gen-eral representation in the tensor product model that is consistent with themodeling principles of quantum cognition because it assumes a single con-cept state |C〉, and a collection of measurements that represent the mem-bership weight estimations. However, this representation cannot be appro-priately interpreted because MA and MB can be entangled measurements9.A more reasonable representation of data assumes that the measure-ments MA and MB act on different sides of Cn⊗Cn so they are not entan-gled.Definition 5.6. A first-type representation of µki=1 on the tensor productspace Cn⊗Cn is a unit vector |C〉 ∈ Cn⊗Cn, and a collection of orthogonalprojectors Mi from Cn to Cn, for i = 1, ..., k, such that (|C〉, {Mi ⊗ 1,1 ⊗Mi}ki=1) is a zero-type representation of µki=1 in Cn ⊗ Cn.The first-type representation is a direct extension of the representationof individual exemplars in Definition 4.4, and thus it is interpreted accord-ing to such representation: The state |C〉 describes the situation having two9Entangled measurements appear in non-trivial analysis of entanglement in physics. Apossible interpretation of entangled measurements in this model is left for future work.825.2. Data Representation of Multiple Exemplarsconcepts and their combination, and Mi represents the semantic estimationof exemplar pi, i = 1, ..., k.We now introduce another representation that is mathematically simpler,and thus will facilitate the data analysis.Definition 5.7. A second-type representation of µki=1 on the tensor productspace Cn⊗Cn is a pair of unit vectors |A〉, and B〉 ∈ Cn, and a collection oforthogonal projectors Mi from Cn to Cn, for i = 1, ..., k, such that (|A〉 ⊗|B〉, {Mi ⊗ 1,1⊗Mi}ki=1) is a zero-type representation of µki=1 in Cn ⊗Cn.The zero-, first-, and second-type representations require different con-ditions to represent a collection of exemplars for a pair of concepts and theirconjunction. While the first-type corresponds to the natural way to rep-resent a pair of systems in quantum physics, and thus is the natural wayto define a representation in the tensor product model for concepts, thezero-type provides a general way to build concrete representations becauseit does not impose a product structure on the concept state or the mem-bership operators for the exemplars. The second-type is a mathematicalsimplification of the first-type representation that assumes |C〉 to be a prod-uct state. In fact, it is trivial to deduce that a second-type representationis also a first-type representation, and a first-type representation is also azero-type representation from Definitions 5.5–5.7 .The following theorem characterizes the cases when a set of data has azero-type representation in C2 ⊗ C2.Theorem 5.8. The set of data µki=1 has a zero-type representation in C2⊗C2if and only if µi is classical conjunction data for i = 1, ..., k.Proof. For each i = 1, ..., k, we use the construction in the proof of The-orem 4.5 to obtain a tensor |C˜i〉 and a one-dimensional projector M˜ suchthat M˜Ai = M˜ ⊗ 1, M˜Bi = 1 ⊗ M˜, and M˜∧i = M˜ ⊗ M˜. This gives thetensor product representation for µi. Next, we use unitary transformationsto change this representation so that |C˜i〉 is a vector in the canonical basisof C2 ⊗C2. To facilitate the notation, we will make use of the isomorphismI between C2 ⊗ C2 and C4. Let(1, 0, 0, 0) = |e1〉,(0, 1, 0, 0) = |e2〉,(0, 0, 1, 0) = |e3〉,(0, 0, 0, 1) = |e4〉.(5.31)835.2. Data Representation of Multiple ExemplarsWe defineI(|1〉 ⊗ |1〉) = |e1〉,I(|1〉 ⊗ |2〉) = |e2〉,I(|2〉 ⊗ |1〉) = |e3〉,I(|2〉 ⊗ |2〉) = |e4〉.(5.32)The isomorphism I allows us to represent |C˜i〉 by a vector |Ci〉 in C4.We can prove the theorem by building a unitary transformation thattakes |Ci〉 to one of the canonical basis vectors of C4, and use this trans-formation to represent the operators M˜Ai , M˜Bi , and M˜∧i by the operatorsMAi , MBi , and M∧i in C4. Next, we apply the the inverse isomorphism I−1to map these new representations to C2 ⊗ C2.Let |Di〉, |Ei〉, |Fi〉 be three vectors in C4 such that〈Di|Di〉 = 〈Ei|Ei〉 = 〈Fi|Fi〉 = 1,〈Ci|Di〉 = 〈Ci|Ei〉 = 〈Ci|Fi〉 = 0,〈Di|Ei〉 = 〈Di|Fi〉 = 〈Ei|Fi〉 = 0.(5.33)The vectors |Ci〉, |Di〉, |Ei〉, and |Fi〉 form an orthonormal basis for C4. SetUi =〈Ci|e1〉 〈Ci|e2〉 〈Ci|e3〉 〈Ci|e4〉〈Di|e1〉 〈Di|e2〉 〈Di|e3〉 〈Di|e4〉〈Ei|e1〉 〈Ei|e2〉 〈Ei|e3〉 〈Ei|e4〉〈Fi|e1〉 〈Fi|e2〉 〈Fi|e3〉 〈Fi|e4〉 . (5.34)Note that Ui is a unitary matrix whose action induces a change from thebasis {|Ci〉, |Di〉, |Ei〉, |Fi〉} to the basis {|ej〉}4j=1. In fact,Ui|Ci〉 = |e1〉, Ui|Di〉 = |e2〉 Ui|Ei〉 = |e3〉, and Ui|Fi〉 = |e4〉.The operator Ui can now be used to change the basis in which MAi , MBi ,and M∧i are represented to the basis {|ej〉}4j=1:M¯Ai = UiMAi U−1i ,M¯Bi = UiMBi U−1i ,M¯∧i = UiM∧i U−1i .(5.35)845.2. Data Representation of Multiple ExemplarsSince 1 = U−1i Ui = UiU−1i , we obtainµi(A) = 〈Ci|MAi |Ci〉 = 〈CiU−1i |UiMAi U−1i |UiCi〉 = 〈e1|M¯Ai |e1〉,µi(B) = 〈Ci|MBi |Ci〉 = 〈CiU−1i |UiMBi U−1i |UiCi〉 = 〈e1|M¯Bi |e1〉,µi(AB) = 〈Ci|M∧i |Ci〉 = 〈CiU−1i |UiM∧i U−1i |UiCi〉 = 〈e1|M¯∧i |e1〉.(5.36)We then use the inverse isomorphism I−1 to obtain a zero-type representa-tion in C2 ⊗ C2:|C〉 = I−1(|e1〉) = |1〉 ⊗ |1〉,M˜Ai = I−1M¯Ai I,M˜Bi = I−1M¯Bi I,M˜∧i = I−1M¯∧i I.(5.37)We have constructed a zero-type representation (|1〉⊗|1〉, {MAi ,MBi }ki=1)from a collection of representations (|Ci〉,M) for the exemplars pi withM(x, y)→ (x, 0) obtained from Theorem 4.5.In the construction of Theorem 5.8, note that when Eq. (5.37) entails op-erators MAi and MBi that are of the form MiA = Mˇi⊗1 and MiB = 1⊗Mˇi,then the representation is also of the first-type. Stating the necessary andsufficient conditions required for a set of data to have first-type representa-tion is out of the scope of this thesis. However, since second-type are alsofirst-type representations, we can obtain sufficient conditions for the exis-tence of a first-type representation by characterizing the conditions requiredfor the data to have a second-type representation:Lemma 5.9. The set of data µki=1 has a second-type representation in C2⊗C2 if and only if for each i = 1, ..., k, there exist |Ai〉, |Bi〉, MˇiA, and MˇiBsuch that Eqs. (4.62)–(4.66) are satisfied.Proof. Let Ui(A) and Ui(B) be the unitary transformations that map |Ai〉to |1〉 and |Bi〉 to |1〉 for i = 1, ..., k. Then (|1〉⊗ |1〉, {MAi ⊗1,1⊗MBi }ki=1)is a tensor space zero-type representation of µki=1 withMAi = Ui(A)−1MˇAi Ui(A),MBi = Ui(B)−1MˇBi Ui(B).(5.38)855.2. Data Representation of Multiple ExemplarsNote that Theorem 5.8 and Lemma 5.9 characterize the sets of data thathave a zero- and second-type representations. Since the first-type represen-tation is less general than the zero-type representation, but more generalthan the second-type representation, Theorem 5.8 and Lemma 5.9 can beapplied to obtain an upper and lower bound on the number of exemplarsthat have a first-type representation.We need to extend Theorem 5.8 to C3⊗C3 so the zero-, first-, and second-type representations become compatible with the representation developedin § 5.2.1 for a Hilbert space model in C3. The next corollary extends theproof of Theorem 5.8 to the space C3 ⊗ C3.Corollary 5.10. If the set of data µki=1 has a zero-type representation inC2 ⊗ C2, then µki=1 has a zero-type representation in C3 ⊗ C3.Proof. Let (|C〉, {MAi ,MBi }ki=1) be a zero-type representation of µki=1 in C2⊗C2. We can create a vector|C∗〉 =3∑i,j=1c∗ij |i〉 ⊗ |j〉 (5.39)such that it is the trivial embedding of|C〉 =2∑i,j=1cij |i〉 ⊗ |j〉 (5.40)in C3 ⊗ C3 by choosingc∗ij ={cij i, j ∈ {1, 2},0 else.(5.41)Similarly, we can also create operators MA∗i and MB∗i by using the trivialembedding so that the actions of the operators MAi and MBi on C2⊗C2 arepreserved. This completes the proof.Since second-type representations are also first- and zero-type represen-tations, we can apply Corollary 5.10 to obtain a first- and second-type rep-resentation in C3 ⊗ C3.865.2. Data Representation of Multiple Exemplars5.2.3 Two-sector Fock Space RepresentationWe now combine the representations of multiple exemplars developed in§ 5.2.1 and § 5.2.2 to represent sets of data in the two-sector Fock spacemodel in a way that is consistent with the modeling principles of quantumcognition in the concrete space C3 ⊕ C3 ⊗ C3.Definition 5.11. A zero-type representation of µki=1 in C3⊕C3⊗C3 consistsof the vectors |A〉, |B〉 ∈ C3 and |C〉 ∈ C3 ⊗ C3, a collection of operators{Mi,MAi ,MBi }ki=1 from C3 ⊗ C3 to C3 ⊗ C3, and a coefficient nAB ∈ [0, 1]such that for all i = 1, .., , k, condition (5.4) is satisfied, and the vector|ψAB〉, defined in Eq. (5.5), and the operator MF∧i , defined in Eq. (5.6),satisfy condition (5.7). We say that (nAB, |A〉, |B〉, |C〉, {Mi,MAi ,MBi }ki=1)is a zero-type representation of µki=1 in C3 ⊕ C3 ⊗ C3.Definition 5.12. A first-type representation of µki=1 in C3⊕C3⊗C3 consistsof a tensor |C〉 ∈ C3⊗C3, a collection of operators {Mi}ki=1 from C3 to C3,and a coefficient nAB ∈ [0, 1] such that (nAB, |A〉, |B〉, |C〉, {Mi,Mi⊗1,1⊗Mi}ki=1) is a zero-type representation of µki=1 in C3 ⊕C3 ⊗C3. We say that(nAB, |A〉, |B〉, {Mi}ki=1) is a first-type representation of µki=1 in C3⊕C3⊗C3.Definition 5.13. A second-type representation of µki=1 in C3 ⊕ C3 ⊗ C3consists of the vectors |A〉, |B〉 ∈ C3, a collection of operators {Mi}ki=1from C3 to C3, and a coefficient nAB ∈ [0, 1] such that (nAB, |A〉, |B〉, |A〉 ⊗|B〉, {Mi,Mi ⊗ 1,1⊗Mi}ki=1) is a zero-type representation of µki=1 in C3 ⊕C3 ⊗ C3. We say that (nAB, |A〉, |B〉, {Mi}ki=1) is a second-type representa-tion of µki=1 in C3 ⊕ C3 ⊗ C3.We now identify the conditions for a zero- and second-type representationof a set of data in C3 ⊕ C3 ⊗ C3.Theorem 5.14. The set of data µki=1 admits a zero-type representation inC3⊕C3⊗C3 if and only if there exists nAB ∈ [0, 1] such that for all i = 1, ..., kµi(AB) = nABµ˜i(AB) +√1− n2ABµ˘i(AB), (5.42)with µ˜ki=1 satisfying conditions (4.1)–(4.4), and µ˘ki=1 is classical conjunctiondata.Proof. Since µi satisfies conditions (4.1)–(4.4) for i = 1, ..., k, we apply The-orem 5.3, and Corollary 5.4 to obtain the representation (|A〉, |B〉, {Mi}ki=1)of µ˜ki=1 in C3 with |A〉 = |1〉 and |B〉 = |2〉. Similarly, since µi is classical875.2. Data Representation of Multiple Exemplarsdata for i = 1, ..., k, we apply Theorem 5.8 and Corollary 5.10 to obtain azero-type representation (|C〉, {MAi ,MBi }ki=1) of µ˘ki=1 in C3 ⊗ C3.Next, let i = 1, ..., k. We apply Eq. (5.44), to show that the state |ψAB〉satisfies〈ψAB|MF∧i |ψAB〉 =n2AB2(〈A|+ 〈B|)Mi(|A〉+ |B〉) + (1− n2AB)〈C|M∧i |C〉= n2ABµ˜i(AB) + (1− n2AB)µ˘i(AB) = µ(AB).(5.43)This completes the proof.This result can be similarly obtained for the second-type representation.Corollary 5.15. The set of data µki=1 admits a second-type representationin C3 ⊕ C3 ⊗ C3 if and only if there exists nAB ∈ [0, 1] such that for alli = 1, ..., kµi(AB) = nABµ˜i(AB) +√1− n2ABµ˘i(AB), (5.44)with µ˜i satisfying conditions (4.1)–(4.4), and µ˘i(AB) = µi(A)µi(B).Proof. The result follows from the proof of Theorem 5.14 replacing µ˘i(AB)by µi(A)µi(B).Theorem 5.14 shows that a zero-type representation of a set of data thatrespects the modeling principles of quantum cognition requires the existenceof a value for nAB such that the convex combination of the membershipweights µ˜(AB), representing the contribution given by C3, and µ˘(AB), rep-resenting the contribution given by C3 ⊗ C3, are equal to the membershipweight µi(AB), for i = 1, ..., k.The second-type representation additionally imposes the conditions thatthe membership weight operators for concepts A and B act separately onthe two sides of C3 ⊗ C3, and that |C〉 = |A〉 ⊗ |B〉.As an example of how to identify whether or not a collection of exem-plars can be represented, consider the exemplars p1 =‘filing cabinet’ and885.3. Data Representation Analysisp11 =‘painting’ for the concepts A =‘Furniture,’ B =‘Household Appli-ances,’ and their conjunction A and B =‘Furniture and Household Appli-ances’ [Ham88b]. The membership values areµ1(A) = 0.97, µ1(B) = 0.31, µ1(AB) = 0.53, andµ2(A) = 0.62, µ2(B) = 0.05, µ2(AB) = 0.11.(5.45)A simple calculation shows that we can obtain a separate zero-type represen-tation of µ1 and µ2 by choosing nAB ∈ [0.3215, 1], and nAB ∈ [0.119, 0.692],respectively. Therefore, a zero-type representation of µ2i=1 requires nAB ∈[0.3215, 0.692].5.3 Data Representation AnalysisIn this section, we provide an analysis of Hampton’s data on conjunctionto compare the two-sector Fock space model to both the Hilbert space andtensor product models presented in Chapter 4. We identify how many ex-emplars can be represented by the zero- and second-type of representationsin the Fock space using Theorem 5.14 and Corollary 5.15 respectively. Thisgive us a upper and lower bound on the number of exemplars that can besimultaneously represented by the first-type representation.For the representation of individual exemplars, the zero- and second-typeof representations in the two-sector Fock space can model 78.1% and 77%of the exemplars in the data set respectively. This is an improvement overthe performance of the Hilbert space model (41%) and of the tensor productmodel (20%).For the representation of multiple exemplars, Figure 5.1 shows the frac-tion of exemplars that can be simultaneously represented for each value ofnAB for the set of concept conjunctions tested by Hampton.We now elaborate on six general statements, inferred from these graphs,that explain how the two-sector Fock space model developed in § 5.1.2 is animprovement of the Hilbert space and tensor product models.The first statement is that at the extreme values, nAB = 0 or 1, corre-sponding to the tensor product and Hilbert space models respectively, the895.3. Data Representation Analysismodel recovers the previous performances, 19% for the tensor product modeland 41.6% for the Hilbert space model, obtained in the analysis of Chapter 4.The second statement is that, as we expected, the zero-type represen-tation performs better than the second-type representation. However, thedifference is small. Since we know that the first-type representation is moreconstrained than the zero-type representation, but less constrained than thesecond-type representation, we conclude that the performance of the first-type representation should be similar to the performance of the second-typerepresentation.The third statement is that there is a small decrease in performance fornAB between 0 and 0.3. This implies that the logical-based representationperforms better than the superposition of logic and emergent thought, whenlogical thought is dominant.The fourth statement is that there a is steady improvement for nAB be-Figure 5.1: Fraction of Hampton’s experimental data that can be simul-taneously modeled in the two-sector Fock space model for different valuesof nAB. The blue and red curves correspond to the fraction of exemplarsthat can be simultaneously modeled using the zero-type and second-typerepresentations respectively.905.3. Data Representation Analysistween 0.3 and 0.8. This implies that in this range, the stronger the influenceof the emergent mode of thought, the better the performance.The fifth statement is that there is a slight decrease in the performancefor nAB between 0.8 and 0.9, and that the performance remain stable fornAB > 0.9. This implies that the maximal performance of the two-sectorFock space model is reached at a value of nAB close to 0.8.The sixth statement is the two-sector Fock space model outperforms thetensor product model for 0.3 ≤ nAB ≤ 1, and outperforms the Hilbert spacemodel for 0.7 ≤ nAB ≤ 0.9.We conclude that the two-sector Fock space model gives a better perfor-mance for the representation of individual exemplar and multiple exemplarssimultaneously, and that the model reaches its best performance when thefirst sector is dominant at nAB ∼ 0.8.91Chapter 6Fock Space Modeling ofNegations and Conjunctionsof ConceptsThe Fock space model, introduced in Chapter 5, uses the idea of a super-position of modes of thought to model conjunctions and disjunctions. Sinceconjunction, disjunction, and negation are the primary operations in logic,we now consider how the negations of concepts are represented in the Fockspace model. We restrict our analysis to conjunctions and negations becausethis is the only experimental data available to contrast theory with data.The notation for this chapter is as follows: Let A and B be two conceptsand let pi be an exemplar. We denote the negation of concept A by A¯ =Not A, and the conjunctions ‘A and B,’ ‘A¯ and B,’ ‘A and B¯,’ and ‘A¯ and B¯,’by AB, A¯B, AB¯, and A¯B¯, respectively. We denote the set of data for themembership weights of A, B, A¯, B¯, and their conjunctions byµi = {µi(A), µi(B), µi(A¯), µi(B¯), µi(AB), µ(A¯B), µi(AB¯), µi(A¯B¯)}, (6.1)and the set of data for the exemplars pi, for i = 1, ..., k by µki=1.First, we develop a theoretical analysis that characterizes classical datafor the case of conjunctions and negations. Next, we introduce experimentaldata showing that concept combinations involving negations of concepts donot satisfy the conditions of classical data. Finally, we develop an extensionof the model presented in Chapter 5 to represent conjunctions and negationsof concepts, and we give some examples of data representation.926.1. Conditions for a Classical Model6.1 Conditions for a Classical ModelWe now introduce the conditions for data representation within a classi-cal Kolmogorovian probability model, and use this information to character-ize classical data for the case of concept combinations involving conjunctionsand negations.Definition 6.1. The set of data µi is a classical data set, or classical data,if and only if there exists a Kolmogorovian probability space (Ω, σ(Ω), P )and events EA, EB ∈ σ(Ω) such thatP (EA) = µi(A), (6.2)P (EB) = µi(B), (6.3)P (Ω \ EA) = µi(A¯), (6.4)P (Ω \ EB) = µi(B¯), (6.5)P (EA ∩ EB) = µi(AB), (6.6)P (EA ∩ (Ω \ EB)) = µi(AB¯), (6.7)P ((Ω \ EA) ∩ EB) = µi(A¯B), (6.8)P ((Ω \ EA) ∩ (Ω \ EB)) = µi(A¯B¯). (6.9)Note that the conditions for classical data for conjunctions and negationscontain the conditions for classical data for conjunctions given by Theo-rem 2.2. Indeed, becauseEA = (EA ∩ EB) ∪ (EA ∩ (Ω/EB)), (6.10)combining Eqs. (6.6) and (6.7) yieldsµi(A) = µi(AB) + µi(AB¯). (6.11)From this µi(AB) ≤ µi(A). The other two conditions for classical data forconjunctions can be obtained similarly.Moreover, note that Eqs. (6.2)–(6.9) imply that a concept and its nega-tion entail ‘opposite’ membership evaluations. For example, from Eqs. (6.2)and (6.4), we haveµi(A)+µi(A¯) = P (EA)+P (Ω/EA) = P (EA∪(Ω/EA)) = P (Ω) = 1. (6.12)936.1. Conditions for a Classical ModelIn addition, Eqs. (6.11) and (6.12) provide examples of the marginal prob-ability law (see Eq. (A.4) in Appendix A.3).We now identify a set of conditions for µi to be classical data.Theorem 6.2. The set of data µi is classical data if and only ifµi(A) + µi(A¯) = 1, (6.13)µi(B) + µi(B¯) = 1, (6.14)µi(A) = µi(AB) + µi(AB¯), (6.15)µi(B) = µi(AB) + µi(A¯B), (6.16)µi(A¯) = µi(A¯B¯) + µi(A¯B), (6.17)µi(B¯) = µi(A¯B¯) + µi(AB¯). (6.18)Proof. Since µi is classical data, Eqs. (6.2)–(6.9) are satisfied. Therefore,the marginal probability formulas, given by conditions (6.15)–(6.18), aredirectly satisfied. Moreover, since P (Ω) = 1, we add (6.2) and (6.4) toobtain condition (6.13), and add conditions (6.3) and (6.5) to obtain con-dition (6.14).Now suppose that µi satisfies Eqs. (6.13)–(6.18). We need to provethat there exists a probability space, (Ω, σ(Ω), P ), that satisfies (6.2)–(6.9).Consider the set Ω = {1, 2, 3, 4}, and let σ(Ω) = P(Ω) be the set of allsubsets of Ω. SetP ({1}) = µi(AB), (6.19)P ({2}) = µi(AB¯), (6.20)P ({3}) = µi(A¯B), (6.21)P ({4}) = µi(A¯B¯), (6.22)and for any arbitrary subset S ⊆ {1, 2, 3, 4}, defineP (S) =∑a∈SP ({a}). (6.23)From Eqs. (6.15)–(6.18), we obtainEA = {1, 2},EB = {1, 3},EA¯ = {3, 4},EB¯ = {2, 4}.(6.24)946.1. Conditions for a Classical ModelIt is easy to verify that, given these choices, Eqs. (6.2)–(6.9) are satisfied.Since a Kolmogorovian probability space additionally requires that P (Ω) =1, we apply Eq. (6.13) to obtainP (Ω) = P ({1, 2, 3, 4}) = 1. (6.25)This completes the proof.Theorem 6.2 characterizes classical data using the marginal probabilitylaw for estimations of membership weights for individual and combined con-cepts. We now introduce an alternative form that will be useful to measurethe deviations in the experimental data.Definition 6.3. LetΛA = 1− µi(A)− µi(A¯), (6.26)ΛB = 1− µi(B)− µi(B¯), (6.27)IA = µi(A)− µi(AB)− µi(AB¯), (6.28)IB = µi(B)− µi(AB)− µi(A¯B), (6.29)IA¯ = µi(A¯)− µi(A¯B¯)− µi(A¯B), (6.30)IB¯ = µi(B¯)− µi(A¯B¯)− µi(AB¯), (6.31)IABA¯B¯ = 1− µi(AB)− µi(AB¯)− µi(A¯B)− µi(A¯B¯). (6.32)The following result summarizes the conditions for classical data usingthe parameters defined in Eqs. (6.28)–(6.32).Corollary 6.4. The set of data µi is classical conjunction data if and onlyifIABA¯B¯ = IA = IB = IA¯ = IB¯ = 0. (6.33)Proof. We obtain Eq. (6.13) by combining IABA¯B¯ = 0 with Eqs. (6.28)and (6.30). Similarly, we obtain Eq. (6.14) by combining IABA¯B¯ = 0 withEqs. (6.29) and (6.31). Next, IA = 0 implies Eq. (6.15). Similarly, IB =IA¯ = IB¯ = 0 implies Eqs. (6.15)–(6.18).956.2. Experiment on Conjunctions and Negations of Concepts6.2 Experiment on Conjunctions and Negationsof ConceptsWe use an experimental data set µi obtained for four pairs of concepts10.In this experiment, participants fill a questionnaire in which they have toestimate the membership of different exemplars with respect to conceptsand concept combinations involving conjunctions and negations. The pairsof concepts considered in the experiment are(A1,B1) = (‘Home Furnishing’,‘Furniture’),(A2,B2) = (‘Spices’,‘Herbs’),(A3,B3) = (‘Pets’,‘Farmyard Animals’), and(A4,B4) = (‘Fruits’,‘Vegetables’).(6.34)The membership of 24 exemplars was tested for each pair of concepts.The data is shown in Appendix B. The choice of exemplars and conceptswas inspired by Hampton’s experiments for concept disjunctions [Ham88b].The methodology for the experiment is that of the “within-subjects” de-sign. That is, all participants were exposed to the same conditions. The setof 24 exemplars was assigned to all participants. In the experiment, partic-ipants were requested to estimate the membership of the 24 exemplars inthe following order: i) Aj ,Bj , and AjBj , ii) Aj , B¯j , and AjB¯j , iii) A¯j ,Bj ,and A¯jBj , and iv) A¯j , B¯j , and A¯jB¯j , for j = 1, ..., 4.The membership weight of exemplars was estimated using the scale{−3,−2,−1, 0,+1,+2,+3}, (6.35)where the extreme values −3 and +3 indicate strong non-membership andstrong membership respectively, and zero, the inability to decide.The analysis presented in this chapter uses only the data for membershipvs non-membership in the interval [0, 1]. Therefore, we average the member-ship estimations asssuming a value equal to 0 with each negative response,0.5 with each response equal to zero, and +1 with each positive response.10The experiment was tested on 40 participants, and was carried out by a collaborator,Sandro Sozzo.966.2. Experiment on Conjunctions and Negations of Concepts6.2.1 ResultsA statistical analysis for the data suggests two strong tendencies:1. membership estimations satisfy conditions (6.13) and (6.14) for clas-sical data, and2. membership estimations violate conditions (6.15)–(6.18), and the valueof the deviation is approximately constant.To support the first claim, we give the 95% confidence interval for the devi-ations ΛA and ΛB defined in Eqs. (6.26) and (6.27) respectively for the setof exemplars of each concept combination. In all cases, the deviations fallwithin a narrow band that is very close to zero. In fact, because µi(A) andµi(A¯) have values between [0, 1], ΛA and ΛB are contained in an intervalof length 2. However, we see in Table 6.1 that the experimental data fallswithin an interval that is of length smaller than 0.05 with 95% certainty.Moreover, the center of the interval is also contained in a narrow region be-tween the values −0.016 (for ΛB, and i = 2) and −0.105 (for ΛB and i = 1).This result confirms that participants’ reasoning about the membership ofexemplars with respect to individual concepts and their negations obeys therules of classical logic and probability.To support the second claim, we give the 95% confidence interval forIA, IB, IA¯, and IB¯ defined in Eqs. (6.28)–(6.31) for the set of exemplars ofeach concept combination.In all cases, the deviations fall within a narrow band of similar values.In fact, since all the values of the data set µi are contained in the interval[0, 1], IA, IB, IA¯, and IB¯ must fall in an interval of length 2. However, theexperimental data falls within an interval that is of length smaller than 0.09,and whose center is between −0.471 (IB, and j = 4), and −0.274 (IB¯, andj = 4) with 95% certainty. This result confirms that participants’ reasoningabout the membership with respect to concept combinations deviates fromTable 6.1: 95% confidence interval for ΛA and ΛB for the data on conjunc-tions and negations in Tables B.1–B.4, Appendix B.95% CI j = 1 j = 2 j = 3 j = 4ΛA (−0.074,−0.032) (−0.064,−0.037) (−0.034,−0.014) (−0.036, 0.000)ΛB (−0.125,−0.078) (−0.038, 0.005) (−0.041,−0.012) (−0.047,−0.023)976.2. Experiment on Conjunctions and Negations of Conceptsthe rules of classical logic and probability, and the value of the deviation isapproximately constant.The pattern of deviations from classicality can be further refined bylooking separately at IX for X = A,B, and IY , for Y = A¯, B¯. If we considerthe average and variance of the extremes of the intervals, we obtain theinterval(−0.460± 0.0006,−0.395± 0.0001), (6.36)for IX with X = A,B, and(−0.371± 0.0029,−0.311± 0.0035), (6.37)for IY with Y = A¯, B¯.Note that while the length of both average intervals is approximately0.06, the center of these intervals is different. In particular, the center forthe case X = A,B is 0.428, and 0.341 for the case Y = A¯, B¯. Moreover,although the variances are small in both cases, it is one order of magnitudesmaller for X = A,B. This means that the violations of the conditions forclassical data have larger value and are more pronounced in the member-ship estimations of concepts than in the estimations of the negated concepts.To visualize these patterns, we show the extreme values of the 95% con-fidence interval for IA, IA¯, IB, and IB¯ in Fig. 6.1. Blue points denote theinterval for concepts and red points for the negated concepts. For example,the interval IA for i = 1 is the blue point with coordinates (−0.469,−0.406).It is easy to observe that the blue points are more concentrated, andfurther from the origin than the red points. This visually confirms the anal-Table 6.2: 95% confidence interval for IA, IB, IA¯, and IB¯, for the data onconjunctions and negations in Tables B.1–B.4, Appendix B.95% CI j = 1 j = 2 j = 3 j = 4IA (−0.469,−0.406) (−0.476,−0.390) (−0.426,−0.349) (−0.463,−0.398)IB (−0.482,−0.427) (−0.443,−0.376) (−0.429,−0.368) (−0.495,−0.446)IA¯ (−0.458,−0.393) (−0.375,−0.326) (−0.332,−0.259) (−0.359,−0.302)IB¯ (−0.390,−0.329) (−0.429,−0.387) (−0.323,−0.241) (−0.298,−0.251)986.3. Fock Space Modeling of Conjunctions and Negationsysis above: the deviation from classical data is larger for the membershipestimations of concepts than for those of negated concepts.6.3 Fock Space Modeling of Conjunctions andNegationsSince the data for concept combinations involving conjunctions and nega-tions does not satisfy the conditions for classical data, a quantum model isnecessary. We extend the model for concept conjunctions presented in §5.1.2to the negations of concepts. To construct this model extension, we intro-duce a set of conditions that relate a concept to its negation in the twosectors of the Fock space model, and add them to the conditions obtainedfor the case of conjunctions.First, we introduce some notation to facilitate the presentation. LetX = A or A¯ and Y = B or B¯ and setFigure 6.1: Representation of intervals IA and IB on blue, and of intervalsIA¯ and IB¯ on red.996.3. Fock Space Modeling of Conjunctions and Negationshmin(XY ) = ave(XY )− dev(XY ),hmax(XY ) = ave(XY ) + dev(XY ),tmin(XY ) = max(0, 1− µi(X)− µi(Y )),tmax(XY ) = min(µi(X), µi(Y )).(6.38)6.3.1 First Sector AnalysisRecall that the requirements for a model for the concepts A, B, and theirconjunction AB are〈A|A〉 = 〈B|B〉 = 1, (6.39)〈A|B〉 = 0, (6.40)µi(A) = 〈A|M|A〉, µi(B) = 〈B|M|B〉, (6.41)µi(AB) =12(µi(A) + µi(B)) + <(〈A|M|B〉). (6.42)To extend the model for negations, we represent the state of the conceptualnegations A¯ and B¯ by the vectors |A¯〉 and |B¯〉 respectively, and require thatthe set {|A〉, |B〉, |A¯〉, |B¯〉} forms an orthonormal set. Moreover, we requirethatµi(A¯) = 〈A¯|M|A¯〉, and µi(B¯) = 〈B¯|M|B¯〉. (6.43)Next, we build the state for the concept combinations as a superposition ofstates|XY 〉 = 1√2(|X〉+ |Y 〉), (6.44)and extend condition (6.42) to the other concept combinations:µi(XY ) =12(µi(X) + µi(Y )) + <〈X|M|Y 〉. (6.45)To measure negated concepts, we use the standard negated operator M⊥from quantum theory [BVN75]:M⊥ = 1−M. (6.46)1006.3. Fock Space Modeling of Conjunctions and NegationsBecause we consider four orthogonal vectors |A〉, |A¯〉, |B〉, and |B¯〉, andtwo projection operators M and 1−M, the maximal number of subspaceswe can obtain is eight11. Therefore, we set H = C8.Definition 6.5. The set of data µi has a representation in the Hilbert spaceC8 if and only if there exist vectors |A〉, |B〉, |A¯〉, |B¯〉 ∈ C8, and M : C8 → C8such that Eqs. (6.39)–(6.45) are satisfied.The following theorem summarizes the type of data that can be repre-sented by this model.Theorem 6.6. The set of data µi has a representation in the Hilbert spaceC8 if and only ifµi(XY ) ∈ [hmin(XY ), hmax(XY )], (6.47)for X = A or A¯, Y = B or B¯.Proof. Let the set {|1〉, |2〉, ..., |8〉} denote the canonical basis of C8, anddefine the operators M and 1−M byM((x1, ..., x8)) = (0, 0, 0, 0, x5, x6, x7, x8), and1−M((x1, ..., x8)) = (x1, x2, x3, x4, 0, 0, 0, 0).(6.48)If we set|A〉 = eiφA(a1, a2, a3, a4, a5, a6, a7, a8), (6.49)|A¯〉 = eiφA¯(a′1, a′2, a′3, a′4, a′5, a′6, a′7, a′8), (6.50)|B〉 = eiφB (b1, b2, b3, b4, b5, b6, b7, b8), (6.51)|B¯〉 = eiφB¯ (b′1, b′2, b′3, b′4, b′5, b′6, b′7, b′8), (6.52)11Since we have shown in § 6.2.1 that µi(A) + µi(A¯) and µi(B) + µi(B¯) are usuallyvery close to one, a good approximation of µi could be constructed using less than eightindependent subspaces.1016.3. Fock Space Modeling of Conjunctions and Negationsthen since these vectors must be orthonormal, we have〈A|A¯〉 = ∑8i=1 aia′i = 0, (6.53)〈B|B¯〉 = ∑8i=1 bib′i = 0, (6.54)〈A|B〉 = ∑8i=1 aibi = 0, (6.55)〈A|B¯〉 = ∑8i=1 aib′i = 0, (6.56)〈A¯|B〉 = ∑8i=1 a′ibi = 0, (6.57)〈A¯|B¯〉 = ∑8i=1 a′ib′i = 0. (6.58)〈A|A〉 = ∑8i=1 aiai = 1, (6.59)〈B|B〉 = ∑8i=1 bibi = 1, (6.60)〈A¯|A¯〉 = ∑8i=1 a′ia′i = 1, (6.61)〈B¯|B¯〉 = ∑8i=1 b′ib′i = 1. (6.62)We use Eqs (6.41) and (6.43) to compute the membership weights:µi(A) = 〈A|M|A〉 = a25 + a26 + a27 + a28, (6.63)1− µi(A) = 〈A|1−M|A〉 = a21 + a22 + a23 + a24, (6.64)µi(A¯) = 〈A¯|M|A¯〉 = a′25 + a′26 + a′27 + a′28, (6.65)1− µi(A¯) = 〈A¯|1−M|A¯〉 = a′21 + a′22 + a′23 + a′24, (6.66)µi(B) = 〈B|M|B〉 = b25 + b26 + b27 + b28, (6.67)1− µi(B) = 〈B|1−M|B〉 = b21 + b22 + b23 + b24, (6.68)µi(B¯) = 〈B¯|M|B¯〉 = b′25 + b′26 + b′27 + b′28, (6.69)1− µi(B¯) = 〈B¯|1−M|B¯〉 = b′21 + b′22 + b′23 + b′24. (6.70)Next, Eqs. (6.42) yieldsµi(AB) =12(µi(A) + µi(B)) + <〈A|M|B〉= ave(AB) +8∑i=5aibi cos(φB − φA), (6.71)µi(AB¯) =12(µi(A) + µi(B¯)) + <〈A|M|B¯〉= ave(AB¯) +8∑i=5aib′i cos(φB¯ − φA), (6.72)µi(A¯B) =12(µi(A¯) + µi(B)) + <〈A¯|M|B〉1026.3. Fock Space Modeling of Conjunctions and Negations= ave(A¯B) +8∑i=5a′ibi cos(φB − φA¯), (6.73)µi(A¯B¯) =12(µi(A¯) + µi(B¯)) + <〈A¯|M|B¯〉= ave(A¯B¯) +8∑i=5a′ib′i cos(φB¯ − φA¯). (6.74)To finish the proof, we need to show that the interference terms for thecombinations XY are bounded by dev(XY ). We apply the Cauchy-Schwarzlemma to the interference term in Eq. (6.71) to obtain|<(〈A|M|B〉)| ≤√µi(A)µi(B). (6.75)Eq. (6.56) implies that0 = 〈A|B〉 = 〈A|M|B〉+ 〈A|1−M|B〉. (6.76)Since the real and imaginary parts of Eq. (6.76) must be zero, we obtainfrom the real part<(〈A|M|B〉)2 = (a1b1 + a2b2 + a3b3 + a4b4)2 cos2(φB − φA), (6.77)and apply Cauchy-Scwharz lemma to obtain<(〈A|M|B〉)2 ≤ (a21 + a22 + a23 + a24)(b21 + b22 + b23 + b24) cos(φB −φA)2. (6.78)Thus,|<(〈A|M|B〉)| ≤√(1− µi(A))(1− µi(B)). (6.79)Combining Eqs. (6.75) and (6.79) yields|<(〈A|M|B〉)| ≤√min(µi(A)µi(B), (1− µi(A))(1− µi(B)), (6.80)and thusµi(AB) ∈ [hmin(AB), hmax(AB)]. (6.81)We repeat this procedure with Eqs. (6.72)–(6.74) to obtain Eq. (6.47). Theother side of the implication follows directly from the construction.1036.3. Fock Space Modeling of Conjunctions and NegationsSimilarly to § 5.2.1, we can obtain a representation in C8 that is com-patible with the quantum modeling principles by applying unitary transfor-mations to the representations of individual exemplars.Definition 6.7. A Hilbert space representation of µki=1 is a four-tuple ofunit vectors |A〉, |B〉, |A¯〉, |B¯〉 ∈ C8, and a collection of orthogonal projectorsMi : C3 → C3, for i = 1, ..., k, such that conditions (4.1)–(4.4) are satisfiedfor i = 1, ..., k. We say (|A〉, |B〉, |A¯〉, |B¯〉, {Mi}ki=1) is a representation ofµki=1 in C8.The following is a corollary of Theorem 5.3 for the representation inDefinition 6.7.Corollary 6.8. The set of data µki=1 has a representation in C8 if and onlyif for all i = 1, ..., k,µi(AB) ∈ [hmin(AB), hmax(AB)]. (6.82)Proof. The proof follows the same construction as in Theorem 5.3 startingfrom four vectors in C8 instead of two vectors in C3.6.3.2 Second Sector AnalysisTo ensure that the two sectors of the two-sector Fock space model arecompatible, we use C8 ⊗ C8 for the second sector. In order to represent anexemplar, the vector |C〉 describes the conceptual situation where conceptsand their combinations are jointly represented, and projection operatorsmeasure the membership weight for each combination. Let |C〉 be a unitvector in C8 ⊗ C8. That is,|C〉 =8∑i,j=1cijeiγij |i〉 ⊗ |j〉, (6.83)and〈C|C〉 = (8∑k,l=1ckle−iγkl〈k| ⊗ 〈l|)(8∑i,j=1cijeiγij |i〉 ⊗ |j〉)=8∑k,l=18∑i,j=1cklcijei(γij−γkl)〈k|i〉〈l|j〉=8∑i,j=1cijcijei(γij−γij) =8∑i,j=1c2ij = 1.(6.84)1046.3. Fock Space Modeling of Conjunctions and NegationsWe extend the membership operator M, defined in Eq. (6.48), to the tensorproduct using MA = M ⊗ 1, MB = 1 ⊗M, and MAB = MAMB. Like-wise, the operators that measure membership for concepts and conjunctionsinvolving negated concepts are defined as follows:MA¯ = (1−M)⊗ 1,MB¯ = 1⊗ (1−M),MA¯B = MA¯MB,MAB¯ = MAMB¯,MA¯B¯ = MA¯MB¯.(6.85)Therefore, the formulas for the membership weight for the concepts A andB areµi(A) = 〈C|MA|C〉 = 〈C|M⊗ 1|C〉= (8∑k,l=1ckle−iγkl〈k| ⊗ 〈l|)|M⊗ 1|(8∑i,j=1cijeiγij |i〉 ⊗ |j〉)=8∑k,l=18∑i,j=1cklcijei(γij−γkl)〈k|M|i〉〈l|1|j〉=8∑i,j,k=1ckjcijei(γij−γkj)〈k|M|i〉=8∑i=58∑j=1cijcijei(γij−γij) =8∑i=58∑j=1c2ij ,(6.86)and1056.3. Fock Space Modeling of Conjunctions and Negationsµi(B) = 〈C|MB|C〉 = 〈C|1⊗M|C〉= (8∑k,l=1ckle−iγkl〈k| ⊗ 〈l|)1⊗M|(8∑i,j=1cijeiγij |i〉 ⊗ |j〉)=8∑k,l=18∑i,j=1cklcijei(γij−γkl)〈k|1|i〉〈l|M|j〉=8∑i,j,l=1cilcijei(γij−γil)〈l|M|j〉=8∑i=18∑j=5cijcijei(γij−γij) =8∑i=18∑j=5c2ij .(6.87)The membership weight formulas of the negated concepts A¯ and B¯ areµi(A¯) = 1− µi(A) = 〈C|MA¯|C〉 = 〈C|(1−M)⊗ 1|C〉= (8∑k,l=1ckle−iγkl〈k| ⊗ 〈l|)(1−M)⊗ 1|(8∑i,j=1cijeiγij |i〉 ⊗ |j〉)=8∑k,l=18∑i,j=1cklcijei(γij−γkl)〈k|1−M|i〉〈l|1|j〉=8∑i,j,k=1ckjcijei(γij−γkj)〈k|1−M|i〉=4∑i=18∑j=1cijcijei(γij−γij) =4∑i=18∑j=1c2ij , (6.88)andµi(B¯) = 1− µi(B) = 〈C|MA¯|C〉 = 〈C|1⊗ (1−M)|C〉= (8∑k,l=1ckle−iγkl〈k| ⊗ 〈l|)1⊗ (1−M)|(8∑i,j=1cijeiγij |i〉 ⊗ |j〉)=8∑k,l=18∑i,j=1cklcijei(γij−γkl)〈k|1|i〉〈l|1−M|j〉=8∑i,j,l=1cilcijei(γij−γil)〈l|1−M|j〉1066.3. Fock Space Modeling of Conjunctions and Negations=8∑i=14∑j=1cijcijei(γij−γij) =8∑i=14∑j=1c2ij . (6.89)And the membership weight formulas for concept combinations involvingconjunctions and negations areµi(AB) = 〈C|MAB|C〉 = 〈C|M⊗M|C〉= (8∑k,l=1ckle−iγkl〈k| ⊗ 〈l|)M⊗M|(8∑i,j=1cijeiγij |i〉 ⊗ |j〉)=8∑k,l=18∑i,j=1cklcijei(γij−γkl)〈k|M|i〉〈l|M|j〉=8∑i=58∑j=5cijcijei(γij−γij) =8∑i=58∑j=5c2ij , (6.90)µi(AB¯) = 〈C|MAB¯|C〉 = 〈C|M⊗ (1−M)|C〉=8∑i=54∑j=1c2ij , (6.91)µi(A¯B) = 〈C|MA¯B|C〉〈C|(1−M)⊗M|C〉=4∑i=18∑j=5c2ij , (6.92)µi(A¯B¯) = 〈C|MA¯B¯|C〉 = 〈C|(1−M)⊗ (1−M)|C〉=4∑i=14∑j=1c2ij . (6.93)Definition 6.9. A representation of µi in the second sector of the Fockspace is a pair (|C〉,M), where |C〉 ∈ C8 ⊗ C8 and M : C8 → C8 are suchthat Eqs. (6.13)–(6.18) are satisfied.The following theorem characterizes the cases when data involving con-junctions and negations can be represented in the second sector.Theorem 6.10. The set of data µi has a representation in the second sectorof the Fock space if and only µi is classical data.Proof. Assume that we have |C〉 and M such that Eqs. (6.86)–(6.93) aresatisfied. Then, it is easy to prove that the classicality conditions (6.13)–(6.18) are satisfied. For example, Eq. (6.17) is proven as follows:1076.3. Fock Space Modeling of Conjunctions and Negationsµi(A¯B) + µi(A¯B¯) = 〈C|(1−M)⊗M|C〉+ 〈C|(1−M)⊗ (1−M)|C〉= 〈C|(1−M)⊗M + (1−M)|C〉= 〈C|(1−M)⊗ 1|C〉 = µi(A¯).(6.94)We prove the other side of the implication. Suppose that µi is clas-sical data and thus satisfies conditions (6.15)–(6.18). If we choose |C〉 =∑8i,j=1 cij such thatcij =√116µi(AB) for 5 ≤ i ≤ 8 and 5 ≤ j ≤ 8,√116µi(AB¯) for 5 ≤ i ≤ 8 and 1 ≤ j ≤ 4,√116µi(A¯B) for 1 ≤ i ≤ 4 and 5 ≤ j ≤ 8,√116µi(A¯B¯) for 1 ≤ i ≤ 4 and 1 ≤ j ≤ 4,(6.95)and M, such thatM(x1, ..., x8) = (0, 0, 0, 0, x5, x6, x7, x8), (6.96)then, Eqs. (6.86)–(6.93) are easily satisfied. This completes the proof.Similarly to § 5.2.2, we can obtain representations in C8 ⊗ C8 that arecompatible with the modeling principles of quantum cognition by applyingunitary transformations to the representations of individual exemplars. Weintroduce the extensions of the zero-, first-, and second-type representationfor this case.Definition 6.11. A zero-type representation of µki=1 in C8 ⊗ C8 is a unitvector |C〉 ∈ C8⊗C8, and a collection of orthogonal projectors {MAi ,MBi }ki=1from C8⊗C8 to C8⊗C8, such that conditions(6.13)–(6.18) are satisfied withM∧i = MAi MBi , for i = 1, ..., k. We say (|C〉, {MAi ,MBi }ki=1) is a zero-typerepresentation of µki=1 in C8 ⊗ C8.Definition 6.12. A first-type representation of µki=1 in C8 ⊗ C8 is a unitvector |C〉 ∈ C8⊗C8, and a collection of orthogonal projectors {Mi}ki=1 fromC8 to C8, such that (|C〉, {Mi⊗1,1⊗Mi}ki=1) is a zero-type representationof µki=1 in C8 ⊗ C8.Definition 6.13. A second-type representation of µki=1 on the tensor prod-uct space C8 ⊗ C8 is a pair of unit vectors |A〉, B〉 ∈ C8, and a collec-tion of orthogonal projectors Mi from C8 to C8, for i = 1, ..., k, such that1086.3. Fock Space Modeling of Conjunctions and Negations(|A〉 ⊗ |B〉, {Mi ⊗ 1,1 ⊗Mi}ki=1) is a zero-type representation of µki=1 inC8 ⊗ C8.As for the case of conjunction, the first-type representation is in accor-dance with the modeling principles of quantum cognition, the second-typeand zero-type representations are a mathematical simplification and a gen-eralization respectively. These two representations will facilitate the dataanalysis.Since Definitions 6.11–6.13 are trivial extensions of Definitions 5.5–5.7,obtaining a zero-, first-, and second-type representation by applying unitarytransformations to a collection of representations for individual exemplarsfollows the same procedure presented in Theorem 5.8.Corollary 6.14. The set of data µki=1 has a zero-type representation inC8 ⊗ C8 if and only if µi is classical data for i = 1, ..., k.Corollary 6.15. The set of data µki=1 has a second-type representation inC8 ⊗ C8 if and only if for all i = 1, ..., kµi(XY ) = µi(X)µi(Y ) (6.97)for X = A, A¯, and Y = B, B¯.6.3.3 Fock Space Representation of Experimental DataWe now combine the representations of multiple exemplars developed in§ 6.3.1 and § 6.3.2 to represent sets of data in the two-sector Fock spacemodel in a way that is consistent with the modeling principles of quantumcognition in the concrete space C8 ⊕ C8 ⊗ C8.First, we need to introduce the state vectors and the membership formu-las. The model requires state vectors that represent the state of the conceptcombinations. These states correspond to the superposition of the conceptcombination represented in the first and second sectors. So the state vectorsfor the concept combinations are given by|ψXY 〉 = nXY eiρ√2(|X〉+ |Y 〉) +√1− n2XY eiθ|C〉. (6.98)Hence, the membership weights for the concept combinations are givenby1096.3. Fock Space Modeling of Conjunctions and Negationsµi(XY ) = 〈ψXY |M⊕M⊗M|ψXY 〉=n2XY2(〈X|+ 〈Y |)M(|X〉+ |Y 〉) + (1− n2XY )〈C|M⊗M|C〉=n2XY2(〈X|M|X〉+ 〈Y |M|Y 〉+ 〈X|M|Y 〉+ 〈Y |M|X〉) + (1− n2XY )8∑i,j=5c2ij=n2XY2(µi(X) + µi(X)) + <〈X|M|Y 〉) + (1− n2XY )8∑i,j=5c2ij=n2XY (µi(X) + µi(Y )2+(8∑i=5aibi)cos(φB − φA)) + (1− n2XY )µˇi(XY ).(6.99)Eq. (6.99) expresses the membership weights for conjunctions of conceptsA, B, and their negations. We can introduce the representations of multipleexemplars as in §5.2.3.Definition 6.16. A zero-type representation of µki=1 in C8⊕C8⊗C8 consistsof an orthonormal set {|A〉, |B〉, |A¯〉, |B¯〉} of vectors in C8, a tensor |C〉 ∈C8⊗C8, a collection of operators {Mi,MAi ,MBi }ki=1 from C8⊗C8 to C8⊗C8,and coefficients 0 ≤ nAB, nA¯B, nAB¯, nA¯B¯ ≤ 1 such that for all i = 1, ..., k,Eqs. (6.41), (6.43), (6.84), (6.86)–(6.89), and (6.99) are satisfied. We saythat(nAB, nA¯B, nAB¯, nA¯B¯, |A〉, |B〉, |A¯〉, |B¯〉, |C〉, {Mi,MAi ,MBi }ki=1)is a zero-type representation of µki=1 in C8 ⊕ C8 ⊗ C8.Definition 6.17. A first-type representation of µki=1 in C8 ⊕ C8 ⊗ C8 is azero-type representation(nAB, nA¯B, nAB¯, nA¯B¯, |A〉, |B〉, |A¯〉, |B¯〉, |C〉, {Mi,MAi ,MBi }ki=1)of µki=1 in C8 ⊕ C8 ⊗ C8 such that for all i = 1, ..., k, MAi = Mi ⊗ 1, andMBi = 1⊗Mi.Definition 6.18. A second-type representation of µki=1 in C8 ⊕ C8 ⊗ C8 isa zero-type representation1106.3. Fock Space Modeling of Conjunctions and Negations(nAB, nA¯B, nAB¯, nA¯B¯, |A〉, |B〉, |A¯〉, |B¯〉, |C〉, {Mi,MAi ,MBi }ki=1)of µki=1 in C8 ⊕ C8 ⊗ C8 such that |C〉 = |A〉 ⊗ |B〉.The following result summarizes the cases where the two-sector Fockspace model for conjunctions and negations of concepts can represent themembership weights for a collection of exemplars:Theorem 6.19. There exists a zero-type representation of µki=1 in the Fockspace model if and only if there exist parameters 0 ≤ nAB, nA¯B, nAB¯, nA¯B¯ ≤1 such that for all i = 1, ..., kµi(XY ) = n2XY µ˜i(XY ) +√1− n2XY µˇi(XY ), (6.100)withµ˜i(XY ) ∈ [hmin(XY ), hmax(XY )],µˇi(XY ) ∈ [tmin(XY ), tmax(XY )].(6.101)Proof. Eq. (6.101) implies we can build a representation (|A〉, |B〉, {Mi})of µ˜ki=1 in C8, and a zero-type representation (|C, {MAi ,MBi }) of µˇki=1 inC8 ⊗ C8. Next, Eq. (6.100) implies there are parameters nAB, nA¯B, nAB¯,and nA¯B¯ such that Eq. (6.99) is satisfied by |A〉, |B〉, |C〉,Mi,MAi , and MBifor each i = 1, ..., k. Therefore,(nAB, nA¯B, nAB¯, nA¯B¯, |A〉, |B〉, |C〉, {Mi,MAi ,MBi }ki=1)is a zero-type representation of µki=1 in C8 ⊕ C8 ⊗ C8.The following corollary characterizes that sets of data that allow for asecond-type representation,Corollary 6.20. There exists a second-type representation of µki=1 in theFock space model if and only if there exist parameters 0 ≤ nAB, nA¯B, nAB¯, nA¯B¯ ≤1 such that for all i = 1, ..., kµi(XY ) = n2XY µ˜i(XY ) +√1− n2XY µi(X)µi(Y ), (6.102)withµ˜i(XY ) ∈ [hmin(XY ), hmax(XY )]. (6.103)1116.3. Fock Space Modeling of Conjunctions and NegationsProof. Because we impose the extra constraint |C〉 = |A〉⊗|B〉 for the case ofsecond-type representations, we can apply the proof of Theorem 6.19 usingµˇi(XY ) = µi(X)µi(Y ).We can now build representations of the data collected in the experi-ment described in § 6.2. Before representing the data, we revisit the resultsobtained in § 6.2.1 in light of this model. In particular we show that thismodel is capable of describing the experimental deviations from the classi-cality parameters in Definition 6.3.Since we know from § 6.2.1 that individual concepts behave classically,we focus our analysis on concept combinations. Therefore, we calculate thevalue of the parameters IA, IB, IA¯, and IB¯ in the Fock space model. Forsimplicity, we assumenXY = n, (6.104)and<〈A|M|B〉+ <〈A|M|B¯〉 = 0,<〈A|M|B〉+ <〈A¯|M|B〉 = 0,<〈A¯|M|B〉+ <〈A¯|M|B¯〉 = 0,<〈A|M|B¯〉+ <〈A¯|M|B¯〉 = 0.(6.105)Then, applying Eq. (6.99) to the definition of IA in Eq. (6.28) yieldsIA = µi(A)− µi(AB)− µi(AB¯)= µi(A)− n2(µi(A)− µi(B) + µi(B¯)2)− (1− n2)(µˇi(AB) + µˇi(AB¯)).(6.106)From our results in § 6.2.1, we can assume that individual concepts satisfythe classicality conditions:µi(B) + µi(B¯) = 1. (6.107)Theorem 6.10 shows that the second sector also satisfies the classical condi-tions. Thereforeµˇi(AB) + µˇi(AB¯) = µi(A). (6.108)1126.4. Examples and Data Representation AnalysisSubstituting Eqs. (6.107) and (6.108) in Eq. (6.106) yieldsIA = µi(A)− n2(µi(A)− 12)− (1− n2)µi(A) = −n22. (6.109)Eq. (6.109) shows that the classicality condition IA = 0 is violated by afactor that is proportional to n2. The same result can be obtained for theparameters IB, IA¯, and IB¯.We can compare experimental values of these parameters with the resultof Eq. (6.109). We use the experimental values of the parameters IA, IB,IA¯, and IB¯ (see Table 6.2) to compute n. The result shows that n fluctuatesbetween 0.7 and 1. This is consistent with the data analysis of § 5.3. There,we demonstrated that the best performance for the Fock space model forconjunctions is obtained when nAB is equal to approximately 0.8 (see Fig-ure 5.1).This confirms that the contribution from the first sector of the Fock spaceis larger than that of the second sector. And since each sector representa different mode of thought, this suggests that the contribution from theemergent mode of thought is larger than that of the logical mode of thought.6.4 Examples and Data Representation AnalysisBefore analyzing the performance of the model, we give a concrete repre-sentation for an example from the experimental data. Consider the exemplar‘olive’ in Table B.4 (Appendix B). The membership weights areµi(A) = 0.53, µi(B) = 0.63, µi(A¯) = 0.47, µi(B¯) = 0.44,µi(AB) = 0.65, µi(AB¯) = 0.34, µi(A¯B) = 0.51, µi(A¯B¯) = 0.36.(6.110)This exemplar can be represented in the Fock space C8 ⊕ (C8 ⊗ C8) bymaking the choicesnAB = nAB¯ = nA¯B = nA¯B¯ = 1,|A〉 = eiφA(0.47, 0.48, 0.26, 0.14,−0.61,−0.20,−0.04, 0.23),|B〉 = eiφB (0.05,−0.44,−0.64, 0.14,−0.33,−0.44,−0.22, 0.15),|A¯〉 = eiφA¯(0.46,−0.039,−0.42, 0.28,−0.034, 0.62, 0.37,−0.04),|B′〉 = eiφB¯ (0.43, 0.1, 0.047, 0.49, 0.59,−0.33,−0.27,−0.18),(6.111)1136.4. Examples and Data Representation AnalysiswithφAB = φB − φA = 102.18◦,φAB¯ = φB¯ − φA = 116.27◦,φA¯B = φB − φA¯ = 97.28◦,φA¯B¯ = φB¯ − φA¯ = 107.51◦,(6.112)and by characterizing the state |C〉 as follows:8∑i=58∑i=5c2ij = 0.640,8∑i=54∑i=1c2ij = 0.347,4∑i=18∑j=5c2ij = 0.469,4∑i=14∑i=1c2ij = 0.5.(6.113)We estimated the number of exemplars that can individually be repre-sented assuming Eq. (6.104), and have found that both the zero- and second-type representations can model 95% of exemplars in the data set. We alsocalculated the performance achieved for specific values of nAB, nA¯B, nAB¯,and nA¯B¯. In Fig. 6.2, we show the number of exemplars for which Eq. (6.99)is satisfied for 0 ≤ nAB, nA¯B, nAB¯, and nA¯B¯ ≤ 1 for j = 1, .., 4. The blue,red, yellow, and green curves in the graphs of the first column correspondto the combinations AjBj , A¯jBj , AjB¯j , and A¯jB¯j respectively. The firstcolumn assumes that µˇi(XY ) in Eq. (6.99) must satisfy the constraints forclassical data, and the second column assumes that µˇi(XY ) = µi(X)µi(Y ).Therefore, by Theorem (6.19) and Corollary (6.20), the first and secondcolumns indicate the number of exemplars that have a zero- and a second-type representation respectively.We observed a very similar pattern across all the concept combinations.For the first column, the performance remains low for small values of nXY ,the number of exemplars increases steadily reaching the highest performancefor nXY between 0.6 and 0.8, and remains stable thereafter. The same pat-tern appear in the second column, except that the performance remainslow up to nXY equal to approximately 0.5, and the highest performance isreached for values between 0.7 and 0.9. Therefore, we conclude that thetwo-sector Fock space model is able to represent almost all exemplars whenthe first sector is dominant. Interestingly, this result is consistent with our1146.4. Examples and Data Representation AnalysisFigure 6.2: Number of exemplars having a zero- and second-type represen-tation for different values of nXY .1156.4. Examples and Data Representation Analysisanalysis for conjunction in Chapter 5.To provide a simpler presentation of our results, we present in Fig. 6.3the second-type representation when the four parameters nXY are equal tothe same n value. In this restricted case, we show the fraction of exemplarsthat simultaneously satisfy Eq. 6.99. The blue, red, yellow, and green curvesrepresent the cases j = 1, 2, 3, and 4 respectively.Here we can clearly see the representation pattern explained above. Infact, none of the exemplars can be represented for n < 0.7. In addition, thefraction of exemplars that can be modeled increases abruptly from 0 to ap-proximately 0.8 for 0.7 ≤ n ≤ 0.9, and remains stable for n > 0.9. Moreover,the maximal performance in all cases does not surpass 90%. This impliesthat there are some exemplars in the data set that cannot be representedfor a fixed n since, from our previous analysis, we know that 95.8% of theexemplars can be represented by the model.From our theoretical and experimental analyses, we conclude that whenconcepts are combined using conjunctions and negations, the emergent modeFigure 6.3: Fraction of experimental data that can be simultaneously rep-resented in the Fock space models for specific values of n.1166.4. Examples and Data Representation Analysisof thought, represented by the first sector of the Fock space, is predominantover the logical mode of thought, represented by the second sector of theFock space. In particular, the emergent mode of thought contributes ap-proximately 70% or more to the conceptual combination state.117Chapter 7Quantum Structures inNatural Language ProcessingThe advent of the internet and the consequent technological revolutionhave transformed the field of language processing into a major challenge forscience. The area of research that focuses on performing non-trivial infor-mation tasks such as translating a document to a foreign language, knownas machine translation, or identifying relevant information from a collectionof documents, known as information retrieval, is called natural languageprocessing (NLP) [JMK+00].It is well-known in the artificial intelligence community that many infor-mation tasks are not easily automated [RN95]. The most famous exampleof such a task was introduced by one of the founding fathers of the theory ofcomputation, Alan Turing [SCA03]. He proposed the following intelligencetest for machines: Suppose we have a chat opened for two computers indifferent rooms. In one of the rooms there is a human who does not knowwho he/she is going to talk to, and in the other room there is a machine thathas no interaction with or feedback from humans, except for the human onthe other side of the chat. The ‘Turing test’ consists of asking the human,after a few minutes of conversation, whether the conversation he/she hadon the chat corresponds to an interaction with a machine or with a human.If participants tend to believe they are speaking with a human, we say themachine has passed the Turing test. When the test was proposed, mostresearchers in the field thought that the Turing test would be passed after10 or 20 years of research. Now, after nearly 70 years, the test remains anopen problem, and there is no theory that sheds light on how it could beresolved [McC]. The Turing test illustrates the difficulties associated withautomated language processing.1187.1. Language, Concepts, and Quantum Structures7.1 Language, Concepts, and QuantumStructuresA NLP model operates on ‘linguistic units’ that can correspond to words,sentences, paragraphs, documents, or even collections of documents. Themodel usually involves a ‘syntactical’ part that is concerned with the gram-matical correctness of the linguistic units, and a ‘semantic’ part that focuseson the meaning of such units.In NLP, syntactical models are generally built upon one of the manymathematically well-grounded theories of syntax. Examples of these theoriesare generative, dependency, and functionalist grammars [Cho02]. For thecase of semantics however, there are no widely accepted formal theories. Infact, semantic models in NLP are usually built in an ad-hoc manner [MS99].For example, semantic approaches based on ontologies are often dependenton the topic of discourse that is being processed [Sow00]. Since semantics isa fundamental part of NLP models, most researchers believe that the lackof a theory for semantics is one of the most important impediments for theachievement of human-level performance, and that fundamentally new ideasare required to achieve significant progress [McC].Although philosophers of language have proposed that the meaning ofwords can be represented using a concept-theoretical framework [Sea04], thedominant research methodology in NLP does not follow this approach. Infact, most researchers in NLP do not look for representational frameworksfor the meaning of linguistic units, but rather focus on matching the per-formance of human annotators’ ‘gold-standards’ using ad-hoc models forparticular language tasks [Pel06].From a methodological perspective, concept modeling and NLP are sim-ilar because they are generally approached in an ad-hoc manner. Moreover,because both areas are concerned with the study of meaning, they have sim-ilar structural problems.For example, it is a well-known fact in NLP that semantic relations inlanguage are graded [Zad65, Mur03, Tur01], and that the gradedness struc-ture of these relations can be better understood using contextual informa-tion [STZ05, BYBM11, Nav09, LC98]. However, it is not clear whether it isbetter to define semantic relations for words [Fel98], sets of words [BFL98],1197.2. Evidence of Quantum Structure in Natural Language Processingor grammar structures [Cow98], and whether contexts should be definedin terms of windows of text12, or grammar structures. As in the case ofconcept modeling, several theories have been proposed to define the basicunits of study and their contexts, but there is no general agreement amongresearchers [BB05].There is an open debate in NLP as to whether the meaning of thecombination of lexical units can be represented in terms of the meaningof the original linguistic units. While several models introduce methodsto combine linguistic units using syntactical structures [Cho02], vector-based representations [Bar13], or heuristic approaches [BP03], it is notclear which method has the best performance [ML10]. In fact, variousscholars have proposed that the meaning of word combinations is non-compositional [BZL10, Sve08], and that the study of such meaning will re-quire new representational tools [Gra90].Quantum cognition has been successful in handling similar structuralproblems for the modeling of concepts. Note that the meaning of a certainpiece of text in a document can be associated to a concept. And, piecesof text, such as words or paragraphs, can be thought of as exemplars for aconcept. Therefore, quantum cognition could offer an alternative approachto represent certain NLP tasks. By studying how words and sentences tendto appear in a document or a collection of documents, we can identify theconcepts that give meaning to the document, and thus determine whetheror not these concepts exhibit a quantum structure.In the following section, we present two examples using this methodol-ogy. In the first case, we study word co-occurrence in a corpus of text toidentify entangled concepts in the corpus and, in the second case, we studya property of quantum particles, called indistinguishability, using statisticalinformation obtained from a web search engine.7.2 Evidence of Quantum Structure in NaturalLanguage ProcessingWe present the results of experiments conducted to identify quantumstructures in the statistical analysis of natural language data.12A window of text is a sequence of words that contain a linguistic unit in a piece oftext and is larger than the linguistic unit.1207.2. Evidence of Quantum Structure in Natural Language Processing7.2.1 Quantum Entanglement in Text CorporaRecall that to test whether two abstract entities A and B are entangled,we need two measurements for each entity with each measurement havingtwo possible outcomes. Then, the concepts are entangled if the Clauser-Horn-Shimony-Holt (CHSH) inequality, Eq. (3.73), is violated. Since wecan associate the exemplars of a concept with words, and estimate the elic-itation of concepts in a document by counting the number of times theirexemplars appear in the document, we propose to test entanglement of con-cepts in a corpus of text by conducting the following experiment:Consider a corpus of text T and two concepts A and B. Choose eightwords w1, ..., w8, where w1, ..., w4 are exemplars of the concept A, andw5, ..., w8 are exemplars of the concept B. Since entanglement is measuredusing the statistical co-occurrence of the exemplars of a concept, we par-tition T as a collection of n consecutive windows of text {t1, ...., tn}, andestimate how the words representing the exemplars of A and B co-occur inthese windows of text.Let N ∈ N and assume that each ti is a window of N words, fori = 1, ..., n, and n = n(N). Next, let MiA = {w1, w2}, MiA′ = {w3, w4},MiB = {w5, w6}, and MiB′ = {w7, w8} be the measurements whose outcomeis +1 if the first word is in ti, −1 if the second word is in ti, and 0 if neitheror both words are in ti. Finally, let MiXY be the joint measurements whoseoutcome is associated with the product of the outcomes of the former ex-periments, and denote by E(MXY ) the expected value of such measurementfor X = A or A′, and Y = B or B′.We say that the concepts A and B of the corpus of text cannot berepresented using a classical probabilistic model whenever the inequality− 2 ≤ E(MAB) + E(MA′B) + E(MAB′)− E(MA′B′) ≤ 2 (7.1)is violated.Let the frequency matrix M(A,A′, B,B′, N) be defined by1217.2. Evidence of Quantum Structure in Natural Language ProcessingM(A,A′, B,B′, N) =F (A1B1) F (A1B2) F (A1B′1) F (A1B′2)F (A2B1) F (A2B2) F (A2B′1) F (A2B′2)F (A′1B1) F (A′1B2) F (A′1B′1) F (A′1B′2)F (A′2B1) F (A′2B2) F (A′2B′1) F (A′2B′2)(7.2)whereF (XY ) =n∑i=1MiXY , (7.3)with X = Aj or A′j , and Y = Bj or B′j , for j = 1, 2.Note that each quadrant in the matrix corresponds to the frequencytable of one of the joint experiments. Also, sinceE(MAB) =F (A1B1) + F (A2B2)− F (A1B2)− F (A2B1)F (A1B1) + F (A2B2) + F (A1B2) + F (A2B1), (7.4)we can estimate Eq. (7.1) from M(A,A′, B,B′, N). Because we are con-cerned with the statistics of joint experiments, it is also important to verifywhether or not the marginal probability law holds13.The marginal probability law implies thatp(A1, B) =F (A1B1) + F (A1B2)∑2i,j=1 F (AiBj)=F (A1B′1) + F (A1B′2)∑2i,j=1 F (AiB′j)= p(A1, B′),p(A′1, B) =F (A′1B1) + F (A′1B2)∑2i,j=1 F (A′iBj)=F (A′1B′1) + F (A′1B′2)∑2i,j=1 F (A′iB′j)= p(A′1, B′),p(A2, B) =F (A2B1) + F (A2B2)∑2i,j=1 F (AiBj)=F (A2B′1) + F (A2B′2)∑2i,j=1 F (AiB′j)= p(A2, B′),p(A′2, B) =F (A′2B1) + F (A′2B2)∑2i,j=1 F (A′iBj)=F (A′2B′1) + F (A′2B′2)∑2i,j=1 F (A′iB′j)= p(A′2, B′).(7.5)Similarly13It has recently been proven that the simultaneous verification of CHSH inequalitiesand marginal probability law is a sufficient test of entanglement (see [DK14, ASV14a]).1227.2. Evidence of Quantum Structure in Natural Language Processingp(A,B1) = p(A′, B1),p(A,B2) = p(A′, B2),p(A,B′1) = p(A′, B′1),p(A,B′2) = p(A′, B′2).(7.6)We define the vectorr =(p(A1, B)− p(A1, B′), p(A′1, B)− p(A′1, B′), p(A2, B)− p(A2, B′),p(A′2, B)− p(A′2, B′), p(B1, A)− p(B1, A′), p(B′1, A)− p(B′1, A′),p(B2, A)− p(B2, A′), p(B′2, A)− p(B′2, A′))(7.7)to record the extent to which the marginal probability law is satisfied by eachjoint experiment, and quantify the violation of the marginal probability lawbyδ = supi=1,...,8||ri||∞, (7.8)where || · ||∞ is the supreme norm.The first step in the experiment is to identify two sets of four words thatare exemplars of the concepts A and B. Since a concept can be associatedwith multiple sets of words, and a set of words can be associated withmultiple concepts [RS10], this is not a trivial procedure. We propose thefollowing methodology:1. Select a set of statistically relevant words, W , in the corpus.2. Determine concepts A and B using all the possible combinations offour words in W .3. Compute the CHSH inequality and the value of r for each choice of Aand B.We applied this methodology on a collection of corpus called ‘TREC col-lection WSJ8792 Lemur 4.12’ 14. The corpus was pre-processed by removing14Lemur is an open source project that develops search engines and text analysis toolsfor research and development of information retrieval and text mining softwares.1237.2. Evidence of Quantum Structure in Natural Language Processingstop terms and applying the Porter stemmer. 32 topics having 70 or moredocuments were selected among TREC topics 151-200. For these topics,we segmented the documents into windows of words of lenght N = 5, 10,and 20. For each topic, 2 sets of 10 words were chosen using two popu-lar relevance criteria. The first, known as tf score, simply rank words bytheir frequency in the corpus, and the second, called term frequency-inversedocument frequency or tf-idf score, is a numerical statistic that reflects howimportant a word is to a document in a corpus. The tf-idf score is often usedas a weighting factor in information retrieval and text mining [RUUU12].For each criteria, the first and second sets of 10 words were used to build allpossible measurements MA,MA′ , and MB,MB′ , respectively.Our statistical analysis indicates that we can identify entangled conceptsfrom the statistical co-occurrence of words using a corpus of text. Fig. 7.1shows the proportion, pN (T ), of subsets of words for the 32-topic corpus thatviolate Eq. (7.1). The black curve corresponds to N = 20, the gray dottedcurve corresponds to N = 10, and the black dashed curve corresponds toN = 5. The left plot is based on the word choice using the frequency rel-evance criteria, and the right plot is based on word choice using the tf-idfrelevance criteria.We observe that the tf-idf score selects more sets of words violatingEq. (7.1) than the tf score. This is consistent with the fact that td-idfis a better word-relevance measure than tf. Although some topics exhibitmore violations than others, we identify a strong tendency for violations ofEq. (7.1) for most topics. Moreover, the violation decreases when N in-creases. This is consistent with the fact that word correlations are noisyfor large window sizes [iCS01]. For shorter window sizes, it is more likelythat only meaningful correlations between words will be kept, and hence theviolation of Eq. (7.1) is observed more frequently.In Fig. 7.2, each point is associated to a set of words. The x-axis cor-responds to the CHSH inequality value obtained using Eq. (7.1), and they-axis corresponds to the δ value. We plot points whose CHSH value is 1.5or larger to better visualize the behavior near the violation threshold. Sincewe are interested in observing sets of words that violate the CHSH inequal-ity and satisfy the marginal probability law, the left plot shows δ ∈ [0, 1],the middle plot zooms to δ ∈ [0, 0.4], and the right plot zooms to δ ∈ [0, 0.1].We can visualize the extent to which the CHSH inequality will be violatedfor different values of δ by the density of points in each region of the plots.1247.2. Evidence of Quantum Structure in Natural Language ProcessingIn most cases where the CHSH inequality is violated, the marginal prob-ability law is not preserved. However, it is possible to identify a regionwhere the CHSH inequality is violated and the violation of the marginalprobability law is very small. In fact, since δ is a supreme norm, the rightplot shows that there are sets of words that violate the CHSH inequalityFigure 7.1: Frequency of the violation of Eq. (7.1) for the 20 most relevantterms. The left plot corresponds to the co-occurrence data for relevanceassociated to term-frequency score, and the right plot corresponds to theco-occurrence data for relevance associated to td-idf score. In both plots,the topics were sorted such that p5(T ) is decreasing so as to avoid thatcurves crossed each other.Figure 7.2: Each point corresponds to the choice of particular measurementsA,A′, B, and B′. The x-axis represents the extent to which equation (7.1) isviolated and the y-axis denote the δ value. We consider three scales for theδ value, and one single scale for the the middle term of the CHSH inequality.Points to the right of the red line violate the CHSH inequalty.1257.2. Evidence of Quantum Structure in Natural Language Processingwith r < 0.05. This indicates that there could be sets of words that violatethe CHSH inequality and satisfy the marginal probability law.7.2.2 Indistinguishability of Concepts and Bose-EinsteinStatisticsOne of the most profound differences between quantum and classicalphysics is how identical particles behave statistically. While classical par-ticles are distinguishable, and thus governed by the Maxwell-Boltzmann(MB) distribution, quantum particles are indistinguishable. Quantum par-ticles are governed by the Bose-Einstein (BE) distribution in the case ofinteger-spin particles, and by the Fermi-Dirac (FD) distribution in the caseof half-integer spin particles.Since the statistics of identical particles illustrate a fundamental dif-ference between classical and quantum entities, we propose to study thestatistical behaviour of a collection of concepts to determine whether theybehave as classical or quantum entities.Consider for example the linguistic expression ‘eleven animals.’ Thisexpression can be viewed as the combination of concepts ‘Eleven’ and ‘An-imals’ into ‘Eleven Animals.’ The concept ‘Eleven Animals’ corresponds toan abstract idea of eleven animals. So the linguistic expression ‘eleven an-imals’ elicits the thought of eleven indistinguishable entities. However, thesame linguistic expression can also elicit the thought of eleven animals asobjects existing in space and time, and thus distinguishable from each other.This intuitive difference between the reasoning about concepts and objectsis what motivates the development of a methodology to test what type ofelicitation is predominant.In order to explain the next experiment, we first summarize the statisti-cal differences in the classical and quantum distributions of physical entities.These differences are then analyzed with respect to concepts using experi-mental data from both a psychological and computational studies.The Statistics of IndistinguishabilityIn classical mechanics, the state of an individual particle is representedby a pair (q, p) ∈ Ω, where q denotes the particle’s position and p its mo-mentum. The set Ω is called the phase space of the particle. The particle’s1267.2. Evidence of Quantum Structure in Natural Language Processingevolution is ruled by specific dynamical laws. As the number of particlesincreases, the dynamical description of the system becomes intractable. Inthis case, classical statistical mechanics is introduced to describe the prop-erties of the system [Mar12]. For a classical system, the MB distributionestimates the likelihood of finding the system in each of its energy states.A fundamental assumption in the derivation of MB distribution is thatall particles are ‘distinguishable.’ That is, one can always follow the trajec-tories of each particle and label them differently. Consider a system of Ndistinguishable particles, and suppose that each particle can be in one of Mpossible states. Then the total number of possible system configurations isWMB(N,M) = MN . Hence, the probability that a specific configuration sis realized isPMB(s) =TMB(s)WMB(N,M), (7.9)where TMB(s) is the number of ways in which s can be realized. For exam-ple, consider a system of N = 2 classical particles that can be distributed inM = 2 energy states. If one applies MB distribution to this simple situation,then the number of possible arrangements is WMB(2, 2) = 4, each one witha probability 14 .The situation is radically different in quantum mechanics where the stateof a system is represented by a probability wave-function in a Hilbert space.Since the measurement of a system induces a collapse of the wave functiondescribing the system, we cannot know, post-measurement, which particlecollapsed to which state. Indeed, given two identical quantum particles, itis not possible to recognize if an exchange has occurred between the twoparticles. More concretely, consider a system of two quantum particles q1and q2, and suppose that it is represented by the unit vector |Ψ(q1, q2)〉 ina Hilbert space. Then, indistinguishability implies that|〈Ψ(q1, q2)|Ψ(q1, q2)〉|2 = |〈Ψ(q2, q1)|Ψ(q2, q1)〉|2. (7.10)Therefore, we have either|Ψ(q2, q1)〉 = |Ψ(q1, q2)〉, or|Ψ(q2, q1)〉 = −|Ψ(q1, q2)〉.(7.11)1277.2. Evidence of Quantum Structure in Natural Language ProcessingIt follows from the spin-statistics Theorem [KB62] that integer-spin par-ticles, called ‘bosons,’ and half-integer spin particles, called ‘fermions,’ cor-respond to the first and second cases of Eq. (7.11) respectively. Moreover,the spin-statistics theorem implies that fermions are subject to the ‘Pauliexclusion principle,’ which means that only one fermion can occupy a spe-cific quantum state at a specific time. This follows directly from the anti-symmetry of the wave function. For bosons there is no restriction in occu-pying the same state.The above difference between fermions and bosons has a dramatic influ-ence in the way both types of particles behave statistically. Let us consideragain the situation of N particles that can be distributed in M single-particlestates, and suppose that the particles are identical. For a system of N iden-tical bosons, the number of possible configurations isWBE(N,M) =(N +M − 1)!N !(M − 1)! , (7.12)where N ! = N(N − 1)(N − 2) . . . 1. There are fewer arrangements availabledue to indistinguishability. In the case of fermions, the Pauli exclusionprinciple dictates that two fermions cannot be in the same state, whichfurther reduces the number of possible configurations toWFD(N,M) =M !N !(M −N)! . (7.13)By considering again the case N = M = 2, we have that, for a system of 2identical bosons, WBE(2, 2) = (2+2−1)!/2!(2−1)! = 3, and the probabilityfor each realization is 1/3. For a system of 2 identical fermions, only onerealization is possible and it occurs with probability 1.The above differences between distinguishable and indistinguishable par-ticles are statistically significant and can be used to characterize empiricalevidence for the indistinguishability of concept combinations. Suppose weconsider two states of ‘Animal,’ namely p =‘cat’ and q =‘dog.’ Then, theconcept ‘Eleven Animals,’ gives rise to twelve possible states. We denotethem by p11,0 =‘eleven cats,’ p10,1 =‘ten cats and one dog,’ and so on. Forsimplicity, we assume the existence of two probability values µ(p) and µ(q)that account for possible bias towards one of the states. Thus, µ(p) andµ(q) are independent probabilities such that µ(p) + µ(q) = 1.1287.2. Evidence of Quantum Structure in Natural Language ProcessingFor the MB statistics, the probability of obtaining a state with n catsand 11− n dog is given byPMB(pn,11−n, µ(p), µ(q)) =11!n!(11− n)!µ(p)nµ(q)11−n. (7.14)For example, if µ(q) = µ(p) = 0.5, the number of possible arrangementsfor the state ‘eleven cats’ and for the state ‘eleven dogs’ is 1. Hence, thecorresponding probability for these configurations is PMB(p0,11, 0.5, 0.5) =0.0005.Since we consider two states, and the FD distribution is subjected to thePauli exclusion principle, we cannot apply the FD statistics in this case. Forthe BE statistics, the probability to obtain a state with n cats and 11 − ndog is given byPBE(pn,11−n, µ(q), µ(p)) =nµ(p) + (11− n)µ(q)12×112. (7.15)Since µ(p) = 1−µ(q), then PBE(pn,11−n, µ(p), µ(q)) is linear with respect ton. Moreover, when µ(p) = µ(q) = 0.5, we have that PBE(pn,11−n, 0.5, 0.5) =112 is constant.The above analysis shows that, if one performs experiments on a col-lection of concepts to estimate the probability of elicitation for each state,then it is possible to determine which type of distribution, MB or BE, isgenerated.Psychological Experiment to Test IndistinguishabilityThe psychological experiment involved 88 participants. We considered alist of concepts Ai, for i = 1, . . . , 14, of different nature, both physical andnon-physical, and two possible exemplars, pi1 and pi2, for each concept. Next,we requested participants to choose one exemplar from the combinationN iAi of concepts for N i ∈ N. The exemplars of these combinations ofconcepts are the states pik,N i−k describing the conceptual combination ‘kexemplars in state pi and (N i − k) exemplars in state pi’, where k is aninteger such that k = 0, . . . , N i. For example, the first collection of conceptswe considered is N1A1 corresponding to the compound conceptual entity‘Eleven Animals,’ with p1 and q1 describing the exemplars ‘cat’ and ‘dog’of the concept ‘Animal,’ and N1 = 11. The exemplars considered are p111,0,p110,1, . . . , p11,10, and p10,11. The collections of concepts used in the experiment1297.2. Evidence of Quantum Structure in Natural Language Processingand their corresponding exemplars are listed in Table 7.1.For each i = 1, ..., 14, we fitted the experimental data using the distri-butions PMB(pn,11−n, µ(pi), µ(qi)) and PBE(pn,11−n, µ(pi), µ(qi)) by choos-ing the values for µ(pi) that minimize the R-squared value of the fit15, fori = 1, . . . , 14.Next, we used the ‘Bayesian Information Criterion’ (BIC) [KR95] to esti-mate which model provides the best fit. Table 7.2 summarizes the statisticalanalysis. The first column of this table identifies the collection of concepts,the second and third columns show the value of the probability parameterµ(pi) and the R2 value of the best MB statistical fit, the fourth and fifthcolumns show the value of the probability parameter µ(pi) and the R2 valueof the best BE statistical fit. The sixth column shows the ∆BIC criterionto discern between PMB(pin, 11− n, µ(pi)µ(qi) and PBE(pin,11−n, µ(pi)µ(qi),and the seventh column identifies the distribution that best represent thedata for concept Ai, i = 1, . . . , 14. Negative ∆BIC values imply that theconcept is best fitted by a MB distribution, whereas positive ∆BIC values15Only one of the two values is sufficient since µ(pi) = 1− µ(qi).Table 7.1: List of concepts and their respective exemplars for the psycho-logical experiment on indistinguishability.i N i Ai pi qi1 11 ‘Animals’ ‘cat’ ‘dog’2 9 ‘Humans’ ‘man’ ‘woman’3 8 ‘Expressions of Emotion’ ‘laugh’ ‘cry’4 7 ‘Expressions of Affection’ ‘kiss’ ‘hug’5 11 ‘Moods’ ‘happy’ ‘sad’6 8 ‘Parts of Face’ ‘nose’ ‘chin’7 9 ‘Movements’ ‘step’ ‘run’8 11 ‘Animals’ ‘whale’ ‘condor’9 9 ‘Humans’ ‘child’ ‘elder’10 8 ‘Expressions of Emotion’ ‘sigh’ ‘moan’11 7 ‘Expressions of Affection’ ‘caress’ ‘present’12 11 ‘Moods’ ‘thoughtful’ ‘bored’13 8 ‘Parts of Face’ ‘eye’ ‘cheek’14 9 ‘Movements’ ‘jump’ ‘crawl’1307.2. Evidence of Quantum Structure in Natural Language Processingimply that the concept Ai is best fitted by a BE distribution. However,these statements are weak for |∆BIC| < 2, moderate for 2 < |∆BIC| < 8, andstrong for 8 < |∆BIC| [KR95].We see that concepts 2 and 9 show a strong ∆BIC value towards MBstatistics, and that concepts 1, 3, 5, 7, 11, 12, and 14 show a strong ∆BICvalue towards BE statistics. Complementary to the BIC criterion, the R2value indicates of ∆BIC can be confirmed with a good fit of the data. Theconcepts that have a strong indication towards one type of statistics andan R2 value larger than 0.78 have their R2 value in bold text. These casesare confirmed by both statistical indicators. Moreover, in all the cases withstrong tendency towards one type of statistics, the R2 value of the othertype of statistics is poor. Interestingly, we can observe that the conceptsthat exhibit MB behavior are associated to physical entities, and that allthe concepts associated to non-physical entities exhibit BE behavior.We conclude that collections of concepts can behave statistically likequantum entities.Table 7.2: Results of statistical fit for the psychological experiment. Eachcolumn refers to the 14 collections of concepts introduced in Table 7.1.i µ(pi) MB R2MB µ(pi) BE R2BE ∆BIC Best Model1 0.55 -0.05 0.16 0.78 19.31 BE strong2 0.57 0.78 0.42 0.44 -9.54 MB strong3 0.82 0.29 0.96 0.79 10.81 BE strong4 0.71 0.81 0.53 0.77 -1.69 MB weak5 0.25 0.79 0.39 0.93 14.27 BE strong6 0.62 0.59 0.61 0.57 -0.37 MB weak7 0.72 0.41 0.64 0.83 12.66 BE strong8 0.63 0.58 0.47 0.73 5.53 BE positive9 0.45 0.87 0.26 0.67 -9.69 MB strong10 0.59 0.50 0.63 0.77 7.17 BE positive11 0.86 0.46 1.00 0.87 11.4 BE strong12 0.21 0.77 0.00 0.87 6.68 BE strong13 0.62 0.54 0.71 0.67 2.97 BE weak14 0.81 0.20 0.91 0.90 20.68 BE strong1317.2. Evidence of Quantum Structure in Natural Language ProcessingIndistinguishability in Concepts on the WebWe have adapted the experimental methodology of § 7.2.2 to study theindistinguishability of concepts on the web. We use a search engine to esti-mate the number of web pages in which different exemplars of a collection ofconcepts appear. In this way, we use the relative frequency of the exemplarsto verify if the indistinguishability of concepts, identified in our psychologi-cal experiments, can also be manifested on the web.Let N i ≥ 3 be an integer number, and consider four pairs of states (pj ,qj), for each j = 1, . . . , 4. Next, for each number 3 ≤ k ≤ N i and pair(pj , qj) of states, we build a set of sentences rjk,N i−k that refer to the statepjk,N i−k. The states and numbers chosen for this experiment are shown inTables 7.3 and 7.4. For example, the states p1 and q1 correspond to ‘cat’and ‘dog,’ and the state p11,3 describing ‘three cats and one dog’ is referredby the sentences r11,3 = {‘three cats and one dog’, ‘one dog and three cats’}.In this experiment, we counted the total number nik,N i−k of web pageswhere the sentences of rjk,N i−k are found using the Bing search API forweb developers16. Since nik,N i−k estimates the number of references to thestate pjk,N i−k in the web, we use their relative frequencies to estimate adistribution P (pik,N i − k, µ(pj), µ(qj)) of the exemplars on the web. Thus,we can study if this distributions can be best described using the MB orBE distributions, for different values of N i. We have built the distributionP (pik,N i − k, µ(pj), µ(qj)) for k = 3, ..., N i, for using 3 ≤ N i ≤ 15.16For more information, see 7.3: List of singular/plural reference to states used to perform theweb-based experiment.j pj1 pj21 “cat”/“cats” “dog”/“dogs”2 “man”/“men” “woman”/“women”3 “win”/“wins” “loss”/“losses”4 “son”/“sons” “daughter”/“daughters”1327.2. Evidence of Quantum Structure in Natural Language ProcessingBecause the sample size is small, this study can only be considered pre-liminary. Moreover, there are certain technical difficulties, well-known in thefield of computational semantics [Pyl84], that affect the result. They includethe fact that a state can potentially be referred to by an infinite number oflinguistic expressions, or be linked to an infinite number of concepts. Also,due to semantic ambiguities, the linguistic expression might in some casesnot refer to the state we assume it to refer to. Even though these are stronglimitations, we have found interesting evidence for BE statistics in the data.We summarize our results in Table 7.5. The first column specifies N i,the other four columns specify the pair of states used in the experiment.Each entry in the table contains a pair of numbers. The first number isthe BIC criteria, ∆BIC, and the second number is the R2 value of the bestfit. As before, negative ∆BIC values imply that the concept is best fitted bya MB distribution, whereas positive ∆BIC values imply that the concept isTable 7.4: List of references to numbers used to perform web-based experi-ment.N i List of references0 “0”,“no”,“zero”1 “1”,“a”,“one”2 “2”,“two”,“a couple of”3 “3”,“three”4 “4”,“four”5 “5”,“five”6 “6”,“six”7 “7”,“seven”8 “8”,“eight”9 “9”,“nine”10 “10”,“ten”11 “11”,“eleven”12 “12”,“twelve”13 “13”,“thirteen”14 “14”,“fourteen”15 “15”,“fifteen”16 “16”,“sixteen”1337.2. Evidence of Quantum Structure in Natural Language Processingbest fitted by a BE distribution. These statements are weak for |∆BIC| < 2,moderate for 2 < |∆BIC| < 8, and strong for 8 < |∆BIC| [KR95]. If R2 >0.65, we omit the value to emphasize that we do not have a significant fit.We can identify three trends:(i) when 3 ≤ N i ≤ 8, the majority of pairs of states exhibit MB statistics,(ii) when 9 ≤ N i ≤ 15, the majority of pairs of states exhibit BE statis-tics, and(iii) for N i ∈ {11, 13, 14, 15}, at least two pairs of states show a poor R2fit.These results indicate that, for N i ≤ 8, the concepts in the combinationbehave like distinguishable entities, while for 9 ≤ N i, they become indistin-guishable. This suggest that, when numbers are large enough, humans tendto treat collections of concepts as indistinguishable entities. This is consis-tent with the fact that we cannot generally remember, repeat, or comparecollections of more than seven or eight distinguishable entities [CMC07].However, by not trying to distinguish the entities when elicited in large col-lections, we make use of language to properly communicate large collectionsof concepts and reason about them. The third trend shows that for someTable 7.5: Results of statistical fit of web-based experiment. The numbersin bold correspond to the cases where the BE-distribution provides a bestfit according to the ∆BIC and R2 criteria.N i j = 1 j = 2 j = 3 j = 43 −3.9, 0.79 −4.4, 0.82 −1.5,− −3.9, 0.804 −8.9, 0.92 −7.4, 0.91 −4.2, 0.80 −9.2, 0.935 −10.5, 0.94 −3.83, 0.81 −8.20, 0.90 −14.7, 0.976 −5.0, 0.84 −3.6, 0.82 −2.6, 0.80 −15.5, 0.967 −2.0, 0.77 3.1,0.72 −1.5, 0.75 −4.7, 0.858 2.0,0.72 −0.1, 0.74 −0.8, 0.77 −1.5, 0.799 5.5,0.69 7.3,0.76 6.4,0.78 −7.4, 0.8710 9.0,0.70 0.5,− 10.5,0.77 −11.4, 0.8911 2.4,− 10.0,− 9.3,0.73 −5.2, 0.8012 10.4,0.70 7.0,0.72 11.1,0.72 −6.4, 0.7913 6.6,− 11.1,− 12.7,0.76 9.4,−14 13.6,− 17.3,0.71 10.8,0.76 −8.9,−15 9.1,− 23.0,0.79 2.3,− −17.6, 0.81347.2. Evidence of Quantum Structure in Natural Language Processingnumbers above ten, the data does not fit BE or MB distributions. This isprobably because the data is sparse which leads to strongly irregular distri-butions.We conclude that the statistical behavior of collections of concepts canresemble the statistical behavior of quantum particles in both psychologicaland NLP experimental settings. Moreover, since quantum structures canbe observed in different NLP settings, this suggests that quantum cognitiontools should be applied in the context of NLP.135Chapter 8Conclusion8.1 General ConclusionIn this thesis, we performed a systematic review of the quantum-cognitiveapproach to concepts (part I), proposed a framework that enhances the rangeof applications of concept combination models (part II), and presented evi-dence of quantum conceptual structures in the context of natural languageprocessing (part III).We have elucidated the mathematical structure of the two-sector Fockspace model for concept combinations based on either conjunctions or dis-junctions and studied how the dimension of the space H = Cn influences themodeling power on each sector of the model, and concluded that H = C3 issufficient for maximal modeling power in both sectors.Next, we introduced unitary transformations to represent concept combi-nations for multiple exemplars for each sector separately, and then combinedthese representations to obtain a representation in the two-sector Fock spaceC3⊕C3⊗C3. This representation is consistent with the cognitive principlesof quantum modeling, it also maximizes the modeling power and permitsthe representation of multiple exemplars simultanously. Our data analysisshows that, when the first sector is approximately 80% dominant with re-spect to the second sector, the two-sector Fock space model provides theoptimal performance.We later studied concept combinations built from conjunctions and nega-tions. We first identified the conditions that characterize classical data, andfound that this data is regularly violated. Moreover, we performed a sta-tistical analysis to characterize this violation of data, and found that theviolation is precisely characterized by a constant value. Next, we extendedthe representations developed for the two-sector Fock space model for con-junctions to the case of conjunctions and negations. Our data analysis indi-cates that, when the first sector is approximately 80% dominant with respect1368.2. Future Workto the second sector, the pattern we identified for the violation of classicalconditions is duplicated, and moreover, the two-sector Fock space modelprovides the optimal performance.The conclusion for the second part of the thesis is that the two-sectorFock space model is not only a powerful tool to represent conceptual combi-nations, but it also provides a sensible explanation for the fact that humansdo not reason logically. In particular, our results indicate that the emergentmode of reasoning modeled by the first sector is 80% dominant with respectto its logical counterpart represented in the second sector.In the third part of the thesis, we considered the application of quantumcognition in the context of natural language processing. We presented twostudies identifying quantum structures in natural language phenomena. Inthe first, we developed a methodology to identify sets of words that statisti-cally behave as quantum entangled particles in a corpus of text, and showedthat in many cases sets of words can behave as entangled entities. In thesecond study, we have demonstrated that references to exemplars of collec-tions of concepts statistically behave as indistinguishable (quantum) entitiesusing data from psychological and web-based studies. Moreover, we foundin this study that there is a tendency for non-physical concepts to followthe statistics of indistinguishable particles, while for physical concepts thetendency is to follow the statistics of classical particles.The conclusion of this thesis is that quantum cognition proposes a suit-able framework for a theory for concepts that can be applied to model cog-nitive phenomena. In particular, the possibility to model non-classical pro-cesses by means of superposition, entanglement, and indistinguishability,entails a fundamental feature that deserves further exploration.8.2 Future WorkHere, we propose three ideas for future work inspired by our results inChapters 5, 6, and 7 respectively:8.2.1 Incompatible ExemplarsIn the concrete representations of concepts and their combinations in-troduced in Chapter 5, all measurements are expressed in the same basis.1378.2. Future WorkThis enables us to investigate the relation between the exemplars by analyz-ing the structure of their measurement operators. For example, given twomeasurements M1 and M2, we have that M1 and M2 represent compatibleobservables if and only the commutator operator[M1,M2] = M1M2 −M2M1 = 0. (8.1)Otherwise, the operators represent incompatible observables.The existence of incompatible measurements is one of the most promi-nent examples of how quantum mechanics differs from the classical world.In particular, the famous Heisenberg uncertainty principle is implied by theexistence of incompatible measurements [Hei27]. Therefore, an importantquestion in quantum cognition is to elucidate if semantic estimations can beincompatible. Indeed, the existence of incompatible measurements wouldimply that the application of consecutive semantic estimations could createuncontrollable disturbances.We have performed a preliminary calculation showing that membershipoperators in the representations derived in Chapter 5 are, in some cases,incompatible. Consider the concepts A = ‘Machine,’ B =‘Vehicle,’ and theexemplars p5 =‘sailboat,’ and p12 =‘skateboard.’ For the case of conceptualconjunction we haveµ5(A) = 0.56, µ5(B) = 0.8, µ5(AB) = 0.42,µ12(A) = 0.28, µ12(B) = 0.84, µ12(AB) = 0.34.(8.2)Note that exemplars p5 and p12 satisfy the conditions of Theorem 5.3. Thus,we obtain a concrete representation {|A〉, |B〉, {M5,M12}} of these exem-plars in C3. In this representation, we have that〈A|[M5,M12]|A〉 = 0.084i〈B|M5,M12]|B〉 = 0.097i(8.3)Therefore, exemplars p5 and p12 are incompatible with respect to the states|A〉 and |B〉.Since the data we analyzed in Chapter 5 was collected presenting theexemplars in only one specific order [Ham88b], these computations demon-strate that it may be possible to predict order effects in membership mea-surements for exemplars that are incompatible. This results is, however,1388.2. Future Workspeculative since there is no experimental data where membership weightestimations have been done presenting exemplars in different orders.One area of reseach would be to generate experimental data to checkwhether the predictions are accurate. If order effects are predictable, thenthe canonical representation proposed in this thesis could be used to developHeisenberg-like uncertainty relations in the context of concept combinationmodels.8.2.2 Modeling Concept Combinations for Real-WorldApplicationsThe models for concepts and concept combinations presented in this the-sis are not general enough for real-world applications. One of the reasonsfor this is that we do not have a model for the conjunction, disjunction, andnegation of concepts. In fact, these three connectives are required to buildthe simplest concept combination structure used in computational applica-tions [RN95]. Therefore, it is important to extend the models presentedin thesis to incorporate these three connectives together. In fact, we couldachieve a model for this connective structure by incorporating disjunctionsto the model of Chapter 6. For the case of two concepts, this involvesrepresentations for states and measurements for the case of disjunction ofconcepts, and of disjunctions and negations.Real-world applications generally require combinations of more than twoconcepts. In logic and computer science, the study of conjunctions, disjunc-tions, and negations of three or more concepts, usually called propositions, isknown as the satisfiability problem [AN96]. This is necessary to determinewhether or not there is a possible instantiation of a concept combinationwhose truth value is positive17.It would be interesting to study the satisfiability problem from the pointof view of quantum cognition. That is, we could test the classical logicalsatisfiability conditions using psychological experiments and, in those caseswhere deviations from classical and fuzzy theoretical rules are found, de-velop a quantum model to handle these deviations.17This means ‘true’ in the context of propositional logic, and a value above a certainthreshold in fuzzy logic.1398.2. Future WorkBecause real-world applications usually involve a large number of con-cepts and a large number of exemplars for each concept, it is necessary totest our models against larger datasets. When we consider a large numberof exemplars and concepts, it is very likely that some concepts have exem-plars in common. This imposes a new type of constraint that has not beenstudied. The constraints for the representation of exemplars shared acrossmultiple concepts will probably require representations in spaces of largerdimension.8.2.3 Indistinguishability and Modes of ReasoningThe identification of concepts with the Maxwell-Boltzmann (MB) andBose-Einstein (BE) statistics in § 7.2.2 assumes that a collection of conceptscan be elicited as an exemplar representing a collection of entities existingin space and time, or a collection of indistinguishable entities of abstractnature. In the former case, the entities corresponding to the exemplar aredistinguishable and behave according to the MB statistics, while in the lat-ter case the entities are indistinguishable and behave according to the BEstatistics.In cognitive science, it is well-known that natural categories, usually re-ferred by nouns in language, can be represented in a ‘hierarchy’ accordingto their ‘level of abstraction’ [Ros99]. For example, the concepts ‘Puppy,’‘Dog,’ and ‘Mammal’ are concepts ordered from lower to higher level of ab-straction. This notion of abstraction is useful to explain that a concept ata lower level of abstraction can be an exemplar of a concept at a higherlevel. In fact, our analysis in § 7.2.2 reveals that concepts at a lower levelof abstraction tend to behave as distinguishable entities, while concepts ata higher level of abstraction tend to behave as indistinguishable entities.A possible extension to this thesis is to consider the notion of abstrac-tion as a mode of reasoning rather than a property of a concept. In fact,although our results indicate that the more concrete the category is, themore distinguishable our reasoning about the category is, we can alwaysconsciously induce on ourselves a mode of reasoning that contradicts thistendency. Therefore, it would be interesting to develop experimental settingswhere the kind of reasoning applied to elicit concepts is controlled. Notethat creating such methodology would allow us to compare concepts at dif-ferent level of abstraction for a fixed kind of reasoning and, therefore, would1408.2. Future Workgeneralize the results of § 7.2.2. Moreover, we could investigate differentkinds of reasoning ranging from personal experiences that are constrainedby our perceptual limitations and the structure of reality, to hypotheticalworlds created by pure imagination where ‘the impossible’ can occur.141Bibliography[AA97] Diedrik Aerts and Sven Aerts. Applications of quantumstatistics in psychological studies of decision processes. InTopics in the Foundation of Statistics, pages 85–97. Springer,1997. → pages xi, 28, 29, 30, 31, 32[AABG00] Diederik Aerts, Sven Aerts, Jan Broekaert, and Liane Gab-ora. The violation of bell inequalities in the macroworld.Foundations of Physics, 30(9):1387–1414, 2000. → pages 46[ABGV12] Diederik Aerts, Jan Broekaert, Liane Gabora, and TomasVeloz. The guppy effect as interference. In Quantum Inter-action, pages 36–47. Springer, 2012. → pages 21, 42[ADR82] Alain Aspect, Jean Dalibard, and Ge´rard Roger. Experi-mental test of bell’s inequalities using time-varying analyzers.Physical review letters, 49(25):1804–1807, 1982. → pages 25[ADS11] Diederik Aerts, Bart DHooghe, and Sandro Sozzo. A quan-tum cognition analysis of the ellsberg paradox. In QuantumInteraction, pages 95–104. Springer, 2011. → pages 1[Aer96] Sven Aerts. Conditional probabilities with a quantal anda kolmogorovian limit. International Journal of TheoreticalPhysics, 35(11):2245–2261, 1996. → pages 29[Aer98] Diederik Aerts. The hidden measurement formalism: whatcan be explained and where quantum paradoxes remain. In-ternational Journal of Theoretical Physics, 37(1):291–304,1998. → pages 29[Aer02] D. Aerts. ”Being and change: foundations of a realistic op-erational formalism“, in Probing the Structure of QuantumMechanics: Nonlinearity, Nonlocality, Computation and Ax-iomatics. eds. D. Aerts, M. Czachor and T. Durt, WorldScientific, Singapore, 2002. → pages 167142Bibliography[Aer07a] D. Aerts. General quantum modeling of combining concepts:A quantum field model in fock space, 2007. Submitted forpublication. → pages 77[Aer07b] D. Aerts. Quantum interference and superposition in cogni-tion: A theory for the disjunction of concepts. 2007. → pages77[Aer09] D. Aerts. Quantum structure in cognition. Journal of Math-ematical Psychology, 53:314–348, 2009. → pages 1, 19, 33,37, 42, 77[AF82] L Accardi and A Fedullo. On the statistical meaning of com-plex numbers in quantum mechanics. Lettere al nuovo ci-mento, 34(7):161–172, 1982. → pages 46[AG05a] D. Aerts and L. Gabora. A state-context-property model ofconcepts and their combinations i: The structure of the setsof contexts and properties. Kybernetes, 34(1/2):151 – 175,2005. → pages 9, 10, 21[AG05b] D. Aerts and L. Gabora. A state-context-property model ofconcepts and their combinations ii: A hilbert space represen-tation. Kybernetes, 34(1/2):176 – 204, 2005. → pages 21, 48,60[Agn91] Franca Agnoli. Development of judgmental heuristics andlogical reasoning: Training counteracts the representative-ness heuristic. Cognitive Development, 6(2):195–217, 1991.→ pages 14[AGS13] Diederik Aerts, Liane Gabora, and Sandro Sozzo. Conceptsand their dynamics: A quantum-theoretic modeling of humanthought. Topics in Cognitive Science, 5(4):737–772, 2013. →pages 1, 36, 46[Ama93] Anton Amann. The gestalt problem in quantum theory: gen-eration of molecular shape by the environment. Synthese,97(1):125–156, 1993. → pages 28[AN96] A. Arnauld and P. Nicole. Logic, or the Art of Think-ing. Cambridge University Press, Cambridge, England, 1996.First published in 1662. Translated and edited by J.V. Buro-ker. → pages 139, 164143Bibliography[AP03] Jean-Julien Aucouturier and Francois Pachet. Representingmusical genre: A state of the art. Journal of New MusicResearch, 32(1):83–93, 2003. → pages 6[AP11] Sam Alxatib and Jeff Pelletier. On the psychology of truth-gaps. In Vagueness in communication, pages 13–36. Springer,2011. → pages 19[Are´03] G. Are´valo. Understanding behavioral dependencies in classhierarchies using concept analysis. L’OBJET, 9(1-2):47–59,2003. → pages 165[AS11a] Diederik Aerts and Sandro Sozzo. A contextual risk model forthe ellsberg paradox. arXiv preprint arXiv:1105.1814, 2011.→ pages 37, 42[AS11b] Diederik Aerts and Sandro Sozzo. Quantum structure in cog-nition: Why and how concepts are entangled. Quantum In-teraction, pages 116–127, 2011. → pages 42[AS11c] Diederik Aerts and Sandro Sozzo. Quantum structure in cog-nition: Why and how concepts are entangled. In QuantumInteraction, pages 116–127. Springer, 2011. → pages 46, 48[AS14] Diederik Aerts and Sandro Sozzo. Quantum entanglement inconcept combinations. International Journal of TheoreticalPhysics, 53(10):3587–3603, 2014. → pages ix, 46, 47, 48[AST12] Diederik Aerts, Sandro Sozzo, and Jocelyn Tapia. A quantummodel for the ellsberg and machina paradoxes. In QuantumInteraction, pages 48–59. Springer, 2012. → pages 17, 18, 42,43, 44[ASV14a] Diederik Aerts, Sandro Sozzo, and Tomas Veloz. The quan-tum nature of identity in human thought: Bose-einsteinstatistics for conceptual indistinguishability. arXiv preprintarXiv:1410.6854, 2014. → pages 122[ASV14b] Diederik Aerts, Sandro Sozzo, and Tomas Veloz. Quantumstructure of negation and conjunction in human thought.Journal of Mathematical Psychology, (submitted), 2014. →pages 21144Bibliography[B+64] John S Bell et al. On the einstein-podolsky-rosen paradox.Physics, 1(3):195–200, 1964. → pages 25, 45[Bal04] Philippe Balbiani. Reasoning about vague concepts in thetheory of property systems. Logique & Analyse, 47:445–460,2004. → pages 9[Bar13] Marco Baroni. Composition in distributional semantics. Lan-guage and Linguistics Compass, 7(10):511–522, 2013. →pages 120[BB05] Mary Bazire and Patrick Bre´zillon. Understanding contextbefore using it. In Modeling and using context, pages 29–40.Springer, 2005. → pages 10, 120[BB12] Jerome R Busemeyer and Peter D Bruza. Quantum modelsof cognition and decision. Cambridge University Press, 2012.→ pages 33[BBG13] Peter Bruza, Jerome Busemeyer, and Liane Gabora. Intro-duction to the special issue on quantum cognition. arXivpreprint arXiv:1309.5673, 2013. → pages 1, 33[Bea64] L. R. Beach. Recognition, assimilation, and identification ofobjects. Psychological Mono-graphs, 78(5-6):21–37, 1964. →pages 6[BFL98] Collin F Baker, Charles J Fillmore, and John B Lowe. Theberkeley framenet project. In Proceedings of the 17th in-ternational conference on Computational linguistics-Volume1, pages 86–90. Association for Computational Linguistics,1998. → pages 119[BHAT05] Eugen Barbu, Pierre Heroux, Sebastien Adam, and EricTrupin. Clustering document images using a bag of sym-bols representation. In Document Analysis and Recognition,2005. Proceedings. Eighth International Conference on, pages1216–1220. IEEE, 2005. → pages 6[BHN93] Maya Bar-Hillel and Efrat Neter. How alike is it versus howlikely is it: A disjunction fallacy in probability judgments.Journal of Personality and Social Psychology, 65(6):1119,1993. → pages 13145Bibliography[BK99] Robert F Bordley and Joseph B Kadane. Experiment-dependent priors in psychology and physics. Theory and De-cision, 47(3):213–227, 1999. → pages 28[BK11] R. Belohlavek and G. Klir, editors. Concepts and Fuzzy Logic.MIT Press, 2011. → pages 166[BKL13] William Blacoe, Elham Kashefi, and Mirella Lapata. Aquantum-theoretic approach to distributional semantics. InHLT-NAACL, pages 847–857, 2013. → pages 1[BKNM09] Peter Bruza, Kirsty Kitto, Douglas Nelson, and CathyMcEvoy. Is there something quantum-like about the hu-man mental lexicon? Journal of Mathematical Psychology,53(5):362–377, 2009. → pages 33[BKR+12] P. Bruza, K. Kitto, L. Ramm, L. Sitbon, D. Song, andS. Blomberg. Quantum-like non-separability of concept com-binations, emergent associates and abduction. Logic Journalof the IGPL, 20(2):445–457, 2012. → pages 48[BKRS13] Peter D Bruza, Kirsty Kitto, Brentyn J Ramm, and Lau-rianne Sitbon. A probabilistic framework for analysing thecompositionality of conceptual combinations. arXiv preprintarXiv:1305.5753, 2013. → pages 48[BL06] David M Blei and John D Lafferty. Dynamic topic models. InProceedings of the 23rd international conference on Machinelearning, pages 113–120. ACM, 2006. → pages 6[BLP11] Aure´lien Baillon, Olivier L’Haridon, and Laetitia Placido.Ambiguity models and the machina paradoxes. The Ameri-can Economic Review, pages 1547–1560, 2011. → pages 17[Boe97] G. Storms P. De Boeck. Formal models for intra-categoricalstructure that can be used for data analysis. The MIT Press,,1997. → pages 165[Boh63] Niels Bohr. Essays 1958-1962 on atomic physics and humanknowledge. 1963. → pages 28[Boo54] George Boole. An Investigation of the Laws of Thought onwhich are Founded the Mathematical Theories of Logic and146BibliographyProbabilities by George Boole. Walton and Maberly, 1854. →pages 26[Bor10] Robert F Bordley. Non-expected utility theories. Wiley En-cyclopedia of Operations Research and Management Science,2010. → pages 28[BOVW99] Nicolao Bonini, Daniel Osherson, Riccardo Viale, and Timo-thy Williamson. On the psychology of vague predicates. Mind& language, 14(4):377–393, 1999. → pages 19[BP03] Luisa Bentivogli and Emanuele Pianta. Beyond lexical units:Enriching wordnets with phrasets. In Proceedings of the tenthconference on European chapter of the Association for Com-putational Linguistics-Volume 2, pages 67–70. Association forComputational Linguistics, 2003. → pages 120[BPB13] Reinhard Blutner, Emmanuel M Pothos, and Peter Bruza.A quantum probability perspective on borderline vagueness.Topics in cognitive science, 5(4):711–736, 2013. → pages 1,19[BPFT11] Jerome R Busemeyer, Emmanuel M Pothos, Riccardo Franco,and Jennifer S Trueblood. A quantum theoretical explana-tion for probability judgment errors. Psychological review,118(2):193, 2011. → pages 1, 37, 38[BTO04] Nicolao Bonini, Katya Tentori, and Daniel Osherson. A dif-ferent conjunction fallacy. Mind & Language, 19(2):199–210,2004. → pages 13, 14[BVN75] Garrett Birkhoff and John Von Neumann. The logic of quan-tum mechanics. Springer, 1975. → pages 100[BW07] J. Busemeyer and Z. Wang. Quantum information processingexplanation for interactions between inferences and decisions.In AAAI Spring Symposium: Quantum Interaction, pages91–97. AAAI, 2007. → pages 15, 37[BYBM11] Ricardo Baeza-Yates, Andrei Broder, and Yoelle Maarek.The new frontier of web search technology: seven challenges.Search computing, pages 3–9, 2011. → pages 119147Bibliography[BZL10] Fan Bu, Xiaoyan Zhu, and Ming Li. Measuring the non-compositionality of multiword expressions. In Proceedings ofthe 23rd International Conference on Computational Linguis-tics, pages 116–124. Association for Computational Linguis-tics, 2010. → pages 120[Cha97] Alexander Chagrov. Modal logic. 1997. → pages 164[Cho02] Noam Chomsky. Syntactic structures. Walter de Gruyter,2002. → pages 119, 120[CK00] Fintan J Costello and Mark T Keane. Efficient creativity:Constraint-guided conceptual combination. Cognitive Sci-ence, 24(2):299–349, 2000. → pages 49[CK01] Fintan J Costello and Mark T Keane. Testing two theoriesof conceptual combination: Alignment versus diagnosticityin the comprehension and production of combined concepts.Journal of Experimental Psychology: Learning, Memory, andCognition, 27(1):255, 2001. → pages 49[CM72] Irving M Copi and Richard W Miller. Introduction to Logic:Study Guide. Macmillan, 1972. → pages 163[CM84] Benjamin Cohen and Gregory L Murphy. Models of con-cepts*. Cognitive Science, 8(1):27–58, 1984. → pages 49[CMC07] Nelson Cowan, C Morey, and Zhijian Chen. The legend ofthe magical number seven. Tall tales about the brain: Thingswe think we know about the mind, but aint so, ed. S. DellaSala, pages 45–59, 2007. → pages 134[Cow98] Anthony Paul Cowie. Phraseology: Theory, Analysis, andApplications: Theory, Analysis, and Applications. OxfordUniversity Press, 1998. → pages 120[Daw13] Michael RW Dawson. Mind, body, world: Foundations of cog-nitive science, volume 1. Athabasca University Press, 2013.→ pages 1, 4, 5[DCGL+10] Maria Luisa Dalla Chiara, Roberto Giuntini, Antonio Ledda,Roberto Leporini, and Giuseppe Sergioli. Entanglement asa semantic resource. Foundations of Physics, 40(9-10):1494–1518, 2010. → pages 46148Bibliography[DH91] Don E Dulany and Denis J Hilton. Conversational implica-ture, conscious representation, and the conjunction fallacy.Social Cognition, 9(1):85–110, 1991. → pages 14[Dir39] Paul Adrien Maurice Dirac. A new notation for quantummechanics. In Mathematical Proceedings of the CambridgePhilosophical Society, volume 35, pages 416–418. CambridgeUniv Press, 1939. → pages 33[DK14] Ehtibar N Dzhafarov and Janne V Kujala. On selective influ-ences, marginal selectivity, and bell/chsh inequalities. Topicsin cognitive science, 6(1):121–128, 2014. → pages 46, 122[EF09] Jonathan St BT Evans and Keith Ed Frankish. In two minds:Dual processes and beyond. Oxford University Press, 2009.→ pages 37[Ell61] Daniel Ellsberg. Risk, ambiguity, and the savage axioms.The Quarterly Journal of Economics, pages 643–669, 1961.→ pages 16[EPR35] Albert Einstein, Boris Podolsky, and Nathan Rosen. Canquantum-mechanical description of physical reality be con-sidered complete? Physical review, 47(10):777, 1935. →pages 25, 45[Fel98] Christiane Fellbaum. WordNet. Wiley Online Library, 1998.→ pages 119[FL96] Jerry Fodor and Ernest Lepore. The red herring and thepet fish: why concepts still can’t be prototypes. Cognition,58(2):253–270, 1996. → pages 21[Fod98] J. Fodor. Concepts: Where Cognitive Science Went Wrong.Oxford University Press, Oxford, 1998. → pages 1, 10, 18[FP88] Jerry A Fodor and Zenon W Pylyshyn. Connectionismand cognitive architecture: A critical analysis. Cognition,28(1):3–71, 1988. → pages 18[Fra09] Riccardo Franco. The conjunction fallacy and interferenceeffects. Journal of Mathematical Psychology, 53(5):415–422,2009. → pages 1, 37, 38, 39149Bibliography[FT95] Craig R Fox and Amos Tversky. Ambiguity aversion andcomparative ignorance. The quarterly journal of economics,pages 585–603, 1995. → pages 16[GA02] L. Gabora and D. Aerts. Contextualizing concepts using amathematical generalization of the quantum formalism. JE-TAI: Journal of Experimental & Theoretical Artificial Intel-ligence, 14, 2002. → pages 7, 10[GA09] L. Gabora and D. Aerts. A model of the emergence andevolution of integrated worldviews. Journal of MathematicalPsychology, 53:434–451, 2009. → pages 9[Gag00] Christina L Gagne´. Relation-based combinations versusproperty-based combinations: A test of the carin theory andthe dual-process theory of conceptual combination. Journalof Memory and Language, 42(3):365–389, 2000. → pages 49[Gar90] Jay L Garfield. Foundations of cognitive science: The essen-tial readings. 1990. → pages 1, 5[Ga¨r00] Peter Ga¨rdenfors. Conceptual Spaces: The Geometry ofThought. The MIT Press, Cambridge, Massachusetts, 2000.→ pages 6[Gig96] Gerd Gigerenzer. On narrow norms and vague heuristics: areply to kahneman and tversky. 1996. → pages 13[Gil87] Itzhak Gilboa. Expected utility with purely subjective non-additive probabilities. Journal of mathematical Economics,16(1):65–88, 1987. → pages 17[Gol94] Robert L Goldstone. The role of similarity in categorization:Providing a groundwork. Cognition, 52(2):125–157, 1994. →pages 9[Gra90] Richard E Grandy. Understanding and the principle ofcompositionality. Philosophical Perspectives, pages 557–572,1990. → pages 18, 120[GS89] Itzhak Gilboa and David Schmeidler. Maxmin expected util-ity with non-unique prior. Journal of mathematical eco-nomics, 18(2):141–153, 1989. → pages 17150Bibliography[GSW05] Bernhard Ganter, Gerd Stumme, and Rudolf Wille, editors.Formal Concept Analysis, Foundations and Applications, vol-ume 3626 of Lecture Notes in Computer Science. Springer,2005. → pages 164, 165[Ham88a] James Hampton. Disjunction of natural concepts. Memory& Cognition, 16(6):579–591, 1988. → pages 1, 18, 21, 41, 49,68, 70[Ham88b] James A Hampton. Overextension of conjunctive concepts:Evidence for a unitary model of concept typicality and classinclusion. Journal of Experimental Psychology: Learning,Memory, and Cognition, 14(1):12, 1988. → pages 1, 18, 20,41, 49, 68, 70, 77, 89, 96, 138[Ham96] James A Hampton. Conjunctions of visually based categories:Overextension and compensation. Journal of ExperimentalPsychology: Learning, Memory, and Cognition, 22(2):378,1996. → pages 21, 41[Ham97a] James Hampton. Inheritance of attributes in natural conceptconjunctions. Memory & Cognition, 15:55–71, 1997. → pages18, 21[Ham97b] James A Hampton. Conceptual combination. Knowledge,concepts, and categories, pages 133–159, 1997. → pages 18,21, 41[Ham97c] James A Hampton. Emergent attributes in combined con-cepts. Creative thought: An investigation of conceptual struc-tures and processes, pages 83–110, 1997. → pages 21[Ham07] James Hampton. Typicality, graded membership, and vague-ness. Cognitive Science, 31(3):355–384, 2007. → pages 9, 10,21, 49[HE92] Robin M Hogarth and Hillel J Einhorn. Order effects in beliefupdating: The belief-adjustment model. Cognitive psychol-ogy, 24(1):1–55, 1992. → pages 15[Hei27] Werner Heisenberg. U¨ber den anschaulichen inhalt der quan-tentheoretischen kinematik und mechanik. Zeitschrift fu¨rPhysik, 43(3-4):172–198, 1927. → pages 138151Bibliography[HHHH09] Ryszard Horodecki, Pawe l Horodecki, Micha l Horodecki, andKarol Horodecki. Quantum entanglement. Reviews of Mod-ern Physics, 81(2):865, 2009. → pages 44[HK13] Emmanuel Haven and Andrei Khrennikov. Quantum socialscience. Cambridge University Press, 2013. → pages 33[HPS71] Paul Gerhard Hoel, Sidney C Port, and Charles J Stone. In-troduction to probability theory, volume 12. Houghton MifflinBoston, 1971. → pages 166[HS09] Yuexian Hou and Dawei Song. Characterizing pure high-order entanglements in lexical semantic spaces via infor-mation geometry. In Quantum Interaction, pages 237–250.Springer, 2009. → pages 46[iCS01] Ramon Ferrer i Cancho and Richard V Sole´. The small worldof human language. Proceedings of the Royal Society of Lon-don B: Biological Sciences, 268(1482):2261–2265, 2001. →pages 124[Imm87] Neil Immerman. Languages that capture complexity classes.SIAM Journal on Computing, 16(4):760–778, 1987. → pages164[JMK+00] Dan Jurafsky, James H Martin, Andrew Kehler, Keith Van-der Linden, and Nigel Ward. Speech and language process-ing: An introduction to natural language processing, compu-tational linguistics, and speech recognition, volume 2. MITPress, 2000. → pages 118[KB62] Leo P Kadanoff and Gordon A Baym. Quantum statisticalmechanics. Benjamin, 1962. → pages 128[Khr99] A. Khrennikov. Interpretations of Probability. Walter deGruyter Press, 1999. → pages 167[Khr10] Andrei Y Khrennikov. Ubiquitous quantum structure: frompsychology to finance. Springer, 2010. → pages 1, 28, 33[Kih87] John F Kihlstrom. The cognitive unconscious. Science,237(4821):1445–1452, 1987. → pages 37152Bibliography[KK90] Jon A Krosnick and Donald R Kinder. Altering the foun-dations of support for the president through priming. TheAmerican Political Science Review, pages 497–512, 1990. →pages 14[KMM05] Peter Klibanoff, Massimo Marinacci, and Sujoy Mukerji. Asmooth model of decision making under ambiguity. Econo-metrica, 73(6):1849–1892, 2005. → pages 17[KP95] Hans Kamp and Barbara Partee. Prototype theory and com-positionality. Cognition, 57(2):121–191, 1995. → pages 18[KR95] Robert E Kass and Adrian E Raftery. Bayes factors. Jour-nal of the american statistical association, 90(430):773–795,1995. → pages 130, 131, 134[KRBS10] Kirsty Kitto, Brentyn Ramm, PD Bruza, and Laurianne Sit-bon. Testing for the non-separability of bi-ambiguous com-pounds. In Quantum Informatics for Cognitive, Social, andSemantic Processes: Papers from the AAAI Fall Symposium,pages 62–69, 2010. → pages 48[Lak73] G. Lakoff. Hedges: A study in meaning criteria and the logicof fuzzy concepts. Journal of Philosophical Logic, 2(4):458–508, 1973. → pages 166[LC98] Claudia Leacock and Martin Chodorow. Combining localcontext and wordnet similarity for word sense identifica-tion. WordNet: An electronic lexical database, 49(2):265–283,1998. → pages 119[LGS09] Michael D Lee, Emily Grothe, and Mark Steyvers. Con-junction and disjunction fallacies in prediction markets. InProceedings of the 31th Annual Conference of the CognitiveScience Society. Lawrence Erlbaum, Mahwah, 2009. → pages14[Low96] R. Lowen. Fuzzy Set Theory. Kluwer, Dordrecht, 1996. →pages 166[Mac09a] E. Machery. Doing without concepts. Oxford UniversityPress, 2009. → pages 6153Bibliography[Mac09b] Mark J Machina. Risk, ambiguity, and the rank-dependenceaxioms. The American Economic Review, 99(1):385–392,2009. → pages 17[Mac14] Mark J Machina. Ambiguity aversion with three or more out-comes. American Economic Review, 104(12):3814–40, 2014.→ pages 17[Mar12] Georgy A Martynov. Classical Statistical Mechanics, vol-ume 89. Springer Science & Business Media, 2012. → pages127[McC] John McCarthy. What is artificial intelligence. → pages 1,118, 119[Med89] D. Medin. Concepts and conceptual structure. AmericanPsychologist, 44(12):1469–1481, 1989. → pages 6, 18[Mei12] Jo¨rg Meibauer. What is a context? What is a Context?:Linguistic Approaches and Challenges, 196:9, 2012. → pages10[ML10] Jeff Mitchell and Mirella Lapata. Composition in distribu-tional models of semantics. Cognitive science, 34(8):1388–1429, 2010. → pages 120[MMR06] Fabio Maccheroni, Massimo Marinacci, and Aldo Rustichini.Dynamic variational preferences. Journal of Economic The-ory, 128(1):4–44, 2006. → pages 17[Mor09] Rodrigo Moro. On the nature of the conjunction fallacy. Syn-these, 171(1):1–24, 2009. → pages 13[MP13] Massimo Melucci and Benjamin Piwowarski. Quantum me-chanics and information retrieval: From theory to applica-tion. In Proceedings of the 2013 Conference on the Theory ofInformation Retrieval, page 1. ACM, 2013. → pages 1[MS88] Douglas L Medin and Edward J Shoben. Context andstructure in conceptual combination. Cognitive Psychology,20(2):158–190, 1988. → pages 10, 18[MS99] Christopher D Manning and Hinrich Schu¨tze. Foundationsof statistical natural language processing, volume 999. MITPress, 1999. → pages 119154Bibliography[MT08] N. Maillot and M. Thonnat. Ontology based complex objectrecognition. Image and Vision Computing, 26(1):102–113,January 01 2008. → pages 6[Mur88] Gregory L Murphy. Comprehending complex concepts. Cog-nitive science, 12(4):529–562, 1988. → pages 49[Mur03] M Lynne Murphy. Semantic relations and the lexicon:Antonymy, synonymy and other paradigms. Cambridge Uni-versity Press, 2003. → pages 119[NA10] H˚akan Nilsson and Patric Andersson. Making the seeminglyimpossible appear possible: Effects of conjunction fallacies inevaluations of bets on football games. Journal of EconomicPsychology, 31(2):172–180, 2010. → pages 14[Nav09] Roberto Navigli. Word sense disambiguation: A survey. ACMComputing Surveys (CSUR), 41(2):10, 2009. → pages 119[Nei76] Ulric Neisser. Cognition and reality: Principles and im-plications of cognitive psychology. WH Freeman/TimesBooks/Henry Holt & Co, 1976. → pages 5[Nos86] R. Nosofsky. Attention, similarity, and the identificationcate-gorization relationship. Journal of Experimental Psychology:General, 115(1):39–57, 1986. → pages 6, 7[Nos87] R. Nosofsky. Attention and learning processes in the iden-tification and categorization of integral stimuli. Journal ofExperimental Psychology: Learning, Memory, and Cognition,13(1):87–108, jan 1987. → pages 9, 10[Nos88] R. Nosofsky. Similarity, frequency, and category representa-tions. Journal of Experimental Psychology: Learning, Mem-ory, and Cognition, 14(1):54–65, jan 1988. → pages 10, 167[PB13] Emmanuel M Pothos and Jerome R Busemeyer. Can quan-tum probability provide a new direction for cognitive model-ing? Behavioral and Brain Sciences, 36(03):255–274, 2013.→ pages 33[Pel94] Francis Jeffry Pelletier. The principle of semantic composi-tionality. Topoi, 13(1):11–24, 1994. → pages 18155Bibliography[Pel06] Francis Jeffry Pelletier. Representation and inference for nat-ural language: A first course in computational semantics.Computational Linguistics, 32(2):283–286, 2006. → pages119[PH14] Jeanne E Parker and Debra L Hollister. The cognitive sciencebasis for context. In Context in Computing, pages 205–219.Springer, 2014. → pages 10[Pit89] Itamar Pitowsky. Quantum probability, quantum logic. Lec-ture notes in physics, 321, 1989. → pages 26[Pit94] Itamar Pitowsky. George boole’s conditions of possible expe-rienceand the quantum puzzle. The British Journal for thePhilosophy of Science, 45(1):95–125, 1994. → pages 26[Pyl84] Zenon Walter Pylyshyn. Computation and cognition. Cam-bridge Univ Press, 1984. → pages 133, 164[Rao09] Goutham Rao. Probability error in diagnosis: The conjunc-tion fallacy among beginning medical students. Fam Med,41(4):262–5, 2009. → pages 14[Rip95] Lance J Rips. The current status of research on conceptcombination. Mind & Language, 10(1-2):72–104, 1995. →pages 18, 49[Rip11] David Ripley. Contradictions at the borders. In Vaguenessin communication, pages 169–188. Springer, 2011. → pages19[Rip13] David Ripley. Sorting out the sorites. Springer, 2013. →pages 19[RMG+76] E. Rosch, C. B. Mervis, W. Gray, D. Johnson, and P. Boyes-Braem. Basic objects in natural categories. Cognitive Psy-chology, 8:382–439, 1976. → pages 6, 8, 9, 10, 18[RN95] S. Russell and P. Norvig. Artificial Intelligence: A ModernApproach. Prentice Hall, 1995. → pages 5, 118, 139, 163[Ros73] E. Rosch. Natural Categories. Cognitive Psychology, 4:328–350, 1973. → pages 6, 8156Bibliography[Ros99] Eleanor Rosch. Principles of categorization. Concepts: corereadings, pages 189–206, 1999. → pages 9, 10, 140, 166[RS80] Michael Reed and Barry Simon. Methods of modern mathe-matical physics: Functional analysis, volume 1. Gulf Profes-sional Publishing, 1980. → pages 34[RS10] Apara Ranjan and Narayanan Srinivasan. Dissimilarity increative categorization. The Journal of Creative Behavior,44(2):71–83, 2010. → pages 123[RUUU12] Anand Rajaraman, Jeffrey D Ullman, Jeffrey David Ullman,and Jeffrey David Ullman. Mining of massive datasets, vol-ume 77. Cambridge University Press Cambridge, 2012. →pages 124[Sau11] Uli Sauerland. Vagueness in language: the case against fuzzylogic revisited. Understanding vaguenessLogical, philosophi-cal, and linguistic perspectives, Studies in logic, 36:185–198,2011. → pages 19[Sav72] Leonard J Savage. The foundations of statistics. CourierDover Publications, 1972. → pages 15[SB74] Seymour Sudman and Norman M Bradburn. Response ef-fects in surveys: A review and synthesis. Number 16. AldineChicago, 1974. → pages 15[SBZ01] R. Sternberg and T. Ben-Zeev. Complex Cognition: The Psy-chology of Human Thought. Oxford University Press, Oxford,2001. → pages 6[SCA03] Ayse Pinar Saygin, Ilyas Cicekli, and Varol Akman. Tur-ing test: 50 years later. In The Turing Test, pages 23–78.Springer, 2003. → pages 118[Sch92] Erwin Schro¨dinger. What is life?: With mind and matter andautobiographical sketches. Cambridge University Press, 1992.→ pages 28[SDBVMR98] Gert Storms, Paul De Boeck, Iven Van Mechelen, and WimRuts. Not guppies, nor goldfish, but tumble dryers, noriega,jesse jackson, panties, car crashes, bird books, and stevie157Bibliographywonder. Memory & cognition, 26(1):143–145, 1998. → pages21, 49[Sea04] John R Searle. Mind: a brief introduction, volume 259. Ox-ford University Press Oxford, 2004. → pages 4, 119[SG07] Mark Steyvers and Tom Griffiths. Probabilistic topic models.Handbook of latent semantic analysis, 427(7):424–440, 2007.→ pages 6[SL97] David R Shanks and Koen Lamberts. Knowledge, Concepts,and Categories. Psychology Press, 1997. → pages 8[SM81] E. Smith and D. Medin. Categories and Concepts. HarvardUniversity Press, 1981. → pages 8[Smi03] Quentin Smith. 15. why cognitive scientists cannot ignorequantum mechanics. Consciousness: New Philosophical Per-spectives: New Philosophical Perspectives, page 409, 2003. →pages 28, 37[SMR13] Paul Smolensky, Michael C Mozer, and David E Rumelhart.Mathematical perspectives on neural networks. PsychologyPress, 2013. → pages 46[SO81] E. Smith and D. Osherson. On the adequacy of prototypetheory as a theory of concepts. Cognition, 9:35–38, 1981. →pages 1, 21, 49[SO82] E. Smith and D. Osherson. Gradedness and conceptual com-bination. Cognition, 12:299–318, 1982. → pages 49[SOSS03] Steven A Sloman, David Over, Lila Slovak, and Jeffrey MStibel. Frequency illusions and other fallacies. Organiza-tional Behavior and Human Decision Processes, 91(2):296–309, 2003. → pages 14[Sow00] John F Sowa. Ontology, metadata, and semiotics. In Concep-tual structures: Logical, linguistic, and computational issues,pages 55–81. Springer, 2000. → pages 119[Soz14] Sandro Sozzo. A quantum probability explanation in fockspace for borderline contradictions. Journal of MathematicalPsychology, 58:1–12, 2014. → pages 19158Bibliography[SP96] Howard Schuman and Stanley Presser. Questions and an-swers in attitude surveys: Experiments on question form,wording, and context. Sage, 1996. → pages 15[ST85] Jun John Sakurai and San Fu Tuan. Modern quantum me-chanics, volume 1. Addison-Wesley Reading, Massachusetts,1985. → pages 34[STZ05] Xuehua Shen, Bin Tan, and ChengXiang Zhai. Context-sensitive information retrieval using implicit feedback. InProceedings of the 28th annual international ACM SIGIRconference on Research and development in information re-trieval, pages 43–50. ACM, 2005. → pages 119[Sve08] Mar´ıa Helena Svensson. A very complex criterion of fixed-ness: Non-compositionality. Phraseology: An Interdisci-plinary Perspective, S. Granger, 81:81–93, 2008. → pages120[TB11] J. Trueblood and J. Busemeyer. A quantum probabilityaccount of order effects in inference. Cognitive Science,35(8):1518–1552, 2011. → pages 15[TBO04] Katya Tentori, Nicolao Bonini, and Daniel Osherson. Theconjunction fallacy: a misunderstanding about conjunction?Cognitive Science, 28(3):467–477, 2004. → pages 14[TC12] Katya Tentori and Vincenzo Crupi. On the conjunction fal-lacy and the meaning of¡ i¿ and¡/i¿, yet again: A reply to.Cognition, 122(2):123–134, 2012. → pages 14[TG82] A. Tversky and I. Gati. Similarity, separability, and the tri-angle inequality. Psychological Review, 89(2), 1982. → pages9[Tha97] Paul Thagard. Coherent and creative conceptual combina-tions. Creative thought: An investigation of conceptual struc-tures and processes, 1997. → pages 37, 49[Tho85] Richard Frederick Thompson. The brain: An introduction toneuroscience. WH Freeman/Times Books/Henry Holt & Co,1985. → pages 5159Bibliography[TK81] Amos Tversky and Daniel Kahneman. Judgments of and byrepresentativeness. Technical report, DTIC Document, 1981.→ pages 14[TK83] Amos Tversky and Daniel Kahneman. Extensional versusintuitive reasoning: The conjunction fallacy in probabilityjudgment. Psychological review, 90(4):293, 1983. → pages12, 13[TKGG11] J. Tenenbaum, C. Kemp, T Griffiths, and N. Goodman. Howto grow a mind: Statistics, structure, and abstraction. Sci-ence, 331(6022):1279–1286, 2011. → pages 166[TML96] Karl Halvor Teigen, Monica Martinussen, and Thorleif Lund.Conjunction errors in the prediction of referendum outcomes:Effects of attitude and realism. Acta Psychologica, 93(1):91–105, 1996. → pages 14[TRR00] Roger Tourangeau, Lance J Rips, and Kenneth Rasinski. Thepsychology of survey response. Cambridge University Press,2000. → pages 15[Tur01] Peter Turney. Mining the web for synonyms: Pmi-ir versuslsa on toefl. 2001. → pages 119[Tve77] A. Tversky. Features of similarity. Psychological Review,44(4), 1977. → pages 9[VAZ11] Tomas Veloz, Diederik Aerts, and Xiaozhao Zhao. Measuringconceptual entanglement in collections of documents. Quan-tum Interaction, pages 116–127, 2011. → pages 46[VGEA11] T. Veloz, L. Gabora, M. Eyjolfson, and D. Aerts. Toward aformal model of the shifting relationship between conceptsand contexts during associative thought. In Dawei Song,Massimo Melucci, Ingo Frommholz, Peng Zhang, Lei Wang,and Sachi Arafat, editors, Quantum Interaction - 5th Inter-national Symposium, QI 2011, Aberdeen, UK, June 26-29,2011, Revised Selected Papers, volume 7052 of Lecture Notesin Computer Science, pages 25–34. Springer, 2011. → pages10, 12160Bibliography[VNM07] John Von Neumann and Oskar Morgenstern. Theory ofGames and Economic Behavior (60th Anniversary Commem-orative Edition). Princeton university press, 2007. → pages15[Vor62] NN Vorob’ev. Consistent families of measures and their ex-tensions. Theory of Probability & Its Applications, 7(2):147–163, 1962. → pages 27[WB13] Zheng Wang and Jerome R Busemeyer. A quantum questionorder model supported by empirical tests of an a priori andprecise prediction. Topics in cognitive science, 5(4):689–710,2013. → pages 15, 37, 39[WBAP13] Zheng Wang, Jerome R Busemeyer, Harald Atmanspacher,and Emmanuel M Pothos. The potential of using quantumtheory to build models of cognition. Topics in Cognitive Sci-ence, 5(4):672–688, 2013. → pages 46[Wig61] Eugene P Wigner. Remarks on the mind-body problem. 1961.→ pages 4[Wis96] Edward J Wisniewski. Construal and similarity in conceptualcombination. Journal of Memory and Language, 35(3):434–453, 1996. → pages 49[Wit58] L. Wittgenstein. Philosophical Investigations. Blackwell, Ox-ford, 1958. → pages 8, 167[WL98] Edward J Wisniewski and Bradley C Love. Relations versusproperties in conceptual combination. Journal of Memoryand Language, 38(2):177–202, 1998. → pages 49[WM08] Douglas H Wedell and Rodrigo Moro. Testing boundary con-ditions for the conjunction fallacy: Effects of response mode,conceptual focus, and problem type. Cognition, 107(1):105–136, 2008. → pages 14[WSSB14] Zheng Wang, Tyler Solloway, Richard M Shiffrin, andJerome R Busemeyer. Context effects produced by questionorders reveal quantum nature of human judgments. Proceed-ings of the National Academy of Sciences, 111(26):9431–9436,2014. → pages 15, 41161Bibliography[Zad65] L. Zadeh. Fuzzy sets. Information and Control, 8:338–353,1965. → pages 8, 119, 165162Appendices163Appendices ATraditional Modeling ToolsDifferent mathematical tools have been used to model cognitive phenom-ena. In what follows, we give a brief overview of the most relevant mathe-matical structures used in cognitive modelling to date. We refer to [RN95]for a more comprehensive review.A.1 Classical logicClassical logic is the first and most explored mathematical structureused to represent and process meaning. Indeed, classical logic has been for-mulated first by ancient greeks, in their search for a notion of truth anddeductive procedures [CM72]. The idea behind logic is that a reasoningprocess, starting from certain ‘true’ basic facts, should allow us to deduceall possible true (or false) facts. The basic elements of any logical approachare i) a set of basic postulates that forms the starting universe of discourse,ii) certain connectives and relations to build new postulates from the basicones, iii) deductive rules to reason.Logic can be used, in principle, to formalize any process where someform of analytic reasoning is present. For example, propositional logic (PL)is defined as a system L = (P,C,R,A), where P is a set of propositionalvariables, C is a set of connectives and relations that include ‘and,’ ‘or,’‘not,’ and ‘then’ denoted by ∧,∨,¬ and →, R is a set of deduction rules,and A is the set of axioms. A typical element of P is a basic proposition suchas p1 =Today is Monday, and p2 =John goes to a restaurant. R containsthe modus ponens such as from p and p → q, infer q. And axioms such asp→ p and ¬¬p→ p are basic truths in the logical system.A logical system should satisfy at least two basic conditions. The first,called soundness, requires that deduction rules only prove formulas thatare true. The second feature, called completeness, requires that every trueproposition must be provable within the logical system. Unfortunately,the latter condition is hard to apply in practice, since it is known that164A.1. Classical logicthe complexity of the satisfiability of a conjunctive proposition requires atleast non-deterministic polynomial time with respect to the proposition’slenght [Pyl84].The set of connectives C is of particular interest since it may limit howmuch can be proven in a certain logical system. There are different expres-sivity levels depending on the types of connectives we use. For example, theconnectives of PL are always finitely evaluated. However, connectives suchas ∀, which means ‘for all,’ require evaluations of formulas in potentiallyan infinite number of cases. Consider the simple mathematical inequality∀x < 1 → x < 2. This proposition is true for infinitely many numbers (allnumbers smaller than 1), so it cannot be stated within a PL system. The useof connectives and relations with different levels of expressivity leads to anextremely fine-grained development of ‘modal’ logics that reveal a hierarchyof logical systems organized according to their degree of expressivity [Cha97].Logic is also deeply connected with the notion of computation. Indeed,the basic operations that a computer performs correspond to logical oper-ations, and thus all procedures that a computer performs can be reducedto logical formulas. In particular, an interesting connection between logicand computation is in the area of descriptive complexity [Imm87], wheredifferent classes of computational complexity are mapped to different logi-cal languages.Formal Concept Analysis (FCA) is an example of the application of logicto model concepts. FCA is based on a particular formalization of the notionof concept, inspired by the view of concepts in traditional logic, which as-sumes that a concept can be described in terms of a set of attributes, whereeach exemplar corresponds to a propositional-logic combination of some ofthese attributes [AN96]. In its basic form, FCA analyses input data consist-ing of objects determined by a set of attributes assumed to be held by theobject.The primary aim in FCA is to extract from the input data all consis-tent or formal concepts that form a hierarchy, called the concept lattice,and a set of particular attribute dependencies known as attribute implica-tions [GSW05]. A formal concept is defined as a pair consisting of two sets:a set of objects to which the concept applies, the concepts extent, and a setof attributes that characterize the concept, the concepts intent. An attributeimplication is an expression A → B. If an object has all the attributes inA, then it also has all the attributes in B. For example, let A = {drinks-165A.2. Fuzzy Logic and Fuzzy Set Theoryalcohol, smokes} and B = {heart-problems}.The concept lattice is used to derive the minimal set of attribute impli-cations. This set can be used to find hidden causal relations that are notevident from the dataset. It has been shown that the attribute implicationhas a non-redundant minimal base from which all the dependences can beobtained applying standard deduction rules [GSW05].Applications of FCA can be found in many areas, including engineering,natural and social sciences, and mathematics [Are´03, GSW05]. In cognitivemodelling, FCA has been proposed as a possible structure to study prop-erty correlations that might foster categorization [Boe97]. However from astructural perspective, FCA is a non-graded mathematical formalism, so itcannot account for the membership and typicality functions of a conceptTheory.A.2 Fuzzy Logic and Fuzzy Set TheoryFuzzy logic is an extension of classical logic where propositions can havedegrees of truth. Fuzzy set theory is the analogous extension of classical settheory. It allows elements of a set to have degrees of membership rangingfrom null to total membership. The degree of membership is measured ona [0, 1] scale, where 0 means not a member, and 1 means full member. Thisis useful for logical settings where it is impossible to assign binary member-ship values [Zad65]. Fuzzy set theory and fuzzy logic are mathematicallyrelated. Here, we elaborate on fuzzy set theory because it has been used asa mathematical framework for the prototype theory of concepts.Given a set X 6= {∅}, a fuzzy set is a pair (X, f) where f : X →[0, 1] [Zad65]. The value f(x) is usually interpreted as a degree of mem-bership of x in X. One of the important notions in fuzzy set theory, is thethreshold dependent set Xα, the set that contains elements having a degreeof membership above α. This emphasizes the elements that have highermembership, and represents a fingerprint of the structure of the set whenwe view Xα as a function of α.Fuzzy set theory is equipped with algebraic operators that are extensionsof the logical operators of classical set theory, including union, intersection,166A.3. Probabilistic Approachesand complement operators. However, these operators can be extended inmany different ways, and there is no formal criteria to decide which exten-sion is more approriate [BK11]. An interesting class of extensions, calledtriangular norms (or t-norms), provide the most reasonable extension forthe binary set theoretical operators ∧ (‘and’), and ∨ (‘or’.) These operatorsare built in terms of minimum and maximum membership functions and, insome cases, include parameters that can be used to tailor the norm to betterfit experimental data [BK11].Aggregation operators that depend on the membership of all the ele-ments in the fuzzy set can also be defined. These operators are used toextract information regarding the membership structure of the set. For ex-ample, the cardinality of a fuzzy set is defined as the sum of the membershipof all its elements. From here, the average membership of the fuzzy set isdefined as the ratio between the fuzzy and classical cardinality. Aggregationoperators are used to develop fuzzy notions of connectives belonging to highexpressivity formal languages found in Modal logic.Fuzzy set theory has been applied in a broad spectrum of areas relatedto automatization such as control theory and expert systems [Low96]. Incognitive modelling, fuzzy set theory was first used to frame the prototypetheory of concepts [Ros99]. A concept A is defined by a universal set ofexemplars U , a membership function f , and a membership threshold α. fcan also be interpreted as a measure of typicality, or of similarity to a specificprototype [Lak73]. The concept is modeled by the fuzzy set (U, f). Torecover truth evaluations, a threshold is imposed for f so the categories canbe treated as sets. This approach extends classical approach to concepts,but still lacks a formal procedure to combine categories forming complexcategories or sentences.A.3 Probabilistic ApproachesProbability theory has been applied to almost every possible science-related endeavour [HPS71]. In cognitive modeling, the degree of member-ship of an exemplar with respect to a concept can be thought of as anestimation of the probability of being a member of the category. There areprocedures, developed using standard probability theory, to infer the mem-bership probability of some exemplars from the membership probability ofothers [TKGG11]. Analogously, the notions of typicality, property relevance,167A.3. Probabilistic Approachesand similarity can also be framed in a probabilistic manner [Nos88].Before showing how probability theory has been applied to conceptsresearch, we consider the foundational aspects of probability as it relates toconcept theories.A.3.1 Interpretations of ProbabilityThere are three main interpretations of probability. The first assumesthat probabilities are the relative frequencies with which the possible out-comes of a situation occur [Khr99]. This interpretation is useful when obser-vations are repeatable. For example, when studying the possible outcomesof throwing a die, the relative frequencies of each outcome tend to 16 as thenumber of repetitions of the experiment grows. We may assume, therefore,that there is an underlying probability that accounts for the relative fre-quencies of occurrences describing the phenomena.There are several probability estimations that do not correspond to sucha view [Khr99]. For example, we know from atomic physics that the decay-ing time of certain radioactive elements corresponds to thousands of years.Clearly, we cannot afford to repeat many experiments, so the decaying timeis obtained through a formula whose parameters are established by a seriesof observations.A second interpretation assumes that the outcome of a situation is ruledby an intrinsic propensity that would generate the observed relative frequen-cies if the experiment was repeated a large number of times [Khr99]. Thisinterpretation however does not account for situations such as the likelihoodthat the current president will be reelected. This kind of subjective proba-bility estimation cannot be interpreted from a relative frequency nor froman intrinsic propensity view.The third interpretation of probability involves a subjective belief in-terpretation. That is, it is assumed that estimators have a degree of beliefconcerning the possible outcomes of a certain situation [Khr99]. Because es-timations about concepts must be built upon cognitive mechanisms based onthe information available to estimators, this is the interpretation most oftenused in theories of concepts. It is interesting to note that while philosophersare concerned with the relation between foundations of probability and con-cepts [Wit58], the modeling community has often ignored this issue [Aer02].168A.3. Probabilistic ApproachesA.3.2 Probability SpacesFormally, a probabilistic space P is defined by the tripleP = (Ω,F ,P), (A.1)where Ω is a set of elementary events w ∈ Ω , F is a σ-algebra of the subsetsof Ω, and P is a σ-additive probability measure from F to [0, 1] such thatP(A) ≥ 0, for A ∈ F ,P(Ω) = 1,P(∪iAi) =∑iP(Ai), for disjoints sets Ai ∈ F , i ∈ N.(A.2)Here we briefly explain how experiments on a system are modeled in a prob-abilistic setting.A partition E of Ω is a set of subsets of Ω such that for all E1, E2 ∈ EE1 ∩ E2 = ∅, and ∪E∈E E = Ω.Because outcomes of each experiment are exclusive, experiments are parti-tions of Ω. The events representing the outcomes form disjoint sets. Also,each possible post-experimental situations of the system corresponds to anoutcome so the set of outcomes of each experiment is complete and thus theunion of the set of outcomes of an experiment is Ω.Denote the experiments by Ei for i = 1, ..., n, and let Ei = {E1i , E2i , ..., }be the set of outcomes for Ei. Ei can be either finite or infinite. For example,let E1 be the experiment of measuring whether a particle is on the positiveor negative side of a referential axis. In this case, we have two outcomes, E+1and E−1 , depending on which side the particle is. Let E2 be the experimentof measuring the exact position of the particle on a referential axis. In thiscase the set of outcomes is infinite.We are interested in the joint probability distribution, P(E1, ..., En),which gives us the probability of all possible outcome configurations ofour experimental setting. In particular, a specific outcome configuration169A.3. Probabilistic Approaches(Ej11 , ..., Ejnn ) has probabilityP(∩ni=1Ejii ) = P(Ej11 , ..., Ejnn ). (A.3)In many cases, rather than performing all possible experiments in oursystem, we focus on a few experiments. For simplicity let n = 2, and assumewe would like to know only the probabilities of the outcomes of experimentE2. The probability distribution that only considers the outcomes of E2 isknown as the marginal probability distribution, P(E2), computed byP(E2) =|E1|∑j=1P(Ej1, E2). (A.4)This formula, known as the marginal probability law, margins out the prob-ability of outcomes for E2 by summing over all the possible outcomes of E1.We can also compute the probability of the outcomes for E2 given that theexperiment E1 has already been performed. This is known as the conditionalprobability, P(E2|Ek1 ) given byP(E2|Ek1 ) =P(Ek1 ∩ E2)P(Ek1 ), for P(Ek1 ) > 0 (A.5)170Appendices BMembership ofConjunctions and Negationsof Concepts171Appendices B. Membership of Conjunctions and Negations of ConceptsTableB.1:RepresentationofthemembershipweightsinthecaseoftheconceptsHomeFurnishingandFurniture.A=HomeFurnishing,B=FurnitureExemplarµ(A)µ(B)µ(A′ )µ(B′ )µ(AB)µ(AB¯)µ(A¯B)µ(A¯B¯)∆AB∆AB¯∆A¯′ B∆A¯B¯IABA¯B¯IAIBIA¯IB¯Mantelpiece0.90.610.120.50.710.750. B. Membership of Conjunctions and Negations of ConceptsTableB.2:RepresentationofthemembershipweightsinthecaseoftheconceptsSpicesandHerbs.A=Spices,B=HerbsExemplarµ(A)µ(B)µ(A′ )µ(B′ )µ(AB)µ(AB¯)µ(A¯B)µ(A¯B¯)∆AB∆AB¯∆A¯′ B∆A¯B¯IABA¯B¯IAIBIA¯IB¯Molasses0.360.130.670.840.240.540.250.730. B. Membership of Conjunctions and Negations of ConceptsTableB.3:RepresentationofthemembershipweightsinthecaseoftheconceptsPetsandFarmyardAnimals.A=Pets,B=FarmyardAnimalsExemplarµ(A)µ(B)µ(A′ )µ(B′ )µ(AB)µ(AB¯)µ(A¯B)µ(A¯B¯)∆AB∆AB¯∆A¯′ B∆A¯B¯IABA¯B¯IAIBIA¯IB¯Goldfish0.930.170.120.810.430.910.180.430. B. Membership of Conjunctions and Negations of ConceptsTableB.4:RepresentationofthemembershipweightsinthecaseoftheconceptsFruitsandVegetables.A=Fruits,B=VegetablesExemplarµ(A)µ(B)µ(A′ )µ(B′ )µ(AB)µ(AB¯)µ(A¯B)µ(A¯B¯)∆AB∆AB¯∆A¯′ B∆A¯B¯IABA¯B¯IAIBIA¯IB¯Apple10.2300.820.60.890.130.180.380.070.130.18-0.79-0.49-0.5-0.3-0.24Parsley0.020.780.990.250.450.10.840.440.430.080.060.19-0.83-0.53-0.51-0.29-0.29Olive0.530.630.470.440.650.340.510.360.12-0.110.04-0.08-0.86-0.46-0.53-0.41-0.26ChiliPepper0.190.730.830.350.510.20.680.440.330.01-0.060.09-0.83-0.53-0.46-0.29-0.29Broccoli0.0910.940.060.590.


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items