World Sanskrit Conference (WSC) (17th : 2018)

Computational Sanskrit & Digital Humanities : Selected Papers Presented at the 17th World Sanskrit Conference,… Huet, Gérard; Kulkarni, Amba 2019

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


70440-Huet_G_et_al_World_Sanskrit_Conference_Computational_2018.pdf [ 5.11MB ]
JSON: 70440-1.0391834.json
JSON-LD: 70440-1.0391834-ld.json
RDF/XML (Pretty): 70440-1.0391834-rdf.xml
RDF/JSON: 70440-1.0391834-rdf.json
Turtle: 70440-1.0391834-turtle.txt
N-Triples: 70440-1.0391834-rdf-ntriples.txt
Original Record: 70440-1.0391834-source.json
Full Text

Full Text

Selected Papers Presented at the 17th World Sanskrit Conference , July 9-13, 2018Edited by Gérard Huet and Amba KulkarniCOMPUTATIONAL SANSKRIT & DIGITAL HUMANITIESUniversity of British Columbia   Vancouver, CanadaTHE   17TH   WORLD  SANSKRIT  CONFERENCEVANCOUVER, CANADA • JULY 9-13, 2018Computational Sanskrit & Digital Humanities:
Selected Papers Presented at the 17th World Sanskrit Conference, 
July 9-13, 2018, Vancouver, Canada. DOI: 10.14288/1.0391834.
URI: Edited by Gérard Huet and Amba Kulkarni
General Editor: Adheesh Sathaye Electronic edition published (2020) by the Department of Asian Studies, Univer-sity of British Columbia, for the International Association for Sanskrit Studies. Hardback edition published in 2018 by D. K. Publishers Distributors Pvt. Ltd., New Delhi (ISBN: 978-93-87212-10-7). © Individual authors, 2020. Content is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license (CC BY-NC-ND 4.0). All papers in this collection have received double-blind peer review. वैधुसव ्मकबुंटुकअ ारा यसं तृा यनसमवायःINTERNATIONAL ASSOCIATION OF SANSKRIT STUDIESComputational Sanskrit&Digital HumanitiesSelected papers presentedatthe 17th World Sanskrit ConferenceUniversity of British Columbia, Vancouver9–13 July 2018Edited byGérard Huet & Amba KulkarniPrefaceThis volume contains edited versions of papers accepted for presentationat the 17th World Sanskrit Conference in July 2018 in Vancouver, Canada.A special track of the conference was reserved for the topic “ComputationalSanskrit & Digital Humanities”, with the intent to cover not only recentadvances in each of the now mature fields of Sanskrit Computational Lin-guistics and Sanskrit Digital Libraries, but to encourage cooperative effortsbetween scholars of the two communities, and prepare the emergence ofgrammatically informed digital Sanskrit corpus. Due to its rather techni-cal nature, the contributions were not judged on mere abstracts, but onsubmitted full papers reviewed by a Program Committee.We would like to thank the Program Committee of our track for theirwork:• Dr Tanuja Ajotikar, Belgavi, Karnataka• Pr Stefen Baums, University of Munich• Pr Yigal Bronner, Hebrew University of Jerusalem• Pr Pawan Goyal, IIT Kharagpur• Dr Oliver Hellwig, Düsseldorf University• Dr Gérard Huet, Inria Paris (co-chair)• Pr Girish Nath Jha, JNU, Delhi• Pr Amba Kulkarni, University of Hyderabad (co-chair)• Dr Pawan Kumar, Chinmaya Vishwavidyapeeth, Veliyanad• Pr Andrew Ollett, Harvard University• Dr Dhaval Patel, I.A.S. Officer, Gujarat• Pr Srinivasa Varakhedi, KSU, Bengaluru14 contributions were accepted, revised along referees’ recommendations,and finely edited to form this collection.The first two papers concern the problem of proper mechanical simula-tion of Pāṇini’s Aṣṭadyāyī. In “A Functional Core for the Computationaliii Computational Sanskrit and Digital HumanitiesAṣṭadyāyī”, Samir Janardan Sohoni and Malhar A. Kulkarni present anoriginal architecture for such a simulator, based on concepts from functionalprogramming. In their model, each Pāṇinian sūtra translates as a Haskellmodule, an elegant effective formalization. They explain an algorithm forsūtra-conflict assessment and resolution, discuss visibility and termination,and exhibit a formal derivation of word bhavati as a showcase.A different computational framework for the same problem is offeredby Sarada Susarla, Tilak M. Rao and Sai Susarla in their paper “PAIAS:Pāṇini Aṣṭadyāyī Interpreter As a Service”. They explain their developmentof a Web service usable as a Sanskrit grammatical assistant, implementingdirectly the Pāninian mechanisms. Here sūtras are records in a databasein the JSON format, managed by a Python library. They pay particularattention to the meta-rules of the grammar, and specially to defining sūtras.Thay refrain from expanding such definitions in operative sūtras, but insiston their emulation along the grammatical processing.These two papers conceptualize two computational models of Pāniniangrammar that are strikingly different in their software architecture. How-ever, when one examines examples of sūtra representations in both systems,the information content looks very similar, which may suggest some futureinter-operability of these two interesting tools.In the general area of mechanical analysis of Sanskrit text, we haveseveral contributions at various levels. At the level of semantic roles analysis,at the heart of dependency parsing, Sanjeev Panchal and Amba Kulkarnipresent possible solutions to the complementary problem of ambiguity. Intheir paper “Yogyatā as an absence of non-congruity” they explain variousdefinitions used by Sanskrit grammarians to express compatibility, and howto use these definitions to reduce ambiguity in dependency analysis.The next paper, “An ‘Ekalavya’ Approach to Learning Context FreeGrammar Rules for Sanskrit Using Adaptor Grammar”, by Amrith Kr-ishna, Bodhisattwa Prasad Majumder, Anil Kumar Boga, and Pawan Goyal,presents an innovative use of adaptor grammars to learn patterns in Sanskrittext definable as context-free languages. They present applications of theirtechniques to word reordering tasks in Sanskrit, a preliminary step towardsrecovering prose ordering from poetry, a crucial problem in Sanskrit.Concerning meter recognition, we have a contribution of Shreevatsa Ra-jagopalan on “A user-friendly tool for metrical analysis of Sanskrit verse”.The main feature of this new metrical analysis tool, available either as aPreface iiiWeb service or as a software library, is its robustness and its guidance inerror-correction.Two more contributions use statistical techniques (Big Data) for improv-ing various Sanskrit-related tasks.For instance, in the field of optical character recognition, the contribu-tion by Devaraja Adiga, Rohit Saluja, Vaibhav Agrawal, Ganesh Ramakr-ishnan, Parag Chaudhuri, K. Ramasubramanian and Malhar Kulkarni on“Improving the learnability of classifiers for Sanskrit OCR corrections”.In the same vein of statistical techniques, Nikhil Chaturvedi and RahulGarg present “A Tool for Transliteration of Bilingual Texts Involving San-skrit”, which accommodates smoothly text mixing various encodings.While much of the work in Sanskrit computational linguistics is on Clas-sical Sanskrit, researchers are also applying computational techniques toVedic Sanskrit. One such effort is a detailed formalization of Vedic recitationphonology by Balasubramanian Ramakrishnan: “Modeling the Phonologyof Consonant Duplication and Allied Changes in the Recitation of TamilTaittiryaka-s”.On a more theoretical perspective on Sanskrit syntax, Brendan Gillonpresents a formalization of Sanskrit complements in terms of the categorialgrammar framework. His paper “Word complementation in Sanskrit treatedby a modest generalization of categorial grammar” explains modified ver-sions of the cancellation rules that aim at accommodating free word order.This raises the theoretical problem of the distinction between complementsand modifiers in Sanskrit.Turning to the Digital Humanities theme, we have a number of contri-butions. In “TEITagger: Raising the standard for digital texts to facilitateinterchange with linguistic software”, Peter Scharf discusses how fine-grainXML representation of corpus within the Text Encoding Initiative stan-dard allows the inter-communication between digital Sanskrit libraries andgrammatical tools such as parsers as well as meter analysis tools.A complementary proposal is discussed in the paper “Preliminary De-sign of a Sanskrit Corpus Manager” by Gérard Huet and Idir Lankri. Theypropose a scheme for a fine-grained representation of Sanskrit corpus allow-ing inter-textuality phenomena such as sharing of sections of text, but alsoa variance of readings. They propose to use grammatical analysis tools tohelp annotators feeding digital libraries with grammatical information usingmodern cooperative work software. They demonstrate a prototype of sucha tool, in the framework of the Sanskrit Heritage platform.iv Computational Sanskrit and Digital HumanitiesMoving towards philological concerns such as critical editions, the paper“Enriching the digital edition of the Kāśikāvṛtti by adding variants from theNyāsa and Padamañjarī”, by Tanuja P. Ajotikar, Anuja P. Ajotikar, andPeter M. Scharf discusses the problem of managing complex informationfrom recensions and variants. It argues for a disciplined method of usingTEI structure to represent this information in machine-manipulable ways,and demonstrates its use on processing variants of the Kāśikāvṛtti, the majorcommentary of the Aṣṭadyāyī.In the same area of software-aided philology, the contribution “From theweb to the desktop: IIIF-Pack, a document format for manuscripts usingLinked Data standards”, by Timothy Bellefleur, presents a proposal for acommon format fit to manage complex information about corpus recensionsin various formats, including images of manuscripts. This is in view of fa-cilitating the interchange of such data by various teams using this commonformat. His proposal uses state-of-the-art standards of hypertext. It hasalready been put to use in an interactive software platform to manage re-censions for the critical edition of the Vetālapañcaviṃśati by Pr. AdheeshSathaye.The volume concludes with the contribution “New Vistas to studyBhartṛhari: Cognitive NLP” by Jayashree Aanand Gajjam, Diptesh Kano-jia, and Malhar Kulkarni which presents highly original research on cogni-tive linguistics in Sanskrit, by comparing the results of experiments witheye-tracking equipment with theories of linguistic cognition by Bhartṛhari.We thank the numerous experts who helped us in the review processand all our authors who responded positively to the reviewer’s commentsand improved their manuscripts accordingly. We thank the entire 17th WSCorganizing committee, led by Pr. Adheesh Sathaye, which provided us thenecessary logistic support for the organization of this section.Gérard Huet & Amba KulkarniContributorsDevaraja AdigaDepartment of Humanities and Social Sciences,Indian Institute of Technology Bombay,Powai, Mumbai, AgrawalIndian Institute of Technology,Kharagpur, AjotikarShan State Buddhist University,Myanmaranujaajotikar@gmail.comTanuja AjotikarKAHER’s Shri B. M. Kankanwadi Ayurveda Mahavidyalaya,Belagavi,India.gtanu30@gmail.comTimothy BellefleurDepartment of Asian Studies,University of British Columbia, Vancouvertbelle@alumni.ubc.cavvi Computational Sanskrit and Digital HumanitiesAnil Kumar BogaDepartment of Computer Science and Engineering,Indian Institute of Technology,Kharagpur, India.bogaanil.009@gmail.comNikhil ChaturvediDepartment of Computer Science and Engineering,Indian Institute of Technology,New ChaudhuriIndian Institute of Technology Bombay,Powai, Mumbai, Aanand GajjamDepartment of Humanities and Social Sciences,Indian Institute of Technology Bombay,Powai, Mumbai, GargDepartment of Computer Science and Engineering,Indian Institute of Technology,New S. GillonMcGill UniversityMontreal, QuebecH3A 1T7 Canadabrendan.gillon@mcgill.caviiContributorsPawan GoyalDepartment of Computer Science and Engineering, Indian Institute of Technology,Kharagpur, India.pawang@cse.iitkgp.ernet.inGérard HuetInria Paris Center,France.Gerard.Huet@inria.frDiptesh KanojiaIITB-Monash Research Academy, Powai, Mumbai, India KrishnaDepartment of Computer Science and Engineering, Indian Institute of Technology,Kharagpur, India.amrith.krishna@cse.iitkgp.ernet.inAmba KulkarniDepartment of Sanskrit Studies,University of Hyderabad,Hyderabad, India.apksh@uohyd.ernet.inMalhar KulkarniDepartment of Humanities and Social Sciences,Indian Institute of Technology Bombay,Powai, Mumbai, Computational Sanskrit and Digital HumanitiesIdir LankriUniversité Paris Diderot,Parislankri.idir@gmail.comBodhisattwa Prasad MajumderWalmart PanchalDepartment of Sanskrit Studies,University of Hyderabad,Hyderabad, India.snjvpnchl@gmail.comShreevatsa RajagopalanIndependent Scholar1035 Aster Ave 1107Sunnyvale, CA 94086, USAshreevatsa.public@gmail.comBalasubramanian RamakrishnanIndependent Scholar145 Littleton RdHarvard, MA 01451, USAbalasr@acm.orgGanesh RamakrishnanIndian Institute of Technology Bombay,Powai, Mumbai, RamasubramanianDepartment of Humanities and Social Sciences, Indian Institute of Technology Bombay, Powai, Mumbai, M RaoSchool of Vedic Sciences,MIT-ADT University,Pune, India.rao.tilak@gmail.comRohit SalujaIITB-Monash Research Academy,Powai, Mumbai, ScharfThe Sanskrit Library,Providence, Rhode Island, U.S.A.andLanguage Technologies Research Center, Indian Institute of Information Technology, Hyderabad, India.scharf@sanskritlibrary.orgSamir Janardan SohoniDepartment of Humanities and Social Sciences, Indian Institute of Technology Bombay, Powai, Mumbai, India.sohoni@hotmail.comSarada SusarlaKarnataka Sanskrit University,Bangalore, India.sarada.susarla@gmail.comx Computational Sanskrit and Digital HumanitiesSai SusarlaSchool of Vedic Sciences,MIT-ADT University,Pune, India.sai.susarla@gmail.comContentsPreface iContributors vA Functional Core for the Computational Aṣṭād-hyāyī 1Samir Sohoni and Malhar A. KulkarniPAIAS: Pāṇini Aṣṭādhyāyī Interpreter As a Service 31Sarada Susarla, Tilak M. Rao and Sai SusarlaYogyatā as an absence of non-congruity 59Sanjeev Panchal and Amba KulkarniAn ‘Ekalavya’ Approach to Learning Context FreeGrammar Rules for Sanskrit Using Adaptor Gram-mar 83Amrith Krishna, Bodhisattwa Prasad Majumder, AnilKumar Boga, and Pawan GoyalA user-friendly tool for metrical analysis of San-skrit verse 113Shreevatsa Rajagopalanxixii Computational Sanskrit and Digital HumanitiesImproving the learnability of classifiers for San-skrit OCR corrections 143Devaraja Adiga, Rohit Saluja, Vaibhav Agrawal,Ganesh Ramakrishnan, Parag Chaudhuri, K. Ramasubra-manian and Malhar KulkarniA Tool for Transliteration of Bilingual Texts In-volving Sanskrit 163Nikhil Caturvedi and Rahul GargModeling the Phonology of Consonant Duplica-tion and Allied Changes in the Recitation of TamilTaittirīyaka-s 181Balasubramanian RamakrishnanWord complementation in Classical Sanskrit 217Brendan GillonTEITaggerRaising the standard for digital textsto facilitate interchange with linguistic software 229Peter M. ScharfPreliminary Design of a Sanskrit Corpus Manager 259Gérard Huet and Idir LankriEnriching the digital edition of the Kāśikāvr̥ttiby adding variants from the Nyāsa and Padamañjarī 277Tanuja P. Ajotikar, Anuja P. Ajotikar, and Peter M.ScharfTable of contents xiiiFrom the Web to the desktop: IIIF-Pack, a doc-ument format for manuscripts using Linked Datastandards 295Timothy BellefleurNew Vistas to study Bhartṛhari: Cognitive NLP 311Jayashree Aanand Gajjam, Diptesh Kanojia and Mal-har Kulkarnixiv Computational Sanskrit and Digital HumanitiesA Functional Core for the ComputationalAṣṭādhyāyīSamir Janardan Sohoni and Malhar A. KulkarniAbstract: There have been several efforts to produce computationalmodels of concepts from Pāṇini’s Aṣṭādhyāyī. These implementationstargeted certain subsections of the Aṣṭādhyāyī such as the visibilityof rules, resolving rule conflict, producing sandhi, etc. Extrapolat-ing such efforts extremely will give us a much-coveted computationalAṣṭādhyāyī. A computational Aṣṭādhyāyī must produce an acceptablederivation of words showing the order in which sūtras are applied.We have developed a mini computational Aṣṭādhyāyī which purportsto derive accented verb forms of the root bhū in the laṭ lakāra. Anengine repeatedly chooses, prioritizes and applies sutras to an input,given in the form of a vivakṣā, until an utterance is derived. Amongother things, this paper describes the structure of sūtras, the visibilityof sūtras in the sapādasaptādhyāyī and tripādī sections, phasing of thesutras and the conflict resolution mechanisms.We found that the saṃjñā and vidhi sūtras are relatively simple toimplement due to overt conditional clues. The adhikāra and paribhāṣāsutras are too general to be implemented on their own, but can bebootstrapped into the vidhi sūtras. The para-nitya-antaraṅga-apavādamethod of resolving sūtra conflicts was extended to suit the compu-tational Aṣṭādhyāyī. Phasing can be used as a device to defer certainsūtras to a later stage in the derivation.This paper is the first part of a series. We intend to write more as weimplement more from the Aṣṭādhyāyī.Keywords: computational Ashtadhyayi, derivation, conflict resolution,sutra, visibility, phase12 Sohoni and Kulkarni1 IntroductionAn accent is a key feature of the Sanskrit language. While the ubiquitousaccent of Vedic has fallen out of use in Classical Sanskrit, Pāṇini’s gram-matical mechanisms are capable of producing accented speech. We aim toderive an accented instance by using computer implementation of Pāṇinianmethods. Our system can produce the output shown in Listing 1.Our implementation uses a representation of vivakṣā (See Listing 11)to drive the derivation. We model Aṣṭādhyāyī sūtras as requiring a set ofpreconditions to produce an effect. The sūtras look for their preconditionsin an input environment. The effects produced by sūtras become part ofan ever-evolving environment which may trigger other sūtras. To resolverule conflicts, we have made a provision for a harness which is based on theparibhāṣā पवू परिनारापवादानाम उ्रोरं बलीयः.We have used Haskell, a lazy functional programming language, to buildthe prototype. Our implementation uses a phonetic encoding of charactersfor accentuation and Pāṇinian operations (See Sohoni and M. A. Kulkarni(2016)). The phonetic encoding allows for faster phonetic modifications andtesting operation, something that seems to happen very frequently acrossmost of the sūtras.The following is an outline of the rest of this paper. Previous work re-lated to the computational Aṣṭādhyāyī is reviewed in Section 2. Section3 discusses how phonetic components, produced and consumed by sūtras,are represented. It also discusses tracing the antecedants of components.Implementation of sūtras is discussed in Section 4. Intermediate steps of aderivation, known as frames, are discussed in 5.1. Section 5.2 also discussesthe environment which is used to check triggering conditions of sūtras. De-ferring application of sūtras by arranging them into phases is discussed in6. The process of derivation is explained in Section 7. Prioritization andconflict resolution of sūtras is discussed in Section 8. Section 9 discusses howvisible frames in a derivation are made available to a sūtra. Some conclusionsand future work are discussed in Section 10.Computational Aṣṭādhyāyī 3Wiwakshaa --->[("gana",Just "1"),("purusha",Just "1"),("wachana",Just "1"),("lakaara",Just "wartamaana"),("prayoga",Just "kartari")]Initial ---> भू॒[]***>(6.1.162) ---> भू[]***>(3.2.123) ---> भूलँट्[(1.4.13) wins (1.4.13) vs (1.3.9) by SCARE]***>(1.4.13) ---> भूलँट्[]***>(1.3.9) ---> भूल्[]***>(3.4.78) ---> भूित॒प्[(1.4.13) wins (1.4.13) vs (1.4.104) by SCARE,(1.4.13) wins (1.4.13) vs (1.3.9) by SCARE,(1.4.13) wins (1.4.13) vs (3.1.68) by SCARE,(1.4.13) wins (1.4.13) vs (7.3.84) by SCARE]***>(1.4.13) ---> भूित॒प्[(1.4.104) wins (1.4.104) vs (1.3.9) by SCARE,(1.4.104) wins (1.4.104) vs (3.1.68) by SCARE,(1.4.104) wins (1.4.104) vs (7.3.84) by SCARE]***>(1.4.104) ---> भूित॒प्[(3.1.68) wins (1.3.9) vs (3.1.68) by paratwa,(7.3.84) wins (3.1.68) vs (7.3.84) by paratwa]***>(7.3.84) ---> भोित॒प्[(3.1.68) wins (1.3.9) vs (3.1.68) by paratwa]***>(3.1.68) ---> भोश॒िĢत॒प्[(1.4.13) wins (1.4.13) vs (1.3.9) by SCARE]***>(1.4.13) ---> भोश॒िĢत॒प्[]***>(1.3.9) ---> भोअ॒ित॒[]***>(1.4.14) ---> भोअ॒ित॒[]***>(1.4.109) ---> भोअ॒ित॒[(8.4.66) wins (6.1.78) vs (8.4.66) by paratwa]***>(8.4.66) ---> भोअ॑ित॒[]***>(6.1.78) ---> भव॒ित॒[]***>(8.4.66) ---> भव॑ित॒Listing 1A derivation of the pada भव॑ित॒4 Sohoni and Kulkarni2 Review of LiteratureFormal foundations of a computational Aṣṭādhyāyī can be seen in Mishra(2008, 2009, 2010). The general approach in Mishra’s work is to take a lin-guistic form such as bhavati and apply heuristics to carve out some grammat-ical decompositions. The decompositions are used to drive analytical pro-cesses that may yield more decompositions along the boundaries of sandhisto produce seed-forms.1 This part is an analysis done in a top-down manner.The second phase is a bottom-up synthesis, wherein, each of the seed-formsis processed by a synthesizer to produce finalized linguistic expressions thatmust match the original input. To support analysis, Mishra’s implementa-tion relies upon a database which contains partial orders of morphologicalentities, mutually exclusive morphemes, and other such artifacts.2 In thesynthesis phase, Mishra (2010) also implements a conflict resolver using thesiddha principle.3Goyal, A. P. Kulkarni, and Behera (2008) have also created a computa-tional Aṣṭādhyāyī which focuses on ordering sūtras in the regions governedby पवू ऽािसम (्A. 8.2.1), षतकुोरिसः (A. 6.1.86) and अिसवदऽाभात (्A. 6.4.22).Input, in the form of prakṛti along with attributes, is passed through a setof modules that have thematically grouped rules. The implementation em-bodies the notion of data spaces. Rules are able to see various data spacesin order to take input. Results produced by the rules are put back into theappropriate data spaces. Conflict resolution is based on paribhāṣā-drivenconcepts such as principle of apavāda as well as ad-hoc techniques.4Goyal, A. P. Kulkarni, and Behera (2008) §4 mention the features of acomputational Aṣṭādhyāyī. Also, a computational Aṣṭādhyāyī should seam-lessly glue together grammatical concepts just like the traditional Aṣṭād-hyāyī. It should not add any side effects, neither should it be lacking anypart of the traditional Aṣṭādhyāyī. Above all, a computational Aṣṭādhyāyīmust produce an acceptable derivation of words.The system described in Mishra (2010) does not use traditional buildingblocks such as the Māheśvara Sūtras or the Dhātupātha, but can be made1See Mishra (2009), Section 4.2 for a description of the general process.2See Mishra (2009), Section 6.1 for details of the database.3Mishra (2010):2554 Goyal, A. P. Kulkarni, and Behera (2008), cf. ‘Module for Conflict Resolution’ in§4.4Computational Aṣṭādhyāyī 5to do so.5 We believe that canonical building blocks such as Māheśvara Sū-tras and Aṣṭādhyāyī sūtrapāṭha should strongly influence the computationalAṣṭādhyāyī.Peter M. Scharf (2016) talks about the need for faithfully translatingPāṇinian rules in the realm of computation and shows elaborate XMLiza-tion of Pāṇini’s rules. Bringing Pāṇini’s rules into the area of computationuncovers some problems that need to be solved. T. Ajotikar, A. Ajotikar,and Peter M. Scharf (2016) discuss some of those issues.XML is useful in describing structured data and therefore XMLizationof the Aṣṭādhyāyī is a step in the right direction. However, processingPāṇinian derivations in XML will be fraught with performance issues. XMLis good for the specification of data but it cannot be used as a programminglanguage. A good deal of designing a computational Aṣṭādhyāyī will haveto focus on questions such as “How to implement (not specify) the notionX”. X may refer to things such as run-time evaluation of apavādas, ordynamically creating any pratyāhāra from the Māheśvara Sūtras or dealingwith an atideśasūtra so that a proper target rule is triggered. The powerof a real, feature-rich programming language will be indispensable in suchwork.Patel and Katuri (2016) have demonstrated the use of programminglanguages to derive subanta forms. Patel and Katuri have discovered amanual way to order rules (NLP ordering) for producing subantas accordingto Bhattojī Dikṣita’s Vaiyākaraṇa Siddhāntakaumudī. It is conceivable thatas more rules are added to the system to derive other types of words, theNLP ordering may undergo a lot of change and it may ultimately approachthe order that comes about due to Pāṇinian paribhāṣās and those compiledby Nagojibhaṭṭa (See Kielhorn (1985)).In the present paper we describe the construction of rules, the progressof a derivation, the resolution of conflicts by modeling competitions betweenthe rules in the ambit of paribhāṣā पवू परिनारापवादानाम उ्रोरं बलीयः andother such concepts.5Mishra (2010):256, §4, “There is, however, a possibility to make the system aware ofthese divisions.”6 Sohoni and Kulkarnitype Attribute v = (String, Maybe v)type Tag = Attribute Stringdata Component = Component {cmpWarnas :: [Warna],cmpAttrs :: [Tag],cmpOrigin :: [Component]}type State = [Component]Listing 2State3 Phonetic ComponentsThe phonetic payload, which comprises of phonemes, is known as aComponent. Listing 2 shows the implementation.6 A Component ismade up of phonetically encoded Warnas. Some name-value pairs knownas Tags give meta information about the Components. Usually, the tagscontain saṃjñās. Over the course of a derivation, Components can un-dergo changes. At times, sūtras are required to test previous incarnationsof a sthānī (substituend), so a list of previous forms of the Componentsis also retained. The current yield of the derivation at any step is in theState which is a list of Components.Listing 3 shows how भू + ितप ्can be represented as a intermediate pho-netic State. The Devanāgarī representations of भू and ितप a्re converted intoan internal representation of Warnas using the encode function. Suitabletags are applied to the components bhu and tip and they are strung togetherin a list to create a State.3.1 Tracing Components to Their OriginsThe sūtra वत मान े लट ् (A. 3.2.123) inserts a लँट ् pratyaya after a dhātu. Thispratyaya will undergo changes and ultimately become ल ्due to application6Excerpts of implementation details are shown in Haskell. References to variable namesin computer code are shown in bold teletype font. In code listings the letter ‘w’ is usedfor व.् In other places the Roman transliteration is used which prefers ‘v’ instead of ‘w’.Computational Aṣṭādhyāyī 7egState = let bhu = Component (encode "भू") -- phonemes[("dhaatu",Nothing)] -- tags[] -- no previous historytip = Component (encode "ितप्") -- phonemes[("wibhakti",Nothing) -- tags,("parasmaipada",Nothing),("ekawachana",Nothing),("saarwadhaatuka",Nothing),("pratyaya",Nothing)][] --no previous historyin [bhu, tip]Listing 3An example of Stateof it-sutras A. 1.3.2-9. लशतिते (A. 1.3.8) will mark the ल ्as an इत ्causingits removal. A. 1.3.8 should not mark the ल ्of a लँट ्pratyaya as an इत ्. Theल ् in ten lakāras is not an इत ्. These lakāras should figure into A. 1.3.8 asan exception list so that A. 1.3.8 does not apply to them. However, othersutras like A. 1.3.2, A. 1.3.3 and A. 1.3.9 may still apply leaving back onlyल.् If a list of ten lakāras was kept as an exception list in A. 1.3.8, ल ्will notmatch any one of those and will be liable to be dropped. Somehow, the ल ्which remains from lakāras, needs to be traced back to the original lakāra.As shown in Listing 2, the datatype Component recursively containsa list of Components. The purpose of this list is to keep around previousforms of a Component. As a Component changes, its previous formalong with all attributes is stored at the head of the list. This makes iteasy to recover any previous form of a component and examine it. Listing4 shows the traceOrigin function. In case of 1.3.8, if calling traceOriginon a ल ्produces one of the 10 lakāras, 1.3.8 does not mark such a ल ्an इत ्.This way of tracing back Components to their previous forms can help indetermination of a sthānī and its attributes under the influence of atideśasūtra like ािनवदादशेोऽनिधौ (A. (1.1.56).8 Sohoni and KulkarnitraceOrigin :: Component -> [Component]traceOrigin (Component ws as []) = []traceOrigin (Component ws as os) =nub \$ (concat.foldr getOrig [os]) oswhere getOrig c os = (traceOrigin c) : osListing 4Tracing origin of a Component4 SūtrasAccording to one opinion in the Pāṇinian tradition, there are six differenttypes of sūtras. The following verse enumerates them;7संा च पिरभाषा च िविधिन यम एव च ।अितदशेोऽिधकार षिधं सऽूलणम ॥्The saṃjñā sūtras apply specific saṃjñās to grammatical entities basedon certain indicatory marks found in the input. They help the vidhi sūtrasbring about changes.The real executive power of the Aṣṭādhyāyī lies in the vidhi, niṣedha andniyama sūtras. The vidhi sūtras bring about changes in the state of thederivation. The niṣedha and niyama sūtras are devices that prevent over-generation of vidhi sūtras. They are strongly associated with specific vidhisūtras and also share some of their conditioning information.The paribhāṣā sūtras are subservient to vidhisūtras. They can be thoughtof as algorithmic helper functions which are called from many places ina computer program. In the spirit of the kāryakālapakṣa, the paribhāṣāsūtras are supposed to unite with vidhi sūtras to create a complete sūtrawhich produces a certain effect. The paribhāṣā sūtras need not be explicitlyimplemented because their logic can be embedded into the vidhi sūtras.The adhikāra sūtras create a context for vidhi sūtras to operate. Froman implementation perspective, the context of the adhikāra can be built intothe body of vidhi, niṣedha or niyama sūtras and therefore adhikāra sūtrasneed not be explicitly implemented.7Vedantakeshari (2001):9-11. According to other opinions there are 7 or even as manyas 8 types of sūtras if niṣedha and upavidhi types are considered.Computational Aṣṭādhyāyī 9The atideśa sūtras create situational analogies. By forming analogies,atideśa sūtras cause other vidhi sūtras to trigger. In this implementationwe implement saṃjñā, vidhi and niyama sūtras. We have not implementedatideśa sūtras.In traditional learning, every paribhāṣā sūtra is expected to be knownin the place it is taught. The effective meaning of a vidhi sūtra is knownby resorting to the methods of yathoddeśapakṣa or kāryakālapakṣa. In oneopinion, in the yathoddeśapakṣa the boundary of the sapādasaptādhyāyī andthe tripādī presents an ideological barrier which cannot be crossed over bythe paribhāṣā sūtras for reasons of being invisible. The kāryakālapakṣa hasno such problem.8We are inclined towards an implementation based on kāryakālapakṣa asit allows us to escape having to implement each and every paribhāṣā sūtraexplicitly and yet enlist the necessary paribhāṣā sūtras’ numbers which gointo creating ekavākyatā (full expanded meaning). This choice allows forswift development with less clutter. Therefore, the paribhāṣā and adhikārasūtras are not explicitly implemented.Sūtras are defined as shown in Listing 5. In the derivation of the wordभव॑ित॒9 no niyama sūtras were encountered, so they are not implemented inthis effort but could be implemented by adding a Niyama value constructor.The Widhi value constructor is used to represent all types of sūtras otherthan saṃjñā sūtras. The Samjnyaa value constructor is used to makesaṃjñā sūtras. Both the value constructors appear to be same in terms oftheir parameters, only the name of the constructor differentiates them. Thisis useful in pattern matching on sūtra values in the SCARE model (Section8.4) which treats saṃjñā sūtras specially.4.1 Testing the Conditions for Application of sūtrasIn a computational Aṣṭādhyāyī, a sūtra must be able to sense certain con-ditions that exist in the input and it should also be able to produce aneffect. These are the two basic requirements any implementable sūtra mustsatisfy. The Testing field in Listing 5 refers to a datatype that has testingfunctions slfTest and condTest. A sūtra will be able to produce its effectprovided that slfTest returns True and condTest returns a function whichcan produce effects. More on this is explained in Section 7.1. Listing 6 shows8See Kielhorn (1985), paribhāṣās 2 & 3 – काय कालपे त ु िऽपाामपुिितिरित िवशषेः9Ṛgvedic convention is used to show accent marks.10 Sohoni and Kulkarnidata Sutra = Widhi { number :: SutraNumber, testing :: Testing}| Samjnyaa { number :: SutraNumber, testing :: Testing}Listing 5Definition of sūtradatatype Testing. Function slfTest is used to prevent a sūtra from apply-ing ad infinitum. Some sūtras produce an effect without any conditions.For example, परः सिकष ः सिंहता (A. 1.4.109) defines the saṃjñā samhitā (themode of continuous speech) which is not pre-conditioned by anything whichcan be sensed in the input. This sūtra can get applied and reapplied contin-uously had it not been for function slfTest. The slfTest function in sūtraA. 1.4.109 allows its application only if it was not applied earlier. UnlikeA. 1.4.109, some sūtras produce effects which are conditioned upon thingsfound in the input. For example, उदाादनदुा िरतः (A. 8.4.66) will lookfor an udātta syllable followed by an anudātta one in samhitā and convertthe anudātta into a svarita syllable. As long as there is no udātta followedby anudātta in the input, A. 8.4.66 will not apply. A. 8.4.66 does not runthe risk of being applied ad infinitum because it is conditioned on thingswhich can be sensed in the input. Therefore, function slfTest in A. 8.4.66always returns True. Function condTest should test for the condition thatan udātta is followed by an anudātta in samhitā, in which case it shouldreturn True. Listing 6 shows the functions which each sūtra is expected toimplement.If a sūtra is inapplicable, condTest returns Nothing, which meansno effects can be produced. In case a sūtra is applicable condTest re-turns an effect function. Simply calling the effect function with the cor-rect Environment parameter will produce an effect as part of a newEnvironment. Effect is simply a function that takes an Environmentand produces a newer Environment. Each Sutra is expected to imple-ment the functions slfTest and condTest.Computational Aṣṭādhyāyī 11type Effect = TheEnv -> TheEnvdata Testing = TestFuncs {slfTest :: Environment Trace -> Bool,condTest :: Environment Trace ->([Attribute String], Maybe Effect)}Listing 6Definition of testing functions4.2 Organization of sūtrasThe sūtras are implemented as Haskell modules. Every sūtra module exportsa details function. The details function gives access to the definition of thesūtra and also the slfTest and condTest functions which are required inother parts of the code. Listing 7 shows a rough sketch of sūtra A. 1.3.9. Inaddition, every sūtra will have to implement its own effects function.12 Sohoni and Kulkarnimodule S1_3_9 where-- imports omitted for brevitydetails = Widhi (SutraNumber 1 3 9)(TestFuncs selfTest condTest)selfTest :: TheEnv -> BoolselfTest _ = TruecondTest :: TheEnv -> ([Attribute String], Maybe Effect)condTest env = -- details omitted for brevityeffects :: Effecteffects env = -- details omitted for brevityListing 7Sūtra moduleComputational Aṣṭādhyāyī 135 The Ecosystem for Execution of SūtrasThe next sūtra applicable in a derivation takes its input from the resultproduced by the previous one. Due to the siddha/asiddha notion betweencertain sūtras, it can be generally said that the input for the next sūtra maycome from any of the previously generated results. This section discussesthe ecosystem in which sūtras get their inputs.5.1 FramesAs each sūtra applies in a derivation, some information about it is capturedin a record known as a Frame. Some of the information, such as all theConflicts, is captured for reporting. The two important constituents ofFrame are the Sutra which was applied and the State it produced. As aderivation progresses, the output State from one sūtra becomes the inputState of another.Listing 8 shows Frame as an abstract datatype which expects two types,conflict and sutra, to create a concrete datatype. Frame (Conflict Su-tra) Sutra is the realization of a concrete type which is, for convenience,called TheFrame. The Trace is merely a list of TheFrames. It is meantto show a step-by-step account of the derivation.5.2 EnvironmentTo produce an effect, some sūtras look at indicators in what is fed as input.A sūtra such as साव धातकुाध धातकुयोः (A. 7.3.84) is expected to convert an इक ्letter at the end of the aṅga into a guṇa letter, provided that a sārvadhātukadata Frame conflict sutra = Frame {frConflicts :: [conflict],frSutra :: (Maybe sutra),frOutput :: State}type TheFrame = Frame (Competition Sutra) Sutratype Trace = [TheFrame]Listing 8The Trace14 Sohoni and Kulkarni-- input to A. 7.3.84[Component (encode "भू") [("dhaatu",Nothing)] [],Component (encode "ितप्")[("saarwadhaatuka",Nothing),("pratyaya",Nothing)] []]Listing 9An input to sūtra A. 7.3.84-- output from A. 7.3.84[Component (encode "भो") [("dhaatu",Nothing)] [],Component (encode "ितप्")[("saarwadhaatuka",Nothing) ,("pratyaya",Nothing)] []]Listing 10An output from sūtra A. 7.3.84.or ārdhadhātuka pratyaya follows. If this sūtra is fed an input such as theone shown in Listing 9, all necessary conditions can be found in this inputviz. there is an इक ् at the end of the aṅga, followed by the sārvadhātukapratyaya ितप ्.Now that its required conditions have been fulfilled, A. 7.3.84 producesan effect such as the one shown in Listing 10. Thus, the input becomesan environment that is looked at by the sūtras to check for any trigger-ing conditions. A sūtra may need to look past its input into the input ofsome previously triggered sūtras. Generalizing this, an environment con-sists of outputs produced by all sūtras in the derivation thus far. The Tracedata structure (see Section 5.1) becomes a very important constituent ofEnvironment.There are sūtras which produce an effect conditioned by what the speakerintends to say. वत मान ेलट ्(A. 3.2.123), for example, will be fed an input whichmay contain entities like a dhātu along with other saṃjñās associated withit. However, the specific lakāra, which the dhātu must be cast into, canbe known only from the intention of the speaker. Unless it can be sensedComputational Aṣṭādhyāyī 15type Wiwakshaa = [Tag]egWiwakshaa = [attr gana "1",attr purusha "1",attr wachana "1",attr lakaara "wartamaana",attr prayoga "kartari",attr samhitaa "yes"]Listing 11Tags to describe vivakṣāthat the speaker wishes to express a vartamāna form, A. 3.2.123 cannot beapplied. Thus, some sūtras are conditioned on what is in the vivakṣā. Asshown in Listing 11, Wiwakshaa is modelled as a list of name-value Tags.The attr function is a helper which creates a Tag. This listing shows avivakṣā for creating a 3rx person, singular, present tense, active voice formof some dhātu in samhitā mode.In the case of certain sūtras the triggering conditions remain intact for-ever. Such sūtras tend to get applied repeatedly. To allow applicationonly once, housekeeping attributes have to be maintained. The house-keeping attributes may be checked by the slfTest function in the sūtrasand reapplication can be prevented. Consider वत मान े लट ् (A. 3.2.123) onceagain. The vivakṣā will continue to have vartamāna in it. As such, A.3.2.123 can get reapplied continuously. While producing the effect of in-serting laṭ, A. 3.2.123 could create a housekeeping tag, say “(3.2.123)”. IfA. 3.2.123 were to apply only in the absence of housekeeping attribute“(3.2.123)”, the reapplication could be controlled using a suitably codedslfTest function. Such Housekeeping is also part of the environment. Justlike Wiwakshaa, it is also represented as a list of Tags. The entire rep-resentation of Environment is shown in Listing 12. It is a parameterizedtype that expects a type t to create an environment from. EnvironmentTrace is a concrete type which is given an alias of TheEnv.16 Sohoni and Kulkarnidata Environment t = Env { envWiwakshaa :: Wiwakshaa, envHsekpg :: Housekeeping, envTrace :: t}type TheEnv = Environment TraceListing 12The Environment6 PhasingSamuel Johnson said that language is the dress of thought. Indeed, Pāṇini’sgenerative grammar derives a correct utterance from an initial thought pat-tern. The seeds of the finished linguistic forms are sowed very early in theprocess of derivation. Morphemes are gradually introduced depending oncertain conditions and are ultimately transformed into final speech forms.It seems that linguistic forms pass through definite stages. This is a crudeapproximation of the derivation process: laying down the seed form fromsemantic information in the vivakṣā, producing aṅgas, producing padas andfinally making some adjustments for the samhitā mode of utterance. At eachstep along the way, there could be several sūtras that may apply. Grammar-ians call this situation a prasaṅga.10 However, only one sūtra can be appliedin a prasaṅga. When the most suitable sūtra gets applied it is said to havebecome pravṛtta. To make the resolution of a prasaṅga relatively simple, sū-tras apparently belonging to the latter stages should not get applied earlierin the derivation, even if they have scope to apply.Phasing is a method to minimize the number of sūtras that participate ina prasaṅga. Those saṃjñā sūtras, which form the basis of certain adhikārasūtras, are deferred until later in the derivation process. For instance, परःसिकष ः सिंहता (A. 1.4.109) creates a basis for the samhitāyām adhikāra whichbegins from तयोा विच सिंहतायाम ्(A. 8.2.108). Similarly, यात ्ूयिविधदािदूयऽेम ्(A. 1.4.13) applies the saṃjñā aṅga to something when there is apratyaya after it. The saṃjñā aṅga creates a basis for the adhikāra sūtraअ (A. 6.4.1). If the sūtras, which apply certain saṃjñās, are suppressed10See Abhyankar and Shukla (1961) pages 271 and 273Computational Aṣṭādhyāyī 17data Phase a = Phase { phsName :: String -- name of phase, phsNums :: [a] -- sutras in the phase}phases = [Phase "pada" [SutraNumber 1 4 14],Phase "samhitaa" [SutraNumber 1 4 109]]Listing 13Definition of Phasein the beginning of the derivation and are released subsequently, other vidhisūtras operating within certain adhikāras will not participate in an untimelyprasaṅga. For example, phasing परः सिकष ः सिंहता (A. 1.4.109) will defersūtras like उदाादनदुा िरतः (A. 8.4.66) till a later time.A Phase has a name and contains a list of sūtras which make up thatphase. Listing 13 defines a Phase and creates a list called phases contain-ing two phases–“pada” and “samhitaa”. The way phases is defined, padaformation phase (due to A. 1.4.14) and samhitā formation phase (due to A.1.4.109) will be deferred till a later time.7 The Process of DerivationThe details of Sutras are collected in a list called the ashtadhyayi. Forbrevity, a small representation is shown in Listing 14.ashtadhyayi :: [Sutra]ashtadhyayi = [S1_3_9.details,S1_3_78.details,S1_4_99.details,S1_4_100.details]Listing 14ashtadhyayi - a list of sūtras18 Sohoni and Kulkarni-- remove phases from the ashtadhyayiashtWithoutPhases = filterSutras(predByPhases allPhases)ashtadhyayi-- generate the word form using phasesgenerateUsingPhases :: TheEnv -> [Sutra] -> [Phase Sutra]-> TheEnvgenerateUsingPhases env sutras phases =foldl' (gen sutras) newEnv phaseswheregen sutras env phase = generate env (phsNums phase ++ sutras)newEnv = generate env sutras-- the returned environment will contain the derivationfinalEnv = generateUsingPhases env ashtWithoutPhases allPhasesListing 15Generation using phasesBefore the process of derivation begins, sūtras which are part of somephase, are removed from the ashtadhyayi. The generation of derived formswill continue as long as applicable sūtras are found in the ashtadhyayi.When sūtras are no longer applicable, sūtras from a phase are added tothe ashtadhyayi. Adding one phase back to the ashtadhyayi holds thepossibility of new sūtras becoming applicable. The process of derivationcontinues once again until no more sūtras are applicable. The process ofadding back sūtras from other phases continues until there are no morephases left to add. Listing 15 shows this way of using phases to generatethe derived form.Function generateUsingPhases uses the generate function to actu-ally advance the derivation. Given an Environment and a set of Sutras,the process of generating a linguistic form will consist of picking out all theapplicable Sutras that have prasaṅga. The Sutras should get prioritizedso that only one Sutra can become pravṛtta. The chosen Sutra should beinvoked to produce a new Environment. This process can continue untilComputational Aṣṭādhyāyī 19generate :: TheEnv -> [Sutra] -> TheEnvgenerate env sutras| null (envWiwakshaa env) || null (envTrace env)|| null sutras = env| otherwise =if (isJust chosen)then generate newEnv sutraselse traceShow sutras envwhere list = choose env sutraschosen = prioritize env (testBattery env) listnewEnv = invoke env chosenListing 16Generation of a derived formno more Sutras apply in which case the derivation steps are shown usingtraceShow. The function generate embodies this logic as shown in Listing16.7.1 Choosing the Applicable sūtrasGiven the list ashtadhyayi as defined in Section 7 and a starterEnvironment, each and every Sutra in the ashtadhyayi is tested forapplicability. The applicability test is shown in Listing 17. A Sutra will bechosen if it clears two-stage condition checking. In the initial stage, slfTestchecks if any Housekeeping attributes in the Environment prevent theSutra from applying. If the initial stage is cleared, the second stage invokesthe condTest function of the Sutra. condTest checks the Trace in theEnvironment for existence of conditions specific to the sūtra. In case theconditions exist, condTest returns a collection of Tags and a function, sayeff, to produce the effects. See Section 4.1 to read more about slfTest andcondTest.The function choose, shown in Listing 17, uses the test described above.All Sutras in the ashtadhyayi, for which test returns an effects functioneff, are collected and returned as a list. All sūtras in the list are applicableand have a prasaṅga. This list of sūtras has to be prioritized so that onlyone sūtra can be invoked.20 Sohoni and Kulkarnichoose :: TheEnv -> [Sutra] -> [(Sutra, [Tag], Effect)]choose env ss =[fromJust r | r <- res, isJust (r) == True]whereres = map appDetails sst = envTrace envappDetails :: Sutra -> Maybe (Sutra, [Tag], Effect)appDetails sut = case (test env sut) of(_, Nothing) -> Nothing(conds, Just eff)-> Just (sut, conds, eff)test :: TheEnv -> Sutra -> ([Tag],Maybe Effect)test e s | null (envTrace e) = ([], Nothing)| otherwise = if slfTest testIfc ethen condTest testIfc eelse ([], Nothing)where testIfc = testing sListing 17Choosing the applicable sūtras8 Prioritizing sūtrasAs the derivation progresses, many sūtras can form a prasaṅga, for theirtriggering conditions are satisfied in the environment. It is the responsibilityof the grammar to decide which sūtra becomes pravṛtta by ensuring that thederivation does not loop continuously.8.1 Avoiding CyclesIt may so happen that of all the sūtras in a prasaṅga, the one that hasbeen chosen to become pravṛtta, say Sc, produces an Environment, sayEi, that already exists in the Trace. In case Ei is reproduced, a cyclewill be introduced which will cause the derivation to not terminate. Whileprioritizing, such sūtras, as producing an already produced Environment,must be filtered out.Computational Aṣṭādhyāyī 218.2 Competitions for Conflict ResolutionThe chosen sūtras can be thought to compete with one another. If there aren sūtras there will be (n-1) competitions. The first and the second sūtrawill compete against one another. The winner among the two will competeagainst the third and so on until we are left with only one sūtra. The sūtrawhich triumphs becomes pravṛtta for it is the strongest among all thosewhich had a prasaṅga. This view of conflict resolution is shown in Figure 1.S) to Sn are competing sūtras.Figure 1Competitions among sūtrasWe model competition as a match between two entities. The match canend in a draw or produce a winner. As shown in Listing 18, Competition isdefined as an abstract type which contains a Resolution. The Resolutiongives the Result and has a provision to note a Reason for that specificoutcome of the match.The actual competition is represented as a function which takes twoSutras and produces a Resolution as shown in Listing 19.8.3 Competitions and BiasesA sūtra Si is eliminated as soon as it looses out to another sūtra Sj andnever participates in any other competition in the prasaṅga. One mightobject to this methodology of conducting competitions by suggesting that22 Sohoni and Kulkarni{-Following are abstract types.Concrete realizations such as 'Conflict Sutra' and'Resolution Sutra' are used in code.-}data Conflict a = Conflict a a (Resolution a)data Resolution a = Resolution (Result a) Reasondata Result a = Draw | Winner atype Reason = StringListing 18The Conflict between sūtrastype Competition a = Sutra -> Sutra -> Resolution SutraListing 19Match between sūtrasSi could have debarred another sūtra Sk later on, therefore it is importantto keep Si in the fray. In fact, the objector could claim that all sūtras mustcompete with one another before a sūtra can become pravṛtta. We note thatthe objector’s method of holding competitions would have been useful if thecompetition between Si and Sk would produce a random winner every time.In fact, the competitions in this grammar are biased. They don’t give boththe sūtras equal chance of winning and this is intentional in the design ofPāṇini’s grammar. In the presence of biases, what is the use of conductingfair competitions? Therefore the proposed method of holding competitionsshould be acceptable.The biases are introduced by the maxim पवू परिनारापवादानाम ्उरोरंबलीयः.11 One sūtra is stronger than another one by way of four tests, namely,paratva, nityatva, antaraṅgatva and apavādatva. The paratva test says that,among any two sūtras, the one which is placed later in the Aṣṭādhyāyīwins. According to the nityatva test, among two competing sūtras, onethat has prasaṅga inspite of the other being applied first, is the winner. The11See Kielhorn (1985):१९, paribhāṣā 38.Computational Aṣṭādhyāyī 23testBattery :: TheEnv -> [Competition Sutra]testBattery e | null (envTrace e) = []| otherwise = [ (scareTest e), (apawaada e), (antaranga e), (nitya e), (para e)]where trace = envTrace e-- various tests. details not shown for brevityscareTest :: TheEnv -> Competition Sutrapara :: TheEnv -> Competition Sutranitya :: TheEnv -> Competition Sutraantaranga :: TheEnv -> Competition Sutraapawaada :: TheEnv -> Competition SutraListing 20A battery of tests to choose a winning sūtraprinciple of antaraṅgatva dictates that of two conflicting sūtras, the one thatwins, relies on relatively fewer nimittas (conditions) or nimittas which areinternal to something. Finally, the test of apavādatva teaches that, amongtwo conflicting sūtras, a special sūtra wins over a general one to preventniravakāśatva (total inapplicability) of the special sūtra. The four tests aresuch that a latter one is stronger determiner of the winning sūtra than theprior ones. Test of apavādatva has highest priority and paratva has thelowest. If the test of apavādatva produces a winner the other three testsneed not be applied. If antaraṅgatva produces a winner, the other two testsneed not be administered. If nityatva produces a winner we need not checkparatva. In the worst-case scenario all the four tests have to be applied oneafter another to a pair of conflicting sūtras.To seek a winning sūtra among two that compete with each other, aprioritization function administers a battery of tests, beginning with apavā-datva (See Listing 20).24 Sohoni and KulkarniscareTest :: TheEnv -> Sutra -> Sutra -> Resolution SutrascareTest e s1@(Samjnyaa _ _) _ =Resolution (Winner s1) "SCARE"scareTest e _ s2@(Samjnyaa _ _) =Resolution (Winner s2) "SCARE"scareTest e s1 s2 = Resolution Draw "SCARE"Listing 21SCARE Test8.4 Sūtra-Conflict Assessment and Resolution Extension(SCARE)The maxim पवू परिनारापवादानाम उ्रोरं बलीयः, introduces four methods ofconflict resolution as explained in Section 8.3. In tradition, these meth-ods expect that saṃjñās have already been applied by resorting to eitherkāryakālapakṣa or yathoddeśapakṣa. However, computation differs from tra-dition, in that the assumptions made in traditional approach need to beexplicitly executed in computation. In a computational Aṣṭādhyāyī, meth-ods introduced by the maxim will work provided that all saṃjñās havealready applied. Somehow saṃjñās need to apply before any of the meth-ods in the maxim have applied. SCARE prioritizes saṃjñā sūtras higherthan any other type of sūtra. When any other type of sūtra competes with asaṃjñā sūtra, the latter wins. In case saṃjñā sūtras compete, all of them willeventually get a chance to apply. Since SCARE is required to differentiatebetween saṃjñā and non-saṃjñā sūtras, the Sutra datatype has an explicitvalue constructor called Samjnyaa. Listing 21 shows the implementationof scareTest.8.5 Determination of apavāda sūtrasAn apavāda relationship holds between a pair of sūtras when the generalprovision of one sūtra is overridden by the special provision of another sūtra.Arbitrary pairs of sūtras may not always have an apavāda relation betweenthem. There has to be a reason for an apavāda relationship to exist betweentwo sūtras. The niṣedha and niyama sūtras suggest something which canComputational Aṣṭādhyāyī 25run counter to what other sūtras say. Therefore they are called apavādas ofother sūtras.A niṣedha sūtra, say न िवभौ तुाः (A. 1.3.4), is considered an apavādaof the general one such as हल ्अम ् (A. 1.3.3). A niṣedha sūtra directlyadvises against taking an action suggested by another sūtra.12 Anothertype of rule, the niyama sūtra, does the work of regulating some operationlaid down by another rule.13 A niyama sūtra such as धातोः तििम एव (A.6.1.80) is considered an apavāda of a more general rule such as वाो िय ूये(A. 6.1.79).Wherever an apavāda relationship holds between two sūtras, it is staticand unidirectional. Also, there can be apavādas of apavādas. Since thecorpus of Pāṇini’s rules is well known, the apavāda relationships can beworked out manually to build a mapping of the apavādas. Thus, A. 1.3.4 isconsidered an apavāda of A. 1.3.3 and A. 6.1.80 is considered an apavāda ofA. 6.1.79.Listing 22 shows how apavādas are setup. The datatype Apawaadarecords a main sūtra and notes all the apavādas of the main sūtra.allApawaadas is a list of Apawaadas which is converted into a mapapawaadaMap for faster access. Function isApawaada is used to checkif sūtra s1 is an apavāda of sūtra s2. The apawaada function in Listing 20uses the isApawaada function to return an apavāda or returns a draw ifapavāda relationship does not exist between sūtras.9 VisibilityThe canonical term siddha means that the output from one sūtra A is visibleand can possibly trigger another sūtra B. The effects of A are visible to B.The sūtras in the tripādī are not siddha in the sapādasaptādhyāyī . Thismeans that even if sūtras from the tripādī have applied in a derivation, theresults produced are not visible to the sūtras in the sapādasaptādhyāyī . Inother words, the State produced by certain sūtras, cannot trigger sūtras ina specific region of the Aṣṭādhyāyī.Listing 23 shows the visibility as dictated by पवू ऽािसम ्(A. 8.2.1). Theentire derivation thus far is captured as a Trace containing frames Fn thruF1. The problem is this: Given a sūtra, say Si, the latest Frame Fj is12पवू सऽूकाय िनषधेकसऽूं िनषधेसऽूम ।्13िसे सित आरमाणो िविधः िनयमाय कते ।26 Sohoni and Kulkarnidata Apawaada = Apawaada { apwOf :: SutraNumber, apwApawaadas :: [SutraNumber]}allApawaadas = [Apawaada (SutraNumber 1 3 3)[(SutraNumber 1 3 4)]]apawaadaMap = M.fromList [(apwOf a, apwApawaadas a)| a <- allApawaadas]-- | Is sutra s1 an apawaaad of s2?isApawaada :: SutraNumber -> SutraNumber -> BoolisApawaada s1 s2 = case M.lookup s2 apawaadaMap ofJust sns -> s1 `elem` sns_ -> FalseListing 22Setting up apavādasComputational Aṣṭādhyāyī 27visibleFrame :: Sutra -> Trace -> Maybe TheFramevisibleFrame _ [] = NothingvisibleFrame s (f@(Frame _ (Just s1) _ _):fs) =if curr < s8_2_1then if top < s8_2_1then Just felse visibleFrame s fselse if top > currthen visibleFrame s fselse Just fwhere s8_2_1 = SutraNumber 8 2 1curr = number stop = number s1Listing 23Visibilityrequired such that Fj contains sūtra Sk whose output State is visible toSi. If Si is from the sapādasaptādhyāyī , frames from the head of the Traceare skipped as long as they contain sūtras from tripādī . Such frames willbe asiddha for Si. If, however, Si is from tripādī , frames from the traceare skipped so long as the sūtra in the frame has a number higher than Si.This is because in the tripādī latter sūtras are asiddha to prior ones. Theforegoing logic is implemented in function visibleFrame.10 Conclusion and Future WorkA computational Aṣṭādhyāyī can potentially become a good pedagogicalresource to teach grammatical aspects of Sanskrit. As a building block, acomputational Aṣṭādhyāyī can be used to build other systems like morpho-logical analyzers and dependency parsers.From an implementation perspective, resorting to kāryakālapakṣa allowsparibhāṣā and adhikāra sūtras to be merged into the logic of vidhi or niyamasūtras. While displaying the derivation after it is completed, the concernedvidhi and niyama sūtras can always enlist numbers of the paribhāṣā andadhikāra sūtras which they have united with. This allows for faster de-28 Sohoni and Kulkarnivelopment of the computational Aṣṭādhyāyī without having to implementseemingly trivial sūtras in the paradigm used to implement sūtras as notedin Section 4.2.It could be suboptimal to represent all grammatical entities asComponents having Warnas and Tags. To a certain extent, it increasesthe number of Tags applied to Components. For example, since pratyayasare expressed as a Components, a ‘pratyaya’ Tag has to be applied. Func-tional languages have extremely powerful type systems. To leverage thetype system, Components can be implemented as typed grammatical en-tities. For instance, a Component can have more value constructors forUpasarga, Dhaatu and Pratyaya, to name a few.Abstracting the input as vivakṣā does away with the need of applyingheuristics to determine what needs to be derived. However, our choice ofrepresenting Wiwakshaa as a simple list of Tags is an oversimplification.The vivakṣā could be a complex psycholinguistic artifact which may containelements such as the kārakas, hints for using specific dhātus, argument struc-ture of dhātu etc. It may have a sophisticated data structure. A thoroughstudy of semantic aspects of Aṣṭādhyāyī is necessary to know what vivakṣāmay look like in its entirety.In the बाधबीजूकरणम ्, Kielhorn (1985) discusses many variations undereach of the four methods introduced in the para-nitya paribhāṣā. Thosevariations should be plugged into the framework discussed in Section 8. Yet,there may be instances of derivations where the maxim पवू परिनारापवादानाम ्उरोरं बलीयःmay not be honoured and a better way is required to resolve sū-tra conflicts in totatality. Effects such as vipratiṣedha and pūrvavipratiṣedhaalso need to be included in the SCARE.ReferencesAbhyankar, K. V. and J. M. Shukla. 1961. A Dictionary Of Sanskrit Gram-mar. Oriental Institute, Baroda.Ajotikar, Tanuja, Anuja Ajotikar, and Peter M. Scharf. 2016. “Some issues informalizing the Aṣṭādhyāyī”. In: Sanskrit and Computational Linguistics,Select papers presented in the ‘Sanskrit and the IT World’ Section atthe 16th World Sanskrit Conference, (June 28 - 2 July 2015) Bangkok,Thailand. Ed. by Amba Kulkarni. DK Publishers Distributors Pvt. Ltd(New Delhi), pp. 103–124. isbn: 978-81-932319-0-6.Goyal, Pawan, Amba P. Kulkarni, and Laxmidhar Behera. 2008. “Com-puter Simulation of Astadhyayi: Some Insights”. In: Sanskrit Computa-tional Linguistics, First and Second International Symposia Rocquen-court, France, October 29-31, 2007 Providence, RI, USA, May 15-17,2008 Revised Selected and Invited Papers. Ed. by Gérard P. Huet, AmbaP. Kulkarni, and Peter M. Scharf. Springer, pp. 139–161. doi: 10.1007/978-3-642-00155-0_5. url:, Malcolm D. 2009. “From pāṇinian sandhi to finite state calculus”.In: Sanskrit Computational Linguistics. Springer, pp. 253–265.Kielhorn, F. 1985. Paribhāṣenduśekhara of Nāgojībhaṭṭa. Parimala Publica-tions, Delhi.Mishra, Anand. 2008. “Simulating the Paninian System of Sanskrit Gram-mar”. In: Sanskrit Computational Linguistics, First and Second Interna-tional Symposia Rocquencourt, France, October 29-31, 2007 Providence,RI, USA, May 15-17, 2008 Revised Selected and Invited Papers. Gérard P. Huet, Amba P. Kulkarni, and Peter M. Scharf. Springer,pp. 127–138.— 2009. “Modelling the Grammatical Circle of the Paninian System of San-skrit Grammar”. In: Sanskrit Computational Linguistics, Third Interna-tional Symposium, Hyderabad, India, January 15-17, 2009. Proceedings.Ed. by Amba P. Kulkarni and Gérard P. Huet. Springer, pp. 40–55.— 2010. “Modelling Astadhyayi: An Approach Based on the Methodologyof Ancillary Disciplines (Vedanga)”. In: Sanskrit Computational Linguis-tics - 4th International Symposium, New Delhi, India, December 10-12,2010. Proceedings. Ed. by Girish Nath Jha. Springer, pp. 239–258.2930 Sohoni and KulkarniPatel, Dhaval and Shivakumari Katuri. 2016. “Prakriyāpradarśinī - An opensource subanta generator”. In: Sanskrit and Computational Linguistics,Select papers presented in the ‘Sanskrit and the IT World’ Section atthe 16th World Sanskrit Conference, (June 28 - 2 July 2015) Bangkok,Thailand. Ed. by Amba Kulkarni. DK Publishers Distributors Pvt. Ltd(New Delhi), pp. 195–221. isbn: 978-81-932319-0-6.Scharf, P. 2009. “Rule selection in the Aṣṭ ādhyā yi or Is Pāṇini’s grammarmechanistic”. In: Proceedings of the 14th World Sanskrit Conference,Kyoto University, Kyoto.Scharf, Peter M. 2009. “Modeling pāṇinian grammar”. In: Sanskrit Compu-tational Linguistics. Springer, pp. 95–126.Scharf, Peter M. 2016. “An XML formalization of the Aṣṭādhyāyī”. In: San-skrit and Computational Linguistics, Select papers presented in the ‘San-skrit and the IT World’ Section at the 16th World Sanskrit Conference,(June 28 - 2 July 2015) Bangkok, Thailand. Ed. by Amba Kulkarni.DK Publishers Distributors Pvt. Ltd (New Delhi), pp. 77–102. isbn:978-81-932319-0-6.Sohoni, Samir Janardan and Malhar A. Kulkarni. 2016. “Character En-coding for Computational Aṣṭādhyāyī”. In: Sanskrit and ComputationalLinguistics, Select papers presented in the ‘Sanskrit and the IT World’Section at the 16th World Sanskrit Conference, (June 28 - 2 July 2015)Bangkok, Thailand. Ed. by Amba Kulkarni. DK Publishers DistributorsPvt. Ltd (New Delhi), pp. 125–155. isbn: 978-81-932319-0-6.Vedantakeshari, Swami Prahlad Giri. 2001. Pāṇiniya Aṣṭādhyāyī Sūtrapāṭha.Krishnadas Academy, Varanasi. 2nd edition.PAIAS: Pāṇini Aṣṭādhyāyī Interpreter As aServiceSarada Susarla, Tilak M. Rao and Sai SusarlaAbstract: It is widely believed that Pāṇini’s Aṣṭādhyāyī is the most ac-curate grammar and word-generation scheme for a natural languagethere is. Several researchers attempted to validate this hypothesisby analyzing Aṣṭādhyāyī’s sūtra system from a computational / algo-rithmic angle. Many have attempted to emulate Aṣṭādhyāyī’s wordgeneration scheme. However, prior work has succeeded in taking onlysmall subsets of the Aṣṭādhyāyī pertaining to specific constructs andmanually coding their logic for linguistic analysis.However, there is another school of thought that Aṣṭādhyāyī itself(along with its associated corrective texts) constitutes a complete,unified, self-describing solution for word generation (kṛt, taddhita),compounding (samāsa) and conjugation (sandhi). In this paper, wedescribe our ongoing effort to directly compile and interpret Aṣṭād-hyāyī’s sūtra corpus (with its associated data sets) to automate itsprakṛti-pratyaya-based word transformation methodology, leaving outkārakas. We have created a custom machine-interpretable languagein JSON for Aṣṭādhyāyī, a Python-based compiler to automaticallyconvert Aṣṭādhyāyī sūtras into that language, and an interpreter toreproduce Aṣṭādhyāyī’s prakriyā for term definitions, meta-rules andvidhis. Such an interpreter has great value in analyzing the gener-ative capability of Pāṇinian grammar, assessing its completeness oranomalies and the contributions of various commentaries to the orig-inal methodology. We avoid manually supplying any data derivabledirectly from Aṣṭādhyāyī. Unlike existing work that aimed at fastinterpretation of rules, we focus initially on fidelity to Aṣṭādhyāyī.We have started with a well-annotated online Aṣṭādhyāyī resource.We are able to automatically enumerate the character sequences de-noted by saṃj nās defined in Aṣṭādhyāyī, and determine which parib-hāṣā sūtras apply to which vidhi sūtras. We are in the process of de-veloping a generic rūpa-siddhi engine starting from a prakṛti-pratyaya3132 Susarla et alsequence. Our service named PAIAS1 p￿rovides programmatic accessto Aṣṭādhyāyī, its data sets and their interpretation via open RESTfulAPI for third-party tool development.1 IntroductionThere is growing interest and activity in applying computing technologyto unearth the knowledge content of India’s heritage literature, especiallyin Saṃskṛt language. This has led to several research efforts to produceanalysis tools for Saṃskṛt language content at various levels - text, syntax,semantics and meaning Goyal, Huet, et al. (2012), Oilver Hellwig (2009),Huet (2002), Kulkarni (2016), and Kumar (2012). The word-generatingflexibility and modular nature of the Saṃskṛt grammar makes it at onceboth simpler and difficult to produce a comprehensive dictionary for thelanguage: simpler because it allows auto-generation of numerous variants ofwords, and difficult because unbounded nature of Saṃskṛt vocabulary makesa comprehensive static dictionary impractical. Yet, a dictionary is essentialfor linguistic analysis of Saṃskṛt documents. Pāṇini’s Aṣṭādhyāyī comes tothe rescue for Saṃskṛt linguistic analysis by offering a procedural basis forword generation and compounding to produce a dynamic, semi-automateddictionary. Aṣṭādhyāyī is considered a monumental work in terms of its abil-ity to codify the conventions governing the usage of a natural language intoprecise, self-contained generative rules. Ever since the advent of computing,researchers have been trying to automate the rule engine of Aṣṭādhyāyī torealize its potential. However, due to the sheer size of the rule corpus and itscomplexity, to date, only specific subsets of its rule base have been digestedmanually to produce word-generation tools pertaining to specific grammarconstructs Goyal, Huet, et al. (2012), Krishna and Goyal (2015), Patel andKaturi (2016), and Scharf and Hyman (2009).However, this approach limits the tools’ coverage of numerous wordforms and hence their usefulness for syntax analysis of the vast Saṃskṛt cor-pus. Interpreting Pāṇini’s Aṣṭādhyāyī as separate subsets is complex andunnatural due to intricate interdependencies among rules and their trigger-ing conditions. Pāṇini’s Aṣṭādhyāyī has a more modular, unified mechanism(prakriyā)2 for word generation via rules for joining prakṛti (stems) with1Pronounced like ‘payas’ meaning milk.2In this paper, we use the IAST convention for Sanskrit words.PAIAS 33numerous pratyayas based on the word sense required. Most aspects of thejoining mechanism are common across conjugation (sandhi), compounding(samāsa) and new word derivation (e.g., kṛt and taddhita forms). However,commentaries such as Siddhānta Kaumudī (SK) have arranged the rules forthe purpose of human understanding of specific grammatical constructs. Forthe purpose of computational tools, we believe the direct interpretation ofPāṇini’s Aṣṭādhyāyī offers a more natural and automatable approach thanSK-based approaches.With this view, we have taken up the problem of relying solely on Aṣṭād-hyāyī and its associated sūtras for deriving all Saṃskṛt grammatical opera-tions of word transformation (or rūpa-siddhi). Our approach is to compilethe Aṣṭādhyāyī sūtra text into executable rules automatically (incorporatinginterpretations made by commentaries), and to minimize the manual codingwork to be done per sūtra. We have built a web-based service called PAIASwith a RESTful API for programmatic access to the Aṣṭādhyāyī engine (i.e.,the sūtra corpus and its associated data sets) and to enable its executionfor word transformation and other purposes. We adopted a service-orientedarchitecture to cleanly isolate functionality from end-user presentation, sonumerous tools and presentation interfaces can evolve for Saṃskṛt grammaremploying appropriate programming languages.In this paper, we describe our ongoing work and its approach, and thespecific results obtained so far. In section 2, we set the context by contrast-ing our approach to relevant earlier work. In section 3, we give an overviewof the project’s goals and guiding design principles. In section 4, we describehow we prepared the source Aṣṭādhyāyī for use in PAIAS. In section 5, weexplain our methodology for Aṣṭādhyāyī interpretation including the high-level workflow, sūtra compilation scheme and rule interpreter. In section 6,we outline how PAIAS enumerates several entity sets referred throughoutAṣṭādhyāyī. In section 7.3, we describe our methodology to interpret parib-hāṣā sūtras. In section 8, we provide details of our implementation and itsstatus. In section 9, we illustrate the operation of the interpreter by show-ing how our engine automatically expands pratyāhāras. Finally we concludeand outline future work in Section 10.34 Susarla et al2 Related WorkAṣṭādhyāyī and its interpretation for Saṃskṛt grammatical analysis andword synthesis has been studied extensively Goyal, Huet, et al. (2012),Goyal, Kulkarni, and Behera (2008), Krishna and Goyal (2015), Patel andKaturi (2016), Satuluri and Kulkarni (2013), Scharf and Hyman (2009),and Subbanna and Varakhedi (2010). For the purpose of this paper, weassume the reader is familiar with Pāṇini’s Aṣṭādhyāyī and its various con-cepts relevant to computational modeling. For a good overview of thoseconcepts, the reader is referred to earlier publications Goyal, Kulkarni, andBehera (2008) and Petersen and Oliver Hellwig (2016). In their Aṣṭādhyāyī2.0 project, Petersen and Oliver Hellwig (2016) have developed a richly an-notated electronic representation of Aṣṭādhyāyī that makes it amenable toresearch and machine-processing. We have achieved a similar objective viamanual splitting of sandhis and word-separation within compounds, and bydeveloping a custom vibhakti analyzer for detecting word recurrence acrossvibhakti and vacana variations.Petersen and Soubusta (2013) have developed a digital edition of theAṣṭādhyāyī. They have created a relational database schema and web-interface to support custom views and sophisticated queries. We optedfor a hierarchical key-value structure (JSON) to represent Aṣṭādhyāyī asit enables a more natural navigational exploration of the text unlike a re-lational model. We feel that the size of the Aṣṭādhyāyī is small enough tofit in DRAM of modern computers making efficiency benefits of the rela-tional model less relevant. We used a document database (MongoDB) dueto the schema flexibility and extensibility it offers along with powerful nav-igational queries. Scharf (2016) developed perhaps the most comprehensiveformalization to date of Pāṇini’s grammar system including the Aṣṭādhyāyīin XML format with the express purpose of assisting the development of au-tomated interpreters. In his formalization, the rules are manually encodedin a pre-interpreted form via spelling out the conditions, contexts, and ac-tions of each rule. In contrast, our attempt is to derive those from the rule’stext itself. Scharf’s encoded rule information enables validation of our ruleinterpretation. We could also leverage its other databases that form part ofPāṇini’s grammar ecosystem.The first step to interpret the Aṣṭādhyāyī is to understand the terms andmetarules that Pāṇini defines in the text itself. T. Ajotikar, A. Ajotikar,and Scharf (2015) explains some of Pāṇini’s techniques that an interpreterPAIAS 35needs to incorporate, and illustrated how Scharf (2016) captures them. Un-like earlier efforts at processing Aṣṭādhyāyī that have manually enumeratedthe terms and their definitions including pratyāhāras Mishra (2008), ourapproach is to extract them from the text itself automatically.Several earlier efforts attempted to highlight and emulate various tech-niques used in Aṣṭādhyāyī for specific grammatical purposes. They typicallyselect a particular subset of Aṣṭādhyāyī’s engine and code its semanticsmanually to reproduce a specific prakriyā. For brevity, we only discuss themost recent work that comes close to ours. Krishna and Goyal (2015) havebuilt an object-oriented class hierarchy to mimic the inheritance structureof Aṣṭādhyāyī rules. They have demonstrated this approach for generatingderivative nouns. Our goal and hence approach differ in two ways, namely,to interpret Aṣṭādhyāyī sūtras faithfully as opposed to achieving specificnoun and verb forms, and to mechanize the process of converting sūtras intoexecutable code to the extent possible. However, the learnings and insightsfrom earlier work on interpreting Aṣṭādhyāyī Mishra (2008) will apply toour work as well, and hence can be incorporated into our engine.Patel and Katuri (2016) have built a subanta generator that imitates themethod given by siddhānta kaumudī. Their unique contribution is a way toorder the sūtras for more efficient interpretation. However, they also encodethe semantic meaning of individual sūtras manually, and do not suggest amethod to mechanize sūtra interpretation from its text directly. Satuluriand Kulkarni (2013) have attempted to generate samāsa compounds by em-ulating the relevant subset of Aṣṭādhyāyī. Subbanna and Varakhedi (2010)have emulated the exception model used in Aṣṭādhyāyī. For this, they haveemulated a small subset of its sūtras relevant to that aspect.3 Design Goals and ScopeThe objective of our project is to develop a working interpreter for Pāṇini’sAṣṭādhyāyī that emulates its methodology faithfully by mechanizing theinterpretation of sūtra text as much as possible. To guide our design, we setthe following principles:Fidelity: We focus on reproducing the prakriyā of Aṣṭādhyāyī sūtra corpus(by taking the semantic adjustments from relevant vyākhyānas as ap-propriate). We do not focus on optimizing the interpretation enginefor speedy execution as of now.36 Susarla et alReuse: We would like to provide a powerful query interface to the Aṣṭād-hyāyī and its data sets to enable sophisticated analytics and learningaids.Extensibility: We would also like to promote the development of an exten-sible and interoperable framework for Aṣṭādhyāyī by providing a pro-grammatic interface to its interpreter engine. This framework shouldsupport plugging in functionality developed by third parties in multi-ple programming languages and methodologies.The specific contributions of this paper include• A programmatic interface to Aṣṭādhyāyī with a powerful query lan-guage for sophisticated search,• A mechanism to automatically extract definitions of saṃj nās (bothstatically and dynamically defined) by interpreting their sūtra text,• A machine-processable language and its interpreter to transform thebulk of Aṣṭādhyāyī sūtra text into executable code, and a mechanismof interpretation that tracks word transformation state persistently,and• An extensible framework that supports interoperability among tech-niques for Aṣṭādhyāyī sūtra interpretation developed by multiple re-searchers to accelerate tool development.4 Preparing theAṣṭādhyāyī for machine-processingWe have started with a well-annotated and curated online resource forAṣṭādhyāyī Sarada Susarla and Sai Susarla (2012) available as a spreadsheetdue to its amenability to augmentation and scripted manipulation. Table1 outlines its schema. The spreadsheet assigns each sūtra a canonical IDstring in the format APSSS (e.g., 11076 to denote the 76th sūtra in 1st pādaof 1st adhyāya). To enable machine-processing, each sūtra is provided withits words split by sandhi and the individual words in a samāsa separatedby hyphens and tagged with simple morphological attributes such as type(subanta, tiṅanta or avyaya), vibhakti and vacana to enable auto-extraction.The adhikāra sūtras are also explicitly tagged with their influence given asPAIAS 37a sūtra range. For each sūtra, the padas that are inherited from earlier sū-tras through anuvṛtti are listed along with their source sūtra id and vibhaktimodification in the anuvṛtta form if any.We auto-convert this spreadsheet into a JSON JSON (2000) dictionaryand use it as the basis for the PAIAS service. Table 2 shows an exam-ple JSON description of sūtra 1.2.10 with all the abovementioned featuresillustrated.{‘ ‘ Adhyaaya” : ‘ ‘ Adhyaaya # adhyAyaH” ,‘ ‘ Paada” : ‘ ‘ Paada # pAdaH” ,‘ ‘ sutra_num” : ‘ ‘ sutra_num sU . saM . ” ,‘ ‘ sutra_krama” : ‘ ‘ sutra_krama sU . kra . saM” ,‘ ‘ Akaaraadi_krama” : ‘ ‘ Akaaraadi_krama akArAdi kra . saM” ,‘ ‘Kaumudi_krama” : ‘ ‘Kaumudi_krama kaumudI kra . saM” ,‘ ‘ sutra_id ” : ‘ ‘ sutra_id pUrNa sU . saM . ” ,‘ ‘ sutra_type ” : ‘ ‘ sutra_type sutralakShaNam ” ,‘ ‘Term” : ‘ ‘Term saMj~nA” ,‘ ‘ Metarule ” : ‘ ‘ Metarule paribhAShA” ,‘ ‘ Spec ia l_case ” : ‘ ‘ Spec ia l_case atideshaH ” ,‘ ‘ I n f l u en c e ” : ‘ ‘ I n f l u en c e adhikAraH ” ,‘ ‘ Commentary” : ‘ ‘ Commentary vyAkhyAnam” ,‘ ‘ sutra_text ” : ‘ ‘ sutra_text sutram ” ,‘ ‘ PadacCheda” : ‘ ‘ PadacCheda padchChedaH” ,‘ ‘ SamasacCheda” : ‘ ‘ SamasacCheda samAsachChedaH” ,‘ ‘ Anuvrtt i ” : ‘ ‘ Anuvrtt i pada sut ra #anuvRRitti−padam sutra−sa~NkhyA” ,‘ ‘ PadacCheda_notes” : ‘ ‘ PadacCheda_notes”}Table 1Aṣṭādhyāyī Database Schema38 Susarla et al‘ ‘12010” : {‘ ‘ Adhyaaya” : 1 ,‘ ‘ Paada” : 2 , ‘ ‘ sutra_num” : 10 ,‘ ‘ sutra_krama” : 12010 , ‘ ‘ Akaaraadi_krama” : 3913 ,‘ ‘Kaumudi_krama” : 2613 ,‘ ‘ sutra_id ” : ” 1 . 2 . 1 0 ” ,‘ ‘ sutra_type ” : [ ‘ ‘ at ideshaH ” ] , ‘ ‘ Commentary” : ‘ ‘ . . . ” ,‘ ‘ sutra_text ” : ‘ ‘ halantAchcha | ” ,‘ ‘ PadacCheda” : [{ ‘ ‘ pada” : ‘ ‘ halantAt ” , ‘ ‘ pada_spl i t ” : ‘ ‘ hal−antAt ” ,‘ ‘ type ” : ‘ ‘ subanta ” , ‘ ‘ vachana” : 1 ,‘ ‘ v ibhakt i ” : 5 } ,{ ‘ ‘ pada” : ‘ ‘ cha ” , ‘ ‘ type ” : ‘ ‘ avyaya ” ,‘ ‘ vachana” : 0 , ‘ ‘ v ibhakt i ” : 0 }] ,‘ ‘ Anuvrtt i ” : [{ ‘ ‘ su t ra ” : 12005 , ‘ ‘ padas ” : [ ‘ ‘ k i t ” ] } ,{ ‘ ‘ su t ra ” : 12008 , ‘ ‘ padas ” : [ ‘ ‘ san ” ] } ,{ ‘ ‘ su t ra ” : 12009 , ‘ ‘ padas ” : [ ‘ ‘ ikaH ” , ‘ ‘ j h a l ” ] }]}Table 2Aṣṭādhyāyī Database SchemaPAIAS 395 Aṣṭādhyāyī Interpreter: High-level WorkflowThe input to our Aṣṭādhyāyī engine is a sequence of tagged lexemes that wecall pada descriptions or pada_descs, and its output is one or more alter-nate sequences of tagged lexemes denoting possible word transformations. Apada_desc is a dictionary of tag-value pairs in JSON format. The tag valuescan be user-supplied (in case of human-assisted analysis), system-inferredor user-endorsed. Table 2 shows a sūtra description where the padacChedasection represents the sūtra as a sequence of pada_descs. An example tagis a pada ‘type’ such as subanta, tiṅanta, nipāta, avyaya, pratyaya, saṃj nāetc. Each application of an Aṣṭādhyāyī sūtra, referred to in this paper as‘prakriyā’, modifies the input pada_desc sequence by adding/editing/re-moving pada_descs to denote word-splitting, morphing or merging opera-tions based on the semantics of the sūtra.For instance, when applying the saṃj nā sūtra for the saṃj nā ‘it’, wetag a given input word with a tag called ‘it_varnas’ whose value is the offsetof the ‘it’ varṇas found in the word. Such tags can also be used to storeintermediary states of grammar transformations for reference by subsequentoperations. This persistent tracking of the transformation state of wordsoffers the power required for interpreting Aṣṭādhyāyī sūtras faithfully. Theneed for such facility to carry over internal state from one sūtra to anotherhas been identified by Patel and Katuri (2016) for their subanta generatortool.In order to transform tagged lexemes, the first step is to identify the oc-currence of pre-determined patterns in input lexemes which are denoted byexplicit terms (saṃj nās) in Aṣṭādhyāyī sūtras. Instead of handcoding thosepattern definitions into the interpreter, our approach is to automatically ex-tract them from the sūtras themselves and interpret them at prakriyā time.To accomplish this, we have devised a machine-processable representationscheme for various sūtras, which we elucidate in Section 6. Likewise, Aṣṭād-hyāyī provides a set of 23 paribhāṣā sūtras or metarules (augmented withapprox. 100 more metarules in paribhāṣendu-śekhara treatise). The purposeof these metarules is to modify the operation of the vidhi sūtras. In Section7.3, we describe how we manually encode metarules as (condition, action)pairs such that we can mechanically determine which paribhāṣās apply to agiven Aṣṭādhyāyī sūtra. Since paribhāṣās operate on vidhi sūtra texts, theirapplicability can be pre-determined a priori instead of at prakriyā time.40 Susarla et alAt a high-level, our approach to Aṣṭādhyāyī interpretation involves thefollowing manual steps:• Splitting of sandhis and samāsa in the sūtra text to facilitate detectionof word recurrences.• Enumerating the anuvṛtta padas of each sūtra (from earlier sūtras).• Coding of each of the 23 paribhāṣā sūtras into condition-action pairs.• Preparation of a vibhakti suffix table that covers subantas of Aṣṭād-hyāyī for use in morphological analysis of sūtra words.• Coding of custom functions to interpret the meaning of some technicalwords used in Aṣṭādhyāyī but not defined therein (e.g., adarśanam,ādiḥ, antyam, etc.).• Adding special case interpretation of the sūtra ‘halantyam’ as ‘haliantyam‘ to break the cyclic dependency for pratyāhāra generation (asexplained in Section 9).In the next section, we outline the preprocessing steps needed for Aṣṭādhyāyīinterpretation.5.1 Preparing the Aṣṭādhyāyī InterpreterTo prepare the Aṣṭādhyāyī engine for rule interpretation, we automaticallypreprocess the Aṣṭādhyāyī database as follows.1. We first perform morphological analysis of each word of every sūtra toextract its prātipadikam. This is required to identify recurrence of aword in the Aṣṭādhyāyī regardless of vibhakti and vacana variations.We describe this step in Section 5.2.2. For each sūtra, we generate a canonical sūtra text that we refer toas its ‘mahāvākya’ as follows. We expand the sūtra’s text to in-clude all anuvṛtta-padas inherited from earlier sūtras. We representa mahāvākya as a list of pada descriptions, each with its morphologi-cal analysis output.3. We auto-extract the definitions of all terms (saṃj nās) used in theAṣṭādhyāyī. These come in different forms and need to be handleddifferently. We describe this step in Section 6.PAIAS 414. We compile saṃj nā and vidhi sūtras into rules to be interpreted atprakriyā time.5. We determine the vidhi sūtras where each of the paribhāṣā sūtras apply,by checking their preconditions. Then we modify the vidhi sūtras.6. Finally, we create an optimized condition hierarchy for rule-checkingby factoring the preconditions for all the Aṣṭādhyāyī sūtras into adecision tree. This step is still work in progress and is out of the scopeof this paper.5.2 Morphological Analysis of Aṣṭādhyāyī WordsTo detect recurrences of a sūtra word at different locations in Aṣṭādhyāyī(e.g. through anuvṛtti or embedded references) despite their vibhakti andvacana variations, we need the prātipadikam of each word. Since most Aṣṭād-hyāyī words are subantas specific to the treatise and not found in typicalSaṃskṛt dictionaries, we developed a simple suffix-based vibhakti analyzerfor this purpose. Since our Aṣṭādhyāyī spreadsheet already has words taggedby their vibhakti and vacana, our vibhakti analyzer takes them as hints andfinds possible matches in predefined vibhakti tables based on various commonword-endings. Once a match is found, it emits an analysis that includes pos-sible alternative prātipadikas along with their liṅga and word-ending. Westore the subanta analysis for each Aṣṭādhyāyī word and store it in thepadacCheda section of the sūtra JSON entry for ready reference.With this technique, we are able to determine the prātipadikam accu-rately for all the technical terms used in Aṣṭādhyāyī and use it for detectingword recurrences. Though the tool generated multiple options for liṅga, thatambiguity doesn’t hurt for our purpose of detecting word recurrence sincethe prātipadikam is unique.Then we extract term (saṃj nā) definitions from the saṃj nā sūtras asdescribed in Section 6.6 Extracting Saṃjñā Definitions from AṣṭādhyāyīAṣṭādhyāyī’s word transformation method consists of detecting pre-definedpatterns denoted by saṃj nās or terms and performing associated transfor-mations. These terms denote either a set of explicitly enumerated memberelements or conditional expressions to be dynamically checked at prakriyā42 Susarla et altime. Hence during preprocessing stage, we create a term definition databasewhere each term is defined as a list of member elements or as a compiledrule. The Aṣṭādhyāyī itself defines four types of terms (in increasing orderof extraction complexity):1. Terms defined in an adhikāra cum saṃj nā sūtra denoting a set of ele-ments enumerated explicitly in subsequent vidhi sūtras (e.g., pratyaya,taddhita, nipāta). The term itself becomes an anuvṛtta pada in all vidhisūtras in its adhikāra. Moreover, those sūtras refer to both the termand its member elements in prathamā vibhakti. Hence, to extract thedefinition, we pick prathamā vibhakti terms excluding (a) terms definedin saṃj nā sūtras of Aṣṭādhyāyī and (b) a manually prohibited list ofwords meant to convey colloquial meaning. With this method, wewere able to successfully extract all the pratyayas, taddhitas, nipātas,samāsas from Aṣṭādhyāyī automatically, and verify their authenticitywith those identified in Dikshita (2010).2. Terms with explicit name defined in saṃj nā sūtras denoting a setof elements enumerated explicitly (e.g., vṛddhi). In this case, the el-ements are listed in prathamā vibhakti and hence can be extracteddirectly from the sūtra.3. Terms with an explicit name defined in saṃj nā sūtras, and denotinga condition to be computed at prayoga time (e.g., ‘it’).4. Terms whose name (saṃj nā) and its members (saṃj ni) are bothdynamically computed quantities (e.g., pratyāhāras such as ‘ac’ and‘hal’)The last two variants require interpreting the sūtra text in different waysas described in Section 7. During Aṣṭādhyāyī prakriyā, when an inputpada_desc needs to be checked for match with a saṃj nā, we have twooptions. If the saṃj nā is represented as a list of member elements, then allpadas in the pada_desc that appear as members of a list will be annotatedwith the saṃj nā name. For instance, when checking the word ‘rāma’ against‘guṇa’ saṃj nā, the pada_desc of the ‘rāma’ word will be augmented witha property named ‘guṇa’ whose value is the index of the last ‘a’ alphabet inthe ‘rāma’ word, i.e, 3.PAIAS 437 Compiling Rules from sūtrasIn this section, we describe a mechanism we have devised for transformingAṣṭādhyāyī sūtras into machine-interpretable rules. This is a core contri-bution of our work as it enables direct interpretation of sūtras. We haveimplemented this mechanism for saṃj nā and paribhāṣā sūtras first because(i) they form a crucial prerequisite to the rest of the engine, and (ii) becausethey have not been studied by earlier work as systematically as the inter-pretation of vidhi sūtras. Moreover Aṣṭādhyāyī’s paribhāṣā sūtras state themechanism for interpreting vidhi sūtras explicitly. Our vidhi sūtra interpre-tation is a work in progress and will not be discussed further.Our sūtra interpretation scheme is based on some grammatical conven-tions we have observed in the sūtra text. First, the bulk of Aṣṭādhyāyīsūtras employ subanta padas and avyayas, and use tiṅanta padas sparingly.Second, saptamī vibhakti is used to indicate the context/condition in whicha sūtra applies. Third, each sūtra word either denotes a saṃj nā (or itsnegation), a predefined function (e.g. ādiḥ, antyam, etc), a set of terms orcharacters (e.g., cuṭū), or joining avyayas (e.g., saha, ca, vā etc.). Finally,whenever multiple words of the same vibhakti occur, one of them is a viśeṣyaand others are its viśeṣaṇas.Hence we compile each Aṣṭādhyāyī sūtra into a hierarchical expressionvia specially defined operators, called a rule. The rule is either atomic, i.e.,a pada_desc describing a sūtra word, or composite, coded as a JSON list.If pada_desc, it is interpreted via a special operator called INTERPRETdescribed below. If list, its first element is a predefined operator, and the restare its arguments. The arguments can in turn be pada_descs or sub-rules.7.1 Special OperatorsThis section describes several special operators that we have defined to formrules.INTERPRET: This operator interprets a single sūtra word on the inputtagged lexemes. It checks whether the pattern it denotes (e.g., ‘it’)applies to any of the input lexemes (e.g., ‘hal’). If the sūtra word is asaṃj nā (e.g., ‘it’), the interpreter interprets its rule recursively andtags the lexemes with the result (e.g., locations of ‘it’ varṇas). If thesūtra word is one of a predefined set of words with special meaning,the interpreter invokes its detector function. For instance, ādiḥ of the44 Susarla et allexeme ‘hal’ is ‘h’. Otherwise, the sūtra word denotes a set of terms orcharacters, in which case the interpreter returns whether the lexemeis a member of that set. For instance, if the sūtra word ‘cuṭū’ is inter-preted against input lexeme ‘c’, it returns True because cuṭū denotesconsonants in the ca-varga and ṭa-varga, i.e., {ca, cha, ja, jha, na, ṭa,ṭha, ̣a, ḍha, ṇa}.If the sūtra word is a negation such as ataddhita or apratyayaḥ, IN-TERPRET applies the negation before returning the result.The following conjunct operators are used to compose larger rules:PIPE: This operator takes a sequence of rules and invokes them by feedingthe output of a rule invocation as input to the subsequent rule. Thepipe exits when one of the stages return empty, and returns the outputof the last rule. This is used to process all sūtra padas of the same vib-hakti. For instance, when interpreting the pipe [‘PIPE’, ‘ādiḥ’, ‘cuṭū’]against the lexeme ‘hal’, the output of ‘ādiḥ’ namely ‘h’ is comparedagainst ‘cuṭū’ membership, which returns None.IF: This operator takes a list of rules. If all of them evaluate to somethingother than None, it returns the input tagged lexeme set as is, otherwiseNone. This is used to encapsulate saptamī vibhakti padas that indicatethe enabling context for a sūtra to apply (e.g., upadeśe). It is alsoused to encapsulate a ṣaṣṭhī vibhakti padam in a saṃj nā sūtra whichindicates the saṃj ni (definition of a saṃj nā).PAIR: This operator represents a pair of elements mentioned in a sūtraalong with the avyaya ‘saha’. The prathamā vibhakti pada sequencedescribes the first element and the tṛtīyā vibhakti pada sequence de-notes the last element. An example is shown in Figure 3 for the sūtra‘ādirantyena sahetā’. If the pair denotes a sequence, then it describesthe first and last elements.GEN_SAMJNA: This operator handles a saṃj nā defined as a computedexpression such as ‘ak’, ‘hal’, ‘sup’ etc. It matches the input taggedlexeme against the rule for the saṃj nā. Upon a match, it invokesthe rule for the ‘saṃj ni’ by passing the saṃj nā as a parameter.For instance, the sūtra ‘ādirantyena sahetā’ gets compiled into thefollowing rule:PAIAS 45• [GEN_SAMJNA, {‘saṃj ni’ : None, ‘saṃj nā’ : [PAIR, ‘ādiḥ’,[PIPE, ‘antyam’, ‘it’] ] } ]Since there is no explicit saṃj ni in this sūtra, we apply a specialpratyāhāra expander function to generate the character sequence fromthe input pair. Figure 3 shows the hierarchical representation of thesūtra text that leads to the above rule.PROHIBIT: This function prohibits applying a sūtra under a matchedsub-condition. It is not the same as negation of a match condition.This is used to process the sūtra word ‘na’ in a sūtra. For instance,when processing the ‘it’ saṃj nā sūtra ‘na vibhaktau tusmāh’, as shownin Figure 1, this function removes any ‘it’ varṇa tagging done whileprocessing its sub-conditions denoted by the words ‘hal’, ‘antyam’ and‘tusmāh’.Figure 1Rule Hierarchy for sūtra ‘na vibhaktau tusmāh’.7.2 Compiling Rules from Saṃj nā sūtrasSaṃj nā sūtras come in two flavors:1. those that explicitly list a term and its definition in prathamā vibhaktiwith some other conditions e.g., (upadeśe pratyayasya ādiḥ it) cuṭū,and46 Susarla et alFigure 2Rule Hierarchy for sūtra ‘cuṭū’.Figure 3Rule Hierarchy for sūtra ‘ādirantyena sahetā’.PAIAS 472. those that describe the saṃj nā name as a computed expression andits denoted items in ṣaṣṭhīi vibhakti, e.g., ādiḥ antyena itā saha (svasyarūpasya).In the above representation of the sūtra texts, we denote words that areinherited by anuvṛtti in parentheses.Figure 2 shows a tree representation for a sūtra of the first flavor. Asaṃj nā sūtra has three components: the term denoted by the edge labeled‘kA’, its definition denoted by ‘kasya’, and the context in which the defini-tion applies, denoted by ‘kutra’. saptamī vibhakti padas in the sūtra denotethe context. The saṃj nā term, if explicitly present in the sūtra will be inprathamā vibhakti with its defining words also in prathamā. In that case, anexecutable version of the sūtra is a representation of the tree as a hierarchi-cal list. All words in the same vibhakti in the sūtra have viśeṣaṇa-viśeṣyarelation.Figure 3 shows the tree representation for a sūtra of the second flavor. Inthis, the ṣaṣṭhī vibhakti word should be interpreted as a filter or qualifier forthe prathamā vibhakti words, not as the saṃj ni (the definition). This sūtraalso has tṛtīyā vibhakti padas joined by ‘saha’, which can be interpreted assequence generation operator. This operator takes prathamā vibhakti padasto indicate start of the sequence and tṛtīyā vibhakti padas to indicate end ofsequence.Figure 4 shows another sūtra of the second flavor, where the saṃj nā andsaṃj ni definition are both parameterized. It has a pada that is a negationof pratyaya. Matching this sūtra requires a pre-defined function that checksif given word is a pratyaya. The saṃj ni in this case is the set of savarṇasof x.7.3 Interpreting Paribhāṣā sūtrasA paribhāṣā sūtra describes how to interpret sūtras whose text matches agiven condition. It can be represented as a set of actions guarded by con-ditions. It is applied to transform vidhi sūtras prior to compiling them intorules. The condition indicates the sūtra to which the paribhāṣā applies, ex-pressed in terms of properties of the words in the sūtra text. The actionsindicate how the matching sūtra should be transformed prior to interpreta-tion.For instance, consider the sūtra ‘ādyantau ṭakitau’. It describes that ifa vidhi sūtra contains ‘ṭit’ or ‘kit’ pada (i.e., which has varṇa ‘ṭ’ or ‘k’ as48 Susarla et alFigure 4Rule Hierarchy for sūtra ‘aṇudit savarṇasya cāpratyayaḥ’.‘it’), then the sūtra should be expanded to add extra words ‘ṣaṣṭhyantasyaādiḥ’ or ‘ṣaṣṭhyantasya antaḥ’ respectively to the sūtra text. We express thislogic in our interpreter by coding the paribhāṣā as shown in Algorithm 1.Here, the condition is expressed as a rule-matching query with new operatorsSAMJNA, PRATYAYA, IT_ENDING, AND and NOT. It applies if a vidhisūtra has an individual pada (in its padacCheda) which is not a saṃj nā orpratyaya word, but has ‘ṭ’ as its ‘it’ varṇa. In that case, the sūtra’s textmust be augmented with two additional words ‘ṣaṣṭhyantasya ādiḥ’. Thatmatching vidhi sūtra will then be compiled into a rule and then interpretedduring word transformation prakriyā time. Similarly, if the sūtra word has‘k’ as its ‘it’ varṇa, then the additional words will be ‘ṣaṣṭhyantasya antaḥ’.As another example, the paribhāṣā sūtra ‘midaco’ntyāt paraḥ’ has thefollowing effect. If a vidhi sūtra has ‘m’ as ‘it’ (other than in a pratyaya orsaṃj nā word), then the words ‘ṣaṣṭhyantasya antyāt acaḥ paraḥ’ should beadded to the sūtra text.We manually define the condition action pairs for each of the 23 parib-hāṣā sūtras as shown in Algorithm 1. At initiation time, the Aṣṭādhyāyīengine checks these conditions on each of the vidhi sūtras of Aṣṭādhyāyī andtransforms them accordingly prior to the rule compilation step.In our current implementation, we have handcoded the condition--action pairs for about half of the paribhāṣās of Aṣṭādhyāyī, and are ableto successfully identify the sūtras to which they apply. This is because ofour ability to identify the various listable saṃj nās, which are needed informulating the conditions. However, processing of vidhi sūtras is futurework.PAIAS 49Algorithm 1 Codifying paribhāṣā sūtra ‘ādyantau ṭakitau’.par ibhasa_defs = {. . .s t r (11046) : [{‘ ‘ cond” : {‘ ‘ PadacCheda” :[AND, [ [NOT, SAMJNA] , [NOT, PRATYAYA] ,[ IT_ENDING, { ‘ ‘ varna ” : ‘ ‘T” } ] ] ] ,‘ ‘ sutra_type ” : [ ‘ ‘ vidhiH ” ]} ,‘ ‘ a c t i on ” : [sutra_add_pada , { ‘ ‘ pada” : ‘ ‘ ShaShThyantasya ” ,‘ ‘ v ibhakt i ” : 6 , ’ type ’ : ‘ ‘ subanta ”} ,sutra_add_pada , { ‘ ‘ pada” : ‘ ‘AdiH” ,‘ ‘ v ibhakt i ” : 1 , ‘ type ’ : ‘ ‘ subanta ”}]} ,{‘ ‘ cond” : {‘ ‘ PadacCheda” :[AND, [ [NOT, SAMJNA] , [NOT, PRATYAYA] ,[ IT_ENDING, { ‘ ‘ varna ” : ‘ ‘ k”} ] ] ] ,‘ ‘ sutra_type ” : [ ‘ ‘ vidhiH ” ]} ,‘ ‘ a c t i on ” : [sutra_add_pada , { ‘ ‘ pada” : ‘ ‘ ShaShThyantasya ” ,‘ ‘ v ibhakt i ” : 6 , ‘ type ’ : ‘ ‘ subanta ”} ,sutra_add_pada , { ‘ ‘ pada” : ‘ ‘ antaH ” ,‘ ‘ v ibhakt i ” : 1 , ‘ type ’ : ” subanta ”}]}] ,. . .}50 Susarla et al8 ImplementationWe have implemented PAIAS as a Python library and a Flask web microser-vice that provides RESTful API access to its functionality. The API-basedinterface provides a flexible, reliable and reusable foundation for open col-laborative development of higher-level tools and user interfaces in multipleprogramming languages to accelerate research on Aṣṭādhyāyī, while ensur-ing interoperability of those tools. The code is available on GitHub at and will soon be availableas a pip installable module.The module comes bundled with the Aṣṭādhyāyī spreadsheet along withdhātu pātha and other associated data sets. Upon first invocation after aclean install, the Aṣṭādhyāyī module computes mahāvākyas for all sūtras,compiles sūtras into machine-executable rules, builds saṃj nā definitions,extracts listable terms such as Pratyayas etc, and transforms vidhi sūtras byapplying the matching paribhāṣā sūtras. It then stores all this derived statepersistently in JSON format in a MongoDB database. This enables fastaccess to the Aṣṭādhyāyī engine subsequently. Our current implementationdoes not handle the transformation and interpretation of vidhi sūtras yet.Figure 2 shows an example Python script using the Aṣṭādhyāyī library.We have also devised a powerful query interface to Aṣṭādhyāyī for sophis-ticated search. Figure 3 shows a Python script to find unique words thatoccur in the Aṣṭādhyāyī grouped by vibhakti. The query condition can bespecified as a JSON dictionary supporting a hierarchical specification ofdesired attributes as shown in this example.PAIAS 51Algorithm 2 Example usage of Aṣṭādhyāyī Service.from ashtadhyayi . u t i l s import *from ashtadhyayi import *de f a ( ) :r e turn ashtadhyayi ( )# Provide mahaavaakya o f g iven sut ra as i nd i v i dua l wordsde f mahavakya ( sutra_id ) :s = a ( ) . sut ra ( sutra_id )re turn s [ ‘ mahavakya_padacCheda ’ ]# Show a l l v idh i su t r a s where g iven par ibhasha sut raapp l i e sde f par ibhasha ( sutra_id ) :p = get_paribhasha ( sutra_id )i f not p :p r i n t ‘ ‘ Error : Paribhasha d e s c r i p t i o n not found f o r ” ,sutra_idreturn [ ]matches = [ ]f o r s_id in p . matching_sutras ( ) :s = a ( ) . sut ra ( s_id )out = d i c t ( ( k , s [ k ] ) f o r k in( ‘ sutra_krama ’ , ‘ sutra_text ’ , ‘ sutra_type ’ ) )matches . append ( out )re turn matches# Return praatipadikam of g iven pada tak ing v ibhakt i andvachana h in t s .de f p raat ipad ika ( pada , v ibhakt i =1, vachana=1):pada = san s c r i p t . t r a n s l i t e r a t e ( pada , s an s c r i p t . SLP1 ,s an s c r i p t .DEVANAGARI)return Subanta . ana lyze ({ ‘ pada ’ : pada ,‘ v ibhakt i ’ : v ibhakt i ,‘ vachana ’ : vachana })52 Susarla et alAlgorithm 3 Example script to extract unique words of various vibhaktisin Aṣṭādhyāyī.from ashtadhyayi . cmdline import *a = ashtadhyayi ( )my f i l t e r = { ‘PadacCheda ’ : { ‘ v ibhakt i ’ : 1 } }r e s u l t = {}f o r v in [ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 ] :my f i l t e r [ ‘ PadacCheda ’ ] [ ‘ v ibhakt i ’ ] = vv_padas = [ ]f o r s_id in a . su t r a s ( my f i l t e r ) :s = a . sut ra ( s_id )f o r p in s [ ‘ PadacCheda ’ ] :i f ‘ v ibhakt i ’ not in p :cont inuei f p [ ‘ v ibhakt i ’ ] != v :cont inuev_padas . append (p [ ‘ pada ’ ] )r e s u l t [ v ] = sor t ed ( s e t ( v_padas ) )pr in t_d ic t ( r e s u l t )PAIAS 539 Evaluation: Putting it all togetherIn this section, we illustrate the automated operation of the PAIAS enginevia by showing the expansion of a pratyāhāra ‘ac’ into its denoted varṇas.pratyāhāra expansion is an essential step to enable the rest of Aṣṭādhyāyīinterpretation.sūtra#(APSSS)MahāvākyaRepresentationGenerated rule13002 it = upadeśe(7) ac(1)anunāsikaḥ(1)[PIPE, [IF, “upadeśa”], [PIPE, “ac”,“anunāsikaḥ”]]13003.1 it = upadeśe(7) hali(7)antyam(1)[PIPE, [IF, “upadeśa”, “hal”], [PIPE,“antyam”]]13003.2 it = upadeśe(7) hal(1)antyam(1)[PIPE, [IF, “upadeśa”], [PIPE, “hal”,“antyam”]]13004 it = upadeśe(7) na(0)vibhaktau(7) tusmāḥ(1)[PIPE, [IF, “upadeśa”, “vibhakti”],[PROHIBIT, “tu-s-ma”]]13005 it = upadeśe(7) ādiḥ(1)ṇiṭuḍavaḥ(1)[PIPE, [IF, “upadeśa”], [PIPE, “ādiḥ”,“ṇi-ṭu-ḍu”]]13006 it = upadeśe(7) ādiḥ(1)ṣaḥ(1) pratyayasya(6)[PIPE, [IF, “upadeśa”], [PIPE,“pratyaya”, [PIPE, “ādiḥ”, “ṣaḥ”]]13007 it = upadeśe(7) ādiḥ(1)pratyayasya(6) cuṭū(1)[PIPE, [IF, “upadeśa”], [PIPE,“pratyaya”, [PIPE, “ādiḥ”, “cu-ṭu”]]13008 it = upadeśe(7) ādiḥ(1)pratyayasya(6) laśaku(1)ataddhite(7)[PIPE, [IF, “upadeśa”, [NOT,“taddhita”]], [PIPE, “pratyaya”, [PIPE,“ādiḥ”, “la-śa-ku”]]11071 pratyāhāra = svasya(6)rūpasya(6) ādiḥ(1)antyena(3) saha(0) itā(3)[GEN_SAMJNA, {‘saṃj ni’ : None,‘saṃj nā’ : [PAIR, ‘ādiḥ’, [PIPE,‘antyam’, ‘it’] ] } ]11060 lopa = iti(0) adarśanam(1) [PIPE, “adarśanam”]Table 3Mahāvākya representations produced by PAIAS engine from specificsaṃj nā sūtras relevant to pratyāhāra expansion.Our engine accomplishes ‘ac’ expansion as follows. First, it compiles allsaṃj nā sūtras into mahāvākyas and then into machine-interpretable rules.54 Susarla et alTable 3 shows the mahāvākya representations and corresponding rules gen-erated by our engine for specific sūtras relevant to pratyāhāra expansion,namely ‘it’ and dynamically computed saṃj nā names. The engine main-tains a terms_db database that caches the set of lexemes denoted by eachsaṃj nā. At start time, the engine resets this database and populates itwith the lexemes of all saṃj nās that are listed explicitly in Aṣṭādhyāyī.During pada_desc tagging, whenever the occurrence of a saṃj nā needs tobe detected, the engine checks the terms_db cache first before attemptingto interpret the saṃj nā’s rules.9.1 Expansion of pratyāhāra ‘hal’Pratyāhāras are computed saṃj nā names. To expand them, the engineshould be able to interpret the saṃj nā sūtras for ‘it’, especially, the sūtra‘halantyam’. However, to break its cyclic dependency on the expansion ofpratyāhāra ‘hal’, this sūtra should be interpreted twice - first as a samāsawith vigraha ‘hali antyam’, where ’hali’ is the saptamī vibhakti form of themāheśvara sūtra ‘hal’, and second as ’hal antyam’. As a result of the firstinterpretation (sūtra 13003.1), the engine adds the varṇa ‘l’ as the definitionof the ‘it’ saṃj nā in terms_db cache. During the second interpretation of‘it’ as ‘upadeśe hal antyam’, the engine recursively checks if ‘hal’ is a saṃj nā.This in turn matches with sūtra 11071 because ‘l’ in ‘hal’ gets tagged as‘it_varṇa’. Hence ‘hal’ gets detected as a computed saṃj nā, which denotesthe set of varṇas from ’ādi’ of ‘hal’ i.e., ‘h’ upto the but not including thelast ‘l’ in ‘upadeśa’ i.e., the māheśvara sūtra character sequence. Hence thesaṃj nā ‘hal’ and its denoted character sequence i.e., all consonants of theSaṃskṛt alphabet get added to the terms_db cache. When unwinding therecursion back to continue the second interpretation of sūtra 13003, thistime, the ending hal varṇa in each māheśvara sūtra gets an ‘it_varṇa’ tag.9.2 Expansion of pratyāhāra ‘ac’Next, when trying to interpret the input lexeme ‘ac’, the engine looks totag the lexeme’s constituent parts by matching them with the definitionsof known saṃj nās. This time, the sūtra 13003.2 applies, causing the ‘c’ tobe tagged as an ’it_varṇa’ because ‘c’ is a member of the ‘hal’ set in theterms_db cache. Since there is no explcit saṃj nā called ‘ac’, the enginePAIAS 55checks to see if ‘ac’ is a dynamically computed saṃj nā name by applyingsūtra 11071 ‘ādiḥ antyena saha itā’.This time, when computing the varṇa set as part of sūtra 11071, the last‘hal’ varṇa in each māheśvara sūtra needs to be suppressed. To accomplishthis, we had to manually code the interpretation of a single vidhi sūtra ‘tasyalopaḥ’. To do so, we had to manually rewrite it as ‘itaḥ lopaḥ’ because ourengine does not yet have the logic to interpret vidhi sūtras automatically.The engine reduces the definition of ‘lopaḥ’ to be ‘adarśanam’ from sūtra11060. We manually wrote a function to interpret the Aṣṭādhyāyī word‘adarśanam’ to suppress the emission of its referrent lexeme - here the onewith the ‘it_varṇa’ tag.Thus the engine is able to generate the varṇa sequence for ‘ac’ pratyāhāraas ‘a i u e o ai au’. The terms_db cache serves two purposes: i) to breakinfinite recursions in expansion of saṃj nā definitions that are possible inAṣṭādhyāyī, and ii) to speedup subsequent processing of a saṃj nā once ithas been expanded.10 Future DirectionsWe recognize that generating an automated interpretation engine for Aṣṭād-hyāyī is a complex and long-term task due to the need to validate and adjustthe methodology manually, and the thousands of sūtras involved. However,our attempt is to rely on the precision of Aṣṭādhyāyī’s exposition to mech-anise large parts of the work. Our second objective is to provide a robustfoundational platform so multiple researchers can work collaboratively andleverage each other’s innovations to accelerate the task. To this end, wewould like to work closely with other researchers to incorporate existing ap-proaches to Aṣṭādhyāyī interpretation and its validation. This is especiallytrue for vidhi sūtras which constitute the bulk of Aṣṭādhyāyī.We hope that our programmatic interface to Aṣṭādhyāyī and its seman-tic functionality enables interoperable applications and deeper explorationof the grammatical structure of Saṃskṛt literature by the larger computerscience community. Certain directions include data-driven analysis of therelative usage of Saṃskṛt grammar constructs in Saṃskṛt literature, vocab-ulary and its evolution over time, kāraka analysis via a mix of data-drivenand first-principles approaches. A robust grammar engine provides a sound56 Susarla et albasis for such projects. Another area of future research would be to explorethe engine’s applicability for modeling other natural languages.11 ConclusionIn this paper, we have presented a programmatic interface to the celebratedSaṃskṛt grammar treatise Aṣṭādhyāyī with the goal to evolve a direct in-terpreter of its sūtras for Saṃskṛt word generation and transformation in allits variations. Our initial experience indicates that the consistent structureand conventions of Aṣṭādhyāyī’s sūtras make them amenable to mechanizedsūtra interpretation with fidelity. However, much more work needs to bedone to fully validate the hypothesis. Having a flexible, reusable and ex-tendible interface to Aṣṭādhyāyī provides a sound basis for collaborativeresearch and application development.ReferencesAjotikar, Tanuja, Anuja Ajotikar, and Peter Scharf. 2015. “Some Issuesin the Computational Implementation of the Ashtadhyayi”. In: San-skrit and Computational Linguistics, select papers from ’Sanskrit andIT World’ section of 16th World Sanskrit Conference. Ed. by AmbaKulkarni. Bangkok, Thailand, pp. 103–124.Dikshita, Pushpa. 2010. “Ashtadhyayi Sutra Pathah”. In: Samskrita Bharati.Chap. 4.Goyal, Pawan, Gérard Huet, Amba Kulkarni, Peter Scharf, and RalphBunker. 2012. “A Distributed Platform for Sanskrit Processing”. In:24th International Conference on Computational Linguistics (COLING),Mumbai.Goyal, Pawan, Amba Kulkarni, and Laxmidhar Behera. 2008. “ComputerSimulation of Ashtadhyayi: Some Insights”. In: 2nd International Sym-posium on Sanskrit Computational Linguistics. Providence, USA.Hellwig, Oilver. 2009. “Extracting dependency trees from Sanskrit texts”.Sanskrit Computational Linguistics 3, LNAI 5406pp. 106–115.Huet, Gérard. 2002. “The Zen Computational Linguistics Toolkit: LexiconStructures and Morphology Computations using a Modular FunctionalProgramming Language”. In: Tutorial, Language Engineering ConferenceLEC’2002. Hyderabad.JSON. 2000. Introducing JSON., Amrit and Pawan Goyal. 2015. “Towards automating the gener-ation of derivative nouns in Sanskrit by simulating Panini”. In: San-skrit and Computational Linguistics, select papers from ’Sanskrit and ITWorld’ section of 16th World Sanskrit Conference. Ed. by Amba Kulka-rni. Bangkok, Thailand.Kulkarni, Amba. 2016. Samsaadhanii: A Sanskrit Computational Toolkit., Anil. 2012. “Automatic Sanskrit Compound Processing”. PhD the-sis. University of Hyderabad.Mishra, Anand. 2008. “Simulating the Paninian System of Sanskrit Gram-mar”. In: 1st and 2nd International Symposium on Sanskrit Computa-tional Linguistics. Providence, USA.5758 Susarla et alPatel, Dhaval and Shivakumari Katuri. 2016. “Prakriyāpradarśinī - an opensource subanta generator”. In: Sanskrit and Computational Linguistics -16th World Sanskrit Conference, Bangkok, Thailand, 2015.Petersen, Wiebke and Oliver Hellwig. 2016. “Annotating and Analyzing theAshtadhyayi”. In: Input a Word, Analyse the World: Selected Approachesto Corpus Linguistics, Newcastle upon Tyne: Cambridge Scholars Pub-lishing.Petersen, Wiebke and Simone Soubusta. 2013. “Structure and implementa-tion of a digital edition of the Ashtadhyayi”. In: In Recent Researchesin Sanskrit Computational Linguistics - Fifth International SymposiumIIT Mumbai, India, January 2013 Proceedings.Satuluri, Pavankumar and Amba Kulkarni. 2013. “Generation of SanskritCompounds”. In: International Conference on Natural Language Pro-cessing.Scharf, Peter. 2016. “An XML formalization of the Ashtadhyayi”. In: San-skrit and Computational Linguistics - 16th World Sanskrit Conference,Bangkok, Thailand, 2015.Scharf, Peter and Malcolm Hyman. 2009. Linguistic Issues in Encoding San-skrit. Motilal Banarsidass, Delhi.Subbanna, Sridhar and Srinivasa Varakhedi. 2010. “Asiddhatva Principlein Computational Model of Ashtadhyayi”. In: 4th International Sanskritand Computational Linguistics Symposium. New Delhi.Susarla, Sarada and Sai Susarla. 2012. Panini Ashtadhyayi Sutras with Com-mentaries: Sortable Index.ā as an absence of non-congruitySanjeev Panchal and Amba KulkarniAbstract: Yogyatā or mutual congruity between the meanings of therelated word is an important factor in the process of verbal cognition.In this paper, we present the computational modeling of yogyatā forautomatic parsing of Sanskrit sentences. Among the several definitionsof yogyatā we modeled it as an absence of non-congruity. We discussthe reasons behind our modeling.Due to lack of any syntactic criterion for viśeṣaṇa (adjectives) in San-skrit, parsing Sanskrit texts with adjectives resulted in a high numberof false positives. Hints from the vyākaraṇa texts helped us in theformulation of a criterion for viśeṣaṇa with syntactic and ontologi-cal constraints, which provided us a clue to decide the absence ofnon-congruity between two words with respect to the adjectival rela-tion. A simple two-way classification of nouns into dravya and guṇawith further sub-classification of guṇas into guṇavacanas was foundto be necessary for handling adjectives. The same criterion was alsonecessary to handle the ambiguities between a kāraka and non-kārakarelations. These criteria together with modeling yogyatā as an absenceof non-congruity resulted in 81% improvement in precision.1 IntroductionThree factors viz. ākāṅkṣā (expectancy), yogyatā (congruity) and sannidhi(proximity) play a crucial role in the process of śābdabodha (verbal cogni-tion). These factors have been found to be useful in the development of aSanskrit parser as well. The concept of subcategorisation of modern Lin-guistics comes close to the concept of ākāṅkṣā. Subcategorization structuresprovide syntactic frames to capture different syntactic behaviors of verbs.Sanskrit being an inflectional language, the information of various relationsis encoded in suffixes rather than in positions. These suffixes express theexpectancy, termed as ākāṅkṣā in the Sanskrit literature. Kulkarni, Pokar,5960 Sanjeev Panchal and Amba Kulkarniand Shukl (2010) describe how the ākāṅkṣā was found to be useful in theproposition of possible relations between words. Sannidhi has been found tobe equivalent to the weak non-projectivity principle (Kulkarni, P. Shukla,et al. 2013c). In this paper, we will discuss the role of the third factor viz.yogyatā, in building a Sanskrit parser.The concept of selection restriction is similar to the concept of yogyatā.The expectancy, or the ākāṅkṣā, proposes a possible relation between thewords in a sentence. Such a relation would hold between two words only ifthey are meaning-wise compatible. It is the selection restriction or yogyatāwhich then comes into force to prune out incongruent relations, keeping onlythe congruent ones. Katz and Fodor (1963) proposed a model of selection re-strictions as necessary and sufficient conditions for semantic acceptability ofthe arguments to a predicate. Identifying a selection restriction that is bothnecessary and sufficient is a very difficult task. Hence there were attemptsto propose alternatives. One such alternative was proposed by Wilks (1975)who viewed these restrictions as preferences rather than necessary and suffi-cient conditions. After the development of WordNet, Resnik (1993) modeledthe problem of induction of selectional preferences using the semantic classhierarchy of WordNet. Since then there is an upsurge in the field of com-putational models for the automated treatment of selectional preferenceswith a variety of statistical models and Machine learning techniques. Inrecent times, one of the ambitious projects to represent World Knowledgewas taken up under the banner of Cyc. This knowledgebase contains overfive hundred thousand terms, including about seventeen thousand types ofrelations, and about seven million assertions relating these terms.1 In spiteof the availability of such a huge knowledge base, we rarely find Cyc beingused in NLP applications.The first attempt to use the concept of yogyatā in the field of Ma-chine Translation was by the Akshar Bharati group (Bhanumati 1989) inthe Telugu-Hindi Machine Translation system. Selectional restrictions wereused in defining the Kāraka Charts that provided a subcategorization frameas well as semantic constraints over the arguments of the verbs. On simi-lar lines Noun Lakṣaṇa Charts and Verb Lakṣaṇa Charts were also usedfor disambiguation of noun and verb meanings. These charts expressed se-lectional restrictions using both ontological concepts as well as semantic1, accessed on 30th August, 2017Yogyatā as an absence of non-congruity 61properties. An example Kāraka chart for the Hindi verb jānā (to go) isgiven in table relation necessity case marker semantic constraintapādānam (source) desirable se not (upādhi:vehicle)karaṇam (instrument) desirable se (upādhi:vehicle)karma(object) mandatory 0/ko -kartā (agent) mandatory 0 -Table 1Kāraka Chart for the verb jānā (to go)Here upādhi is an imposed property. The first row in Table 1 states aconstraint that a noun with case marker se has a kāraka role of apādānam(source) provided it is not a vehicle. The ontological classification was in-spired by the ontology originated from the vaiśeṣika school of philosophy.The parsers for Indian languages were further improved. Bharati, Chai-tanya, and Sangal (1995) mentions the importance of two semantic factorsviz. animacy and humanity, in parsing, that removes the ambiguity amongthe kartā and karma(roughly subject and object). This hypothesis was fur-ther strengthened with experimental verification by Bharati, Husain, et al.(2008).In the next section, we first state the importance of yogyatā in parsing,as a filter to prune out meaningless parses. Since yogyatā deals with thecompatibility between meanings, and a word expresses meanings at differentlevels, we also discuss the mutual hierarchy among these various meanings.In the third section, we look at various definitions of yogyatā offered inthe tradition, and decide the one that is suitable for implementation. Inthe same section, we evolve strategies to disambiguate relations based onyogyatā. Finally, the criteria evolved for disambiguation are evaluated. Theevaluation results are discussed in section four, followed by the conclusion.2 Yogyatā as a filterNecessary condition for understanding a sentence is that a word having anexpectancy for another word should become nirākāṅkṣa (having no further62 Sanjeev Panchal and Amba Kulkarniexpectancy) once a relation is established between them. Further, such re-lated words should also have mutual compatibility from the point of view ofthe proposed relation. If they are not, then the expectancy of such wordswill not be put to rest and there would not be any verbal cognition. There-fore the role of yogyatā in verbal cognition is very important. The purposeof using yogyatā in parsing is not to make a computer ‘understand’ the text,but to rule out incompatible solutions from among the solutions that fulfillthe ākāṁkṣās. For example, in the sentenceSkt: yānam vanam gacchati.Gloss: vehicle{neut., sg., nom./acc.} forest{neut., sg., nom./acc.}go{present, 3rd per., sg.}There are 6 possible analyses, based on the ākāṅkṣā. They are1. yānam is the kartā and vanam is the karma of the verb gam,2. yānam is the karma and vanam is the kartā of the verb gam,3. yānam is the kartā of the verb gam and vanam is the viśeṣaṇa of yānam,4. yānam is the karma of the verb gam and vanam is the viśeṣaṇa of yānam,5. yānam is the viśeṣaṇa of vanam which is the kartā of the verb gam,6. yānam is the viśeṣaṇa of vanam which is the karma of the verb gam.If the machine knows that the kartā of an action of going should be mov-able, and that the designation of yāna is movable, but that of vana is notmovable, then mechanically it can rule out the second analysis. The wordsyānam and vanam on account of the agreement between them have the poten-tial to be viśeṣaṇas of each other. But the semantic incompatibility betweenthe meanings of these words rules out the last four possibilities, leaving onlythe first correct analysis.As another example, look at the sentenceSkt: Rāmeṇa bāṇena Vālī hanyate.Gloss: Rama{ins.} arrow{ins.} Vali{nom.} is_killed.Rāma and bāṇa, both being in instrumental case, can potentially be akartā as well as a karaṇam of the verb han (to kill). If the machine knowsthat bāṇa can be used as an instrument in the act of killing, while Rāmabeing the name of a person, can not be a potential instrument in the act ofYogyatā as an absence of non-congruity 63killing, it can then filter out the incompatible solution: Rāma as a karaṇamand bāṇa as a kartā.Look at another sentence payasā siñcati (He wets with water). Herepayas (water) is in instrumental case, and is a liquid, and hence is compat-ible with the action of siñc (to wet). But in the sentence vahninā siñcati(He wets with fire), vahni (fire) is not fit to be an instrument of the actionof wetting, and as such it fails to satisfy the yogyatā. But now imagine asituation where a person is in a bad mood, and his friend without know-ing it starts accusing him further for some fault of his, instead of utteringsome soothing words of the console. Third-person watching this utters kimvahninā siñcasi (Why are you pouring fire?) - a perfect verbalization ofthe situation. The words, here, are like a fire to the person who is alreadyin a bad mood. This meaning of vahni is its extended meaning. Thus,even if a relation between primary meanings does not make sense, if therelation between extended meanings makes sense, we need to produce theparse. Therefore, in addition to the primary meanings, the machine also,sometimes, needs access to the secondary/extended meanings of the words.2.1 Word and its MeaningsEvery word has a significative power that denotes its meaning. In Indiantheories of meaning, this significative power is classified into three typesviz. abhidhā (the primary meaning), lakṣaṇā (the secondary or metaphoricmeaning) and vyañjanā (the suggestive meaning). In order to use the con-cept of yogyatā in designing a parser, we should know what is the role ofeach of these meanings in the process of interpretation.The secondary meaning comes into play when the primary meaning isincompatible with the meanings of other words in a sentence. The absenceof yogyatā is the basic cause for this signification. Indian rhetoricians ac-cept three conditions as necessary for a word to denote this extended ormetaphoric sense. These three conditions are21. inapplicability / unsuitability of the primary meaning,2. some relation between the primary meaning and the extended mean-ing, and2 mukhyārthabādhe tadyoge rūḍhito’tha prayojanāt |anyo’rtho lakṣyate yat sā lakṣaṇāropitā kriyā ||(KP II 9)64 Sanjeev Panchal and Amba Kulkarni3. definite motive justifying the extension.In addition to these two meanings, there is one more meaning, calledvyañjanā or the suggestive meaning. This corresponds to the inner mean-ing of any text/speaker’s intention. In order to understand this meaning,consider a sentence gato’stam arkaḥ which literally means ‘the sun has set’.Every listener gets this meaning. In addition to this meaning, it may alsoconvey different signals to different listeners. For a child playing in theground, it may mean ‘now it is getting dark and it is time to stop playingand go home’, for a Brahmin, it may mean ‘it is time to do the sandhyā-vandana’, and for a young man it may mean ‘it is time to meet his lover’.This extra meaning co-exists with the primary meaning. It does not blockthe primary meaning. Therefore vyaṅgārtha (suggestive meaning) exists inparallel with the primary/secondary meaning.Since the suggestive meaning is in addition to the primary/secondarymeaning, and is optional, and also is different for different listeners, it in-volves subjectivity for processing. Hence it is not possible to objectivelyprocess this meaning for any utterance.3 This also puts an upper limit onthe meaning one can get from a linguistic utterance without the interferenceof subjective judgments. In summary, we observe that these three meaningsare not in the same plane. Lakṣaṇā comes into play only when abhidhāfails to provide a suitable meaning for congruent interpretation. And thesuggestive meaning can co-exist with the abhidhā as well as the lakṣaṇā,and as such, is outside the scope of automatic processing.3 Modeling YogyatāYogyatā is the compatibility between the meanings of related words. Thismeaning, as we saw above, can be either a primary or a metaphoric one. Theabsence of any hindrance in the understanding of a sentence implies thereis yogyatā or congruity among the meanings. There have been differentviews among scholars about what yogyatā is. According to one definition,yogyatā is artha-abādhaḥ4 (that which is not a hindrance to meaning). It3One of the reviewers commented that taking into account the advents in Big Dataand Machine Learning techniques, it may even be possible to process such meaningsby machines in the future. However, we are of the opinion that machine would needsemantically annotated corpus for learning, which does not yet exist.4All the meanings we will be discussing below are found in NK p. 675.Yogyatā as an absence of non-congruity 65is further elaborated as bādhaka-pramā-virahaḥ or bādhaka-niścaya-abhāvaḥ(absence of the decisive knowledge of incompatibility). There are other at-tempts to define it as an existing qualifying property. One such definitionis sambandha-arhatvam (eligibility for mutual association), and the otherone is paraspara-anvaya-prayojaka-dharmavattvam (a property of promot-ing mutual association). The first set of definitions presents yogyatā as anabsence of incompatibility whereas the second set of definitions present itas the presence of compatibility between the meanings.Let us see the implications of modeling yogyatā through these two lenses.1. We establish a relation only if the two morphemes are mutually con-gruous.In this case, we need to take care of not only the congruity betweenprimary meanings but even between the metaphoric/secondary mean-ings.2. We establish a relation if there is no incongruity between the twomeanings.The first possibility ensures that the precision is high and there is lesschance of Type-1 error, i.e. of allowing wrong solutions. The second pos-sibility, on the other hand, ensures that the recall is high and there is lesschance of Type-2 error, viz. the possibility of missing any correct solution.But there is a chance that we allow some unmeaningful solutions as well. Ifwe decide to go for the first possibility, we need to handle both the primaryas well as secondary meanings, and we need to state precisely under whatconditions the meanings are congruous. And this means modeling congruityfor each verb and for each relation. This is a gigantic task, and there is apossibility of missing correct solutions if we do not take into account allthe possible extensions of meanings. Therefore, we decided to go for thesecond choice allowing a machine to do some mistakes of choosing incon-gruous solutions but we did not want to throw away correct solutions evenby mistake. This decision is in favor of our philosophy of sharing the loadbetween man and machine. Our aim is to provide access to the originaltext by reducing the language learning load. So we can not afford to missa possible solution. Thus at the risk of providing more solutions than theactual possible solutions, we decided to pass on some load to the reader ofpruning out irrelevant solutions manually.66 Sanjeev Panchal and Amba KulkarniIn the first step, we decided to use yogyatā only in those cases where acase marker is ambiguous between more than one relation. We noticed thefollowing three cases of ambiguities with reference to the relations.1. viśeṣya-viśeṣaṇa-bhāva (adjectival relation)Here both the viśeṣya and viśeṣaṇa agree in gender, number and case,and hence only on the basis of the word form, we can not tell whichone is viśeṣya and which one is viśeṣaṇa.2. a kāraka and a non-kāraka relation as ina. karaṇam (instrument) and hetu (cause), with an instrumentalcase marker,b. sampradānam (beneficiary), prayojanam (purpose) and tā-darthya (being intended for), with a dative case marker,c. apādānam (source) and hetu (cause), with an ablative casemarker.3. śaṣṭhī sambandha (a genitive relation) and a viśeṣaṇa (an adjective)When two words are in the genitive case, it is not clear whether thereis an adjectival relation between them, or a genitive relation.We now discuss each of these three cases below.3.1 Viśeṣya-viśeṣaṇa-bhāva (Adjectival relation)We come across a term samānādhikaraṇa (co-reference) in Pāṇini to denotean adjective (Joshi and Roodbergen 1998, p. 6). One of the contexts inwhich the term samānādhikaraṇa is used is the context of an agreementbetween an adjective and a noun.5 For example, dhāvantaṁ mṛgaṁ (a run-ning deer), or sundaraḥ aśvaḥ (a beautiful horse). Pāṇini has not defined theterm samānādhikaraṇa, either. The term samānādhikaraṇa (co-reference)literally means ‘having the same locus’. Patañjali in the Samartha-āhnikadiscusses the term sāmānādhikaraṇya (co-referential) (literally a propertyof being in the same locus). In the example, sundaraḥ aśvaḥ (a beautifulhorse), both the qualities of saundarya (beauty) and aśvatva (horse-ness)reside in an aśva (horse), which is the common locus. Similarly, in thecase of ācāryaḥ droṇaḥ, or agne gṛhapate (O Agni! house-holder), boththe words ācārya as well as droṇa refer to the same individual, so do agni5sāmānādhikaraṇyam ekavibhaktitvam ca. dvayoścaitad bhavati. kayoḥ. Viśeṣaṇa-viśeṣyayoḥ vā sañjñā-sañjñinorvā (MBh 1.1.1)Yogyatā as an absence of non-congruity 67and gṛhapati. This is true of various other relation-denoting terms such asguru, śiṣya, pitā, putra, etc. and upādhis (imposed / acquired properties)such as rājā, mantrī, vaidya, etc. From all this discussion, we may saysāmānādhikaraṇya (the property of having the same locus) is the semanticcharacterisation of a viśeṣaṇa.In Sanskrit, there is no syntactic / morphological category as a viśeṣaṇa(an adjective). The gender, number and case of a viśeṣaṇa follows that ofa viśeṣya (the head). From the point of view of analysis this provides asyntactic clue for a possible viśeṣya-viśeṣaṇa-bhāva between two words suchas in śuklaḥ paṭaḥ (a white cloth). This agreement is just a necessary con-dition, and not sufficient. Because, a viśeṣaṇa, in addition to agreeing withthe viśeṣya should also be semantically fit to be a qualifier of the viśeṣya.For example, there can be two words say yānam (a vehicle) and vanam (aforest), that match perfectly in gender, number and case, but we can notimagine a viśeṣya-viśeṣaṇa-bhāva between yāna and vana. Is it only thesemantics that rules out such a relation or are there any clues, especiallysyntactic ones, that help us to rule out a viśeṣya-viśeṣaṇa-bhāva betweensuch words?In search of clues:Pāṇini has not defined the terms viśeṣya and viśeṣaṇa. Patañjali usestwo terms dravya (substance) and guṇa (quality) while commenting on theagreement between a viśeṣya and a viśeṣaṇa.yad asau dravyaṁ śrito bhavati guṇaḥ tasya yat liṅgam vacanamca tad guṇasya api bhavati. (MBh under A4.1.3 Vt VI.)A quality assumes the gender and number of the substance inwhich it resides.But then what is this guṇa?We come across the description of guṇa by Kaiyyaṭa.sattve niviśate apaiti pṛthag jātiṣu dṛśyateādheyaḥ -ca-akriyājaḥ-ca saḥ asattva-prakṛti-guṇaḥ(MBh A4.1.44)Guṇa is something which is found in things / substances (sattve68 Sanjeev Panchal and Amba Kulkarniniviśate), which can cease to be there (apaiti), which is foundin different kinds of substances (pṛthag jātiṣu), which is some-times an effect of an action and sometimes not so (ādheyaḥ-ca-akriyājaḥ-ca), and whose nature is not that of a substance(asattva-prakṛti).Thus guṇa is something which is not a substance since it resides in otherthings. It is not universal since it is found in different kinds of substances.It is not an action, since guṇa is sometimes an effect of an action, as in thecase of the color of a jar and sometimes not, as in the case of the magnitudeof a substance. This characterisation of guṇais very close to the vaiśeṣika’sconcept of guṇa (Raja 1963).Then, is this vaiśeṣika guṇa a viśeṣaṇa?Patañjali commenting on the word guṇa under A2.2.11 provides anexample contrasting two types of guṇas. While both śukla and gandhaare qualities (guṇa) according to the vaiśeṣika ontology, the usage śuklaḥpaṭaḥ (a white cloth) is possible, while gandham candanam (fragrance sandal-wood) is not. Thus, only some of the vaiśeṣika guṇas have a potential to bea viśeṣaṇa, and not all.If viśeṣaṇa is not a vaiśeṣika guṇa, what is it?The characterisation of guṇa by Bhartṛhari in Guṇa-samuddeśa includesbhedakam as one of the characteristics of guṇa. But, in addition, guṇa,according to him, is also capable of expressing the degree of quality in asubstance through a suffix. He defines guṇa assaṁsargi bhedakaṁ yad yad savyāpāraṁ pratīyateguṇatvaṁ paratantratvāt tasya śāstra udāhṛtam VP III.5.1Whatever rests on something else (saṁsargi), differentiates it(bhedaka), and is understood in that function (savyāpāra) is,being dependent, called quality in the śāstra. (Iyer 1971)According to Bhartṛhari, apart from being a differentiator, a guṇa hasanother important characteristic, viz. that such a distinguishing quality canYogyatā as an absence of non-congruity 69also express the degree of excellence through some suffix (such as a com-parative suffix tarap, or a superlative suffix tamap). This concept of guṇaof Bhartṛhari, thus is different from the concept of the guṇa of a vaiśeṣika.This definitely rules out the case of gandha, since we can not have gand-hatara but we can have śuklatara to distinguish the white-ness between twowhite cloths.Another clue from PāṇiniWe have another hint from Pāṇini through Patañjali. While in A4.1.3,Patañjali has used the terms dravya and guṇa in connection with agreement,in A1.2.52, he uses the term guṇavacana while describing a viśeṣaṇaguṇavacanānāṁ śabdānām-āśrayataḥ liṅgavacanāni bhavanti-iti(A1.2.52).The words which are guṇavacanas take the gender and numberof the substance in which they reside.The term guṇavacana is used for those words which designate quality andthen a substance in which this quality resides (Cardona 2009). In the ex-ample, śuklaḥ paṭaḥ, since śukla in addition to being a quality (white color),can also designate a substance, such as a paṭa (cloth), which is (white) incolor, it is a guṇavacana word. But gandha (fragrance) designates only qual-ity, and can not be used to designate a substance that has a fragrance, andhence is not a guṇavacana.Is guṇavacana necessary and sufficient to describe a viśeṣaṇa?Let us look at the examples above. It definitely rules out yānaṁ andvanaṁ to be qualifiers of each other, since neither of them is quality. Butthen what about dhāvan (the one who is running) in dhāvan bālakaḥ (arunning boy)? Is dhāvan a guṇavacana?Guṇavacana is a technical term, used by Pāṇini to define an operation ofelision of matup suffix in certain quality denoting words such as ’sukla etc. Sotechnically, a word such as dhāvan, though it designates a substance, is not aguṇavacana. This is clear from Patañjali’s commentary on A1.4.16 where he6The Vārtika guṇvacanam ca is followed by several other vārttikas, of which the fol-lowing two are relevant. samāsa-kṛt-taddhita-avyaya-sarvanāma-asarvaliṅgā jātiḥ ||41 ||saṁkhyā ca ||42 ||70 Sanjeev Panchal and Amba Kulkarnistates that compounds (samāsa), primary derivatives (kṛdantas), secondaryderivatives (taddhitāntas), indeclinables (avyaya), pronouns (sarvanāma),words referring to universals (jāti), numerals (saṁkhyā) can not get thedesignation guṇavacana, since the latter saṁjñās (technical terms) supersedethe previous ones.7The very fact that Kātyāyana had to mention that words belonging to allthe latter categories are not guṇavacana, indicates that all these categoriesof words have the potential to get the guṇavacana designation, but Pāṇinidid not intend to assign this sañjñā to these words. Whatever may be thereason, but this list of various categories, in fact, provides us a morphologicalclue for a word to be a viśeṣaṇa.Here are some examples of viśeṣaṇas belonging to these different gram-matical categories.1. Samāsa (a compound)Bahuvrīhi (exo-centric) compounds refer to an object different fromthe components of the compound, and thus typically act as adjectives.For example, pītāmbaraḥ is made up of two components pīta (yellow)and ambara (cloth), but it refers to the ‘one wearing a yellow-cloth’(and is conventionally restricted to Viṣṇu). An example of tat-puruṣ(endo-centric) compound as a viśeṣaṇa is parama-udāraḥ (extremelynoble).2. Kṛdanta (an adjectival participle)Nouns derived from verbs act as qualifiers of a noun. For example,in the expression dhāvantam mṛgam (a running deer), dhāvantam, averbal noun, is a viśeṣaṇa. Only certain kṛdanta suffixes such as śatṛ,śānac, kta, etc. produce nouns that can be viśeṣaṇas, and not all.3. Taddhita (a secondary derivative)Taddhitas with certain suffixes derive new nouns such as bhāratīya(Indian), dhanavān (wealthy), guṇin (possessing good qualities), etc.that denote a substance, as against certain other taddhita words suchas manuṣyatā (humanity), vārddhakya (senility) etc. which derive newwords designating qualities.4. Sarvanāma (a pronoun)Pronouns also act as qualifiers. For example, in the expression idampustakam (this book), idam is a viśeṣaṇa.7gaṇavacanasaj̇ñāyāḥ ca etābhiḥ bādhanaṁ yathā syāt itiYogyatā as an absence of non-congruity 715. Jāti (a universal)In an expression āmraḥ vṛkṣaḥ (a mango tree), both the words āmraḥand vṛkṣaḥ are common nouns. But one is a special and the other oneis a general one. So the designation of āmra is a subset of the designa-tion of vṛkṣa. Only in such cases, where there is a parājāti-aparājāti(hypernymy-hyponymy) relation, the one denoting an aparājāti (hy-ponymy) qualifies to be a viśeṣaṇa of the other one.6. Saṁkhyā (a numeral)In an expression ekaḥ puruṣaḥ (a man), the word ekaḥ designates anumber, which is a viśeṣaṇa of puruṣa.There are still two more classes of words that are not covered in the abovelist, but which can be viśeṣaṇas. They are: words denoting an acquiredproperty or an imposed property, and the relation-denoting terms. Forexample, ācāryaḥ in ācāryaḥ droṇaḥ, is an imposed property and putraḥin Daśarathasya putraḥ rāmaḥ is a relation denoting term.In summary, samastapada, certain kṛdantas, certain taddhitāntas,saṁkhyā, sarvanāma, ontological categories such as parā-aparā jātis,semantico-syntactic property such as guṇavacana and finally semantic prop-erties such as relation denoting terms and upādhis, all these serve as charac-terisations of a viśeṣaṇa. This characterization is only a necessary condition,and not sufficient since it does not involve any mutual compatibility betweenthe words. However, it brings in more precision in the necessary conditionsfor two words to be in viśeṣya-viśeṣaṇa-bhāva.3.1.1 Deciding a ViśeṣyaOnce we have identified the words that are mutually compatible with regardto an adjectival relation, the next thing is to decide the viśeṣya (head)among them. The commentary on A2.1.57 is useful in deciding the viśeṣya.This sūtra deals with the compound formation of two words that are inviśeṣya-viśeṣaṇa-bhāva. In Sanskrit compound formation, the one which issubordinate gets a designation of upasarjana. This provides us a clue aboutwhich word classes are subordinate to which ones. A noun may refer to asubstance through an expression expressing the class character (jāti) suchas utpalam (a flower), or through an action associated with it (kriyāvacana),as in dhāvan (running), or through a guṇavācaka such as nīlam. If there aretwo words designating common nouns, one denoting a special and the otherone general, then the one which denotes a special type of common noun is72 Sanjeev Panchal and Amba Kulkarnisubordinate.8 For example, in āmraḥ vṛkṣaḥ, āmra is a special kind of tree,and hence is a viśeṣaṇa and vṛkṣa is its viśeṣya. If one word designates acommon noun and the other one either a guṇavacana or a kriyāvacana, thenthe word denoting the common noun becomes the viśeṣya.9 Thus in nīlamutpalam, utpalam is the viśeṣya. In pācakaḥ brāhmaṇaḥ (cook Brahmin),brāhmaṇaḥ is the viśeṣya. When one of the words designate a guṇavacanaand the other a kriyāvacana, or both the words designate either guṇavacanasor kriyāvacanas, then either of them can be a viśeṣya, as in khañjaḥ kubjaḥ(a hump-backed who is limping) or kubjaḥ khañjaḥ (a limping person withhump-back), similarly as in khañjaḥ pācakaḥ (a limping cook) or pācakaḥkhañjaḥ (a limping person who is a cook), etc.On the basis of the above discussion, we have the following preferentialorder for the viśeṣya.jātivācaka S {guṇvacana, kṛdanta}.We saw earlier that a viśeṣaṇa can be any one of the following: a pro-noun, a numeral, a kṛdanta, a taddhitānta, a samasta-pada, guṇavācaka,jāti, relation denoting terms, and an upādhi. So adding all these categoriesto the above preferential order, we get,jātivācaka S upādhi S taddhitānta S guṇavacana S numeral S kṛdantaS pronoun.103.1.2 Flat or Hierarchical Structure?After we identify all the words that have a samānādhikaraṇa relation be-tween them, and mark the viśeṣya (the head) among them, the next taskis to know whether a viśeṣaṇa is related to this viśeṣya directly, or throughother viśeṣaṇas.If there are n viśeṣaṇas, and all of them are related to the viśeṣya directly,then it results in a flat structure. But if a viśeṣaṇa is related to the viśeṣya8sāmānyajāti-viśeṣajātiśabdayoḥ samabhivyāhāre tu viṣeṣajātireva viśeṣaṇam. underA2.1.57, in BM9jātiśabdo guṇakriyāśabdasamabhivyāhāre viśeṣyasamarpaka eva na tu viśeṣaṇasamarpakaḥ, svabhāvāt, under A2.1.57, in BM10This preferential order is purely based on some observations of the corpus, and needsfurther theoretical support, if there is any.Yogyatā as an absence of non-congruity 73through other viśeṣaṇas, then there are exponentially large number of waysin which n viśeṣaṇas can relate to the viśeṣya. For example, if there arethree words say a, b and c, of which c is the viśeṣya. Then computationally,there are three ways in which the other two words may relate to c.1. Both a and b are the viśeṣaṇa of c. (This results in a flat structure.)2. a is a viśeṣaṇa of b and b that of c.3. b is a viśeṣaṇa of a and a that of c.In positional languages like English, only the first two cases are possible.For example, consider the phrase ‘light red car’, which may either mean acar which is red in color and is light in weight, or a car which is light-red incolor. In the second case, light-red is a compound.Sanskrit being a free word order language, one can imagine, computation-ally, a possibility for the third type as well. The relation between the adjec-tival terms being that of sāmānādhikaraṇya (co-referential), semantically,only a flat structure is possible with adjectives. The other two cases of hi-erarchical structures result in compound formation in Sanskrit.This is also supported by Jaimini’s Mīmāṁsā sūtraguṇānām ca parārthatvāt asambandhaḥ samatvāt syāt. (MS3.1.22)In as much as all subsidiaries are subservient to something elseand are equal in that respect, there can be no connection amongthemselves.(Jha 1933)Thus, a viśeṣaṇa is not connected to another viśeṣaṇa. The associated struc-ture is a flat one, with all the viśeṣaṇas being connected to the viśeṣya.3.2 Distinguishing a kāraka from a non-kāraka:In Sanskrit, some case markers denote both a kāraka relation as well as anon-kāraka relation, as we saw earlier. In a sentence, if a verb denotes anaction, then nouns denote the participants in such an action. These partic-ipants, which are classified into 6 types, viz. kartā, karma, karaṇam, sam-pradānam, apādānam, and adhikaraṇam are collectively called as kārakas.Other nouns in the sentence, which do not participate directly in the action,74 Sanjeev Panchal and Amba Kulkarniexpress non-kāraka relations such as hetu (cause), prayojanam (purpose),etc. We get a clue to distinguish between the nouns which are related bya kāraka relation and those which are related by a non-kāraka one in theAruṇādhikāra of the Śābara bhāṣya. There it is mentioned thatna ca amūrta-arthaḥ kriyātāḥ sādhanaṁ bhavatīti (SB; p 654)No unsubstantial object can ever be the means of accomplishingan act.Thus anything other than dravya can not be a kāraka. As we saw earlier,the guṇavacanas also can designate a dravya. And thus, all the dravyas andthe guṇavacanas are qualified to be a kāraka. And the rest, i.e. nouns whichdenote either a guṇa which is not a guṇavacana or a kriyā (verbal nouns),may have a non-kāraka relation with a verb.Let us see some examples.Skt: rāmaḥ daśarathasya ājñayā rathena vanam gacchati.Gloss: Rama {nom.} Dasharatha{gen.} order{ins.} ratha{ins.} for-est{acc.} goes.Eng: On Dasharatha’s order, Rama goes to the forest by a chariot.Skt: rāmaḥ adhyayanena atra vasati.Gloss: Rama {nom.} study{ins.} here lives.Eng: Rama lives here in order to study.In the first sentence ājñā (order) is the cause for Rama’s going to forest,ratha (chariot) is the instrument (or vehicle) for his going and in the secondsentence adhyayana is the cause of Rāma’s stay.Since both hetu as well as karaṇam demand a 3rx case suffix, ākāṅkṣāwould establish a relation of karaṇam between ājñayā and gacchati,11between rathena and gacchati and also between adhyayana and gacchati.Now with the above definition of a kāraka, adhyayana, being a verbal noun (akṛdanta) in the sense of bhāva, represents an abstract concept and thereforeit does not designate a dravya (a substance). Hence it can not be a karaṇam.Similarly ājñā, which is a guṇa (according to Vaiśeṣika ontology, being a11To be precise, the relation is between the meaning denoted by the nominal stem ājñāand the one denoted by the verbal root gam.Yogyatā as an absence of non-congruity 75śabda), can not be a karaṇa. Thus the use of congruity helps in pruningout impossible relations.On the same grounds, establishment of apādānam and sampradānamrelations between a non-dravya12 denoting noun and a verb can also beprevented.3.3 Congruous substantive for a Ṣaṣṭhī (genitive)Pāṇini has not given any semantic criterion for the use of the genitive re-lation. His rule is ṣaṣṭhī śeṣe (A2.3.50) which means, in all other casesthat are not covered so far, the genitive case suffix is to be used. The re-lation marked by the ṣaṣṭhī (genitive) case marker falls under the utthāpya(aroused) ākāṁkṣā. This is a case of uni-directional expectancy. Thus,there is no syntactic clue to which noun the word in genitive case wouldget attached. All other nouns in the sentence are potential candidates fora genitive relation to join with. The clue is, however, semantic. Patañjaliin the Mahābhāṣya on A2.3.50 provides some semantic clues. He says thereare hundreds of meanings of śaṣṭhī. Some of them are sva-svāmi-bhāva as inrājñaḥ puruṣaḥ (a king’s man), avayava-avayavī-bhāva as in vṛkṣasya śākhā(branch of a tree) etc. So in order to establish a genitive relation, we needthe semantic inputs. However, there are certain constraints. They are1. A genitive connecting a verbal noun expressing bhāva such as lyuṭ etc.expresses a kāraka13 relation and not the genitive one, as in rāmasyagamanam.2. A genitive always connects with a viśeṣya, and never with a viśeṣaṇa,since there is a samānādhikaraṇa relation between the viśeṣya andviśeṣaṇa. For example, in the expression rāmasya vīreṇa putreṇa, thegenitive relation of rāmasya is with putreṇa and not with vīreṇa.Lexical resources such as Sanskrit WordNet14 and Amarakośa15 that aremarked with the semantic information of part-whole relation, janya-janaka-bhāva, ājīvikā relation etc. help in identifying the genitive relations withconfidence. When both the words refer to dravyas (substantives), then alsothere is a possibility of a genitive relation. So note that, while for other12To be precise, a non-dravya and non-guṇavacana.13kartṛkarmaṇoḥ kṛti (A2.3.65)14 Sanjeev Panchal and Amba Kulkarnirelations, we look for the absence of non-congruity for ruling out the rela-tions, in the case of genitives, instead, we look for the presence of congruity,to prune out impossible relations. We took this decision, since we found itdifficult to describe the non-congruity in the case of genitive relations.Ambiguity between a genitive and an adjectival relationFurther, we come across an ambiguity in the genitive relation, in thepresence of adjectives. Look at the following two examples.Skt: vīrasya Rāmasya bāṇamGloss: brave{gen.} Rama{gen.} arrowEng: An arrow of brave RamaandSkt: Rāmasya putrasya pustakamGloss: Rama{gen.} son{gen.} bookEng: A book of Rama’s sonIn the first example, vīra being a guṇavacana, with the earlier charac-terisation of an adjective, vīra would be marked an adjective. while in thesecond one there is a kinship relation.4 EvaluationAs stated earlier, ākāṅkṣā states the possibility of relations between twowords. The mutual compatibility between the meanings further helps inpruning out the incompatible relations. We classified the content nounsinto two classes: dravya and guṇa. Guṇas being further marked if they areguṇavacanas. We tested the mutual compatibility only when the suffix isambiguous. To be precise, the yogyatā is used only to disambiguate betweena kāraka versus non-kāraka relation, to establish the viśeṣya-viśeṣaṇa-bhāva,and to establish a genitive relation. This ensured that we do not miss themetaphoric meanings. In the case of kāraka relations, if the noun denotes aguṇavacana, then the possible kāraka relation, on the basis of expectancy isYogyatā as an absence of non-congruity 77pruned out. Similarly, in the case of adjectival relations, the relations witha non-guṇavācaka guṇa is pruned out.The performance of the system with and without yogyatā was measuredto evaluate the impact of yogyatā. The corpus for evaluation of sentencesconsists of around 2300 sentences. It includes sentences with various gram-matical constructions, a few passages from school text book, Bhagavadgītā,and a sample from Māgha’s Śiśupālavadham. The ślokas in Bhagvadgītāas well as in Śiśupālavadham were converted to a canonical form.16 Thesentences with conjunction were not considered for the evaluation, since thenouns in conjunction conflict with the adjectives, and the criteria for han-dling conjunction are under development. The statistics showing the size ofvarious texts, the average word length and the average sentence length isgiven in Table 2.Type Sents Words characters avg sntlen avg wrd lenText books 260 1,295 9,591 4.98 7.40Syntax 937 3,339 25,410 3.56 7.61Māgha’s SPV 66 623 5,851 9.40 9.39Bhagvadgītā 940 5,698 42,251 6.06 7.41Total 2,203 10,955 83,103 3.77 7.58Table 2Corpus CharacteristicsAll these sentences were run through a parser, first without using theconditions of yogyatā and second times using the conditions of yogyatā. Inboth cases, the parser produced all possible parses. We also ensured thatthe correct parse is present among the produced solutions. Table 3 showsthe statistics providing the number of solutions with and without using thefilter of yogyatā. The number of parses produced was reduced drastically.This improved the precision by 63% in textbook stories, by 67% in thegrammatical constructs, and by 81% in case of the text from Bhagvadgītāand Māgha’s kāvya. Better results in the case of these texts pertains to thefact that these texts have more usage of adjectives and non-kāraka relationsas against the textbook sentences, and artificial grammatical constructs.16All the ślokas were presented in their anvita form, following the traditional Daṇḍān-vaya method, where the verb typically is at the end, and viśeṣaṇas precede the viśeṣyas.78 Sanjeev Panchal and Amba KulkarniCorpus type Sents avg sols avg sols improvementwithout with inyogyata yogyata precisionText books 260 39.76 14.56 63%Syntax 937 19.5 6.33 67%Literary 66 11,199 2,107 81%BhG 940 2,557 478 81%Total 2203 1439.54 268.85 81%Table 3Improvement5 ConclusionYogyatā or mutual congruity between the meanings of the related words isan important factor in the process of verbal cognition. In this paper, wepresented the computational modeling of yogyatā for automatic parsing ofSanskrit sentences. Among the several definitions of yogyatā, we modeled itas an absence of non-congruity.Due to lack of any syntactic criterion for viśeṣaṇa (adjectives) in Sanskrit,parsing Sanskrit texts with adjectives resulted in a high number of falsepositives. Hints from the vyākaraṇa texts helped us in the formulationof a criterion for viśeṣaṇa with syntactic and ontological constraints, whichprovided us a hint to decide the absence of non-congruity between two wordswith respect to the adjectival relation. A simple two-way classification ofnouns into dravya (substance) and guṇa (quality) with further classificationsof guṇas into guṇavacanas was found to be necessary for handling adjectives.The same criterion was also found useful to handle the ambiguities betweena kāraka and non-kāraka relations. These criteria together with modelingyogyatā as an absence of non-congruity resulted in 81% improvement inprecision.Finally, the fact that there can not be an adjective of an adjective, havingidentified a viśeṣya, there is only one way all the viśeṣaṇas can connect withthe viśeṣya. This theoretical input provided much relief from a practicalpoint of view, in the absence of which possible solutions would have beenexponential.Yogyatā as an absence of non-congruity 796 AbbreviationsA: Pāṇini’s Aṣṭādhyāyī, See Pande, 2004Aa.b.c : adhyāya(chapter),pāda(quarter),sūtra number in AṣṭādhyāyīBM: Bālamanoramā, see Pande, 2012MBh: Patañjali’s Mahābhāṣya, see MīmāṅsakaKP: Kāvyaprakāśa, see JhalakikarMS: Mīmāṁsā sūtra, through SBNK: Nyāyakośa, see JhalkaikarPM: Padamañjarī, see MishraSB: Śābara Bhāṣya, see Mīmāṁsaka, 1990VP: Vākyapadīyam, see Sharma, 1974ReferencesBhanumati, B. 1989. An Approach to Machine Translation among IndianLanguages. Tech. rep. Dept. of CSE, IIT Kanpur.Bharati, Akshar, Vineet Chaitanya, and Rajeev Sangal. 1995. Natural Lan-guage Processing: A Paninian Perspective. Prentice-Hall New Delhi.Bharati, Akshar, Samar Husain, Bharat Ambati, Sambhav Jain, Dipti MSharma, and Rajeev Sangal. 2008. “Two semantic features make all thedifference in Parsing accuracy”. In: Proceedings of the 6th InternationalConference on Natural Language Processing (ICON-08). C-DAC, Pune.Cardona, George. 2007. Pāṇini and Pāṇinīyas on Śeṣa Relations. KunjunniRaja Academy of Indological Research Kochi.— 2009. “On the structure of Pāṇini’s system”. In: Sanskrit ComputationalLinguistics 1 & 2. Ed. by Gérard Huet, Amba Kulkarni, and Peter Scharf.Springer-Verlag LNAI 5402.Devasthali, G V. 1959. Mīmāṁsā: The vākya śāstra of Ancient India. Book-sellers’ Publishing Co., Bombay.Huet, Gérard, Amba Kulkarni, and Peter Scharf, eds. 2009. Sanskrit Com-putational Linguistics 1 & 2. Springer-Verlag LNAI 5402.Iyer, K A Subramania. 1969. Bhartṛhari: A study of Vākyapadīya in the lightof Ancient comentaries. Deccan College, Poona.— 1971. The Vākyapadīya of Bhartṛhari, chapter III pt i, English Transla-tion. Deccan College, Poona.Jha, Ganganatha. 1933. Śābara Bhāṣya. Oriental Institute Baroda.Jhalakikar, V R. 1920; 7th edition. Kāvyaprakāśa of Mammaṭa with theBālabodhinī. Bhandarkar Oriental Research Institute, Pune.— 1928. Nyāyakośa. Bombay Sanskrit and Prakrit Series, 49, Poona.Jijñāsu, Brahmadatta. 1979. (In Hindi). Aṣṭādhyāyī (Bhāṣya) Prathamāvṛtti.Ramlal Kapoor Trust Bahalgadh, Sonepat, Haryana, India.Joshi, S D. 1968. Patañjali’s Vyākaraṇa Mahābhāṣya Samarthāhnika (P2.1.1) Edited with Translation and Explanatory Notes. Center of Ad-vanced Study in Sanskrit, University of Poona, Poona.Joshi, S D and J.A.F. Roodbergen. 1975. Patañjali’s Vyākaraṇa MahābhāṣyaKārakāhnikam (P 1.4.23–1.4.55). Pune: Center of Advanced Study inSanskrit.80Yogyatā as an absence of non-congruity 81— 1998. The Aṣṭādhyāyī of Pāṇini with Translation and Explanatory Notes,Volume 7. Sahitya Akadamy, New Delhi.Katz, J J and J A Fodor. 1963. “The structure of a Semantic Theory”.Language 39pp. 170–210.Kiparsky, Paul. 2009. “On the Architecture of Panini’s Grammar”. In: San-skrit Computational Linguistics 1 & 2. Ed. by Gérard Huet, AmbaKulkarni, and Peter Scharf. Springer-Verlag LNAI 5402, pp. 33–94.Kulkarni, Amba. 2013b. “A Deterministic Dependency Parser with DynamicProgramming for Sanskrit”. In: Proceedings of the Second InternationalConference on Dependency Linguistics (DepLing 2013). Prague, CzechRepublic: Charles University in Prague Matfyzpress Prague Czech Re-public, pp. 157–166. url:, Amba and Gérard Huet, eds. 2009. Sanskrit Computational Lin-guistics 3. Springer-Verlag LNAI 5406.Kulkarni, Amba, Sheetal Pokar, and Devanand Shukl. 2010. “Designing aConstraint Based Parser for Sanskrit”. In: Fourth International SanskritComputational Linguistics Symposium. Ed. by G N Jha. Springer-Verlag,LNAI 6465, pp. 70–90.Kulkarni, Amba and K. V. Ramakrishnamacharyulu. 2013a. “Parsing San-skrit texts: Some relation specific issues”. In: Proceedings of the 5th Inter-national Sanskrit Computational Linguistics Symposium. Ed. by MalharKulkarni. D. K. Printworld(P) Ltd.Kulkarni, Amba, Preeti Shukla, Pavankumar Satuluri, and Devanand Shukl.2013c. “How ‘Free’ is the free word order in Sanskrit”. In: Sanskrit Syntax.Ed. by Peter Scharf. Sanskrit Library, pp. 269–304.Mishra, Sri Narayana. 1985. Kāśikāvṛttiḥ along with commentaries Nyāsaof Jinendrabuddhi and Padamañjarī of Haradattamiśra. Ratna Publica-tions, Varanasi.Mīmāṃsakaḥ, Yudhiṣṭhira. 1990. Mīmāṁsā Śābara Bhāṣya. Ramlal KapoorTrust, Sonipat, Hariyana.— 1993. Mahābhāṣyam, Patañjalimuniviracitam. Ramlal Kapoor Trust,Sonipat, Hariyana.Pande, Gopaldatta. 2000, Reprint Edition. Vaiyākaraṇa Siddhāntakaumudīof Bhaṭṭojidikṣita (Text only). Chowkhamba Vidyabhavan, Varanasi.— 2012, Reprint Edition. Vaiyākaraṇa Siddhāntakaumudī of Bhaṭṭojidikṣitacontaining Bālamanoramā of Śrī Vāsudevadīkṣita. Chowkhamba Surab-harati Prakashan, Varanasi.82 Sanjeev Panchal and Amba KulkarniPande, Gopaldatta. 2004. Aṣṭādhyāyī of Pāṇini elaborated by M.M.PanditrajDr. Gopal Shastri. Chowkhamba Surabharati Prakashan, Varanasi.Pataskar, Bhagyalata A. 2006. “Semantic Analysis of the technical terms inthe ‘Aṣṭādhyāyī’ meaning ‘Adjective’”. Annals of Bhandarkar OrientalResearch Institute 87pp. 59–70.Raja, K Kunjunni. 1963. Indian Theories of Meaning. Adayar Library andResearch Center, Madras.Ramakrishnamacaryulu, K V. 2009. “Annotating Sanskrit Texts Basedon Śābdabodha Systems”. In: Proceedings Third International San-skrit Computational Linguistics Symposium. Ed. by Amba Kulkarni andGérard Huet. Hyderabad India: Springer-Verlag LNAI 5406, pp. 26–39.Ramanujatatacharya, N S. 2005. Śābdabodha Mīmāṁsā. Institut Françis dePondichérry.Resnik, Phillip. 1993. “Semantic classes and syntactic ambiguity”. In: AR-RPA Workshop on Human Language Technology. Princeton.Sharma, Pandit Shivadatta. 2007. Vyākaraṇamahābhāṣyam. ChaukhambaSanskrit Paratishthan, Varanasi.Sharma, Raghunath. 1974. Vākyapadīyam Part III with commentary Prakāśaby Helaraja and Ambakartri. Varanaseya Sanskrit Visvavidyalaya,Varanasi.Shastri, Swami Dwarikadas and Pt. Kalika Prasad Shukla. 1965. Kāśikāvṛt-tiḥ with the Nyāsa and Padamañjarī. Varanasi: Chaukhamba SanskritPratishthan.Wilks, Yorick. 1975. “A preferential, pattern-seeking, semantics for NaturalLanguage Interface”. Artificial Intelligence 6pp. 53–74.An ‘Ekalavya’ Approach to Learning Context FreeGrammar Rules for Sanskrit Using AdaptorGrammarAmrith Krishna, Bodhisattwa Prasad Majumder, AnilKumar Boga, and Pawan GoyalAbstract: This work presents the use of Adaptor Grammar, a non-parametric Bayesian approach for learning (Probabilistic) Context-Free Grammar productions from data. In Adaptor Grammar, we pro-vide the set of non-terminals followed by a skeletal grammar thatestablishes the relations between the non-terminals in the grammar.The productions and the associated probability for the productionsare automatically learnt by the system from the usages of words orsentences, i.e., the dataset. This facilitates the encoding of prior lin-guistic knowledge through the skeletal grammar and yet the tiresometask of finding the productions is delegated to the system. The systemcompletely learns the grammar structure by observing the data. Wecall this approach the ‘Ekalavya’ approach. In this work, we discussthe effect of using Adaptor grammars for Sanskrit at word-level super-vised tasks such as compound type identification and also in identify-ing the source and derived words from corpora for derivational nouns.In both of the works, we show the use of sub-word patterns learnedusing Adaptor grammar as effective features for their correspondingsupervised tasks. We also present our novel approach of using AdaptorGrammars for handling Structured Prediction tasks in Sanskrit. Wepresent the preliminary results for the word reordering task in San-skrit. We also outline our plan for the use of Adaptor grammars forDependency Parsing and Poetry to Prose Conversion tasks.8384 Amrith Krishna et al1 IntroductionThe recent trends in Natural Language Processing (NLP) community sug-gest an increased application of black-box statistical approaches such as deeplearning. In fact, such systems are preferred as there has been an increase inthe performance of several NLP tasks such as machine translation, sentimentanalysis, word sense disambiguation, etc. (Manning 2016). In fact, MITTechnology Review reported the following regarding Noam Chomsky’s opin-ion about the extensive use of ‘purely statistical methods’ in AI. The reportsays that “derided researchers in machine learning who use purely statisticalmethods to produce behavior that mimics something in the world, but whodon’t try to understand the meaning of that behavior.” (Cass 2011).Chomsky quotes, “It’s true there’s been a lot of work on trying to applystatistical models to various linguistic problems. I think there have beensome successes, but a lot of failures. There is a notion of success ... whichI think is novel in the history of science. It interprets success as approxi-mating un-analyzed data.” (Pinker et al. 2011). Norvig (2011), in his replyto Chomsky, comes in defense of statistical approaches used in the com-munity. Norvig lays emphasis on the engineering aspects of the problemsthat the community deals with and the performance gains achieved in usingsuch approaches. He rightly attributes that, while the generative aspects ofa language can be deterministic, the analysis of a language construct canlead to ambiguity. As probabilistic models are tolerant to noise in the data,the use of such approaches is often necessary for engineering success. It isoften the case that the speakers of a language deviate from the laid outlinguistic rules in usage. This can be seen as noise in the dataset, and yetthe system we intend to build should be tolerant to such issues as well. Theuse of statistical approaches provides a convenient means of achieving thesame. But, the use of statistical approaches does not imply discarding of thelinguistic knowledge that we possess. Manning (2016) quotes the work ofPaul Smolensky, “Work by Paul Smolensky on how basically categorical sys-tems can emerge and be represented in a neural substrate (Smolensky andLegendre 2006). Indeed, Paul Smolensky arguably went too far down therabbit hole, devoting a large part of his career to developing a new categor-ical model of phonology, Optimality Theory (Prince and Smolensky 1993).”This is an example where the linguistics and the statistical computationalmodels had a successful synergy, fruitful for both the domains.‘Ekalavya’ Approach 85The Probabilistic Context-Free Grammars (PCFGs) provide a conve-nient platform for expressing linguistic structures with probabilistic priori-tization of the structures they accept. It has been shown that PCFGs canbe learned automatically using statistical approaches (Horning 1969). Inthis work, we look into Adaptor grammar (Johnson, T. L. Griffiths, andGoldwater 2007), a non-parametric Bayesian approach for learning gram-mar from the observations, say, sentences or word usages in the language.When given a skeletal grammar along with the fixed set of non-terminals,Adaptor grammar learns the right-hand side of the productions and theprobabilities associated with them. The grammar does so just by observingthe dataset provided to it, and hence the name ‘Ekalavya’ approach.The use of Adaptor grammars for linguistic tasks provides the followingadvantages for a learning task.1. Adaptor grammars in effect output valid PCFGs, which in turn arecontext-free grammars, and thus are valid for linguistic representa-tions.2. It helps to encode linguistic information which is already described invarious formalisms via the skeletal grammars. Thus domain knowledgecan effectively be used. The only restriction here might be that theexpressive power of the grammar is limited to that of a Context-FreeGrammar.3. By leveraging the power of statistics, we can obtain the likelihoodof various possible parses, in case of structural ambiguity during ananalysis of a sentence.4. While the proposed structures might not be as competitive in perfor-mance as with the black-box statistical approaches such as the deeplearning approaches, the interpretability of the Adaptor grammar-based systems is a big plus. Grammar experts can look into the indi-vidual production rules learned by the system. This frees the expertsfrom coming up with the rules in the first place. Additionally, bylooking into the production rules, understandable to any domain ex-pert with the knowledge of context-free grammars, it can be validatedwhether the system has learned patterns that are relevant to the taskor not.86 Amrith Krishna et alIn Section 2, we discuss the preliminaries regarding Context-Free Gram-mars, Probabilistic CFGs, and Adaptor Grammar. In Section 3, we discussthe use of Adaptor grammars in various NLP tasks for different languages.We then describe the work performed in Sanskrit with Adaptor grammarsin Section 4. We then discuss future directions in Sanskrit tasks, specificallyfor multiple structured prediction tasks.2 Preliminaries - CFG and Probabilistic CFGContext-Free Grammar was proposed by Noam Chomsky who initiallytermed it as phrase structure grammar. Formally, a Context-Free GrammarG is a 4-tuple (kPΣP gP h), where k is a set of non-terminals, Σ is a finiteset of terminals, g is the set of productions from k to (k ∪ Σ)∗, where ∗is the ‘Kleene Star’ operation. h is an element of k which is treated as thestart symbol, which forms the root of the parse trees for every string ac-cepted by the grammar. Using the notation Ll for the language generatedby non-terminal m, the language generated by the grammar G is LS .Figure 1An example of a Context Free GrammarThe productions in Context-Free Grammars are often handcrafted byexpert linguists. it is common to have large CFGs for many of the real-lifeNLP tasks. It is common that a given string can have multiple possibleparses for the given grammar. This is due to the fact that a Context-FreeGrammar contains all possible choices that can be produced from a givenNon-terminal (O’Donnell 2015). The grammar neither provides a determin-istic parse nor prioritizes the parses. This leads to structural ambiguity inthe grammar. Probabilistic Context-Free Grammars (PCFGs) have beenintroduced to weigh the probable trees when the ambiguity arises, and thusprovide a means for prioritizing the desired rules. A PCFG is a 5-tuple‘Ekalavya’ Approach 87(kPΣP gP hP ), where , denotes a vector of real numbers in the range of[0P 1] indexed by productions of g, subject to noting gl for the set of pro-ductions of m in g, for all m in k we require∑.∈fX. = 1Figure 2Example of a Probabilistic Context Free Grammar corresponding to CFGshown in Figure 1The probabilities associated with all the productions of a given non-terminal should add up to 1. The probability of a given tree is nothingbut the product of the probabilities associated with the rules which areused to construct the tree. A given vector l denotes the parameters of amultinomial distribution that have the non-terminal m on their left-handside (LHS) (O’Donnell 2015).Note that PCFGs make two strong conditional independence assump-tions (O’Donnell 2015):1. The decision about expanding a non-terminal depends only on thenon-terminal and the given distribution for that non-terminal. Noother assumptions can be made.2. Following from the first assumption, a generated expression is inde-pendent of other expressions.There are numerous techniques suggested for the estimation of weightsfor the productions in PCFG. The Inside-Outside algorithm is a maximumlikelihood estimation approach based on the unsupervised Expectation max-imization parameter estimation method. Summarily, the algorithm starts byinitializing the parameters with a random set of values and then iteratively88 Amrith Krishna et almodifies the parameter values such that the likelihood of the training corpusis increased. The process continues until the parameter values converge, i.e.,no more improvement of the likelihood over the corpus is possible.Another way of estimating parameters is through the Bayesian Inferenceapproach (Johnson, T. Griffiths, and Goldwater 2007). Given a corpus ofstrings s = s)P s2:::::sn, we assume a CFG G generates all the strings inthe corpus. We take the dataset s and infer the parameters  using Bayes’theoreme (|s) ∝ eG(s|)e ()where,eG(s|) =n∏i5)eG(si|)Now, the joint posterior distribution for the set of possible trees t andthe parameters  can be obtained bye (tP |s) ∝ e (s|t)e (t|)e () = (n∏i5)e (si|ti)e (ti|))e ()To calculate the posterior distribution, we assume that the parametersin  are drawn from a known distribution termed as the prior. We assumethat each non-terminal in the grammar has a given distribution which neednot be the same for all. For a non-terminal, the multinomial distributionis indexed by the respective productions and since we use Dirichlet priorover here, each production probability l→ has a corresponding Dirichletparameter l→. Now, either through Markov Chain Monte Carlo Sam-pling approaches (Johnson, T. Griffiths, and Goldwater 2007) or throughvariational inference or a hybrid approach, the parameters are learnt (Zhai,Boyd-Graber, and Cohen 2014).However, this approach as well does not deal with the real bottleneck,which is to come up with relevant rules which can solve a task for a givencorpus. For large datasets, the CFGs could have a large set of rules andit is often cumbersome to come up with rules by experts alone. Non-Parametric Bayesian Approaches have been proposed as modifications forPCFGs. Roughly, the Non-parametric Bayesian approaches can be seen aslearning a single model that can adapt its complexity to the data (Gersh-man and David M Blei 2012). The term non-parametric does not imply that‘Ekalavya’ Approach 89there are no parameters associated with the learning algorithm, but ratherit implies that the number of parameters is not fixed, and increases with anincrease in data or observations.The most general version of learning PCFGs goes by the name of Infi-nite HMM or Infinite PCFG (Johnson 2010). In infinite PCFG, say for themodel described in Liang et al. (2007), we are provided with a set of atomiccategories and a combination of these categories as rules. Now, dependingon the data, the learning algorithm learns the productions and the numberof possible non-terminals along with the probabilities associated with them(Johnson 2010). Another variation that is popular with the Non-ParametricGrammar induction models is the Adaptor grammar (Johnson, T. L. Grif-fiths, and Goldwater 2007). Here, the number of non-terminals remainsfixed and is set manually. But, the production rules and their correspond-ing probabilities are obtained by inference. The productions are obtainedfor a subset of non-terminals which are ‘adapted’, and it uses a skeletalgrammar to obtain the linguistic structures.An Adaptor Grammar is a 7-tuple G = (kPΣP gP hP P VPX). Here V ⊆k denotes non-terminals which are adapted, i.e., productions for the nonterminals in V will automatically be learnt from data. X is the Adaptorset, where Xl is a function that maps a distribution over trees Tl to adistribution over distributions over Tl (Johnson 2010).Figure 3Example of an Adaptor Grammar. The non-terminals marked with an ‘@’show that they are adapted. The productions will be learnt from data,where each production is a variable length permutation of subset of theelements in the alphabet setThe independence assumptions that exist for PCFGs are not anymorevalid in the case of Adaptor Grammars (Zhai, Boyd-Graber, and Cohen2014). Here the non-terminal m is defined in terms of another distributionHl . Now the adaptors for each of the non-terminal m, Xl , can be basedon Dirichlet Process or a generalisation of the same, termed as Pitman-Yor90 Amrith Krishna et alProcess. Here iDl(GY1 P GY2 :::::P GYm) is a distribution over all the treesrooted in the non-terminal mHl =∑l→Y1:::Ym∈fXl→Y1:::YmiDl(GY1 P GY2 :::::P GYm)Gl ∼ Xl(Hl)3 Adaptor Grammar in Computational LinguisticsAdaptor Grammar has been widely used in multiple morphological and syn-tactic tasks for various languages. Adaptor Grammar has been initiallyshown for word segmentation task in English (Johnson, T. L. Griffiths, andGoldwater 2007). A sentence with no explicit word boundaries was given asobservations and the task was to predict the actual words in the sentence.The task is similar to tasks for variable-length motif identification.Adaptor Grammars has been introduced by Johnson, T. L. Griffiths, andGoldwater (2007) as a non-parametric Bayesian framework for performinginference of syntactic grammar of a language over parse trees. A PCFG(Probabilistic Context-Free Grammar) and an adaptor function jointly de-fine an Adaptor grammar. The PCFG learns the grammar rules behind thedata generation process and the adaptor function maps the probabilities ofthe generated parse trees to substantially larger values than of the sameunder the conditionally independent PCFG model.Adaptor grammars have been very effectively used in numerous NLPrelated tasks. Johnson (2010) has drawn connections between topic mod-els and PCFGs and then proposed a model with combined insights fromadaptor grammars and topic models. While LDA defines topics project-ing documents to lower-dimensional space, Adaptor grammar defines thedistribution over trees. The author also projects a hybrid model to iden-tify topical collocations using the power of PCFG encoded topic models.Adaptor grammars are also used in named entity structure learning. Zhai,Kozareva, et al. (2016) has used adaptor grammars for identifying entitiesfrom shopping-related queries in an unsupervised manner.The word segmentation task is essentially identifying the individualwords from a continuous sequence of characters. This is seen as a chal-lenging task in computational cognitive science as well. Johnson (2008a)used Adaptor Grammar for word segmentation on the Bantu Language,‘Ekalavya’ Approach 91‘Sesotho’. The author specifically showed how the grammar with additionalsyllable structure yields a better F-score for word segmentation task than theusual collocation grammar. A similar study has been carried out by Kumar,Padró, and Oliver González (2015). The authors present the mechanism tolearn complex agglutinative morphology with specific examples of three offour Dravidian languages, Tamil, Malayalam, and Kannada. Furthermore,the authors specifically have stressed upon the task of dealing with sandhiusing finite-state transducers after producing morphological segment genera-tion using Adaptor grammars. Adaptor grammar succeeds in leveraging theknowledge about the agglutinative nature of the Dravidian language but re-frains from modeling the specific morphotactic regularities of the particularlanguage. Johnson also demonstrates the effect of syllabification on wordsegmentation task using PCFGs (Johnson 2008b). Johnson further moti-vates the usability of the aforementioned unsupervised approaches for wordsegmentation and grammar induction tasks by extracting the collocationaldependencies between words (Johnson and Demuth 2010).Due to their generalizability, Adaptor grammars have been used exten-sively in NLP. Hardisty, Boyd-Graber, and Resnik (2010) achieves state-of-the-art accuracy in perspective classification using adaptive Naïve Bayesmodel – the adaptor grammar-based non-parametric Bayesian model. Be-sides this, adaptor grammar has been proven to be effective in grammarinduction (Cohen, David M Blei, and Smith 2010). Grammar inductionis an unsupervised syntax learning task. The authors achieved consider-able results along with the finding that the variational inference algorithm(David M. Blei, Kucukelbir, and McAuliffe 2017) can be extended to thelogistic normal prior instead of the Dirichlet prior. Neubig et al. (2011)proposed an unsupervised model for phrase alignment and extraction wherethey claimed that their method can be thought of as an adaptor grammarover two languages. Zhai, Kozareva, et al. (2016) has presented a work,where the authors attempted to identify relevant suggestive keywords toa typed query so as to improve the results for search in an e-commercesite. The authors previously presented a new variational inference approachthrough a hybrid of Markov chain Monte Carlo and variational inference.It has been reported that the hybrid scheme has improved scalability with-out compromising the performance on typical common tasks of grammarinduction.Botha and Blunsom (2013) presented a new probabilistic model thatextends Adaptor grammar to make it learn word segmentation and mor-92 Amrith Krishna et alpheme lexicons in an unsupervised manner. Stem derivation in Semitic lan-guages such as Arabic achieves better performance using this mildly context-sensitive grammar formalism. Again, Eskander, Rambow, and Yang (2016)recently investigated with Adaptor Grammars for unsupervised morpholog-ical segmentation to establish a claim of language-independence. Keepingaside other baselines such as morphological knowledge input from externalsources and other cascaded architectures, adaptor grammar proved to beoutperforming in a majority of the cases.Another use of Adaptor grammar has been seen in the identification ofnative language (Wong, Dras, and Johnson 2012). Authors used adaptorgrammar in identifying n-gram collocations of an arbitrary length over amix of Parts of Speech tags and words to feed them as a feature in the clas-sifier. By modeling the task with syntactic language models, the authorsshowed that extracted collocations efficiently represent the native language.Besides grammar induction, Huang, Zhang, and Tan (2011) further usesAdaptor grammar for machine transliteration. The PCFG framework helpsto learn syllable equivalent in both languages and hence aids in the auto-matic phonetic translation. Furthermore, Feldman et al. (2013) recentlyexplored a Bayesian model to understand how feedback from segmentedwords can alter the phonetic category learning of infants due to access tothe knowledge of the joint occurrence of word-pairs.As an extension to the standard Adaptor Grammar, O’Donnell (2015)presented Fragment Grammars which were built as a generalization of Adap-tor Grammars. They generalize Adaptor Grammars by scoping the pro-ductivity and abstraction to occur at any point within individual storedstructures. The specific model has adopted ‘stochastic memoization’ as anefficient substructure storing mechanism from the Adaptor grammar frame-work. It further memoizes partial internal computations via a lazy evalua-tion version of the original storage mechanism given by Adaptor Grammar.4 Adaptor Grammar for SanskritAdaptor Grammars have also been used for Sanskrit as well, mainly as ameans of obtaining variable-length character n-grams to be used as fea-tures for classification tasks. Below, we describe two different applications,compound type identification, as well as identifying the Taddhita suffix forderivational nouns.‘Ekalavya’ Approach 934.1 Variable Length Character n-grams for compound typeidentification1Krishna, Satuluri, Sharma, et al. (2016) used adaptor grammars for identi-fying patterns present in different types of compound words. The underlyingtask was, given a compound word in Sanskrit, to identify the type of thecompound. The problem was a multi-class classification problem. The clas-sifier needed to classify a given compound into one of the four broad classes,namely, Avyayībhāva, Dvandva, Bahuvrīhi, Tatpuruṣa.The system is developed as an ensemble-based supervised classifier. Weused the Random Forests classifier with an easy ensemble approach to han-dle the class imbalance problem persisting in the data. The classifier hada majority of its labels in Tatpuruṣa. The presence of Avyayībhāva was theleast. The classifier incorporated rich features from multiple sources. Therules from Aṣṭādhyāyī pertaining to compounds that are of conditional na-ture i.e. contains those containing selectional constraints were encoded asa feature. This was encoded by applying those selectional restrictions overthe input compounds. Variable-length character n-grams for each class ofcompounds were obtained from adaptor grammar. Each filtered productionfrom the compound class-specific grammar was used as a feature. We alsoincorporated noun pairs that follow the knowledge structure in Amarakośaas mentioned in Nair and A. Kulkarni (2010). We used a selected subset ofrelations from Nair and A. Kulkarni (2010).We capture semantic class-specific linguistic regularities present in ourdataset using variable-length character n-grams and character n-gram col-locations shared between compounds using adaptor grammars.We learn 3 separate grammars namely, G1, G2, and G3, with the sameskeletal structure as Figure 4a, but with different data samples belongingto Tatpuruṣa, Bahuvrīhi and Dvandva respectively. We did not learngrammar for Avyayībhāva, due to insufficient data samples for learning thepatterns. We use a ‘$’ marker to indicate the word boundary between thecomponents, where the components were in sandhi split form. A ‘#’ sym-bol was added to mark the beginning and end of the first and the finalcomponents, respectively. We also learn a grammar G4, where the entiredataset is taken together along with additional 4000 random pair of words1The work has been done as part of the compound type identification work publishedin Krishna, Satuluri, Sharma, et al. (2016). Please refer to the aforementioned work for adetailed explanation of the concepts described here.94 Amrith Krishna et alfrom the Digital Corpus of Sanskrit, where none of the words appeared asa compound component in the corpus. The co-occurrence or the absence ofit was taken as the proxy for compatibility between the components. Theskeletal grammar in Figure 4b has two adapted non-terminals, both markedby ‘@’. Also, the adapted non-terminal ‘Word’ is a non-terminal appearingas a production to the adapted non-terminal ‘Collocation’. The ‘+’ symbolindicates the notion of one or more occurrence of ‘Word’, as used in regularexpressions. This is not standard to use the notation in productions as percontext-free grammar. This is ideally achieved using recursive grammarsin CFGs with additional non-terminals. But, in order to present a simplerrepresentation of skeletal grammar, we followed this scheme. In subsequentrepresentations, we will be using recursiveness instead of the ‘+’ notation.Figure 4a) Skeletal grammar for the adaptor grammar b) Derivation tree for aninstance of a production ‘#sa$ śa’ for the non-terminal @CollocationEvery production in the learned grammars has a probability to be in-voked, where the likelihood of all the productions of a non-terminal, sumsto one. To obtain discriminative productions from G1, G2, and G3, we findconditional entropy of the productions with that of G4 and filter only thoseproductions above a threshold. We also consider all the unique productionsin each of the Grammars in G1 to G3. We further restrict the productionsbased on the frequency of the production in the data and the length of thesub-string produced by the production, both of them were kept at the valueof three.We show an instance of one such production for a variable-length char-acter n-gram collocation. Here, for the adapted non-terminal @Collocation,we find that one of the production finally derives ‘#sa$ śa’, which actuallyis derived as two @Word derivations as shown in the Figure 4b. We usethis as a regular expression, which captures some properties that need tobe satisfied by the concatenated components. The particular production‘Ekalavya’ Approach 95mandates that the first component must be exactly sa, as it is sandwichedbetween the symbols # and $. Now, since śa occurs after the previous sub-string which contains $ the boundary for both the components, śa shouldbelong to the second component. Now, since as per the grammar both thesubstrings are independent @word productions, we relax the constraint thatboth the substrings should occur immediately one after the other. We treatthe same as a regular expression, such that śa should occur after sa, and anynumber of characters can come in between both the substrings. For this par-ticular pattern, we had 22 compounds, all of those belonging to Bahuvrīhi,which satisfied the criteria. Now, compounds where the first componentis ‘sa’ are mostly Bahuvrīhi compounds, and this is obvious to Sanskritlinguists. But here, the system was not provided with any such prior infor-mation or possible patterns. The system learned the pattern from the data.Incidentally, our dataset consisted of a few compound samples belonging todifferent classes as well where the first component was ‘sa’.4.1.1 ExperimentsDataset - We obtained a labeled dataset of compounds and the decom-posed pairs of components from the Sanskrit studies department, UoHyd2.The dataset contains more than 32,000 unique compounds. The compoundswere obtained from ancient digitised texts including Śrīmad Bhagavat Gīta,Caraka saṃhitā among others. The dataset contains the sandhi split com-ponents along with the compounds. With more than 75% of the dataset con-taining Tatpuruṣa compounds, we down-sample the Tatpuruṣa compoundsto a count of 4000, to match with the second-highest class, Bahuvrīhi. Wefind that the Avyayībhāva compounds are severely under-represented in thedata-set, with about 5% of the Bahuvrīhi class. From the dataset, we fil-tered 9,952 different data-points split into 7,957 data points for training andthe remaining as a held-out dataset.Result - To measure the impact of different types of features we incorpo-rated, we train the classifier incrementally with different feature types. Wereport the results over the held-out data. At first, we train the system withonly Aṣṭādhyāyī rules and some additional hand-crafted rules. We find thatthe overall accuracy of the system is about 59.34%. Then we augmented theclassifier by adding features from Amarakoṣa. We find that the overall accu-racy of the system has increased to 63.81%. We then finally add the adaptor2 Amrith Krishna et alClass P R FA 0.92 0.43 0.58B 0.85 0.74 0.79D 0.69 0.39 0.49T 0.68 0.88 0.77Table 1Classwise performance of the Random Forests Classifier.grammar-based features which have increased the performance of the systemto an accuracy of 74.98 %. The effect of adding adaptor grammar featureswas more visible for the improvement in the performance of Dvandva andBahuvrīhi. Notably, the precision for Dvandva and Bahuvrīhi increased byabsolute values 0.15 and 0.06 respectively, when compared to the resultsbefore adding adaptor grammar-based features. Table 1 presents the resultof the system with the entire feature set per Compound class. The additionof adaptor grammar features has resulted in an overall increase in the per-formance of the system from 63.81 % to 74.91 %. The patterns for adaptorgrammar were learned only using the data from the training set and theheld-out data was not used. This was done so as to ensure no over-fittingof data takes place. Also, we filtered the productions with length less than3 and which do not occur many times in the grammar.4.2 Distinctive Patterns in Derivational Nouns in Taddhita3Derivational nouns are a means of vocabulary expansion in a language. Anew word is created in a language where an existing word is modified byan affix. Taddhita is a category of such derivational affixes which are usedto derive a prātipadika from another prātipadika. The challenge here is toidentify Taddhita prātipadikas from corpora in Sanskrit and also to identifytheir source words.Pattern-based approaches often result in false positives. The edit dis-tance, a popular distance metric to compare the similarity of two givenstrings, between the source and derived words due to the patterns tendsto vary from 1 to 6. For example, consider the word ‘rāvaṇi’ derived from3The work has been done as part of the Derivational noun word pair identification workpublished in Krishna, Satuluri, Ponnada, et al. (2017). Please refer to the aforementionedwork for a detailed explanation of the concepts described here.‘Ekalavya’ Approach 97‘rāvaṇa’, where the edit distance between the words is just 1. But, ‘Āś-valāyana’ derived from ‘aśvala’ has an edit distance of 6. Also, the word‘kālaśa’ is derived from the word ‘kalaśa’, but ‘kāraṇa’ is not derived from‘karaṇa’. Similarly ‘stutya’ is derived from ‘stu’ but using a kṛt affix. But,dakṣiṇā (South direction) is used to derive dākṣhiṇātya (Southern) with ataddhita affix. If we have to use vṛddhi as an indicator, which is the onlydifference between both the examples, then there are cases such as kārakaderived from kṛ for kṛt and aṣvaka is derived from aṣva using taddhita. Allthese instances show the level of ambiguity that can arise in deciding thepairs of source and derived words using taddhita. All the aforementionedexamples show the need for knowledge of Aṣṭādhyāyī (or the knowledge ofaffixes), semantic relation between the word pairs or a combination of theseto resolve the right set of word pairs.The approach proposed in Krishna, Satuluri, Ponnada, et al. (2017) firstidentifies a high recall low precision set of word pairs from multiple San-skrit Corpora based on pattern similarities as exhibited by the 137 affixesin Taddhita. Once the patterns are obtained, we look for various similar-ities between the word pairs to group them together. We use rules fromAṣṭādhyāyī , especially from the Taddhita section. But since we could notincorporate rules of semantic and pragmatic nature, to compensate for themissing rules, we tried to identify patterns from the word pairs, specificallythe source words, to be used. We use Adaptor Grammar for the purpose.Currently, we do not identify the exact affix that leads to the derivationof the word. Also, since the affixes are distinguished not just by the visiblepattern, but also by the ‘it’ markers, it is challenging to identify the exactaffix. So, we group all those affixes that result in similar patterns into asingle group. All the word pairs that follow the same pattern belong toone group. To further increase the group size, we group all those entriesthat differ by vṛddhi and guṇa also into the same group. Such distinctionsare not considered while forming a group. Effectively we only look into thepattern at the end of the ‘derived word’. We call all such collection of groupsbased on the patterns as our ‘candidate set’.For every distinct pattern in our candidate set, we first identify the wordpairs and then create a graph with the given word pairs. A word pair is anode and edges are formed between nodes where they match different setof similarities. The first set of similarities are based on rules directly fromAṣṭādhyāyī, while the second set of node similarities were using charactern-grams using Adaptor grammars. Once the similarities were found, we98 Amrith Krishna et alapply the Modified Adsorption approach (Talukdar and Crammer 2009) onthe graph. The modified adsorption is a semi-supervised label prorogationapproach where labels are provided to a subset of nodes and then propagatedto the remaining nodes based on the similarity they share with other nodes.Figure 5 shows a sample construction of the graph for the word pairs,where words differ by a pattern ‘ya’. Here every pair obtained by patternmatching is a node. Now, Modified Adsorption is a semi-supervised ap-proach. So, we need a limited number of labeled nodes. The nodes markedin grey are labeled nodes. They are called as seed nodes. The label here isjust binary, i.e. a word pair can either be a true Taddhita pair or not. Now,edges are formed between the word pairs. Modified Adsorption provides ameans of designing the graph explicitly, while many of its predecessors reliedmore on nearest-neighbor based approaches (Zhu and Ghahramani 2002).Also, the edges can be weighted based on the closeness between differentnodes. Once the graph structure is defined, we perform the modified ad-sorption. In this approach, the labels from the seed nodes are propagatedthrough the edges, such that the labels from seed nodes are propagated toother unlabelled nodes as well. The highly similar nodes should be givensimilar labels or else the optimization function penalizes any other label as-signments. We use three different means of obtaining similarities betweenthe nodes. The first such set of similarity is the rules in Aṣṭādhyāyī thatthe pair of nodes have a match with. The second set of similarity is the sumof probabilities of productions from adaptor grammar, which are matchedfor a pair of nodes. The third is the word vector similarity between thesource words in the node pairs. For a detailed working of the system and adetailed explanation of each set of features please refer to Krishna, Satuluri,Ponnada, et al. (2017). Here, we republish the working of the second setof features obtained using Adaptor grammar and the results of the modelthereafter.Character n-grams similarity by Adaptor Grammar - Pāṇini hadan obligation to maintain brevity, as his grammar treatise was supposed tobe memorized and recited orally by humans (Kiparsky 1994). In Aṣṭādhyāyī,Pāṇini uses character sub-strings of varying lengths as conditional rules forchecking the suitability of the application of an affix. We examine if thereare more such regularities in the form of variable-length character n-gramsthat can be observed from the data, as brevity is not a concern for us. Also,we assume this would compensate for the loss of some of the informationwhich Pāṇini originally encoded using pragmatic rules. In order to identify‘Ekalavya’ Approach 99Figure 5Graph structure for the group of words where derived words end in ‘ya’.Nodes in grey denote seed nodes, where they are marked with their classlabel. The Nodes in white are unlabelled nodes.the regularities in the pattern in the words, we use Adaptor grammar.In Listing 1, ‘Word’ and ‘Stem’ are non-terminals, which are adapted.The non-terminal ‘Suffix’ consists of the set of various end-patterns.lory→ htem huffixlory→ htemhtem→ Xhvrshuffix→ v|yv|:::::|VyvnvListing 1: Skeletal CFG for the Adaptor grammarThe set A2 captures all the variable-length character n-grams learned asthe productions by the grammar along with the probability score associatedwith the production. We form an edge between two nodes in Gi2, if thereexists an entry in A2, which are present in both the nodes. We sum theprobability value associated with all such character n-grams common to thepair of nodes vj P vk ∈ ki, and calculate the edge score j;k. If the edge score100 Amrith Krishna et alis greater than zero, we find the sigmoid of the value so obtained to assignthe weight to the edge. The expression for calculating j;k in the equationgiven below uses the Iverson bracket (Knuth 1992) to show the conditionalsum operation. The equation essentially makes sure that the probabilitiesassociated with only those character n-grams get summed, which are presentin both the nodes. We define the edge score j;k, weight set li2 and Edgeset Ei2 as follows.j;k =|A2|∑l5)vk2;l[vk2;l = vj2;l]Evk;vji2 ={1 j;k S 00 j;k = 0lvk;vji2 ={(j;k) j;k S 00 j;k = 0As mentioned, we use the label distribution per node obtained fromphase 1 as the seed labels in this setting.4.2.1 ExperimentsAs we mentioned, we use three different set of similarity sets for weightingthe edges. But, in Modified Adsorption (MAD) the edge weight requires tobe a scalar. This implies a similarity score between a pair of nodes usingone similarity function can be used at a time. hence, we chose to applythe similarity weights sequentially on the graph. An alternative would havebeen to obtain a weighted average of the different similarity scores. But, ourpipeline approach can be seen as a means of bootstrapping our seeds set.In Modified Adsorption, we need to provide seed labels, which are labels forsome of the nodes. In reality, the seed nodes do not have a binary assignmentof the labels, rather a distribution of the labels (Talukdar and Crammer2009). So after the run of each similarity set, we get a label distribution foreach of the nodes in the graph. This label distribution is used to generateseed nodes in the subsequent run of the modified adsorption. The seed nodesalso get modified during the run of the algorithm.Dataset - We use multiple lexicons and corpora to obtain our vocabu-lary C. We use IndoWordNet (M. Kulkarni et al. 2010), the Digital Corpus of‘Ekalavya’ Approach 101Sanskrit4, a digitized version of the Monier Williams5 Sanskrit-English dic-tionary, a digitized version of the Apte Sanskrit-Sanskrit Dictionary (Goyal,G. P. Huet, et al. 2012) and we also utilize the lexicon employed in the San-skrit Heritage Engine (Goyal and G. Huet 2016). We obtained close to170,000 unique word lemmas from the combined resources.Results - In Krishna, Satuluri, Ponnada, et al. (2017), we report resultsfrom 11 of the patterns from a total of more than 80 patterns we initiallyobtained. Due to the lack of enough evidence in the form of data-points wedid not attempt the procedure for others. Here, we only show results for 5 ofthe patterns, which were selected based on the size of evidence from the cor-pora we obtain. Since we use each of the similarity set sequentially, we haveoutputs at each of the phase of the sequences. The result of the system afterincorporating Aṣṭādhyāyī rules is bVDW1, while that after incorporatingAdaptor grammar ngrams is bVDW2 and the final result after the wordvector similarity is bVD. Now, since we have 5 different patterns, we havean index i sub-scripted to the systems to denote the corresponding patterns.We additionally use a baseline called as Label Propagation (LP), based onthe algorithm by Zhu and Ghahramani (2002). We can find that the systemswhich incorporates adaptor grammar are thebVD andbVDW2. Both thesystems are the best and second best performing systems respectively.Table 2 shows the results of our system. We compare the performanceof 5 different patterns, selected based on the number of candidate wordpairs available for the pattern. The system proposed in the work bVDiperforms the best for all the 5 patterns. Interestingly, bVDW2i is thesecond best-performing system in all cases. The system uses 3 kinds ofsimilarity measures in a sequential pipeline of which adaptor grammar comesas the second feature set. To understand the impact of adding adaptorgrammar-based features, we can compare the results with that ofbVDW1i.The system shows the result for each of the patterns before using adaptorgrammar-based features.A baseline using the label propagation algorithm was also used. Themotive behind the label propagation baseline was to measure the effect ofModified adsorption on the task. In Label Propagation, we experimentedwith the parameter K with different values, K ∈ {10P 20P 30P 40P 50P 60}, andfound that K = 40, provides the best results for 3 of the 5 end-patterns.4 Amrith Krishna et alPattern System P R AaMAD 0.72 0.77 73.86MADB2 0.68 0.68 68.18MADB1 0.49 0.52 48.86LP 0.55 0.59 55.68akaMAD 0.77 0.67 73.33MADB2 0.71 0.67 70MADB1 0.43 0.4 43.33LP 0.75 0.6 70inMAD 0.74 0.82 76.47MADB2 0.67 0.70 67.65MADB1 0.51 0.56 51.47LP 0.63 0.65 63.23yaMAD 0.7 0.72 70.31MADB2 0.61 0.62 60.94MADB1 0.53 0.59 53.12LP 0.56 0.63 56.25iMAD 0.55 0.52 54.76MADB2 0.44 0.38 45.24MADB1 0.3 0.29 30.95LP 0.37 0.33 38.09Table 2Comparative performance of the four competing models.The values for K are set by empirical observations. We find that for those3 patterns (‘a’,‘in’,‘i’), the entire vertex set has vṛddhi attribute set to thesame value. For the other two (‘ya’,‘aka’), K = 50 gave the best results.Here, the vertex set has nodes where the vṛddhi attribute is set to either ofthe values. For a better insight towards this finding, the notion of the patternthat we use in the design of the system needs to be elaborated. A pattern iseffectively the substrings that remain in both the source word and derivedword after removing the portions which are common in both. This patternis the visible change that happens in the derivation of a word. To reduce thenumber of distinct patterns we did not consider the pattern changes thatoccur due to vṛddhi and guṇa as distinct patterns, rather we abstractedthem out. Now, multiple affixes may lead to the generation of the same setof patterns. In the case of pattern, rather end-pattern, (Krishna, Satuluri,Ponnada, et al. 2017), ‘a’, the effect may be the result of application of oneof the following affixes such as aṇ añ etc. Here, all the affixes of pattern ‘a’‘Ekalavya’ Approach 103lead to vṛddhi. But for the pattern ‘ya’, the affixes may or may not lead toa vṛddhi. We report the best result for each of the system in Table 2.5 Inference of Syntactic Structure in SanskritIn this section, we are reporting an ongoing work, where we investigate theeffectiveness of using Adaptor grammar for inference of syntactic structuresin Sanskrit. We experiment with the effect of Adaptor Grammar in cap-turing the ‘natural order’ or the word order followed in prose. For thistask, we use a dataset of Sanskrit sentences which are in prose order. Thedataset consists of 2000 sentences from Pañcākhyānaka and more than 600sentences from Mahābhārata, . For this experiment, we only consider themorphological classes of the words involved in the sentences. Currently, weuse the morphological tags as used in the Sanskrit Library.6 We keep 500of the sentences for testing and the remaining 2000 are used for identifyingthe patterns. Some of the constructs had one or two words, which we ignorefor the experiment.We learn the necessary productions in grammar and then evaluate thegrammar on the 500 test sentences. We calculate the likelihood of gener-ating each of the sentences. In order to test the likelihood of the correctsentence, we also generate all possible permutations of the morphologicaltags in each of the test sentences. For sentences of length S 5, we breakthem into sub-sequences of 5 and find the permutations of the sub-sequencesand concatenate them again. This is used as a means of sampling the pos-sible combinations as the explicit enumeration of all the permutations arecomputationally costly. From the generated candidate set we find the like-lihood of the ground truth sentence and rank them. We report our resultsbased on two measures.1. Edit Distance (ED) - The edit distance of the top-ranked sentenceamong the candidate set for a given sentence with that of the groundtruth. Edit distance is roughly described as the minimum numberof operations required to convert one string to another based on afixed set of operations with predefined costs. We use the standardLevenshtein distance (Levenshtein 1966), where the three operationsare ‘insert’, ‘delete’ and ‘substitution’. All the 3 operations have a6 Amrith Krishna et alcost of 1. We compare the ground truth sentence with the predictedsentence that has the highest likelihood to obtain the measure. Thepredicted sentences with a lower Edit Distance implies a better result.2. Mean Reciprocal Rank (MRR) - Mean Reciprocal Rank is theaverage of reciprocal ranks for each of the queries. Here a test sentenceis treated as a query. The different permutations are the retrievedresults for the query. So from the ranked retrieved list, we find theinverse of the rank of the gold standard sentence. The better the MRRScore, the better the result.1|f||e|∑i5)relirvnkiWe first attempt the same skeletal grammars as proposed by Johnson,T. L. Griffiths, and Goldwater (2007) for capturing the syntactic regularities.We used both the ‘unigram’ and ‘collocation’ grammar as mentioned in thework. Figures 6 and 7 show the first two grammars that we have used forthe task.Figure 6Unigram grammar as used in Johnson, T. L. Griffiths, and Goldwater(2007)With these grammars, we experimented with various hyper-parametersettings. Since both the grammars are right recursive grammars, the lengthof the productions so learnt from the grammar varied greatly. Though this isbeneficial for identifying the word lengths, the association with the morpho-logical tags cannot be much longer. Secondly, the number of productions tobe learnt is user-defined hyper-parameter. We find that due to the possiblevarying length size of strings and fewer observations, the main morpholog-ical patterns that were learnt as the productions were not repeated enoughin the observations to be statistically significant.‘Ekalavya’ Approach 105Figure 7Collocation grammar as used in Johnson, T. L. Griffiths, and Goldwater(2007)Figure 8Modified grammar by eliminating the recursiveness in the Adaptednonterminal ‘@Word’.We modified both the grammars to restrict the length of the productionsto a maximum of 4 and limited the number of productions to be learnt. Weshow the modification done to the adapted non-terminal ‘word’ in both thegrammars. This restricts the number of productions that ‘word’ can learn.The modified portion can be seen in Figure 8.The results for all the four grammars are shown in Table 3. It can beseen that there is considerable improvement in the Mean Reciprocal Rankand the edit distance measures for the task with the restricted grammar.On our manual inspection of the patterns learnt from all the grammars, itwas observed that the initial skeletal grammars were essentially over-fittingthe training instances due to longer lengths. The modified grammars couldreduce the Edit distance to almost half and double the Mean Reciprocal106 Amrith Krishna et alGrammar MRR EDUnigram 0.2923 4.87Collocation 0.3016 4.66Modified Unigram 0.4025 3.21Modified Collocation 0.5671 2.20Table 3Results for the word reordering task.Rank for the task.For example, consider the sentence ‘tatra budhaḥ vrata caryā samāptauāgacchat (ā agacchat)’ from Mahābhārata. Consider the corresponding se-quence of morphological tags as shown, ‘i m1s iic f3s f7s i ipf[1]_a3s’.7 Wefilter out the ‘iic’ tags as the ‘iic’ tag stands for the compound component.It can be seen as part of the immediate next noun tag following it. We donot filter out the ‘i’ tags as of now, where ‘i’, stands for the indeclinable.So in effect the tag sequence is ‘i m1s f3s f7s i ipf[1]_a3s’. The ‘Colloca-tion’ Grammar had the following sequence as the most likely output ‘i f7s im1s f3s ipf[1]_a3s’ with an edit distance of 4. In the ‘Modified Collocation’Grammar the predicted sequence is ‘i m1s f3s i f7s ipf[1]_a3s’. The editdistance of the sentence is 2. Here, it can be seen that just 2 tags haveswapped their position. The tags ‘i’ and ‘f7s’ have changed their positions,but are still at adjacent positions to each other. The fourth and fifth wordsin the original sentence have changed to become the fifth and fourth wordsin the predicted sentence.The results shown here are preliminary in nature. What excites us themost is the provision this framework provides to incorporate the syntacticknowledge which is explicitly defined in our grammar formalisms. With thiswork, we plan to extend the work to two immediate tasks. First, we planto extend the word-reordering task to the poetry to prose conversion task.Currently, the task is to convert a bag of words into its corresponding proseor the ‘natural order’. But we will investigate the regularities involved inpoetry apart from the aspects of meter and incorporate the regularities toguide the grammar in picking up those patterns. We can also attempt tolearn the conditional probabilities for the syntactic patterns in both poetry7We follow the notations from Sanskrit Library -‘Ekalavya’ Approach 107and prose. Second, we will be performing the Dependency parse analysisof given sentences at a morphological level. Goyal and A. Kulkarni (2014)presents a scheme for converting Sanskrit constructs in constituency parsestructure to Dependency parse structure. Headden III, Johnson, and Mc-Closky (2009) provides some insights into the use of PCFGs and lexicalevidence for unsupervised dependency parsing. Currently, we will be work-ing only on the projective dependency parsing. We will be relying on theDependency Model with Valence to define our PCFG formalism for depen-dency parsing.6 ConclusionThe primary goal of this work was to look into the applicability of the Adap-tor Grammars, a non-parametric Bayesian approach for learning syntacticstructures from observations. In this work, we introduced the basic con-cepts of the Adaptor grammars, various applications in which the grammaris used in NLP tasks. We provide detailed descriptions of how adaptorgrammar is used in word-level vocabulary expansion tasks in Sanskrit. Theadaptor grammars were used as effective sub-word n-gram features for bothCompound type identification and Derivational noun pair identification. Wefurther showed the feasibility of using adaptor grammar for syntactic levelanalysis of sentences in Sanskrit. We plan to investigate the feasibility ofusing the Adaptor grammars for dependency parsing and poetry to proseconversion tasks at the sentence level.AcknowledgementsThe authors acknowledge the use of the morphologically tagged databaseof the Pañcākhyānaka and Mahābhārata produced under the direction ofProfessor Peter M. Scharf while laureate of a Blaise Pascal Research Chairat the Université Paris Diderot 2012–2013 and maintained by The SanskritLibrary.ReferencesBlei, David M., Alp Kucukelbir, and Jon D. McAuliffe. 2017. “VariationalInference: A Review for Statisticians”. Journal of the American Statis-tical Association 112.518pp. 859–877. doi: 10.1080/01621459.2017.1285773. eprint:, Jan A and Phil Blunsom. 2013. “Adaptor Grammars for LearningNon- Concatenative Morphology”. In: Proceedings of the 2013 Conferenceon Empirical Methods in Natural Language Processing. Association forComputational Linguistics.Cass, Stephen. 2011. Unthinking Machines, Artificial intelligence needs areboot, say experts. url:, Shay B, David M Blei, and Noah A Smith. 2010. “Variational in-ference for adaptor grammars”. In: Human Language Technologies: The2010 Annual Conference of the North American Chapter of the Asso-ciation for Computational Linguistics. Association for ComputationalLinguistics, pp. 564–572.Eskander, Ramy, Owen Rambow, and Tianchun Yang. 2016. “Extendingthe Use of Adaptor Grammars for Unsupervised Morphological Segmen-tation of Unseen Languages.” In: COLING, pp. 900–910.Feldman, Naomi H, Thomas L Griffiths, Sharon Goldwater, and James LMorgan. 2013. “A role for the developing lexicon in phonetic categoryacquisition.” Psychological review 120.4p. 751.Gershman, Samuel J and David M Blei. 2012. “A tutorial on Bayesian non-parametric models”. Journal of Mathematical Psychology 56.1pp. 1–12.Goyal, Pawan and Gérard Huet. 2016. “Design and analysis of a lean in-terface for Sanskrit corpus annotation”. Journal of Language Modelling4.2pp. 145–182.Goyal, Pawan, Gérard P Huet, Amba P Kulkarni, Peter M Scharf, andRalph Bunker. 2012. “A Distributed Platform for Sanskrit Processing.”In: COLING, pp. 1011–1028.Goyal, Pawan and Amba Kulkarni. 2014. “Converting Phrase Structures toDependency Structures in Sanskrit”. In: Proceedings of COLING 2014,108‘Ekalavya’ Approach 109the 25th International Conference on Computational Linguistics: Tech-nical Papers, pp. 1834–1843.Hardisty, Eric A, Jordan Boyd-Graber, and Philip Resnik. 2010. “Model-ing perspective using adaptor grammars”. In: Proceedings of the 2010Conference on Empirical Methods in Natural Language Processing. As-sociation for Computational Linguistics, pp. 284–292.Headden III, William P, Mark Johnson, and David McClosky. 2009. “Im-proving unsupervised dependency parsing with richer contexts andsmoothing”. In: Proceedings of Human Language Technologies: The 2009Annual Conference of the North American Chapter of the Associationfor Computational Linguistics. Association for Computational Linguis-tics, pp. 101–109.Horning, James Jay. 1969. A study of grammatical inference. Tech. rep.STANFORD UNIV CALIF DEPT OF COMPUTER SCIENCE.Huang, Yun, Min Zhang, and Chew Lim Tan. 2011. “Nonparametricbayesian machine transliteration with synchronous adaptor grammars”.In: Proceedings of the 49th Annual Meeting of the Association for Com-putational Linguistics: Human Language Technologies: short papers-Volume 2. Association for Computational Linguistics, pp. 534–539.Johnson, Mark. 2008a. “Unsupervised word segmentation for Sesotho us-ing adaptor grammars”. In: Proceedings of the Tenth Meeting of ACLSpecial Interest Group on Computational Morphology and Phonology.Association for Computational Linguistics, pp. 20–27.— 2008b. “Using Adaptor Grammars to Identify Synergies in the Unsuper-vised Acquisition of Linguistic Structure.” In: ACL, pp. 398–406.— 2010. “PCFGs, topic models, adaptor grammars and learning topicalcollocations and the structure of proper names”. In: Proceedings of the48th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics, pp. 1148–1157.Johnson, Mark and Katherine Demuth. 2010. “Unsupervised phonemic Chi-nese word segmentation using Adaptor Grammars”. In: Proceedings ofthe 23rd international conference on computational linguistics. Associa-tion for Computational Linguistics, pp. 528–536.Johnson, Mark, Thomas L Griffiths, and Sharon Goldwater. 2007. “Adap-tor grammars: A framework for specifying compositional nonparametricBayesian models”. In: Advances in neural information processing sys-tems, pp. 641–648.110 Amrith Krishna et alJohnson, Mark, Thomas Griffiths, and Sharon Goldwater. 2007. “Bayesianinference for pcfgs via markov chain monte carlo”. In: Human LanguageTechnologies 2007: The Conference of the North American Chapter ofthe Association for Computational Linguistics; Proceedings of the MainConference, pp. 139–146.Kiparsky, Paul. 1994. “Paninian linguistics”. The Encyclopedia of Languageand Linguistics 6pp. 2918–2923.Knuth, Donald E. 1992. “Two notes on notation”. The American Mathemat-ical Monthly 99.5pp. 403–422.Krishna, Amrith, Pavankumar Satuluri, Harshavardhan Ponnada, MuneebAhmed, Gulab Arora, Kaustubh Hiware, and Pawan Goyal. 2017. “AGraph Based Semi-Supervised Approach for Analysis of DerivationalNouns in Sanskrit”. In: Proceedings of TextGraphs-11: the Workshopon Graph-based Methods for Natural Language Processing. Vancouver,Canada: Association for Computational Linguistics, pp. 66–75. url:, Amrith, Pavankumar Satuluri, Shubham Sharma, Apurv Kumar,and Pawan Goyal. 2016. “Compound Type Identification in Sanskrit:What Roles do the Corpus and Grammar Play?” In: Proceedings of the6th Workshop on South and Southeast Asian Natural Language Process-ing (WSSANLP2016). Osaka, Japan: The COLING 2016 OrganizingCommittee, pp. 1–10.Kulkarni, Malhar, Chaitali Dangarikar, Irawati Kulkarni, Abhishek Nanda,and Pushpak Bhattacharyya. 2010. “Introducing Sanskrit Wordnet”. In:Proceedings on the 5th Global Wordnet Conference (GWC 2010), Narosa,Mumbai, pp. 287–294.Kumar, Arun, Lluís Padró, and Antoni Oliver González. 2015. “JointBayesian Morphology learning of Dravidian Languages”. In: RICTA2015: Proceedings of the Joint Workshop on Language Technology forClosely Related Languages, Varieties and Dialects: Hissan, Bulgaria:September 10, 2015: proceedings book.Levenshtein, Vladimir I. 1966. “Binary codes capable of correcting deletions,insertions, and reversals”. In: Soviet physics doklady. Vol. 10, pp. 707–710.Liang, Percy, Slav Petrov, Michael I Jordan, and Dan Klein. 2007. “The Infi-nite PCFG Using Hierarchical Dirichlet Processes.” In: EMNLP-CoNLL,pp. 688–697.‘Ekalavya’ Approach 111Manning, Christopher D. 2016. “Computational linguistics and deep learn-ing”. Computational Linguistics.Nair, Sivaja S and Amba Kulkarni. 2010. “The Knowledge Structure inAmarakosa.” In: Sanskrit Computational Linguistics. Springer, pp. 173–189.Neubig, Graham, Taro Watanabe, Eiichiro Sumita, Shinsuke Mori, andTatsuya Kawahara. 2011. “An unsupervised model for joint phrasealignment and extraction”. In: Proceedings of the 49th Annual Meet-ing of the Association for Computational Linguistics: Human Lan-guage Technologies-Volume 1. Association for Computational Linguis-tics, pp. 632–641.Norvig, Peter. 2011. On Chomsky and the Two Cultures of Statistical Learn-ing. url:’Donnell, Timothy J. 2015. Productivity and reuse in language: A theoryof linguistic computation and storage. MIT Press.Pinker, Steven, Emilio Bizzi, Sydney Brenner, Noam Chomsky, Marvin Min-sky, and Barbara H. Partee. 2011. Keynote Panel: The Golden Age: ALook at the Original Roots of Artificial Intelligence, Cognitive Science,and Neuroscience. url:, Alan and Paul Smolensky. 1993. Optimality Theory: Constraint in-teraction in generative grammar. John Wiley & Sons, the version pub-lished in 2008.Smolensky, Paul and Géraldine Legendre. 2006. The harmonic mind: Fromneural computation to optimality-theoretic grammar (Cognitive architec-ture), Vol. 1. MIT press.Talukdar, Partha and Koby Crammer. 2009. “New regularized algorithmsfor transductive learning”. Machine Learning and Knowledge Discoveryin Databasespp. 442–457.Wong, Sze-Meng Jojo, Mark Dras, and Mark Johnson. 2012. “ExploringAdaptor Grammars for Native Language Identification”. In: Proceedingsof the 2012 Joint Conference on Empirical Methods in Natural LanguageProcessing and Computational Natural Language Learning, pp. 699–709.Zhai, Ke, Jordan Boyd-Graber, and Shay B Cohen. 2014. “Online adaptorgrammars with hybrid inference”. Transactions of the Association forComputational Linguistics 2pp. 465–476.Zhai, Ke, Zornitsa Kozareva, Yuening Hu, Qi Li, and Weiwei Guo. 2016.“Query to Knowledge: Unsupervised Entity Extraction from Shopping112 Amrith Krishna et alQueries using Adaptor Grammars”. In: Proceedings of the 39th Interna-tional ACM SIGIR conference on Research and Development in Infor-mation Retrieval. ACM, pp. 255–264.Zhu, Xiaojin and Zoubin Ghahramani. 2002. Learning from Labeled andUnlabeled Data with Label Propagation. Tech. rep.A user-friendly tool for metrical analysis of SanskritverseShreevatsa RajagopalanAbstract: This paper describes the design and implementation of atool that assists readers of metrical verse in Sanskrit (and otherlanguages/literatures with similar prosody). It is open-source, andavailable online as a web application, as a command-line tool and as asoftware library. It handles both varṇavṛtta and mātrāvṛtta metres. Ithas many features for usability without placing strict demands on itsusers. These include allowing input in a wide variety of transliterationschemes, being fairly robust against typographical or metrical errorsin the input, and “aligning” the given verse in light of the recognizedmetre.This paper describes the various components of the system and itsuser interface, and details of interest such as the heuristics used in theidentifier and the dynamic-programming algorithm used for displayingresults. Although originally and primarily designed to help readers,the tool can also be used for additional applications such as detectingmetrical errors in digital texts (its very first version identified 23 errorsin a Sanskrit text from an online corpus), and generating statisticsabout metres found in a larger text or corpus. These applications areillustrated here, along with plans for future improvements.1 Introduction1.1 DemoAs a software tool is being discussed, it seems best to start with ademonstration of a potential user interaction with the tool. Suppose I wishto learn about the metre of the following subhāṣita (which occurs in thePratijñāyaugandharāyaṇa attributed to Bhāsa):113114 Rajagopalankāṣṭhād agnir jāyate mathya-mānād-bhūmis toyaṃ khanya-mānā dadāti |sotsāhānāṃ nāstyasādhyaṃ narāṇāṃmārgārabdhāḥ sarva-yatnāḥ phalanti ||Then I can visit the tool’s website,, enter the above verse (exactly as above), and correctly learn that itis in the metre Śālinī. More interestingly, suppose I do not have the versecorrectly: perhaps I am quoting it from memory (possibly having misheardit, and unaware of the line breaks), or I have found the text on a (not veryreliable) blog, or some errors have crept into the digital text, or possibly Ijust make some mistakes while typing. In such a case, possibly even withan unreasonable number of mistakes present, I can still use the tool in thesame way. Thus, I can enter the following error-ridden input (which, forillustration, is encoded this time in the ITRANS convention):kaaShThaad agni jaayatemathyamaanaad bhuumistoya khanyamaanaa /daati sotsaahaanaaM naastyasaadhyaMnaraaNaaM maargaabdhaaH savayatnaaH phalantiihi //Here, some syllables have the wrong prosodic weight (laghu instead ofguru and vice-versa), some syllables are missing, some have been introducedextraneously, not a single line of the input is correct, and even the totalnumber of syllables is wrong. Despite this, the tool identifies the metreas Śālinī. The output from the tool, indicating the identified metre, andhighlighting the extent to which the given verse corresponds to that metre,is shown in Figure 1. The rest of this paper explains how this is done, amongother things.1.2 BackgroundA large part of Sanskrit literature, in kāvya, śāstra and other genres, is inverse (padya) rather than prose (gadya). A verse in Sanskrit (not countingsome modern Sanskrit poets’ experimentation with “free verse” and the like)is invariably in metre.Computer tools to recognize the metre of a Sanskrit verse are notnew. A script in the Perl programming language, called sscan, writtenby John Smith, is distributed among other utilities at the http://bombay.Metrical analysis of Sanskrit verse 115Figure 1A screenshot of the output from the tool for highly erroneous input.Despite the errors, the metre is correctly identified as Śālinī. The gurusyllables are marked in bold, and the deviations from the expected metricalpattern (syllables with the wrong weight, or missing or superfluoussyllables) are underlined (and highlighted in red).116 website, and although the exact date is unknown, thetimestamp in the ZIP file suggests a date of 1998 or earlier for this file (Smith1998?). This script, only 61 lines long (38 source lines not includingcomments and description) was the spark of inspiration that initiated thewriting of the tool being described in the current paper, in 2013. Othersoftware or programs include those by Murthy (2003?), by A. Mishra (2007)and by Melnad, Goyal, and P. M. Scharf (2015). A general introduction tometre and Sanskrit prosody is omitted in this paper for reasons of space,as the last of these papers (Melnad, Goyal, and P. M. Scharf 2015) quiteexcellently covers the topic.Like these other tools, the tool being discussed in this paper recognizesthe metre given a Sanskrit verse. It is available in several forms: as aweb application hosted online at,as a commandline tool, and as a Python library; all are available in thesource-code form at It isbeing described here for two reasons:1. It has some new features that I think will be interesting (seesection 1.4), some of which distinguish it from other tools. Thedevelopment of this tool has thrown up a few insights (see Section 4)which may be useful to others who would like to develop better toolsin the future.2. A question was raised about this tool (P. Scharf 2016), namely:“An open-source web archive of metrically related soft-ware and data can be found at with an interface at The author and contributors to this archiveand data were unknown at the time and not included in ourliterature review. No description of the extent, comprehen-siveness, and effectiveness of the software has been found.”I took this as encouragement that such a description may be desirable/ of interest to others.1.3 The intended userThe tool can be useful for someone trying to read or compose Sanskrit verses,and for someone checking a text for metrical errors. In other words, the toolMetrical analysis of Sanskrit verse 117can be used by different kinds of users: a curious learner, an editor workingwith a text (checking verses for metrical correctness), a scholar investigatingthe metrical signature of a text, or an aspiring poet. To make these concrete,consider the following “user stories” as motivating examples.• Devadatta is learning Sanskrit. He knows that Sanskrit verse iswritten in metre and that this is supposed to make it easier to chant orrecite. But he knows very little about various metres, so that when helooks at a verse, especially one in a longer metre like Śārdūla-vikrīḍitamor Sragdharā, he cannot quickly recognize the metre. All he sees is astring of syllables, and he has no idea where to pause (yati), how torecite, or even where to break at pādās if they are not indicated clearlyin the text he is reading. With this tool, these problems are solved,and he can focus on understanding and appreciating the poetry, nowthat he can read it aloud rhythmically and melodically and savor itssounds.• Chitralekha is a scholar. She works with digital texts that, thoughuseful to have, are sometimes of questionable provenance and do notalways meet her standards of critical editions. Errors might havecrept into the texts, and she has the idea that some instances ofscribal emendation or typographical errors (such as omitted letters,extraneous letters, or transposed letters) are likely to cause metricalerrors as well. With this tool, she can catch a great many of them(see Section 3). Sometimes, she is interested in questions aboutprosody itself, such as: what are all the metres used in this text?Which ones are the most common? How frequently does the poetX use a particular “poetic license” of scansion? What are the rulesgoverning Anuṣṭubh (Śloka), typically? This tool can help her withsuch questions too.• Kamban would like to write poetry, like his famous namesake. Hehas a good command of vocabulary and grammar and has some poeticimagination, but when he writes a verse, especially in an “exotic” (tohim) metre, he is sometimes unsure whether he has got all the syllablesright. With this tool, he enters his tentative attempt and sees whetheranything is off. He knows that the metres will soon become second-nature to him and he will not need the tool anymore, but still he118 Rajagopalanwishes he could have more help—such as choosing his desired metre,and knowing what he needs to add to his partially composed verse.With the names of these users as mnemonics, we can say that the toolcan be used to discover, check, and compose metrical verse and facts aboutthem.1.4 User-friendly featuresAs mentioned earlier, the tool has several features for easing the user’s job:1. It accepts a wide variety of input scripts (transliteration schemes).Unlike most tools, it does not enforce the input to be in any particularinput scheme or system of transliteration. Instead, it accepts IAST,Harvard-Kyoto, and ITRANS transliteration, Unicode Devanāgarī,and Unicode Kannada scripts, without the user having to indicatewhich input scheme is used. The tool is agnostic to the input methodused, as it converts all input to an internal representation based onSLP1 (P. M. Scharf and Hyman 2011). It is straightforward to extendto other scripts or transliteration methods, such as SLP1 or other Indicscripts.2. It is highly robust against typographical errors or metrical errors inthe verse that is input. This is perhaps the most interesting featureof the tool and is useful because the text in the “wild” is not alwayserror-free.3. It can detect the metre even from partial verses—even if the user isnot aware that the verse one is looking up is incomplete.4. Informative “display” of a verse in relation to the identified metre,by aligning the verse to the metre using a dynamic programmingalgorithm to find the best alignment.5. Supports learning more about a metre, by pointing to other examplesof the metre, and audio recordings of the metre being recited in severalstyles (where available).6. Quick link to provide feedback (by creating an issue on GitHub),specific to the input verse being processed on the page.Metrical analysis of Sanskrit verse 119PreprocessingMetrical(raw)dataBuild MetricalindexIdentifyUserinputDetectinputscheme Transliterateto SLP1InputschemeSLP1(phonemes) ScansionSLP1(withpunctuation)DisplayMetricalsignature(patternlines)Listofmetres PrettyoutputFigure 2A “data flow diagram” of the system’s operation. The rectangles denotedifferent forms taken by the data; the ovals denote code that transforms (oruses, or generates) the data.2 How it worksThis section describes how the system works. At a high level, there are thefollowing steps/components:1. Metrical data, about many known metres. This has been entered intothe system.2. Building the Index (Pre-processing): from the metrical data, variousindices are generated.3. Detection and Transliteration: The input supplied by the user isexamined, the input scheme detected, and transliterated into SLP1.4. Scansion: The SLP1 text (denoting a set of phonemes) is translatedinto a metrical signature (a pattern of laghus and gurus).5. Matching: The metrical signature is compared against the index, toidentify the metre (or metres).6. Display: The user’s input is displayed to the user, appropriately re-formatted to fit the identified metre(s), and with highlighting of anydeviations from the metrical ideal.These steps are depicted in Figure 2, and described in more detail in thefollowing subsections.120 Rajagopalan2.1 Metrical dataThis is the raw data about all the metres known to the system. Theyare stored in the JSON format, so that they could be used by otherprograms too. In what follows, a metrical pattern is defined as string overthe alphabet {LPG}, i.e., a sequence of symbols each of which is either L(denoting laghu or a light syllable) or G (denoting guru or a heavy syllable).As described elsewhere (Melnad, Goyal, and P. M. Scharf 2015), there aretwo main types of metres, varṇavṛtta and mātrāvṛtta (note that (Murthy2003) points out that the Śloka metre constitutes a third type by itself),with the former having three subtypes:1. samavṛtta metres, in which all four pādas of a verse have the samemetrical pattern,2. ardhasamavṛtta metres, in which odd pādas have one pattern and evenpādas another (so that the two halves of the verse have the samemetrical pattern),3. viṣamavṛtta metres, in which potentially all four pādas have differentmetrical patterns.Correspondingly each metre’s characteristics are indicated in this systemwith the minimal amount of data necessary:1. samavṛtta metres are represented by a list of length one (or forconvenience, simply a string), containing the pattern of each of theirpādas,2. ardhasamavṛtta metres are represented by a list of length two,containing the pattern of the odd pādas followed by the pattern ofthe even pādas,3. viṣamavṛtta metres are represented by a list of length four, containingthe pattern for each of the four pāas.Additionally, with the pattern, yati can be indicated; also spaces can beadded, which are ignored. The yati is ignored for identification, but usedlater for displaying information about the metre. Here are some lines, asexamples:Metrical analysis of Sanskrit verse 121{# ...['Śālinī', 'GGGG—GLGGLGG'],['Praharṣiṇī', 'GGGLLLLGLGLGG'],['Bhujaṅgaprayātam', 'LGG LGG LGG LGG'],# ...['Viyoginī', ['LLGLLGLGLG','LLGGLLGLGLG']],# ...['Udgatā', ['LLGLGLLLGL','LLLLLGLGLG','GLLLLLLGLLG','LLGLGLLLGLGLG']],# ...}For mātrāvṛtta metres (those based on the number of morae: mātrās),the constraints are more subtle, and as not every syllable’s weight is fixed,there are so many patterns that fit each metre that it may not be efficient togenerate and store each pattern separately. Instead, the system representsthem by using a certain conventional notation, which expands to regularexpressions. This notation is inspired by the elegant notation describedin another paper (Melnad, Goyal, and P. M. Scharf 2015), and uses aparticularly useful description of the Āryā and related metres available in apaper by Ollett Ollett (2012).# ...["Āryā", ["22 4 22", "4 22 121 22 .", "22 4 22", "4 22 1 22 ."]],["Gīti", ["22 4 22", "4 22 121 22 ."]],["Upagīti", ["22 4 22", "4 22 L 22 ."]],["Udgīti", ["22 4 22", "4 22 L 22 .", "22 4 22", "4 22 121 22."]],["Āryāgīti", [["22 4 22", "4 22 121 22 (4|2L)"]],# ...Here, 2 will be interpreted as the regular expression (G|LL) and 4 asthe regular expression (GG|LLG|GLL|LLLL|LGL) – all possible sequencesof laghus and gurus that are exactly 4 mātrās long. Note that with thisnotation, the frequently mentioned rule of “any combination of 4 mātrāsexcept LGL (ja-gaṇa)” is simply denoted as 22, expanding to the regularexpression (G|LL)(G|LL) which covers precisely the 4 sequences of laghusand gurus of total duration 4, other than LGL.122 RajagopalanType of metre Numbersamavṛtta 1242ardhasamavṛtta 132viṣamavṛtta 19mātrāvṛtta 5Total 1398Table 1The number of metres “known” to the current system. Not too muchshould be read into the raw numbers as a larger number isn’t necessarilybetter; see Section 4.1.3 for why.The data in the system was started with a hand-curated list of popularmetres (Ganesh 2013). It was greatly extended with the contributionsof Dhaval Patel, which drew from the Vṛttaratnākara and the work ofMishra (A. Mishra 2007). A few metres from these contributions are yetto be incorporated, because of reasons described in section 4.1.3. Overall,as a result of all this, at the moment we have a large number of knownmetres, shown in Table 1.2.2 Metrical indexThe data described in the previous section is not used directly by the restof the program. Instead, it is first processed into data structures (which wecan consider a sort of “index”) that allow for efficient lookup, even whenthe number of metres is huge. These enable the robustness to errors thatis one of the most important features of the system. The indices are calledpāda1, pāda2, pāda3, pāda4, ardha1, ardha2, and full. Each of theseindices consists of an associative array (a Python dict) that maps a pattern(a “pattern” is a string over the alphabet {LPG}) to a list1 of metres thatcontain that pattern (at the position indicated by the name of the index),and similarly an array that maps a regular expression to the list of metresthat contain it. For instance, ardha2 maps the second half of each knownmetre to that metre’s name. It is at this point that we also introduce laghu-1Why a list? Because different metres can share the same pāda, for instance. Andthere can even be multiple names for the same metre. See Section 4.1.3 later.Metrical analysis of Sanskrit verse 123ending variants for many metres (see more in 4.1.2). Section 2.5 describeshow these indices are used.Although this index is generated automatically and not written down incode, the following hypothetical code illustrates some sample entries in theardha2 index:ardha2_patterns = {# ...'GGGGGLGGLGGGGGGGLGGLGG': ['Śālinī'],# laghu variants for illustration.# In reality we don't add for Śālinī…'GGGGGLGGLGLGGGGGLGGLGG': ['Śālinī'],'GGGGGLGGLGGGGGGGLGGLGL': ['Śālinī'],'GGGGGLGGLGLGGGGGLGGLGL': ['Śālinī'],# ...}ardha2_regexes = {# ..."22 4 22" + "4 22 L 22 .": ['Āryā', 'Upagīti],# ...}2.3 TransliterationThe first step that happens after users enter their input is automatictransliteration. Detecting the input scheme is based on a few heuristics.Among the input schemes initially supported (Devanāgarī, Kannada, IAST,ITRANS, and Harvard-Kyoto), the detection is done as follows:• If the input contains any Kannada consonants and vowels, treat it asKannada.• If the input contains (m)any Devanāgarī consonants and vowels, treatit as Devanāgarī. Note that this should not be applied to othercharacters from the Devanāgarī Unicode block, such the daṇḍa symbol,which are often used with other scripts too, as encouraged in theUnicode standard.• If the input contains any of the characters āīūṛṝḷḹṃḥṅñṭḍśṣ, treatit as IAST.124 Rajagopalan• If the input matches the regular expressionaa|ii|uu|[RrLl]\^[Ii]|RR[Ii]|LL[Ii]|~N|Ch|~n|N\^|Sh|shtreat it as ITRANS. Here, the Sh and sh might seem dangerous, butthe consonant cluster ःह is unlikely in Sanskrit.• Else, treat the input as Harvard-Kyoto.An option to explicitly indicate the input scheme (bypassing theautomatic inference) could be added but has not seemed necessary so far.The input is transliterated into (a subset of) the encoding SLP1 (P. M.Scharf and Hyman 2011), which is used internally, as it has many propertiessuitable for computer representation of Sanskrit text. While the input isbeing transliterated according to the detected scheme, known punctuationmarks (and line breaks) are retained, while all “unknown” characters thathave not been programmed into the transliterator (such as control charactersand accent marks in Devanāgarī) are ignored.The exact details of how the transliteration is done are omitted here, astransliteration may be regarded as a reasonably well-solved problem by now.One point worth mentioning is that there are no strict input conventions. Inother work (Melnad, Goyal, and P. M. Scharf 2015), a convention is adoptedlike:If the input text lacks line-end markers, it is assumed to be asingle pāda and to belong to the samavṛtta type of metreSuch a scheme may be interesting to explore. For now, as much as possible,the system tries to assume an untrained user and therefore infer all suchthings, or try all possibilities.2.4 ScanThe transliteration into SLP1 can be thought of as having generated a setof Sanskrit phonemes (this close relationship between phonemes and thetextual representation is the primary strength of the SLP1 encoding). Fromthese phonemes, scansion into a pattern of laghus and gurus can proceeddirectly, without bothering with syllabification (however, syllabification isstill done, for the sake of the “alignment” described later in section 2.6).The rule for scansion is mechanical: initial consonants are dropped, andMetrical analysis of Sanskrit verse 125each vowel is considered as a set along with all the non-vowels that follow itbefore the next vowel (or end of text) is found. If the vowel is long or if thereare multiple consonants (treating anusvāra and visarga as consonants here,for scansion only) in this set, then we have a guru, else we have a laghu.The validity of this method of scansion, with reference to the traditionalSanskrit grammatical and metrical texts, is skipped in this paper, assomething similar has been treated elsewhere (Melnad, Goyal, and P. M.Scharf 2015). However, note that this is the “purist” version of Sanskritscansion. There is an issue of śithila-dvitva or poetic licence, which is treatedin more detail in Section IdentificationThe core of the tool’s robust metre identification is an algorithm for tryingmany possibilities for identifying the metre of the input text. Identifyingthe metre given a metrical pattern (the result of scansion) is done in twosteps: (1) first the input is broken into several “parts” in various ways, andthen (2) each of these parts is matched against the appropriate indices.2.5.1 PartsGiven the metrical pattern corresponding to the input text, which may beeither a full verse, a half-verse, or a single quarter-verse (pāda), we try tobreak it into parts in multiple ways. One way of breaking the input, whichshould not be ignored, is already given by the user, in the form of line breaksin the input. If there are 4 lines, for example, it is a strong possibility thatthese are the 4 pādas of the verse. If there are 2 lines, each line may containtwo pādas. But what if there are 3 lines, or 5? Another way of breaking theinput is by counting syllables. If the number of syllables is a multiple of 4(say 4n), it is possible that every n syllables constitute a pāda of a samavṛttametre. But what if the number of syllables is not a multiple of 4?The solution adopted here is to consider all ways of breaking a patterninto k parts even when its length (say l) may not be a multiple of k. Althoughthis would apply to any positive k, we only care about k = 4 and k = 2, solet’s focus on the k = 4 case for illustration. In that case, suppose that thelength l leaves a remainder r when divided by 4, that is,l ≡ r (mod 4)P 0 ≤ r Q 4126 Rajagopalanor in other words l can be written as l = 4n + r for some integer n, where0 ≤ r Q 4. Then, as ⌊l/4⌋ = n (here ⌊·⌋ denotes the “floor function”,or integer division with rounding down), we can consider all the ways ofbreaking the string of length l into 4 parts of lengths (n+vP n+bP n+xP n+y)where v+ b+ x+ y = r (in words: we consider all ways of distributing theremainder r among the 4 parts). For example, when r = 2, we say that astring of length 4n+ 2 can be broken into 4 parts in 10 ways:(nP nP nP n+ 2)(nP nP n+ 1P n+ 1)(nP nP n+ 2P n)(nP n+ 1P nP n+ 1)(nP n+ 1P n+ 1P n)(nP n+ 2P nP n)(n+ 1P nP nP n+ 1)(n+ 1P nP n+ 1P n)(n+ 1P n+ 1P nP n)(n+ 2P nP nP n)Similarly, there are 4 ways when r = 1, 20 ways when r = 3, and of coursethere is exactly one way (nP nP nP n) when r = 0.In this way, we can break the given string into 4 parts (in 1, 4, 10, or 20ways) or into 2 parts (in 1 or 2 ways), either by lines or by syllables. Forinstance, if we are given an input of 5 lines, then there are 4 ways we canbreak it into 4 parts, by lines. What we do with these parts is explainednext.2.5.2 Lookup/matchOnce we have the input broken into the appropriate number of parts (basedon whether we’re treating it as a full verse, a half verse, or a pāda), we look upeach part in the appropriate index. For a particular index, to match againstpatterns is a direct lookup (we do not have to loop through all patterns inthe index). To match against regexes, we do indeed loop through all regexes,which are fewer in number compared to the number of patterns. If needed,we can trade-off time and memory here; for instance, we could have indexeda large number of instantiated patterns instead of regexes even for mātrāMetrical analysis of Sanskrit verse 127treating input askind of index full verse half verse single pādapāda1 first part of 4 first part of 2 the full inputpāda2 second part of 4 second part of 2 the full inputpāda3 third part of 4 first part of 2 the full inputpāda4 fourth part of 4 second part of 2 the full inputardha1 first part of 2 the full input -ardha2 second part of 2 the full input -full the full input - -Table 2What to match or look up, depending on how the input is being treated.Everywhere in the table above, phrases like “first part of 4” mean both bylines and by syllables. For instance, when treating the input as a full verse,the first )/4 part by lines and the first )/4 part by syllables are bothmatched against the pāda1 index.metres. Note that in this way, to match an ardhasamavṛtta or a viṣamavṛttathat has been input perfectly, we search directly for the full pattern (of theentire verse) in the index. We do not have to run a loop for breaking aline into pādas in all possible ways, as in (Melnad, Goyal, and P. M. Scharf2015). Details of which indices are looked up are in Table 2.2.6 Align/DisplayThe metre identifier, from the previous section, results in a list of metresthat are potential matches to the input text. Not all of them may matchthe input verse perfectly; some may have been detected based on partialmatches. Whatever the reason for this imperfect match (an over-eagermatching on the part of the metre identifier, or errors in the input text),it would be useful for the user to see how closely their input matches agiven metre. And even when the match is perfect, aligning the verse to themetre can help highlight the pāda breaks, the location of yati, and so on.This is done by the tool, using a simple dynamic-programming algorithmvery similar to the standard algorithm for the longest common subsequenceproblem: in effect, we simply align both the strings (the metrical pattern of128 Rajagopalanthe input verse, and that of the known metre) along their longest commonsubsequence.What this means is that given two strings s and t, we use a dynamicprogramming algorithm to find the minimal set of “gap” characters to insertin each string, such that the resulting strings match wherever both have anon-gap character (and never have a gap character in both). For example:('abcab', 'bca'), => ('abcab', '-bca-')('hello', 'hello'), => ('hello', 'hello')('hello', 'hell'), => ('hello', 'hell-')('hello', 'ohell'), => ('-hello', 'ohell-')('abcdabcd', 'abcd'), => ('abcdabcd', 'abcd----')('abcab', 'acb'), => ('abcab', 'a-c-b')('abcab', 'acbd'), => ('abcab-', 'a-c-bd')We use this algorithm on the verse pattern and the metre’s pattern,to decide how to align them. Then, using this alignment, we display theuser’s input verse in its display version (transliterated into IAST, and withsome recognized punctuation retained). Here, laghu and guru syllables arestyled differently in the web application (styling customizable with CSS).This also highlights each location of yati or caesura (if known and stored forthe metre), so that the user can see if their verse violates any of the subtlerrules, such as words straddling yati boundaries.This algorithm could also be used for ranking the results, based on thedegree of match between the input and each result (metre identified).3 Text analysis and resultsAs part of testing the tool (and as part of pursuing the interest in literatureand prosody that led to the tool in the first place), a large number of textssuch as from GRETIL2 were examined. Although primarily designed to helpreaders, the tool can also be used to analyze a metrical text, to catch errorsor generate statistics about the metres used. In the very first version of thetool, the first metre added was Mandākrāntā, and the tool was run on atext of the Meghadūta from GRETIL, the online corpus of Sanskrit texts.This text was chosen because the Meghadūta is well-known to be entirely2Göttingen Register of Electronic Texts in Indian Languages: and related Indologicalmaterials from Central and Southeast Asia, http://gretil.sub.uni-goettingen.deMetrical analysis of Sanskrit verse 129in the Mandākrāntā metre, so the “gold standard” to use as a referenceto compare against was straightforward. Surprisingly, this tool successfullyidentified 23 errors in the 122 verses!3 These were communicated to theGRETIL maintainer.Similarly, testing of the tool on other texts highlighted many errors.Errors identified in the GRETIL text of Bhartṛhari’s Śatakatraya werecarefully compared against the critical edition by D. D. Kosambi.4 In thistext, as in Nīlakaṇṭha’s Kali-viḍambana,5 in Bhallaṭa’s Bhallaṭa-śataka,6,and in almost all cases, the testing highlighted errors in the text, ratherthan any in the metre recognizer. This constitutes evidence that therecognizer has a high accuracy approaching 100%, though the lack of areliable (and nontrivial) “gold standard” hinders attaching a numeric valueto the accuracy. In the terminology of “precision and recall”, the recognizerhas a recall of 100% in the examples tested (for example, no verse thatis properly in Śārdūla-vikrīḍitam is failed to be recognized as that metre),while the precision was lower and harder to measure because of errors inthe input (sufficiently many errors can make the verse partially match adifferent metre).After sufficiently fixing the tool and the text so that Meghadūta wasrecognized as being 100% in the Mandākrāntā metre, other texts wereexamined. These statistics7 confirmed that, for example, the most commonmetres in the Amaruśataka are Śārdūlavikrīḍitam (57%), Hariṇī (13%) andŚikhariṇī (10%), while those in Kālidāsa’s Raghuvamśa are Śloka, Upajātiand Rathoddhatā. And so on. Once errors in the texts are fixed, this sort ofanalysis can give insights into the way different poets use metre. It can alsobe used for students to know which are the most common metres to focuson learning, at least for a given corpus. Other sources of online texts, like3See a list of 23 errors and 3 instances of broken sandhi detected in one of the GRETILtexts of the Meghadūta, at (October 2013).4See for a list of errors found, in diff format, with commentsreferring to the location of the verse in Kosambi5 RajagopalanTITUS, SARIT8 or The Sanskrit Library9 could also be used for testing thesystem.4 Interesting issues and computational experienceSome insights and lessons learned as a result of this project are worthhighlighting, as are some of the design decisions that were made eitherintentionally or unconsciously.4.1 Metrical data4.1.1 The gaṇa-sFor representing the characteristics of a given metre, a popular scheme usedby all Sanskrit authors of works on prosody is the use of the 8 gaṇs. Eachpossible laghu-guru combination of three syllables (trika), namely each of the2+ possibilities LLL, LLG, LGL, LGG, GLL, GLG, GGL, GGG, is given a distinctname (na, sa, ja, ya, bha, ra, ta, ma respectively), so that a long patternof laghus and gurus can be concisely stated in groups of three. This is anexcellent mnemonic and space-saving device, akin to writing in octal insteadof binary. For instance, the binary number 1101100101012 can be writtenmore concisely as the octal number 66250 and the translation between themis immediately apparent (1101100101012 corresponds to 66250 and vice-versa, by simply treating each group of three binary digits (bits) as an octaldigit, or conversely expanding each octal digit to a three-bit representation).Similarly, the pattern GGLGGLLGLGLG of Indravaṃśa can be more conciselyexpressed by the description as “ta ta ja ra”. Moreover, another mnemonicdevice of unknown origin uses a string “yamātārājabhānasalagaṃ” thattraverses all the 8 gaṇas (and the names lv and gv used for any “leftover”laghus and gurus respectively), assigning them syllable weights (via vowellengths) such that the three syllables starting at any of the 8 consonants areitself in the gaṇa named by that consonant.10Thus we can see that the gaṇa names are a useful mnemonic and space-saving device, and yet at the same time, from an information-theoreticpoint of view, they contain absolutely no information that is not present8http://sarit.indology.info9http://sanskritlibrary.org10In the modern terminology of combinatorics, this is a de Bruijn sequence.Metrical analysis of Sanskrit verse 131in the expanded string (the pattern of Ls and Gs). Moreover, for a typicalreader who is not trying to memorize the definitions of metres (either in theGGLGGLLGLGLG form or the “ta ta ja ra”’ form), the gaṇas add no value andserve only to further mystify and obscure the topic. Moreover, they can bemisleading as to the nature of yati breaks in the metre, as the metre beingdescribed is rarely grouped into threes, except for certain specific metres(popularly used in stotras) such as भजुूयातम ्, तोटकम ्, and ॐिवणी. One canas easily (and more enjoyably) learn the pattern of a metre by committing arepresentative example (a good verse in that metre) to memory, rather thanthe definition using gaṇas, as the author and others know from personalexperience. For these reasons, the gaṇa information is de-emphasized in thetool described in this paper.4.1.2 pādānta-laghuSanskrit poetic convention is that the very last syllable in a verse can belaghu even if the metre requires it to be guru. Consider for instance, thevery first verse of Kālidāsa’s Meghadūta, in the Mandākrāntā metre:kaścit kāntā-viraha-guruṇā svādhikārāt pramattaḥśāpenāstaṃgamita-mahimā varṣa-bhogyeṇa bhartuḥyakṣaś cakre janaka-tanayā-snāna-puṇyodakeṣusnigdhacchāyā-taruṣu vasatiṃ rāmagiryāśrameṣuEven though the Mandākrāntā requires in each pāda a final syllable thatis guru, the final syllable of the verse above is allowed to be ṣu which ifit occurred in another position (and not followed by a consonant cluster)would be treated as a laghu syllable. A similar convention, though notalways stated as clearly in texts in prosody, more or less applies at the endof each half (ardha or pair of pādas) of the verse (for an example, see thekāṣṭhād agnir… verse in Śālinī from Section 1.1).The question of such a laghu at the end of odd pādas (viṣama-pādānta-laghu) is a thorny one, with no clear answers. Even the word of someonelike Kedārabhaṭṭa cannot be taken as final on this matter, as it needs tohold up to actual usage and what is pleasing to the trained ear. Certainlywe see such laghus being used liberally in metres like Śloka, Upajāti andVasantatilakā. At the same time, there are metres like Śālinī where thiswould be unusual. The summary from those well-versed in the metrical132 Rajagopalantradition11 is that such laghus are best avoided (and are therefore unusual,the works of the master poets) in yati-prabala metres, those where the yati isprominent. This is why, Śālinī with 11 syllables to a pāda requires a stricterobservance of guru at the end of odd pādas than a metre like Vasantatilakāwith 14. As a general rule of thumb, though, such viṣama-pādānta-laghuscan be regarded as incorrect in metres longer than Vasantatilakā. It is notclear how a computer could automatically make such subjective decisions,so something like the idea (Melnad, Goyal, and P. M. Scharf 2015) of storinga boolean parameter about which metres allow this option, seems desirable.Still, the question of how that boolean parameter is to be chosen remainsopen.4.1.3 Is more data always better?It seems natural that having data about more metres would lead to betterdecisions and better results, but in practice, some care is needed. A commonproblem is that when there are too many metres in our database, thelikelihood of false positives increases. To see this more clearly, imaginea hypothetical case in which every possible combination of laghu and gurusyllables was given its own name as a metre: in that case, a verse intendedto be in the metre Śārdūlavikrīḍtam, say, with even a single error, wouldperfectly match some other named metre, and we would be misled as to thetruth. A specific case where this happens easily is when a user inputs asingle pāda but the system tries to treats it as a full verse. In this case, thequarters of the input, as they are much shorter, are more likely to matchsome metre accidentally. The solution of returning multiple results (a listof results rather than a single result) alleviates this problem (cf. the idea oflist decoding from computer science).A related problem is the over-precise naming of metres. We know thatIndravajrā and Upendravajrā differ only in the weight of the first syllable,and that the Upajāti metre consists of free alternation between them forthe four pādas in a verse, as for this particular metre, the weight of thefirst syllable does not matter too much. However, there exist theorists ofprosody who have, to each of the 24 = 16 possibilities (all the ways ofcombining Indravajrā and Upendravajrā), given names like Māyā, Premā,Mālā, Ṛddhiḥ and so on (A. Mishra 2007). This is not very useful to areader, as in such cases, the metre in question is, in essence, really more11Śatāvadhānī R. Ganesh, personal communicationMetrical analysis of Sanskrit verse 133common than such precise naming would make it seem. Velankar (Velankar1949) even considers the name Upajāti as arising from the “displeasure” ofthe “methodically inclined prosodist”.Another issue is that data compiled from multiple works on prosody(or sometimes even from the same source) can have inconsistencies. Itcan happen that the same metre is given different names in differentsources (Velankar 1949, p. 59). This is very common with noun endingsthat mark gender, such as -ā versus -aṃ, but we also see cases wherecompletely different names are used. It can also happen that the samename is used for entirely different metres (see also the confusion aboutUpajāti mentioned below in Section 4.4). For these reasons, instead ofstoring each metre as a (namePpattern) pair as mentioned earlier, or asthe (better) (namePpatternPbool) triple (Melnad, Goyal, and P. M. Scharf2015), it seems best to store a (patternPboolPnameP source for name) tuple.I started naively, thinking the name of metres is objective truth, and as aresult of this project I realized that names are assigned with some degree ofarbitrariness.Finally, a couple more points: (1) There exist metres that end with laghusyllables, and the code should be capable of handling them. (2) It is betterto keep metrical data as data files, rather than code. This was a mistakemade in the initial design of the system. Although it did not deter helpfulcontributors like Dhaval Patel from contributing code-like definitions foreach metre, it is still a hindrance that is best avoided. Keeping data in datafiles is language-agnostic and would allow it to be used by other tools.Overall, however, despite these issues, on the whole, the situation is nottoo bad, because it is mostly a small set of metres that is used by most poets.Although the repertoire of Sanskrit metres is vast (Deo 2007), and even theset of commonly used metres is larger in Sanskrit than in other languages,nevertheless, as with rāgas in music, although names can and have beengiven to a great many combinations, not every mathematical possibility isan aesthetic possibility.124.2 TransliterationIt appears that accepting input in various input schemes is one of the featuresof the tool that users enjoy. Although the differences between various inputschemes are mostly superficial and easily learned, it appears that many12This remark comes from Śatāvadhānī Ganesh who has pointed this out multiple times.134 Rajagopalanpeople have their preferred scheme that they would like to employ whereverpossible. These are fortunately easy for computers to handle.As pointed out elsewhere in detail (P. M. Scharf and Hyman 2011), theset of graphemes or phonemes one might encounter in putatively Sanskritinput is larger than that supported by common systems of transliterationlike Harvard-Kyoto or IAST. Characters like chandrabindu and ळ will occurin the input especially with modern poetry or verse from other languages.The system must be capable of doing something reasonable in such cases.A perhaps unusual choice is that the system does not currently acceptinput in SLP1, even though SLP1 is used internally. The simple reason isthat no one has asked for it, and it does not seem that many people typein SLP1. SLP1 is a great internal format and can be a good choice forinteroperability between separate tools, but it seems that the average userdoes not prefer typing kfzRaH for कृः. Nevertheless, this is a minor pointas this input method can easily be added if anyone wants it.In an earlier paper (Melnad, Goyal, and P. M. Scharf 2015), two of thedeficiencies stated about the tool by Mishra (A. Mishra 2007) are that:1. By supporting only Harvard-Kyoto input, that tool requires specialtreatment of words with consecutive a-i or a-u vowels (such as theword “ूउग”). In this tool, as Devanāgarī input is accepted, such wordscan be input (besides of course by simply inserting a space).2. That tool does not support accented input, which (Melnad, Goyal,and P. M. Scharf 2015) do because they accept input in SLP1. Inthis tool, accented input is accepted if input as Devanāgarī. However,as neither this tool nor the one by (Melnad, Goyal, and P. M. Scharf2015) supports Vedic metre, this point seems moot: Sanskrit poetryin the classical (non-Vedic) metre is not often accompanied by accentmarkers! In this tool, accent marks in Devanāgarī are accepted butignored.4.3 ScansionAs a coding shortcut when the program was first being written, I decidedto treat anusvāra and visarga as consonants too for scansion, instead ofespecially handling them. To my surprise, I have not had to revise this andeliminate the shortcut, because, in every instance, the result of scansion isthe same. I am not aware of any text on prosody treating anusvāra andMetrical analysis of Sanskrit verse 135visarga as consonants, but their identical treatment is valid for Sanskritprosody. This is a curious insight that the technological constraints (orlaziness) have given us!As mentioned in earlier work (Melnad, Goyal, and P. M. Scharf 2015),in later centuries of the Sanskrit tradition, there evolved an option ofconsidering certain guru syllables as laghu, as a sort of poetic license,in certain cases. Specifically, certain consonant clusters, especially thosecontaining r like pr and hr, were allowed to be treated as if they were singleconsonants, at the start of a word. This rule is stated by Kedārabhaṭṭatoo and seems to be freely used in the Telugu tradition even today. Afurther trend is to allow this option everywhere, based on how “effortlessly”or “quickly” certain consonant clusters can be pronounced, compared withothers. A nuanced understanding of this matter comes from a practisingpoet and scholar of Sanskrit literature, Śatāvadhānī R. Ganesh:13 thispractice arose from the influence of Prākṛta and Deśya (regional) languages(for instance, it is well-codified as a rule in Kannada and Telugu, under thename of Śithila-dvitva). It was also influenced by music; Ganesh cites thetreatise चतदु डीूकािशका. He concludes that as a conscientious poet, he willfollow poets like Kālidāsa, Bhāravi, Māgha, Śrāharṣa and Viśākhadatta innot using this exception when composing Sanskrit, but using it sparinglywhen composing in languages like Kannada where prior poets have used itfreely.With this understanding,14 the question arises whether the system needsto encode this exception, especially for dealing with later or modern poetry.This could be done, but as a result of the system’s robustness to errors, inpractice, this turns out to be less necessary. Any single verse is unlikely toexploit this poetic license in every single pāda, so the occasional usage of thisexception does not prevent the metre from being detected. The only caveatis that this already counts as an error, so verses that exploit this exceptionwould have slightly lower robustness to further additional errors.13personal communication, but see also corroboration at another summary here: Rajagopalan4.4 IdentificationIt is not enough for a verse to have the correct scansion (the correct patternof laghu and guru syllables), for it to be a perfect specimen of a givenmetre. There are additional constraints, such as yati: because a pauseis indicated at each yati-sthāna (caesura), a word must not cross sucha boundary, although separate lexical components of a compound word(samāsa) may. Previously (Melnad, Goyal, and P. M. Scharf 2015), anapproach has been suggested of using a text segmentation tool such as theSanskrit Heritage Reader (Huet 2005; Huet and Goyal 2013) for detectingwhen such a constraint is violated. This would indeed be ideal, but the toolbeing described in this paper alleviates the problem by displaying the user’sinput verse aligned to the metre, with each yati-sthāna indicated. Thus, anyinstance of a word crossing a yati boundary will be apparent in the display.Note that we can provide information on all kinds of Upajāti, evenif they are not explicitly added to our database, a problem mentionedpreviously (Melnad, Goyal, and P. M. Scharf 2015). Upajāti just means“mixture”; the common upajāti of Indravajrā and Upendravajrā, as a metre,has nothing to do with the upajāti of Vaṃśastha and Indravaṃśa (Velankar1949). In fact, the latter is sometimes known by the more specific name ofKarambajāti,15 among other names. Whenever an Upajāti of two differentmetres is used and input correctly, each of the two metres will be recognizedand shown to the user, because different pādas will match different patternsin our index. So without us doing any special work of adding all the kindsof Upajāti to the data, the user can see in any given instance that theirverse contains elements of both metres, and in exactly what way. Of course,adding the “mixed” metre explicitly to the data would be more informativeto the user, if the mixture is a common one.4.5 DisplayOnce a metre is identified, for some users, telling the user the name of themetre may be enough. However, if we envision this tool being used by anyonereading any Sanskrit verse (such as Devadatta from Section 1.3), then formany users, being told the name of the metre (or even the metre’s pattern)carries mainly the information that the verse is in some metre, but does notsubstantially improve the reader’s enjoyment of the verse. Seeing the verse15Śatāvadhān R. Ganesh, personal communicationMetrical analysis of Sanskrit verse 137aligned to the metre, with line breaks introduced in the appropriate placesand yati locations highlighted, helps a great amount. What would help themost, however, is a further introduction to the metre, along with familiarexamples that happen to be in the same metre, and audio recordings oftypical ways of reciting the metre.The tool does this, for popular metres (see Figure 1), drawing onanother resource (Ganesh 2013). In these audio recordings made in 2013,Śatāvadhānī R. Ganesh describes several popular metres, with well-chosenexamples (most recited from memory and some composed extempore for thesake of the recordings). Some interesting background such as its usage in thetradition—a brief “biography” of the metre —is also added for some metres.Although they were not created for the sake of this tool, it was the sameinterest in Sanskrit prosody that led both to the creation of this tool and tomy request for these recordings. Showing the user’s verse accompanied byexamples of recitation of other verses in the same metre helps the user readaloud and savor the verse they input.Incidentally, an introduction to metres via popular examples andaccompanying audio recordings is also the approach taken by the bookChandovallarī (S. Mishra 1999). The examples chosen are mostly from thestotra literature, which are most likely to be familiar to an Indian audience.In this way, it can complement the recordings mentioned in the previousparagraph, in which the examples were often chosen for their literary qualityor illustrative features.4.6 Getting feedback from usersThe main lesson I learned from building this system was the value of makingthings accessible to as many users as possible, by removing as many barriersas possible. Write systems that are “liberal” in what they accept, but arenevertheless conservative enough to avoid making errors (Postel’s law).There exist users who may not have much computer science orprogramming knowledge, but are nevertheless scholars who are expertsin a specific subject. For example, India’s tech penetration is low; evenmany Sanskrit scholars aren’t trained or inclined to enter verse in standardtransliteration formats. The very fact that they are visiting your tool andusing it means that they constitute a self-selecting sample. It would bea shame not to use their expertise. Their contributions and suggestionscan help improve the system. In the case of this tool, the link to GitHub138 Rajagopalandiscussion pages, and making it easy with a quick link to report issuesencountered during any particular interaction, have generated a lot ofimprovements, both in terms of usability and correctness. A minor exampleof a usability improvement is setting things up so that the text area isautomatically focused when a user visits the web page—this is trivial to setup, but not something that had occurred as something desirable to do. Inthis case, a user asked for it.Though user feedback guided many design decisions, gathering andacting on more of the user feedback would lead to further improvements.5 Conclusions and future workThis paper has described a tool for metre recognition that takes variousmeasures to be useful to users as much as possible. In this section, we listthe current limitations of the tool and improvements that can be (and areplanned to be) made.In terms of transliteration, though there are many transliterationschemes supported, even the requirement to be in a specific transliterationscheme is too onerous—instead, the tool must let the user type, and in real-time display its understanding of the user’s input, while offering convenientinput methods (such as a character picker16) that do not require priorknowledge of how to produce specific characters. Similarly, on the outputside, a user’s preferred script for reading Sanskrit (which may not be thesame as their input script) should be used and remembered for futuresessions, so that for instance a user can completely use the tool and seeall Sanskrit text in the Kannada script. There may even exist users whoprefer to read everything in SLP1!Very few mātrā metres are currently supported (only members of theĀryā family have been added). There are many simple mātrā metres usedin stotras, such a metre consisting of alternating groups of 3 and 4 mātrās.More examples for each