- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- PSS : a phonetic search system for short text documents
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
PSS : a phonetic search system for short text documents Zhang, Jerry Jiaer
Abstract
Finding the right information from the increasing amount of data on the Internet is not easy. This is why most people use search engines because they make searching less difficult with a a variety of techniques. In this thesis, we address one of them called phonetic matching. The idea is to look for documents in a document set based on not only the spellings but their pronunciations as well. It is useful when a query contains spelling mistakes or a correctly spelled one does not return enough results. In these cases, phonetic matching can fix or tune up the original query by replacing some or all query words with the new ones that are phonetically similar, and hopefully achieve more hits. We propose the design of such a search system for short text documents. It allows for single- and multiple-word queries to be matched to sound-like words or phrases contained in a document set and sort the results in terms of their relevance to the original queries. Our design differs from many existing systems in that, instead of relying heavily on a set of extensive prior user query logs, our system makes search decisions mostly based on a relatively small dictionary consisting of organized metadata. Our goal is to make it suitable for start-up document sets to have the comparable phonetic search ability as those of bigger databases, without having to wait till enough historical user queries are accumulated.
Item Metadata
Title |
PSS : a phonetic search system for short text documents
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2008
|
Description |
Finding the right information from the increasing amount of data on the
Internet is not easy. This is why most people use search engines because
they make searching less difficult with a a variety of techniques. In this
thesis, we address one of them called phonetic matching. The idea is to look
for documents in a document set based on not only the spellings but their
pronunciations as well. It is useful when a query contains spelling mistakes
or a correctly spelled one does not return enough results. In these cases,
phonetic matching can fix or tune up the original query by replacing some
or all query words with the new ones that are phonetically similar, and
hopefully achieve more hits. We propose the design of such a search system
for short text documents. It allows for single- and multiple-word queries to
be matched to sound-like words or phrases contained in a document set and
sort the results in terms of their relevance to the original queries. Our design
differs from many existing systems in that, instead of relying heavily on a set
of extensive prior user query logs, our system makes search decisions mostly
based on a relatively small dictionary consisting of organized metadata. Our
goal is to make it suitable for start-up document sets to have the comparable
phonetic search ability as those of bigger databases, without having to wait
till enough historical user queries are accumulated.
|
Extent |
1031338 bytes
|
Genre | |
Type | |
File Format |
application/pdf
|
Language |
eng
|
Date Available |
2009-03-05
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0051236
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2008-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International