Abstract:With the rapid growth of online information, a simple search query may return thousands or even millions of results. There is a need to help user to access and identify relevant information in a flexible way. This paper describes a methodology that automatically map web search results into user defined categories. This allows the user to focus on categories of their interest, thus helping them to find for relevant information in less time. Text classification algorithm is used to map search results into categories. This paper focuses on feature selection method and term weighting measure in order to train an optimum and simple category model from a relatively small number of training texts. Experimental evaluations on real world data collected from the web shows that our classification algorithm gives promising results and can potentially be used to classify search results returned by search engines.
The speech synthesis approach required in restricted domain speech application is a synthesizer that has high quality like the speech output of 'slot-filler' approach but have at least the least flexibility of the 'genuine' speech synthesizer. Thus, in this research study, we propose an alternative approach of creating a speech synthesizer to be used in a restricted domain speech application. In our approach, we use word unit as the primary unit and our speech corpus is represented by syntax-prosody tree structures. Speech synthesis is performed by constructing a syntax-prosody tree of a target input sentence. The construction of the tree is by done by adapting an examplebased syntactic parsing approach and the concatenated of synthesis units from the constructed tree nodes will be the synthesized utterance. For evaluation, we performed MOS subjective evaluation on our speech synthesizer with natural speech and two other Malay TTS system. Based on an ANOVA and T-Tests analysis, we found the overall MOS scores of our speech synthesizer output, sound B was (mean = 3.34, sd = 1.10), the other two Malay TTS system; C (mean = 1.95, sd = 0.72) and D (mean = 1.80, sd = 1.04) and the natural speech, A (mean = 4.71, sd = 0.21). We conclude that our Malay speech synthesizer sounded more natural, easier to listen, more pleasant and more fluent compared to the sounds of the other two Malay TTS systems. As expected, the recorded speech was perceived more natural than the output of our Malay speech synthesizer.
In this paper, a flexible annotation schema called (SSTC) is introduced. In order to describe the correspondence between different languages, we propose a variant of SSTC called synchronous SSTC (S-SSTC). We will also describe how S-SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other synchronous formalisms. The proposed S-SSTC schema is well suited to describe the correspondence between different languages, in particular, relating a language with its translation in another language (i.e. in Machine Translation). Also it can be used as annotation for translation systems that automatically extract transfer mappings (rules or examples) from bilingual corpora. The S-SSTC is very well suited for the construction of a Bilingual Knowledge Bank (BKB), where the examples are kept in form of S-SSTCs.
The search that involves structured web resources like XML data, services is still lagging of its own method and relying on contemporary search systems. This paper presents a method that learns semantics from structured information of these resources. Instead of committing the semantic meaning of resources to strict and formal vocabularies like ontology or data dictionary, we are interested to interpret the meaning based on the natural context of the resources. The semantics are used in search process, i.e. query reasoning and resource selection, to provide better answer in terms of context relevancy and clearer result description.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.