Siti Khaotijah Mohammad scite author profile

Siti Khaotijah Mohammad

5Publications

24Citation Statements Received

16Citation Statements Given

How they've been cited

How they cite others

111

Affiliations

Universiti Sains Malaysia, Hospital Universiti Sains Malaysia

Publications

Order By: Most citations

A Malay Text Corpus Analysis for Sentence Compression Using Pattern-Growth Method

et al. 2016

View full text Add to dashboard Cite

A text summary extracts serves as a condensed representation of a written input source where important and salient information is kept. However, the condensed representation itself suffer in lack of semantic and coherence if the summary was produced in verbatim using the input itself. Sentence Compression is a technique where unimportant details from a sentence are eliminated by preserving the sentence’s grammar pattern. In this study, we conducted an analysis on our developed Malay Text Corpus to discover the rules and pattern on how human summarizer compresses and eliminates unimportant constituent to construct a summary. A Pattern-Growth based model named Frequent Eliminated Pattern (FASPe) is introduced to represent the text using a set of sequence adjacent words that is frequently being eliminated across the document collection. From the rules obtained, some heuristic knowledge in Sentence Compression is presented with confidence value as high as 85% - that can be used for further reference in the area of Text Summarization for Malay language.

show abstract

A Malay text summarizer using pattern-growth method with sentence compression rules

Alias

Mohammad

Gan

et al. 2016

View full text Add to dashboard Cite

A text representation model using Sequential Pattern-Growth method

Alias

Mohammad²,

Gan³

et al. 2017

Pattern Anal Applic

View full text Add to dashboard Cite

Using Dictionary and Lemmatizer to Improve Low Resource English-Malay Statistical Machine Translation System

Yeong¹,

Tan²,

Mohammad³

2016

Procedia Computer Science

View full text Add to dashboard Cite

Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation

Yeong¹,

Tan²,

Gan³

et al. 2018

Int. J. Adv. Sci. Eng. Inf. Technol.

View full text Add to dashboard Cite

Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by an SMT is based on the statistical analysis of text corpora, while NMT uses the deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce a better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage in long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combines SMT and NMT to perform English to Malay translation. The motivation for using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a "suggestion translation" to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in the computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.