Probabilistic Models for Text Mining

Sun, Yizhou; Deng, Hongbo

doi:10.1007/978-1-4614-3223-4_8

Cited by 10 publications

(6 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this field, words appearing in documents are related to discrete latent variables, which in turn are called topics. Comprehensive descriptions of topic models and typical applications can be found in the text mining literature (see, e.g., Blei, Ng, and Jordan 2003;Steyvers and Griffiths 2007;Sun, Deng, and Han 2012).…”

Section: Latent Dirichlet Allocationmentioning

confidence: 99%

Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool

Schröder

Falke

Hruschka

et al. 2019

Journal of Interactive Marketing

View full text Add to dashboard Cite

The increasing importance of online distribution channels is paralleled by a rising interest in gaining insights into the customer journey to online purchases. In this paper we propose an easy-to-implement two-step procedure that enables online marketing managers to disentangle the complex interrelationships hidden behind observed Internet browsing behavior across websites. Utilizing the procedure allows managers to gain a better understanding why Internet users are visiting their website(s) and how these visits are related to purchases. In the first step, the procedure uncovers latent interests underlying online users’ browsing behavior. In the second step, we segment the online users based on their uncovered latent interests. This way, online marketers may understand how segment-specific combinations of latent interests are linked to purchase behavior. We apply the procedure to ComScore clickstream data across 472 websites. We show that there is considerable heterogeneity among online users both regarding online browsing habits, combinations of latent interests, and their conversion into online purchases. For example, some users are interested in apparel and travel service opposed to users who are interested in entertainment tickets. Our empirical analysis confirms that a relatively small fraction of online users realize 70% of online spending. In addition, we detect substantial segment-specific differences of shopping behavior across categories, the most important product categories being apparel as well as food & beverages. Our descriptive perspective comes up with surprising associations among the websites which can be interesting for online marketers.

show abstract

Section: Latent Dirichlet Allocationmentioning

confidence: 99%

Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool

Schröder

Falke

Hruschka

et al. 2019

Journal of Interactive Marketing

View full text Add to dashboard Cite

show abstract

“…Among probabilistic models, mixture models can represent subpopulations within a population without explicating to which data aggrupation (or their observed samples) a point belongs (Marin et al., ; Sun et al., ). These models represent the population‘s pdf and they are useful to make estimations.…”

Section: Mathematical Modelmentioning

confidence: 99%

Waiting‐time estimation in walk‐in clinics

Montecinos

Ouhimmou

Chauhan

2016

Int Trans Operational Res

View full text Add to dashboard Cite

Medical assistance is offered by walk-in clinics (WC). These clinics must keep track of patients' turn in line. Some private companies offer an extra follow-up service to WC patients, which notifies them when their consultation approaches, so patients can use their free time elsewhere than in the waiting room. This paper aims to develop an applied forecasting approach for consultation service time estimation and waiting-time estimation. A model based on particle filters and mixture models helps to estimate the waiting time for each consultation, using historical and new incoming data from patient consultations. The system considers two types of patients, namely, regular and follow-up. Our method gives an estimate of the waiting time for consultation better than simple statistics.

show abstract

“…This is accomplished by a combination of tools and insights from natural language processing and computational linguistics, augmented by computational intelligence. They involve both rule-based and probability-based approaches 7 and a detailed survey of di↵erent tools that are employed in text mining can be found in Nenkova and McKeown (2012), Aggarwal and Zhai (2012b), and Sun, Deng and Han (2012). The potential applications are diverse, ranging from extracting information regarding new discoveries in biomedical research, gathering information and outlooks that may be useful for finance professionals, to crucial information gathering for intelligence and security services.…”

Section: Text Mining and Sentiment Analysismentioning

confidence: 99%

Information aggregation and computational intelligence

Chen

Venkatachalam

2016

Evolut Inst Econ Rev

View full text Add to dashboard Cite

This paper examines the possibility that the computational intelligence (CI) inspired tools can e↵ectively aggregate the rich information generated from the Web 2.0 economy, and thereby enhance the quality of decision-making. Despite many advancements and commendable applications of CI in recent years, this issue has not been well addressed. We argue that this question is intimately related to the central issue of the socialist calculation debate since the time of Friedrich Hayek. In terms of information aggregation, we examine whether there is a better engineering than the market mechanism. More precisely, we focus on whether the CI-driven sentiment analysis can generate signals like prices and whether CI can process the unstructured text data better than the market. We argue that Web 2.0 economy may not be able to set us free from information overload problems that have long co-existed with the presence of markets. We attribute this to the tacitness and subjectivity of knowledge and the recursive (feedback) characteristic of the sentiments. In this sense, Hayek's fundamental assertion that the e↵ectiveness of the market mechanism may not be so much conditioned on the information and communication technology still applies.The first author and the second author are grateful for the research support in the form of Ministry of Science and Technology (MOST) grants, MOST 103-2410-H-004-009-MY3 and MOST 104-2811-H-004-003, respectively. We thank the two anonymous referees for their valuable suggestions that helped to improve the paper. All remaining errors are our own.

show abstract

Probabilistic Models for Text Mining

Cited by 10 publications

References 49 publications

Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool

Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool

Waiting‐time estimation in walk‐in clinics

Information aggregation and computational intelligence

Contact Info

Product

Resources

About