Abstract. ~-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.This paper presents and proves in detail a convergence theorem for ~-learning based on that outlined in Watkins (1989). We show that 0~-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many O~ values can be changed each iteration, rather than just one.
When a disease breaks out in a human population, changes in behavior in response to the outbreak can alter the progression of the infectious agent. In particular, people aware of a disease in their proximity can take measures to reduce their susceptibility. Even if no centralized information is provided about the presence of a disease, such awareness can arise through first-hand observation and word of mouth. To understand the effects this can have on the spread of a disease, we formulate and analyze a mathematical model for the spread of awareness in a host population, and then link this to an epidemiological model by having more informed hosts reduce their susceptibility. We find that, in a well-mixed population, this can result in a lower size of the outbreak, but does not affect the epidemic threshold. If, however, the behavioral response is treated as a local effect arising in the proximity of an outbreak, it can completely stop a disease from spreading, although only if the infection rate is below a threshold. We show that the impact of locally spreading awareness is amplified if the social network of potential infection events and the network over which individuals communicate overlap, especially so if the networks have a high level of clustering. These findings suggest that care needs to be taken both in the interpretation of disease parameters, as well as in the prediction of the fate of future outbreaks. mathematical model | rumor spread | behavioral response | social networks H uman reactions to the presence of disease abound, yet they have rarely been systematically investigated (1). Such reactions can range from avoiding social contact with infected individuals (social distancing) to wearing protective masks, vaccination, or more creative precautions. It has been shown, for instance, that local measles outbreaks are correlated with the demand for measles, mumps, and rubella vaccines (2). Similarly, the demand for condoms rises in areas where AIDS is prevalent (3), and condom use has been linked to the knowledge of someone who has died of AIDS (4).Behavior that is responsive to the presence of a disease can potentially reduce the size of an epidemic outbreak. On closer inspection, it is not so much the presence of the disease itself that will prompt humans to change their behavior, as awareness of the presence of the disease. A change in behavior can be prompted without witnessing the disease first hand, but by being informed about it through others. This information in itself will spread through the population and have its own dynamic. For example, according to the Chinese Southern Weekend newspaper, the text message "There is a fatal flu in Guangzhou" was sent 126 million times in Guangzhou alone during the 2003 severe acute respiratory syndrome (SARS) outbreak (5), causing people to stay home or wear face masks when going outside. This figure stands in stark contrast to the comparatively low number of 5,327 cases recorded in the whole of China (6). It is not clear how much the individual behaviora...
No abstract
Abstract. Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.
BackgroundConcerns over online health information–seeking behavior point to the potential harm incorrect, incomplete, or biased information may cause. However, systematic reviews of health information have found few examples of documented harm that can be directly attributed to poor quality information found online.ObjectiveThe aim of this study was to improve our understanding of the quality and quality characteristics of information found in online discussion forum websites so that their likely value as a peer-to-peer health information–sharing platform could be assessed.MethodsA total of 25 health discussion threads were selected across 3 websites (Reddit, Mumsnet, and Patient) covering 3 health conditions (human immunodeficiency virus [HIV], diabetes, and chickenpox). Assessors were asked to rate information found in the discussion threads according to 5 criteria: accuracy, completeness, how sensible the replies were, how they thought the questioner would act, and how useful they thought the questioner would find the replies.ResultsIn all, 78 fully completed assessments were returned by 17 individuals (8 were qualified medical doctors, 9 were not). When the ratings awarded in the assessments were analyzed, 25 of the assessments placed the discussion threads in the highest possible score band rating them between 5 and 10 overall, 38 rated them between 11 and 15, 12 rated them between 16 and 20, and 3 placed the discussion thread they assessed in the lowest rating band (21-25). This suggests that health threads on Internet discussion forum websites are more likely than not (by a factor of 4:1) to contain information of high or reasonably high quality. Extremely poor information is rare; the lowest available assessment rating was awarded only 11 times out of a possible 353, whereas the highest was awarded 54 times. Only 3 of 78 fully completed assessments rated a discussion thread in the lowest possible overall band of 21 to 25, whereas 25 of 78 rated it in the highest of 5 to 10. Quality assessments differed depending on the health condition (chickenpox appeared 17 times in the 20 lowest-rated threads, HIV twice, and diabetes once). Although assessors tended to agree on which discussion threads contained good quality information, what constituted poor quality information appeared to be more subjective.ConclusionsMost of the information assessed in this study was considered by qualified medical doctors and nonmedically qualified respondents to be of reasonably good quality. Although a small amount of information was assessed as poor, not all respondents agreed that the original questioner would have been led to act inappropriately based on the information presented. This suggests that discussion forum websites may be a useful platform through which people can ask health-related questions and receive answers of acceptable quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.