Fifty years after Damerau set up his statistics for the distribution of errors in typed texts, his findings are still used in a range of different languages. Because these statistics were derived from texts in English, the question of whether they actually apply to other languages has been raised. We address this issue through the analysis of a set of typed texts in Brazilian Portuguese, deriving statistics tailored to this language. Results show that diacritical marks play a major role, as indicated by the frequency of mistakes involving them, thereby rendering Damerau's original findings mostly unfit for spelling correction systems, although still holding them useful, should one set aside such marks. Furthermore, a comparison between these results and those published for Spanish show no statistically significant differences between both languages-an indication that the distribution of spelling errors depends on the adopted character set rather than the language itself.
In this chapter, two empirical pilot studies on the role of politeness in dialogue summarization are described. In these studies, a collection of four dialogues was used. Each dialogue was automatically generated by the NECA system and the politeness of the dialogue participants was systematically manipulated. Subjects were divided into groups who had to summarize the dialogues from a particular dialogue participant's point of view or the point of view of an impartial observer. In the first study, there were no other constraints. In the second study, the summarizers were restricted to summaries whose length did not exceed 10% of the number of words in the dialogue that was being summarized.Amongst other things, it was found that the politeness of the interaction is included more often in summaries of dialogues that deviate from what would be considered normal or unmarked. A comparison of the results of the two studies suggests that the extent to which politeness is reported is not affected by how long a summary is allowed to be. It was also found that the point of view of the summarizer influences which information is included in the summary and how it is presented. This finding did not seem to be affected by the constraint in our second study on the summary length.
O curso de Bacharelado em Sistemas de Informação da Universidade de São Paulo trabalha pela constante melhoria na formação que oferece para seus alunos, o que requer um trabalho contínuo de inovação e aprimoramento do processo de ensino-aprendizagem executados por seus professores e alunos. Na busca desta melhoria, os professores e alunos vêm realizando algumas ações, dentre as quais estão as experiências apresentadas neste artigo: as disciplinas de Desafios de Programação e o Campeonato de Programação para Calouros. Ambas estão focadas na complementação do aprendizado de lógica de programação, algoritmos e estruturas de dados -- assuntos difíceis do ponto de vista didático, mas imprescindíveis na formação técnica de qualidade. O presente artigo revisita e estende análises sobre essas experiências.
As stock trading became a popular topic on Twitter, many researchers have proposed different approaches to make predictions on it, relying on the emotions found in messages. However, detailed studies require a reasonably sized corpus with emotions properly annotated. In this work, we introduce a corpus of tweets in Brazilian Portuguese annotated with emotions. Comprising 4,277 tweets, this is, to the best of our knowledge, the largest annotated corpus available in the stock market domain for this language. Amongst its possible uses, the corpus lends itself to the application of machine learning models for automatic emotion identification, as well as to the study of correlations between emotions and stock price movements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.