Detection and recognition of traffic panels and their textual information are important applications of advanced driving assistance systems (ADAS). They can significantly contribute in enhancing road safety by optimizing the driving experience. However, these tasks might face two major challenges. First, the lack of suitable and good quality datasets. Second, the in-feasibility of global standardization of traffic panels in terms of shape, color and language of the written text. Present research is intensively directed toward Latin text-based panels, while other scripts such as Arabic remain quiet undervalued. In this paper, we address this issue by introducing ATTICA a , a new open-source multi-task dataset, consisting of two main sub-datasets: ATTICA_Sign for traffic signs/panels detection and ATTICA_Text for Arabic text extraction/recognition. Our dataset gathers 1215 images with 3173 traffic panel boxes, 870 traffic sign boxes and 7293 Arabic text boxes. In order to examine the utility and advantages of our dataset, we adopt stateof-the-art deep learning based approaches. The conducted experiments show promising results confirming the valuable addition of our dataset in this field of research.
Accurate and timely traffic information is a vital element in intelligent transportation systems and urban management, which is vitally important for road users and government agencies. However, existing traffic prediction approaches are primarily based on standard machine learning which requires sharing direct raw information to the global server for model training. Further, user information may contain sensitive personal information, and sharing of direct raw data may lead to leakage of user private data and risks of exposure. In the face of the above challenges, in this work, we introduce a new hybrid framework that leverages Federated Learning with Local Differential Privacy to share model updates rather than directly sharing raw data among users. Our FL-LDP approach is designed to coordinate users to train the model collaboratively without compromising data privacy. We evaluate our scheme using a real-world public dataset and we implement different deep neural networks. We perform a comprehensive evaluation of our approach with state-of-the-art models. The prediction results of the experiment confirm that the proposed scheme is capable of building performance accurate traffic predictions, improving privacy preservation, and preventing data recovery attacks.
Building real-world complex Named Entity Recognition (NER) systems is a challenging task. This is due to the complexity and ambiguity of named entities that appear in various contexts such as short input sentences, emerging entities, and complex entities. Besides, real-world queries are mostly malformed, as they can be code-mixed or multilingual, among other scenarios. In this paper, we introduce our submitted system to the Multilingual Complex Named Entity Recognition (MultiCoNER) shared task. We approach the complex NER for multilingual and code-mixed queries, by relying on the contextualized representation provided by the multilingual Transformer XLM-RoBERTa. In addition to the CRF-based token classification layer, we incorporate a span classification loss to recognize named entities spans. Furthermore, we use a self-training mechanism to generate weakly-annotated data from a large unlabeled dataset. Our proposed system is ranked 6th and 8th in the multilingual and code-mixed MultiCoNER's tracks respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.