It is a widely-accepted fact that the processing of very large amounts of data with state-of-the-art Natural Language Processing (NLP) practices (i.e. Machine Learning –ML, language agnostic approaches) has resulted to a dramatic improvement in the speed and efficiency of systems and applications. However, these developments are accompanied with several challenges and difficulties that have been voiced within the last years. Specifically, in regard to NLP, evident improvement in the speed and efficiency of systems and applications with GenAI also entails some aspects that may be problematic, especially when particular text types, languages and/or user groups are concerned.
State-of-the-art NLP approaches with automated processing of vast amounts of data in GenAI are related to observed problematic Aspects 1-7, namely: (1) Underrepresentation, (2) Standardization. These result to (3) Barriers in Text Understanding, (4) Discouragement of HCI Usage for Special Text Types and/or User Groups, (5) Barriers in Accessing Information, (6) Likelihood of Errors and False Assumptions and (7) Difficulties in Error Detection and Recovery. An additional problem are typical cases, such as less-resourced languages (A), less experienced users (B) and less agile users (C).
A hybrid approach involving the re-introduction and integration of traditional concepts in state-of-the-art processing approaches, whether they are automatic or interactive, concerns the following targets:
i), (ii) and (iii): Making more types of information accessible to more types of recipients and user groups (i), Making more types of services accessible and user-friendly to more types of user groups (ii), Making more types of feelings, opinions, voices and reactions visible from more types of user groups (iii)
Specifically, in the above-presented cases traditional and classical theories, principles and models are re-introduced and can be integrated into state-of-the art data-driven approaches involving Machine Learning and neural networks, functioning as training data and seed data in Natural Language Processing applications where user requirements and customization are of particular interest and importance. A hybrid approach may be considered a compromise between speed and correctness / userfriendliness in (types of) NLP applications where the achievement of this balance plays a crucial role. In other words, a hybrid approach and the examples presented here target to prevent mechanisms from adopting human biases, ensuring fairness and socially responsible outcome and responsible Social Media. A hybrid approach and the examples presented here also target to customizing content to different linguistic and cultural groups, ensuring equitable information distribution.
Here, we present characteristic examples with cases employing the re-introduction of four typical types of traditional concepts concerning classical theories, principles and models. These four typical classical theories, principles and models are also not considered to be flawless, however they can be transformed into practical strategies that can be integrated into evaluation modules, neural networks and training data (including knowledge graphs) and dialogue design. The proposed and discussed re-introduction of traditional concepts is not limited only to the particular models, principles and theories presented here.
The first example concerns the application of a classic principle from Theoretical Linguistics. The concept employed in the second example concerns a model from the field of Linguistics and Translation. The third and the fourth examples demonstrate the interdisciplinary application of models and theoretical frameworks from the fields of Linguistics-Cognitive Science and Linguistics-Psychology respectively.