In the paper, we present a software pipeline for speech recognition to automate the creation of training datasets, based on desired unlabeled audios, for low resource languages and domain-specific area. Considering the commoditizing of speech recognition, more teams build domain-specific models as well as models for local languages. At the same time, lack of training datasets for low to middle resource languages significantly decreases possibilities to exploit last achievements and frameworks in the Speech Recognition area and limits the wide range of software engineers to work on speech recognition problems. This problem is even more critical for domain-specific datasets. The pipeline was tested for building Ukrainian language recognition and confirmed that the created design is adaptable to different data source formats and expandable to integrate with existing frameworks.
The combination of cyber security systems and artificial intelligence is a logical step at this stage of information technology development. Today, many cybersecurity vendors are incorporating machine learning and artificial intelligence into their products or services. However, the effectiveness of investments in advanced machine learning and deep learning technologies in terms of generating meaningful measurable results from these products is a matter of debate. When designing such systems, there are problems with achieving accuracy and scaling. The article considers the classification of artificial intelligence systems, artificial intelligence models used by security products, their capabilities, recommendations that should be taken into account when using generative artificial intelligence technologies for cyber protection systems are given. ChatGPT's NLP capabilities can be used to simplify the configuration of policies in security products. An approach that considers both short-term and long-term metrics to measure progress, differentiation, and customer value through AI is appropriate. The issue of using generative AI based on platform solutions, which allows aggregating various user data, exchanging ideas and experience among a large community, and processing high-quality telemetry data, is also considered. Thanks to the network effect, there is an opportunity to retrain AI models and improve the effectiveness of cyber defense for all users. These benefits lead to a virtual cycle of increased user engagement and improved cyber security outcomes, making platform-based security solutions an attractive choice for businesses and individuals alike. When conducting a cyber security audit of any IT infrastructure using AI, the limits and depth of the audit are established taking into account previous experience.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.