In the classical leave-one-out procedure for outlier detection in regression analysis, we exclude an observation, then construct a model on the remaining data. If the difference between predicted and observed value is high we declare this value an outlier. As a rule, those procedures utilize single comparison testing. The problem becomes much harder when the observations can be associated with a given degree of membership to an underlying population and the outlier detection should be generalized to operate over fuzzy data. We present a new approach for outlier that operates over fuzzy data using two inter-related algorithms. Due to the way outliers enter the observation sample, they may be of various order of magnitude. To account for this, we divided the outlier detection procedure into cycles. Furthermore, each cycle consists of two phases. In Phase 1 we apply a leave-one-out procedure for each nonoutlier in the data set. In Phase 2, all previously declared outliers are subjected to Benjamini-Hochberg step-up multiple testing procedure controlling the false discovery rate, and the non-confirmed outliers can return to the data set. Finally, we construct a regression model over the resulting set of non-outliers. In that way we ensure that a reliable and high-quality regression model is obtained in Phase 1 because the leaveone-out procedure comparatively easily purges the dubious observations due to the single comparison testing. In the same time, the confirmation of the outlier status in relation to the newly obtained high-quality regression model is much harder due to the multiple testing procedure applied hence only the true outliers remain outside the data sample. The two phases in each cycle are a good trade-off between the desire to construct a high-quality model (i.e. over informative data points) and the desire to use as much data points as possible (thus leaving as much observations as possible in the data sample). The number of cycles is user-defined, but the procedures can finalize the analysis in case a cycle with no new outliers is detected. We offer one illustrative example and two other practical case studies (from real-life thrombosis studies) that demonstrate the application and strengths of our algorithms. In the concluding section, we discuss several limitations of our approach and also offer directions for future research. Keywords: regression analysis, leave-one-out method, degree of membership, multiple testing, Benjamini-Hochberg step-up multiple testing, false-discovery rate Highlights: -We develop algorithms for outlier rejection over fuzzy samples using weighted least squares that operate in a given number of cycles -Each cycle has two phases -use single testing leave-one-out procedure for initial purging of data, then confirm the previous outlier status with multiple testing -We offer one illustrative example and two examples from a case study in thrombosis research to show the strength of our cycle-based approach
CONTEXTInnovative economies require a workforce with a high level of technical skills and scientific awareness, yet worldwide there is a decline in the number of students participating in pre university science, technology, engineering and mathematics (STEM). Australia's graduation rates in STEM fields are low by international comparison, providing challenges in meeting qualified workforce needs. Australia's future in the next three to five years depends on a stronger workforce with more qualified engineers and associated professionals with high level skills who are capable of meeting the needs of growing industries such as advanced manufacturing and the maritime sector. PURPOSEThis project identified the mismatch between current skills and future needs from a literature review and through interviews with Tasmanian industry stakeholders. It reflected on existing pathways and the changes required for ensuring that future skills needs are met. APPROACHQualitative data on current skills and future skill needs were collected through semi-structured interviews with individual companies in the manufacturing, advanced manufacturing and maritime/marine industries in Tasmania. The companies selected for interview were either members of the Tasmanian Maritime Network or considered to be in growth industries or industries of importance for Tasmania. Companies were selected to ensure a mix of size, age of company and diversity within the industry. RESULTSA major learning from this project was that there are common needs amongst the manufacturing, advanced manufacturing and maritime/marine industries for future skills despite the diversity of industries. The fundamental skills identified by industry for continued growth and effective management included basic skills such as literacy and numeracy, problem-solving, work ethic, IT, leadership and management including the need for staff to be multi-skilled. Technology is everchanging and technology based skills for specific industries will also drive training needs for the future. Issues raised by industry included: retirement of the ageing workforce in these industries which will create a skills gap if industry does not address training and progression of existing staff; training providers were not necessarily offering the required training and therefore all companies offered some form of in-house training for specialty skills; and that the lack of higher level Vocational Education and Training (VET) in manufacturing, advanced manufacturing and engineering has left a gap of skilled staff in Tasmania. DISCUSSION AND CONCLUSIONSThe results of this study clearly indicate that there is a need for VET and Higher Education (HE) to be flexible in their course offerings, and maintain a close relationship with industry (and with each other) to promote skills transfer between the sectors. This will ensure that the education and training sector remains relevant to meet the needs of employers, delivering consistent and quality learning outcomes. In addition, a close relationship will creat...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.