2021
DOI: 10.1016/j.ascom.2021.100460
|View full text |Cite
|
Sign up to set email alerts
|

Modelling the projected separation of microlensing events using systematic time-series feature engineering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 16 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…Loading and integrating these data can be challenging, and it is an ongoing process, even after the machine learning model has been deployed and an initial master database has been constructed (continuous integration). All these data are in raw format and require transformation (cleaning), i.e., treating missing values (i.e., imputation [53,54], complete entry removal, matrix completion [55][56][57][58][59] depending on the randomness in the missing data pattern or by producing the missing data with simulations [60]), correcting for data formats (mathematical representation of numbers, time stamps, inconsistent text format, size and resolution of images, etc. ), de-duplicating [61] (with or without a unique key per real-world entry) or removing redundant data, dealing with contradicting data, removing outliers (point outliers, collective or contextual outliers), etc.…”
Section: Data Pipeline: Extraction Loading and Transformationmentioning
confidence: 99%
“…Loading and integrating these data can be challenging, and it is an ongoing process, even after the machine learning model has been deployed and an initial master database has been constructed (continuous integration). All these data are in raw format and require transformation (cleaning), i.e., treating missing values (i.e., imputation [53,54], complete entry removal, matrix completion [55][56][57][58][59] depending on the randomness in the missing data pattern or by producing the missing data with simulations [60]), correcting for data formats (mathematical representation of numbers, time stamps, inconsistent text format, size and resolution of images, etc. ), de-duplicating [61] (with or without a unique key per real-world entry) or removing redundant data, dealing with contradicting data, removing outliers (point outliers, collective or contextual outliers), etc.…”
Section: Data Pipeline: Extraction Loading and Transformationmentioning
confidence: 99%
“…treating missing values (i.e. imputation [52,53], complete entry removal, matrix completion [54][55][56][57][58] depending on the randomness in the missing data pattern or by producing the missing data with simulations [59]), correcting for data formats (mathematical representation of numbers, time stamps, inconsistent text format, size and resolution of images, etc), de-duplicating [60] (with or without a unique key per real-world entry) or removing redundant data, dealing with contradicting data, removing outliers (point outliers, collective or contextual outliers), etc. Current industry practices for data engineering, include ELT tools (Extract, Load, Transform) [61], which offer data extraction, data integration, and data transformation functions via some Application Programming Interface (API).…”
Section: Data Pipeline: Extraction Loading and Transformationmentioning
confidence: 99%
“…Random Forest has generated great interest in recent astrophysical and cosmological analyses (e.g. Bonjean et al 2019;Hernandez Vivanco et al 2020;Kennedy et al 2021), because it achieves high accuracy and efficiency with large data sets, whilst being quick to implement. Additionally, at the training stage, RF is able to learn in a fast way highly non-linear relations between the inputs and the outputs it is given.…”
Section: Machine Learning Approachmentioning
confidence: 99%