2022
DOI: 10.3390/en15145085
|View full text |Cite
|
Sign up to set email alerts
|

Physics-Based Method for Generating Fully Synthetic IV Curve Training Datasets for Machine Learning Classification of PV Failures

Abstract: Classification machine learning models require high-quality labeled datasets for training. Among the most useful datasets for photovoltaic array fault detection and diagnosis are module or string current-voltage (IV) curves. Unfortunately, such datasets are rarely collected due to the cost of high fidelity monitoring, and the data that is available is generally not ideal, often consisting of unbalanced classes, noisy data due to environmental conditions, and few samples. In this paper, we propose an alternate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 45 publications
0
4
0
Order By: Relevance
“…However, after a rigorous filtering process adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (PRISMA) guidelines [6], 31 papers were retained for detailed examination. We found that many studies used I-V curves alongside ML for fault identification in PV arrays [7]. Techniques like Principal Component Analysis (PCA) [8] were commonly used for feature extraction, while complex PV systems adopted advanced methods, such as Recurrent Neural Networks (RNNs) [9] with satellite data and Convolutional Neural Networks (CNNs) [10], for analyzing voltage and current.…”
Section: Data From Satellitementioning
confidence: 99%
“…However, after a rigorous filtering process adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (PRISMA) guidelines [6], 31 papers were retained for detailed examination. We found that many studies used I-V curves alongside ML for fault identification in PV arrays [7]. Techniques like Principal Component Analysis (PCA) [8] were commonly used for feature extraction, while complex PV systems adopted advanced methods, such as Recurrent Neural Networks (RNNs) [9] with satellite data and Convolutional Neural Networks (CNNs) [10], for analyzing voltage and current.…”
Section: Data From Satellitementioning
confidence: 99%
“…The package documentation for pvOps provides thorough examples exploring the various capabilities of each module. Additional details about the iv module capabilities are captured in (Hopwood et al, 2020;Hopwood, Stein, et al, 2022) while more information about the design and development of the text, timeseries, and text2time modules are captured in (Mendoza et al, 2021). Key package dependencies of pvOps include pandas (The pandas development team, 2020), sklearn (Pedregosa et al, 2011), nltk (Bird et al, 2009), and keras (Chollet & others, 2015) for analysis and matplotlib (Hunter, 2007), seaborn (Waskom, 2021), and plotly (Plotly Technologies Inc., 2015) for visualization.…”
Section: Package Overviewmentioning
confidence: 99%
“…By their definition, faults or anomalous behaviour occurs much less frequently than normal operating conditions. 1 Because of this, data re-balancing efforts are usually required. However with faulty conditions there is also the issue that to collect data about faulty conditions, they need to occur, and this is far from ideal as faulty conditions usually lead to machine damage, or faulty components being manufactured.…”
Section: Introductionmentioning
confidence: 99%