Proceedings of the Second (2015) ACM Conference on Learning @ Scale 2015
DOI: 10.1145/2724660.2728696
|View full text |Cite
|
Sign up to set email alerts
|

Feature Factory

Abstract: We examine the process of engineering features for developing models that improve our understanding of learners' online behavior in MOOCs. Because feature engineering relies so heavily on human insight, we engage the crowd for feature proposals and guidance on how to operationalize them. When we examined our crowd-sourced features in the context of predicting stopout, not only were they impressively nuanced, but they also integrated more than one interaction mode between the learner and platform and described … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…A direction of improvement for the COPPY module is dependence modeling with Vine copula which has been recently a tool of high interest in the machine learning community see, e.g., [Lopez-Paz et al, 2013, Veeramachaneni et al, 2015, Carrera et al, 2016, Gonçalves et al, 2016 or [Sun et al, 2019]. Therefore, it strengthens the need of dependence modeling with copulae in Python as a not negligeable part of the machine learning community use this language.…”
Section: Discussionmentioning
confidence: 99%
“…A direction of improvement for the COPPY module is dependence modeling with Vine copula which has been recently a tool of high interest in the machine learning community see, e.g., [Lopez-Paz et al, 2013, Veeramachaneni et al, 2015, Carrera et al, 2016, Gonçalves et al, 2016 or [Sun et al, 2019]. Therefore, it strengthens the need of dependence modeling with copulae in Python as a not negligeable part of the machine learning community use this language.…”
Section: Discussionmentioning
confidence: 99%
“…In EDM, as well as in other areas of data mining and data science, transforming raw and inchoate data streams into meaningful variables is the first major challenge in the process. Often data come in forms (and formats) that are not ready for analysis; the data not only need to be transformed into a more meaningful format, but in addition meaningful variables need to be engineered (see section 3.3 in Baker, 2015, or Veeramachaneni, Adl, & O’Reilly, 2015, for a more thorough discussion of this process). In addition, data often need to be cleaned to remove cases and values that are not simply outliers but actively incorrect (i.e., cases where time stamps have impossible values, instructor test accounts in learning system data, etc.).…”
Section: Overview Of the Important Edm/la Toolsmentioning
confidence: 99%
“…A researcher may be interested in utilizing durations between actions to identify off-task students (e.g., Baker, 2007; Cetintas, Si, Xin, & Hord, 2010) but only have access to raw time stamps. In these situations, new variables must be created in order to conduct the desired analyses, a process termed feature engineering (Baker, 2015; Veeramachaneni et al, 2015).…”
Section: Overview Of the Important Edm/la Toolsmentioning
confidence: 99%