Archetypal Analysis

Cutler, Adele; Breiman, Leo

doi:10.2307/1269949

Cited by 183 publications

(213 citation statements)

References 0 publications

Supporting

Mentioning

213

Contrasting

Order By: Relevance

“…On the other hand, constraint 2) means that archetypes z j are convex combinations of the data points. To solve AA, Cutler and Breiman (1994) proposed an algorithm using an alternating minimization algorithm, where each step involves solving several convex least squares. According to the previous definition, archetypes are not necessarily real observed cases.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Archetypoid analysis for sports analytics

Vinué

Epifanio

2017

Data Min Knowl Disc

View full text Add to dashboard Cite

We intend to understand the growing amount of sports performance data by finding extreme data points, which makes human interpretation easier. In archetypoid analysis each datum is expressed as a mixture of actual observations (archetypoids). Therefore, it allows us to identify not only extreme athletes and teams, but also the composition of other athletes (or teams) according to the archetypoid athletes, and to establish a ranking. The utility of archetypoids in sports is illustrated with basketball and soccer data in three scenarios. Firstly, with multivariate data, where they are compared with other alternatives, showing their best results. Secondly, despite the fact that functional data are common in sports (time series or trajectories), functional data analysis has not been exploited until now, due to the sparseness of functions. In the second scenario, we extend archetypoid analysis for sparse functional data, furthermore showing the potential of functional data analysis in sports analytics. Finally, in the third scenario, features are not available, so we use proximities. We extend archetypoid analysis when asymmetric relations are present in data. This study provides information that will provide valuable knowledge about player/team/league performance so that we can analyze athlete's careers.

show abstract

Section: Methodsmentioning

confidence: 99%

“…AA was first proposed by Cutler and Breiman (1994). Its aim is to find pure types (the archetypes) in such a way that the other observations are a mixture of them.…”

mentioning

confidence: 99%

Archetypoid analysis for sports analytics

Vinué

Epifanio

2017

Data Min Knowl Disc

View full text Add to dashboard Cite

show abstract

“…The performance of each model was described, and sets of 4-5 playstyles identified across each model. The authors concluded that Archetype Analysis (AA) [13], [12] performs best in terms of developing clearly separated and explainable profiles, the latter forming a key quality criteria in games-based behavioral profiling as argued by Drachen et al [10].…”

Section: Related Workmentioning

confidence: 99%

“…However, research on systems for recommending products or behaviors to users are comparatively rare. The first major academicbased inroads towards using recommender systems Sifa et al [15] focused on recommendation game titles to players based on the games they had played previously, introducing an AA [13] based recommender system for game recommendation across a 3000+ game dataset from the game distribution platform Steam. Around the same time, Valve, the company behind Steam, introduced a recommender system to their storefront (the two projects being unrelated).…”

Section: Related Workmentioning

confidence: 99%

Controlling the crucible

Sifa

Pawlakos²,

Zhai³

et al. 2018

Proceedings of the Australasian Computer Science Week Multiconference

View full text Add to dashboard Cite

“…Another example of a constrained MF method is archetypal analysis (AA) as introduced by [3]. It considers the NMF problem where W ∈ R n×k and H ∈ R k×n are additionally required to be column stochastic matrices, i.e., they are to be non-negative and each of their columns is to sum to 1.…”

Section: Interpretable Matrix Factorizationmentioning

confidence: 99%

Matrix Factorization as Search

Kersting

Bauckhage

Thurau

et al. 2012

Machine Learning and Knowledge Discovery in Databases

View full text Add to dashboard Cite

Abstract. Simplex Volume Maximization (SiVM) exploits distance geometry for efficiently factorizing gigantic matrices. It was proven successful in game, social media, and plant mining. Here, we review the distance geometry approach and argue that it generally suggests to factorize gigantic matrices using search-based instead of optimization techniques. Interpretable Matrix FactorizationMany modern data sets are available in form of a real-valued m × n matrix V of rank r ≤ min(m, n). The columns v 1 , . . . , v n of such a data matrix encode information about n objects each of which is characterized by m features. Typical examples of objects include text documents, digital images, genomes, stocks, or social groups. Examples of corresponding features are measurements such as term frequency counts, intensity gradient magnitudes, or incidence relations among the nodes of a graph. In most modern settings, the dimensions of the data matrix are large so that it is useful to determine a compressed representation that may be easier to analyze and interpret in light of domain-specific knowledge. Formally, compressing a data matrix V ∈ R m×n can be cast as a matrix factorization (MF) task. The idea is to determine factor matrices W ∈ R m×k and H ∈ R k×n whose product is a low-rank approximation of V. Formally, this amounts to a minimization problem min W, H V − WH 2 where · denotes a suitable matrix norm, and one typically assumes k r. A common way of obtaining a low-rank approximation stems from truncating the singular value decomposition (SVD) where V = WSU T = WH. The SVD is popular for it can be solved analytically and has significant statistical properties. The column vectors w i of W are orthogonal basis vectors that coincide with the directions of largest variance in the data. Although there are many successful applications of the SVD, for instance in information retrieval, it has been criticized because the w i may lack interpretability with respect to the field from which the data are drawn [6]. For example, the w i may point in the direction of negative orthants even though the data itself is strictly non-negative. Nevertheless, data analysts are often tempted to reify, i.e., to assign a "physical"The authors would like to thank the anonymous reviewers for their comments.

show abstract

Archetypal Analysis

Cited by 183 publications

References 0 publications

Archetypoid analysis for sports analytics

Archetypoid analysis for sports analytics

Controlling the crucible

Matrix Factorization as Search

Contact Info

Product

Resources

About