“…The first part of the tutorial covers methods that align neurons to human interpretable concepts or study the most salient neurons in the network. We cluster these methods into four groups i) Visualization Methods (Karpathy et al, 2015;Li et al, 2016a), ii) Corpus Selection (Kádár et al, 2017;Poerner et al, 2018;Na et al, 2019;Mu and Andreas, 2020b), iii) Neuron Probing (Dalvi et al, 2019a;Lakretz et al, 2019;Valipour et al, 2019;Durrani et al, 2020) and iv) Unsupervised Methods (Bau et al, 2019;Torroba Hennigen et al, 2020;Michael et al, 2020). We will discuss evaluation methods that are used to measure the effectiveness of an interpretation method, such as accuracy, control tasks (Hewitt and Liang, 2019) and ablation studies (Li et al, 2016b;Lillian et al, 2018;Dalvi et al, 2019a;Lakretz et al, 2019).…”