What does Transformer learn about source code?

Zhang, Kechi; Li, Ge; Jin, Zhi

doi:10.48550/arxiv.2207.08466

Cited by 1 publication

(1 citation statement)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They find that CAT-scores for source code models are correlated with their performance on code summarization, and that the CAT-scores vary per layer and per language, with a tendency for higher scores in the earlier layers. Zhang et al [43] define a similar metric over the attention matrix, an aggregated attention score, with which they can derive a graph of relationships between tokens.…”

Section: Related Workmentioning

confidence: 99%

What do pre-trained code models know about code?

Karmakar

Robbes

2021

2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)

View full text Add to dashboard Cite

Pre-trained models of source code have recently been successfully applied to a wide variety of Software Engineering tasks; they have also seen some practical adoption in practice, e.g. for code completion. Yet, we still know very little about what these pre-trained models learn about source code. In this article, we use probing-simple diagnostic tasks that do not further train the models-to discover to what extent pre-trained models learn about specific aspects of source code. We use an extensible framework to define 15 probing tasks that exercise surface, syntactic, structural and semantic characteristics of source code. We probe 8 pre-trained source code models, as well as a natural language model (BERT) as our baseline. We find that models that incorporate some structural information (such as GraphCodeBERT) have a better representation of source code characteristics. Surprisingly, we find that for some probing tasks, BERT is competitive with the source code models, indicating that there are ample opportunities to improve source-code specific pre-training on the respective code characteristics. We encourage other researchers to evaluate their models with our probing task suite, so that they may peer into the hidden layers of the models and identify what intrinsic code characteristics are encoded.

show abstract

Section: Related Workmentioning

confidence: 99%