BackgroundIncreased transcription of the human endogenous retrovirus group HERV-K (HML-2) is often seen during disease. Although the mechanism of its tissue-specific activation is unclear, research shows that LTR CpG hypomethylation alone is not sufficient to induce its promoter activity and that the transcriptional milieu of a malignant cell contributes, at least partly, to differential HML-2 expression.ResultsWe analyzed the relationship between LTR sequence variation and promoter expression patterns in human breast cancer cell lines, finding them to be positively correlated. In particular, two proviruses (3q12.3 and 11p15.4) displayed increased activity in almost all tumorigenic cell lines sampled. Using a transcription factor binding site prediction algorithm, we identified two unique binding sites in each 5′ LTR that appeared to be associated with inducing promoter activity during neoplasia. Genomic analysis of the homologous proviruses in several non-human primates indicated post-integration genetic drift in two transcription factor binding sites, away from the ancestral sequence and towards the active form. Based on the sequences of 2504 individuals from the 1000 Genomes Project, the active form of the 11p15.4 site was found to be polymorphic within the human population, with an allele frequency of 51%, whereas the activating mutation in the 3q12.3 provirus was fixed in humans but not present in the orthologous provirus in chimpanzees or gorillas.ConclusionsThese data suggest that stage-specific transcription factors at least partly contribute to LTR promoter activity during transformation and that, in some cases, transcription factor binding site polymorphisms may be responsible for the differential HML-2 expression often seen between individuals.Electronic supplementary materialThe online version of this article (10.1186/s12977-018-0441-2) contains supplementary material, which is available to authorized users.