Nowadays, the traditional ways of job seeking have become less popular than digital methods. Recruitment websites are more attractive to job seekers since they provide easy, convenient access to a greater number of job vacancies. The biggest disadvantage, however, is that job vacancies published online are often unstructured and confusing. Studies related to online job vacancies are usually restricted to a short duration and a small number of recruitment websites. Such studies frequently use proxies for skills and occupations, or aggregate them into wider groups. The aim of our research is to provide full educational characteristics of job vacancies in Poland and calculate a complete list of educational mismatches. We introduce an approach that includes stages of source selection; data collection; and extraction of occupations, qualifications, and skills. We describe difficulties with data scraping and ways to overcome them. Thanks to our large dataset, we are able to determine and describe the labour demand. We also show the results of a survey that estimates educational traits of the labour supply. To measure mismatch between education and labour supply and demand, we use structural compliance indices. The paper also offers a case study for chosen occupational groups. Our findings reveal the greatest mismatch is in education and job-related skills, with the least mismatch occurring between geographic regions.