Semantic video annotation using ontologies has received a large attention from the scientific community in the recent years. Ontologies are being regarded as an appropriate tool to bridge the semantic gap. In this paper we present an overview of the state-of-the-art of approaches and algorithms that exploit ontologies to perform semantic video annotation and present an approach to automatically learn rules describing high-level concepts. This approach exploits the domain knowledge embedded into an ontology to learn a set of rules for semantic video annotation. The proposed technique is an adaptation of the First Order Inductive Learner (FOIL) technique to the Semantic Web Rule Language (SWRL) standard: Experiments have been performed in two different video domains: i) the TRECVID 2005 broadcast news collection, to detect events related to airplanes, such as taxiing, flying, landing and taking off; ii) surveillance videos, to detect if a person enters or exits a specific area. The promising experimental performance demonstrates the effectiveness and flexibility of the proposed framework