The DAily Home LIfe Activity Dataset: A High Semantic Activity Dataset for Online Recognition

Vaquette, Geoffrey; Orcesi, Astrid; Lucat, Laurent; Achard, Catherine

doi:10.1109/fg.2017.67

Cited by 22 publications

(19 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To evaluate our proposed pipeline, we tested it with two different datasets; DAHLIA [9] and GAADRD [10]. The details and results of these datasets are explained in next subsections.…”

Section: Resultsmentioning

confidence: 99%

“…Recently, the DAily Home LIfe Activity Dataset (DAHLIA) was published [9], it is the biggest public dataset for detection of daily-living activities. Many algorithms were applied to this dataset as baseline; Online Efficient Linear Search (ELS) [19] utilized the sliding window approach along with features from 3D skeletons in each frame forming a new feature called "gesturelets", which is used to form a codebook then train SVM classifier.…”

Section: Related Workmentioning

confidence: 99%

“…The introduction of such datasets motivated more researchers to work on the problem of activity detection, as will be discussed in section II. Unlike general activity detection and datasets that use videos from web, there is another category of datasets with main focus on activities of daily-life (ADL) [9] [10] [11], where all the videos contain usual daily-living activities such as cooking, eating, reading, answering the phone, etc. Dailyliving activities are different from general activities from web as some activities have similar motion, identical background, and the person can be occluded with different objects.…”

Section: Introductionmentioning

confidence: 99%

“…3) Proposing a post-processing technique to further im-prove the results of labeled frames generated by online detection step. 4) Analyzing and testing our proposed algorithm with hand-crafted and the new proposed deep features, proving that our pipeline outperforms state-of-the-art results for two public datasets, namely, DAHLIA [9] and GAADRD [10].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Online temporal detection of daily-living human activities in long untrimmed video streams

Goel

Abubakr

Koperski

et al. 2018

2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS)

View full text Add to dashboard Cite

Many approaches were proposed to solve the problem of activity recognition in short clipped videos, which achieved impressive results with hand-crafted and deep features. However, it is not practical to have clipped videos in real life, where cameras provide continuous video streams in applications such as robotics, video surveillance, and smart-homes. Here comes the importance of activity detection to help recognizing and localizing each activity happening in long videos. Activity detection can be defined as the ability to localize starting and ending of each human activity happening in the video, in addition to recognizing each activity label. A more challenging category of human activities is the daily-living activities, such as eating, reading, cooking, etc, which have low inter-class variation and environment where actions are performed are similar. In this work we focus on solving the problem of detection of daily-living activities in untrimmed video streams. We introduce new online activity detection pipeline that utilizes single sliding window approach in a novel way; the classifier is trained with sub-parts of training activities, and an online frame-level early detection is done for sub-parts of long activities during detection. Finally, a greedy Markov model based post processing algorithm is applied to remove false detection and achieve better results. We test our approaches on two daily-living datasets, DAHLIA and GAADRD, outperforming state of the art results by more than 10%.

show abstract

“…To evaluate our proposed pipeline, we tested it with two different datasets; DAHLIA [9] and GAADRD [10]. The details and results of these datasets are explained in next subsections.…”

Section: Resultsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Online temporal detection of daily-living human activities in long untrimmed video streams

Goel

Abubakr

Koperski

et al. 2018

2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS)

View full text Add to dashboard Cite

show abstract

“…UCF101 [36], HMDB [22] and Kinetics [19] were widely used for recog-nizing actions in video clips [40,29,45,8,35,44,7,28,26,38,18,41]; THUMOS [17], ActivityNet [4] and AVA [13] were introduced for temporal/spatial-temporal action localization [33,48,27,37,52,53,3,5,24]. Recently, significant attention has been drawn to model human-human [13] and human-object interactions in daily actions [31,34,42]. In contrast to these datasets that were designed to evaluate motion and appearance modeling, or human-object interactions, our Agent-in-Place Action (APA) dataset is the first one that focuses on actions that are defined with respect to scene layouts, including interaction with places and moving directions.…”

Section: Related Workmentioning

confidence: 99%

Layout-Induced Video Representation for Recognizing Agent-in-Place Actions

Rong

Wang

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We address scene layout modeling for recognizing agentin-place actions, which are actions associated with agents who perform them and the places where they occur, in the context of outdoor home surveillance. We introduce a novel representation to model the geometry and topology of scene layouts so that a network can generalize from the layouts observed in the training scenes to unseen scenes in the test set. This Layout-Induced Video Representation (LIVR) abstracts away low-level appearance variance and encodes geometric and topological relationships of places to explicitly model scene layout. LIVR partitions the semantic features of a scene into different places to force the network to learn generic place-based feature descriptions which are independent of specific scene layouts; then, LIVR dynamically aggregates features based on connectivities of places in each specific scene to model its layout. We introduce a new Agent-in-Place Action (APA) dataset 1 to show that our method allows neural network models to generalize significantly better to unseen scenes.

show abstract

Video benchmarks of human action datasets: a review

Singh

Vishwakarma

2018

Artif Intell Rev

View full text Add to dashboard Cite

The DAily Home LIfe Activity Dataset: A High Semantic Activity Dataset for Online Recognition

Cited by 22 publications

References 39 publications

Online temporal detection of daily-living human activities in long untrimmed video streams

Online temporal detection of daily-living human activities in long untrimmed video streams

Layout-Induced Video Representation for Recognizing Agent-in-Place Actions

Video benchmarks of human action datasets: a review

Contact Info

Product

Resources

About