Beyond Floating-Point Ops: CNN Performance Prediction with Critical Datapath Length

Langerman, David; Johnson, Alex; Buettner, Kyle; George, Alan

doi:10.1109/hpec43674.2020.9286182

Cited by 7 publications

(3 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The FLOPs is a widely used indicator to measure the computational complexity of the model. The entire number of network operations that can be summed up into a single floating‐point hardware operation is the definition of FLOPs (Langerman et al., 2020; Molchanov et al., 2016). The original ResNeSt is much lighter than Res2Net due to fewer parameters (37.28 M) and computations (211.15G).…”

Section: Resultsmentioning

confidence: 99%

Attention‐guided multiscale neural network for defect detection in sewer pipelines

Wang

Dang

et al. 2023

Computer aided Civil Eng

View full text Add to dashboard Cite

Sanitary sewer systems are major infrastructures in every modern city, which are essential in protecting water pollution and preventing urban waterlogging. Since the conditions of sewer systems continuously deteriorate over time due to various defects and extrinsic factors, early intervention in the defects is necessary to prolong the service life of the pipelines. However, prior works for defect inspection are limited by accuracy, efficiency, and economic cost. In addition, the current loss functions in object detection approaches are unable to handle the imbalanced data well. To address the above drawbacks, this paper proposes an automatic defect detection framework that accurately identifies and localizes eight types of defects in closed‐circuit television videos based on a deep neural network. First, an effective attention module is introduced and used in the backbone of the detector for better feature extraction. Then, a novel feature fusion mechanism is presented in the neck to alleviate the problem of feature dilution. After that, an efficient loss function that can reasonably adjust the weight of training samples is proposed to tackle the imbalanced data problem. Also, a publicly available dataset is provided for defect detection tasks. The proposed detection framework is robust against the imbalanced data and achieves a state‐of‐the‐art mean average precision of 73.4%, which is potentially applied in realistic sewer defect inspections.

show abstract

Section: Resultsmentioning

confidence: 99%

Attention‐guided multiscale neural network for defect detection in sewer pipelines

Wang

Dang

et al. 2023

Computer aided Civil Eng

View full text Add to dashboard Cite

show abstract

“…Previous works examining the relationship between efficiency measures showed that different cost indicators do not correlate well with each other during neural network training (Dehghani et al, 2021). In particular, it has been hypothesized that discrepancies between FLOPs and wallclock inference latency is primarily are primarily compute bounded by kernel execution or memory-bound by data movement as opposed to framework bottlenecks (Langerman et al, 2020). These previous works have largely focused on convolutional neural networks (CNNs) in computer vision.…”

Section: Efficiency Metrics and Cost Indicatorsmentioning

confidence: 99%

The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment

Fernandez,

Kahn,

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Increased focus on the computational efficiency of NLP systems has motivated the design of efficient model architectures and improvements to underlying hardware accelerators. However, the resulting increases in computational throughput and reductions in floating point operations have not directly translated to improvements in wall-clock inference latency. We demonstrate that these discrepancies can be largely attributed to bottlenecks introduced by deep learning frameworks. We denote this phenomenon as the framework tax, and observe that the disparity is growing as hardware speed increases over time. In this work, we examine this phenomenon through a series of case studies analyzing the effects of model design decisions, framework paradigms, and hardware platforms on total model latency.Code is available at https://github.com/ JaredFern/Framework-Tax.

show abstract

“…Transformers have been shown to be more efficient in terms of inference time when compared to convolutional networks [56]. This effect is not directly correlated with the number of parameters, but is rather more influenced by the network structure [79].…”

Section: Inference Timementioning

confidence: 99%

Learning Gait Representations with Noisy Multi-Task Learning

Cosma

Radoi

2022

Sensors

View full text Add to dashboard Cite

Gait analysis is proven to be a reliable way to perform person identification without relying on subject cooperation. Walking is a biometric that does not significantly change in short periods of time and can be regarded as unique to each person. So far, the study of gait analysis focused mostly on identification and demographics estimation, without considering many of the pedestrian attributes that appearance-based methods rely on. In this work, alongside gait-based person identification, we explore pedestrian attribute identification solely from movement patterns. We propose DenseGait, the largest dataset for pretraining gait analysis systems containing 217K anonymized tracklets, annotated automatically with 42 appearance attributes. DenseGait is constructed by automatically processing video streams and offers the full array of gait covariates present in the real world. We make the dataset available to the research community. Additionally, we propose GaitFormer, a transformer-based model that after pretraining in a multi-task fashion on DenseGait, achieves 92.5% accuracy on CASIA-B and 85.33% on FVG, without utilizing any manually annotated data. This corresponds to a +14.2% and +9.67% accuracy increase compared to similar methods. Moreover, GaitFormer is able to accurately identify gender information and a multitude of appearance attributes utilizing only movement patterns. The code to reproduce the experiments is made publicly.

show abstract

Beyond Floating-Point Ops: CNN Performance Prediction with Critical Datapath Length

Cited by 7 publications

References 14 publications

Attention‐guided multiscale neural network for defect detection in sewer pipelines

Attention‐guided multiscale neural network for defect detection in sewer pipelines

The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment

Learning Gait Representations with Noisy Multi-Task Learning

Contact Info

Product

Resources

About