A sketch is a black-and-white, 2-D graphical representation of an object and contains fewer visual details as compared to a colored image. Despite fewer details, humans can recognize a sketch and its context very efficiently and consistently across languages, cultures, and age groups, but it is a difficult task for computers to recognize such low-detail sketches and get context out of them. With the tremendous increase in popularity of IoT devices such as smartphones and smart cameras, etc., it has become more critical to recognize free hand-drawn sketches in computer vision and human-computer interaction in order to build a successful artificial intelligence of things (AIoT) system that can first recognize the sketches and then understand the context of multiple drawings. Earlier models which addressed this problem are scale-invariant feature transform (SIFT) and bag-of-words (BoW). Both SIFT and BoW used hand-crafted features and scale-invariant algorithms to address this issue. But these models are complex and time-consuming due to the manual process of features setup. The deep neural networks (DNNs) performed well with object recognition on many large-scale datasets such as ImageNet and CIFAR-10. However, the DDN approach cannot be carried out for hand-drawn sketches problems. The reason is that the data source is images, and all sketches in the images are, for example, ‘birds’ instead of their specific category (e.g., ‘sparrow’). Some deep learning approaches for sketch recognition problems exist in the literature, but the results are not promising because there is still room for improvement. This article proposed a convolutional neural network (CNN) architecture called Sketch-DeepNet for the sketch recognition task. The proposed Sketch-DeepNet architecture used the TU-Berlin dataset for classification. The experimental results show that the proposed method beats the performance of the state-of-the-art sketch classification methods. The proposed model achieved 95.05% accuracy as compared to existing models DeformNet (62.6%), Sketch-DNN (72.2%), Sketch-a-Net (77.95%), SketchNet (80.42%), Thinning-DNN (74.3%), CNN-PCA-SVM (72.5%), Hybrid-CNN (84.42%), and human recognition accuracy of 73% on the TU-Berlin dataset.