Point cloud classification plays an important role in a wide range of airborne light detection and ranging (LiDAR) applications, such as topographic mapping, forest monitoring, power line detection, and road detection. However, due to the sensor noise, high redundancy, incompleteness, and complexity of airborne LiDAR systems, point cloud classification is challenging. Traditional point cloud classification methods mostly focus on the development of handcrafted point geometry features and employ machine learning-based classification models to conduct point classification. In recent years, the advances of deep learning models have caused researchers to shift their focus towards machine learning-based models, specifically deep neural networks, to classify airborne LiDAR point clouds. These learning-based methods start by transforming the unstructured 3D point sets to regular 2D representations, such as collections of feature images, and then employ a 2D CNN for point classification. Moreover, these methods usually need to calculate additional local geometry features, such as planarity, sphericity and roughness, to make use of the local structural information in the original 3D space. Nonetheless, the 3D to 2D conversion results in information loss. In this paper, we proposed a directionally constrained fully convolutional neural network (D-FCN) that can take the original 3D coordinates and LiDAR intensity as input; thus, it can directly apply to unstructured 3D point clouds for semantic labeling. Specifically, we first introduce a novel directionally constrained point convolution (D-Conv) module to extract locally representative features of 3D point sets from the projected 2D receptive fields. To make full use of the orientation information of neighborhood points, the proposed D-Conv module performs convolution in an orientation-aware manner by using a directionally constrained nearest neighborhood search. Then, we designed a multiscale fully convolutional neural network with downsampling and upsampling blocks to enable multiscale point feature learning. The proposed D-FCN model can therefore process input point cloud with arbitrary sizes and directly predict the semantic labels for all the input points in an end-to-end manner. Without involving additional geometry features as input, the proposed method has demonstrated superior performance on the International Society for Photogrammetry and Remote Sensing (ISPRS) 3D labeling benchmark dataset. The results show that our model has achieved a new state-of-the-art level of performance with an average F1 score of 70.7%, and it has improved the performance by a large margin on categories with a small number of points (such as powerline, car, and facade).