Variational quantum circuits (VQCs) have shown great potential in near-term applications. However, the discriminative power of a VQC, in connection to its circuit architecture and depth, is not understood. To unleash the genuine discriminative power of a VQC, we propose a VQC system with the optimal classical post-processing---maximum-likelihood estimation on measuring all VQC output qubits. Via extensive numerical simulations, we find that the error of VQC quantum data classification typically decay exponentially with the circuit depth, when the VQC architecture is extensive---the number of gates does not shrink with the circuit depth. This fast error suppression ends at the saturation towards the ultimate Helstrom limit of quantum state discrimination. On the other hand, non-extensive VQCs such as quantum convolutional neural networks are sub-optimal and fail to achieve the Helstrom limit, demonstrating a trade-off between ansatz complexity and classification performance in general. To achieve the best performance for a given VQC, the optimal classical post-processing is crucial even for a binary classification problem. To simplify VQCs for near-term implementations, we find that utilizing the symmetry of the input properly can improve the performance, while oversimplification can lead to degradation.