As semiconductor devices are scaled into nanoscale regime, first velocity saturation starts to limit the carrier mobility due to pronounced intervalley scattering, and when the device dimensions are scaled to 100 nm and below, velocity overshoot (which is a positive effect) starts to dominate the device behavior leading to larger ON-state currents. Alongside with the developments in the semiconductor nanotechnology, in recent years there has been significant progress in physical based modeling of semiconductor devices. First, for devices for which gradual channel approximation can not be used due to the two-dimensional nature of the electrostatic potential and the electric fields driving the carriers from source to drain, drift-diffusion models have been exploited. These models are valid, in general, for large devices in which the fields are not that high so that there is no degradation of the mobility due to the electric field. The validity of the drift-diffusion models can be extended to take into account the velocity saturation effect with the introduction of field-dependent mobility and diffusion coefficients. When velocity overshoot becomes important, drift diffusion model is no longer valid and hydrodynamic model must be used. The hydrodynamic model has been the workhorse for technology development and several high-end commercial device simulators have appeared including Silvaco, Synopsys, Crosslight, etc. The advantages of the hydrodynamic model are that it allows quick simulation runs but the problem is that the amount of the velocity overshoot depends upon the choice of the energy relaxation time. The smaller is the device, the larger is the deviation when using the same set of energy relaxation times. A standard way in calculating the energy relaxation times is to use bulk Monte Carlo simulations. However, the energy relaxation times are material, device geometry and doping dependent parameters, so their determination ahead of time is not possible. To avoid the problem of the proper choice of the energy relaxation times, a direct solution of the Boltzmann Transport Equation (BTE) using the Monte Carlo method is the best method of choice. That is why the focus of this review paper is on explaining basic Monte Carlo device simulator and then the focus will be shifted on the inclusion of various higher order effects that explain particular physical phenomena or processes.The Monte Carlo book chapter is organized as follows. First, the idea behind the Monte Carlo technique is outlined by revoking the path integral method for the solution of the BTE. This approach naturally leads to the free-flight-scatter sequence that is used in solving the BTE using the Monte Carlo method. Various scattering mechanisms relevant for different materials are given to completely specify the collision integral in the BTE. A discussion followed with the presentation of a generic flow-chart for implementing bulk Monte Carlo code is presented. Note that bulk Monte Carlo approach is suitable for the characterization of ma...