To design new revolutionary wireless communication systems, orthogonal frequency-division multiplexing (OFDM)-based massive multiuser (MU) multiple-input multiple-output (MIMO) has been shown to be the most promising technology to significantly enhance the spectral, energy, and hardware efficiencies. However, massive MU-MIMO-OFDM transmitters exhibit signal with high peak-to-average power ratio (PAPR). Accordingly, the nonlinearity of the radio frequency (RF) power amplifier (PA), which causes the most severe hardware impairment, is expected to be a low-cost and energy-efficient component to enable cost-and energy-efficient massive MU-MIMO-OFDM BS deployments, generating harmful in-band distortion and out-of-band (OOB) emissions. In this paper, we develop a PAPR-aware downlink transmission scheme in an OFDM-based large-scale MU-MIMO. Linear precoding of data and peak-canceling signals (PCSs) are employed to reduce the PAPRs of the transmitted signals by exploiting the excess degrees of freedom (DoFs) provided by equipping the base station (BS) by a large number of transmit antennas. Specifically, we design PCSs to be added to the frequencydomain precoded data signals, with the goal of reducing the PAPRs of their time-domain counterpart signals. Most importantly, the added PCSs have to lie in the null spaces of their associated MIMO channel matrices such that they do not cause any MU interference (MUI) and OOB radiation. In this regard, an efficient algorithm is developed, which is based on different data and PCSs precoders, and the corresponding achievable PAPR reduction and bit error rate (BER) performance are analyzed. Moreover, to optimize a tradeoff between performance and complexity, linear precoders based on matrix polynomials (M-POLYs) and gradient-iterative approaches are studied for both data and PCSs precoding. Simulation results reveal that these latter provide similar performance as the regularized zero-forcing (RZF) and orthogonal projection null space (OPNS)-based data and PCSs precoders, while they need much lower computational complexity. The substantial PAPR reduction provided by the proposed algorithm offers interesting insights for the design of energy-efficient massive MU-MIMO-OFDM systems.