We theoretically study the generation of attosecond XUV pulses via high-order frequency mixing (HFM) of two intense generating fields, and compare this process with the more common high-order harmonic generation (HHG) process. We calculate the macroscopic XUV signal by numerically integrating the 1D propagation equation coupled with the 3D time-dependent Schrödinger equation. We analytically find the length scales which limit the quadratic growth of the HFM macroscopic signal with propagation length. Compared to HHG these length scales are much longer for a group of HFM components, with orders defined by the frequencies of the generating fields. This results in a higher HFM macroscopic signal despite the microscopic response being lower than for HHG. In our numerical simulations in a gas target, the intensity of the HFM signal is several times higher than that for HHG, while for a plasma target the HFM generation efficiency is up to three orders of magnitude higher than for HHG. The HFM with relatively long generating pulses provides very narrow XUV lines (δω/ω = 4.6 × 10-4) with well-defined frequencies, thus allowing for a simple extension of optical frequency standards to the XUV range. Finally, we show that the group of HFM components effectively generated due to macroscopic effects provides a train of attosecond pulses such that the carrier-envelope phase of an individual attosecond pulse can be easily controlled by tuning the phase of one of the generating fields.