The complex between scandium triflate and pybox is a good enantioselective catalyst for the reaction between methyl (E)‐4‐aryl‐2‐oxo‐3‐butenoates (1) and enol silyl ethers (2). The main products with (4′S)‐isopropyl‐, (4′S)‐phenyl‐, and (4′S,5′S)‐4‐TIPS‐pybox are methyl (4S,4aS,8aS)‐4‐aryl‐8a‐trialkylsiloxy‐hexahydro‐4H‐chromen‐2‐carboxylates (7), which are obtained in good yield and enantioselectivities of more than 95 % ee. Because the reaction gives products with a trans ring junction that cannot derive from a conventional concerted hetero‐Diels–Alder pathway, the mechanism involved in the enantioselective catalytic cycle and the origin of the stereoinduction were investigated. The structures of two desilylated products, 8b and 9b, were determined by X‐ray crystal analysis. Their absolute configuration was then related to that of 7 by stereospecific isomerization and/or desilylation reactions. If aryl‐butenoates 1 are coordinated to the catalyst, forming reacting complexes characterized by the five‐membered structure 11, these rigid reacting intermediates give a face discrimination that is determined by the configuration of the pybox 4′‐substituent. The resulting face‐selective attack of enol silyl ethers to coordinated 1 gives an enantioselective tandem Mukaiyama–Michael addition/intramolecular ring closure reaction, which is an enantioselective formal hetero‐Diels–Alder reaction, which rationalizes the absolute configuration of the reaction products.