“…With the advances of computing power, data availability, and algorithms, there has been significant interest in developing machine learning (ML) models to assist a variety of organic reaction-related tasks, 1â4 including reaction product prediction, 5â18 retrosynthesis, 9,14,17,19â37 reaction condition optimization, 38â41 reaction yield prediction, 42â54 and reaction type classification. 38,51,55â57 These ML-based data-driven approaches for organic synthesis can be classified into descriptor-based models, [5][6][7][8][9][10][19][20][21][22][23][24][25][26][38][39][40][41]51,55,57 graph-based m o d e l s , 1 1 â 1 3 , 2 7 , 2 8 , 5 2 a n d s e q u e n c e -b a s e d m o dels, [14][15][16][17][18][29][30][31][32][33][34][35][36][37]53,54,56 depending on how molecules are represented as input for machine learning. Descriptor-based models use hand-crafted features as molecular representations and often need feature engineering or template extraction for different reaction prediction tasks, which set limitations to generalizability.…”