“…Subsequent attempts to model Marcus et al’s (1999) human data using variable-free network models have met with varying degrees of success. This work has shown that model performance is influenced by various factors, including pretraining (whether the model has any prior knowledge about phonemes, syllables or any abstract relations that will help the model to figure out the task at hand) (Seidenberg & Elman, 1999a,b; Altmann, 2002), encoding assumptions (whether the model is trained on input vectors that represent phonetic features, place of articulation, vowel height, primary/secondary stress or non-featural random vectors) (Negishi, 1999; Christiansen & Curtin, 1999; Christiansen, Conway, & Curtin, 2000; Dienes, Altmann, & Gao, 1999; Altmann & Dienes, 1999; Shultz & Bale, 2001; Geiger et al, 2022), and model type (whether the model is a neural network, autoencoder trained with cascade-correlation, auto-associater, Bayesian, Echo State Network or Seq2Seq) (Shultz, 1999; Sirois, Buckingham, & Shultz, 2000; Frank and Tenenbaum, 2011; Alhama and Zuidema, 2018; Prickett et al, 2022), and task (whether the task is to predict or identify rules, words, syllables, or patterns, or segment syllable sequences into “words”) (Seidenberg & Elman, 1999a, 1999b; Christiansen & Curtin, 1999;) (see Alhama and Zuidema (2019) for a detailed review of the computational models). These factors have made it challenging to draw direct comparisons with human behavior, further fueling the ongoing discussion.…”