“…Hence, we do not expect perfect agreement between model-predicted salience and human eye position. In particular, our bottom-up model as used here does not yet account, among others, for how the rapid identification of the gist (semantic category) of a scene may provide contextual priors to more efficiently guide attention towards target objects of interest (Biederman, Teitelbaum, & Mezzanotte, 1983;Friedman, 1979;Hollingworth & Henderson, 1998;Oliva & Schyns, 1997;Potter & Levy, 1969;Torralba, 2003); how search for a specific target might be guided top-down, for example by boosting visual neurons tuned to the attributes of the target (Ito & Gilbert, 1999;Moran & Desimone, 1985;Motter, 1994;Müller, Reimann, & Krummenacher, 2003;Reynolds, Pasternak, & Desimone, 2000;Treue & Maunsell, 1996;Treue & Trujillo, 1999;Wolfe, 1994Wolfe, , 1998Wolfe, Cave, & Franzel, 1989;Yeshurun & Carrasco, 1998); or how task, expertise, and internal scene models may influence eye movements (Henderson & Hollingworth, 2003;Moreno, Reina, Luis, & Sabido, 2002;Nodine & Krupinski, 1998;Noton & Stark, 1971;Peebles & Cheng, 2003;Savelsbergh, Williams, van der Kamp, & Ward, 2002;Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995;Yarbus, 1967). Nevertheless, our hypothesis for this study is that a more realistic simulation framework might yield better agreement between human and model than a less realistic one.…”