2015
DOI: 10.1111/cogs.12311
|View full text |Cite
|
Sign up to set email alerts
|

Archaeology Through Computational Linguistics: Inscription Statistics Predict Excavation Sites of Indus Valley Artifacts

Abstract: Computational techniques comparing co-occurrences of city names in texts allow the relative longitudes and latitudes of cities to be estimated algorithmically. However, these techniques have not been applied to estimate the provenance of artifacts with unknown origins. Here, we estimate the geographic origin of artifacts from the Indus Valley Civilization, applying methods commonly used in cognitive science to the Indus script. We show that these methods can accurately predict the relative locations of archeol… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 30 publications
0
8
0
Order By: Relevance
“…This hypothesis receives support from the observation that distributional vectors actually encode surprising amounts of information about the surrounding world: For example, Louwerse and Zwaan (2009) applied a multidimensional scaling technique to the similarities between distributional vectors for city names and projected them onto a two-dimensional space. They found that the coordinates of the cities within this space were correlated with the actual geographical positions of these cities, in the real world (Louwerse & Zwaan, 2009; Recchia & Louwerse, 2016) as well as in fictional worlds such as Middle-earth from The Lord of the Rings (Louwerse & Benesh, 2012). Other studies have shown that the similarity structure between distributional vectors also reflects other real-world similarity structures, such as the relative position of the days of the week or months of the year (Louwerse, Raisig, Tillman, & Hutchinson, 2015).…”
Section: Embodiment and The Role Of Nonlinguistic Experiencementioning
confidence: 99%
“…This hypothesis receives support from the observation that distributional vectors actually encode surprising amounts of information about the surrounding world: For example, Louwerse and Zwaan (2009) applied a multidimensional scaling technique to the similarities between distributional vectors for city names and projected them onto a two-dimensional space. They found that the coordinates of the cities within this space were correlated with the actual geographical positions of these cities, in the real world (Louwerse & Zwaan, 2009; Recchia & Louwerse, 2016) as well as in fictional worlds such as Middle-earth from The Lord of the Rings (Louwerse & Benesh, 2012). Other studies have shown that the similarity structure between distributional vectors also reflects other real-world similarity structures, such as the relative position of the days of the week or months of the year (Louwerse, Raisig, Tillman, & Hutchinson, 2015).…”
Section: Embodiment and The Role Of Nonlinguistic Experiencementioning
confidence: 99%
“…Previous research has shown that the type of knowledge that can be extracted from natural language data—even from its surface-level statistical structure alone, without semantic analyses of its content—is surprisingly extensive. For example, word frequencies are positively correlated with the population sizes of cities 10 , and statistical analyses of natural language data even reveal the real geographical distances between places 10 , 11 or the typical spatial arrangement of objects 12 . Even more striking evidence in this respect comes from congenitally blind individuals who never had any visual experience but can exploit linguistic information, which enables them to linguistically categorize colors and correctly assign colors to objects 13 and to differentiate different kinds of “seeing”, such as peeking versus staring 14 .…”
Section: Introductionmentioning
confidence: 99%
“…At a first glance, this opens an interesting hypothesis on the type of information encoded in language data: Does natural language allow us to re-construct or at least make informed guesses about some proprieties of the (physical) world surrounding us, since speakers use language to communicate about this world? If natural language data allows us to predict the location of archeological sites 11 , can our hypothetical alien scientist use statistical analyses to paint a picture of how Earth and the beings inhabiting it looked like? What is actually encoded in this data?…”
Section: Introductionmentioning
confidence: 99%
“…As a consequence, there are systematic redundancies between the structure of the outside world and the statistical structure of language (Johns & Jones, 2012a). This is supported by computational evidence showing that language statistics replicate the real geographical distances between places (Louwerse & Zwaan, 2009;Recchia & Louwerse, 2016), or the typical spatial arrangement of objects (Louwerse, 2008). One can thus draw inferences about the outside world from the statistical structure of language alone, even without direct access to it (Rinaldi & Marelli, 2020a).…”
Section: Cortical Maps Recovered From Language Statisticsmentioning
confidence: 90%
“…Word frequency, the number of occurrences of a word in a given corpus of natural language, is the central measure of a word's prominence in language (van Heuven, Mandera, Keuleers, & Brysbaert, 2014). To probe such a relationship we reverted to the geographical domain (Louwerse & Zwaan, 2009;Recchia & Louwerse, 2016) and selected three different test cases: River lengths, city sizes, and country sizes. In all cases, we investigated the English (UK), French, German, and Italian languages, in order to rule out any influence of idiosyncratic patterns specific to each language.…”
Section: Study 1: Word Frequencies Encode Magnitude Informationmentioning
confidence: 99%