Ground investigations often use trial pits and borehole cores on construction sites to determine the strata likely to be encountered at various depths. The data obtained from trial pits can be coded into a form that can be used as sample observations for input to a grammatical inference machine. A grammatical inference machine is a black box, which when presented with a sample of observations of some unknown source language, produces a grammar which is compatible with the sample. This article presents a heuristic model for a grammatical inference machine, which takes as data sentences and non-sentences identified as such, and is capable of inferring grammars in the class of context-free grammars expressed in Chomsky Normal Form. An algorithm and its corresponding software implementation have been developed based on this model. The software takes, as input, coded representations of ground investigation data, and produces as output a grammar which describes and classifies the geotechnical data observed in the area, and also promises the possibility of being able to predict the likely configuration of strata across the site.