In the initial phase of the pentest, named Open Source Intelligence, we use passive recognition with Google Hacking. Google Hacking is a practice that uses strings called Dorks. To support them, the Google Hacking Database is available with thousands of Dorks. However, the Google Hacking Database contains a reduced number of attributes, all with textual values, which makes it impossible to apply Machine Learning techniques. one way to enrich the Google Hacking Database with attributes is with Natural Language Processing and the transformation of textual values to numeric, converting Dorks characters to ASCII. So, the objective was to apply Natural Language Processing to enrich Google Hacking Database with attributes and convert its textual values to ASCII, to enable the application of Machine Learning techniques. The computational experiments were conducted in seven steps: Selection of the GHDB Base, Removal of Hyperlinks and Deletion of Attributes, Removal of the Site Parameter from Dorks, Removal of Outliers and Stopwords, Enrichment with Natural Language Processing, Base Transformation, and Application of the SOM. The results obtained with the application of the SOM were considered good, depending on the values presented by the metrics that evaluated the network. Thus, it is considered that the objective of this paper was achieved.