TREK had decades of geotechnical investigations stored in many different formats and locations, making it hard to reuse past work or see everything known about a site in one place. The goal of this project was to create a unified geotechnical data platform and a geospatial view that engineers could rely on for planning, design and field work, while steadily improving data quality over time.
I started by designing a relational data model in Postgres/PostGIS to represent boreholes, samples, tests and related attributes in a consistent way. Thousands of legacy files and spreadsheets were then migrated into this structure using automated ETL pipelines.
Where possible, I used LLM- and NLP-assisted extraction to help read semi-structured documents and map fields into the new schema. The focus was not on perfect automation, but on reducing manual effort and highlighting records that needed human review. For elevation, I combined external data sources by using the Google Maps Elevation API together with nearby LiDAR data to compare and correct missing or inaccurate elevation values in the database, improving the vertical accuracy of borehole locations.
On top of the database, I implemented a Google Earth based decision view. Engineers can zoom to an area of interest and immediately see all known boreholes and investigations, along with links back to the underlying records, reports and tests.
To address data quality concerns, I designed a validation workflow and feedback loop:
• Each record includes a simple indication of location and attribute confidence
• Field and design teams can flag questionable points directly from the map
• Suggested corrections and comments are collected
• Managers review and approve updates, which are then applied to the central database
• Over time this process improves both location accuracy and attributes
The platform architecture was kept modular so that new data sources or AI components (such as the geotechnical AI assistant) could be added without redesigning the whole system.
By consolidating fragmented geotechnical data into a single, geospatially-aware platform, the project made it much easier for engineers and managers to reuse existing investigations, spot gaps and plan new work. Having a Google Earth based view gave teams an intuitive way to explore the data and quickly see what is known about a site before committing time and budget.
The feedback loop turned data quality from a one-off clean-up into an ongoing collaborative process. As field and design teams used the platform and corrected records, the underlying database became steadily more reliable, providing a stronger foundation for future analytics and AI initiatives at TREK.