Release of a New Report on the application of advanced data analysis and machine learning in southern British Columbia, Canada
A recently completed application of advanced data analyses and machine learning to stream sediment geochemical data from British Columbia, Canada demonstrates the ability of random forests to generate predictive mineral exploration maps using regional stream sediment data. See Grunsky, E.C. & Arne, D., 2020 Mineral-Resource Prediction Using Advanced Data Analytics and Machine Learning of the QUEST-South Stream-sediment Geochemical Data, Southwestern British Columbia (Parts of NTS 082, 092) for a copy of the report and associated data files.
Summary
In this study we apply multivariate statistical and predictive classification methods, to interpret geochemical data from 8545 stream-sediment samples collected in southern British Columbia, Canada. Data for 35 elements were levelled for laboratory bias and adjusted for values reported below the lower limit of detection. Each sample site was attributed with the closest British Columbia MINFILE occurrence within 2.5 km. MINFILE occurrences were grouped into “GroupModels” based on similarities between the British Columbia Geological Survey mineral deposit models and geochemical signatures. These data were used to create a training data set of 474 observations, including 100 samples not attributed with a MINFILE occurrence. The training set was used to generate predictions for the mineral deposit models from which posterior probabilities were estimated for the remaining 8071 samples. The data underwent a log-centred transform and then characterization using either principal component analysis (PCA) or t-distributed stochastic neighbour embedding using 9 dimensions (t-SNE) prior to classification by random forests. The posterior probabilities generated from the t-SNE metric provide a slightly higher level of prediction accuracy compared to the posterior probabilities obtained using the PCA metric. The results are comparable to those obtained using a conventional catchment analysis approach and expert-driven model.