Conventional implicit modelling: RBF or Kriging
To illustrate, a bit of background as to what happens to data points for modelling geological features, before fitting them with RBFs: they are assigned an attribute that either assigns a distance value, or an indicator value. In either case, the points with a zero value are on the boundary to be modelled, positive values are on one side of that boundary, negative values on the other. Fitting an RBF then operates very similarly to Kriging in estimating values for points that are not in the data set. That is, it estimates for each point how far that point is from the boundary to be modelled, i.e.,
Modelling with SDFs
SDFs contain a whole variety of different functions that can be used for this purpose. The most simple and basic one is a (signed) distance from a point or line. In those instances, you cannot really talk about a signed distance as they just return the distance to the point or polyline. Things become slightly different when the points and lines are part of a surface, or in case of a polyline when it is closed, but I’ll leave that to the reader or check online.
In the case where we have existing surfaces, we can really talk about an SDF. For other data, where multiple points make up a surface such as point clouds from laser scanners, a direct distance estimate does not suffice and neighbouring points need to be used to estimate the SDF. RBFs cannot be directly applied in this case as there are no ‘inside’ and ‘outside’ points. So, a first step is to add so-called “off-surface” points. These are points indicating the sign and are estimated from the surface points. For RBFs there is no need to have these off-surface points at every point of the point cloud and optimizations can be used to estimate how many are needed, but without them the RBF does not work as all the contact points are at zero distance so all points in space not part of the data set will also be estimated to be at zero distance.
Using an SDF is implicit modelling
Hopefully by now, you start to realize that Implicit Modelling is actually a way to estimate distances from a boundary. Now you might ask: “How does that work for grade data, or other numeric data?”. This is a valid question that doesn’t really change the idea described above. For the answer consider you are trying to model a grade shell. This shell is created by using a cut-off grade, let’s say 5 g/t of some mineral. To create the boundary at 5 g/t we can apply a threshold, or better, apply a constant subtraction to all values, meaning all values below 5 g/t will become negative, all values above 5 g/t stay positive. This is a bit like indicator modelling, but you should also see the similarity with the SDFs we mentioned for modelling domains from categorical data as we are not changing the values to indicators.
There is one more thing to consider though. In normal life, we understand the concept of distance, e.g., how 1cm is defined, or 1 meter and how far that is from our boundary. For numeric data, this is not as straightforward. We cannot say, at 1 meter from the boundary the grade will be 3 g/t. The lower or higher grade away from the boundary is usually highly non-linear and mostly not constant around all areas of our cut-off. That is where the similarity to our SDF’s ends, and is why a slightly different type of functions is needed for that. However, it can generally be stated that methods used for numeric interpolation and estimation can also be used for modelling boundaries, but methods used to estimate boundaries cannot always be used to estimate numeric values like grades.
Why does it all matter?
Why did we mention all this stuff above you might ask? What’s the point? That is what we will get to now. RBFs have served a great purpose in geological modelling, but its innovation seems to have stopped without further developments in the last 20+ years. However, in those 20+ years developments in other areas have not been idle. This has led to a whole set of algorithms optimized for similar problems that are generally branded under Machine Learning (ML). We won’t go into that too much here as it is part of another blog post. We just want to highlight how ML has resulted in other SDFs specifically focused towards the issue of estimating boundaries from categorical data.
Great speed increase through clever math
No need to worry, we will not go into the math here. We just want to highlight a principle that has helped create enormously fast methods mentioned before.
Typically, signed data points are created from the drilling data although other ways are possible as well. For now, just consider using drilling data to generate many points inside the unit and many points outside. Many of these points are increasingly far away from the boundary that we want to model. This in turn means that many of them are actually redundant.
To illustrate, let’s consider a plane, which in a 2D projection becomes a line. If we have a point 1m from this on one side and one point 1m away on the other side. It is easy to see where the point on the boundary will be: exactly in the middle between the two points (assuming of course they are exactly mirrored). If we then add another set of points, 5m away on either side, will they help to know where the point on the boundary is? Not really, as the original two points already defined where the boundary is, so these additional points are actually redundant.
Application from Machine Learning
It is exactly this principle that has let to such incredible speeds observed in ML. Not only in the fitting process as in RBFs, which is called training in ML, but also in the evaluations afterwards. This speed up is only possible by rethinking the modelling process by using SDFs instead of implicit modelling or RBFs. The distance values themselves have a lot of intrinsic value, that when used properly, can greatly speed up the modelling process.