top of page
Majnoon View_edited.jpg
Geo-spatial Data and Information

 

​The first benefit of investigation is not increased confidence, but removal of false confidence.

Geo-spatial data is collected and transformed into information...

Geo-spatial information is attributed, organised and interrogated...

This information, properly presented allows the visualisation and comprehension of complex 3D and temporal situations.

For Geo-spatial data to provide meaningful information or insights it must be:

  • Observed

  • Preserved

  • Processed

  • Analysed

  • Communicated

Not all geo-spatial data is created equally. Reliability of geo-spatial data is therefore dependent upon:

  • The quality of the sampling.

  • The quality of the position.

  • The quality of the data attribution.

  • The relevance of the data.

Note: In order to visualise and measure your geo-spatial data within the wider context of the system domain, it is imperative to accurately position it in the correct coordinate reference system.

Assumptions About Your Data

Everything is related to everything, but near things are more related. (Tobler 1970)

But....

Your data is  rarely collected with the aim of providing good spatial statistics for the geo-statistician…

  • Budgetary constraints

  • Minimum required (for representative sample)

  • Scale of observations/measurements

  • Targeted with bias

Errors WILL propagate throughout the computational, analytical process.

Unless your data:

  • Have been collected and analysed/measured in a fit and proper manner.

  • Have been pre-screened to identify blunders and outliers; you should consider:

    • Different sampling campaigns

    • Consistent sampling procedures

      • Signs:  Differences in reporting precision, sample numbering, missing samples

    • Have positions been verified (plot on a map).

Surveyor with prism.jpg
clent_1xp.tif

Site Characterisation and its Variability

If the area has near uniform data values then you can expect accurate estimates/predictions.

If the area has highly variable data values - the chances of local accurate estimates and predictions are poor.

This will affect all estimation methods used.

Variability in data.png

Site Characterisation:
How many samples are needed?

Number of samples.png

Each sample will have a cost impact

  • Collection, analysis complexity, time, errors

  • Avoid unnecessary samples  - waste of resources

Population Size

  • Smaller populations allow uncertainty (outliers) to migrate into results

  • Large populations increased complexity, time and confounding factors

What is the aim?

  • Larger MoE will require less samples

  • Higher confidence will require more samples 

Site Characterisation:
Displaying and Prediction of the Results

Spatial Contunity.png
Kriging.png

How good are your map/ results

  • A helpful qualitative display with questionable quantitative significance’ Isaacks (1989)

  • What contouring method was used?

  • Calibration points (but consider sample costs)

Distribution and number of sample points

  • Type of sampling, targeted or random

  • Where is the seed point for sampling?

  • Understanding sampling bias

Do the results fit the expectation of geological and environmental understanding

  • Limits of the data -

    • Better to interpolate than extrapolate the results

  • Are the anomalies real or processing artifacts

Contouring results using same data (red dots) but different methods

contouring.png

Basic Analysis
Univariate and Bivariate

Univariate.jpg
Bi Variate.jpg
Multi Varaite.jpg

Univariate Analysis

Provides simple statistical information

Visual representation

No spatial understanding (depiction)

Bivariate Analysis

Simple linear regression and more complex relationships

Are two variables correlatable?

  • Correlation is not causation!

Multivariate Analysis

Natural systems will be expected to have more than one independent input

  • Rain + Geology + Time + Topography + ...

Use residual plots

In a regression model we should not be able to predict the error in any given observation

  • By analysing our residuals we can determine if they are consistent with random error or there is a systematic bias

residual plot.jpg

Autocorrelation

Autocorrelation in spatial data refers to the correlation of a variable with itself through space. It describes how similar data values are based on the distance and direction between them.

Why it is important:

  • Identify clustering, gradients, or randomness in spatial distributions

  • ​Strong positive autocorrelation implies nearby data points are good predictors for unknown values — a core idea behind kriging and contouring.

  • If autocorrelation is unexpectedly high or low, it may point to issues like:

    • Measurement bias

    • Systematic environmental variation

    • Over-sampling (duplicate or near-duplicate points)

Model Assumptions
Many statistical models assume observations are independent. Spatial autocorrelation violates this — meaning traditional stats (like regression) may give misleading results.

Slide78.JPG

Geo-spatial Data
A Project Schema & Geospatial Uncertainty Curve

Geospatial projects often exist in complex 3D environments both at, and beneath the ground surface.

 

From project conception through to execution and eventual completion, Geo-spatial data and its transformation to credible information is poorly understood and is an often side-lined facet of the project.

​Because many projects begin with unjustified certainty - the first benefit of investigation is not increased confidence, but removal of false confidence.

For successful outcomes good quality geo-spatial data will reduce uncertainty, dispel incorrect pre-conceptions and add value to projects that require this data as a key foundation element:

  • Knowledge gaps are identified and rectified through a single data measurement or series of data measurements.

  • This data is organised and attributed, and transformed into information

  • This information yields insights from which  informed decisions based on spatial knowledge can be made...

  • Reducing overrun and cost consuming changes to plan that were originally made on false pre-conceptions

data information knowledge.png
Spatial (2).PNG
Spatial (1).PNG

Acknowledgement is made to Pyrcz, Isaacks, Deutsch, Smith and others from whom many of the above themes are based

Slide3.PNG

A conceptual look at knowledge gained in an investigation

 

The idea is to treat the curve as a knowledge-gain function, where:

  • x = investigative input, effort, time, cost, or integrated investigation intensity

  • y = usable spatial knowledge, or defensible confidence in site understanding

Then: dy/dx  Is the marginal knowledge gain per unit investigative effort.

Interpretation by project stage

Early stage: low dy/dx

  • Desk study, initial assumptions, rough conceptual model.

  • At this stage, effort may not immediately produce much reliable spatial knowledge. Some effort is spent identifying what is not known. There may even be confusion reduction rather than true knowledge expansion.

  • This is an important point: early effort is still valuable, but the apparent rate of gain in defensible knowledge may be small.

Transitional stage: increasing dy/dx

  • Knowledge gaps are identified, investigation becomes targeted, measurements begin to answer the right questions.

  • This is where each added unit of effort may yield large reductions in uncertainty. This is often the most efficient part of the investigation.

Mature stage: peak dy/dx

  • The investigation is well-designed, the main controls on variability are understood, and data are being transformed into robust spatial interpretation.

  • This is the zone of maximum return on investigative effort.

Late stage: declining dy/dx

  • Additional sampling, modelling refinement, and monitoring still help, but each new increment contributes less than before.

  • This is the diminishing-returns zone

A useful alternative perspective is that dy/dx does not merely depend on effort quantity. It depends on effort quality.  So more rigorously dy/dx

is high when investigation is:

  • Correctly targeted

  • Spatially representative

  • Well-attributed

  • Properly analysed

  • Linked to the governing uncertainties

and low when effort is:

  • Ad hoc

  • Biased

  • Redundant

  • Poorly located

  • Not tied to the key conceptual uncertainties

This is probably the most important insight. The derivative is not just “how much work is being done,” but “how much useful understanding is being extracted from that work.”

​The second Derivative

A further useful idea is the second derivative d2y/dx2

This indicates whether the rate of knowledge gain is accelerating or decelerating.

Conceptually:

  • d2y/dx> 0: investigation is becoming more effective, often because the conceptual model is improving and sampling is becoming better targeted

  • d2y/dx= 0: point of maximum efficiency growth, often near the inflection region

  • d2y/dx< 0: diminishing returns have begun

This can be very helpful if the aim is to argue for targeted investigation design rather than simply more investigation.

Area under the curve

The Spatial Certainty Curve can be viewed as a cumulative expression of spatial knowledge developed through investigation. Its gradient, dy/dx, represents the rate at which useful knowledge is gained as investigative effort increases.

The area under this rate curve represents the cumulative increase in spatial knowledge over a given range of effort, rather than the total possible knowledge.

Where spatial knowledge is expressed qualitatively rather than as a measured index, this relationship should be understood as a conceptual guide rather than a strict mathematical quantity.

.

Important caution

A common assumption would be that knowledge always increases smoothly with effort. In reality that is not always true.

Sometimes there may be temporary negative effects in perceived confidence:

  • Early investigation may reveal that prior assumptions were wrong

  • Confidence may drop before reliable knowledge rises

  • The curve may therefore include a “false confidence collapse” before the main rise

This is actually a very valuable idea for site investigation, because many projects begin with unjustified certainty. In those cases the first benefit of investigation is not increased confidence, but removal of false confidence.

So in a more realistic conceptual model:

  • Apparent confidence may fall first

  • Defensible knowledge then rises

  • Later gains diminish

Albian Geo FZ LLC (47006383) Registered in the United Arab Emirates.  Updated April 2026

bottom of page