Geo-spatial Data | Albian-Geo

Geo-spatial Data and Information

Geo-spatial data is collected and transformed into information...

Geo-spatial information is attributed, organised and interrogated...

This information, properly presented allows the visualisation and comprehension of complex 3D and temporal situations.

The union and visualisation of geo-spatial information mitigates risk, and allows managers to make better and informed decisions...

For Geo-spatial data to provide meaningful information or insights it must be:

Observed
Preserved
Processed
Analysed
Communicated

Not all geo-spatial data is created equally. Reliability of geo-spatial data is therefore dependent upon:

The quality of the sampling.
The quality of the position.
The quality of the data attribution.
The relevance of the data.

In order to visualise and measure your geo-spatial data within the wider context of the system domain, it is imperative to accurately position it in the correct coordinate reference system.

Assumptions About Your Data

Everything is related to everything, but near things are more related. (Tobler 1970)

But....

Your data is rarely collected with the aim of providing good spatial statistics for the geo-statistician…

Budgetary constraints
Minimum required (for representative sample)
Scale of observations/measurements
Targeted with bias

Errors WILL propagate throughout the computational, analytical process.

Unless your data:

Have been collected and analysed/measured in a fit and proper manner.
Have been pre-screened to identify blunders and outliers; you should consider:
- Different sampling campaigns
- Consistent sampling procedures
  - Signs: Differences in reporting precision, sample numbering, missing samples
- Have positions been verified (plot on a map).

Site Characterisation and its Variability

If the area has near uniform data values then you can expect accurate estimates/predictions.

If the area has highly variable data values - the chances of local accurate estimates and predictions are poor.

This will affect all estimation methods used.

Site Characterisation:
How many samples are needed?

Each sample will have a cost impact

Collection, analysis complexity, time, errors
Avoid unnecessary samples - waste of resources

Population Size

Smaller populations allow uncertainty (outliers) to migrate into results
Large populations increased complexity, time and confounding factors

What is the aim?

Larger MoE will require less samples
Higher confidence will require more samples

Site Characterisation:
Displaying and Prediction of the Results

How good is your map/ results

A helpful qualitative display with questionable quantitative significance’ Isaacks (1989)
What contouring method was used?
Calibration points (but consider sample costs)

Distribution and number of sample points

Type of sampling, targeted or random
Where is the seed point for sampling?
Understanding sampling bias

Do the results fit the expectation of geological and environmental understanding

Limits of the data -
- Better to interpolate than extrapolate the results
Are the anomalies real or processing artifacts

Contouring results using same data (red dots) but different methods

Basic Analysis
Univariate and Bivariate

Univariate Analysis

Provides simple statistical information

Visual representation

No spatial understanding (depiction)

Bivariate Analysis

Simple linear regression and more complex relationships

Are two variables correlatable?

Correlation is not causation!

Multivariate Analysis

Natural systems will be expected to have more than one independent input

Rain + Geology + Time + Topography + ...

Use residual plots

In a regression model we should not be able to predict the error in any given observation

By analysing our residuals we can determine if they are consistent with random error or there is a systematic bias

Autocorrelation

Autocorrelation in spatial data refers to the correlation of a variable with itself through space. It describes how similar data values are based on the distance and direction between them.

Why it is important:

Identify clustering, gradients, or randomness in spatial distributions
Strong positive autocorrelation implies nearby data points are good predictors for unknown values — a core idea behind kriging and contouring.

If autocorrelation is unexpectedly high or low, it may point to issues like:
- Measurement bias
- Systematic environmental variation
- Over-sampling (duplicate or near-duplicate points)

Model Assumptions
Many statistical models assume observations are independent. Spatial autocorrelation violates this — meaning traditional stats (like regression) may give misleading results.

Geo-spatial Data
A Project Schema

Geospatial projects often exist in complex 3D environments both at, and beneath the ground surface.

From project conception through to execution and eventual completion, Geo-spatial data and its transformation to credible information is poorly understood and is an often side-lined facet of the project.

For successful outcomes good quality geo-spatial data will reduce uncertainty, dispel incorrect pre-conceptions and add value to projects that require this data as a key foundation element:

Knowledge gaps are identified and rectified through a single data measurement or series of data measurements.
This data is organised and attributed, and transformed into information
This information yields insights from which informed decisions based on spatial knowledge can be made...
Reducing overrun and cost consuming changes to plan that were originally made on false pre-conceptions

Acknowledgement is made to Pyrcz, Isaacks, Deutsch, Smith and others from whom many of the above themes are based

Geo-spatial Data and Information

Assumptions About Your Data

Site Characterisation and its Variability

Site Characterisation: How many samples are needed?

Site Characterisation: Displaying and Prediction of the Results