Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save maxcollombin/70a9fd6d22fdc39f2c5b896c07137c49 to your computer and use it in GitHub Desktop.
Save maxcollombin/70a9fd6d22fdc39f2c5b896c07137c49 to your computer and use it in GitHub Desktop.
Spatial SQL to return the id of the closest geometry from table_b to each geometry in table_a
SELECT
a.id,
-- Note that on line 13 the results are grouped by a.id so if you want to return the geometry from table_a,
-- here you can use the function ANY_VALUE(a.geom) (as you cannot group by geometry in BigQuery)
ARRAY_AGG(b.id ORDER BY ST_Distance(a.geog, b.geog) LIMIT 1) [ORDINAL(1)] AS neighbor_id
-- Here we return the id of the closest geometry from table_b to each geometry in table_a.
-- If you want to include more fields from table_b here you can use STRUCT(b.id, b.second_field, b.third_field) etc.
FROM
table_a a
JOIN
table_b b
ON
ST_DWithin(a.geog, b.geom, 100) -- Here we use ST_DWithin to limit the results to within 100m.
-- The smaller the search radius, the faster the query will run, however the higher the liklihood
-- that we end up with geometries with no nearest neighbour. This can be optimised with prior knowledge of the dataset.
GROUP BY
a.id
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment