Skip to content

Wrong result qualify in Tasty Bytes - Zero to Snowflake - Geospatial  #121

@arcuellar88

Description

@arcuellar88

I am following the tutorial Tasty Bytes - Zero to Snowflake - Geospatial with my own data and I am not getting the right answer in Step 4. I am missing one comparison. I have 5 places and I am only getting 9 distances. I should get 10.

https://github.com/Snowflake-Labs/sf-samples/blame/1ca85b477fa4beb3131c41cc1cfb6286df8a1c20/samples/tasty_bytes/FY25_Zero_To_Snowflake/tb_geospatial.sql#L157

Qualify window function filter doesn't work for the first element as Lag is null (something <> null is not true)

WITH _top_10_locations AS
(
SELECT TOP 10
location_id,
ST_MAKEPOINT(longitude, latitude) AS geo_point,
SUM(price) AS total_sales_usd
FROM analytics.orders_v
WHERE primary_city = 'Paris'
GROUP BY location_id, latitude, longitude
ORDER BY total_sales_usd DESC
)
SELECT
a.location_id,
b.location_id,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1609,2) AS geography_distance_miles,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1000,2) AS geography_distance_kilometers
FROM _top_10_locations a
JOIN _top_10_locations b
ON a.location_id <> b.location_id -- avoid calculating the distance between the point itself
QUALIFY a.location_id <> LAG(b.location_id) OVER (ORDER BY geography_distance_miles) -- avoid duplicate: a to b, b to a distances
ORDER BY geography_distance_miles;

My suggestion:

WITH _top_10_locations AS
(
SELECT TOP 10
location_id,
ST_MAKEPOINT(longitude, latitude) AS geo_point,
SUM(price) AS total_sales_usd
FROM analytics.orders_v
WHERE primary_city = 'Paris'
GROUP BY location_id, latitude, longitude
ORDER BY total_sales_usd DESC
)
SELECT
a.location_id,
b.location_id,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1609,2) AS geography_distance_miles,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1000,2) AS geography_distance_kilometers
FROM _top_10_locations a
JOIN _top_10_locations b
ON a.location_id < b.location_id -- avoid calculating the distance between the point itself. Also avoid duplicate: a to b, b to a
ORDER BY geography_distance_miles;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions