Skip to content

BlueriverQuantLabs/unique-id

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

unique-id

OVERVIEW

The goal is to create an accurate, atomic unique identifier for every physical structure in the world. Unique ID aims to be the single source of truth on property identity.

ROADMAP

Phase 1

During Phase I of the project we will identify relevant address data sources and work to integrate them into a single source of truth data store. Twhich represents validated, real world addresses across the United States. First we will build the data infrastructure to handle addressess across the U.S. After a proof-of-concept on a representative sample of addresses is successful, we will build out the system for the entire U.S. (and eventually, the world).

Phase 2

In Phase 2, we will use our single source of truth database created during Phase 1 to pair address data with satelite--and other non-traditional sources.

DATA SOURCES

They have 478m addresses globally and for freely available download. Crowd-sourced.

Potential Use

Openaddressess is a good starting data source

To Do

  • Load data files into "raw" data sources db
  • Write ETL script to load into "raw" data sources DB, scrub/clean data as needed, and load into single source of truth

Potential Use

Primary usage will be to store addresses in single source of truth db. Geocoding services seem cheaper than google so might be a good commercial option to fill in the gaps.

Data Download

Commercial Options

To Do

  • Download data from openstreetmap.org and review for features, completeness, and accuracy.
  • Write ETL script to load into "raw" data sources DB, scrub/clean data as needed, and load into single source of truth  

TBD # of data points

Potential Use

  • Store Places API data as a data source
  • Geocoding & Reverse Geocoding API to fill in gaps in other datasets

Documentation

To Do

  • Model out pricing/cost for use case
  • If cost effective compared to free/open data sources, negotiate Enterprise License so we can cache more than 30 days of data

TBD # of data points

Potential Use

  • Store listing address data for single source of truth db.
  • Store review and other unstructured data as potential indicator of misclassification of buildings etc.

Documentation

To Do

  • Review documentation/Yelp TOS to understand how much data we can legally cache
  • Model out pricing/cost for use case

A listing database of launched satellites

TBD # of data points

Potential Use

Find inexpensive and suitable satelite data sources for use in Phase 2.

To Do

Research which satelite data will provide most value to phase 2 of the project

MISC TOOLS

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published