Skip to content

RIFFAI-org/Universal-platform-architecture-and-plan-for-large-scale-data-handling---solar-data-focused

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Universal-platform-architecture-and-plan-for-large-scale-data-handling---solar-data-focused

Full stack design of platform devolpment, using tool most famular to sat data team. The goal is to create scaleable solution for visualizing big dataset(over 10 million records).

This will be broken into multiple parts. Backed end

  1. batching
  2. Processing
  3. Layer creation
  4. Storage

Front end

  1. Web hosting
  2. Map control
  3. Map design
  4. Website

Universal platform architecture and plan for large-scale data handling

Backend

1 and 2. Batching and Processing Batching is key keep the project scope in check. right now the around 10m poloygon in the dataset. if each has 10 attributes that 100m attribute all need to calucalted. if that scale up the whole region or multiple ie( all of Japan & south east Asia) (not un realisitic) that could be over 1 Billion attributes. So there a need to split inyo batches of 500k all with a key id to connect them back together + stardard layout. Any processing or change to the underlying data(adding attricbute and features) should be conducted on the batch layer when there a fix amount. The calculate one batch at a time till process is completed. Do not add or remover feature or adjust model all in one dataset. (it also recomened to use polar(libarary) and high core count cpu when modifying this layer). This should keep the model scalable. batching

3 and 4. Layer Creation and Storage Due the detail need to see certain feature. Like Roof tops, trees, etc. there a lot feature to visulizaed that will only be visable zoomed in. Other feature such as buidling details, or asset detail(like tree type) will crowd the screen unless one or two is shown at a time. On the zoom out level only gerneral trend need be seen like (where are the most buiding localted). To preven waisting processing time and maxmize speed. These part will be slip into layers. a. Layer 3. (heat map) this will aggregate all pologon into simple points, the futher aggreation, so show a rought intesity of wheach each asset is located. (will be vectorized) b. Later 2. (pologyon) only visable when each assent will be visable. like 1,000 m above surface or lower. (will be vectorized) c. Layer 1. the will be triggure oncee user is below 1000 m and they click on one asset.

Each 3 of these layer will be stored in 3s after processing. The map and website can later called on them when needed. This keep at the processing in the batching phase.

3s storage guidlines for both database name, asset id name(key id), and batch order. Do to the amount data needed queery and amount of data needed to be sliped in batch process. This sytem is design both to fast to query, off each id a unqiu structure, and to be human reabalbe for filling purpose. The query will first look at this of number B00-0000 (type,region, and version) this will point to the correct database. Then it will look at this number 000 which will point to the correct batch. then it only one datset of 500k record to query thougth instead 10, 100 of millions or billions.

Database id naming conventions Overall structure B00-0000-000-0000000 B = type 00 = Region 0000 = Version 000 = batch 0000000 = individual building/asset number

File naming convention The second purpose is to not overlap the same data, we do not want duplicates, so structure needs to be clear and unique individual. Group name = (human readable text)-B00-0000 File name = (Batch000)-000 asset id(key id) = B00-0000-000-000000

layer creation

Font end.

  1. webhosting
    • will manage requested. and will only operate between the map, user, website, and s3 data storage.

2 and 3 Map control and design

  • will called all data back to 3s. the data that is shown will be decitaed by level of zoom each user selects. this is to save memory.

  • use open map as base map. Capture

  1. website
    • will conatain the map. as well as desciption as to how to operated it.

About

Full stack design of platform devolpment, using tool most famular to sat data team. The goal is to create scaleable solution for visualizing big dataset(over 10 million records).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors