From 069dfe84667372a84cf2ea7424e4035f380e7710 Mon Sep 17 00:00:00 2001 From: Jin Whan Bae Date: Wed, 13 Sep 2017 11:16:23 -0500 Subject: [PATCH 1/4] cep27 rough draft --- source/cep/cep27.rst | 260 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 260 insertions(+) create mode 100644 source/cep/cep27.rst diff --git a/source/cep/cep27.rst b/source/cep/cep27.rst new file mode 100644 index 000000000..0975de3d6 --- /dev/null +++ b/source/cep/cep27.rst @@ -0,0 +1,260 @@ +CEP 27 - |Cyclus| Database Restructuring +******************************************** + +:CEP: 27 +:Title: |Cyclus| Database Restructuring +:Last-Modified: 2017-09-11 +:Author: Dr. Tony Scopes & Jin whan Bae +:Status: Draft +:Type: Standards Track +:Created: 2013-09-11 + +Abstract +============ + +This CEP proposes to restructure the |cyclus| output database structure in order to +reduce the number of tables and redundancy of data, and ultimately reduce the number +of `joins' required for data analysis. Doing so would reduce the computing time +for end-user analysis, and allow for a clearer, more concise output database. + + +Motivation +========== +The current output database requires the user to join multiple tables to acquire +meaningful material data, such as quantity and composition. This causes long +analysis computing times and confusion for the user. + + +Rationale +========= +The proposed restructure aims to reduce the number of tables the user has to query +for analysis. This can be done by two methods: +1. Combine redundant tables +2. Reduce a table (`Compositions` table) into a column with variable-type map. + + + +Specification \& Implementation +=============================== +The following tables that are currently in output are considered for editing: +1. Compositions +2. Transactions +3. Recipes +4. ExplicitInventory +5. ExplicitInventoryCompact +6. Info +7. InfoExplicitInv +8. ResCreators +9. Resources + + +Material and Product +-------------------- + +Currently, both **Material** and **Product** are in the Resources Table. +The internal state of **Material** is stored in **Compositions**, and +the internal state of **Product** is stored in **Products** table. +This requires the user to make joins to acquire the internal state +of the resources. + +We can avoid unnecessary joins by creating a **Materials** and +**Products** table, with the internal state (composition and quality) +as a column. + +In short, we propose to replace **Compositions**, **Products**, and +**Resources** table with **Materials** and **Products** Table. In the +process, the **QualID** column would be removed. + +Currently: + +============ ========== + Resources +------------------------ +Column Type +============ ========== +Simid uuid +Resourceid int +ObjId int +Type string +TimeCreated int +Quantity double +Units string +QualId int +Parent1 int +Parent2 int +============ ========== + + + +============ ========== + Products +------------------------ +Column Type +============ ========== +Simid uuid +Qualid int +Quality string +============ ========== + + + + +============ ========== + Compositions +------------------------ +Column Type +============ ========== +Simid uuid +Qualid int +NucId int +MassFrac double +============ ========== + +Would be restructured to: + + +============ ========== + Materials +------------------------ +Column Type +============ ========== +Simid uuid +Resourceid int +ObjId int +TimeCreated int +Parent1 int +Parent2 int +Units string +Quantity double +Composition map +============ ========== +Where map would have + +============ ========== + Products +------------------------ +Column Type +============ ========== +Simid uuid +Resourceid int +ObjId int +TimeCreated int +Parent1 int +Parent2 int +Units string +Quantity double +Quality string +============ ========== + + +Also, since **QualID** is removed, the **Recipes** Table +also needs to be edited: + +============ ========== + Recipes +------------------------ +Column Type +============ ========== +Simid uuid +Recipes string +Composition map +============ ========== + + +Transactions +------------ +The transactions table would be modified +to have a flag for weather the commodity is +a material or a product. + +IS THIS NEEDED?? + +============ ========== + Transactions +------------------------ +Column Type +============ ========== +Simid uuid +TransactionId int +SenderId int +ReceiverId int +Resourceid int +Commodity string +Time int +============ ========== + +============ ========== + Transactions +------------------------ +Column Type +============ ========== +Simid uuid +TransactionId int +SenderId int +ReceiverId int +ResourceType int(bool) +Resourceid int +Commodity string +Time int +============ ========== + + +Rescreators +----------- +Along with **Transactions**, the **Rescreators** +table would need another column, ResourceType: + +============ ========== + Rescreators +------------------------ +Column Type +============ ========== +Simid uuid +Resourceid int +AgentId int +ResourceType int(bool) +============ ========== + + +Merge ExplicitInventory & ExplicitInventoryCompct +------------------------------------------- +The **ExplicitInventory** table and **ExplicitInventoryCompact** +table should be merged to a single table, called **Inventories**, +with the following columns: + +============ ========== + Inventories +------------------------ +Column Type +============ ========== +Simid uuid +Agentid int +Time int +InventoryName string +Quantity double +Composition map +============ ========== + + +Merge Info & InfoExplicitInv +---------------------------- +We saw little reason to separate the two tables. +Combining them would not significantly improve anything, +but would reduce the number of tables created. + + + + +Backwards Compatibility +======================= +This CEP is not backwards compatible. + +Document History +================ +This document is released under the CC-BY 3.0 license. + +References and Footnotes +======================== + +.. rubric:: References + From 3230670ac90a744fb1b413e28f7f16e4341f3eea Mon Sep 17 00:00:00 2001 From: Anthony Scopatz Date: Thu, 14 Sep 2017 13:55:43 -0400 Subject: [PATCH 2/4] updates to CEP27 --- source/cep/cep27.rst | 147 +++++++++++++++++++++++++------------------ 1 file changed, 87 insertions(+), 60 deletions(-) diff --git a/source/cep/cep27.rst b/source/cep/cep27.rst index 0975de3d6..65b54b12b 100644 --- a/source/cep/cep27.rst +++ b/source/cep/cep27.rst @@ -4,24 +4,23 @@ CEP 27 - |Cyclus| Database Restructuring :CEP: 27 :Title: |Cyclus| Database Restructuring :Last-Modified: 2017-09-11 -:Author: Dr. Tony Scopes & Jin whan Bae +:Author: Jin whan Bae & Anthony Scopatz :Status: Draft :Type: Standards Track :Created: 2013-09-11 Abstract ============ - This CEP proposes to restructure the |cyclus| output database structure in order to reduce the number of tables and redundancy of data, and ultimately reduce the number -of `joins' required for data analysis. Doing so would reduce the computing time +of ``joins`` required for data analysis. Doing so would reduce the computing time for end-user analysis, and allow for a clearer, more concise output database. Motivation ========== The current output database requires the user to join multiple tables to acquire -meaningful material data, such as quantity and composition. This causes long +meaningful material data, such as quantity and composition. This causes long analysis computing times and confusion for the user. @@ -29,14 +28,33 @@ Rationale ========= The proposed restructure aims to reduce the number of tables the user has to query for analysis. This can be done by two methods: + 1. Combine redundant tables -2. Reduce a table (`Compositions` table) into a column with variable-type map. +2. Reduce a table (``Compositions`` table) into a column with variable-type map. + +Additionally, this CEP proposes to store both **Inventories** and **Transactions** +by default. Either table may be backed out of the other (with additional +information coming from **Materials** etc). However, this backing out process has proven +extrodinarily expensive, exploding the number of operations needed to back out non-present +by millions to billions. Even for small databases, this has proven prohibitive. + +While storing both **Inventories** and **Transactions** may seem inefficient, consider +that: + +* Data storage is cheap, +* Material inventories are what most analysis tasks require, and +* This is precisely double-entry bookkeeping, as applied to the nuclear fuel cycle. +Double-entry bookkeeping was huge innovation in accounting systems. When implemented +correctly and without fraud, it leads to a self-consisent system. This enables errors +to be discovered and corrected earlier. This CEP argues that |Cyclus| should provide +the information needed to verify the mass balances, if requested. Specification \& Implementation =============================== The following tables that are currently in output are considered for editing: + 1. Compositions 2. Transactions 3. Recipes @@ -57,23 +75,23 @@ the internal state of **Product** is stored in **Products** table. This requires the user to make joins to acquire the internal state of the resources. -We can avoid unnecessary joins by creating a **Materials** and +We can avoid unnecessary joins by creating a **Materials** and **Products** table, with the internal state (composition and quality) as a column. In short, we propose to replace **Compositions**, **Products**, and **Resources** table with **Materials** and **Products** Table. In the -process, the **QualID** column would be removed. +process, the **QualId** column would be removed. Currently: ============ ========== Resources ------------------------ -Column Type +Column Type ============ ========== -Simid uuid -Resourceid int +SimId uuid +ResourceId int ObjId int Type string TimeCreated int @@ -89,10 +107,10 @@ Parent2 int ============ ========== Products ------------------------ -Column Type +Column Type ============ ========== -Simid uuid -Qualid int +SimId uuid +QualId int Quality string ============ ========== @@ -102,10 +120,10 @@ Quality string ============ ========== Compositions ------------------------ -Column Type +Column Type ============ ========== Simid uuid -Qualid int +QualId int NucId int MassFrac double ============ ========== @@ -116,10 +134,10 @@ Would be restructured to: ============ ========== Materials ------------------------ -Column Type +Column Type ============ ========== -Simid uuid -Resourceid int +SimId uuid +ResourceId int ObjId int TimeCreated int Parent1 int @@ -128,15 +146,16 @@ Units string Quantity double Composition map ============ ========== -Where map would have + +Where the composition column would map ============ ========== Products ------------------------ -Column Type +Column Type ============ ========== -Simid uuid -Resourceid int +SimId uuid +ResourceId int ObjId int TimeCreated int Parent1 int @@ -146,16 +165,15 @@ Quantity double Quality string ============ ========== - -Also, since **QualID** is removed, the **Recipes** Table +Also, since **QualId** is removed, the **Recipes** Table also needs to be edited: ============ ========== Recipes ------------------------ -Column Type +Column Type ============ ========== -Simid uuid +SimId uuid Recipes string Composition map ============ ========== @@ -163,61 +181,67 @@ Composition map Transactions ------------ -The transactions table would be modified -to have a flag for weather the commodity is -a material or a product. +The transactions table would be modified to have an integer flag for whether +the commodity is a material or a product. This flag let's anyone inspecting +the transaction table know which resource table (either **Materials** or +**Products**) to go to to find the actual concrete resource. -IS THIS NEEDED?? +**Current:** ============ ========== Transactions ------------------------ -Column Type +Column Type ============ ========== -Simid uuid +SimId uuid TransactionId int SenderId int ReceiverId int -Resourceid int -Commodity string -Time int -============ ========== - -============ ========== - Transactions ------------------------- -Column Type -============ ========== -Simid uuid -TransactionId int -SenderId int -ReceiverId int -ResourceType int(bool) -Resourceid int +ResourceId int Commodity string Time int ============ ========== +**Proposed** -Rescreators +================ ========== + Transactions +---------------------------- +Column Type +================ ========== +SimId uuid +TransactionId int +SenderId int +ReceiverId int +**ResourceType** **int** +ResourceId int +Commodity string +Time int +================ ========== + +This table will now be optionally written to the database. The default will be to +write this table (true). + + +ResCreators ----------- -Along with **Transactions**, the **Rescreators** +Along with **Transactions**, the **ResCreators** table would need another column, ResourceType: ============ ========== - Rescreators + ResCreators ------------------------ -Column Type +Column Type ============ ========== Simid uuid Resourceid int AgentId int -ResourceType int(bool) +ResourceType int ============ ========== -Merge ExplicitInventory & ExplicitInventoryCompct -------------------------------------------- +Merge ExplicitInventory & ExplicitInventoryCompact +---------------------------------------------------- The **ExplicitInventory** table and **ExplicitInventoryCompact** table should be merged to a single table, called **Inventories**, with the following columns: @@ -225,7 +249,7 @@ with the following columns: ============ ========== Inventories ------------------------ -Column Type +Column Type ============ ========== Simid uuid Agentid int @@ -235,14 +259,17 @@ Quantity double Composition map ============ ========== +This table will be optionally written to the database. The default will be to +write this table (true). + Merge Info & InfoExplicitInv ---------------------------- -We saw little reason to separate the two tables. -Combining them would not significantly improve anything, -but would reduce the number of tables created. - +We saw little reason to separate the two tables. Combining them is a matter of cleanliness. +Additionallty, the single **Info** table will have to contain an extra column, **RecordTransactions**. +Furthermore, the **RecordInventory** column is no longer needed and will be removed. +Other informational tables may also be merged into the single table. Backwards Compatibility From 1d67bca664c80e700638511eb6058adc8b628a80 Mon Sep 17 00:00:00 2001 From: Jin Whan Bae Date: Mon, 18 Sep 2017 08:52:40 -0500 Subject: [PATCH 3/4] added explicit inventory tables for reference --- source/cep/cep27.rst | 32 +++++++++++++++++++++++++++++--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/source/cep/cep27.rst b/source/cep/cep27.rst index 0975de3d6..a14e74870 100644 --- a/source/cep/cep27.rst +++ b/source/cep/cep27.rst @@ -219,11 +219,24 @@ ResourceType int(bool) Merge ExplicitInventory & ExplicitInventoryCompct ------------------------------------------- The **ExplicitInventory** table and **ExplicitInventoryCompact** -table should be merged to a single table, called **Inventories**, -with the following columns: +table should be merged to a single table, called **Inventories**. +The current **ExplicitInventory** table and **ExplicitInventoryCompact** +table has a structure as such: +============ ========== + ExplicitInventory +------------------------ +Column Type +============ ========== +Simid uuid +Agentid int +Time int +InventoryName string +NucId int +Quantity double +============ ========== ============ ========== - Inventories + ExplicitInventoryCompact ------------------------ Column Type ============ ========== @@ -235,6 +248,19 @@ Quantity double Composition map ============ ========== +============ ========== + Inventories +------------------------ +Column Type +============ ========== +Simid uuid +Agentid int +Time int +InventoryName string +Quantity double +Composition int +============ ========== + Merge Info & InfoExplicitInv ---------------------------- From b06a9418da4394ed0f5af2370daf0758f93ff7cb Mon Sep 17 00:00:00 2001 From: Jin Whan Bae Date: Mon, 18 Sep 2017 09:03:59 -0500 Subject: [PATCH 4/4] spelling.. --- source/cep/cep27.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/source/cep/cep27.rst b/source/cep/cep27.rst index b707569f4..81e0c75ca 100644 --- a/source/cep/cep27.rst +++ b/source/cep/cep27.rst @@ -11,7 +11,7 @@ CEP 27 - |Cyclus| Database Restructuring Abstract ============ -This CEP proposes to restructure the |cyclus| output database structure in order to +This CEP proposes to restructure the |Cyclus| output database structure in order to reduce the number of tables and redundancy of data, and ultimately reduce the number of ``joins`` required for data analysis. Doing so would reduce the computing time for end-user analysis, and allow for a clearer, more concise output database. @@ -35,7 +35,7 @@ for analysis. This can be done by two methods: Additionally, this CEP proposes to store both **Inventories** and **Transactions** by default. Either table may be backed out of the other (with additional information coming from **Materials** etc). However, this backing out process has proven -extrodinarily expensive, exploding the number of operations needed to back out non-present +extraordinarily expensive, exploding the number of operations needed to back out non-present by millions to billions. Even for small databases, this has proven prohibitive. While storing both **Inventories** and **Transactions** may seem inefficient, consider @@ -46,7 +46,7 @@ that: * This is precisely double-entry bookkeeping, as applied to the nuclear fuel cycle. Double-entry bookkeeping was huge innovation in accounting systems. When implemented -correctly and without fraud, it leads to a self-consisent system. This enables errors +correctly and without fraud, it leads to a self-consistent system. This enables errors to be discovered and corrected earlier. This CEP argues that |Cyclus| should provide the information needed to verify the mass balances, if requested. @@ -292,7 +292,7 @@ write this table (true). Merge Info & InfoExplicitInv ---------------------------- We saw little reason to separate the two tables. Combining them is a matter of cleanliness. -Additionallty, the single **Info** table will have to contain an extra column, **RecordTransactions**. +Additionally, the single **Info** table will have to contain an extra column, **RecordTransactions**. Furthermore, the **RecordInventory** column is no longer needed and will be removed. Other informational tables may also be merged into the single table.