-
Notifications
You must be signed in to change notification settings - Fork 0
Enron Email Database UseCase Doc
- V-Query in Haskell is located in
/Users/qiaoran/Desktop/VDBMS/src/VDB/Example/EnronUseCase/EnronQuery- Each xxxyyy.hs file contains one v-query for interaction of xxx and yyy.
- The context of this query and the translated query are encode as comments in each file.
- SQL Queries for each interaction are located in
/Users/qiaoran/Desktop/VDBMS/databases/enronEmailDB/enronDbSQLqueries- The query about interaction of xxx and yyy are located in the directory named xxxyyy
Enron Data is From here
-
employee(eid, firstname, lastname, email_id, email2, email3, email4, folder, status)
- size: 149 tuples
-
message(mid, sender, date, message_id, subject, body, folder)
- size: 252759 tuples
-
recipientinfo(rid, mid, rtype, rvalue, dater)
- size: 2064442 tuples
-
referenceinfo(rfid, mid, reference)
- size: 54778
-
v_employee(eid, firstname, lastname, email_id, folder, status, sign, public_key, presCond)
- sign is responded to signature
- public_key is responded to encrypt
-
v_message(mid, sender, date, message_id, subject, body, folder, is_system_notification, is_signed, is_encrypted, is_from_remailer, is_autoresponse, is_forward_msg, presCond)
- is_signed is responded to signature
- is_encrypted is responded to encrypt
- is_from_remailer is responded to remailmsg
-
v_recipientinfo(rid, mid, rtype, rvalue, presCond)
-
v_referenceinfo(rfid, mid, reference,presCond)
-
v_auto_msg(eid, subject, body, presCond)
- whole table will depend on autoresponder
-
v_forward_msg(eid, forwardaddr, presCond)
- whole table is responded to forwardmessage
-
v_remail_msg(eid, pseudonym, presCond)
- whole table is responded to remailmessage
-
v_filter_msg(eid, suffix, presCond)
- whole table is responded to filtermessages
-
v_mailhost(eid, username, mailhost, presCond)
- whole table is responded to mailhost
-
v_alias(eid, email, nickname, presCond)
- whole table is responded to addressbook
- signature
- addressbook
- filtermsg
- autoresponder
- forwardmsg
- mailhost
- encrypt
- remailmsg
-
- Products for daily use include features:
- Features:
- forwardmsg
- filtermsg
- Presence Condition(p1):
- forwardmsg AND filtermsg AND NOT (addressbook OR encrypt OR remailmsg OR autoresponder OR signature OR mailhost)
- Features:
- Products for daily use include features:
-
- Products that focus on Privacy
- Features:
- signature
- encrypt
- remailmsg
- Presence Condition(p2):
- signature AND encrypt AND remailmsg AND NOT (addressbook OR filtermsg OR autoresponder OR forwardmsg OR mailhost)
- Features:
- Products that focus on Privacy
-
- Products that focus on Group Usage
- features:
- addressbook
- autoresponder
- mailhost
- Presence Condition(p3):
- addressbook AND autoresponder AND mailhost AND NOT (forwardmsg OR encrypt OR remailmsg OR filtermsg OR signature)
- features:
- Products that focus on Group Usage
-
- Products that enables all features
- Presence Condition(p4):
- signature AND addressbook AND filtermsg AND autoresponder AND forwardmsg AND mailhost AND encrypt AND remailmsg
- Presence Condition(p4):
- Products that enables all features
-
- Products that disables all features
- Presence Condition(p5):
- (NOT signature) AND (NOT addressbook) AND (NOT filtermsg) AND (NOT autoresponder) AND (NOT forwardmsg) AND (NOT mailhost) AND (NOT encrypt) AND (NOT remailmsg)
- Presence Condition(p5):
- Products that disables all features
We have 150 employees in Enron database and 5 product variants for email system, we therefore decide to break 150 employees into 5 groups (A,B,C,D,E), and let those 5 groups people be user of the corresponding 5 different product variants. Assign employee to different group based on employee number (eid) as follows:
- Product for daily usage:
0 < eid <= 30 - Product for privacy:
30 < eid <= 60 - Product for group usage:
60 < eid <= 90 - Product for all enabled features:
90 < eid <= 120 - Product for all disabled features:
120 < eid <= 150
v_employee(eid, firstname, lastname, email_id, folder, status, sign, public_key, presCond)
| - ---- | eid | firstname | lastname | email_id | folder | status | sign | public_key | presCond |
|---|---|---|---|---|---|---|---|---|---|
| Daily | xx | xx | xx | xx | xx | xx | NULL | NULL | p1 |
| Privacy | xx | xx | xx | xx | xx | xx | xx | xx | p2 |
| Group | xx | xx | xx | xx | xx | xx | NULL | NULL | p3 |
| Enable | xx | xx | xx | xx | xx | xx | xx | xx | p4 |
| Disable | xx | xx | xx | xx | xx | xx | NULL | NULL | p5 |
v_message(mid, sender, date, message_id, subject, body, folder, is_system_notification, is_signed, is_encrypted, is_from_remailer, is_autoresponse, is_forward_msg, presCond)
| mid | sender | date | message_id | subject | body | folde | is_system_notification | is_signed | is_encrypted | is_from_remailer | is_autoresponse | is_forward_msg | presCond | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| xx | xx | xx | xx | xx | xx | xx | xx | NULL | NULL | NULL | NULL | xx | p1 | |
| xx | xx | xx | xx | xx | xx | xx | xx | xx | xx | xx | NULL | NULL | p2 | |
| xx | xx | xx | xx | xx | xx | xx | xx | NULL | NULL | NULL | xx | NULL | p3 | |
| xx | xx | xx | xx | xx | xx | xx | xx | xx | xx | xx | xx | xx | p4 | |
| xx | xx | xx | xx | xx | xx | xx | xx | NULL | NULL | NULL | NULL | NULL | p5 |
v_recipientinfo(rid, mid, rtype, rvalue, presCond)
| rid | mid | rtype | rtype | rvalue | presCond | |
|---|---|---|---|---|---|---|
| xx | xx | xx | xx | xx | p1 | |
| xx | xx | xx | xx | xx | p2 | |
| xx | xx | xx | xx | xx | p3 | |
| xx | xx | xx | xx | xx | p4 | |
| xx | xx | xx | xx | xx | p5 |
v_referenceinfo(rfid, mid, reference,presCond)
Just like the table v_recipientinfo, we insert presCond for corresponding product data.
For that rest of tables, since the whole table depends on a specific features, we will populate the table with user's email data based on which product group they belong to.
- Import the enron data into a new database. eg.
mysql -u #username# -p #database# < #dump_file#