-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path02_S3.txt
More file actions
177 lines (134 loc) · 9.72 KB
/
02_S3.txt
File metadata and controls
177 lines (134 loc) · 9.72 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
S3 (awslagi)
#@@ It's very hot topic for the exam.
Intro:
------
S3 stands for, Simple Storage Service (SSS -> S3) It's secure, highly scalable, durable object storage. It is very easy to use, with a Web interface to retrieve or store any amount of DATA.
* Object Base storage, so you store pictures, videos, documents (but it's not where you storage OS for example)
* Chars:
Files can be between 0 -> 5 TB
Unlimited storage (paying by the gigs)
Files are stored in BUCKETS
S3 Names are Universal namespace, meaning that names should be unique GLOBALLY, when generating a bucket a DOMAIN name will be generated, so if domain is taking we won't be able to set up a new bucket with the same name.
When successfully uploading a file to S3 you will receive a HTTP CODE 200,
* Bucket: you can take as a folder in the Cloud you can have folders inside the bucket
The bucket name comes in this way: https://s3-eu-west-1.amazonaws.com/bucket_name -> where we can see this S3, then the region EU-WEST-1, then the amazonaws.com and the last the bucketname #@@ this can be an exam question
DATA CONSISTENCY MODEL ##@@ Popular topic for the exam
----------------------
Read after write consistency for PUTS of new objects (immediatelly after we upload a file is read, so we'll abe to read the content of the file immediatelly)
Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate).We update a file, a file in the S3, so after reading it as Read after write consistency we could get both versions, the one prior to the changes or the other one, but eventually we'll get the new file but it will take some time to propagate, even more if there are on different availability zones. Meaning that updating/deleting a file, you wont be able to read inmediatelly
S3 as a simple Key-value store
------------------------------
S3 is object base consisting of the following:
key (simply the name of the object)
Value (data, simple the sequence of bytes)
VersionID (important to versioning)
Meta-data (data about the data itself, like tags and so)
Subresources:
*Access control list: for setting permissions of the files.
*Torrent: For torrenting the file #&& not in the exam topic
S3 - Basic Info:
-----------------
99.99 % of availability (Designed for it and Guaranteed)
Guarantees also 99.99999999999 (11x9's) DURABILITY (it's warranty that you wont loose more than 100-99.99999999999% files)
It is Tiered(escalonado) storage available
LifeCycle management (for instance we can move older files from one storage to Glacyer for example)
Versioning (we can have versions for a file for example)
Encryption (there are different encryptions for the S3)
Secure data using Access Control List and bucket policies: We can lock down a bucket using bucket policies at the bucket level, and Access Control is on the invidual file.
S3 Storage Tiers/Classes:
-------------------------
S3 Standard:
Designed for 99.99 % of availability and 99.99999999999% Durability, redundacy across multiple disks in different facilities (availability zones), sustaining the loss of 2 facilities.
S3 IA:(Infrequently Accessed)
For data which is used less frequently, but required fast access when needed (faster than Glacier), it's cheaper than S3 Standard but they charge something for every time you access the data. Again they have redundancy across multiple disks in different availability zones.
S3 One Zone IA:
Low cost, infrequently Access data, but do it is just in one availability zone.
Glacier:
Cheapest storage, it's for archieving only having 3 modes: Expedidited (to be restored in a few minutes but more $$$), Standard(it will take 3-5 hours to restore) and Bulk(from 5-12 hours to restore).
S3 Charges:
------------
Storage: In Gb basis
Requests: How many times is the object pulled
Storage Management Pricing: You can be charged for the tags,metadata
Data Transfer Pricing: For example transferring data from one region to another. (Cross-Region replication)
Transfer Acceleration: which enable fast,easy and secure transfers of files over long distances between users and S3 buckets, it's taking advantage of CloudFront and Edge locations.
======================
# @@ EXAM TIPS
======================
* S3 is object-based
* Files can be from 0-5TB
* Unlimited Storage
* Files are stored in buckets
* S3 is an universal namespace, so it should be unique globally
* Example of a bucket name: https://s3-region.amazonaws.com/bucketName
* Read after Write consistency of PUTS of new Object
* Eventual Consistency for overwrite PUTS and DELETES (so it can take some time to propagate)
* S3 Storage Classes: S3, S3-IA, One Zone S3-IA, Glacier (data archieving, std retrival time about 3-5 hours)
* Core fundamental of the S3 object: Key, Value, VersionID, Metadata, Subresources: ACL(access Control list), Torrent
* IT IS NOT SUITABLE FOR ANYTHING ELSE THAN FILES (so no OS, no DB)
* When successfully upload a file you will get a message "HTTP 200"
* When versioning is active you cannot disabled. (just put it innactive)
* Versioning is storing all the copies of the object.
* Versioning integrates Lifecycle rules
* MFA Delete: is used in Versioning to prevent to delet of bucket accidentally
* For activate Replication is needed that both of the S3 have active versions
* Regions must be different for the S3 Replication
* Existing files prior to a replication are not replicated automatically so you have to copy them
* No replication of multiple buckets like daisy chain allowed still
* Delete markers are replicated
* Deletion of "Delete markers" are NOT replicated, also deleting versions of a file are not replicated either
* LifeCycle can be used in conjuntion with versioning
* LifeCycle can be applied to Current versions and previous versions.
* Following actions can be done as part of LifeCycle rules:
Transition to Standard S3-IA (default after a minimum of 30days of the creation date)
Transition to Glacier after being 30 days in S3-IA by default
Also can be used to permanently delete object
* #@@ BEFORE THE EXAM READ THE S3 FAQs ;)
------------------------------------
Additional Information from the Labs
-------------------------------------
* By default buckets and the object in the bucket are private.
* MFA DELETE: We can set up the use of MFA for deleting or modifiying versions on buckets. Which can help to avoid accidentally deletion of S3
* Encryption:
Client side encryption
Server side encryption:
With Amazon S3 Managed keys(SSE-S3)
With KMS (SSE-KMS)
With Customer Private keys (SSE-C)
Permissions for objects:
Owner Access
Access for other AWS accounts: So we can add another AWS account for the permissions of the object(file for instance)
Public Access: The rest, meaning the whole inet :D
Properties for Objects:
Storage Class: We can change the class of storage for the object as S3 std, S3-IA, one zone S3-IA ...
Encryption: Server side encryption
Metadata: We can add more metadata
Tags: Even if we tag de bucket, #@@ the object are NOT inheriting the Tags from the buckets.
* The S3 are managed globally (meaning that we see all S3 from all locations at once)
Permissions for Bucket:
For the ACL we have:
Owner Access
Access for other AWS accounts:
Public Acess
S3 Delivery group:
Policies:
We can create policies as we want using Policy Generator :) Just to remember the "Principal" is the one receiving the policy.. but it's a bit out of the scope of the exam
CORS Configuration: (In another lecture)
Properties for Bucket:
Versioning: Once enabled, we can only suspend it. So bear in mind that if versioning is enabled, it will keep various versions of files from the original ones once modified, so think on the amount of data which will be needed for big files for example.
* Check https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html
Cross Region Replication:
* First for CrossRegion replication to work we need to have Versioning active for both buckets.
* When doing the replication you can do the whole bucket or Prefix (folders)
* But when we do the replication, all the files which were existing on the original bucket prior to the replication won't be automatically copied, but we can do it using the AWSCli as for example: ```aws s3 cp --recursive s3://carlinhosbucket s3://carlinhosbucketsydney```
* After we have successfully replicated the S3, if we delete one file, so it will delete the file and create the "deletion-mark" in the Origin S3, and also it will also delete the file and create the "deletion-mark" in the replication. BUT if we delete the "deletion-mark" in the origin S3 this will NOT be replicated into the other S3... S
* Also for example if we return to a previous version (delete the current version) of an object in the S3, in the replicated S3 we will NOT get either this change.This is the Replication
* Attention: when we do the replication of a bucket to another is not copiying the existing files but the new/modified ones.
Lifecycle Management: (S3-IA - Glacier)
For example it will good that if we have some objects in S3 which are not accessed in 30 days you can move them to the cheaper S3-IA and if you later do not access after 60 days it can be move to Glacier which is even cheaper, or you can "expire" the object entyrely after some retention period we configure. This for example can be configured with rules from a service called LifeCycle Management
* LifeCycle can be used in conjuntion with versioning
* LifeCycle can be applied to Current versions and previous versions.
* Following actions can be done as part of LifeCycle rules:
Transition to Standard S3-IA (default after a minimum of 30days of the creation date)
Transition to Glacier after being 30 days in S3-IA by default
Also can be used to permanently delete object