Skip to content

Commit e358958

Browse files
authored
Merge pull request #123 from skytable/0.8.3/backup-and-restore
Add docs on backup and restore, fix inconsistencies
2 parents d00fe96 + ebd765d commit e358958

17 files changed

+140
-39
lines changed
File renamed without changes.
File renamed without changes.

docs/4.architecture.md renamed to docs/c.architecture.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,20 @@ id: architecture
33
title: Architecture
44
---
55

6-
Skytable is a modern NoSQL database that prioritises performance, scalability and reliability while providing a rich and powerful querying interface. We are generally targetting an audience that wants to build high performance, large-scale, low latency applications, such as social networking services, auth services, adtech and such. Skytable is designed to work with
7-
both **structured and semi-structured data**.
6+
Skytable is a modern NoSQL database that prioritises performance, scalability and reliability while providing a rich and powerful querying interface.
7+
We are generally targetting an audience that wants to build high performance, large-scale, low latency applications, such as social networking services,
8+
auth services, adtech and such. Skytable is designed to work with both **structured and semi-structured data**.
89

9-
Our goal is to provide you with a powerful and solid foundation for your application with no gimmicks — just a solid core. That's why, every component in Skytable has been engineered from the ground up, from scratch.
10+
Our goal is to provide you with a powerful and solid foundation for your application with no gimmicks — just a solid core. That's why, every component in
11+
Skytable has been engineered from the ground up, from scratch.
1012

1113
And all of that, without you having to be an expert, and with the least maintenance that you can expect.
1214

1315
## Fundamental differences from relational systems
1416

15-
BlueQL kind of looks and feels like using SQL with a relational database but that doesn't make Skytable's internals the same, with the most important distinction being the fact that Skytable has a NoSQL engine! But Skytable's evaluation and execution of queries is fundamentally different from SQL counterparts and even NoSQL engines. Here are some key differences:
17+
BlueQL kind of looks and feels like using SQL with a relational database but that doesn't make Skytable's internals the same, with the most important
18+
distinction being the fact that Skytable has a NoSQL engine! But Skytable's evaluation and execution of queries is fundamentally different from SQL
19+
counterparts and even NoSQL engines. Here are some key differences:
1620

1721
- All DML queries are point queries and **not** range queries:
1822
- This means that they will either return atleast one row or error
@@ -64,7 +68,7 @@ A `model` in Skytable is like a `table` in SQL but is vastly different because o
6468

6569
## Query language
6670

67-
Skytable has it's own query language BlueQL<sup>TM</sup> which takes a lot of inspiration from SQL but makes several different (and sometimes vastly different) design choices, focused on clarity, speed, simplicity and most importantly, security.
71+
Skytable has its own query language BlueQL<sup>TM</sup> which takes a lot of inspiration from SQL but makes several different (and sometimes vastly different) design choices, focused on clarity, speed, simplicity and most importantly, security.
6872

6973
For example, Skytable's BlueQL<sup>TM</sup> *only* allows the parameterization of queries. All the queries you ran previously with strings and numbers directly were only possible because the REPL client smartly does the paramterization behind the scenes. This is done for security. You'll learn more about BlueQL next.
7074

@@ -99,12 +103,15 @@ Skytable will use atleast as many threads as the number of logical CPUs present
99103

100104
## Networking
101105

102-
Skytable its own in-house Skyhash protocol that is built on top of TCP enabling any programming language that has a TCP client to use it without issues. There are three phases in the connection:
106+
Skytable uses its own in-house Skyhash protocol for client-server communication. It is built on top of TCP, enabling any programming language that has a
107+
TCP client to use it without issues. There are three phases in the connection:
103108
- Handshake: All auth data, compatibility information and other data is exchanged at this step
104109
- Connection mode selection: based on the handshake parameters a connection mode is chosen and the server responds with the chosen exchange mode
105110
- Data exchange: This is where the real querying happens
106111
- Termination: there is no special step; just a `TCP FIN`
107112

113+
You can [read more about the protocol here](protocol).
114+
108115
## Backwards compatibility
109116

110117
We make the promise to you that no matter what changes in Skytable, you will always be able to:
@@ -115,6 +122,7 @@ More technically:
115122
- **For minor/patch releases**: The minor/patch is just in the name but it indicates that no data migration effort is needed. **No minor releases ever need data migration, and any migration is done automatically**
116123
- **For major releases**: Major releases generally introduce breaking changes (just like the upgrade from `0.7.x` to `0.8.0` is a largely breaking change). **Major releases will either automatically upgrade the data files or require you to use a migration tool that is shipped with the bundle**.
117124
- Definitions (closely following semantic versioning):
118-
- **A major release** is something like `1.0.0` to `2.0.0` or `0.8.0` to `0.9.0` (in development versions, 0.8.0 to 0.9.0 is a major version bump)
125+
- **A major release** is something like `1.0.0` to `2.0.0` or `0.8.0` to `0.9.0` (in development versions, 0.8.0 to 0.9.0 is considered a major version
126+
bump)
119127
- **A minor release** is something like `1.0.0` to `1.1.0` or `0.8.0` to `0.8.1`
120128
- **A patch release** is something like `1.0.0` to `1.0.1` or `0.8.0` to `0.8.1` (note that in development versions there is no distinction between a minor and patch release)
File renamed without changes.

docs/index.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,12 @@ To develop using Skytable and maintain your deployment you will want to learn ab
1919
- [**DCL**](blueql/dcl): Data control with BlueQL
2020
- [**Querying**](querying): Introduces different query modes and when to choose a specific query mode
2121
- [**System administration**](system):
22-
- [**Configuration**](system/configuration): Information to help you configure Skytable with custom settings such as custom ports, hosts, TLS, and etc.
23-
- [**User management**](system/user-management): Information on access control, user and other administration features
24-
- [**Global management**](system/global-management): Global settings management
25-
- [**Operations**](system/operations): Learn about administration operations
22+
- [**Configuration**](system/configuration): Configuration modes (CLI, environment variables, configuration files) and options
23+
- [**User management**](system/user-management): Account types, permissions, creating and managing multiple users
24+
- [**Global management**](system/global-management): Learn how to check system health and manage the global state of your database instances
25+
- [**Disk usage**](system/disk-usage): Understand disk usage and compaction
26+
- [**Backup and restore**](system/backup-and-restore): Backing up data and restoring data from backups
27+
- [**Data recovery**](system/recovery): Understanding data loss, mitigation and recovery options
2628
- **Resources**:
2729
- [**Useful links**](resources/useful-links): Links to helpful resources
2830
- [**Migration**](resources/migration): For old our returning Skytable users who are coming from older versions
File renamed without changes.
File renamed without changes.
File renamed without changes.

docs/system/1.configuration.md renamed to docs/system/a.configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ To start the server with a configuration file, simply run `skyd --config <path t
4444
Here's an explanation of all the keys:
4545
- `system`:
4646
- `mode`: set to either `dev` / `prod` mode. `prod` mode will generally make some things stricters (such as background services)
47-
- `rs_window`: **This is a very important setting!** It is set to `300` by default and is called the "reliability service window" which ensures that if any changes are observed in `300` (or whatever value you set) seconds, then they reach the disk as soon as that time elapses. For example, in the default configuration the system checks for changes every 5 minutes and if there are any dataset changes, they are immediately synced. [Read more here](operations#understanding-data-loss)
47+
- `rs_window`: **This is a very important setting!** It is set to `300` by default and is called the "reliability service window" which ensures that if any changes are observed in `300` (or whatever value you set) seconds, then they reach the disk as soon as that time elapses. For example, in the default configuration the system checks for changes every 5 minutes and if there are any dataset changes, they are immediately synced. [Read more here](recovery#understanding-data-loss)
4848
- `auth`:
4949
- `plugin`: this is the authentication plugin. we currently only have `pwd` that is a simple password based authentication system where the password is stored as an [`rcrypt` hash](https://github.com/ohsayan/rcrypt) on disk. More `plugin` options are set to be implemented for more advanced authentication, especially in enterprise settings
5050
- `root_pass`: this is the root account password. **It must have atleast 16 characters**
File renamed without changes.

docs/system/3.global-management.md renamed to docs/system/c.global-management.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,23 @@ The following query returns an `Empty` response or an error code depending on th
1111
SYSCTL REPORT STATUS
1212
```
1313

14-
If you receive an error code, we recommend you to connect to the host and check logs. If the server has crashed, you may need to [recover the database](operations#data-recovery).
14+
If you receive an error code, we recommend you to connect to the host and check logs. If the server has crashed, you may need to [recover the database](recovery).
1515

16-
## Inspecting all spaces
16+
## Inspecting global state
17+
18+
The following query provides a quick overview of the global system state, including users, spaces and settings:
19+
20+
```sql
21+
INSPECT GLOBAL
22+
```
23+
24+
This will return a JSON like this:
1725

18-
The single DDL query that lets you do a "sneak peek" into the status of the entire system is the `INSPECT GLOBAL` query. It
19-
returns a JSON string like this:
2026
```json
2127
{
22-
"spaces:"["space1", "space2"],
23-
"users":["root", "staging_server"],
24-
"settings:{},
28+
"spaces": ["prodApp1", "prodApp2"],
29+
"users": ["root", "staging_app_server", "prod_app_server"],
30+
"settings": {}
2531
}
2632
```
2733

docs/system/d.disk-usage.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
---
2+
id: disk-usage
3+
title: Disk usage
4+
---
5+
6+
## Directory structure
7+
8+
This is the general directory structure (subdirectories omitted):
9+
```
10+
├── data
11+
├── gns.db-tlog
12+
└── .sky_pid
13+
```
14+
15+
- `gns.db-tlog` (file): This is a very important file that stores system tables and other data
16+
- `data` (directory): This directory contains subdirectories with all the spaces (which in turn contain all the data for each space)
17+
- `.sky_pid` (file): This is a temporary PID file that is created whenever the database is started. If the database crashes, then you may have to remove
18+
it manually
19+
20+
21+
## Managing disk usage
22+
23+
Over time, as you continue to use your database your database files will grow in size, as you would expect. However, sometimes database files may grow beyond an efficient size resulting in high memory usage or slowdowns. To counter this, Skytable uses internal heuristics to determine when a database file is "larger than needed" and automatically compacts them at startup.
24+
25+
However, in some cases you may wish to perform a compaction regardless in order to reduce the file size. In order to do this you will have to run:
26+
27+
```sh
28+
skyd compact
29+
```
30+
31+
The server will then compact all files (even if a compaction wasn't triggered by internal heuristics) to their optimum size.

docs/system/e.backup-and-restore.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
id: backup-and-restore
3+
title: Backup and restore
4+
---
5+
6+
## Backing up data
7+
8+
To back data up, you can use the subcommand `skyd backup` as follows:
9+
10+
```sh
11+
skyd backup --type=direct --to=<path to backup> [--from <directory>]
12+
```
13+
14+
- `--type=direct`: This specifies the kind of backup created. The `direct` type indicates that it's a simple copy of the data files and directories
15+
- `--to=<path to backup>`: This specifies where this backup is to be created
16+
- `--from <path to installation>` *(optional)*: When this is not provided, the `backup` subcommand assumes that the current working directory is the installation directory. If you're running it from a different directory then set this option.
17+
18+
**Example**:
19+
20+
```sh
21+
skyd backup \
22+
--type=direct \
23+
--from=/var/lib/skytable \
24+
--to=/mnt/backupnfsdrive/quick-backup-before-migration
25+
```
26+
27+
:::info Backup types
28+
Note that in the future we may add more backup types including compressed archives or other modes. The only type of backup (specified using `--type`) is `direct` which clones the data files and directories. But you do not need to worry about this as the restore subcommand will take care of determining what kind of backup is being pointed to.
29+
:::
30+
31+
### Backup protections
32+
33+
The `backup` subcommand includes some protections to create consistent and valid backups. These include not allowing backups if the database is currently using the data files and some other parameters. If you need to override any of these parameters, then please check the help menu with `skyd backup --help`.
34+
35+
## Restoring data
36+
37+
To restore data from a backup, you can use the subcommand `skyd restore` as follows:
38+
39+
```sh
40+
skyd restore --from=<path to backup> [--to <installation directory>]
41+
```
42+
43+
- `--from=<path to backup>`: Specifies the path to the backup
44+
- `--to <installation directory>` *(optional)*: By default, it is assumed that the current directory is the installation directory. If not, set this option.
45+
46+
**Example**:
47+
48+
```sh
49+
skyd restore \
50+
--from=/mnt/backupnfsdrive/quick-backup-before-migration \
51+
--to=/var/lib/skytable
52+
```
53+
54+
### Data restore protections
55+
56+
The `restore` subcommand also has some safeguards in place that prevent you from accidentally restoring incorrect data. Some of these safeguards include:
57+
58+
- **Backup has correct time signatures**
59+
- **Backup is compatible**
60+
- **Was created by the same host:** you will obviously need to override this when recovering from a crash and this should be okay to do. The reason this protection exists is in a situation where you're running a cluster and have multiple backups and accidentally restore from the wrong backup.
61+
62+
If you need to override any of these conditions in special cases, then please check the help menu with `skyd restore --help`.

docs/system/operations.md renamed to docs/system/f.recovery.md

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,8 @@
11
---
2-
title: Operations
2+
id: recovery
3+
title: Data recovery
34
---
45

5-
## Managing disk usage
6-
7-
Over time, as you continue to use your database your database files will grow in size, as you would expect. However, sometimes database files may grow beyond an efficient size resulting in high memory usage or slowdowns. To counter this, Skytable uses internal heuristics to determine when a database file is "larger than needed" and automatically compacts them at startup.
8-
9-
However, in some cases you may wish to perform a compaction regardless in order to reduce the file size. In order to do this you will have to run:
10-
11-
```sh
12-
skyd compact
13-
```
14-
15-
The server will then compact all files (even if a compaction wasn't triggered by internal heuristics) to their optimum size.
16-
17-
## Data recovery
186

197
In the unforeseen event that a power failure or other catastrophic system failure causes the database to crash, the Skytable server will fail to start normally. Usually it will exit with a nonzero code and an error message such as "journal-corrupted." In such cases, you will need to recover the journal(s) and/or any other corrupted file(s).
208

docs/system/index.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,9 @@ In the following sections, we explore general system administration options with
88

99
Here's an overview of the different administration guides:
1010

11-
- [**Configuration**](configuration): Understand how Skytable can be configured using command-line arguments, environment variables or a configuration file and what all configuration options are available
12-
- [**User management**](user-management): Learn about account types, permissions and how you can manage multiple users
11+
- [**Configuration**](configuration): Configuration modes (CLI, environment variables, configuration files) and options
12+
- [**User management**](user-management): Account types, permissions, creating and managing multiple users
1313
- [**Global management**](global-management): Learn how to check system health and manage the global state of your database instances
14-
- [**Operations**](operations): Understand administrator operations tasks such as backups, recovery and more
14+
- [**Disk usage**](disk-usage): Understand disk usage and compaction
15+
- [**Backup and restore**](backup-and-restore): Backing up data and restoring data from backups
16+
- [**Data recovery**](recovery): Understanding data loss, mitigation and recovery options

docusaurus.config.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,8 +163,8 @@ module.exports = {
163163
to: '/protocol/specification'
164164
},
165165
{
166-
from: '/system/recovery',
167-
to: '/system/operations#data-recovery'
166+
from: '/system/operations',
167+
to: '/system',
168168
}
169169
]
170170
}]

sidebars.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@ module.exports = {
2727
"system/configuration",
2828
"system/user-management",
2929
"system/global-management",
30-
"system/operations",
30+
"system/disk-usage",
31+
"system/backup-and-restore",
32+
"system/recovery",
3133
],
3234
link: {
3335
type: 'doc',

0 commit comments

Comments
 (0)