Skip to content

Commit 5bde791

Browse files
authored
Updates to Pulse, Install on Premise sections. (#1082)
* Updates to formatting and other doc fixes * Update topic Create Table for default behavior of USING option. * Edited Using SnappyData Cloudbuilder > Quick Start Steps > Deploying SnappyData Cloud Cluster on AWS using Scripts for including the link for EC2 scripts release page. Edited the "Here" link in Programming Guide. Minor spell edits for the term YARN. * minor edits for typos. * Edited Install on premise section for all items in Multiple Host Installation.# Please enter the commit message for your changes. Lines starting * Updates to SnappyData Pulse content and addition of corresponding images. * >>>>>>> origin/master Add chapter for Kubernetes. Update the mkdocs.yml file to include index entry for Getting Started with SnappyData on Kubernetes chapter * Some edits to the Kubernetes chapter content. * Corrected minor error and added content for troubleshooting member startup issue due to missing disk stores. * Incorporated review comments for SnappyData Pulse. * Review comments incorporation for Kubernetes documentation. * Minor changes to Kubernetes documentation content and corresponding changes to yml file. * changes done to the publish-site.sh for serving the published content. * Including changes suggested by Sumedh in Best Practices > Operator System Setting > OS cache Size. * Including updates to Troubleshooting > Member Startup Problem section. * Minor edits to Troubleshooting > Member Startup Problems section. * Review comment from Sumedh incorporated. * Based on Pierce's email, created a subsection for configuring secondary lead and provided a link to it from the best practices section. Added content in Configuring the cluster> Configuring Secondary Lead section. Provided link to this chapter in Best Practices>HA considerations>Lead section * Incorporated review comments and included content for Accessing Logs and Configuring Log level. * Final review comment incorporation and edits for SnappyData Pulse. * Updates done for SNAP-2454, SNAP-227-, SNAP-2256 * For SNAP 2454 Removed links of Create Synonym and Drop Synonym as suggested by Sonal/Shirish * Changes/corrections to Configuring the Cluster > Configuring Data Servers section suggested by Rishi. * Review incorporation for Kubernetes content from the table top review with Amogh and Shirish. * Minor edits for SnappyData Pulse content. Added updated image for pulse. * Minor edits to Install On- Premise section. * Edited some grammatical errors. * Added content in troubleshooting section for Resolving Catalog Inconsistency Issues. * Adding catalog_inconsistency.md. * Added Securing SnappyData Pulse UI connection section under Configuring the cluster section. Added link to the same in Managing and Monitoring section. * Incorporate review comments for SnappyData Pulse UI Secure connection and Troubleshooting section. * Minor link correction.
1 parent be094ef commit 5bde791

40 files changed

+829
-220
lines changed

docs/Images/CPU-Usage-graph.png

10.5 KB
Loading

docs/Images/CPU_cores.png

2.42 KB
Loading

docs/Images/CPU_cores1.png

5.98 KB
Loading

docs/Images/Dashboard-Ext-Tables.png

25.3 KB
Loading

docs/Images/Dashboard-MembersList.png

89.8 KB
Loading

docs/Images/Dashboard-TablesList.png

56.7 KB
Loading

docs/Images/Dashboard-Trends.png

37.2 KB
Loading

docs/Images/Dashboard.png

120 KB
Loading

docs/Images/Heap-Usage-graph.png

11.6 KB
Loading

docs/Images/OffHeap-Usage-graph.png

12.3 KB
Loading

docs/Images/memberdetailview.png

241 KB
Loading
19.2 KB
Loading

docs/Images/services.png

24.1 KB
Loading
24.5 KB
Loading
24.6 KB
Loading

docs/Images/snapshot.png

22.4 KB
Loading

docs/best_practices/ha_considerations.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,12 @@ High availability options are available for all the SnappyData components.
55

66
**Lead** </br>
77
SnappyData supports secondary lead nodes. If the primary lead becomes unavailable, one of the secondary lead nodes takes over immediately.
8-
Setting up the secondary lead node is highly recommended because the system cannot function if the lead node is unavailable. Currently, the queries and jobs that are executing when the primary lead becomes unavailable, are not re-tried and have to be resubmitted.
8+
[Setting up the secondary lead node](../configuring_cluster/configuring_cluster.md#confsecondarylead) is highly recommended because the system cannot function if the lead node is unavailable. Currently, the queries and jobs that are executing when the primary lead becomes unavailable, are not re-tried and have to be resubmitted.
99

1010
**Locator**</br>
1111
SnappyData supports multiple locators in the cluster for high availability.
12-
It is recommended to set up multiple locators (ideally two) as, if a locator becomes unavailable, the cluster continues to be available. New members can however not join the cluster.</br>
13-
With multiple locators, clients notice nothing and the fail over recovery is completely transparent.
12+
It is recommended to set up multiple locators (ideally two). If a locator becomes unavailable, the cluster continues to be available. However, new members cannot join the cluster.</br>
13+
With multiple locators, there are no impact on the clients and the fail over recovery is completely transparent.
1414

1515
**DataServer**</br>
1616
SnappyData supports redundant copies of data for fault tolerance. A table can be configured to store redundant copies of the data. So, if a server is unavailable, and if there is a redundant copy available on some other server, the tasks are automatically retried on those servers. This is totally transparent to the user.

docs/best_practices/important_settings.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -67,14 +67,18 @@ ec2-user soft sigpending 524288
6767

6868
**OS Cache Size**</br>
6969
When there is a lot of disk activity especially during table joins and during an eviction, the process may experience GC pauses. To avoid such situations, it is recommended to reduce the OS cache size by specifying a lower dirty ratio and less expiry time of the dirty pages.</br>
70-
The following are the typical configuration to be done on the machines that are running SnappyData processes.
70+
71+
Add the following to */etc/sysctl.conf* using the command `sudo vim /etc/sysctl.conf` or `sudo gedit /etc/sysctl.conf` or by using an editor of your choice:</br>
7172

7273
```
73-
sudo sysctl -w vm.dirty_background_ratio=2
74-
sudo sysctl -w vm.dirty_ratio=4
75-
sudo sysctl -w vm.dirty_expire_centisecs=2000
76-
sudo sysctl -w vm.dirty_writeback_centisecs=300
74+
vm.dirty_background_ratio=2
75+
vm.dirty_ratio=4
76+
vm.dirty_expire_centisecs=2000
77+
vm.dirty_writeback_centisecs=300
7778
```
79+
Then apply to current session using the command `sudo sysctl -p`
80+
81+
These settings lower the OS cache buffer sizes which reduces long GC pauses during disk flush but can decrease overall disk write throughput. This is especially true for slower magnetic disks where the bulk insert throughput can see a noticeable drop (such as 20%), while the duration of GC pauses should reduce significantly (such as 50% or more). If long GC pauses, for example in the range of 10s of seconds, during bulk inserts, updates, or deletes is not a problem then these settings can be skipped.
7882

7983
**Swap File** </br>
8084
Since modern operating systems perform lazy allocation, it has been observed that despite setting `-Xmx` and `-Xms` settings, at runtime, the operating system may fail to allocate new pages to the JVM. This can result in the process going down.</br>

docs/configuration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ The following topics are covered in this section:
2828

2929
* [SSL Setup for Client-Server](configuring_cluster/ssl_setup.md)
3030

31+
* [Securing SnappyData Pulse UI Connection](./configuring_cluster/securinguiconnection.md)
32+
3133
* [List of Properties](configuring_cluster/property_description.md)
3234

3335
* [Firewalls and Connections](configuring_cluster/firewalls_connections.md)

docs/configuring_cluster/configuring_cluster.md

Lines changed: 29 additions & 15 deletions
Large diffs are not rendered by default.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Securing SnappyData Pulse UI Connection
2+
3+
You can secure the SnappyData Pulse UI with SSL authentication, so that the UI can be accessed only over HTTPS. The following configurations are needed to set up SSL enabled connections for SnappyData Pulse UI:
4+
5+
**To set up SSL enabled connections for SnappyData Pulse UI:**
6+
7+
1. Make sure that you have valid SSL certificate imported into truststore.
8+
2. Provide the following spark configuration in the leads conf files:
9+
10+
localhost-spark.ssl.enabled=true \
11+
-spark.ssl.protocol=<ssl-protocol> \
12+
-spark.ssl.enabledAlgorithms=<comma-separated-list-of-ciphers> \
13+
-spark.ssl.keyPassword=<key-password> \
14+
-spark.ssl.keyStore=<path-to-key-store> \
15+
-spark.ssl.keyStorePassword=<key-store-password> \
16+
-spark.ssl.keyStoreType=<key-store-type> \
17+
-spark.ssl.trustStore=<path-to-trust-store> \
18+
-spark.ssl.trustStorePassword=<trust-store-type> \
19+
-spark.ssl.trustStoreType=<trust-store-type>
20+
21+
!!!Note
22+
- If using TLS SSL protocol, the enabledAlgorithms can be TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA
23+
- Store types could be JKS or PKCS12.
24+
25+
3. Launch the Snappy cluster.
26+
`./sbin/snappy-start-all.sh`
27+
4. Launch the Snappy Pulse UI in your web browser. You are directed to the HTTPS site.
28+
29+
!!!Note
30+
Users are automatically redirected to HTTPS (on port 5450) even if the SnappyData Pulse UI is accessed with HTTP protocol.

docs/install/install_on_premise.md

Lines changed: 70 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ SnappyData runs on UNIX-like systems (for example, Linux, Mac OS). With on-premi
55

66
<a id="singlehost"></a>
77
## Single-Host Installation
8+
89
This is the simplest form of deployment and can be used for testing and POCs.
910

1011
Open the command prompt, go the location of the downloaded SnappyData file, and run the following command to extract the archive file.
@@ -20,21 +21,32 @@ Start a basic cluster with one data node, one lead, and one locator:
2021
./sbin/snappy-start-all.sh
2122
```
2223

23-
For custom configuration and to start more nodes, see the section on [configuring the SnappyData cluster](../configuring_cluster/configuring_cluster.md).
24+
For custom configuration and to start more nodes, refer [configuring the SnappyData cluster](../configuring_cluster/configuring_cluster.md).
2425

26+
<a id="multihost"></a>
2527
## Multi-Host Installation
26-
For real-life use cases, you need multiple machines on which SnappyData can be deployed. You can start one or more SnappyData node on a single machine based on your machine size.
2728

28-
## Machines with a Shared Path
29-
If all your machines can share a path over an NFS or similar protocol, then follow the steps below:
29+
For real-life use cases, you require multiple machines on which SnappyData must be deployed. You can start one or more SnappyData node on a single machine based on your machine size.
30+
31+
Where there are multiple machines involved, you can deploy SnappyData on:
32+
33+
* [Machines With Shared Path](#sharedpath)
34+
35+
* [Machines Without a Shared Path](#machine-shared-path)
3036

31-
#### Prerequisites
37+
* [Machines Without Passwordless SSH](#without_passwordless)
38+
39+
<a id="sharedpath"></a>
40+
### Machines With a Shared Path
41+
If all the machines in your cluster can share a path over an NFS or similar protocol, then use the following instructions:
42+
43+
**Prerequisites**
3244

3345
* Ensure that the **/etc/hosts** correctly configures the host and IP address of each SnappyData member machine.
3446

35-
* Ensure that SSH is supported and you have configured all machines to be accessed by [passwordless SSH](../reference/misc/passwordless_ssh.md).
47+
* Ensure that SSH is supported and you have configured all the machines to be accessed by [passwordless SSH](../reference/misc/passwordless_ssh.md). If SSH is not supported then follow the instructions in the [Machines Without Passwordless SSH](#without_passwordless) section.
3648

37-
**To set up the cluster:**
49+
**To set up the cluster for machines with a shared path:**
3850

3951
1. Copy the downloaded binaries to the shared folder.
4052

@@ -45,20 +57,64 @@ If all your machines can share a path over an NFS or similar protocol, then foll
4557

4658
3. Configure the cluster as described in [Configuring the Cluster](../configuring_cluster/configuring_cluster.md).
4759

48-
4. After configuring each of the components, run the `snappy-start-all.sh` script:
60+
4. After configuring each of the members in the cluster, run the `snappy-start-all.sh` script:
4961

5062
./sbin/snappy-start-all.sh
5163

5264
This creates a default folder named **work** and stores all SnappyData member's artifacts separately. The folder is identified by the name of the node.
5365

54-
If SSH is not supported then follow the instructions in the [Machines without a Shared Path](#machine-shared-path) section.
66+
!!!Tip
67+
For optimum performance, configure the **-dir** to a local directory and not to a network directory. When **-dir** property is configured for each member in the cluster, the artifacts of the respective members get created in the **-dir** folder.
5568

5669
<a id="machine-shared-path"></a>
57-
## Machines without a Shared Path
70+
### Machines Without a Shared Path
71+
72+
In case all the machines in your cluster do not share a path over an NFS or similar protocol, then use the following instructions:
73+
74+
**Prerequisites**
75+
76+
* Ensure that **/etc/hosts** correctly configures the host and IP Address of each SnappyData member machine.
77+
78+
* Ensure that SSH is supported and you have configured all the machines to be accessed by [passwordless SSH](../reference/misc/passwordless_ssh.md). If SSH is not supported then follow the instructions in the [Machines without passwordless SSH](#without_passwordless) section.
79+
80+
**To set up the cluster for machines without a shared path:**
81+
82+
1. Copy and extract the downloaded binaries into each machine. Ensure to maintain the same directory structure on all the machines. For example, if you copy the binaries in **/opt/snappydata/** on the first machine, then you must ensure to copy the binaries to **/opt/snappydata/** on rest of the machines.
83+
84+
2. Configure the cluster as described in [Configuring the Cluster](../configuring_cluster/configuring_cluster.md). Maintain one node as the controller node, where you can configure your cluster. Usually this is done in the lead node. On that machine, you can edit files such as servers, locators, and leads which are in the **$SNAPPY_HOME/conf/ directory**.
85+
86+
3. Create a working directory on every machine, for each of the SnappyData member that you want to run. <br> The member's working directory provides a default location for the logs, persistence, and status files of that member. <br>For example, if you want to run both a locator and server member on the local machine, create separate directories for each member.
87+
88+
4. Run the `snappy-start-all.sh` script:
89+
90+
./sbin/snappy-start-all.sh
91+
92+
<a id="without_passwordless"></a>
93+
### Machines Without Passwordless SSH
94+
95+
96+
In case the machines in your cluster do not share a common path as well as cannot be accessed by [passwordless SSH](../reference/misc/passwordless_ssh.md), then you can use the following instructions to deploy SnappyData:
97+
98+
**To set up the cluster for machines without passwordless SSH:**
99+
100+
1. Copy and extract the downloaded binaries into each machine. The binaries can be placed in different directory structures.
101+
102+
3. [Configure](../configuring_cluster/configuring_cluster.md) each member separately.
103+
104+
5. Start the members in the cluster one at a time. Start the locator first, then the servers, and finally the leads. Use the following scripts to start the members:
105+
106+
* `$SNAPPY_HOME/sbin/snappy-locator.sh`
107+
108+
* `$SNAPPY_HOME/sbin/snappy-server.sh`
109+
110+
* `$SNAPPY_HOME/sbin/snappy-lead.sh`
111+
112+
!!!Note
113+
These scripts will start the member with the default configuration properties. To override the default configuration, you can pass the properties as arguments to the above scripts.
114+
115+
116+
117+
58118

59-
* Ensure that the **/etc/hosts** correctly configures the host and IP Address of each SnappyData member machine.
60119

61-
* Copy and extract the downloaded binaries on each machine. Ensure that the directory structure is the same on all machines.
62120

63-
* On each host, create a working directory for each SnappyData member, that you want to run on the host. <br> The member working directory provides a default location for the log, persistence, and status files for that member.
64-
<br>For example, if you want to run both a locator and server member on the local machine, create separate directories for each member.

docs/isight/quick_start_steps.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ For more information, refer to the [Apache Zeppelin](#LoggingZeppelin) section o
157157

158158
SnappyData provides a script (`snappy-ec2`) that allows you to launch and manage SnappyData clusters on Amazon Elastic Compute Cloud (EC2).
159159

160-
The scripts are available on GitHub in the [snappy-cloud-tools repository](https://github.com/SnappyDataInc/snappy-cloud-tools/tree/master/aws/ec2) and also from the latest [SnappyData Release page](https://github.com/SnappyDataInc/snappydata/releases). The package is available in compressed files (**snappydata-ec2-`<version>`.tar.gz**). Extract the contents to a location on your computer.
160+
The scripts are available on GitHub in the [snappy-cloud-tools repository](https://github.com/SnappyDataInc/snappy-cloud-tools/tree/master/aws/ec2) and also on the [Release page](https://github.com/SnappyDataInc/snappy-cloud-tools/releases). The package is available in compressed files (**snappydata-ec2-`<version>`.tar.gz**). Extract the contents to a location on your computer.
161161

162162
For more information on setting up the cluster using the EC2 script, refer to [Using SnappyData EC2 scripts](../install/setting_up_cluster_on_amazon_web_services.md#EC2).
163163

0 commit comments

Comments
 (0)