docs: Issue263 Installation doc updates (#270)

afleisc · web-flow · commit 0328c0e40999 · 2021-06-29T11:06:31.000-04:00
* docs: Updated the installation doc to include a section on cloning the repo, docs on authentication, and updating Terraform instructions so that it can function

* docs: Reformat sections to be more context friendly. Add more clear language and link to Google Cloud SDK installation guide

* docs: specified terraform folder
diff --git a/docs/installation.md b/docs/installation.md
@@ -1,48 +1,59 @@
 # Data Validation Tool Installation Guide
-The data validation tool can be installed on any machine that has Python 3.6+ installed. 
 
 The tool natively supports BigQuery connections. If you need to connect to other databases such as Teradata or Oracle, you will need to install the appropriate connection libraries. (See the [Connections](connections.md) page for details)
 
 This tool can be natively installed on your machine or can be containerized and run with Docker.
 
 
 ## Prerequisites
-The Data Validation Tool can be configured to store the results of validation runs into BigQuery tables. To allow tool to do that, we need to do following:
+
+- Any machine with Python 3.6+ installed.
 
 ## Setup
 
-To write results to BigQuery, you'll need to setup the required cloud
-resources, local authentication, and configure the tool.
+By default, the data validation tool writes the results of data validation to `stdout`. However, we recommend storing the results of validations to a BigQuery table in order to standardize the process and share results across a team. In order to allow the data validation tool to write to a BigQuery table, users need to have a BigQuery table created with a specific schema. If you choose to write results to a BigQuery table, there are a couple of requirements:
+
+- A Google Cloud Platform project with the BigQuery API enabled.
+
+- A Google user account with appropriate permissions. If you plan to run this tool in production, it's recommended that you create a service account specifically for running the tool. See our [guide](https://cloud.google.com/docs/authentication/production) on how to authenticate with your service account. If you are using a service account, you need to grant your service account appropriate roles on your project so that it has permissions to create and read resources.
+
+Clone the repository onto your machine and navigate inside the directory:
+
+```
+git clone https://github.com/GoogleCloudPlatform/professional-services-data-validator.git
+cd professional-services-data-validator
+```
+
+There are two methods of creating the BigQuery output table for the tool: via *Terraform* or the *Cloud SDK*.
 
-A Google Cloud Platform project with the BigQuery API enabled is required.
 
-Confirm which Google user account will be used to execute the tool. If you plan to run this tool in
-production, it's recommended that you create a service account specifically
-for running the tool.
-There are two methods of creating the Cloud resources necessary for the tool: via Terraform or the Cloud SDK.
-### Create cloud resources - Terraform
+### Cloud Resource Creation - Terraform
 
-You can use Terraform to create the necessary BigQuery resources. (See next
-section for manually creating resources with `gcloud`.)
+By default, Terraform is run inside a test environment and needs to be directed to your project. Perform the following steps to direct the creation of the BigQuery table to your project:
+
+1. Delete the `testenv.tf` file inside the `terraform` folder
+2. View `variables.tf` inside the `terraform` folder and replace `default = "pso-kokoro-resources"` with `default = "YOUR_PROJECT_ID"`
+
+
+After installing the [terraform CLI tool](https://learn.hashicorp.com/tutorials/terraform/install-cli) and completing the steps above, run the following commands from inside the root of the repo:
 
 ```
 cd terraform
 terraform init
 terraform apply
 ```
 
-You should see a dataset named `pso_data_validator` and a table named
-`results`.
+### Cloud Resource Creation - Cloud SDK (gcloud)
 
-### Create cloud resources - Cloud SDK (gcloud)
+Install the [Google Cloud SDK](https://cloud.google.com/sdk/docs/install) if necessary. 
 
-Create a dataset for validation results.
+Create a dataset for validation results:
 
 ```
 bq mk pso_data_validator
 ```
 
-Create a table.
+Create a table:
 
 ```
 bq mk --table \
@@ -52,6 +63,13 @@ bq mk --table \
   terraform/results_schema.json
 ```
 
+### Cloud Resource Creation - After success
+
+You should see a dataset named `pso_data_validator` and a table named
+`results` created inside of your project.
+
+After installing the CLI tool using the instructions below, you will be ready to run data validation commands and output the results to BigQuery. See an example [here](https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/docs/examples.md#store-results-in-a-bigquery-table).
+
 
 ## Deploy Data Validation CLI on your machine