|
| 1 | +# TA-dataset |
| 2 | +The DataSet Add-on for Splunk provides integration with [DataSet](https://www.dataset.com) by [SentinelOne](https://sentinelone.com). The key functions allow two-way integration: |
| 3 | +- SPL custom command to query DataSet directly from the Splunk UI without having to reindex data to Splunk. |
| 4 | +- Inputs to index alerts as CIM-compliant, or user-defined query results, from DataSet to Splunk. |
| 5 | +- Alert action to send events from Splunk to DataSet. |
| 6 | + |
| 7 | +## Installation |
| 8 | +Reference Splunk documentation for [installing add-ons](https://docs.splunk.com/Documentation/AddOns/released/Overview/Installingadd-ons). |
| 9 | +### Splunk Enterprise |
| 10 | +| Splunk component | Required | Comments | |
| 11 | +| ------ | ------ | ------ | |
| 12 | +| Search heads | Yes | Required to use the custom search command. | |
| 13 | +| Indexers | No | Parsing is performed during data collection. | |
| 14 | +| Forwarders | Yes | For distributed deployments, this add-on requires heavy forwarders for modular inputs. | |
| 15 | + |
| 16 | +### Splunk Cloud |
| 17 | +| Splunk component | Required | Comments | |
| 18 | +| ------ | ------ | ------ | |
| 19 | +| Search heads | Yes | Required to use the custom search command. Splunk Cloud Victoria Experience also handles modular inputs on the search heads. | |
| 20 | +| Indexers | No | Parsing is performed during data collection. | |
| 21 | +| Inputs Data Manager | Yes | For Splunk Cloud Classic Experience, this add-on requires an IDM for modular inputs. | |
| 22 | + |
| 23 | +## Configuration |
| 24 | +### Dataset |
| 25 | +1. Navigate to https://app.scalyr.com/keys |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +2. Click Add Key > Add Read Key (required for inputs and search command). |
| 30 | +3. Click Add Key > Add Write Key (required for alert action). |
| 31 | +4. Optionally, click the pencil icon to rename the keys. |
| 32 | + |
| 33 | +### Splunk |
| 34 | +1. In Splunk, open the Add-on |
| 35 | + |
| 36 | + |
| 37 | + |
| 38 | +2. In configuration on DataSet Account tab: |
| 39 | +- Select the environment. |
| 40 | +- Enter the DataSet read key. |
| 41 | +- Enter the DataSet write key. |
| 42 | + |
| 43 | +3. Optionally, configure logging level and proxy information on the associated tabs. |
| 44 | +4. Click Save. |
| 45 | + |
| 46 | + |
| 47 | + |
| 48 | +5. On the inputs page, click Create New Input and select the desired input |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | +6. For DataSet alerts, enter: |
| 53 | +- A name for the input. |
| 54 | +- Interval, in seconds. A good starting point is `300` seconds to collect every five mintues. |
| 55 | +- Splunk index name |
| 56 | +- Start time, in relative shorthand form, e.g.: `24h` for 24 hours before input execution time. |
| 57 | +7. Click Save. |
| 58 | + |
| 59 | + |
| 60 | + |
| 61 | +7. For DataSet queries, enter: |
| 62 | +- A name for the input. |
| 63 | +- Interval, in seconds. A good starting point is `300` seconds to collect every five mintues. |
| 64 | +- Splunk index name |
| 65 | +- Start time, in relative shorthand form, e.g.: `24h` for 24 hours before input execution time. |
| 66 | +- *(optional)* End time, in relative shorthand form, e.g.: `5m` for 5 minutes before input execution time. |
| 67 | +- *(optional)* Query string used to return matching events. |
| 68 | +- *(optional)* Maximum number of events to return. |
| 69 | + |
| 70 | +## Using |
| 71 | + |
| 72 | +### Inputs |
| 73 | +The DataSet Add-on for Splunk collects the following inputs utilizing time-based checkpointing to prevent reindexing the same data: |
| 74 | + |
| 75 | +| Source Type | Description | CIM Data Model | |
| 76 | +| ------ | ------ | ------ | |
| 77 | +| dataset:alerts | Predefined Power Query API call to index [alert state change records](https://app.scalyr.com/help/alerts#logging) | [Alerts](https://docs.splunk.com/Documentation/CIM/latest/User/Alerts) | |
| 78 | +| dataset:query | User-defined standard [query](https://app.scalyr.com/help/api#query) API call to index events | - | |
| 79 | + |
| 80 | +## SPL Command |
| 81 | +The `| dataset` command allows queries against the DataSet API directly from Splunk's search bar. Five optional parameters are supported: |
| 82 | + |
| 83 | +- **method** - Define `query` or `powerQuery` to call the appropriate REST endpoint. Default is query. |
| 84 | +- **query** - The DataSet [query](https://app.scalyr.com/help/query-language) or Power Query []() used to filter events. Default is no filter (return all events limited by maxCount). |
| 85 | +- **maxCount** - Number of events to return from DataSet. Default is 100. |
| 86 | +- **startTime** - The Splunk time picker can be used (not "All Time"), or startTime is an alternative to define the [start time](https://app.scalyr.com/help/time-reference) for DataSet events to return. Use epoch time or relative shorthand in the form of a number followed by d, h, m or s (for days, hours, minutes or seconds), e.g.: `24h`. Default is 24h. Note the |
| 87 | +- **endTime** - The Splunk time picker can be used (not "All Time"), or endTime is an alternative to define the [end time](https://app.scalyr.com/help/time-reference) for DataSet events to return. Use epoch time or relative shorthand in the form of a number followed by d, h, m or s (for days, hours, minutes or seconds), e.g.: `5m`. Default is search time. |
| 88 | + |
| 89 | +For all queries, be sure to `"`wrap the entire query in double quotes`"`, and inside use `'`single quotes`'` or double quotes `\"`escaped with a backslash`\"`, as shown in the following examples. |
| 90 | + |
| 91 | +Query Example: |
| 92 | +`| dataset method=query search="serverHost = * AND Action = 'allow'" maxCount=50 startTime=10m endTime=1m` |
| 93 | + |
| 94 | +Power Query Example 1: `| dataset method=powerquery search="dataset = \"accesslog\" |
| 95 | +| group requests = count(), errors = count(status == 404) by uriPath |
| 96 | +| let rate = errors / requests |
| 97 | +| filter rate > 0.01 |
| 98 | +| sort -rate"` |
| 99 | + |
| 100 | + |
| 101 | + |
| 102 | +Power Query Example 2: `| dataset method=powerQuery search="$serverHost == 'cloudWatchLogs' |
| 103 | +| parse 'RequestId: $RID$ Duration: $DUR$ ms Billed Duration: $BDUR$ ms Memory Size: $MEM$ MB Max Memory Used: $UMEM$ MB' |
| 104 | +| let deltaDUR= BDUR - DUR, deltaMEM = MEM - UMEM |
| 105 | +| sort -DUR |
| 106 | +| columns 'Request ID' = RID, 'Duration(ms)' = DUR, 'Charged delta (ms)' = deltaDUR, 'Used Memory (MB)' = UMEM, 'Charged delta Memory (MB)' = deltaMEM" startTime=5m` |
| 107 | + |
| 108 | +Since events are returned in JSON format, the Splunk [spath command](https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchReference/Spath) is useful. Additionally, the Splunk [collect command](https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/collect) can be used to add the events to a summary index: |
| 109 | + |
| 110 | +``` |
| 111 | +| dataset query="serverHost = * AND Action = 'allow'" maxCount=50 startTime=10m endTime=1m |
| 112 | +| spath |
| 113 | +| collect index=dataset |
| 114 | +``` |
| 115 | + |
| 116 | +## Alert Action |
| 117 | +An alert action allows sending an event to the DataSet [addEvents API](https://app.scalyr.com/help/api#addEvents). |
| 118 | + |
| 119 | +##### Note |
| 120 | +This add-on was built with the [Splunk Add-on UCC framework](https://splunk.github.io/addonfactory-ucc-generator/). |
| 121 | +Splunk is a trademark or registered trademark of Splunk Inc. in the United States and other countries. |
0 commit comments