Skip to content

Commit bc6a5d3

Browse files
Update documentation on writing queries
* Bump gaarf-py to 1.9.0 Change-Id: I1fa82662df79b20066f52d1ea787a502a15ff427
1 parent 8df0b70 commit bc6a5d3

File tree

3 files changed

+178
-203
lines changed

3 files changed

+178
-203
lines changed

README.md

Lines changed: 29 additions & 201 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Google Ads API Report Fetcher (gaarf)
44
[![Downloads npm](https://img.shields.io/npm/dw/google-ads-api-report-fetcher?logo=npm)](https://www.npmjs.com/package/google-ads-api-report-fetcher)
55
[![PyPI](https://img.shields.io/pypi/v/google-ads-api-report-fetcher?logo=pypi&logoColor=white&style=flat-square)](https://pypi.org/project/google-ads-api-report-fetcher/)
66
[![Downloads PyPI](https://img.shields.io/pypi/dw/google-ads-api-report-fetcher?logo=pypi)](https://pypi.org/project/google-ads-api-report-fetcher/)
7+
[![GitHub Workflow CI](https://img.shields.io/github/actions/workflow/status/google/ads-api-report-fetcher/pytest.yaml?branch=main&label=pytest&logo=python&logoColor=white&style=flat-square)](https://github.com/google/ads-api-report-fetcher/actions/workflows/pytest.yaml?branch=main)
78

89

910
## Table of content
@@ -75,7 +76,7 @@ Options:
7576
* `account` - Ads account id, aka customer id, it can contain multiple ids separated with comma, also can be specified in google-ads.yaml as 'customer-id' (as string or list)
7677
* `input` - input type - where queries are coming from (Python only). Supports the following values:
7778
* `file` - (default) local or remote (GCS, S3, Azure, etc.) files
78-
* `console` - data are read from standard output
79+
* `console` - data are read from standard input
7980
* `output` - output type, Supports the following values:
8081
* `csv` - write data to CSV files
8182
* `bq` or `bigquery` - write data to BigQuery
@@ -124,24 +125,14 @@ Options specific for SqlAlchemy writer (*Python version only*):
124125
* `sqldb.connection-string` to specify where to write the data (see [more](https://docs.sqlalchemy.org/en/14/core/engines.html))
125126
* `sqldb.if-exists` - specify how to behave if the table already exists (see [more](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html))
126127

128+
#### Query specific options
127129

128-
All parameters whose names start with the `macro.` prefix are passed to queries as params object.
129-
For example if we pass parameters: `--macro.start_date=2021-12-01 --macro.end_date=2022-02-28`
130-
then inside sql we can use `start_date` and `end_date` parameters in curly brackets:
131-
```sql
132-
AND segments.date >= "{start_date}"
133-
AND segments.date <= "{end_date}"
134-
```
130+
If your query contains macros, templates, or sql you need to pass `--macro.`, `--template.`, or `--sql.` CLI flags to to `gaarf`.
131+
Lear more about each of those in [How to write queries](docs/how-to-write-queries.md) document:
132+
* [Macros](docs/how-to-write-queries.md#macros)
133+
* [Templates](docs/how-to-write-queries.md#templates)
134+
* [Sql](docs/how-to-write-queries.md#sql)
135135

136-
Full example:
137-
```
138-
gaarf google_ads_queries/*.sql --ads-config=google-ads.yaml \
139-
--account=1234567890 --output=bq \
140-
--macro.start_date=2021-12-01 \
141-
--macro.end_date=2022-02-28 \
142-
--bq.project=my_project \
143-
--bq.dataset=my_dataset
144-
```
145136

146137
If you run Python version of `gaarf` you can provide query directly from console:
147138

@@ -166,12 +157,12 @@ gaarf-bq <files> [options]
166157
gaarf-sql <files> [options]
167158
```
168159

169-
Options:
170-
* `sql.*` - named SQL parameters to be used in queries as `@param`. E.g. a parameter 'date' supplied via cli as `--sql.date=2022-06-01` can be used in query as `@date` in query.
171-
* `macro.*` - macro parameters to substitute into queries as `{param}`. E.g. a parameter 'dataset' supplied via cli as `--macro.dataset=myds` can be used as `{dataset}` in query's text.
172-
* `template.*` - parameters for templates, strings with "," will be converted to lists/arrays
160+
If your query contains macros, templates, or sql you need to pass `--macro.`, `--template.`, or `--sql.` CLI flags to to `gaarf-bq` or `gaarf-sql`.
161+
Lear more about each of those in [How to write queries](docs/how-to-write-queries.md) document:
162+
* [Macros](docs/how-to-write-queries.md#macros)
163+
* [Templates](docs/how-to-write-queries.md#templates)
164+
* [Sql](docs/how-to-write-queries.md#sql)
173165

174-
175166
The tool assumes that scripts you provide are DDL, i.e. contains statements like create table or create view.
176167

177168
In general it's recommended to separate tables with data from Ads API and final tables/views created by your post-processing queries.
@@ -182,15 +173,15 @@ In general it's recommended to separate tables with data from Ads API and final
182173
* `dataset-location` - BigQuery [locations](https://cloud.google.com/bigquery/docs/locations) for newly created dataset(s)
183174

184175
So it's likely that your final tables will be in a separate dataset (or datasets). To allow the tool to create those datasets for you, make sure that macro for your datasets contains the word "dataset".
185-
In that case gaarf-bq will check that a dataset exists and create it if not.
176+
In that case `gaarf-bq` will check that dataset exists and create it if not.
186177

187178

188179
For example:
189180
```
190181
CREATE OR REPLACE TABLE `{dst_dataset}.my_dashboard_table` AS
191182
SELECT * FROM {ads_ds}.{campaign}
192183
```
193-
In this case gaarf-bq will check for existance of a dataset specified as 'dst_dataset' macro.
184+
In this case `gaarf-bq` will check for existence of a dataset specified as 'dst_dataset' macro.
194185

195186
**SqlAlchemy specific options [Python only]:**
196187
* `connection-string` - specific connection to the selected DB (see [more](https://docs.sqlalchemy.org/en/14/core/engines.html))
@@ -210,181 +201,6 @@ export GAARF_DB_PORT=12345
210201
export GAARF_DB_NAME=test
211202
```
212203

213-
**Common options**
214-
215-
There are three type of parameters that you can pass to a script: `macro`, `sql`, and `template`.
216-
217-
*Macro*
218-
219-
Macro is just a substitution in script text.
220-
For example:
221-
```
222-
SELECT *
223-
FROM {dst_dataset}.{table-src}
224-
```
225-
Here `dst_dataset` and `table-src` are macros that can be supplied as:
226-
```
227-
gaarf-bq --macro.table-src=table1 --macro.dst_dataset=dataset1
228-
```
229-
230-
*SQL*
231-
232-
You can also use normal sql type parameters with `sql` argument:
233-
```
234-
SELECT *
235-
FROM {dst_dataset}.{table-src}
236-
WHERE name LIKE @name
237-
```
238-
and to execute:
239-
`gaarf-bq --macro.table-src=table1 --macro.dst_dataset=dataset1 --sql.name='myname%'`
240-
241-
it will create a parameterized query to run in BQ:
242-
```
243-
SELECT *
244-
FROM dataset1.table1
245-
WHERE name LIKE @name
246-
```
247-
248-
*Template*
249-
250-
Your SQL scripts can be templates using a template engine: [Jinja](https://jinja.palletsprojects.com) for Python and [Nunjucks](https://mozilla.github.io/nunjucks/) for NodeJS.
251-
A script will be processed as a template if and only if you supplied `template` argument.
252-
253-
Inside templates you can use appropriate syntax and control structues of a template engine (Jinja/Nunjucks).
254-
They are mostly compatible but please consult the documentations if you migrate between platforms (Python <-> NodeJS).
255-
256-
Usually inside template blocks you use some variable (in if-else/for-loop). To pass their values you use `--template` arguments.
257-
258-
Example:
259-
```
260-
SELECT
261-
customer_id AS
262-
{% if level == "0" %}
263-
root_account_id
264-
{% else %}
265-
leaf_account_id
266-
{% endif %}
267-
FROM dataset1.table1
268-
WHERE name LIKE @name
269-
```
270-
and to execute:
271-
272-
`gaarf-bq path/to/query.sql --template.level=0`
273-
274-
This will create a column named either `root_account_id` since the specified level is 0.
275-
276-
Please note that all values passed through CLI arguments are strings. But there's a special case - a value containing ","
277-
then it's treated as an array - see the following example.
278-
279-
Template are great when you need to create multiple column based on condition:
280-
281-
```
282-
SELECT
283-
{% for day in cohort_days %}
284-
SUM(GetCohort(lag_data.installs, {{day}})) AS installs_{{day}}_day,
285-
{% endfor %}
286-
FROM asset_performance
287-
```
288-
and to execute:
289-
290-
`gaarf-bq path/to/query.sql --template.cohort_days=0,1,3,4,5,10,30`
291-
292-
It will create 7 columns (named `installs_0_day`, `installs_1_day`, etc) because the cohort_days argument was processed as a list.
293-
294-
ATTENTION: passing macros into sql queries is vulnerable to sql-injection so be very careful where you're taking values from.
295-
296-
297-
## Expressions and Macros
298-
> *Note*: currently expressions are supported only in NodeJS version.
299-
300-
As noted earlier both Ads queries and BigQuery queries support macros. They are named values than can be passed alongside
301-
parameters (e.g. command line, config files) and substituted into queries. Their syntax is `{name}`.
302-
On top of this queries can contain expressions. The syntax for expressions is `${expression}`.
303-
They will be executed right after macros substitution. So macros can contain expressions inside.
304-
Both expressions and macros deal with query text before submitting it for execution.
305-
Inside expression block we can do anything that the MathJS library supports - see https://mathjs.org/docs/index.html,
306-
plus work with date and time. It's all sort of arithmetic operations, strings and dates manipulations.
307-
308-
One typical use-case - evaluate date/time expressions to get dynamic date conditions in queries. These are when you don't provide
309-
a specific date but evaluate it right in the query. For example, applying a condition for date range for last month,
310-
which can be expressed as a range from today minus 1 month to today (or yesterday):
311-
```
312-
WHERE start_date >= '${today()-period('P1M')}' AND end_date <= '${today()}'
313-
```
314-
will be evaluated to:
315-
`WHERE start_date >= '2022-06-20 AND end_date <= '2022-07-20'`
316-
if today is 2022 July 20th.
317-
318-
Also you can use expressions for making table names dynamic (in BQ scripts), e.g.
319-
```
320-
CREATE OR REPLACE TABLE `{bq_dataset}_bq.assetssnapshots_${format(yesterday(),'yyyyMMdd')}` AS
321-
```
322-
323-
Supported functions:
324-
* `datetime` - factory function to create a DateTime object, by default in ISO format (`datetime('2022-12-31T23:59:59')`) or in a specified format in the second argument (`datetime('12/31/2022 23:59','M/d/yyyy hh:mm')`)
325-
* `date` - factory function to create a Date object, supported formats: `date(2022,12,31)`, `date('2022-12-31')`, `date('12/31/2022','M/d/yyyy')`
326-
* `duration` - returns a Duration object for a string in [ISO_8601](https://en.wikipedia.org/wiki/ISO_8601#Durations) format (PnYnMnDTnHnMnS)
327-
* `period` - returns a Period object for a string in [ISO_8601](https://en.wikipedia.org/wiki/ISO_8601#Durations) format (PnYnMnD)
328-
* `today` - returns a Date object for today date
329-
* `yesterday` - returns a Date object for yesterday date
330-
* `tomorrow` - returns a Date object for tomorrow date
331-
* `now` - returns a DateTime object for current timestamp (date and time)
332-
* `format` - formats Date or DateTime using a provided format, e.g. `${format(date('2022-07-01'), 'yyyyMMdd')}` returns '20220701'
333-
334-
Please note functions without arguments still should called with brackets (e.g. `today()`)
335-
336-
For dates and datetimes the following operations are supported:
337-
* add or subtract Date and Period, e.g. `today()-period('P1D')` - subtract 1 day from today (i.e. yesterday)
338-
* add or subtract DateTime and Duration, e.g. `now()-duration('PT12H')` - subtract 12 hours from the current datetime
339-
* for both Date and DateTime add or subtract a number meaning it's a number of days, e.g. `today()-1`
340-
* subtract two Dates to get a Period, e.g. `tomorrow()-today()` - subtract today from tomorrow and get 1 day, i.e. 'P1D'
341-
* subtract two DateTimes to get a Duration - similar to subtracting dates but get a duration, i.e. a period with time (e.g. PT10H for 10 hours)
342-
343-
By default all dates will be parsed and converted from/to strings in [ISO format]((https://en.wikipedia.org/wiki/ISO_8601)
344-
(yyyy-mm-dd for dates and yyyy-mm-ddThh:mm:ss.SSS for datetimes).
345-
But additionally you can specify a format explicitly (for parsing with `datetime` and `date` function and formatting with `format` function)
346-
using standard [Java Date and Time Patterns](https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html):
347-
348-
* G Era designator
349-
* y Year
350-
* Y Week year
351-
* M Month in year (1-based)
352-
* w Week in year
353-
* W Week in month
354-
* D Day in year
355-
* d Day in month
356-
* F Day of week in month
357-
* E Day name in week (e.g. Tuesday)
358-
* u Day number of week (1 = Monday, ..., 7 = Sunday)
359-
* a Am/pm marker
360-
* H Hour in day (0-23)
361-
* k Hour in day (1-24)
362-
* K Hour in am/pm (0-11)
363-
* h Hour in am/pm (1-12)
364-
* m Minute in hour
365-
* s Second in minute
366-
* S Millisecond
367-
* z Time zone - General time zone (e.g. Pacific Standard Time; PST; GMT-08:00)
368-
* Z Time zone - RFC 822 time zone (e.g. -0800)
369-
* X Time zone - ISO 8601 time zone (e.g. -08; -0800; -08:00)
370-
371-
Examples:
372-
```
373-
${today() - period('P2D')}
374-
```
375-
output: today minus 2 days, e.g. '2022-07-19' if today is 2022-07-21
376-
377-
```
378-
${today()+1}
379-
```
380-
output: today plus 1 days, e.g. '2022-07-22' if today is 2022-07-21
381-
382-
```
383-
${date(2022,7,20).plusMonths(1)}
384-
```
385-
output: "2022-08-20"
386-
387-
388204
### Dynamic dates
389205
Macro values can contain a special syntax for dynamic dates. If a macro value starts with *:YYYY* it will be processed
390206
as a dynamic expression to calculate a date based on the current date.
@@ -418,11 +234,23 @@ But you can override it via arguments if needed (e.g. `--macro.date_iso=:YYYYMMD
418234

419235

420236
## Docker
421-
You can run Gaarf as a Docker container. At the moment we don't publish container images so you'll need to build it on your own.
237+
238+
You can run Gaarf as a Docker container.
239+
240+
```
241+
export GAARF_ACCOUNT=123456
242+
docker run \
243+
-v $HOME/google-ads.yaml:/root/google-ads.yaml \
244+
ghcr.io/google/gaarf-py:latest \
245+
gaarf "SELECT customer.id AS account_id FROM customer" \
246+
--input=console --output=console \
247+
--account=$GAARF_ACCOUNT --ads_config=/root/google-ads.yaml
248+
```
249+
250+
### Build a container image
422251
The repository contains sample `Dockerfile`'s for both versions ([Node](js/Dockerfile)/[Python](py/Dockerfile))
423252
that you can use to build a Docker image.
424253

425-
### Build a container image
426254
If you cloned the repo then you can just run `docker build` (see below) inside it (in js/py folders) with the local [Dockerfile](js/Dockerfile).
427255
Otherwise you can just download `Dockerfile` into an empty folder:
428256
```

0 commit comments

Comments
 (0)