Skip to content

Commit 66a5d2e

Browse files
committed
Explain header handling, AKA the Ferris arc.
1 parent a270e4f commit 66a5d2e

File tree

4 files changed

+129
-17
lines changed

4 files changed

+129
-17
lines changed

docs/formats.dj

Lines changed: 94 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -461,31 +461,108 @@ $ jaq -n '["a","b"], [1,2], [3,4] | @csv' --raw-output
461461
3,4
462462
```
463463

464+
### Headers
465+
464466
jaq currently does not provide any special handling for table headers,
465467
where the first row of a table contains field names.
466-
However, using existing filters, you can handle table headers yourself.
468+
However, there are a few filters in `examples/table.jq`
469+
that aim to simplify handling of tables with headers.
470+
Let us see their usage.
471+
472+
For example, let us look at
473+
[Ferris Bueller](https://en.wikipedia.org/wiki/Ferris_Bueller%27s_Day_Off)'s student record:
474+
475+
```
476+
$ cat examples/ferris.csv
477+
ID,Name,Home phone,Days absent
478+
38-106,"Bueller, Ferris",555-6452,9
479+
```
480+
481+
Hmmm, this is a bit hard to read, and Edward R. Rooney, Dean of Students, is a busy man.
482+
Let us help him out a bit:
483+
484+
```
485+
$ jaq -L examples 'include "table"; from_table' -s examples/ferris.csv
486+
{
487+
"ID": "38-106",
488+
"Name": "Bueller, Ferris",
489+
"Home phone": "555-6452",
490+
"Days absent": 9
491+
}
492+
```
493+
494+
Ah, much better. So what does Edward R. Rooney have to say about Ferris to Ferris's mom?
467495

468-
One useful pattern is to use
496+
> "So far this semester, he has been absent nine times."
497+
> --- "Nine times?"
498+
> --- "Nine times."
499+
> --- "I don't remember him being sick nine times."
500+
> --- "That's probably because he wasn't sick. He was skipping school. Wake up and smell the coffee, Mrs. Bueller. It's a fool's paradise. He's just leading you down the primrose path."
501+
> --- "I can't believe it."
502+
> --- "I got it right here in front of me. He has missed nine days."
503+
504+
Now it's time for Ferris to shine:
505+
506+
```
507+
$ jaq -L examples 'include "table"; from_table | .["Days absent"] -= 7' -s examples/ferris.csv
508+
{
509+
"ID": "38-106",
510+
"Name": "Bueller, Ferris",
511+
"Home phone": "555-6452",
512+
"Days absent": 2
513+
}
514+
```
515+
516+
> "Grace! GRACE!!!"
517+
518+
Next, let's have a look at Ferris's courses:
519+
520+
```
521+
$ cat examples/ferris-courses.tsv
522+
Perds Days Course Teacher Rm Grade 1 Grade 2 Grade 3 Grade EX
523+
01-04 MTWF Eng Comp Hollndr 221 B+ A
524+
05-08 MWTF Calculus McMurry 309 B A-
525+
09-10 TWTF Chemistry Gunner 260 A- A
526+
11-13 All Lunch Caf
527+
14-19 MTWF Gym Carlyle 127 B+ A-
528+
23-25 TWTF Computr Sc Cohen 114 A- A
529+
26-29 MWTF Utpian Scy Jardin 241 A A
530+
30-32 MTWF Euro Hist Rice 334 B+ B+
531+
```
532+
533+
Let's see ... in which courses did Ferris get a B+ as first grade? Anyone? Anyone?
534+
535+
```
536+
$ jaq -L examples 'include "table"; [from_table | select(.["Grade 1"] == "B+")] | to_table' \
537+
-s examples/ferris-courses.tsv --to tsv
538+
Perds Days Course Teacher Rm Grade 1 Grade 2 Grade 3 Grade EX
539+
01-04 MTWF Eng Comp Hollndr 221 B+ A
540+
14-19 MTWF Gym Carlyle 127 B+ A-
541+
30-32 MTWF Euro Hist Rice 334 B+ B+
542+
```
543+
544+
Yeah, Ferris ... you still got a bit of work ahead of you on the database.
545+
546+
So far, we slurped in all tables before processing them, using [`-s`](#--slurp).
547+
This can be wasteful, especially when processing large tables, because
548+
this reads the whole table into memory.
549+
We can also read tables row-by-row:
550+
One useful pattern there is to use
469551
[`inputs`](#inputs) *without* the habitual [`-n`](#--null-input) option.
470552
That way, the first row (= header) becomes the filter input,
471553
and the remaining rows can then be retrieved with `inputs`.
472-
We can find the indices of fields with [`index`](#indices).
473-
Putting this together, we can extract only certain fields from a table:
554+
Putting this together, we can rewrite the previous example:
474555

475556
```
476-
$ printf "last,first,birthyear\nBach,J. S.,1685\nMozart,W. A.,1756\nvan Beethoven,L.,1770" \
477-
| jaq -c --from csv 'inputs as $row | [$row[index("last", "birthyear")]]'
478-
["Bach",1685]
479-
["Mozart",1756]
480-
["van Beethoven",1770]
557+
$ jaq -L examples 'include "table"; ., . as $header | from_row(inputs) | select(.["Grade 1"] == "B+") | to_row($header)' \
558+
examples/ferris-courses.tsv --to tsv
559+
Perds Days Course Teacher Rm Grade 1 Grade 2 Grade 3 Grade EX
560+
01-04 MTWF Eng Comp Hollndr 221 B+ A
561+
14-19 MTWF Gym Carlyle 127 B+ A-
562+
30-32 MTWF Euro Hist Rice 334 B+ B+
481563
```
482564

483-
You can also convert rows to objects via:
565+
This concludes our little Ferris Bueller arc.
566+
If you do not know the film yet, I greatly recommend you to change that.
567+
And remember: "Life moves pretty fast. If you don't stop and look around once in a while, you could miss it."
484568

485-
```
486-
$ printf "last,first,birthyear\nBach,J. S.,1685\nMozart,W. A.,1756\nvan Beethoven,L.,1770" \
487-
| jaq -c --from csv 'to_entries | inputs as $row | map({(.value): $row[.key]}) | add'
488-
{"last":"Bach","first":"J. S.","birthyear":1685}
489-
{"last":"Mozart","first":"W. A.","birthyear":1756}
490-
{"last":"van Beethoven","first":"L.","birthyear":1770}
491-
```

examples/ferris-courses.tsv

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
Perds Days Course Teacher Rm Grade 1 Grade 2 Grade 3 Grade EX
2+
01-04 MTWF Eng Comp Hollndr 221 B+ A
3+
05-08 MWTF Calculus McMurry 309 B A-
4+
09-10 TWTF Chemistry Gunner 260 A- A
5+
11-13 All Lunch Caf
6+
14-19 MTWF Gym Carlyle 127 B+ A-
7+
23-25 TWTF Computr Sc Cohen 114 A- A
8+
26-29 MWTF Utpian Scy Jardin 241 A A
9+
30-32 MTWF Euro Hist Rice 334 B+ B+

examples/ferris.csv

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ID,Name,Home phone,Days absent
2+
38-106,"Bueller, Ferris",555-6452,9

examples/table.jq

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Convert between rows/tables and objects/arrays of objects.
2+
#
3+
# A row is an array of values and a table is an array of rows.
4+
# The first row of a table is usually a *header*, containing field names.
5+
#
6+
# Properties:
7+
# - [from_table] | to_table === .
8+
# - $keys | from_row(rows) | to_row($keys) === rows
9+
#
10+
# Examples:
11+
# - ["first", "last"] as $keys | $keys | from_row([1, 2]) | to_row($keys) --> [1, 2]
12+
# - [["first", "last"], [1, 2]] | [from_table] | to_table --> ["first", "last"], [1, 2]
13+
14+
# Take a header as input and a row as argument, yield an object.
15+
def from_row($row): to_entries | .[] |= {(.value): $row[.key]} | add;
16+
# Take a table as input, yield an array of objects.
17+
def from_table: .[1:][] as $row | .[0] | from_row($row);
18+
19+
# Take an object as input and a header as argument, yield a row.
20+
def to_row($keys): [.[$keys[]]];
21+
# Take an array of objects as input and a header as argument, yield a table.
22+
def to_table($keys): $keys, (.[] | to_row($keys));
23+
# Take an array of objects as input, yield a table.
24+
def to_table: to_table(.[0] | keys_unsorted);

0 commit comments

Comments
 (0)