Skip to content

Commit 28bbe76

Browse files
authored
Merge pull request #400 from rpinterKX/bin
major rewrite of bin with lots of added examples
2 parents 749d7ae + 9d37848 commit 28bbe76

File tree

1 file changed

+79
-18
lines changed

1 file changed

+79
-18
lines changed

docs/ref/bin.md

Lines changed: 79 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -16,20 +16,48 @@ x bin y bin[x;y]
1616
x binr y binr[x;y]
1717
```
1818

19+
## Lists
20+
1921
Where
2022

2123
- `x` is a sorted list
22-
- `y` is a list or atom of exactly the same type (no type promotion)
24+
- `y` is an atom of exactly the same type (no type promotion)
2325

24-
returns the index of the _last_ item in `x` which is ≤`y`. The result is `-1` for `y` less than the first item of `x`.
26+
returns the index of the _last_ item in `x` which is ≤`y`. The result is `-1` for `y` less than the first item of `x`. If `x` is a simple list, `bin` is [atomic](../basics/atomic.md) in `y`. (For higher ranks of either argument, `bin` works the same way as [`?` (Find)](find.md/#type-specific).)
2527
`binr` _binary search right_, introduced in V3.0 2012.07.26, gives the index of the _first_ item in `x` which is ≥`y`.
2628

27-
They use a binary-search algorithm, which is generally more efficient on large data than the linear-search algorithm used by `?` ([Find](find.md)).
29+
```q
30+
q)0 2 4 6 8 10 bin 5
31+
2
32+
q)0 2 4 6 8 10 bin -10 0 4 5 6 20
33+
-1 0 2 2 3 5
34+
35+
q)0 1 1 2 bin 0 1 2
36+
0 2 3
37+
q)0 1 1 2 binr 0 1 2
38+
0 1 3
39+
```
40+
41+
`bin` uses a binary search algorithm, which is generally more efficient on large data than the linear-search algorithm used by [`?` (Find)](find.md).
2842

29-
The items of `x` should be sorted ascending although `bin` does not verify this property.
43+
The items of `x` must be sorted ascending although `bin` does not verify this property.
3044

3145
!!! danger "If `x` is not sorted the result is undefined."
3246

47+
`bin` can be also used if `x` is a dictionary with its values sorted.
48+
49+
```q
50+
q)(`a`b`c!0 2 4) bin -1 3
51+
``b
52+
```
53+
54+
Non-simple lists can also be used. In this case, items are lexicographically sorted.
55+
56+
```q
57+
q)("apple";"banana";"coffee") bin ("anise";"berry";"curry")
58+
-1 1 2
59+
```
60+
3361
The result `r` can be interpreted as follows: for an atom `y`, `r` is an integer atom whose value is either a valid index of `x` or `-1`. In general:
3462

3563
```txt
@@ -44,31 +72,64 @@ and
4472
r[j]=x bin y[j] for all j in index of y
4573
```
4674

47-
Essentially `bin` gives a half-open interval on the left.
75+
`bin` is the function used in [`aj`](aj.md) and [`lj`](lj.md).
4876

49-
`bin` and `binr` are right-atomic: their results have the same count as `y`.
77+
`bin` and `binr` are [multithreaded primitives](../kb/mt-primitives.md).
5078

51-
`bin` also operates on tuples and table columns and is the function used in [`aj`](aj.md) and [`lj`](lj.md).
79+
## Tables
5280

53-
`bin` and `binr` are [multithreaded primitives](../kb/mt-primitives.md).
81+
Where
82+
83+
- `x` is a table of `n` columns
84+
- `y` is a table row with the same schema (e.g. a list with `n` elements or a dictionary with the same keys as the columns of `x`)
85+
86+
returns the index of the last row of `x` for which
87+
88+
- the first `n-1` values each match the first `n-1` values of `y`, and
89+
- the last value is not greater than the last value of `y`.
90+
91+
(For higher ranks, see the examples below as well as the documentation for [`?` (Find)](find.md/#type-specific).)
92+
93+
If no items match the criteria, either because there are no rows that match in the first `n-1` columns, or because the last value is smaller than the last value in the first such row, `0N` is returned.
5494

5595
```q
56-
q)0 2 4 6 8 10 bin 5
57-
2
58-
q)0 2 4 6 8 10 bin -10 0 4 5 6 20
59-
-1 0 2 2 3 5
96+
q)t:([]a:`p`p`p`q`q`q;b:0 2 4 0 2 4)
97+
q)t bin `a`b!(`p;3)
98+
1
99+
q)t bin ([]a:`q;b:-1 1 3 5)
100+
0N 3 4 5
101+
q)t bin `a`b!(`r;2)
102+
0N
60103
```
61104

62-
If the left argument items are not distinct the result is not the same as would be obtained with `?`:
105+
To use `bin` with a table, the last column needs not be sorted overall, but it needs to be sorted within the equivalence classes defined by the first `n-1` columns (as shown in the previous example).
106+
107+
`bin` can also be used with keyed tables. Here, `y` needs to contain all value columns, and it is the keys that are returned (as a table).
63108

64109
```q
65-
q)1 2 3 3 4 bin 2 3
66-
1 3
67-
q)1 2 3 3 4 ? 2 3
68-
1 2
110+
q)kt:([k:`c`d`e`f`g`h`j`l]a:`p`p`q`q`p`p`q`q;b:0 1 0 1 0 1 0 1;c:3 3 3 3 7 7 7 7)
111+
q)kt
112+
k| a b c
113+
-| -----
114+
c| p 0 3
115+
d| p 1 3
116+
e| q 0 3
117+
f| q 1 3
118+
g| p 0 7
119+
h| p 1 7
120+
j| q 0 7
121+
l| q 1 7
122+
q)kt bin ([]a:`p`q`q`r;b:1;c:4 8 2 4)
123+
k
124+
-
125+
d
126+
l
127+
128+
129+
q)(kt bin ([]a:`p`q`q`r;b:1;c:4 8 2 4))`k
130+
`d`l``
69131
```
70132

71-
72133
## Sorted third column
73134

74135
`bin` detects the special case of three columns with the third column having a sorted attribute. The search is initially constrained by the first column, then by the sorted third column, and then by a linear search through the remaining second column. The performance difference is visible in this example:

0 commit comments

Comments
 (0)