Skip to content

feat: Add array length function, closes #3308 #3317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions docs/4_secondary_admin_controls/expressions/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@ There are various supported functions, and we are willing to add more. Let us kn

The currently supported functions are:

##### General operations

**length(val)**

Find the length of the item passed in.
* For a strings it will return the number of unicode graphemes
* For arrays, the number of elements
* For JSON or other objects, it will return the number of properties
* For numbers it will return the length of the string representation

##### Numeric operations

**round(val)**
Expand Down Expand Up @@ -110,7 +120,6 @@ Find the index of the first occurrence of a value within the provided string.

Optionally provide an offset to start the search from.


**lastIndexOf(val, find, offset)**

Find the index of the last occurrence of a value within the provided string.
Expand Down Expand Up @@ -141,7 +150,6 @@ eg `encode("Companion","hex")` gives `"436f6d70616e696f6e"`

Decode a string from the requested format ('hex','base64'). If `enc` is missing, `latin1` will be used.


eg `decode("436f6d70616e696f6e","hex")` gives `"Companion"`

**parseVariables(string)**
Expand Down Expand Up @@ -204,7 +212,6 @@ eg `00:10:15` gives 615

You can do the reverse of this with `secondsToTimestamp(str)`


**secondsToTimestamp(seconds, format)**

Convert a number of seconds into a timestamp of format 'HH:mm:ss'.
Expand Down
28 changes: 27 additions & 1 deletion shared-lib/lib/Expression/ExpressionFunctions.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,34 @@
import { pad } from '../Util.js'
import { JSONPath } from 'jsonpath-plus'
import { countGraphemes } from 'unicode-segmenter/grapheme';

// Note: when adding new functions, make sure to update the docs!
export const ExpressionFunctions: Record<string, (...args: any[]) => any> = {
// General operations
length: (v) => {
let len = 0
if (v === undefined || v === null) {
len = 0
} else if (Array.isArray(v)) {
len = v.length
} else if (typeof v === 'number') {
len = (v + '').length
} else if (typeof v === 'bigint') {
len = v.toString().length
} else if (typeof v === 'string') {
// So we handle UTF graphemes correctly
len = countGraphemes(v)
} else if (v instanceof RegExp) {
len = v.toString().length
} else if (typeof v === 'object') {
len = Object.keys(v).length
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all uncatched types have a lenght property, e.g. BIGINT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added BigInt. What others are you aware (RegExp maybe) of/what do you think I should do in the default case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added RegExp too now...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think string should have its own case and then it is not part of the unknown rest any more. The unknown rest could maybe return NaN.
From the JS perspective there are also iterators and generators that are interesting for length() but currently we don't use them in variables, so if you want to implement them it would be a bonus for possible future use. Also we don't use regex data type currently and I'm not sure about the value of knowing the length of a regex.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree, a string case would be good and the fallback should always return NaN.

Another note is that typeof new Regexp() is object, so this will need to be above the object check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I've added/sorted those others, useful or not and made it return NaN as the default otherwise.

I guess ideally we need yet another type of object if we wish to test the NaN case...

// If it's got to here, we don't know how to handle it
len = NaN
}
return len
},

// Number operations
// TODO: round to fractionals, without fp issues
round: (v) => Math.round(v),
Expand Down Expand Up @@ -74,7 +100,7 @@ export const ExpressionFunctions: Record<string, (...args: any[]) => any> = {
// Bool operations
bool: (v) => !!v && v !== 'false' && v !== '0',

// Object operations
// Object/array operations
jsonpath: (obj, path) => {
const shouldParseInput = typeof obj === 'string'
if (shouldParseInput) {
Expand Down
3 changes: 2 additions & 1 deletion shared-lib/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@
"jsep": "^1.4.0",
"jsonpath-plus": "^10.3.0",
"ps-tree": "^1.2.0",
"semver": "^7.7.1"
"semver": "^7.7.1",
"unicode-segmenter": "^0.11.3"
},
"scripts": {
"build": "run build:ts",
Expand Down
32 changes: 32 additions & 0 deletions shared-lib/test/expressions-functions.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,38 @@
import { ExpressionFunctions } from '../lib/Expression/ExpressionFunctions.js'

describe('functions', () => {
describe('general', () => {
it('length', () => {
expect(ExpressionFunctions.length()).toBe(0)
expect(ExpressionFunctions.length('')).toBe(0)
expect(ExpressionFunctions.length('a')).toBe(1)
expect(ExpressionFunctions.length('abc')).toBe(3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect(ExpressionFunctions.length('ä')).toBe(1) // codepoint U+00E4, one grapheme
expect(ExpressionFunctions.length('̈a')).toBe(1) // codepoints U+0308 U+0061, one grapheme
expect(ExpressionFunctions.length('́̈a')).toBe(1) // codepoints U+0301 U+0308 U+0061, one grapheme

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is pretty hard to do entirely properly, but do you think of these is "good enough"?
https://mathiasbynens.be/notes/javascript-unicode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually node.js has the Intl.segmenter() since version 16 but that is not working 100% with combining marks.
https://www.npmjs.com/package/unicode-segmenter should do the trick and is even faster than the built in Intl.segmenter()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've got unicode-segmenter going now, but it doesn't return the right result for all cases for some reason (unless I've made a stupid error somewhere)?

expect(ExpressionFunctions.length('ä')).toBe(1) // codepoint U+00E4, one grapheme
expect(ExpressionFunctions.length('̈a')).toBe(1) // codepoints U+0308 U+0061, one grapheme

Check failure on line 12 in shared-lib/test/expressions-functions.test.ts

View workflow job for this annotation

GitHub Actions / test

shared-lib/test/expressions-functions.test.ts > functions > general > length

AssertionError: expected 2 to be 1 // Object.is equality - Expected + Received - 1 + 2 ❯ shared-lib/test/expressions-functions.test.ts:12:45
expect(ExpressionFunctions.length('́̈a')).toBe(1) // codepoints U+0301 U+0308 U+0061, one grapheme
expect(ExpressionFunctions.length(9)).toBe(1)
expect(ExpressionFunctions.length(99)).toBe(2)
expect(ExpressionFunctions.length(-123)).toBe(4)
expect(ExpressionFunctions.length(3.14)).toBe(4)
expect(ExpressionFunctions.length(BigInt(1024))).toBe(4)
expect(ExpressionFunctions.length(BigInt(9007199254740991))).toBe(16)
expect(ExpressionFunctions.length(new RegExp('ab+c', 'i'))).toBe(7)
expect(ExpressionFunctions.length([])).toBe(0)
expect(ExpressionFunctions.length([9])).toBe(1)
expect(ExpressionFunctions.length([99])).toBe(1)
expect(ExpressionFunctions.length(['abc'])).toBe(1)
expect(ExpressionFunctions.length([9, 'a'])).toBe(2)
expect(ExpressionFunctions.length(['a', 'c'])).toBe(2)
expect(ExpressionFunctions.length(['ab', ''])).toBe(2)
expect(ExpressionFunctions.length([1, , 3])).toBe(3)
expect(ExpressionFunctions.length(['a', 'b', 'c'])).toBe(3)
expect(ExpressionFunctions.length(['a', ['b', 'b'], 'c'])).toBe(3)
expect(ExpressionFunctions.length({ a: 1 })).toBe(1)
expect(ExpressionFunctions.length({ a: 1, b: { c: 5 } })).toBe(2)
expect(ExpressionFunctions.length({ a: ['a', 'c'], b: { c: 5 } })).toBe(2)
})
})

describe('number', () => {
it('round', () => {
expect(ExpressionFunctions.round(9.99)).toBe(10)
Expand Down
8 changes: 8 additions & 0 deletions yarn.lock
Original file line number Diff line number Diff line change
Expand Up @@ -1244,6 +1244,7 @@ __metadata:
ps-tree: "npm:^1.2.0"
semver: "npm:^7.7.1"
typescript: "npm:~5.7.3"
unicode-segmenter: "npm:^0.11.3"
languageName: unknown
linkType: soft

Expand Down Expand Up @@ -16052,6 +16053,13 @@ asn1@evs-broadcast/node-asn1:
languageName: node
linkType: hard

"unicode-segmenter@npm:^0.11.3":
version: 0.11.3
resolution: "unicode-segmenter@npm:0.11.3"
checksum: 10c0/e5f6c16ebd2112eb6e447436ebaa525deb69705a609e6e19161c779ae738f1c2325b96cd986d5099c0f9b8c92538940d41d77237dd5a169230bb8405ec5f4835
languageName: node
linkType: hard

"unified@npm:^11.0.0":
version: 11.0.5
resolution: "unified@npm:11.0.5"
Expand Down
Loading