Skip to content

Latest commit

 

History

History
538 lines (514 loc) · 34.1 KB

todo.md

File metadata and controls

538 lines (514 loc) · 34.1 KB

todo

  • store operators in precedence order

    • indicates unary, binary or ternary
    • directly map calltree nodes [op, ...args] (not reduce heuristic)
    • makes ops fns useful on its own (like dlv)
    • there are shortcuts or extensions for ||, && etc
    • makes simpler unary handling
    • same unaries/binaries ops can reside in different precedence
  • think about flatten groups: a["b"]["c"] → a.#0.#1, a(b,c)(d,e) → a ( #bc ( #de

    • that allows correct precedence detection, eg. a.#0.#1(#2 can be split as [(, [., a, #0, #1], #2]
    • that allows overloading these operators by user ~ ( operator implies [(, a, args] === [a, ...args]
    • that allows removing .\w shadowing ↑ ~ maybe problematic to separate a[b] from just [b], a(b,c) from just (b,c) . There's a problem a.b(c).d → a.#b(c.d → [(, [., a, b], [., c, d]], but must be [.,[[.,a,b],c] d]
      1. maybe we don't need to take .,( as operators. Just use dotprop after.
      • not clear how a.b(c).d looks: should ['.', ['a.b', c], 'd'], but how?
      1. make () a splitting operator, mb end-bracket
      • any splitting attempt creates structure either [call, [.,a,b], c, [.,d,e]] or [[.,a],[]] → strategy of unwrapping brackets worked, but it's not generic operator but an exceptional cases for {, [, (
    • the actual problem of a.b(c.d,e.f).g.h+1 is that ( and . has same precedence and are handled simultaneously, not in order . correct (non-order) grouping would be #a(#b,#c).g.h+1 → #abc.g.h+1 → #abcg+1 → #abcg1 . then unwrap → [+,#abcg,1] → [+,[.,#abc,g,h],1] → [+,[.,[#a,#b,#c],g,h],1] → ... → we cannot organize single object with order of operators, these two (/. must come same time therefore
      • ok, with precedence split it seems to be possible now a ( #bc → [(, a, #bc, #de], needs unwrapping only
  • eval JSON, array objects it's incompatible non-lispy syntax . [,{,: are not operators, they're built-in. . parser must return ready-object/array, not lispy json ? Array vs fn call

    1. Array → [['a','b','c']], call → ['a', 'b', 'c'] (frisk) + escape any value by just [{}], [[]], [Date] ... - [[a,1],2] → a(1)(2), therefore [[a,1]] → a(1)()
    2. Array → ['list', 'a', 'b', 'c'], call → ['a', 'b', 'c'] (lisp) + → list(a,b,c) - list can be external function that we supercede 2.1 ['Array', 'a', 'b', 'c'] + → Array(a,b,c)
      • lucky match w/o new operator - [1,2,3] → ['Array',[',',1,,2,3]], which is redundant 2.1.1 ★ [Array, 'a', 'b', 'c']
      • since our notation implies fn call, and first arg is fn name, operator or token to call, then we just call that token 2.2 ['...', 'a', 'b', 'c'] + unary operator generates list anyways - can be super confusing with real ... operator, like ...[a,b,c] → ['...',['...', a,b,c]] 2.3 ['[', 1, 2, 3] + keeps reference to operator + [ as prop access is transfused into . - still makes nested ',' - suppose comma should not create separate group [',', a, b, c] but should instead just push arguments to current
    3. Array → [,'a','b','c'], call → ['a', 'b', 'c'] - undefined(a,b,c)
    4. Array → ['a', 'b', 'c'], call → ['a', 'b', 'c'] if 'a' is a function (frisk)
    5. Prohibit arrays/JSONs? + not cross-compatible - every lang has its own way defining them, but property access is same + expected to pass via context . not really good idea to mix lisp with arrays/objects → better provide lispy constructors like ['[]', [',',1,2,3]] or ['{}', [',',[':',a,1],[':',b,2],[':',c,3]]] . that gives a hint for end-catch operators, hm. like ? '...[...]':(a,b)=>, '...(...)':(a,b)=>, '...?...:...':(a,b)=>
      • lengthy
      • impossible to approach in the same loop as regular operators. Asc creates [.,a,b,[c,args]], desc creates [[[+,a,[.,b,c]], d, .e]
  • take over jsep-strip performance

  • word operators

  • subscriptexp → no need for tpl tag unless fields have special meaning

  • ternaries: a?b:c, a|name>b, a,b in c, a to b by c,

    • ‽ what if we decompose these to op groups (just postfix unaries), it's totally fine and even useful:
      • [?,a], [?a,[:b,c]], [in,[,a,b],c], [to,a,b], [to,a,[by,b,c]], [if,a,[else,b,c]] . for that we would need to create transform groups
      • that would enable Justin extension
      • that would allow flattening fn calls by default
  • ;

  • comments

  • # operators overloaded (python comments for example)

  • justin

  • infinite unaries? -+-+a

  • postfix unaries, 7! (factorial), 5s (units), exist?, arrᵀ - transpose, . Lisp tree doesn't make difference between unary/binary operator. . Seems that postfix operator === binary operator without second argument.

    • We don't have cases in pure subscript for postfix operators: a--, b++ don't make sense in evaluator. ? What if make 1st group - unaries prefix, 2nd - unaries postfix?
      • a way to implement unaries
      • removes arguments limitation
      • allows accessing unaries directly
      • can be complicated extension, unless we reference group immediately, not by index
      • allows faster search for unaries
  • . operator is not compatible with others, like a.+b.+c

    • it's neither evaluable: in handler '.':(a,b)=>a[b] b has meaning as "string", not value from context by that key
  • extension: Justin (+JSONs) ? make transforms for strings?

  • Bench

  • unary word

  • Optimizations

    • parser is mode-less: for each character it attempts to check every possible condition. Smart guys don't overcheck and just consume according to current mode. Eg. for s
    • preparate jsep - [x] remove noise, [x] flatten, [x] lispify
      • fastest 10 times than justin
      • size is only 300 bytes more
      • supports errors better
      • make generic extensions as subscript
      • fold gobbling groups with gobbling arguments/properties
      • make simpler identifiers consumption
      • fix subscript tests for jsep stripped
      • simplify eval: no need for first arg cases, just op or fn decided to have long evals
        • no flattening: makes precedence more clear, ops reduce-less: in js there's still binary ops
          • can be hard to organize right-assoc operators like ** → long calls (multiple args) allow easier manual evals, frisk-compatible, fns still require args, enables shortcuts
    • try handling unaries in advance → direct parser solves that ? turn . operator to transform ? a.b(c.d).e.f . dislex version (operators split) was faster than now.
    • seems many redundant checks come from operator call at the time when we just consume a token . it is meaningful to track perf tests from the beginning of development
  • Main slowdown reasons:

    1. Operators lookup - it must be precedence dict
    2. Unknown 20% lost at recursive algorithm: jsep-strip is faster: . it has direct sequences
      • streamlined sequences: doesn't seem to speed up. . direct props/calls . single-token expression shortcuts . flattened recursion.
      • it speeds up indeed up to 5-10%. Added hand-to-hand imlpementations. ! Found! Recursion just causes extra consumeOp checks (twice as much) ↑ Something of that makes thing faster, although less pure nor extensible (like, no {} literals). . Logically, gobbleExpression doesn't check every token to belong to end token, so maybe there's just less checks? → seems that we're going to have slower perf.
  • Passing array literals is a problem

    • no way to pass Array literals to calltree. Internally they're marked in literals store, so evals are guaranteed. But direct eval(['a', ['+',1,2,3]]) marks an array as evaluable. ? Maybe we should prohibit evaluate exports to simplify internal logic? → transform keeping raw literal or turn into constructors. → not literals anymore, but tree operators
  • Minifications

    • ( [ . can be folded to operators, can they?...
    • generalize parsing identifiers: parseFloat works on any string, things like 1.1.2 can be folded, although mb not as performant. Maybe make digits+numbers parseable as single chunks?
      • 2a can be parsed as 2*a, but likely with '' operator
        • that also allows defining either units or coefficients (units can be done as postfix too)
      • Maybe create separate parse literal function
        • can detect booleans, null, undefined
        • can detect any other types of literals, like versions, units, hashes, urls, regexes etc
  • https://github.com/gajus/liqe nah

  • Flatten binaries: [, [, a b] c] → [, a b c]

    • many exceptions lead to flat form (args, sequence, ternaries)
    • it matches reducers
    • formulas are still meaningful this way
    • good for performance
  • Transforms for literals/tokens. → done as parsers, just implement per-token parsers

    • We need to interpolate strings a${}b
    • We need to generalize tokens 2a, https://..., ./a/b/c, [a,b,c], {a,b,c}, hash, /abc/ig, 1.2.0
  • Process sequences separately

    • Now expression loop checks for groups besides normal operators, which is op tax
    • Now commas are almost useless
    • Braces are still special case of operator
    • Comma-operator creates many problematic transforms, that can be reduced
    • It doesn't gain desired performance, still ~17-20% slower.
      • → due to excessive lookups
    • It still has problems with calls/properties - it must check if consumed operator is a group or not. → can be checked on expression level for redirection
    • With memoized op lookup it allows faster braces lookups (no closing braces)
    • Sequence is useful for consuming same-precedence ops as well like a + b + c ...
    • Passing precedence over sequence is tedious → Maybe pass groupinfo, like [operator, precedence, start, end, index]?
    • consumeGroup from previous impl is almost identical (gravitates) to consumeSequence
      • we may just address operator memo, current group collection (that simplifies lookups)
    • [.( are 3 special transforms and also they need special handling in expressions...
    • also , still makes no sense on itself, so mb they can be handled as postfixes actually → Ok, handled in id parser as single token, same as jsep - that reduces unnecessary declarative API
  • Optimizations 2

    • Operator lookup can be simplified: look for small-letters first, and increase until meet none ? so we go until max length or until found operator loses?
      • it's faster for 1-char ops, slower for 2+ char ops
    • curOp can expect itself first, unless it's not met do lookup
      • allows better node construction as well
  • Separating binary/unary and operators is good: +clear pertinence to eval/parse, +faster & simpler base, ...

  • Should consolidate binary as parse.binary, parse.prefix, evaluate.operator?

    • makes sense semantically
    • better association parse.literal = ...
    • reduces amount of exports
    • no plurals needed
  • Error cases from jsep (from source)

  • Improve perf formula 1 + (a | b ^ c & d) + 2.0 + -3e-3 * +4.4e4 / a.b["c"] - d.e(true)(false)

  • Make literals direct (passing wrapped numbers to functions isn't good)

  • ? Is that possible to build parser from set of test/consume functions, rather than declarative config? (see note below).

    • allows defining per-operator consumers
    • allows better tests (no need for generic operator lookups)
    • allows probablistic order of operators check
    • some operators can consume as much nodes as they need
  • Optimizations 3

    • Think if it's worth hardcoding subscript case, opposed to generalization
      • apparently that's faster, esp if done on numbers;
      • maybe that's smaller, since generalization is not required;
      • it can take a faster routes for numbers, sequences (no global closing-bracket case);
    • It can be completely built on recursions, without while loops. . Take space: space = (str='') => isSpace(c=char()) ? (str+c, index++, space(str)) : '' → recursions are slower than loops
  • Move token parsers out: that would allow simplify & speed up comment, quote, interpolated string, float, even group token, and even maybe unary

  • Will that eventually evolve into dict of parsing tokens/arrays of tokens? We may need this dict to be passed to subparsers, eg. string interpolator may need parse.expr. ? maybe just make it a valid class? parser = new Parser(); parser.group(), parser.char() etc.

    • exposes internal sub-parsers
    • naturally exposes static variables
    • passes instance to custom subparsers
    • standard method
    • mb problematic minifications ? alternatively, a set of top-level parsers → Done as flat directly exported tokens
  • Parsers scheme usually can be generalized to just a set of tests like isNumber: consumeNumber isIdentifier: consumeIdentifier isUnaryGroup: consumeUnaryGroup like (a) isUnaryOperator: consumeUnary like +a isQuote: consumeString isComment: consumeComment isAnythingElse: consumeAnythingElse

    isBinaryOperator: consumeBinary like a+b, maybe even a+b+c isBinaryGroup: consumeBinaryGroup like a(b), maybe even a,b,c isTernaryStart: consumeTernary like a?b:

    . Each test is fastest as simple numbers comparison, worse - string comparison, worse - regex . Each consume is flow with own simpler checks

  • Move fragment parsers into own module

    • makes shorter & cleaner main exports, extension is possible by importing from src/
    • requires bundling
  • Node tests

  • Hardcode subscript parser for smaller size & better perf

  • Make parse/eval first-level

  • ?: operator is problematic with postfix parsers, since they expect highest precedence and a||b?c:d needs only 4.

    • also fn checkers unfortunately not as fast as desired.
    1. ‽ group/binary can be merged into a single dict: precedences come under 32, after 32 - ascii chars, denoting character to catch. This way ternary can be organized as {?: 58}, and groups as (: 41, [: 93. ? How do we detect tail-parsing?
    2. Make postfix parsing take in account precedence somehow ? Parse postfix after every expression group as well, not only after token
    3. Merge binary and postfix somehow: binary is also a kind of postfix extension, isn't it?
    4. ✔ We can regroup binary dict to be a set of parsers: each parser anyways checks for first character to match some condition;
    • This way we can customly map .[(
    • This way we can customly map ?:
    • This way we can customly map a+b+c+...
    • This way we can map
    • We could limit lookup by slicing precendence array, not by comparing fact of found lookup and discarding it → We have to store parsers as [,comma,some,every,or,xor,and,eqneq,comp,shift,sum,mult,unary,propcall]
      • that's going to be slow to call this set of functions, compared to object lookup
        • that's faster in fact than descending dict lookup: . we first don't search what we should not search by precedence
      • it can be easier to organize sub-search (we need to parse precedence only higher than the current one) ! there can be a lookup booster checker as well, eg. if code is +, then it's either binary or unary, just fast-forward . Seems routing is necessary
      1. ✔ lookup is through all precedences until null returned for all of them. + very fast method
      2. That can be done as global routing function or object findOperator(prec,c1)()
      3. That can be done per-precedence function sum=x=>pass(x)?mapped:mult(): - extending would imply changing prev function in chain - ridiculous 2.1. We can pass next function to levels sum=(x,next)=>pass(x)?mapped:next()
      • Cannot use precedence constructor: it creates fn call stack for condition, besides internals are too different: either number of op chars to skip varies, or unary/non-unary case, overall not worth it
  • remove first char check into lookup table

  • Try moving [(. into token? nope, splitting to lookup table is better

  • is that possible to merge tokens with operators lookup table?

    • it seems we uniquely identify tokens by first character as well (with fallback to non-spacy UTF for ids)
    • that's faster
    • that supposedly allows merging tokens/ops loop
    • that would allow to avoid confusion of not operator detected as token
    • also that would shrink parse.token to a bunch of lookups
    • also space can be done via lookup as well
    • tokens lookup introduce separate rules dict, which is almost identical to tokens that we expose now, it is even technically parsed differently
    • cognitive load it adds is sort of impersonal - hard intellectual grasp, opposed to easy token meaning
    • not operator can be worked around by checking lookup table in tokens parsing as well
    • interpolated strings eval can act as operator a, - not token.
  • pass end character as expr argument nope - it requires idx++, which is behind expr purpose

  • make group an operator, not token

  • token must wrap found result in an object, otherwise ignore falsish stuff - that will align method with operators, no nil ? How do we handle it then? . With strings - we return them wrapped as "something". We don't return actual strings. . We need truish wrapper objects with correct valueOf returning undefined etc. . Real primitives are actually available, we can still unwrap them into tree ? Maybe first merge expr loop, then find out if single-point unwrapping is available

  • ? Separate subscript setup from core eval/parse?

    • Some languages have precedence other than JS. Subscript forces JS precedence with no way to redefine it.
    • That would compactify justin, although with some duplication. Or maybe not, if we build from subscript.
    • that leaves parse unworkable and meaningless. Minimal level thing maybe?
  • Test low-precedence unary, like -2*a^3 + -a^2

  • Unknown operator test case, like <<< or >==

  • expr can skip end, if passed: low-hanging fruit

  • Make eval-only tests

  • Remove spaced check: word operators are exceptional and better be conditioned manually

  • ? externalize operator lookup descent?

    • fast op lookups are single functions, not stack of meanings: it can boost perf and shorten size
    • descent can be implemented manually
    • makes no point to host pre/postfix index then
    • it's nice to have it a separate feature
    • blocks >>> vs >> vs > vs >= detection
  • Eval optimizations:

    • calltree nodes can stash static values (as in HTM)
      • doesn't give much perf, but increases size - that's 1 fn call + 1 check
    • node can pre-figure out what function must be used on it (done as detecting reducer)
  • Eval: cache static parts

  • string token:

    1. new String(abc)
    • shorter eval code
    • correlates with val for literals (any literal can be a wrapper)
    • natural wrapper parser as well: new Number - parses string
    • dirty tree
    • uneasy manual eval
    1. '@abc'
    • frisk-compatible
    • eval is not completely frisk-compatible, eg. [1,2,3] node is not valid in subscript
    • it's unnatural notation
    1. '"abc"', "'abc'"
    • not generic: need support for each type of token
    1. 'str:abc', 'data:abc'
    • URL schema-compatible
    • 'int:1.12', 'float:2.25', 'bool:true', 'literal:this'
    1. ? merging with literals somehow - like [value]?
    • would save space from val function
    • would allow static pre-eval
    • would let evaluators decide how to handle literals (no need for explicit unwrapping strings for . operator)
    • would let evaluators implement reducers logic
    • if we store literals as [value], that'd be also compatible with frisk
      • array is confusable with, say [0] or [null] - this will attempt to get these values from context
        • we can facilitate that by making sure value is not a string
      • strings as ['abc'] would be confusable with fn call . note that a(b,c) is ( operator with b, c, ... args (comma acts as flattener)
    1. Functional types/evaluators, like id={valueOf(){ return ctx[id] }}, lit={valueOf(){ return literal[lit] }}
    • allows to get rid of evaluator ad-hocs
    • all 5.s . in fact .evaluate can be a node's method, so evaluator would be trivial
    • that can be meld with 1. ? simply string can be meant ref to context, other tokens are all valueOfs
    1. Relegate types to cast operators, eg. int 123['int', '123'], ['bool', 'true'], ['', null]
    • same as functional wrapper, but via built-in method
    • lispy logic, consistent ? how to differentiate statics? Anything that's not node?
    • if operator evaluators were built-in into nodes, we wouldn't have to keep cast type evaluators at all.
    1. ✔ Direct evaluators instead of nodes
    • merges config into single definition . types of evaluators: binary op, prop access, fn call, group, array, object, literal, context variable → there are common types of evaluators, like, binary, unary, multi, but can be custom . We keep ctx access as direct strings, allowing defining prop getter evaluator
    • We save space on node operator definitions ! fn can have with valueOf, calling itself. ? How do we detect static parts? Mark parsed tokens?
      • via parse.literals returning evaluator with 0 args! ? How do we define function call and id? . As a couple of parsers, instead of evaluators? . direct fn reducer is extra call, can slow down evaluation. Defining token-based calls would add some code, but make evals faster . a(b,c) operator is the main blocker against non-reducer binaries. If we somehow managed to handle , via binaries, we'd reduce flat-reducers code, even static optimizers would flatten (+faster +shorter) . . and in operators treat args as literals, not ids. Is there a way to parse them as literals? . it should not be a hack, it should be conceptually right . seems that operator decides not only how to evaluate operands, but also what semantic they have. Ie. they're responsible for what meaning token parts have. . or either - a.b is not an operator but a kind of identifier access.
        • still a,b in c is special construct. Same as let a,b or a of b → can be solved via checking in id if ctx is passed and returning self otherwise . v6 is less flexible, in sense that there's no easy way to redefine parsing → ok, passing map is the most flexible and generic:
        • allows redefining calc which is -1 stack call
        • allows flexible parser
        • although it adds 20 commas and redundant parsing - we usually deal with operators on nexpressions - mb just a case of evaluator?
  • inside-expression skip can be combined into operator, if we're able to indicate what operator we're defining ? by precedence we can detect what type of function is passed: unary, binary or postfix

    • no: idx+=l 3 times is less than condition to detect it from outside
    • yes: we do from=idx, idx=from, and also it outpays on extensions
  • test a ++ postfix spacing

  • try to get rid of skip, code, char from configurators: parse literals, tokens somehow differently

    • no: we better implement custom parsers instead
  • flat into single subscript.js file, unminified: saving redundant exports and conventions

    • no: that's the opposite, makes redundant exports for internal values - how would we extend justin otherwise?
  • Sequences in expr vs , as reducer

    • modifying engine is verbose and slow, compared to direct lookups...
    • fake operators isn't good also
    • seq creates an args array - we have to reduce it for eval anyways, which is natural via operator
    • operator doesn't need modified code() call → utilize custom parsers, where args are needed.
  • Declarable multiarg operator like (a,...b)=>x vs separator ['(',',',')'] vs manual sequences handling

    • separator doesn't account for a,b,c in d
    • (...a,b)=>x is neither possible
    • [',','in'] is impossible - end operator is 1-char now → no: just use custom parsers, they occupy less space in total than unnecessary generalizations
  • extend end operator to multiple characters? → no: end operator is a custom parser case, not generalized

  • make ternaries support as ['?',':'],(a,b,c)=>{} → no: make custom parser

  • Radical nanoscript?

    • remove descent; no: descent is useful for standard operators and overloading
    • binaries-only defs;
    • eval binaries only;
    • no char, no code, no err; → err and code are heavily needed, char is rare in code, can be substr instead
    • space via skip; no: too slow
    • no word operators; no: too easy to do and useful to have
  • Make mapper configurable:

    • binaries-only vs flat nodes must be a feature of configurator, not built-in.
      • for fn arguments / arrays we have to parse , as flat sequence, unless we provide a special reducer in ( parser - that doesn't save that much space
    • As well as word operators. → trivially solved as custom mapper with next-char check
    • As well as reducer in evaluator. → not needed anymore as direct evals supercede v5 scheme
  • reorganize as conditional eval getting, instead of props in fn can be partially done by passing custom precedence, but not sustainable

  • justin, all tests, publish v6

  • Make better error explanations

  • remove tagged literal, since it doesn't make much sense without caching by callsite. That can easily be implemented user-side.

  • collect ids: needed for element-params

  • simpler way to extend: ideally we should be able to extend subscript without internals for: . strings . a?.b

    • somewhat possible by passing ids as arguments → Just expose direct parser. Requiring it from /src is unreliable, but extending is needed
  • make operator, token more external methods with plain API

  • numbers must not be part of core:

    • they are valid ids
    • different config may want parse differently, like versions 1.2.3
    • different lang has different number capabilities
  • identifier parser can be configurable:

    • we may want to collect all used ids → can be done via AST
    • we may want it to return different target (function, string, etc) → can be done via separate eval / AST
    • or make a.b.c identifiers, not operators.
  • don't collect arguments? it slows down parsing and can be done as separate analyzing routine in target DSL.

    • maybe we just need better interop instead (see below)
    • since ids can be collected externally now, it's better to outsource that indeed, to keep point of performance/size. ~ same time, since subscript is not just a thing in itself, it would be useful to expose ast.
  • Should we retain subscript.eval and subscript.parse for different purpose uses?

    • Alternatively, we can come up with node constructor that can either create an eval function or generate AST
    • Or we can still stuff ast into eval, like .id, .of etc.
    • Wasm is still faster for parsing and evaluating. Keeping ast is useful.
    • returning AST from parse method is both fast and useful, rather than single-purpose eval, which can be splitted → eval, as well as compile, map, JSONify, ESify, WASM-compile etc. - are different targets indeed.
  • Would be nice to provide actual analyzable tree, not just eval function.

    • ast enables easier access to underlying tokens (no need to fake id function), no need to store .of, .id etc.
      • that would solve collecting arguments case
    • that would allow different targets by user demand
    • ast makes many custom operators direct ones, like . or
    • ast is possible to eval in wasm, since it's declaratively defined
    • that would allow swizzles, pre-eval and various node optimizations like a++ → a+=1, a,b = 1 → a=1, b=1
    • stuffing tree into subscript seems to be dissolving main point: terse, fast expressions. → parse is faster and even smaller, eval is on par via separate evaluate target
  • AST format

    • It can be strictly binary or with multiple args. Multiargs case makes more sense (?:, a(b,c,d), [a,b,c,d], a;b;c;d)
      • multiargs allow SIMD and other optimizations, it better reflect actual execution and doesn't enforce redundant ordering
      • AST should be able to validly codegenerated back, therefore redundant ordering imposes redundant parens
        • Precedences belong to codegenerator, not to parser, to match against.
      • enables simd
    • AST can reflect either: visual syntactic structure (like exact parens) OR semantic order of execution.
    • AST can not depend on languages and just reflect order of commands execution, can be converted to any language. ~ may be challending if target lang has different precedences → precedences must be defined by target codegenerator
    • Parens operator may have semantic sense as either grouping or destructuring
    • Can be converted to xml as <mul><a/><b/><c/></mul> → safest is source fragments ordered as nested arrays indicating order of evaluation.
  • Number, primitives parsing: is that part of evaluator or parser? What about mapping/simplifying nodes?

    • we organically may have [value] syntax for primitives/literals
      • there are 0 operand operators: ;;;, a,,,b taking that signature
    • some parsing literals can be complicated, like ` or 1.2e+3 - need to be detected parse-time;
      • postponing them is extra parse work
      • There are many non-standard number signatures: 0x00, 0b00, 00017, 0123n, 1.23e+5, 1_100_200
      • these literals can be parsed from AST also via custom mappers ['.', '1', '2e'] ~ that can be parsed specifically, but AST should be very safe
    • Numbers are in eg. XML - we'd still store untyped strings there
    • Type detection therefore can be done on stage of interpreter (eg. XML).
      • so that is sort-of preparing arguments for eval...
    • optimization mapping can be done on level of evaluator (pre-eval), not before - we need reducers to do that
    • null, undefined, false jeopardizes expr eval: need to be strings → As a gift for God it should be lightweight, generic and minimal. → It should be JSON-compatible, therefore types should be abstracted as much as possible (arrays + strings). ~ although json allows booleans and numbers and even null → we don't need mappers layer: parser already includes mapping, no need for any extra, it should create calltree. → it's safest to convert initial string to ordered substrings, without particular type detection. → Evaluating value (converting from stirng to value) is parse of evaluator ~ same time on parsing level we can point to a type function to initially parse - it will save time on reparsing in eval.
  • Mapping layer?

    • map various numeric constructs
    • map swizzles from sonr
    • nah, mapping is part of parsing. Just handle whatever structures needed in parse.
  • Generalize operator registering

  • Make less cryptic

    • Provide explicit binary, unary, sequence helpers
    • Make main clean single entry, without commonscript setup. DSLs may not need shifts or binary by default
    • Make explicit lookup.id, lookup.space - parsing should not be cryptic, it should be apparent
    • Display better error message with context and underline
  • Sometimes nary returns last element as null, other times as undefined. Find out why.

  • Make able to parse sequence of identifiers: s-expressions or html may allow that. nah, subscript is not for that

  • [?] remove a.1 and a.true from subscript

  • make a?.valueOf() context work

  • make default difference of a() as ['()', 'a'] like in lino, rather then here

  • streamline samples (at price of larger codebase - that's fine)

  • Hide skip, expr, compile as args for defining tokens functions? -> nah, passing as args is messy

    • shorter exports
    • easier extending dialects - no need to import parse/eval internals
  • Modularize, make pluggable groups

    • standardizes base for various languages
  • Recognize false, 0 and '' as results of successfully parsed expressions. -> nah, too much trouble with checks for undefined, null, false, 0, '' as primitives

    • Make [,'path'] for prop access, and 'value' a direct primitive -> can't easily parse direct primitives without wrapping
  • Flip increments [+=, a, 1] for ++a, and [++, a] for a++ -> nah, likely that's harder than ++a-1 (dealing with object cases)

  • Untangle prefix/postfix order not existent anymore

  • See if we can get rid of set -> yes, it makes code smaller & simpler

  • add assign operators

  • consolidate obj access cases in one function: +assign, +inc/dec, +=/-=, +call, ?. reuse similar chunk

  • Make literals via [,val] (no need for empty string)

Backlog

  • Better interop. Maybe we're too narrow atm. Practice shows some syntax info is missing, like collecting ids etc. -> not clear what's that

    • different targets: lispy calltree, wasm binaries, regular ast, transform to wat, unswizzle
    • collecting ids via tree traversal -> impl in dependent
    • sonr transforms like a,b,c = d,e,fa=d,b=e,c=f -> do client transform
    • more direct API: prefix operator, id - may not require low-level extension → that belongs to custom langs, not core
  • (a ? ^b; 123) error points to (a ┃? ^-a; 123) instead of (a ? ┃^-a; 123)

  • language building tool: create own language with a set of features

    • Make operator groups, importable; build subscript by including op groups.
  • ideas snippets

    • !keyed arrays? [a:1, b:2, c:3]
    • parser examples as chunks
    • string interpolation ${} 1 ${}
    • readme ideas
    • double.js scripting
    • js-based glsl evaluator
    • language simulators
  • Demo

  • complex groups detector: ax + by + c

  • compile groups/complex groups to wasm: ax + by + c

    • wasm doesn't generically support any-type of argument.
  • WASMify https://youtu.be/awe7swqFOOw?t=778

    • before interface types it's very problematic for wasm to deal with slicing/passing strings/substrings. ~ in fact we can initialize lookup tokens in JS and run actual parser in WASM by passing table . and return multiple values from skip as ranges to slice