Application code largely consists of:
- Business logic.
- Code that sends and retrieves data to databases and external APIs.
jubbly is a data-first applications language (as opposed to a systems language).
It's also not remotely complete..
- Vaguely mypy/TypeScript like typing, but much simplified and with comptime-like (importtime) capabilities. Types themselves are just Structs, meaning we don't need a special type-level langauge to transform types during typechecking, and that types themselves are easily serializable.
- Immutable data by default. Everything is sortable, all values have canonical hash. Immutability made ergonomic with novel
altkeyword - effectively a generalised+= - Serialize everything to/from bytes quickly and canonically for inserting into DBs/sending over the wire. Natively handle versioning and backwards compatability of stored data and API data.
- Interpreted language with an aim for optional compilation. Very small core AST with sugary CST on top. Deep Rust interop à la PyO3, but typed-ier. Consistent syntax.
pub fib = fn[n: I64]: I64 -> {
if n <= 2 n - 1 else fib(n - 1) + fib(n - 2)
}
let Person = *[
name: String,
age: I64,
]
let oli = Person["Oli", age=33]There are Atoms, and there are composite types (for now, these are based on types from im.
None, Bool, I64, F64, String
Vector[T], OrdSet[T], OrdMap[K, T], StructWe can construct composite types like:
[1, 2, 3]
OrdSet[_][4, 5, 6]
[1 => "one", 2 => "two"]Structs are defined like:
pub Person = *[
name: String,
age: I64 = 42,
]We construct Structs like:
Person["Oli", age=33]Struct definitions are themselves just structs - the definition of Person above is actually just sugar for:
pub Person = Struct[
"some/namespace:Person",
[
Field[0, "name", String, no_default],
Field[1, "age", I64, 42],
],
]Namespaces are just bags of functions. To get postfix/infix syntax, we use tildes.
One tilde passes the value on the left as the first argument to any function:
" foo "~string:trim()
// "foo"Two tildes applies the values on either side to any binary function:
[1, 2, 3]~map~fn[v] -> v + 1
// [2, 3, 4]As everything is immutable, we need a cute way of making new values with deeply nested values altered - enter alt. You can think of alt as += generalised over any binary function, over any level of nesting. The equivalent to += is:
let x = 5
alt x + 1
// x = 6But we can use it over abitrary functions/nesting:
pub A = *[x: I64]
let l = [1, [2, A[3]], 4]
alt l.[1].[0]= 99 // note `]=` and `.=` are binary functions that return a new value
alt l.[1].[1].x + 5
alt l~push~5
// l = [1, [99, A[8]], 4, 5]There is an early return operator similar to rust's ? - but it is generic, so can take an argument. The following are syntactically equivalent:
bar? // missing argument defaults to `Error`
if is_instance(let _ = bar, Error) return _ else _
foo:bar()?RuntimeError + 4
(if is_instance(let _ = foo:bar(), RutimeError) return _ else _) + 4Blocks are what lets are lexically scoped to. They evaluate to the last expression.
{
let a = 4
a = 5
a
}Given no trailing else, if expressions default to none. Blocks are optional:
let is_gt_zero = if i > 0 true else false
if foo return 42Namespaces are files containing values, each value is declared with let or pub.
In $JUBBLY_PATH/foo/bar.jub:
pub x = 42In another file (note the only top-level things we can do in namespaces are use, let|pub, firstly):
use foo/bar as b
firstly {
print(b:x)
}There are no relative imports, as aliases are encouraged, and are just a transform at the CST -> AST level for b:x -> foo/bar:x.
Generics are basically just functions - the keyword is gn not fn and calling is like f<arg1, arg2, ...> not f(arg1, arg2, ...).
The main difference vs functions is that at runtime, we just pass through to the return value - they are only "called" at importtime for typechecking.
Consider the identity function in the prelude:
pub identity = gn[T] -> fn[x: T, /]: T -> x // note, args before `/` are positionalAs opposed to a function, if you don't directly call a generic with <> - eg. identity(42) - it desugars during typechecking to - identity<_>(42) and we try infer the _ type value from the rest of the expression.
If you need something object-like, use a closure. This is hopefully ugly enough to discourage frequent usage.
pub object_like = fn[] -> {
let x = 0
*[
get_x = fn[] -> x,
inc_x = fn[] -> {
x = x + 1
none
},
][]
}
let obj = object_like()
obj.inc_x()
obj.inc_x()
obj.get_x()
// 2The CST nodes are:
Parens
Int
Float
String_
Name
Call
Callgn
Callprefix
Callinfix
Callinfixtilde
Callpostfixtilde
Dotget
Dotset
Itemget
Itemset
Let
Set
Setpub
Alt
Builtin
Fn
Gn
And
Or
If
While
Block
Return
Returnif
Returniferror
Construct
Constructpair
Constructtype
Template
Refliteral
Raw source code, including comments, transforms directly to and from CST nodes.
CST nodes are transformed to a much smaller set of AST nodes, and it is these that we interpret/compile.
Atom
Name
Call
Callgn
Let
Set
Setpub
Builtin
Fn
Gn
And
Or
If
While
Block
Return
- Plan typechecking.
- Need to parse (and interpret?) the prelude before anything such that names get set correctly.
- Iron out some of the path stuff now - how do we handle versioning, is there a JUBBLYPATH, etc.
- Developer Experience
- Flip traceback.
- CLI, debug()
Blockat top level. - VSCode highlighting - LSP - https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide (or TextMate).
- Seperate
reprfromprint, revisitfmtandtemplate.
- Typing
- Generics, including filling in
_s. Do we need a named_per<A, B, ...>? - Type aliases that give nice errors.
NewType. - Do we need
Interface[message: String]or can we just pass inFn[T]: String. - Typechecking CLI.
- Make
OrdSet[T]etcN[T]that are all covariant. Annotated?- Check that we don't overwrite non-
reassignablenamespace variables. Check we don't doublelet(?). Check:=lines up. Check recursively that:=only gets called byfirstly- see alsoreassignable. fnAST nodes should say which values are in their closure.FnWithoutClosureAKAFnPure(?) type. Also,FnWith1Arg... for passing to.maplike js.PartialEq for Fncan be more clever with block scopes, checking if they actually contain any values that could get mutated. Addis_dynamicto fns/structs. Maybe we should add referenced vars instead of just count.impure!? / Some kind of typing for effects?Stream/Iterator- is this just any function with a closure? Can we get away without one? Eg. for db rows, we could just do like:db:cursor(config, statement)~db:loop~fn[row] -> f(row)- Does the type system eg. render
TypeVarTuplesirrelevant? Is@overloadjust some special case of something else?
- Generics, including filling in
- Performance
- Serialization currently aims to be on top of msgpack, look more into zerovec, rkyv, bincode, bitcode, wit-bindgen. See benchmarks.
- Any function call that references nothing mutable can be cached.
- Unsafe
resolve? - Can we statically know if we need mutable scopes? Think about all the different flavours of functions. Pure, with closure, with named args etc. If a function doesn't leak its scope, we can just use a static scope that we create at the time we create the function (note we have to create scopes for the blocks as well).
- Should
Values from scopes and as returned fromevaluate(...)be references? - Should we use
unsafe {&self.scope.get_unchecked(i)? - Can
Contexthave a lifetime so we don't have to copy in and out of the scope? - Can we preserve the performance characteristics of
im? ie. Things remain mutable until they aren't. - Look more seriously at string interning.
- Can we compile away dot access to [i] access?
- Consider doing like: https://www.cs.cornell.edu/~asampson/blog/flattening.html
- Can we speed up startup by AOT serializing the prelude?
- Rust interop
rkyv,extism.- Write a macro for the builtin args to check the types - codegen based off of Jubbly types.
- Start with just a vec of ints
- Benchmark
fib()
- Compile all the way to rust with shitload of
RCs? see. Starts at some entry point and monomorphizes? - Search
- fastest map
- matklad parsing
- wanabethatguy lsp
- domenicquirl cstree
- rust-langdev
- lalrpop
- Serialization/Versioning
- If we've stored a nested value without a name, if we subsequently change the type to a union, it's not immediately clear how we should deserialize this old data - we use the order of the union to decide (and make changing the first value of the union a backwards incompatible change).
- Namespace versioning? default to currently installed.
use pydantic//fieldsor explicit major versionuse pydantic/2/fields. The whole namespace thing needs a bit more thought - eg. how would one implement maniple services. - Our
hash/__eqfunctions need to ignore values where the value is the default value, or the value is missing. If we add a default value where there was none before, we'll need to distinguish these from normal defaults. - Our "check_versions_compatible_over_time" function needs to check that we never update the default values.
- DB
- Example polymorphic
JOINacross types a lametatoelecorgas. - It is possible to write a custom sorting function for Postgres indexes according to chatgpt.
ZSet-indexes: Vec[String]# rememberZSet[T, Index[K]] -> Grouped[K, ZSet[T]]- Is DBSP just f(schemas, queries) -> [steps that update cache tables]..? See: https://arxiv.org/pdf/2404.16486. Think about Postgres equivalents to ZSet ops.
- Write an ORM.
- Write a FastAPI.
- Example polymorphic
- Sugar/Context
s/String/Str/ s/Vector/Vec OrdMap Set...?- Allow duplicate
lets - should be pretty easy with naming stuff. #[]for sets.Literalliterals.- Syntax for
let one = 1~uptype~Literal[1](note this won't work as one is not a subtype of the other). - Datetime literals - do we even need a prefix? Do we
Instant? - Decimal literals.
- Obvious features from other langs:
- Unpacking.
- JSX styley -
#<div>...<>-#<enters jsx parser mode. - Pandas-like interface for common operations.
- Allow template functions. Potentially, just using backticks should return the template vector itself.
- Comprehensions?
/,*like Python to specify args.- Converting
matchtoifexpressions.
- "Blessed" subsets of the language, for eg: configuration, nodejs compatability.
- Code that fails because of a type error that wasn't caught by the typechecker is considered a bug.
- Refactor
- Can we roll
fn,and,or,else. IntoQuoted,EvaluateQuoted. Or just makeand,or,elsedesugar to functions? - Only have one
Arenaperns? - Try get rid of
register_nameand just usescope. - Look again at all type weirdness, remove a load of
.into()s. - Look at other commonly used traits.
- Can we roll
cd interpreter
cargo test
cargo run -- test
cargo clippy -- -Wclippy::pedantic
cargo flamegraph --root -- run std/tests/test_basic:run_fib