Scripting API for Ghostty #2353
Replies: 40 comments 56 replies
-
|
Just dropping down some thoughts. I'm leaning towards a single-line text protocol in the format of memcached/redis/etc. This has various benefits:
|
Beta Was this translation helpful? Give feedback.
-
|
I know GitHub links it, but I want to explicitly note @LordMZTE's proposal for dynamically linked plugins in #1358. See my comment there, I don't think these are mutually exclusive, but I did want to centralize all potential plugin discussion in one place for now until we spin out actual working on some of these. And I'm not ready to commit to working on these yet. |
Beta Was this translation helpful? Give feedback.
-
|
I'd like to propose another way to interact which are escape sequences, we could make another standard which specifies how it is used. The way it would work is simple, you just echo stuff in your terminal and Ghostty (and other standard implementors) parse it and apply the settings. I don't know how the Kitty keyboard protocol works, but it would be similar, just print your desired setting, and it gets applied: echo $"($escape_start)background($escape_sep)#800000($escape_end)"This would be really useful over SSH, and could prevent you from accidentally deleting the primary database instead of the backup one because you selected the wrong terminal emulator window. This isn't a replacement for the socket way, btw. Another idea is that the background-color and other common settings could be documented in the standard, so all terminal emulators that implement the standard work with a single shell command when added to the shellrc. For emulators that don't support it, the escape sequences could be picked carefully so it doesn't change output and just shows as a normal echo statement. Like |
Beta Was this translation helpful? Give feedback.
-
|
Just building on the conversation about making a text protocol. Would it be cursed to do something like A few problems/questions that this kind of syntax raises:
I chose something overly simple because like mentioned above, it can be easily debugged and building an API client with it is as simple as building a parser in the language and moving forward with normal TCP connections. I think at the very least following the naming scheme of keybinds is the best choice because then you don't need to reinvent the wheel and there's consistency between these two. The one downside is we already run into consistency issues because in a keybind you would do |
Beta Was this translation helpful? Give feedback.
-
|
Why invent a new protocol? Why not use HTTP & JSON over Unix sockets? Every language worth mentioning has HTTP & JSON support already (although it may take a bit of work to talk to a Unix socket) and it's basically infinitely extensible. That also means that we don't need to spend weeks bikeshedding a new protocol. Even cURL works over Unix sockets.
|
Beta Was this translation helpful? Give feedback.
-
I understand the use of JSON here for a uniform and simple standard, but HTTP doesn't seem logical to me, mostly because we could just use a streaming JSON parser (Zig can be made to do this IIRC) to directly transfer JSON over the socket. It would also seem to me that most HTTP clients aren't exactly easy to convince to connect to a Unix socket, as it's simply not something HTTP was ever intended to do, as further made evident by the lack of a way to express this in a URL. Your example could translate to something as simple as:
(Note: the JSON here is obviously just a placeholder and isn't meant to suggest a possible way a request could look.) |
Beta Was this translation helpful? Give feedback.
-
|
JSON makes sense, honestly it wasn't my first choice because I was looking at redis, which is just raw commands. Need to confirm, but is there an overhead to using JSON parsing? Do we need the power of JSON either? I think following the keybind syntax is a good place to start because it's simple and shares a mental model in 2 different places. Introducing JSON basically opens up the can of worms for super complicated API bodies which I personally think is just a sign of bad design. |
Beta Was this translation helpful? Give feedback.
-
|
Everyone hates JSON, because everyone has had to deal with JSON. It's the LCD for passing data between systems. I think that the overhead of JSON parsing is irrelevant unless you're expecting hundreds of API calls per second. Yeah JSON does open you up to the possibility of some truly horrific data structures, but the same could be said for trying to cram everything into a bespoke REDIS-like API. The point of using HTTP & JSON is that the best code is the code that you don't have to write. Each language's HTTP & JSON code is thoroughly vetted by a large community whereas a bespoke protocol will probably only ever be looked at by a very small number of people. Plus cURL & wget mean that you can access the API from a simple shell script as well. |
Beta Was this translation helpful? Give feedback.
-
|
|
Beta Was this translation helpful? Give feedback.
-
|
The use of HTTP makes things simpler as it's a very common design pattern for such things. I suggested using Unix sockets as a security measure but binding to a random port on localhost is an option as well. On Linux it would theoretically be possible to use CGroups and network namespaces to futher limit access to the localhost port but that might be more complication than necessary. |
Beta Was this translation helpful? Give feedback.
-
|
Why not use D-Bus? |
Beta Was this translation helpful? Give feedback.
-
This needs to be cross platform. Additionally, I’ve found dbus pretty complicated to use compared to a simple text proto. |
Beta Was this translation helpful? Give feedback.
-
|
Technically, we're already using D-Bus since it's required by GTK. I wouldn't recommend it as the primary API endpoint. At some point we'll need to make more use of D-Bus to achieve deeper integration with Gnome (such as being able to link into the Gnome search interface) so at that point it may make sense to expose the Ghostty API over D-Bus in some way. |
Beta Was this translation helpful? Give feedback.
-
|
Is the plan also to expose this functionality via Ghostty's cli? I find myself wanting to port stuff like this wezterm/helix integration to Ghostty - i.e. making it easy to use the API from scripts would be great. |
Beta Was this translation helpful? Give feedback.
-
|
I would start with simple text API over Unix socket. Question is whether it should be stream socket or datagram socket. It could also be exposed within Ghostty session itself via environment variable like Usage of Unix sockets would also allow us to share extra data via |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
|
The main use case for me would be to use smart splits in nvim |
Beta Was this translation helpful? Give feedback.
-
Would love to see this implemented. I started working on this feature but hit a wall when I realized a lot of commands aren't available from the CLI. I can't ditch tmux until I can easily navigate between ghostty and neovim panes seamlessly. |
Beta Was this translation helpful? Give feedback.
-
|
To me, a large part of why this issue hasn't seen much progress is because it's too broad. Even after rounds of discussion among maintainers it's unclear how to proceed, because we've been trying to develop one Big solution that's trying to cater to all needs. Specifically, there are multiple problems that we're trying to solve here:
These problems all demand somewhat different solutions because of different use cases and different access methods. For the first use case the obvious solution is an escape code sequence — either via OSC sequences like iTerm2's OSC 1337 or DCS sequences like tmux's control mode or Kitty's remote control protocol. We could use control sequences for all remote control capabilities like Kitty, but then we run into the problem of how we get them into a running Ghostty instance. If the app is running inside Ghostty and wishes to control it, that can be very easily done. But if some other program that isn't attached to Ghostty wants to control it, it would require some external transport in order to speak to it: Kitty solves this by using Unix sockets, which, as the name implies, aren't available on Windows, and I don't think we want to limit ourselves to Unix platforms just for this. Other transports are even more cumbersome — other people have proposed hosting an HTTP server or using telnet, but none of these options are really convenient to use without using specific libraries, which just makes the feature harder to use by downstream consumers. The other main approach is to use platform-native IPC capabilities, such as AppleScript on macOS, D-Bus on Linux, etc. This IMHO is a much better idea for when other desktop apps need to talk to Ghostty, especially since native apps often already have the libraries needed to use these IPC apparatuses — even on Linux, all GTK apps are built with gdbus support by default, and Qt apps can use qdbus as well. What's better is that native IPC is also often used by the desktop environment to communicate to apps: in fact, ever since #7679 we already use D-Bus to start new windows, so there's no reason why we shouldn't lean in on it harder, especially considering our goal to be native. The caveat of course is that there's no standardized way to talk to Ghostty across all operating systems. I think I would be fine with that drawback, especially since our macOS and GTK apps do have different feature sets, and multiplatform apps aiming to control Ghostty should be aware of those differences. IMHO we would only see whether this is a problem once someone actually writes said multiplatform app. Using D-Bus/AppleScript would also be quite annoying for apps running inside Ghostty or for CLI usage. Given that these two solutions are good for a subset of the use cases proposed here, I would say... ¿por qué no los dos? On the GTK side at least we already use D-Bus, and it's just a matter of making all of our keybind actions available via D-Bus (which GTK already supports via the Action API). Adding a new control sequence would also be enormously simpler than trying to embed a Telnet/HTTP/Varlink/whatever API in Ghostty that terminal consumers would only find hard to use. Being native for Ghostty doesn't just mean that it should only follow conventions for the OS, it should also mean that it should follow the conventions set out by other terminals. Trying to squash these two different sides of a terminal emulator (which, philosophically, is the connection between the OS world and the terminal world) together into one solution makes neither side happy. On a community administration side I think it would be much more productive to split this discussion in two: one discussing the design for a control sequence-based API for terminal apps, and one discussing the design for a desktop IPC-based API for desktop apps. The CLI in my mind should be able to intelligently choose between the two depending on whether it's running inside Ghostty (similar to |
Beta Was this translation helpful? Give feedback.
-
So I'm trying to stop using tmux and use native ghostty stuff to support my existing tmux workflow. I have something like this:
I'd really love to be able to do something like this with native ghostty (I've had trouble with running inside of tmux for a while and I find the idea of removing this middle layer compelling). guess that means that I should look into AppleScript? (If I want to do this in macOS?) Would the equivalent be enabling something like querying for open windows / tabs, and enabling the selection of a particular window/tab via something that can take advantage of applescript ... like spotlight / alfred / raycast? I think I find the CLI approach that tmux takes really appealing here, because it allows for really easy composition with other tools like fzf... though I understand that this is really because tmux is a shell application, so it is quite different than running natively - we would need to tell the OS to focus the thing, rather than merely changing the program that the shell is running in this case. |
Beta Was this translation helpful? Give feedback.
-
|
Scripting is a complex task. Perhaps development towards scripting could start with smaller improvements? For example, we can add a new action that would allow changing settings by pressing shortcuts. This is one of the use cases for scripting. For me, that's the only thing I'm missing right now. |
Beta Was this translation helpful? Give feedback.
-
|
Just to chime in with my use case, I'm looking for a way to set up a vim-dispatch
I do this currently with Kitty with the launch kitten in a vim function (simplified): let kitty = 'kitty @ launch --copy-env --keep-focus --title='.shellescape(a:request.title).' '.'--cwd='.shellescape(a:request.directory)
call system(kitty.' zsh -c '.shellescape(command).redir)Having something similar for |
Beta Was this translation helpful? Give feedback.
-
|
I saw a digression from the starting premise of a single line text protocol, and I wondered if its worth pointing out that sending text over tcp/udp with netcat is pretty useable. Here's a case where it would seem more complicated if it was all http and json (or control sequences) : https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_zkCommands |
Beta Was this translation helpful? Give feedback.
-
|
I'm interested in this for project-specific terminal layouts - similar to what Zellij provides with its layout files. My current setup defines tabs for a Rails development environment: shell, console, web server, background jobs, and a split pane for asset watchers. Each tab has its own command. Being able to launch this with a single command is incredibly useful. The appeal of having this natively in Ghostty is twofold:
|
Beta Was this translation helpful? Give feedback.
-
|
I built a working version of this, and added support for IPC over a unix socket which works nicely on my Mac. I also made an MCP server so Claude Code can test my TUI apps for me! I'm interested in upstreaming the implementation for sure and would love feedback on the approach. I used a pretty simple JSON-RPC protocol for the IPC socket. It's still mostly AI and I've got a bunch to do before submitting. My branch: https://github.com/hyperb1iss/ghostty/tree/stef/remote-control ghostty-automation-2x.mov |
Beta Was this translation helpful? Give feedback.
-
|
This is also macOS only, but I added AppleScript wrappers around the existing App Intents framework so that you can open new window / tab / split any direction / send text / get contents / send command. It also can maintain a persistent link between Ghostty Terminal and the automation sender via UUID. I will probably end up building some SwiftUI menubar app so that I can search all open Ghostty terminals for the content I'm looking for then switch to it. I always end up with way too many windows. Anyway, I digress. I know there's hesitancy to build out an automation API w/o a "unified" scripting API, but since the App Intents are already there, and this is literally what AppleScript is made for, this is a dead simple addition branch: https://github.com/kkilchrist/ghostty-applescript (idea is mine after significant digging; Claude did the implementation while referring to another app of mine that had working AppleScript support) |
Beta Was this translation helpful? Give feedback.
-
|
For what it's worth, I just want a |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Some inspiration:
Beta Was this translation helpful? Give feedback.
All reactions