📎 for your 📋.
Sends images on your clipboard to a vision AI model for transcription into text.
-
Compile it (an exercise for the reader, but having devbox and looking in the justfile would help)
-
Install it (in WSL I alias
clippy-clippyto the windows executable) -
Run it once to see if everything's working with your PATH. It'll output some information about how to configure, and the default config file is easy to understand if you've ever worked with an OpenAI compatible API
-
Copy an image onto your clipboard that has some text in it
-
Run it, it'll output the text
If you want to copy that text back to your clipboard you can overwrite it with something like
clippy-clippy | pbcopyon macos orclippy-clippy | clip.exeon windows.You might want to alias that to something, but I wanted
clippy-clippydefault functionality to not be "destructive" to what is on your clipboard. -
Try running it with the
-mor--markdownflag to output github flavored markdown. It's super nice for tables and things like that.
Use at your own risk, this was almost entirely vibe-coded. I had to tweak a bunch of things to get it to work on macos, but after that was done it compiled for windows on WSL first try 🙀. I made several adjustments to the system prompt and modified the flags a bit, but it all seems to work well.
I don't know rust very well, and I like learning programming languages by modifying existing codebases. This is also a tool I'll likely use frequently, which increases the probability that I'll be delving into the source and making adjustments.
The configuration system is ALMOST what I want, but works well enough. It's currently totally usable as a tool, so I'll use it for a while before I decide what other improvements it might need.