A Neovim plugin that provides local edit completions and cursor predictions. Currently supports custom models and models form Zeta (Zed) and SweepAI.
Warning
This is an early-stage, beta project. Expect bugs, incomplete features, and breaking changes.
- Go 1.24.2+ (for building the server component)
- Neovim 0.8+ (for the plugin)
Using lazy.nvim
{
"leonardcser/cursortab.nvim",
-- version = "*", -- Use latest tagged version for more stability
build = "cd server && go build",
config = function()
require("cursortab").setup()
end,
}Using packer.nvim
use {
"leonardcser/cursortab.nvim",
-- tag = "*", -- Use latest tagged version for more stability
run = "cd server && go build",
config = function()
require("cursortab").setup()
end
}require("cursortab").setup({
enabled = true,
log_level = "info", -- "trace", "debug", "info", "warn", "error"
state_dir = vim.fn.stdpath("state") .. "/cursortab", -- Directory for runtime files (log, socket, pid)
keymaps = {
accept = "<Tab>", -- Keymap to accept completion, or false to disable
partial_accept = "<S-Tab>", -- Keymap to partially accept, or false to disable
trigger = false, -- Keymap to manually trigger completion, or false to disable
},
ui = {
colors = {
deletion = "#4f2f2f", -- Background color for deletions
addition = "#394f2f", -- Background color for additions
modification = "#282e38", -- Background color for modifications
completion = "#80899c", -- Foreground color for completions
},
jump = {
symbol = "", -- Symbol shown for jump points
text = " TAB ", -- Text displayed after jump symbol
show_distance = true, -- Show line distance for off-screen jumps
bg_color = "#373b45", -- Jump text background color
fg_color = "#bac1d1", -- Jump text foreground color
},
},
behavior = {
idle_completion_delay = 50, -- Delay in ms after idle to trigger completion (-1 to disable)
text_change_debounce = 50, -- Debounce in ms after text change to trigger completion (-1 to disable)
max_visible_lines = 0, -- Max visible lines per completion (0 to disable)
cursor_prediction = {
enabled = true, -- Show jump indicators after completions
auto_advance = true, -- When no changes, show cursor jump to last line
proximity_threshold = 2, -- Min lines apart to show cursor jump (0 to disable)
},
},
provider = {
type = "inline", -- Provider: "inline", "fim", "sweep", "sweepapi", or "zeta"
url = "http://localhost:8000", -- URL of the provider server
api_key_env = "", -- Env var name for API key (e.g., "OPENAI_API_KEY")
model = "", -- Model name
temperature = 0.0, -- Sampling temperature
max_tokens = 512, -- Max tokens to generate
top_k = 50, -- Top-k sampling
completion_timeout = 5000, -- Timeout in ms for completion requests
max_diff_history_tokens = 512, -- Max tokens for diff history (0 = no limit)
completion_path = "/v1/completions", -- API endpoint path
fim_tokens = { -- FIM tokens (for FIM provider)
prefix = "<|fim_prefix|>",
suffix = "<|fim_suffix|>",
middle = "<|fim_middle|>",
},
privacy_mode = true, -- Don't send telemetry to provider
},
blink = {
enabled = false, -- Enable blink source
ghost_text = true, -- Show native ghost text alongside blink menu
},
debug = {
immediate_shutdown = false, -- Shutdown daemon immediately when no clients
},
})For detailed configuration documentation, see :help cursortab-config.
The plugin supports five AI provider backends: Inline, FIM, Sweep, Sweep API, and Zeta.
| Provider | Multi-line | Multi-edit | Cursor Prediction | Model |
|---|---|---|---|---|
| Inline | Any base model | |||
| FIM | ✓ | Any FIM-capable | ||
| Sweep | ✓ | ✓ | ✓ | sweep-next-edit-1.5b |
| Sweep API | ✓ | ✓ | ✓ | sweep-next-edit-7b (hosted) |
| Zeta | ✓ | ✓ | ✓ | zeta |
End-of-line completion using OpenAI-compatible API endpoints. Works with any
OpenAI-compatible /v1/completions endpoint.
Requirements:
- An OpenAI-compatible completions endpoint
Example Configuration:
require("cursortab").setup({
provider = {
type = "inline",
url = "http://localhost:8000",
},
})Example Setup:
# Using llama.cpp
llama-server -hf ggml-org/Qwen2.5-Coder-1.5B-Q8_0-GGUF --port 8000Fill-in-the-Middle completion using standard FIM tokens. Uses both prefix
(before cursor) and suffix (after cursor) context. Compatible with Qwen,
DeepSeek-Coder, and similar models. Works with any OpenAI-compatible
/v1/completions endpoint.
Requirements:
- An OpenAI-compatible completions endpoint with a FIM-capable model
Example Configuration:
require("cursortab").setup({
provider = {
type = "fim",
url = "http://localhost:8000",
},
})Example Setup:
# Using llama.cpp with Qwen2.5-Coder 1.5B
llama-server -hf ggml-org/Qwen2.5-Coder-1.5B-Q8_0-GGUF --port 8000
# Or with Qwen 2.5 Coder 14B + 0.5B draft for speculative decoding
llama-server \
-hf ggml-org/Qwen2.5-Coder-14B-Q8_0-GGUF:q8_0 \
-hfd ggml-org/Qwen2.5-Coder-0.5B-Q8_0-GGUF:q8_0 \
--port 8012 \
-b 1024 \
-ub 1024 \
--cache-reuse 256Sweep Next-Edit 1.5B model for fast, accurate next-edit predictions. Sends full file for small files, trimmed around cursor for large files.
Requirements:
- vLLM or compatible inference server
- Sweep Next-Edit model downloaded from Hugging Face
Example Configuration:
require("cursortab").setup({
provider = {
type = "sweep",
url = "http://localhost:8000",
},
})Example Setup:
# Using llama.cpp
llama-server -hf sweepai/sweep-next-edit-1.5b-GGUF --port 8000
# Or with a local GGUF file
llama-server -m sweep-next-edit-1.5b.q8_0.v2.gguf --port 8000Sweep's hosted API for Next-Edit predictions. No local model setup required.
Note
The hosted API runs sweep-next-edit-7b for better quality predictions.
Requirements:
- Create an account at sweep.dev and get your API token
- Set the
SWEEPAPI_TOKENenvironment variable with your token
Example Configuration:
# In your shell config (.bashrc, .zshrc, etc.)
export SWEEPAPI_TOKEN="your-api-token-here"require("cursortab").setup({
provider = {
type = "sweepapi",
api_key_env = "SWEEPAPI_TOKEN",
},
})Zed's Zeta model - a Qwen2.5-Coder-7B fine-tuned for edit prediction using DPO and SFT.
Requirements:
- vLLM or compatible inference server
- Zeta model downloaded from Hugging Face
Example Configuration:
require("cursortab").setup({
provider = {
type = "zeta",
url = "http://localhost:8000",
model = "zeta",
},
})Example Setup:
# Using vLLM
vllm serve zed-industries/zeta --served-model-name zeta --port 8000
# See the HuggingFace page for optimized deployment optionsThis integration exposes a minimal blink source that only consumes
append_chars (end-of-line ghost text). Complex diffs (multi-line edits,
replacements, deletions, cursor prediction UI) still render via the native UI.
require("cursortab").setup({
keymaps = {
accept = false, -- Let blink manage <Tab>
},
blink = {
enabled = true,
ghost_text = false, -- Disable native ghost text
},
})
require("blink.cmp").setup({
sources = {
providers = {
cursortab = {
module = "cursortab.blink",
name = "cursortab",
async = true,
-- Should match provider.completion_timeout in cursortab config
timeout_ms = 5000,
score_offset = 50, -- Higher priority among suggestions
},
},
},
})- Tab Key: Navigate to cursor predictions or accept completions
- Shift-Tab Key: Partially accept completions (word-by-word for inline, line-by-line for multi-line)
- Esc Key: Reject current completions
- The plugin automatically shows jump indicators for predicted cursor positions
- Visual indicators appear for additions, deletions, and completions
- Off-screen jump targets show directional arrows with distance information
:CursortabToggle: Toggle the plugin on/off:CursortabShowLog: Show the cursortab log file in a new buffer:CursortabClearLog: Clear the cursortab log file:CursortabStatus: Show detailed status information about the plugin and daemon:CursortabRestart: Restart the cursortab daemon process
To build the server component:
cd server && go buildTo run tests:
cd server && go test ./...Which provider should I use?
See the provider feature comparison table for capabilities. For the best experience:
- If you have a consumer GPU and want to run locally, use Sweep with the
sweep-next-edit-1.5bmodel for fast local inference - Otherwise, use Sweep API for the best quality with the hosted
sweep-next-edit-7bmodel
Why are completions slow?
- Use a smaller or more heavily quantized model (e.g., Q4 instead of Q8)
- Decrease
provider.max_tokensto reduce output length (also limits input context)
Why are completions not working?
- Update to the latest version and restart the daemon with
:CursortabRestart - Increase
provider.completion_timeout(default: 5000ms) to 10000 or more if your model is slow - Increase
provider.max_tokensto give the model more surrounding context (trade-off: slower completions)
How do I update the plugin?
Use your Neovim plugin manager to pull the latest changes, then run
:CursortabRestart to restart the daemon.
Why isn't my API key or environment variable being picked up?
The plugin runs a background daemon that persists after Neovim closes.
Environment variables are only loaded when the daemon starts. If you add or
change an environment variable (e.g., SWEEPAPI_TOKEN in your .zshrc), simply
restarting Neovim or your shell won't update the daemon.
Solution: Run :CursortabRestart to restart the daemon with the new
environment variables.
Contributions are welcome! Please open an issue or a pull request.
Feel free to open issues for bugs :)
This project is licensed under the MIT License - see the LICENSE file for details.
