Skip to content

Latest commit

 

History

History
565 lines (404 loc) · 32.1 KB

File metadata and controls

565 lines (404 loc) · 32.1 KB

Python LSP Benchmark Comparison

Generated from results/bench-servers/summary-20260530T065651Z.json

  • Generated at: 20260530T065651Z
  • Config: github-releases
  • Servers: pyright, ty, pyrefly, pylsp-mypy
  • Baseline server: Pyright (pyright)
  • Benchmarks: data_science, django, pandas, sqlalchemy, transformers, web, tsp_core, tsp_semantic

Server Versions

Server Version Source
Pyright 1.1.410 /home/runner/work/python-lsp-compare/python-lsp-compare/.python-lsp-compare/servers/pyright/1.1.410/package/dist/pyright-langserver.js
Ty 0.0.40 /home/runner/work/python-lsp-compare/python-lsp-compare/.python-lsp-compare/servers/ty/0.0.40/ty-x86_64-unknown-linux-gnu/ty
Pyrefly 1.0.0 /home/runner/work/python-lsp-compare/python-lsp-compare/.python-lsp-compare/servers/pyrefly/venv/bin/pyrefly
pylsp-mypy 1.14.0 /home/runner/work/python-lsp-compare/python-lsp-compare/.python-lsp-compare/servers/pylsp-mypy/venv/bin/pylsp

Server Notes

  • Pyright: Requires Node.js to be installed.
  • Pyrefly: Installed from PyPI into an isolated venv because GitHub release binaries are no longer published.
  • pylsp-mypy: Uses python-lsp-server (pylsp) with the pylsp-mypy plugin.
  • pylsp-mypy: LSP features like hover and completion are provided by pylsp/jedi, not mypy.
  • pylsp-mypy: mypy contributes diagnostics only.

Overview

Server Success Benchmarks Wall clock ms Avg measured ms Measured requests Non-empty % Failed points
Ty yes 6 3693.96 3.25 150 100% 0
Pyrefly yes 8 9482.25 18.92 205 98% 0
Pyright yes 6 31881.38 54.17 150 97% 0
pylsp-mypy no 6 156508.28 257.68 150 80% 5

Wall clock ms includes server startup, warmup iterations, and shutdown — but excludes one-time environment creation and dependency installation.

Benchmark: data_science

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Ty yes 537.16 3.51 5 25 100% 0
Pyrefly yes 683.13 9.66 5 25 100% 0
Pyright yes 3065.58 44.47 5 25 100% 0
pylsp-mypy no 6243.76 78.12 5 25 80% 1

dataframe completion

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Ty yes 1.72 1.91 100% 225.00 +24.00 pass
Pyright yes 5.74 9.42 100% 201.00 0.00 pass
Pyrefly yes 35.46 137.59 100% 250.00 +49.00 pass
pylsp-mypy yes 46.11 51.98 100% 181.00 -20.00 pass

dataframe describe hover

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 0.24 0.27 100% 4244.00 +225.00 pass
Pyright yes 1.02 1.14 100% 4019.00 0.00 pass
Pyrefly yes 3.04 3.23 100% 3604.00 -415.00 pass
pylsp-mypy yes 172.85 174.29 100% 4134.00 +115.00 pass

summarize definition

Method: textDocument/definition

Server Success Mean ms P95 ms Non-empty % Definitions found Delta vs Pyright Validation
Ty yes 0.15 0.16 100% 1.00 0.00 pass
Pyrefly yes 0.17 0.19 100% 1.00 0.00 pass
Pyright yes 0.86 2.26 100% 1.00 0.00 pass
pylsp-mypy yes 0.89 0.97 100% 1.00 0.00 pass

edit array then complete (edit+completion)

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
pylsp-mypy no 4.32 5.64 0% 0.00 -169.00 fail (10)
Pyrefly yes 9.18 9.41 100% 149.00 -20.00 pass
Ty yes 13.56 13.76 100% 167.00 -2.00 pass
Pyright yes 186.53 234.04 100% 169.00 0.00 pass

edit array then hover (edit+hover)

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Pyrefly yes 0.42 0.47 100% 2075.00 +1797.00 pass
Ty yes 1.88 1.97 100% 376.00 +98.00 pass
Pyright yes 28.21 33.99 100% 278.00 0.00 pass
pylsp-mypy yes 166.44 167.70 100% 5644.00 +5366.00 pass

Result Differences

  • dataframe completion: result differences detected (181.00, 201.00, 225.00, 250.00).
  • dataframe describe hover: result differences detected (3604.00, 4019.00, 4134.00, 4244.00).
  • edit array then complete (edit+completion): result differences detected (0.00, 149.00, 167.00, 169.00).
  • edit array then hover (edit+hover): result differences detected (2075.00, 278.00, 376.00, 5644.00).

Benchmark: django

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Ty yes 231.45 2.07 5 25 100% 0
Pyrefly yes 352.29 6.34 5 25 100% 0
Pyright yes 1190.57 13.55 5 25 100% 0
pylsp-mypy yes 6824.00 150.47 5 25 100% 0

queryset completion

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Pyright yes 4.57 6.74 100% 10.00 0.00 pass
Ty yes 5.36 7.82 100% 259.00 +249.00 pass
Pyrefly yes 26.94 98.60 100% 38.00 +28.00 pass
pylsp-mypy yes 165.35 523.93 100% 2.00 -8.00 pass

queryset filter hover

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 0.20 0.22 100% 46.00 -11.00 pass
Pyright yes 0.48 0.51 100% 57.00 0.00 pass
Pyrefly yes 0.55 1.48 100% 298.00 +241.00 pass
pylsp-mypy yes 156.65 157.22 100% 57.00 0.00 pass

model definition

Method: textDocument/definition

Server Success Mean ms P95 ms Non-empty % Definitions found Delta vs Pyright Validation
Ty yes 0.15 0.16 100% 1.00 0.00 pass
Pyrefly yes 0.16 0.17 100% 1.00 0.00 pass
Pyright yes 0.41 0.44 100% 1.00 0.00 pass
pylsp-mypy yes 0.89 0.97 100% 1.00 0.00 pass

edit queryset then complete (edit+completion)

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Ty yes 3.04 3.34 100% 104.00 -1.00 pass
Pyrefly yes 3.39 10.28 100% 83.00 -22.00 pass
Pyright yes 21.35 23.78 100% 105.00 0.00 pass
pylsp-mypy yes 242.86 272.31 100% 143.00 +38.00 pass

edit queryset then hover (edit+hover)

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Pyrefly yes 0.67 0.70 100% 1190.00 +1107.00 pass
Ty yes 1.62 1.67 100% 100.00 +17.00 pass
Pyright yes 40.94 47.16 100% 83.00 0.00 pass
pylsp-mypy yes 186.58 190.38 100% 71.00 -12.00 pass

Result Differences

  • queryset completion: result differences detected (10.00, 2.00, 259.00, 38.00).
  • queryset filter hover: result differences detected (298.00, 46.00, 57.00).
  • edit queryset then complete (edit+completion): result differences detected (104.00, 105.00, 143.00, 83.00).
  • edit queryset then hover (edit+hover): result differences detected (100.00, 1190.00, 71.00, 83.00).

Benchmark: pandas

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Ty yes 830.00 6.28 5 25 100% 0
Pyrefly yes 680.66 12.50 5 25 100% 0
Pyright yes 7293.84 86.23 5 25 100% 0
pylsp-mypy yes 6899.21 129.41 5 25 100% 0

report dataframe completion

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Ty yes 17.83 21.38 100% 1000.00 +725.80 pass
Pyrefly yes 38.51 152.79 100% 39.00 -235.20 pass
Pyright yes 70.11 239.25 100% 274.20 0.00 pass
pylsp-mypy yes 87.25 264.13 100% 6.00 -268.20 pass

dataframe groupby hover

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 0.22 0.27 100% 308.00 -42.00 pass
Pyright yes 0.90 1.04 100% 350.00 0.00 pass
Pyrefly yes 6.24 14.79 100% 3120.00 +2770.00 pass
pylsp-mypy yes 182.07 186.75 100% 301.00 -49.00 pass

build report definition

Method: textDocument/definition

Server Success Mean ms P95 ms Non-empty % Definitions found Delta vs Pyright Validation
Ty yes 0.16 0.17 100% 1.00 0.00 pass
Pyrefly yes 0.18 0.21 100% 1.00 0.00 pass
Pyright yes 0.43 0.47 100% 1.00 0.00 pass
pylsp-mypy yes 0.88 0.94 100% 1.00 0.00 pass

edit dataframe then complete (edit+completion)

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Ty yes 11.76 12.23 100% 448.00 +7.00 pass
Pyrefly yes 14.44 15.14 100% 256.00 -185.00 pass
pylsp-mypy yes 194.73 197.95 100% 442.00 +1.00 pass
Pyright yes 349.56 772.74 100% 441.00 0.00 pass

edit dataframe then hover (edit+hover)

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 1.43 1.49 100% 4378.00 +86.00 pass
Pyrefly yes 3.14 9.78 100% 2481.00 -1811.00 pass
Pyright yes 10.16 12.42 100% 4292.00 0.00 pass
pylsp-mypy yes 182.12 197.33 100% 232.00 -4060.00 pass

Result Differences

  • report dataframe completion: result differences detected (1000.00, 274.20, 39.00, 6.00).
  • dataframe groupby hover: result differences detected (301.00, 308.00, 3120.00, 350.00).
  • edit dataframe then complete (edit+completion): result differences detected (256.00, 441.00, 442.00, 448.00).
  • edit dataframe then hover (edit+hover): result differences detected (232.00, 2481.00, 4292.00, 4378.00).

Benchmark: sqlalchemy

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Ty yes 320.64 1.64 5 25 100% 0
Pyrefly yes 852.09 17.67 5 25 100% 0
Pyright yes 3030.71 43.40 5 25 100% 0
pylsp-mypy no 5898.13 74.43 5 25 60% 2

query completion

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Ty yes 4.05 8.99 100% 1.00 0.00 pass
Pyright yes 8.21 12.62 100% 1.00 0.00 pass
pylsp-mypy yes 32.31 50.51 100% 1.00 0.00 pass
Pyrefly yes 72.42 277.95 100% 38.00 +37.00 pass

sessionmaker hover

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 0.33 0.36 100% 10628.00 +56.00 pass
Pyright yes 1.30 1.60 100% 10572.00 0.00 pass
Pyrefly yes 1.35 3.44 100% 13682.00 +3110.00 pass
pylsp-mypy yes 292.89 295.60 100% 10498.00 -74.00 pass

mapped class definition

Method: textDocument/definition

Server Success Mean ms P95 ms Non-empty % Definitions found Delta vs Pyright Validation
Ty yes 0.15 0.16 100% 2.00 +1.00 pass
Pyrefly yes 0.21 0.22 100% 1.00 0.00 pass
Pyright yes 0.32 0.34 100% 1.00 0.00 pass
pylsp-mypy yes 0.87 0.92 100% 1.00 0.00 pass

edit query then complete (edit+completion)

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Pyrefly yes 0.50 0.53 100% 17.00 -22.00 pass
Ty yes 2.04 2.23 100% 23.00 -16.00 pass
pylsp-mypy no 23.44 24.00 0% 0.00 -39.00 fail (10)
Pyright yes 126.59 162.87 100% 39.00 0.00 pass

edit session then hover (edit+hover)

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 1.63 1.81 100% 958.00 +58.00 pass
Pyrefly yes 13.89 21.81 100% 1869.00 +969.00 pass
pylsp-mypy no 22.65 22.88 0% 0.00 -900.00 fail (10)
Pyright yes 80.58 91.12 100% 900.00 0.00 pass

Result Differences

  • query completion: result differences detected (1.00, 38.00).
  • sessionmaker hover: result differences detected (10498.00, 10572.00, 10628.00, 13682.00).
  • mapped class definition: result differences detected (1.00, 2.00).
  • edit query then complete (edit+completion): result differences detected (0.00, 17.00, 23.00, 39.00).
  • edit session then hover (edit+hover): result differences detected (0.00, 1869.00, 900.00, 958.00).

Benchmark: transformers

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Ty yes 1428.94 3.72 5 25 100% 0
Pyrefly yes 3348.94 77.33 5 25 80% 0
Pyright yes 16075.35 129.35 5 25 80% 0
pylsp-mypy no 126905.79 1055.68 5 25 40% 2

classifier pipeline completion

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Ty yes 11.84 13.48 100% 771.00 +648.00 pass
Pyright yes 52.56 82.79 100% 123.00 0.00 pass
pylsp-mypy yes 118.97 126.68 100% 2.00 -121.00 pass
Pyrefly yes 372.76 1489.89 100% 38.00 -85.00 pass

pipeline hover

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 0.16 0.20 100% 7.00 -27.00 pass
Pyrefly yes 0.18 0.24 100% 48.00 +14.00 pass
Pyright yes 0.48 0.56 100% 34.00 0.00 pass
pylsp-mypy no 1795.33 1831.52 0% 0.00 -34.00 fail (10)

auto tokenizer definition

Method: textDocument/definition

Server Success Mean ms P95 ms Non-empty % Definitions found Delta vs Pyright Validation
Ty yes 0.20 0.28 100% 1.00 0.00 pass
Pyright yes 0.40 0.45 100% 1.00 0.00 pass
Pyrefly yes 0.60 1.89 100% 1.00 0.00 pass
pylsp-mypy yes 1585.80 1644.14 100% 1.00 0.00 pass

edit prediction then complete (edit+completion)

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
pylsp-mypy yes 2.26 2.38 0% 0.00 0.00 pass
Ty yes 3.45 3.77 100% 23.00 +23.00 pass
Pyright yes 6.26 6.48 0% 0.00 0.00 pass
Pyrefly yes 12.57 26.35 0% 0.00 0.00 pass

edit tokenizer then hover (edit+hover)

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Pyrefly yes 0.54 0.59 100% 33.00 +3.00 pass
Ty yes 2.93 2.99 100% 7.00 -23.00 pass
Pyright yes 587.05 618.65 100% 30.00 0.00 pass
pylsp-mypy no 1776.04 1841.94 0% 0.00 -30.00 fail (10)

Result Differences

  • classifier pipeline completion: result differences detected (123.00, 2.00, 38.00, 771.00).
  • pipeline hover: result differences detected (0.00, 34.00, 48.00, 7.00).
  • edit prediction then complete (edit+completion): result differences detected (0.00, 23.00).
  • edit tokenizer then hover (edit+hover): result differences detected (0.00, 30.00, 33.00, 7.00).

Benchmark: web

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Ty yes 345.78 2.28 5 25 100% 0
Pyright yes 1225.33 8.03 5 25 100% 0
Pyrefly yes 686.92 10.62 5 25 100% 0
pylsp-mypy yes 3737.40 57.99 5 25 100% 0

request args completion

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Pyright yes 4.74 7.92 100% 14.00 0.00 pass
Ty yes 6.35 9.54 100% 453.00 +439.00 pass
pylsp-mypy yes 20.74 26.52 100% 1.00 -13.00 pass
Pyrefly yes 44.28 148.40 100% 275.40 +261.40 pass

client session hover

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 0.16 0.19 100% 7.00 -19.00 pass
Pyright yes 0.61 0.71 100% 26.00 0.00 pass
Pyrefly yes 4.61 13.16 100% 149.00 +123.00 pass
pylsp-mypy yes 14.92 24.00 100% 359.00 +333.00 pass

client references

Method: textDocument/references

Server Success Mean ms P95 ms Non-empty % References found Delta vs Pyright Validation
Pyrefly yes 0.30 0.34 100% 2.00 0.00 pass
Ty yes 0.53 0.72 100% 2.00 0.00 pass
Pyright yes 0.88 1.00 100% 2.00 0.00 pass
pylsp-mypy yes 1.91 1.96 100% 2.00 0.00 pass

edit response then complete (edit+completion)

Method: textDocument/completion

Server Success Mean ms P95 ms Non-empty % Completions found Delta vs Pyright Validation
Pyrefly yes 1.45 4.08 100% 32.00 -173.00 pass
Ty yes 2.96 3.17 100% 227.00 +22.00 pass
Pyright yes 4.96 6.02 100% 205.00 0.00 pass
pylsp-mypy yes 84.43 85.18 100% 57.00 -148.00 pass

edit response then hover (edit+hover)

Method: textDocument/hover

Server Success Mean ms P95 ms Non-empty % Hover length Delta vs Pyright Validation
Ty yes 1.39 1.42 100% 1650.00 +1230.00 pass
Pyrefly yes 2.46 5.06 100% 3606.00 +3186.00 pass
Pyright yes 28.98 34.78 100% 420.00 0.00 pass
pylsp-mypy yes 167.93 169.92 100% 363.00 -57.00 pass

Result Differences

  • request args completion: result differences detected (1.00, 14.00, 275.40, 453.00).
  • client session hover: result differences detected (149.00, 26.00, 359.00, 7.00).
  • edit response then complete (edit+completion): result differences detected (205.00, 227.00, 32.00, 57.00).
  • edit response then hover (edit+hover): result differences detected (1650.00, 3606.00, 363.00, 420.00).

Benchmark: tsp_core

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Pyrefly yes 284.35 0.27 8 40 100% 0

builtins semantic tokens

Method: semantic token impl using typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Semantic tokens found Delta vs Pyright Validation
Pyrefly yes 0.97 1.14 100% 30.00 0.00 pass

builtin int computed type

Method: typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.16 0.19 100% 7.00 0.00 pass

list declared type

Method: typeServer/getDeclaredType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.15 0.16 100% 7.00 0.00 pass

generic specialization computed type

Method: typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.27 0.46 100% 8.00 0.00 pass

flow narrowed branch type

Method: typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.16 0.17 100% 8.00 0.00 pass

stdlib path computed type

Method: typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.14 0.15 100% 7.00 0.00 pass

function argument expected type

Method: typeServer/getExpectedType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.15 0.16 100% 7.00 0.00 pass

edited narrowing recomputes type (edit+getComputedType)

Method: typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Results found Delta vs Pyright Validation
Pyrefly yes 0.20 0.22 100% 5.00 0.00 pass

Benchmark: tsp_semantic

Server Success Wall clock ms Avg measured ms Points Measured requests Non-empty % Failed points
Pyrefly yes 2593.87 34.31 3 15 100% 0

django semantic tokens

Method: semantic token impl using typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Semantic tokens found Delta vs Pyright Validation
Pyrefly yes 46.31 74.66 100% 126.00 0.00 pass

transformers semantic tokens

Method: semantic token impl using typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Semantic tokens found Delta vs Pyright Validation
Pyrefly yes 29.30 51.50 100% 74.00 0.00 pass

stdlib semantic tokens

Method: semantic token impl using typeServer/getComputedType

Server Success Mean ms P95 ms Non-empty % Semantic tokens found Delta vs Pyright Validation
Pyrefly yes 27.32 47.51 100% 75.00 0.00 pass