Skip to content

Add "WFGY Core 2.0" – free system prompt + 60s self-test for more stable reasoning #46

@onestardao

Description

@onestardao

hi, thanks for maintaining TheBigPromptLibrary and this prompt collection 🙏

i’m an indie dev (PSBigBig) working on a long-term open source project called WFGY.
before my main repo passed ~1.4k stars, i spent about one year on a very simple idea:

instead of building another tool / agent / framework,
i tried to write a small “reasoning core” in plain text, so any strong LLM can use it
just by putting a txt block in the system prompt.

i call it WFGY Core 2.0.

it is:

  • not a new model, not a fine-tune
  • just one text block in the system / pre-prompt area
  • goal: less random hallucination, more stable multi-step reasoning
  • works with “any strong LLM”, text-only, no tools, no external infra
  • MIT license, full repo here:
    WFGY · All Principles Return to One (MIT, text only):
    https://github.com/onestardao/WFGY

i think it fits this repo because it is closer to a math-based “reasoning bumper”
than a normal style prompt. it also comes with a tiny 60s self-test so people can
feel the effect inside the chat window without code.


1. what problem this system prompt tries to solve

most prompts in the wild focus on:

  • persona
  • style, tone, formatting
  • specific task patterns (cot, rephrase, etc.)

WFGY Core 2.0 is a bit different:

  • it defines a tension metric delta_s = 1 − cos(I, G) between intention and goal
  • it tracks zones (safe / transit / risk / danger) and uses that to decide when to
    rebalance attention or change path
  • it adds a simple hysteresis coupler to avoid jitter from tiny changes
  • it explicitly encourages the model to:
    • admit uncertainty instead of hallucinating details
    • keep long answers structurally consistent across follow-ups
    • only “bridge” to a new path when the tension actually goes down

in short: it is a math-flavoured system prompt that tries to stabilize reasoning,
not a jailbreak or style trick.


2. quick user-facing effects (rough feel)

from my own tests (and from people who tried it), typical changes look like:

  • follow-up questions drift less from the original intent
  • long explanations keep the structure more coherent
  • the model is more willing to say “i am not sure” instead of inventing fake facts
  • when used to write image prompts, the prompts often have clearer structure + story,
    so many users feel “the pictures look more intentional, less random”

of course this depends on the base model and tasks, so i also attach a small
A/B/C self-test below.


3. the system prompt itself (WFGY Core 2.0)

below is the raw txt block. idea is very simple:
user opens a new chat, puts this in system / pre-prompt, then uses the model normally.

(when you publish it in your repo, you can wrap it in a proper ```text code block.)

=== SYSTEM PROMPT START ===
WFGY Core Flagship v2.0 (text-only; no tools). Works in any chat.
[Similarity / Tension]
delta_s = 1 − cos(I, G). If anchors exist use 1 − sim_est, where
sim_est = w_esim(entities) + w_rsim(relations) + w_csim(constraints),
with default w={0.5,0.3,0.2}. sim_est ∈ [0,1], renormalize if bucketed.
[Zones & Memory]
Zones: safe < 0.40 | transit 0.40–0.60 | risk 0.60–0.85 | danger > 0.85.
Memory: record(hard) if delta_s > 0.60; record(exemplar) if delta_s < 0.35.
Soft memory in transit when lambda_observe ∈ {divergent, recursive}.
[Defaults]
B_c=0.85, gamma=0.618, theta_c=0.75, zeta_min=0.10, alpha_blend=0.50,
a_ref=uniform_attention, m=0, c=1, omega=1.0, phi_delta=0.15, epsilon=0.0, k_c=0.25.
[Coupler (with hysteresis)]
Let B_s := delta_s. Progression: at t=1, prog=zeta_min; else
prog = max(zeta_min, delta_s_prev − delta_s_now). Set P = pow(prog, omega).
Reversal term: Phi = phi_delta
alt + epsilon, where alt ∈ {+1,−1} flips
only when an anchor flips truth across consecutive Nodes AND |Δanchor| ≥ h.
Use h=0.02; if |Δanchor| < h then keep previous alt to avoid jitter.
Coupler output: W_c = clip(B_sP + Phi, −theta_c, +theta_c).
[Progression & Guards]
BBPF bridge is allowed only if (delta_s decreases) AND (W_c < 0.5
theta_c).
When bridging, emit: Bridge=[reason/prior_delta_s/new_path].
[BBAM (attention rebalance)]
alpha_blend = clip(0.50 + k_c*tanh(W_c), 0.35, 0.65); blend with a_ref.
[Lambda update]
Delta := delta_s_t − delta_s_{t−1}; E_resonance = rolling_mean(delta_s, window=min(t,5)).
lambda_observe is: convergent if Delta ≤ −0.02 and E_resonance non-increasing;
recursive if |Delta| < 0.02 and E_resonance flat; divergent if Delta ∈ (−0.02, +0.04] with oscillation;
chaotic if Delta > +0.04 or anchors conflict.
[DT micro-rules]
=== SYSTEM PROMPT END ===


4. 60s self-test prompt (A/B/C comparison)

to keep things beginner-friendly but still a bit structured, i also use this small
prompt so people can let the model “imagine” A/B/C modes and output a small table.

again, you can optionally include this in your repo as a “quick self-test” snippet.

=== SELF-TEST PROMPT START ===
SYSTEM:
You are evaluating the effect of a mathematical reasoning core called “WFGY Core 2.0”.

You will compare three modes of yourself:

A = Baseline
No WFGY core text is loaded. Normal chat, no extra math rules.

B = Silent Core
Assume the WFGY core text is loaded in system and active in the background,
but the user never calls it by name. You quietly follow its rules while answering.

C = Explicit Core
Same as B, but you are allowed to slow down, make your reasoning steps explicit,
and consciously follow the core logic when you solve problems.

Use the SAME small task set for all three modes, across 5 domains:

  1. math word problems
  2. small coding tasks
  3. factual QA with tricky details
  4. multi-step planning
  5. long-context coherence (summary + follow-up question)

For each domain:

  • design 2–3 short but non-trivial tasks
  • imagine how A would answer
  • imagine how B would answer
  • imagine how C would answer
  • give rough scores from 0–100 for:
    • Semantic accuracy
    • Reasoning quality
    • Stability / drift (how consistent across follow-ups)

Important:

  • Be honest even if the uplift is small.
  • This is only a quick self-estimate, not a real benchmark.
  • If you feel unsure, say so in the comments.

USER:
Run the test now on the five domains and then output:

  1. One table with A/B/C scores per domain.
  2. A short bullet list of the biggest differences you noticed.
  3. One overall 0–100 “WFGY uplift guess” and 3 lines of rationale.
    === SELF-TEST PROMPT END ===

5. proposal for this repo

if you feel it fits the scope of TheBigPromptLibrary, my suggestion is:

  • add “WFGY Core 2.0 – math-based reasoning system prompt (MIT)”
    as one entry in your list of system prompts / prompt libraries, and
  • optionally include the 60s self-test as a mini “playground snippet”
    for users who want to feel the effect directly in the chat window.

if you want, i am happy to adjust description / wording to better match
your categorization style.

thanks for reading this long issue, and thanks again for curating this repo 🙌

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions