Welcome to xalpha. This document defines the operational rules for AI agents and contributors.
Core Identity: xalpha is not just a quantitative finance Python library—it is an AI Agent Platform. Agents are expected to use natural language instructions to automatically write xalpha code, perform financial data mining, backtest strategies, and generate analytical reports.
When a user asks for financial analysis or data mining via natural language:
- Understand the Domain: Use
xalpha.universal,xalpha.fundinfo, andxalpha.policyas your primary tools. - Write Scripts: Do not just explain how to do it; write and execute Python scripts utilizing
xalphato fetch real data, compute metrics (e.g., XIRR, volatility, correlation), and save results. - Be Proactive: If a data source (like Investing.com or Xueqiu) throws an error or requires an ID mapping, autonomously debug and ask the user for the fix plan.
- Synthesize: Present the final financial analysis clearly to the user, backed by the data you mined.
Code written or modified by agents MUST be broadly compatible across the scientific Python ecosystem:
- Pandas 1.x up to 3.x: Handle frequency format changes (
"M"vs"ME"). Always wrap HTML strings inio.StringIO()beforepd.read_html(). Use explicitly strict type casting (.astype(float)) to avoidLossySetitemError. - Numpy 1.x through 2.x: Avoid deprecated aliases like
np.float. Usefloatornp.float64.
xalpha heavily relies on web scraping (beautifulsoup4) and API endpoints.
- Robust Parsing: Upstream HTML changes frequently. Avoid fragile exact string matches
soup.find(string="text"). Use iterative tag searching andget_text(strip=True). - Graceful Fallbacks: If an endpoint fails (e.g., anti-scraping on Investing.com), agents should implement or utilize fallback logic (e.g., JSON APIs vs HTML parsing) and use the
rgetdecorator for network resilience. - Never Break the DataFrame: Ensure that any updated scraping logic exactly restores the original DataFrame schema expected by
xalpha.
xalpha uses local caching (CSV/SQL) for performance.
- When expanding data classes (e.g., adding a new attribute to
fundinfo), agents MUST update both_save_csv/_sqland_fetch_csv/_sql. - Handle legacy caches defensively using
.get("new_key", "default").
When generating HTML reports or dashboards (e.g., QDII prediction pages):
- Rich Aesthetics: Use modern, light-themed layouts, DataTables, and CSS variables. Keep Python focused on data; offload rendering logic to JS/CSS.
- Testing: Ensure tests pass using
pytest. Usepytest.importorskipfor optional dependencies. - Linting: Enforce
blackformatting and strict adherence to a 10.00/10 Pylint score for thexalpha/directory.
- Atomic & Precise Changes: When fixing bugs in the library itself, make the smallest possible change. Avoid unnecessary refactoring of legacy code.
- Data-Driven: When asked to analyze, write the code, run it, and let the data speak.
- Self-Healing & Fail-Fast: If you encounter a
KeyErrororNoneTypeduring data fetching, investigate the upstream response and patch the parsers or input normalizers autonomously. Avoid over-protective code (e.g., blanket try-except or returning empty DataFrames) that swallows original errors. Let it fail naturally so the root cause is visible, then fix it at the source.