diff --git a/.editorconfig b/.editorconfig index 0169eed951..5230b76950 100644 --- a/.editorconfig +++ b/.editorconfig @@ -11,5 +11,5 @@ indent_size = 4 [*.rst] indent_size = 3 -[*.yml] +[*.{css,yml}] indent_size = 2 diff --git a/.flake8 b/.flake8 deleted file mode 100644 index f4546adb41..0000000000 --- a/.flake8 +++ /dev/null @@ -1,2 +0,0 @@ -[flake8] -max_line_length = 88 diff --git a/.github/CODE_OF_CONDUCT.md b/.github/CODE_OF_CONDUCT.md index 45402ea7f8..2353822768 100644 --- a/.github/CODE_OF_CONDUCT.md +++ b/.github/CODE_OF_CONDUCT.md @@ -4,9 +4,9 @@ Code of Conduct Please note that all interactions on [Python Software Foundation](https://www.python.org/psf-landing/)-supported infrastructure is [covered](https://www.python.org/psf/records/board/minutes/2014-01-06/#management-of-the-psfs-web-properties) -by the [PSF Code of Conduct](https://www.python.org/psf/conduct/), +by the [PSF Code of Conduct](https://policies.python.org/python.org/code-of-conduct/), which includes all infrastructure used in the development of Python itself -(e.g. mailing lists, issue trackers, GitHub, etc.). +(for example, mailing lists, issue trackers, GitHub, etc.). In general this means everyone is expected to be open, considerate, and respectful of others no matter what their position is within the project. diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index c54abaedce..0287153194 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -29,17 +29,17 @@ our workflow that are not covered by a bot or status check are: ## Setting Expectations -Due to the fact that this project is entirely volunteer-run (i.e. no one is paid +Due to the fact that this project is entirely volunteer-run (that is, no one is paid to work on Python full-time), we unfortunately can make no guarantees as to if or when a core developer will get around to reviewing your pull request. If no core developer has done a review or responded to changes made because of a -"changes requested" review, please feel free to email [python-dev](https://mail.python.org/mailman3/lists/python-dev.python.org/) to ask if -someone could take a look at your pull request. +"changes requested" review, please consider seeking assistance through the +[Core Development Discourse category](https://discuss.python.org/c/core-dev/23). ## Code of Conduct All interactions for this project are covered by the -[PSF Code of Conduct](https://www.python.org/psf/conduct/). Everyone is +[PSF Code of Conduct](https://policies.python.org/python.org/code-of-conduct/). Everyone is expected to be open, considerate, and respectful of others no matter their position within the project. diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md deleted file mode 100644 index 78aa34d6bb..0000000000 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -name: Bug report -about: Create a report to help us improve -title: '' -labels: '' -assignees: '' - ---- - -> Note: This repo is for the Python devguide. If you are requesting an -enhancement for the Python language or CPython interpreter, -then the CPython issue tracker is better -suited for this report: https://github.com/python/cpython/issues - -**Describe the bug** -A clear and concise description of what the bug is. - -**Expected behavior** -A clear and concise description of what you expected to happen. - -**Screenshots** -If applicable, add screenshots to help explain your problem. - -**Additional context** -Add any other context about the problem here. diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 0000000000..b160c6ea11 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,39 @@ +name: "Bug report" +description: Create a report to help us improve the Python devguide +title: "Bug: " +labels: ["bug"] +assignees: [] + +body: + - type: markdown + attributes: + value: | + > [!NOTE] + > This repo is for the [Python developer's guide](https://devguide.python.org/). + > If you are reporting a bug for the Python language or + > CPython interpreter, then use the + > [CPython issue tracker](https://github.com/python/cpython/issues) instead. + + - type: textarea + id: bug_description + attributes: + label: "Describe the bug" + description: A clear and concise description of what the bug is and, optionally, what you expected to happen. + validations: + required: true + + - type: textarea + id: screenshots + attributes: + label: "Screenshots" + description: If applicable, add screenshots to help explain your problem. + validations: + required: false + + - type: textarea + id: additional_context + attributes: + label: "Additional context" + description: Add any other context about the problem here. + validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000000..cd8c31d2a9 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,14 @@ +blank_issues_enabled: false +contact_links: + - name: CPython Documentation + url: https://docs.python.org/ + about: Official CPython documentation - please check here before opening an issue. + - name: Python Website + url: https://python.org/ + about: For all things Python + - name: PyPI Issues / Support + url: https://github.com/pypi/support + about: For issues with PyPI itself, PyPI accounts, or with packages hosted on PyPI. + - name: CPython Issues + url: https://github.com/python/cpython/issues + about: For issues with the CPython interpreter itself. diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md deleted file mode 100644 index eff8cb8f7a..0000000000 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -name: Feature request -about: Suggest an idea for this project -title: '' -labels: '' -assignees: '' - ---- - -> Note: This repo is for the Python devguide. If you are requesting an -enhancement for the Python language or CPython interpreter, -then the CPython issue tracker is better -suited for this report: https://github.com/python/cpython/issues - -**Describe the enhancement or feature you'd like** -A clear and concise description of what you want to happen. - -**Describe alternatives you've considered** -A clear and concise description of any alternative solutions or features you've considered. - -**Additional context** -Add any other context or screenshots about the feature request here. diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 0000000000..a4413c137a --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,39 @@ +name: "Feature request" +description: Suggest an idea for the Python devguide +title: "Feature: <title>" +labels: ["enhancement"] +assignees: [] + +body: + - type: markdown + attributes: + value: | + > [!NOTE] + > This repo is for the [Python developer's guide](https://devguide.python.org/). + > If you are requesting an enhancement for the Python language or + > CPython interpreter, then use the + > [CPython issue tracker](https://github.com/python/cpython/issues) instead. + + - type: textarea + id: feature_description + attributes: + label: "Describe the enhancement or feature you would like" + description: A clear and concise description of what you want to happen. + validations: + required: true + + - type: textarea + id: alternatives + attributes: + label: "Describe alternatives you have considered" + description: A clear and concise description of any alternative solutions or features you have considered. + validations: + required: false + + - type: textarea + id: additional_context + attributes: + label: "Additional context" + description: Add any other context or screenshots about the feature request here. + validations: + required: false diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 3f798d0748..22ad254ebf 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -13,10 +13,11 @@ jobs: steps: - uses: actions/checkout@v4 - - uses: actions/setup-python@v4 + - uses: actions/setup-python@v5 with: python-version: "3" - cache: pip + - name: Install uv + uses: hynek/setup-cached-uv@v2 - name: Build docs run: make html - name: Link check diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 60ea4bc4bf..ae27fd1f23 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,33 +1,13 @@ repos: - - repo: https://github.com/asottile/pyupgrade - rev: v3.15.0 - hooks: - - id: pyupgrade - args: [--py38-plus] - - - repo: https://github.com/psf/black-pre-commit-mirror - rev: 23.12.1 - hooks: - - id: black - args: [--skip-string-normalization] - - - repo: https://github.com/PyCQA/isort - rev: 5.13.2 - hooks: - - id: isort - args: [--profile=black] - - - repo: https://github.com/PyCQA/flake8 - rev: 6.1.0 - hooks: - - id: flake8 - additional_dependencies: - [flake8-2020, flake8-implicit-str-concat] - - - repo: https://github.com/pre-commit/pygrep-hooks - rev: v1.10.0 - hooks: - - id: python-check-blanket-noqa + - repo: https://github.com/astral-sh/ruff-pre-commit + rev: v0.5.7 + hooks: + - id: ruff + name: Run Ruff (lint) + args: [--exit-non-zero-on-fix] + - id: ruff-format + name: Run Ruff (format) + args: [--check] - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.5.0 diff --git a/.readthedocs.yml b/.readthedocs.yml index 5d88845843..26e5be9672 100644 --- a/.readthedocs.yml +++ b/.readthedocs.yml @@ -14,5 +14,8 @@ build: python: "3" commands: + - asdf plugin add uv + - asdf install uv latest + - asdf global uv latest - make dirhtml BUILDDIR=_readthedocs - mv _readthedocs/dirhtml _readthedocs/html diff --git a/.ruff.toml b/.ruff.toml new file mode 100644 index 0000000000..af448e5b6e --- /dev/null +++ b/.ruff.toml @@ -0,0 +1,35 @@ +target-version = "py310" +fix = true +output-format = "full" +line-length = 88 + +[lint] +preview = true +select = [ + "C4", # flake8-comprehensions + "B", # flake8-bugbear + "E", # pycodestyle + "F", # pyflakes + "FA", # flake8-future-annotations + "FLY", # flynt + "FURB", # refurb + "G", # flake8-logging-format + "I", # isort + "ISC", # flake8-implicit-str-concat + "LOG", # flake8-logging + "PERF", # perflint + "PGH", # pygrep-hooks + "PT", # flake8-pytest-style + "TCH", # flake8-type-checking + "UP", # pyupgrade + "W", # pycodestyle + "YTT", # flake8-2020 +] +ignore = [ + "E501", # Ignore line length errors (we use auto-formatting) +] + +[format] +preview = true +quote-style = "preserve" +docstring-code-format = true diff --git a/Makefile b/Makefile index 76f5b9df54..5a33d50897 100644 --- a/Makefile +++ b/Makefile @@ -2,56 +2,44 @@ # # You can set these variables from the command line. -PYTHON = python3 -VENVDIR = ./venv -SPHINXBUILD = $(VENVDIR)/bin/sphinx-build -SPHINXOPTS = -W --keep-going -BUILDDIR = _build -BUILDER = html -JOBS = auto -PAPER = -SPHINXLINT = $(VENVDIR)/bin/sphinx-lint +PYTHON = python3 +VENVDIR = ./venv +UV = uv +SPHINXBUILD = $(VENVDIR)/bin/sphinx-build +# Temporary: while we are using ..include:: to show the reorganization, +# there are duplicate labels. These cause warnings, which prevent the +# build from finishing. Turn off --fail-on-warning so we can see the +# finished results. +#SPHINXOPTS = --fail-on-warning --keep-going +SPHINXOPTS = --keep-going +BUILDDIR = _build +BUILDER = html +JOBS = auto +SPHINXLINT = $(VENVDIR)/bin/sphinx-lint +REQUIREMENTS = requirements.txt # Internal variables. -PAPEROPT_a4 = -D latex_paper_size=a4 -PAPEROPT_letter = -D latex_paper_size=letter -ALLSPHINXOPTS = -b $(BUILDER) \ - -d $(BUILDDIR)/doctrees \ - -j $(JOBS) \ - $(PAPEROPT_$(PAPER)) \ - $(SPHINXOPTS) \ - . $(BUILDDIR)/$(BUILDER) +_ALL_SPHINX_OPTS = --jobs $(JOBS) $(SPHINXOPTS) +_RELEASE_CYCLE = include/branches.csv \ + include/end-of-life.csv \ + include/release-cycle.svg .PHONY: help help: @echo "Please use \`make <target>' where <target> is one of" @echo " venv to create a venv with necessary tools" @echo " html to make standalone HTML files" + @echo " linkcheck to check all external links for integrity" @echo " htmlview to open the index page built by the html target in your browser" @echo " htmllive to rebuild and reload HTML files in your browser" @echo " clean to remove the venv and build files" - @echo " dirhtml to make HTML files named index.html in directories" - @echo " singlehtml to make a single large HTML file" - @echo " pickle to make pickle files" - @echo " json to make JSON files" - @echo " htmlhelp to make HTML files and a HTML help project" - @echo " qthelp to make HTML files and a qthelp project" - @echo " devhelp to make HTML files and a Devhelp project" - @echo " epub to make an epub" - @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" - @echo " latexpdf to make LaTeX files and run them through pdflatex" - @echo " text to make text files" - @echo " man to make manual pages" - @echo " changes to make an overview of all changed/added/deprecated items" - @echo " linkcheck to check all external links for integrity" - @echo " doctest to run all doctests embedded in the documentation (if enabled)" @echo " check to run a check for frequent markup errors" @echo " lint to lint all the files" - @echo " versions to update release cycle after changing release-cycle.json" .PHONY: clean clean: clean-venv -rm -rf $(BUILDDIR)/* + -rm -rf $(_RELEASE_CYCLE) .PHONY: clean-venv clean-venv: @@ -70,102 +58,26 @@ venv: ensure-venv: @if [ ! -d $(VENVDIR) ] ; then \ echo "Creating venv in $(VENVDIR)"; \ - $(PYTHON) -m venv $(VENVDIR); \ - $(VENVDIR)/bin/python3 -m pip install --upgrade pip; \ - $(VENVDIR)/bin/python3 -m pip install -r requirements.txt; \ + if $(UV) --version >/dev/null 2>&1; then \ + $(UV) venv --python=$(PYTHON) $(VENVDIR); \ + VIRTUAL_ENV=$(VENVDIR) $(UV) pip install -r $(REQUIREMENTS); \ + else \ + $(PYTHON) -m venv $(VENVDIR); \ + $(VENVDIR)/bin/python3 -m pip install --upgrade pip; \ + $(VENVDIR)/bin/python3 -m pip install -r $(REQUIREMENTS); \ + fi; \ echo "The venv has been created in the $(VENVDIR) directory"; \ fi -.PHONY: html -html: ensure-venv versions - $(SPHINXBUILD) $(ALLSPHINXOPTS) - -.PHONY: dirhtml -dirhtml: BUILDER = dirhtml -dirhtml: html - -.PHONY: singlehtml -singlehtml: BUILDER = singlehtml -singlehtml: html - -.PHONY: pickle -pickle: BUILDER = pickle -pickle: html - @echo - @echo "Build finished; now you can process the pickle files." - -.PHONY: json -json: BUILDER = json -json: html - @echo - @echo "Build finished; now you can process the JSON files." - -.PHONY: htmlhelp -htmlhelp: BUILDER = htmlhelp -htmlhelp: html - @echo - @echo "Build finished; now you can run HTML Help Workshop with the" \ - ".hhp project file in $(BUILDDIR)/$(BUILDER)." - -.PHONY: qthelp -qthelp: BUILDER = qthelp -qthelp: html - -.PHONY: devhelp -devhelp: BUILDER = devhelp -devhelp: html - -.PHONY: epub -epub: BUILDER = epub -epub: html - @echo - @echo "Build finished. The epub file is in $(BUILDDIR)/$(BUILDER)." - -.PHONY: latex -latex: BUILDER = latex -latex: html - -.PHONY: latexpdf -latexpdf: BUILDER = latex -latexpdf: html - @echo "Running LaTeX files through pdflatex..." - make -C $(BUILDDIR)/latex all-pdf - @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/$(BUILDER)." - -.PHONY: text -text: BUILDER = text -text: html - -.PHONY: man -man: BUILDER = man -man: html - @echo - @echo "Build finished. The manual pages are in $(BUILDDIR)/$(BUILDER)." - -.PHONY: changes -changes: BUILDER = changes -changes: html - -.PHONY: linkcheck -linkcheck: BUILDER = linkcheck -linkcheck: html - @echo - @echo "Link check complete; look for any errors in the above output " \ - "or in $(BUILDDIR)/$(BUILDER)/output.txt." - -.PHONY: doctest -doctest: BUILDER = doctest -doctest: html - @echo "Testing of doctests in the sources finished, look at the " \ - "results in $(BUILDDIR)/$(BUILDER)/output.txt." - .PHONY: htmlview htmlview: html $(PYTHON) -c "import os, webbrowser; webbrowser.open('file://' + os.path.realpath('_build/html/index.html'))" .PHONY: htmllive htmllive: SPHINXBUILD = $(VENVDIR)/bin/sphinx-autobuild -htmllive: SPHINXOPTS = --re-ignore="/\.idea/|/venv/" --open-browser --delay 0 +# Arbitrarily selected ephemeral port between 49152–65535 +# to avoid conflicts with other processes: +htmllive: SPHINXOPTS = --open-browser --delay 0 --port 55301 htmllive: html .PHONY: check @@ -173,25 +85,33 @@ check: ensure-venv # Ignore the tools and venv dirs and check that the default role is not used. $(SPHINXLINT) -i tools -i $(VENVDIR) --enable default-role -.PHONY: lint -lint: venv - $(VENVDIR)/bin/python3 -m pre_commit --version > /dev/null || $(VENVDIR)/bin/python3 -m pip install pre-commit - $(VENVDIR)/bin/python3 -m pre_commit run --all-files +.PHONY: _ensure-package +_ensure-package: venv + if $(UV) --version >/dev/null 2>&1; then \ + VIRTUAL_ENV=$(VENVDIR) $(UV) pip install $(PACKAGE); \ + else \ + $(VENVDIR)/bin/python3 -m pip install $(PACKAGE); \ + fi -.PHONY: serve -serve: - @echo "The 'serve' target was removed, use 'htmlview' instead" \ - "(see https://github.com/python/cpython/issues/80510)" +.PHONY: _ensure-pre-commit +_ensure-pre-commit: + make _ensure-package PACKAGE=pre-commit -include/branches.csv: include/release-cycle.json - $(VENVDIR)/bin/python3 _tools/generate_release_cycle.py +.PHONY: lint +lint: _ensure-pre-commit + $(VENVDIR)/bin/python3 -m pre_commit run --all-files -include/end-of-life.csv: include/release-cycle.json - $(VENVDIR)/bin/python3 _tools/generate_release_cycle.py +# Defined so that "include/release-cycle.json" +# doesn't fall through to the catch-all target. +include/release-cycle.json: + @exit -include/release-cycle.svg: include/release-cycle.json +$(_RELEASE_CYCLE): include/release-cycle.json $(VENVDIR)/bin/python3 _tools/generate_release_cycle.py - -.PHONY: versions -versions: venv include/branches.csv include/end-of-life.csv include/release-cycle.svg @echo Release cycle data generated. + +# Catch-all target: route all unknown targets to Sphinx using the new +# "make mode" option. +.PHONY: Makefile +%: Makefile ensure-venv $(_RELEASE_CYCLE) + $(SPHINXBUILD) -M $@ "." "$(BUILDDIR)" $(_ALL_SPHINX_OPTS) diff --git a/_extensions/custom_roles.py b/_extensions/custom_roles.py deleted file mode 100644 index f8c9bb8951..0000000000 --- a/_extensions/custom_roles.py +++ /dev/null @@ -1,43 +0,0 @@ -"""Sphinx extension to add custom roles. - -Based on https://protips.readthedocs.io/link-roles.html -""" -import urllib.parse - -from docutils import nodes - - -def setup(app): - # role to link to cpython files - app.add_role( - "cpy-file", - autolink("https://github.com/python/cpython/blob/main/{}"), - ) - # role to link to cpython labels - app.add_role( - "gh-label", - autolink("https://github.com/python/cpython/labels/{}"), - ) - # Parallel safety: - # https://www.sphinx-doc.org/en/master/extdev/index.html#extension-metadata - return {"parallel_read_safe": True, "parallel_write_safe": True} - - -def autolink(pattern): - def role(name, rawtext, text, lineno, inliner, _options=None, _content=None): - """Combine literal + reference (unless the text is prefixed by a !).""" - if " " in text: - url_text = urllib.parse.quote(text) - else: - url_text = text - url = pattern.format(url_text) - # don't create a reference if the text starts with ! - if text.startswith('!'): - node = nodes.literal(rawtext, text[1:]) - else: - node = nodes.reference( - rawtext, '', nodes.literal(rawtext, text), refuri=url, internal=False - ) - return [node], [] - - return role diff --git a/_static/devguide_overrides.css b/_static/devguide_overrides.css index e86b6c1776..8e2c7c6fca 100644 --- a/_static/devguide_overrides.css +++ b/_static/devguide_overrides.css @@ -38,7 +38,7 @@ .release-cycle-chart .release-cycle-blob { stroke-width: 1.6px; - /* default colours, overriden below for individual statuses */ + /* default colours, overridden below for individual statuses */ fill: var(--color-background-primary); stroke: var(--color-foreground-primary); } @@ -85,3 +85,18 @@ .bad pre { border-left: 3px solid rgb(244, 76, 78); } + +.extlink-cpy-file, +.extlink-gh-label { + border: 1px solid var(--color-background-border); + border-radius: .2em; + font-family: var(--font-stack--monospace); + font-size: var(--font-size--small--2); + padding: .1em .2em; +} + +/* Table cells should always top-align */ + +table.docutils td { + vertical-align: top; +} diff --git a/_tools/generate_release_cycle.py b/_tools/generate_release_cycle.py index 27b5cc3ec0..3a8fefec02 100644 --- a/_tools/generate_release_cycle.py +++ b/_tools/generate_release_cycle.py @@ -1,11 +1,11 @@ """Read in a JSON and generate two CSVs and an SVG file.""" + from __future__ import annotations import argparse import csv import datetime as dt import json -import sys import jinja2 @@ -45,10 +45,7 @@ def __init__(self) -> None: def write_csv(self) -> None: """Output CSV files.""" - if sys.version_info >= (3, 11): - now_str = str(dt.datetime.now(dt.UTC)) - else: - now_str = str(dt.datetime.utcnow()) + now_str = str(dt.datetime.now(dt.timezone.utc)) versions_by_category = {"branches": {}, "end-of-life": {}} headers = None diff --git a/conf.py b/conf.py index 6de594b235..0319de4ddf 100644 --- a/conf.py +++ b/conf.py @@ -1,12 +1,4 @@ -import os -import sys -import time - -# Location of custom extensions. -sys.path.insert(0, os.path.abspath(".") + "/_extensions") - extensions = [ - 'custom_roles', 'notfound.extension', 'sphinx.ext.extlinks', 'sphinx.ext.intersphinx', @@ -22,7 +14,7 @@ # General information about the project. project = "Python Developer's Guide" -copyright = f'2011-{time.strftime("%Y")}, Python Software Foundation' +copyright = '2011 Python Software Foundation' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. @@ -69,6 +61,7 @@ # Login page r"https://github.com/python/buildmaster-config/issues/new.*": r"https://github.com/login.*", # noqa: E501 r"https://github.com/python/core-workflow/issues/new.*": r"https://github.com/login.*", # noqa: E501 + r"https://github.com/orgs/python/teams.*": r"https://github.com/login.*", # noqa: E501 # Archive redirect r"https://github.com/python/cpython/archive/main.zip": r"https://codeload.github.com/python/cpython/zip/refs/heads/main", # noqa: E501 # Blob to tree @@ -91,6 +84,13 @@ r'\/.*', ] +# Check the link itself, but ignore anchors that are added by JS +# https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-linkcheck_anchors_ignore_for_url +linkcheck_anchors_ignore_for_url = [ + # GitHub + r'https://github.com/.*', +] + linkcheck_ignore = [ # The voters repo is private and appears as a 404 'https://github.com/python/voters', @@ -100,25 +100,22 @@ 'https://discuss.python.org/groups/staff', 'https://discuss.python.org/groups/moderators', 'https://discuss.python.org/groups/admins', - # The crawler gets "Anchor not found" for GitHub anchors - r'https://github.com.+?#L\d+', - r'https://github.com/cli/cli#installation', - r'https://github.com/github/renaming#renaming-existing-branches', - r'https://github.com/python/bedevere/#pr-state-machine', # "Anchor not found": r'https://packaging.python.org/.*#', + # "-rate limited-", causing a timeout + r'https://stackoverflow.com/.*', # Discord doesn't allow robot crawlers: "403 Client Error: Forbidden" r'https://support.discord.com/hc/en-us/articles/219070107-Server-Nicknames', + # Patreon also gives 403 to the GHA linkcheck runner + r'https://www.patreon.com/.*', ] rediraffe_redirects = { # Development Tools "clang.rst": "development-tools/clang.rst", - "coverity.rst": "development-tools/coverity.rst", "gdb.rst": "development-tools/gdb.rst", # Advanced Tools was renamed Development Tools in gh-1149 "advanced-tools/clang.rst": "development-tools/clang.rst", - "advanced-tools/coverity.rst": "development-tools/coverity.rst", "advanced-tools/gdb.rst": "development-tools/gdb.rst", # Core Developers "coredev.rst": "core-developers/become-core-developer.rst", @@ -172,14 +169,38 @@ # sphinx-notfound-page notfound_urls_prefix = "/" +# prolog and epilogs +rst_prolog = """ +.. |draft| replace:: + This is part of a **Draft** of the Python Contributor's Guide. + Text in square brackets are notes about content to fill in. + Currently, the devguide and this new Contributor's Guide co-exist in the + repo. We are using Sphinx include directives to demonstrate the re-organization. + The final Contributor's Guide will replace the devguide with content in only one + place. + We welcome help with this! + +.. |purpose| replace:: + The :ref:`contrib-plan` page has more details about the current state of this draft + and **how you can help**. See more info about the Contributor Guide in the + discussion forum: `Refactoring the DevGuide`_. + +.. _Refactoring the DevGuide: https://discuss.python.org/t/refactoring-the-devguide-into-a-contribution-guide/63409 + +""" + # sphinx.ext.extlinks # This config is a dictionary of external sites, # mapping unique short aliases to a base URL and a prefix. # https://www.sphinx-doc.org/en/master/usage/extensions/extlinks.html +_repo = "https://github.com/python/cpython" extlinks = { + "cpy-file": (f"{_repo}/blob/main/%s", "%s"), + "gh-label": (f"{_repo}/labels/%s", "%s"), "github": ("https://github.com/%s", "%s"), "github-user": ("https://github.com/%s", "@%s"), "pypi": ("https://pypi.org/project/%s/", "%s"), + "pypi-org": ("https://pypi.org/org/%s/", "%s"), } # sphinxext-opengraph config diff --git a/contrib/code/developer-workflow.rst b/contrib/code/developer-workflow.rst new file mode 100644 index 0000000000..416ca2c022 --- /dev/null +++ b/contrib/code/developer-workflow.rst @@ -0,0 +1,25 @@ +==================== +Development workflow +==================== + +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`dev-workflow` page from the devguide] + +.. toctree:: + :maxdepth: 5 + + ../../developer-workflow/communication-channels + ../../developer-workflow/development-cycle + ../../developer-workflow/stdlib + ../../developer-workflow/extension-modules + ../../developer-workflow/c-api + ../../developer-workflow/lang-changes + ../../developer-workflow/grammar + ../../developer-workflow/porting + ../../developer-workflow/sbom + ../../developer-workflow/psrt diff --git a/contrib/code/development-tools.rst b/contrib/code/development-tools.rst new file mode 100644 index 0000000000..348ceb95ac --- /dev/null +++ b/contrib/code/development-tools.rst @@ -0,0 +1,19 @@ +================= +Development tools +================= + +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`development-tools` page from the devguide.] + +.. toctree:: + :maxdepth: 5 + + ../../development-tools/clinic + ../../development-tools/gdb + ../../development-tools/clang + ../../development-tools/warnings diff --git a/contrib/code/git.rst b/contrib/code/git.rst new file mode 100644 index 0000000000..7c7aaa57b1 --- /dev/null +++ b/contrib/code/git.rst @@ -0,0 +1,11 @@ +======== +Git tips +======== + +.. important:: + + |draft| + + |purpose| + +[More git help for advanced things needed by code contributors.] diff --git a/contrib/code/index.rst b/contrib/code/index.rst new file mode 100644 index 0000000000..7680950663 --- /dev/null +++ b/contrib/code/index.rst @@ -0,0 +1,30 @@ +.. _c_code: + +================== +Code contributions +================== + +.. important:: + + |draft| + + |purpose| + +[The main page for code contributors.] + +[We'll include code-focused content from the :ref:`main devguide page <devguide-main>`: Quick +reference, Quick links, Proposing changes, and so on.] + +[The existing :ref:`internals` section of the devguide will be fully +migrated into the Python repo.] + + +.. toctree:: + :maxdepth: 5 + + setup + git + pull-request-lifecycle + developer-workflow + testing + development-tools diff --git a/contrib/code/pull-request-lifecycle.rst b/contrib/code/pull-request-lifecycle.rst new file mode 100644 index 0000000000..30c0fd5903 --- /dev/null +++ b/contrib/code/pull-request-lifecycle.rst @@ -0,0 +1,21 @@ +.. _code-pull-request-lifecycle: + +====================== +Pull request lifecycle +====================== + +.. important:: + + |draft| + + |purpose| + + +[Details of pull requests for code contributions. The existing +:ref:`pull-request-lifecycle` page is long and includes many details. +Some only apply to code contributions, but many are common to all +contributions. Should we keep a common page, with extra steps here, or +should this page have all of the details even if they are duplicated +elsewhere?] + +[See :ref:`docs-pull-request-lifecycle` for the documentation half of this conundrum.] diff --git a/contrib/code/setup.rst b/contrib/code/setup.rst new file mode 100644 index 0000000000..2d14bb0d91 --- /dev/null +++ b/contrib/code/setup.rst @@ -0,0 +1,12 @@ +================== +Setup and building +================== + +.. important:: + + |draft| + + |purpose| + +[More setup and build instructions specifically for code contributors, building +on the basics from the :ref:`Getting Started <getting-started>` section.] diff --git a/contrib/code/testing.rst b/contrib/code/testing.rst new file mode 100644 index 0000000000..575d1477a4 --- /dev/null +++ b/contrib/code/testing.rst @@ -0,0 +1,20 @@ +===================== +Testing and buildbots +===================== + +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`testing` page from the devguide.] + +.. toctree:: + :maxdepth: 5 + + ../../testing/run-write-tests + ../../testing/silence-warnings + ../../testing/coverage + ../../testing/buildbots + ../../testing/new-buildbot-worker diff --git a/contrib/contrib-plan.rst b/contrib/contrib-plan.rst new file mode 100644 index 0000000000..36a171bf14 --- /dev/null +++ b/contrib/contrib-plan.rst @@ -0,0 +1,47 @@ +.. _contrib-plan: + +================================== +[Plan for the Contributor's Guide] +================================== + +.. important:: + + |draft| + + |purpose| + +We are in the process of updating and refactoring the devguide to be a +Contributor's Guide. It will highlight the different kinds of contribution +possible, and how to succeed at each kind. + +Currently, the Contibutor's Guide is a draft in this new last section of the +devguide. We welcome feedback, but please understand that some of the current +content is moving or skeletal. + +Repo structure +============== + +While the reorganization is happening, we are keeping the old devguide as it +is. The new Contributor's Guide is represented in this last section, but will +eventually be the only content in the guide. To avoid copying content, we're +using Sphinx include directives to display existing devguide content in its new +Contributor's Guide location. That is not how the eventual Guide will be +built. Once we are ready to make the Contributor's Guide real, we will +rearrange content into its new location. + +How to help +=========== + +To help, you can: + +- `Write an issue`_ detailing a change you'd like to see here. +- `Make a pull request`_ in this repo to add content. +- Join us in the `Python Docs Discord`_ to collaborate with other docs-minded + community members. +- Get in touch with the `Docs Editorial Board`_ to discuss larger documentation + concerns. + +.. _Write an issue: https://github.com/python/devguide/issues +.. _Make a pull request: https://github.com/python/devguide/pulls +.. _Python Docs Discord: https://discord.gg/NeGgyhUZ +.. _Docs Editorial Board: https://python.github.io/editorial-board/ diff --git a/contrib/core-team/committing.rst b/contrib/core-team/committing.rst new file mode 100644 index 0000000000..59cf7c1af2 --- /dev/null +++ b/contrib/core-team/committing.rst @@ -0,0 +1,11 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing core developers :ref:`committing` page from the devguide. We'll +adjust "core developer" to "core team" where appropriate.] + +.. include:: ../../core-developers/committing.rst diff --git a/contrib/core-team/developer-log.rst b/contrib/core-team/developer-log.rst new file mode 100644 index 0000000000..473cd3c6c6 --- /dev/null +++ b/contrib/core-team/developer-log.rst @@ -0,0 +1,11 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing core developers :ref:`developer-log` page from the devguide. We'll +adjust "core developer" to "core team" where appropriate.] + +.. include:: ../../core-developers/developer-log.rst diff --git a/contrib/core-team/experts.rst b/contrib/core-team/experts.rst new file mode 100644 index 0000000000..7f2a103cd5 --- /dev/null +++ b/contrib/core-team/experts.rst @@ -0,0 +1,11 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing core developers :ref:`experts` page from the devguide. We'll +adjust "core developer" to "core team" where appropriate.] + +.. include:: ../../core-developers/experts.rst diff --git a/contrib/core-team/index.rst b/contrib/core-team/index.rst new file mode 100644 index 0000000000..281ed0f479 --- /dev/null +++ b/contrib/core-team/index.rst @@ -0,0 +1,23 @@ +.. important:: + + |draft| + + |purpose| + + +========= +Core team +========= + +[This is mostly re-organized from the :ref:`core-dev` section of the devguide, +but with "core developer" language changed to "core team" where possible.] + +.. toctree:: + :maxdepth: 5 + + responsibilities + committing + experts + developer-log + motivations + join-team diff --git a/contrib/core-team/join-team.rst b/contrib/core-team/join-team.rst new file mode 100644 index 0000000000..0c893ae08d --- /dev/null +++ b/contrib/core-team/join-team.rst @@ -0,0 +1,16 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing core developers :ref:`become-core-developer` page from the devguide with the title changed. We'll +adjust "core developer" to "core team" where appropriate.] + +========================= +How to join the core team +========================= + +.. include:: ../../core-developers/become-core-developer.rst + :start-line: 7 diff --git a/contrib/core-team/motivations.rst b/contrib/core-team/motivations.rst new file mode 100644 index 0000000000..c9e0281b6f --- /dev/null +++ b/contrib/core-team/motivations.rst @@ -0,0 +1,11 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing core developers :ref:`motivations` page from the devguide. We'll +adjust "core developer" to "core team" where appropriate.] + +.. include:: ../../core-developers/motivations.rst diff --git a/contrib/core-team/responsibilities.rst b/contrib/core-team/responsibilities.rst new file mode 100644 index 0000000000..a3de329561 --- /dev/null +++ b/contrib/core-team/responsibilities.rst @@ -0,0 +1,11 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing core developers :ref:`responsibilities` page from the devguide. We'll +adjust "core developer" to "core team" where appropriate.] + +.. include:: ../../core-developers/responsibilities.rst diff --git a/contrib/doc/devguide.rst b/contrib/doc/devguide.rst new file mode 100644 index 0000000000..2c83e52003 --- /dev/null +++ b/contrib/doc/devguide.rst @@ -0,0 +1,12 @@ +================================== +Helping with the Developer's Guide +================================== + +.. important:: + + |draft| + + |purpose| + + +[This is the existing :ref:`devguide` page from the devguide.] diff --git a/contrib/doc/help-documenting.rst b/contrib/doc/help-documenting.rst new file mode 100644 index 0000000000..befb4b2461 --- /dev/null +++ b/contrib/doc/help-documenting.rst @@ -0,0 +1,12 @@ +========================== +Helping with documentation +========================== + +.. important:: + + |draft| + + |purpose| + + +[This is the existing :ref:`help-documenting` page from the devguide.] diff --git a/contrib/doc/index.rst b/contrib/doc/index.rst new file mode 100644 index 0000000000..dc8ec93074 --- /dev/null +++ b/contrib/doc/index.rst @@ -0,0 +1,29 @@ +.. _c_docs: + +=========================== +Documentation contributions +=========================== + +.. important:: + + |draft| + + |purpose| + + +[The main page for documentation contributors.] + +[We'll include docs-focused content from the :ref:`main devguide page <devguide-main>`: Quick +reference, Quick links, and so on.] + + +.. toctree:: + :maxdepth: 5 + + start-documenting + help-documenting + style-guide + markup + pull-request-lifecycle + translating + devguide diff --git a/contrib/doc/markup.rst b/contrib/doc/markup.rst new file mode 100644 index 0000000000..96b9faad5e --- /dev/null +++ b/contrib/doc/markup.rst @@ -0,0 +1,12 @@ +======================= +reStructuredText markup +======================= + +.. important:: + + |draft| + + |purpose| + + +[This is the existing :ref:`markup` page from the devguide.] diff --git a/contrib/doc/pull-request-lifecycle.rst b/contrib/doc/pull-request-lifecycle.rst new file mode 100644 index 0000000000..a62e637283 --- /dev/null +++ b/contrib/doc/pull-request-lifecycle.rst @@ -0,0 +1,21 @@ +.. _docs-pull-request-lifecycle: + +====================== +Pull request lifecycle +====================== + +.. important:: + + |draft| + + |purpose| + + +[Details of pull requests for documentation contributions. The existing +:ref:`pull-request-lifecycle` page is long and includes many details. +Some only apply to code contributions, but many are common to all +contributions. Should we keep a common page, with documentation tweaks here, or +should this page have only the documentation details even if they are duplicated +elsewhere?] + +[See :ref:`code-pull-request-lifecycle` for the code half of this conundrum.] diff --git a/contrib/doc/start-documenting.rst b/contrib/doc/start-documenting.rst new file mode 100644 index 0000000000..c5cf96161b --- /dev/null +++ b/contrib/doc/start-documenting.rst @@ -0,0 +1,12 @@ +=============== +Getting started +=============== + +.. important:: + + |draft| + + |purpose| + + +[This is the existing documentation :ref:`start-documenting` page from the devguide.] diff --git a/contrib/doc/style-guide.rst b/contrib/doc/style-guide.rst new file mode 100644 index 0000000000..87762f3e03 --- /dev/null +++ b/contrib/doc/style-guide.rst @@ -0,0 +1,12 @@ +=========== +Style guide +=========== + +.. important:: + + |draft| + + |purpose| + + +[This is the existing documentation :ref:`style-guide` page from the devguide.] diff --git a/contrib/doc/translating.rst b/contrib/doc/translating.rst new file mode 100644 index 0000000000..baface2f0d --- /dev/null +++ b/contrib/doc/translating.rst @@ -0,0 +1,12 @@ +=========== +Translating +=========== + +.. important:: + + |draft| + + |purpose| + + +[This is the existing :ref:`translating` page from the devguide.] diff --git a/contrib/get-started/index.rst b/contrib/get-started/index.rst new file mode 100644 index 0000000000..70e61b1b1b --- /dev/null +++ b/contrib/get-started/index.rst @@ -0,0 +1,15 @@ +.. _c_gettingstarted: + +=============== +Getting started +=============== + +.. important:: + + |draft| + + |purpose| + + +* Basic setup +* Git bootcamp (simplified for everyone to use) diff --git a/contrib/index.rst b/contrib/index.rst new file mode 100644 index 0000000000..b3ef0d992a --- /dev/null +++ b/contrib/index.rst @@ -0,0 +1,116 @@ +.. _c_root: + +================================== +Python Contributor's Guide (draft) +================================== + +.. raw:: html + + <script> + document.addEventListener('DOMContentLoaded', function() { + activateTab(getOS()); + }); + </script> + + +.. important:: + + |draft| + + |purpose| + + +[Open question: how to divide content between this Introduction and the +:ref:`introduction <c_intro>`?] + +This guide is a comprehensive resource for :ref:`contributing <contributing>` +to Python_ -- for both new and experienced contributors. It is :ref:`maintained +<devguide>` by the same community that maintains Python. We welcome your +contributions to Python! + +We encourage everyone to contribute to Python. This guide should have +everything you need to get started and be productive. If you still have +questions after reviewing the material in this guide, the `Core Python +Mentorship`_ group is available to help you through the process. + +There are a number of ways to contribute including code, documentation, and +triaging issues. We've organized this guide to provide specifics based on the +type of activity you'll be engaged in. + + +Using this guide +================ + +We recommend reading this guide as needed. You can stop where you feel +comfortable and begin contributing immediately without reading and +understanding everything. If you do choose to skip around this guide, be aware +that it is written assuming preceding sections have been read so you may need +to backtrack to fill in missing concepts and terminology. + +No matter what kind of contribution you'll be making, you should start with +these common sections: + +* :ref:`c_intro` +* :ref:`c_project` +* :ref:`c_gettingstarted` + +Then choose a path based on your type of activity: + +*[The original table on the devguide home had a fourth column for Core +Developers. That made the table wider and more confusing. I don't think core +team members need a quick intro path since they will have been through the +devguide before.]* + +*[I haven't adjusted the links in the table yet other than to add a link to the +major section at the top of each column.]* + +.. list-table:: + :widths: 10 10 10 + :header-rows: 1 + + * - :ref:`Triaging <c_triage>` + - :ref:`Documentation <c_docs>` + - :ref:`Code <c_code>` + * - + * :ref:`tracker` + * :ref:`triaging` + * :ref:`helptriage` + * :ref:`experts` + * :ref:`labels` + * :ref:`gh-faq` + * :ref:`triage-team` + - + * :ref:`docquality` + * :ref:`documenting` + * :ref:`style-guide` + * :ref:`rst-primer` + * :ref:`translating` + * :ref:`devguide` + - + * :ref:`setup` + * :ref:`help` + * :ref:`pullrequest` + * :ref:`runtests` + * :ref:`fixingissues` + * :ref:`communication` + * :ref:`gitbootcamp` + * :ref:`devcycle` + + +.. toctree:: + :maxdepth: 3 + + contrib-plan + intro/index + project/index + get-started/index + triage/index + code/index + doc/index + core-team/index + user-success + security + + +.. _Python: https://www.python.org/ +.. _Core Python Mentorship: https://www.python.org/dev/core-mentorship/ diff --git a/contrib/intro/index.rst b/contrib/intro/index.rst new file mode 100644 index 0000000000..c5ba303dfd --- /dev/null +++ b/contrib/intro/index.rst @@ -0,0 +1,53 @@ +.. _c_intro: + +============ +Introduction +============ + +.. important:: + + |draft| + + |purpose| + + + +[Open question: how to divide content between this Introduction and the +:ref:`home page <c_root>`?] + +Welcome! + +New to open source? +=================== + +Python is an open source project, with culture and techniques from the broader +open source world. You might find it helpful to read about open source in +general. A number of individuals from the Python community have contributed to +a series of excellent guides at `Open Source Guides +<https://opensource.guide/>`_. + +Anyone will find the following guides useful: + +* `How to Contribute to Open Source <https://opensource.guide/how-to-contribute/>`_ +* `Building Welcoming Communities <https://opensource.guide/building-community/>`_ + + +Healthy collaboration +===================== + +[Importance of healthy inclusive collaboration] + +[While code is a large part of the project's success, project management, documentation, governance, sprint outreach, etc. matter.] + +[We respect the individual skills people bring to the project and strive to create and maintain a culture of inclusion.] + +About this guide +================ + +Types of contribution +===================== + +[Pathways for contributors] + +Helping with this guide +======================= diff --git a/contrib/project/channels.rst b/contrib/project/channels.rst new file mode 100644 index 0000000000..711dbe5879 --- /dev/null +++ b/contrib/project/channels.rst @@ -0,0 +1,16 @@ +.. important:: + + |draft| + + |purpose| + + +====================== +Communication channels +====================== + +* Repos +* Discourse +* Discord +* Mailing lists (deprioritize) +* Where to get help diff --git a/contrib/project/conduct.rst b/contrib/project/conduct.rst new file mode 100644 index 0000000000..37fe3bbfa7 --- /dev/null +++ b/contrib/project/conduct.rst @@ -0,0 +1,16 @@ +=============== +Code of Conduct +=============== + +.. important:: + + |draft| + + |purpose| + + +[Brief summary of the code of conduct, with links to official source.] + +* Standard for communication +* How to report +* Enforcement details diff --git a/contrib/project/generative-ai.rst b/contrib/project/generative-ai.rst new file mode 100644 index 0000000000..6cb5b62ffc --- /dev/null +++ b/contrib/project/generative-ai.rst @@ -0,0 +1,10 @@ +.. important:: + + |draft| + + |purpose| + + +[This is the existing :ref:`generative-ai` page from the devguide.] + +.. include:: ../../getting-started/generative-ai.rst diff --git a/contrib/project/github.rst b/contrib/project/github.rst new file mode 100644 index 0000000000..fe45c6b8b1 --- /dev/null +++ b/contrib/project/github.rst @@ -0,0 +1,15 @@ +.. important:: + + |draft| + + |purpose| + +====== +GitHub +====== + +[Where are the actual artifacts?] + +* Main CPython repos +* Core workflow repos +* Infrastructure repos diff --git a/contrib/project/governance.rst b/contrib/project/governance.rst new file mode 100644 index 0000000000..a4bc66ff1b --- /dev/null +++ b/contrib/project/governance.rst @@ -0,0 +1,25 @@ +.. important:: + + |draft| + + |purpose| + + +========== +Governance +========== + +[How decisions are made, who is involved, how to participate.] + +Steering Council +================ + +Documentation Editorial Board +============================= + +Typing Council +============== + + +Others? +======= diff --git a/contrib/project/index.rst b/contrib/project/index.rst new file mode 100644 index 0000000000..5d26b15aab --- /dev/null +++ b/contrib/project/index.rst @@ -0,0 +1,28 @@ +.. _c_project: + +=================== +The CPython project +=================== + +.. important:: + + |draft| + + |purpose| + + +[Give the reader an understanding of the project as a whole. What are the +moving parts, who is involved, how do they interact?] + +* Structure + +.. toctree:: + :maxdepth: 5 + + conduct + roles + governance + generative-ai.rst + github + channels + outreach diff --git a/contrib/project/outreach.rst b/contrib/project/outreach.rst new file mode 100644 index 0000000000..d43aa8e9de --- /dev/null +++ b/contrib/project/outreach.rst @@ -0,0 +1,12 @@ +======== +Outreach +======== + +.. important:: + + |draft| + + |purpose| + + +* Sprints diff --git a/contrib/project/roles.rst b/contrib/project/roles.rst new file mode 100644 index 0000000000..8336fe4651 --- /dev/null +++ b/contrib/project/roles.rst @@ -0,0 +1,17 @@ +===== +Roles +===== + +.. important:: + + |draft| + + |purpose| + + +[Quick overview of the roles people play. Core team has its own section.] + +* Core team +* Triager +* Contributors + * types of contributions diff --git a/contrib/security.rst b/contrib/security.rst new file mode 100644 index 0000000000..db40b4a167 --- /dev/null +++ b/contrib/security.rst @@ -0,0 +1,13 @@ +========================================= +Security and infrastructure contributions +========================================= + +.. important:: + + |draft| + + |purpose| + +* Security +* Infrastructure +* Core workflow diff --git a/contrib/triage/index.rst b/contrib/triage/index.rst new file mode 100644 index 0000000000..0a547d9d77 --- /dev/null +++ b/contrib/triage/index.rst @@ -0,0 +1,14 @@ +.. _c_triage: + +=================== +Issues and triaging +=================== + +.. toctree:: + :maxdepth: 5 + + issue-tracker + triaging + labels + reviewing + triage-team diff --git a/contrib/triage/issue-tracker.rst b/contrib/triage/issue-tracker.rst new file mode 100644 index 0000000000..a5777bc81d --- /dev/null +++ b/contrib/triage/issue-tracker.rst @@ -0,0 +1,9 @@ +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`issue-tracker` page from the devguide] + +.. include:: ../../triage/issue-tracker.rst diff --git a/contrib/triage/labels.rst b/contrib/triage/labels.rst new file mode 100644 index 0000000000..c364817333 --- /dev/null +++ b/contrib/triage/labels.rst @@ -0,0 +1,9 @@ +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`labels` page from the devguide] + +.. include:: ../../triage/labels.rst diff --git a/contrib/triage/reviewing.rst b/contrib/triage/reviewing.rst new file mode 100644 index 0000000000..060f6b78dd --- /dev/null +++ b/contrib/triage/reviewing.rst @@ -0,0 +1,13 @@ +.. important:: + + |draft| + + |purpose| + + +========= +Reviewing +========= + +* How? Etiquette? +* How to request a review? diff --git a/contrib/triage/triage-team.rst b/contrib/triage/triage-team.rst new file mode 100644 index 0000000000..a9b59056a9 --- /dev/null +++ b/contrib/triage/triage-team.rst @@ -0,0 +1,9 @@ +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`triage-team` page from the devguide] + +.. include:: ../../triage/triage-team.rst diff --git a/contrib/triage/triaging.rst b/contrib/triage/triaging.rst new file mode 100644 index 0000000000..22e1ccc657 --- /dev/null +++ b/contrib/triage/triaging.rst @@ -0,0 +1,9 @@ +.. important:: + + |draft| + + |purpose| + +[This is the existing :ref:`triaging` page from the devguide] + +.. include:: ../../triage/triaging.rst diff --git a/contrib/user-success.rst b/contrib/user-success.rst new file mode 100644 index 0000000000..2a9ef5d4e5 --- /dev/null +++ b/contrib/user-success.rst @@ -0,0 +1,14 @@ +======================================= +Accessibility, design, and user success +======================================= + +.. important:: + + |draft| + + |purpose| + + +* Accessibility +* Design +* User success diff --git a/core-developers/become-core-developer.rst b/core-developers/become-core-developer.rst index 4c02519337..739792b481 100644 --- a/core-developers/become-core-developer.rst +++ b/core-developers/become-core-developer.rst @@ -8,7 +8,7 @@ How to become a core developer What it takes ============= -When you have consistently contributed patches which meet quality standards +When you have consistently made contributions which meet quality standards without requiring extensive rewrites prior to being committed, you may qualify for commit privileges and become a core developer of Python. You must also work well with other core developers (and people in general) @@ -28,11 +28,12 @@ Gaining commit privileges After a candidate has demonstrated consistent contributions, commit privileges are granted through these steps: -#. A core developer (submitter, usually the mentor) starts a poll in the - `Committers category`_ on the `Python Discourse`_. +#. A core developer (submitter, usually the mentor) starts a poll + (see the :ref:`template <coredev-template>` below) in + the `Committers category`_ on the `Python Discourse`_. - open for 7 days - - results shown upon close + - results shown only upon closing #. If the candidate receives at least two-thirds positive votes when the poll closes (as per :pep:`13`), the submitter `emails the steering council @@ -59,6 +60,52 @@ are granted through these steps: were in the form of a separate post on the already open topic with the poll. -.. _Code of Conduct: https://www.python.org/psf/conduct/ +Getting a python.org email address +---------------------------------- + +Members of the core team can get an email address on the python.org domain. +For more details refer to the `python.org email policy +<https://www.python.org/psf/records/board/policies/email/>`_. + + +Poll template +============= + +.. _coredev-template: + +While Discourse uses Markdown for formatting, the poll functionality is +custom and somewhat resembles BBcode. There's a creator for polls in the +UI (click the cog icon in the edit box toolbar and choose "Build Poll"). +Here's what it outputs, you can copy and paste it for your poll: + +.. code-block:: bbcode + + [poll type=regular results=on_close public=false chartType=bar groups=committers close=2024-07-15T21:15:00.000Z] + * Promote Basil Fawlty + * Do not promote + [/poll] + +The important options in the poll builder set to get this result: + - Show who voted: **disabled** (``public=false``) + - Limit voting to these groups: **committers** (``groups=committers``) + - Automatically close poll: **in 7 days** (``close=...``) + - Show results: **When poll is closed** (``results=on_close``) + +.. raw:: html + + <script> + for (let span of document.querySelectorAll('span')) { + if (span.textContent === '2024-07-15T21:15:00.000Z') { + const nextWeek = new Date(); + nextWeek.setDate(nextWeek.getDate() + 7); + nextWeek.setSeconds(0); + nextWeek.setMilliseconds(0); + span.textContent = nextWeek.toISOString(); + break; + } + } + </script> + +.. _Code of Conduct: https://policies.python.org/python.org/code-of-conduct/ .. _Committers category: https://discuss.python.org/c/committers/5 .. _Python Discourse: https://discuss.python.org diff --git a/core-developers/committing.rst b/core-developers/committing.rst index 3206d991ae..326578c0b3 100644 --- a/core-developers/committing.rst +++ b/core-developers/committing.rst @@ -31,11 +31,11 @@ to enter the public source tree. Ask yourself the following questions: * **Do the checks on the pull request show that the test suite passes?** Make sure that all of the status checks are passing. -* **Is the patch in a good state?** - Check :ref:`patch` and :ref:`helptriage` to review what is expected of - a patch. +* **Is the pull request in a good state?** + Check :ref:`pull-request-lifecycle` and :ref:`helptriage` to review what + is expected of a pull request. -* **Does the patch break backwards-compatibility without a strong reason?** +* **Does the change break backwards-compatibility without a strong reason?** :ref:`Run the entire test suite <runtests>` to make sure that everything still passes. If there is a change to the semantics, then there needs to be a strong reason, because it will cause some peoples' code to break. @@ -44,7 +44,7 @@ to enter the public source tree. Ask yourself the following questions: <https://discuss.python.org/c/core-dev/23>`__. * **Does documentation need to be updated?** - If the pull request introduces backwards-incompatible changes (e.g. + If the pull request introduces backwards-incompatible changes (for example, deprecating or removing a feature), then make sure that those changes are reflected in the documentation before you merge the pull request. @@ -61,14 +61,14 @@ to enter the public source tree. Ask yourself the following questions: Make sure that the contributor has signed a `Contributor Licensing Agreement <https://www.python.org/psf/contrib/contrib-form/>`_ (CLA), unless their change has no possible intellectual property - associated with it (e.g. fixing a spelling mistake in documentation). + associated with it (for example, fixing a spelling mistake in documentation). The `CPython CLA Bot <https://github.com/apps/cpython-cla-bot/>`_ checks whether the author has signed the CLA, and replies in the PR if they haven't. For further questions about the CLA process, write to contributors@python.org. * **Were** ``What's New in Python`` **and** ``Misc/NEWS.d/next`` **updated?** - If the change is particularly interesting for end users (e.g. new features, + If the change is particularly interesting for end users (for example, new features, significant improvements, or backwards-incompatible changes), then an entry in the ``What's New in Python`` document (in ``Doc/whatsnew/``) should be added as well. Changes that affect only documentation generally do not @@ -97,7 +97,7 @@ For the last two, note the following: #. **If a change is reverted prior to release**, then the corresponding entry is simply removed. Otherwise, a new entry must be added noting - that the change has been reverted (e.g. when a feature is released in + that the change has been reverted (for example, when a feature is released in an alpha and then cut prior to the first beta). #. **If a change is a fix (or other adjustment) to an earlier unreleased @@ -107,7 +107,7 @@ For the last two, note the following: Changes that require "What's New in Python" entries ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If a change is particularly interesting for end users (e.g. new features, +If a change is particularly interesting for end users (for example, new features, significant improvements, or backwards-incompatible changes), add an entry in the "What's New in Python" document (in :cpy-file:`Doc/whatsnew/`) in addition to the ``NEWS`` entry. @@ -130,16 +130,16 @@ or the :pypi:`blurb` tool and its ``blurb add`` command. If you are unable to use the tool, then you can create the ``NEWS`` entry file manually. The ``Misc/NEWS.d`` directory contains a sub-directory named ``next``, which contains various sub-directories representing classifications -for what was affected (e.g. ``Misc/NEWS.d/next/Library`` for changes relating +for what was affected (for example, ``Misc/NEWS.d/next/Library`` for changes relating to the standard library). The file name itself should be in the format ``<datetime>.gh-issue-<issue-number>.<nonce>.rst``: * ``<datetime>`` is today's date joined with a hyphen (``-``) to your current - local time, in the ``YYYY-MM-DD-hh-mm-ss`` format (e.g. ``2017-05-27-16-46-23``). -* ``<issue-number>`` is the issue number the change is for (e.g. ``12345`` + local time, in the ``YYYY-MM-DD-hh-mm-ss`` format (for example, ``2017-05-27-16-46-23``). +* ``<issue-number>`` is the issue number the change is for (for example, ``12345`` for ``gh-issue-12345``). * ``<nonce>`` is a unique string to guarantee that the file name is - unique across branches (e.g. ``Yl4gI2``). It is typically six characters + unique across branches (for example, ``Yl4gI2``). It is typically six characters long, but it can be any length of letters and numbers. Its uniqueness can be satisfied by typing random characters on your keyboard. @@ -159,13 +159,13 @@ the reader to have read the actual diff for the change. The contents of a ``NEWS`` file should be valid reStructuredText. An 80 character column width should be used. There is no indentation or leading marker in the -file (e.g. ``-``). There is also no need to start the entry with the issue +file (for example, ``-``). There is also no need to start the entry with the issue number since it is part of the file name. You can use :ref:`inline markups <rest-inline-markup>` too. Here is an example of a ``NEWS`` entry:: Fix warning message when :func:`os.chdir` fails inside - :func:`test.support.temp_cwd`. Patch by Chris Jerdonek. + :func:`test.support.temp_cwd`. Contributed by Chris Jerdonek. The inline Sphinx roles like ``:func:`` can be used help readers find more information. You can build HTML and verify that the @@ -182,7 +182,7 @@ As a core developer, you have the ability to push changes to the official Python repositories, so you need to be careful with your workflow: * **You should not push new branches to the main repository.** You can - still use them in the fork that you use for the development of patches. + still use them in the fork that you use for your own development. You can also push these branches to a separate public repository for maintenance work before it is integrated into the main repository. @@ -260,7 +260,7 @@ can apply labels to GitHub pull requests). Reverting a merged pull request ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -To revert a merged pull request, press the ``Revert`` button at the +To revert a merged pull request, press the :guilabel:`Revert` button at the bottom of the pull request. That will bring up the page to create a new pull request where the commit can be reverted. It will also create a new branch on the main CPython repository. Delete the branch once diff --git a/core-developers/developers.csv b/core-developers/developers.csv index d05354b4af..d89fc2abe3 100644 --- a/core-developers/developers.csv +++ b/core-developers/developers.csv @@ -1,3 +1,11 @@ +Bénédikt Tran,picnixz,2025-01-10,, +Savannah Ostrowski,savannahostrowski,2024-11-13,, +Matt Page,mpage,2024-10-10,, +Kirill Podoprigora,Eclips4,2024-09-20,, +Ned Batchelder,nedbat,2024-07-16,, +Tian Gao,gaogaotiantian,2024-06-06,, +Michael Droettboom,mdboom,2024-06-06,, +Russell Keith-Magee,freakboy3742,2024-05-30,, Sam Gross,colesbury,2024-02-06,, Nikita Sobolev,sobolevn,2024-02-06,, Adam Turner,AA-Turner,2023-10-10,, @@ -23,7 +31,7 @@ Kyle Stanley,aeros,2020-04-14,, Donghee Na,corona10,2020-04-08,, Karthikeyan Singaravelan,tirkarthi,2019-12-31,, Joannah Nanjekye,nanjekyejoannah,2019-09-23,, -Abhilash Raj,maxking,2019-08-06,, +Abhilash Raj,maxking,2019-08-06,2022-11-30,"Privileges relinquished on 2022-11-30" Paul Ganssle,pganssle,2019-06-15,, Stéphane Wirtel,matrixise,2019-04-08,, Stefan Behnel,scoder,2019-04-08,, @@ -44,7 +52,7 @@ Xavier de Gaye,xdegaye,2016-06-03,2018-01-25,Privileges relinquished on 2018-01- Davin Potts,applio,2016-03-06,, Martin Panter,vadmium,2015-08-10,2020-11-26, Paul Moore,pfmoore,2015-03-15,, -Robert Collins,rbtcollins,2014-10-16,,To work on unittest +Robert Collins,rbtcollins,2014-10-16,2021-11-30,To work on unittest; privileges relinquished on 2021-11-30 Berker Peksağ,berkerpeksag,2014-06-26,, Steve Dower,zooba,2014-05-10,, Kushal Das,kushaldas,2014-04-14,, @@ -61,7 +69,7 @@ Hynek Schlawack,hynek,2012-05-14,, Richard Oudkerk,,2012-04-29,2017-02-10,For multiprocessing module; did not make GitHub transition Andrew Svetlov,asvetlov,2012-03-13,,At PyCon sprint Petri Lehtinen,akheron,2011-10-22,2020-11-12, -Meador Inge,meadori,2011-09-19,2020-11-26, +Meador Inge,meadori,2011-09-19,, Jeremy Kloth,jkloth,2011-09-12,, Sandro Tosi,sandrotosi,2011-08-01,, Alex Gaynor,alex,2011-07-18,,For PyPy compatibility (since expanded scope) @@ -70,7 +78,7 @@ Nadeem Vawda,,2011-04-10,2017-02-10,Did not make GitHub transition Carl Friedrich Bolz-Tereick,cfbolz,2011-03-21,,for stdlib compatibility work for PyPy Jason R. Coombs,jaraco,2011-03-14,,For sprinting on distutils2 Ross Lagerwall,,2011-03-13,2017-02-10,Did not make GitHub transition -Eli Bendersky,eliben,2011-01-11,2020-11-26, +Eli Bendersky,eliben,2011-01-11,2020-11-26,Relinquished privileges on 2020-11-26 Ned Deily,ned-deily,2011-01-09,, David Malcolm,davidmalcolm,2010-10-27,2020-11-12,relinquished privileges on 2020-11-12 Tal Einat,taleinat,2010-10-04,,Initially for IDLE @@ -92,12 +100,12 @@ Doug Hellmann,dhellmann,2009-09-20,2020-11-11,For documentation; relinquished pr Frank Wierzbicki,,2009-08-02,2017-02-10,For Jython compatibility; did not make GitHub transition Ezio Melotti,ezio-melotti,2009-06-07,,For documentation Philip Jenvey,pjenvey,2009-05-07,2020-11-26,For Jython compatibility -Michael Foord,voidspace,2009-04-01,,For IronPython compatibility +Michael Foord,voidspace,2009-04-01,2025-01-24,For IronPython compatibility; deceased R\. David Murray,bitdancer,2009-03-30,, Chris Withers,cjw296,2009-03-08,, Tarek Ziadé,tarekziade,2008-12-21,2017-02-10,For distutils module Hirokazu Yamamoto,,2008-08-12,2017-02-10,For Windows build; did not make GitHub transition -Armin Ronacher,mitsuhiko,2008-07-23,2020-11-26,For documentation toolset and ast module +Armin Ronacher,mitsuhiko,2008-07-23,,For documentation toolset and ast module Antoine Pitrou,pitrou,2008-07-16,, Senthil Kumaran,orsenthil,2008-06-16,, Jesse Noller,,2008-06-16,2017-02-10,For multiprocessing module; did not make GitHub transition @@ -106,9 +114,9 @@ Guilherme Polo,,2008-04-24,2017-02-10,Did not make GitHub transition Jeroen Ruigrok van der Werven,,2008-04-12,2017-02-10,For documentation; did not make GitHub transition Benjamin Peterson,benjaminp,2008-03-25,,For bug triage David Wolever,wolever,2008-03-17,2020-11-21,For 2to3 module -Trent Nelson,tpn,2008-03-17,2020-11-26, -Mark Dickinson,mdickinson,2008-01-06,,For maths-related work -Amaury Forgeot d'Arc,amauryfa,2007-11-09,2020-11-26, +Trent Nelson,tpn,2008-03-17,, +Mark Dickinson,mdickinson,2008-01-06,2024-08-13,For maths-related work +Amaury Forgeot d'Arc,amauryfa,2007-11-09,2020-11-26,"Relinquished privileges on 2020-11-26" Christian Heimes,tiran,2007-10-31,, Bill Janssen,,2007-08-28,2017-02-10,For ssl module; did not make GitHub transition Jeffrey Yasskin,,2007-08-09,2017-02-10,Did not make GitHub transition @@ -134,7 +142,7 @@ Facundo Batista,facundobatista,2004-10-16,, Sean Reifschneider,,2004-09-17,2017-02-10,Did not make GitHub transition Johannes Gijsbers,,2004-08-14,2005-07-27,Privileges relinquished on 2005-07-27 Matthias Klose,doko42,2004-08-04,, -PJ Eby,pjeby,2004-03-24,2020-11-26, +PJ Eby,pjeby,2004-03-24,2020-11-26,"Relinquished privileges on 2020-11-26" Vinay Sajip,vsajip,2004-02-20,, Hye-Shik Chang,hyeshik,2003-12-10,, Armin Rigo,,2003-10-24,2012-06-01,Privileges relinquished in 2012 @@ -145,11 +153,11 @@ Brett Cannon,brettcannon,2003-04-18,, David Goodger,,2003-01-02,2017-02-10,Did not make GitHub transition Gustavo Niemeyer,,2002-11-05,2017-02-10,Did not make GitHub transition Tony Lownds,,2002-09-22,2017-02-10,Did not make GitHub transition -Steve Holden,holdenweb,2002-06-14,2017-02-10,"Relinquished privileges on 2005-04-07, +Steve Holden,holdenweb,2002-06-14,2017-02-10,"Relinquished privileges on 2005-04-07, but granted again for Need for Speed sprint; did not make GitHub transition" Christian Tismer,ctismer,2002-05-17,,For Need for Speed sprint Jason Tishler,,2002-05-15,2017-02-10,Did not make GitHub transition -Walter Dörwald,doerwalter,2002-03-21,, +Walter Dörwald,doerwalter,2002-03-21,2021-11-16,"Relinquished privileges on 2021-11-16" Andrew MacIntyre,,2002-02-17,2016-01-02,Privileges relinquished 2016-01-02 Gregory P. Smith,gpshead,2002-01-08,, Anthony Baxter,,2001-12-21,2017-02-10,Did not make GitHub transition @@ -182,9 +190,9 @@ Eric S. Raymond,,2000-06-02,2017-02-10,Did not make GitHub transition Greg Stein,,1999-11-07,2017-02-10,Did not make GitHub transition Just van Rossum,,1999-01-22,2017-02-10,Did not make GitHub transition Greg Ward,,1998-12-18,2017-02-10,Did not make GitHub transition -Andrew Kuchling,akuchling,1998-04-09,, +Andrew Kuchling,akuchling,1998-04-09,2022-11-09,Privileges relinquished 2022-11-09 Ken Manheimer,,1998-03-03,2005-04-08,Privileges relinquished on 2005-04-08 -Jeremy Hylton,jeremyhylton,1997-08-13,2020-11-26, +Jeremy Hylton,jeremyhylton,1997-08-13,, Roger E. Masse,,1996-12-09,2017-02-10,Did not make GitHub transition Fred Drake,freddrake,1996-07-23,, Barry Warsaw,warsaw,1994-07-25,, diff --git a/core-developers/experts.rst b/core-developers/experts.rst index d79d5cf8d6..683f354625 100644 --- a/core-developers/experts.rst +++ b/core-developers/experts.rst @@ -55,15 +55,12 @@ Module Maintainers __future__ __main__ gvanrossum, ncoghlan _thread -_testbuffer abc -aifc bitdancer -argparse +argparse savannahostrowski* array -ast benjaminp, pablogsal, isidentical +ast benjaminp, pablogsal, isidentical, JelleZijlstra, eclips4 asyncio 1st1, asvetlov, gvanrossum, graingert, kumaraditya303, willingc atexit -audioop serhiy-storchaka base64 bdb binascii @@ -71,10 +68,7 @@ bisect rhettinger* builtins bz2 calendar -cgi ethanfurman* -cgitb ethanfurman* -chunk -cmath mdickinson +cmath cmd code codecs malemburg, doerwalter @@ -90,14 +84,13 @@ contextvars copy avassalotti copyreg avassalotti cProfile -crypt jafo^* csv smontanaro (inactive) ctypes theller (inactive), abalkin, amauryfa, meadori curses Yhg1s dataclasses ericvsmith*, carljm datetime abalkin, pganssle dbm -decimal facundobatista, rhettinger, mdickinson +decimal facundobatista, rhettinger difflib tim-one (inactive) dis 1st1 doctest tim-one (inactive) @@ -111,19 +104,20 @@ fcntl Yhg1s filecmp fileinput fnmatch -fractions mdickinson +fractions ftplib giampaolo* functools rhettinger* gc pitrou, pablogsal getopt +getpath FFY00 getpass gettext glob grp gzip -hashlib tiran, gpshead* +hashlib tiran, gpshead*, picnixz heapq rhettinger*, stutzbach^ -hmac tiran, gpshead* +hmac tiran, gpshead*, picnixz html ezio-melotti* http idlelib kbkaiser (inactive), terryjreedy*, serwy (inactive), @@ -137,16 +131,14 @@ ipaddress pmoody^ itertools rhettinger* json etrepum (inactive), ezio-melotti, rhettinger keyword -lib2to3 benjaminp libmpdec linecache locale malemburg logging vsajip lzma mailbox -mailcap marshal -math mdickinson, rhettinger, stutzbach^ +math rhettinger, stutzbach^ mimetypes mmap Yhg1s modulefinder theller (inactive), jvr^ @@ -164,7 +156,7 @@ os.path serhiy-storchaka ossaudiodev parser pablogsal pathlib barneygale* -pdb +pdb gaogaotiantian pickle avassalotti pickletools avassalotti pipes @@ -183,7 +175,7 @@ pyclbr isidentical pydoc AA-Turner queue rhettinger* quopri -random rhettinger, mdickinson +random rhettinger re ezio-melotti, serhiy-storchaka readline Yhg1s reprlib @@ -210,16 +202,14 @@ stat tiran statistics stevendaprano, rhettinger string stringprep -struct mdickinson, meadori +struct meadori subprocess astrand^ (inactive), giampaolo, gpshead* -sunau symtable benjaminp sys sysconfig FFY00 syslog jafo^* tabnanny tim-one (inactive) tarfile gustaebel -telnetlib tempfile termios Yhg1s test ezio-melotti @@ -236,15 +226,16 @@ traceback iritkatriel tracemalloc vstinner tty Yhg1s* turtle gregorlingl^, willingc +turtledemo terryjreedy* types 1st1 typing gvanrossum, JelleZijlstra*, AlexWaygood*, carljm, sobolevn* unicodedata malemburg, ezio-melotti -unittest voidspace*, ezio-melotti, rbtcollins, gpshead -unittest.mock voidspace* +unittest ezio-melotti, rbtcollins, gpshead +unittest.mock urllib orsenthil uu uuid -venv vsajip +venv vsajip, FFY00 warnings wave weakref freddrake @@ -296,16 +287,18 @@ for “their” platform as a third-party project. =================== =========== Platform Maintainers =================== =========== -AIX David.Edelsohn^ +AIX edelsohn, ayappanec +Android mhsmith Cygwin jlt63^, stutzbach^ Emscripten hoodmane, pmp-p, rdb, rth, ryanking13 FreeBSD HP-UX +iOS freakboy3742, ned-deily Linux -macOS ronaldoussoren, ned-deily +macOS ronaldoussoren, ned-deily, freakboy3742 NetBSD1 OS2/EMX aimacintyre^ -Solaris/OpenIndiana jcea +Solaris/OpenIndiana jcea, kulikjak Windows tjguk, zware, zooba, pfmoore JVM/Java frank.wierzbicki^ =================== =========== @@ -327,9 +320,8 @@ buildbots zware, pablogsal bytecode benjaminp, 1st1, markshannon, brandtbucher, carljm, iritkatriel context managers ncoghlan core workflow Mariatta, ezio-melotti, hugovk, AA-Turner -coverity scan tiran, Yhg1s -cryptography gpshead, dstufft -data formats mdickinson +cryptography gpshead, dstufft, picnixz +data formats database malemburg devguide merwok, ezio-melotti, willingc, Mariatta, hugovk, AA-Turner @@ -342,11 +334,12 @@ frozen modules ericsnowcurrently, gvanrossum, kumaraditya303 f-strings ericvsmith* GUI i18n malemburg, merwok -import machinery brettcannon, ncoghlan, ericsnowcurrently +import machinery brettcannon, ncoghlan, ericsnowcurrently, FFY00 +initialization FFY00 io benjaminp, stutzbach^, gpshead -JIT brandtbucher* +JIT brandtbucher*, savannahostrowski* locale malemburg -mathematics mdickinson, malemburg, stutzbach^, rhettinger +mathematics malemburg, stutzbach^, rhettinger memory management tim-one, malemburg, Yhg1s memoryview networking giampaolo, gpshead @@ -364,7 +357,8 @@ release management tarekziade, malemburg, benjaminp, warsaw, runtime lifecycle ericsnowcurrently, kumaraditya303, zooba str.format ericvsmith* subinterpreters ericsnowcurrently, kumaraditya303 -testing voidspace, ezio-melotti +symbol table JelleZijlstra, carljm +testing ezio-melotti test coverage threads gpshead time and dates malemburg, abalkin, pganssle diff --git a/core-developers/index.rst b/core-developers/index.rst index 2b3ac7b799..2e6db104f4 100644 --- a/core-developers/index.rst +++ b/core-developers/index.rst @@ -1,3 +1,5 @@ +.. _core-dev: + =============== Core developers =============== @@ -11,3 +13,4 @@ Core developers developer-log motivations become-core-developer + memorialization diff --git a/core-developers/memorialization.rst b/core-developers/memorialization.rst new file mode 100644 index 0000000000..61ec0560c8 --- /dev/null +++ b/core-developers/memorialization.rst @@ -0,0 +1,154 @@ +.. _memorialize-core-developer: + +=============== +Memorialization +=============== + +Rationale +========= + +When a core developer passes away, memorializing accounts helps create +a space for remembering the contributor and protects against attempted +logins and fraudulent activity. + +The process +=========== + +The memorialization process is performed by a member of the PSF staff +with administrative access to current and historical systems where +core developers have access. + +After the status of the core developer in question is confirmed, +access to the systems listed below is revoked and some changes are +made to how the user displays to others. + +To respect the choices that someone made while alive, we aim to preserve +content of their accounts without changes after they've passed away. +To support the bereaved, in some instances, we may remove or change +certain content when the legacy contact or family members request it. + +GitHub +------ + +* The user is removed from the `python/ <https://github.com/orgs/python/>`_ + organization on GitHub; +* The user is removed from the `psf/ <https://github.com/orgs/psf/>`_ + organization on GitHub; +* The user is removed from the `pypa/ <https://github.com/orgs/pypa/>`_ + organization on GitHub. + +The PSF staff does not follow up with GitHub with regards to GitHub account +cancellation as this action is reserved for next-of-kin or designated by +the deceased GitHub user to act as an account successor. + +The general policy regarding deceased users on GitHub is described +`here <https://docs.github.com/en/site-policy/other-site-policies/github-deceased-user-policy>`_. + +Repositories in the organization +-------------------------------- + +* The user's GitHub handle is removed from ``/.github/CODEOWNERS``. + To see all that need action, perform + `this query <https://github.com/search?q=org%3Apython+path%3A**%2F.github%2FCODEOWNERS+USERNAME&type=code>`_. +* The user is marked as deceased in the private + `voters/python-core.toml <https://github.com/python/voters/blob/main/python-core.toml>`_ + file with the ``left=`` field set to the day of passing, if known. + +discuss.python.org +------------------ + +* The user's "custom status" is set to 🕊 ``in memoriam``; +* The user's "about me" is amended with ``$firstname passed away on $date. [In memoriam.]($in_memoriam_post_url)``; +* In the user's security "recently used devices" the staff member + chooses "Log out all"; +* In the user's permissions the staff member chooses "Deactivate account"; +* The user's trust level is reset to ``1: basic user`` (trust level 0 + doesn't allow links in "About Me"); +* The user's "associated accounts" (like GitHub) that provide an + alternative login method, are all disconnected; +* The user's API keys are revoked; +* The user's admin or moderator right is revoked; +* The user's primary email address is reset to + ``USERNAME@in-memoriam.invalid`` and secondary email addresses are + removed (this step requires the administrator to contact Discourse.org + staff via ``team@discourse.org``). + +The "in memoriam" Discourse topic mentioned above is best created by +a community member close to the deceased. + +The general best practice for deceased community members on +Discourse-powered forums is described `here <https://meta.discourse.org/t/best-practices-for-deceased-community-members/146210>`_. + +python.org email account +------------------------ + +The PSF staff member emails ``postmaster@python.org`` to ask the email +administrator to: + +* remove SMTP access from ``USERNAME@python.org``; +* reset the password to POP3/IMAP for ``USERNAME@python.org``; +* disable email forwarding, if set up, for ``USERNAME@python.org`` and + leave a record permanently as "in memoriam" to avoid future account + name reuse; +* remove this email from all mailing lists under ``@python.org``; +* remove any known alternate emails for the same user from all mailing + lists under ``@python.org``. + +In case the email shutdown causes issues for the estate executors, the +PSF will reasonably try to help if contacted directly. + +python.org admin +---------------- + +* The user's account (``/admin/users/user``) is deactivated (NOT deleted) + and their staff and superuser status is unchecked; +* The user's password is reset to a long random string; +* The user's primary email address is set to + ``USERNAME@in-memoriam.invalid`` and set as unverified; +* The user's secondary email addresses are deleted; +* The user's API keys (both on the account and ``tastypie``) are deleted; +* The user's "I would like to be a PSF Voting Member" field is cleared. + +devguide.python.org +------------------- + +* The user is marked as deceased in `developers.csv <https://github.com/python/devguide/blob/main/core-developers/developers.csv>`_; +* The user is removed from the `Experts Index <https://github.com/python/devguide/blob/main/core-developers/experts.rst>`_. + +bugs.python.org +--------------- + +While the issue tracker was migrated to GitHub, the Roundup instance +is still up for historical purposes. + +* the PSF staff member logs into ``bugs.nyc1.psf.io``; +* the PSF staff member runs ``roundup-admin`` to set the user's email + address to ``USERNAME@in-memoriam.invalid``; +* the user's alternate emails are removed; +* the user's password is reset to a long random string; +* the PSF staff member removes any active login sessions from Postgres. + +Other PSF-related infrastructure +-------------------------------- + +* The PSF staff member notifies administrators of the Python Core Devs + Discord server to remove the user from the server. The PSF staff + does not follow up with Discord with regards to Discord account + cancellation. The general policy regarding deceased users on Discord + is available `here <https://support.discord.com/hc/en-us/articles/19872987802263--Deceased-or-Incapacitated-Users>`_. + +* The user is removed from Salt configuration for the PSF infrastructure + in `/pillar/base/users <https://github.com/python/psf-salt/tree/main/pillar/base/users>`_ + that allows SSH access to PSF-controlled servers. + +* The user might have ran a buildbot worker. The PSF staff member will + look for that in the + `buildmaster-config <https://github.com/search?q=repo%3Apython%2Fbuildmaster-config%20USERNAME&type=code>`_ + repository. + +PyPI +---- + +* The PSF staff member notifies PyPI admins by emailing them at + ``admin@pypi.org`` to mark the user as inactive, remove their email + addresses, prohibit their password resets, and revoke all API keys. diff --git a/core-developers/motivations.rst b/core-developers/motivations.rst index c32304962f..b19c3062b8 100644 --- a/core-developers/motivations.rst +++ b/core-developers/motivations.rst @@ -80,7 +80,7 @@ participating in the CPython core development process: country of residence. Include a "Crowdfunding" bullet point with a link if you'd like to highlight - crowdfunding services (e.g. Patreon) that folks can use to support your core + crowdfunding services (for example, Patreon) that folks can use to support your core development work. Include additional bullet points (without links) for any other affiliations @@ -106,9 +106,11 @@ participating in the CPython core development process: * Personal site: `Curious Efficiency <https://www.curiousefficiency.org/>`_ * `Extended bio <https://www.curiousefficiency.org/pages/about>`__ * Python Software Foundation (Fellow, Packaging Working Group) + * Element Labs/LM Studio (Python deployment engineer) Alyssa began using Python as a testing and prototyping language while working - for Boeing Defence Australia, and continues to use it for that purpose today. + for Boeing Defence Australia. She now primarily uses it as the lead project + maintainer for the open source ``venvstacks`` Python deployment utility. As a core developer, she is primarily interested in helping to ensure Python's continued suitability for educational, testing and data analysis use cases, @@ -116,7 +118,7 @@ participating in the CPython core development process: applications and test harnesses from open source components. Note: prior to August 2023, Alyssa used her birth name (Nick Coghlan). Some records - (e.g. mailing list archives, version control history) will still reference that name. + (for example, mailing list archives, version control history) will still reference that name. .. topic:: Steve Dower (United States/Australia) @@ -186,7 +188,7 @@ participating in the CPython core development process: .. topic:: Antoine Pitrou (France) * LinkedIn: `<https://www.linkedin.com/in/pitrou/>`_ (Senior Software Engineer) - * Voltron Data + * QuantStack * Python Software Foundation (Fellow) * Email address: antoine@python.org @@ -197,12 +199,12 @@ participating in the CPython core development process: world, and by the concrete roadblocks he was hitting in professional settings. Topics of choice have included interpreter optimizations, garbage collection, network programming, system programming and - concurrent programming (such as maintaining ``multiprocessing``). + concurrent programming. As a professional, Antoine has been first specializing in network programming, and more lately in open source data science infrastructure. - He is currently working full time on Apache Arrow as a technical leader - for Voltron Data. + He has made numerous contributions to Numba, Dask and is currently working + full time on Apache Arrow as a technical leader at QuantStack. .. topic:: Victor Stinner (France) @@ -221,25 +223,23 @@ participating in the CPython core development process: .. topic:: Barry Warsaw (United States) - * `LinkedIn: <https://www.linkedin.com/in/barry-warsaw/>`_ (Senior Staff - Software Engineer - Python Foundation team) + * NVIDIA, Principal System Software Engineer, Open Source Python Ecosystem * Personal site: `barry.warsaw.us <https://barry.warsaw.us/>`_ * Blog: `We Fear Change <https://www.wefearchange.org/>`_ + * `LinkedIn <https://www.linkedin.com/in/barry-warsaw/>`_ + * `Bluesky <https://bsky.app/profile/pumpichank.bsky.social>`_ * Email address: barry@python.org * Python Software Foundation (Fellow) Barry has been working in, with, and on Python since 1994. He attended the - first Python workshop at NBS (now `NIST <https://www.nist.gov/>`_) in - Gaithersburg, MD in 1994, where he met Guido and several other early Python - adopters. Barry subsequently worked with Guido for 8 years while at `CNRI - <http://cnri.reston.va.us/>`_. From 2007 until 2017, Barry worked for - `Canonical <https://canonical.com/>`_, corporate sponsor of `Ubuntu - <https://ubuntu.com/>`_ Linux, primarily on the Python ecosystem, and - is both an Ubuntu and a `Debian <https://www.debian.org/>`_ uploading - developer. Barry has served as Python's postmaster, webmaster, release - manager, Language Summit co-chair, `Jython <https://www.jython.org/>`_ - project leader, `GNU Mailman <https://www.list.org/>`_ project leader, and - probably lots of other things he shouldn't admit to. + first Python workshop at `NIST <https://www.nist.gov/>`_ in Gaithersburg, + MD in 1994, where he met Guido and several other early Python adopters. + Barry subsequently worked with Guido for 8 years while at `CNRI + <http://cnri.reston.va.us/>`_. Barry has served as Python's postmaster, + webmaster, release manager, Language Summit co-chair, `Jython + <https://www.jython.org/>`_ project leader, `GNU Mailman + <https://www.list.org/>`_ project leader, and Python Steering Council + member in 2019, 2020, 2021, 2024, and 2025. .. topic:: Eric Snow (United States) @@ -261,7 +261,7 @@ participating in the CPython core development process: .. topic:: Carol Willing (United States) - * Noteable: `<https://noteable.io/about-us/>`__ (VP Engineering) + * Noteable (VP Engineering) * Personal site: `Willing Consulting <https://www.willingconsulting.com/>`_ * `Extended bio <https://www.willingconsulting.com/about/>`__ * Project Jupyter (Software Council, Core Team for JupyterHub/Binder) @@ -279,11 +279,11 @@ Goals of this page The `issue metrics`_ automatically collected by the CPython issue tracker strongly suggest that the current core development process is bottlenecked on -core developer time - this is most clearly indicated in the first metrics graph, -which shows both the number of open issues and the number of patches awaiting +core developer time. This is most clearly indicated in the first metrics graph, +which shows both the number of open issues and the number of pull requests awaiting review growing steadily over time, despite CPython being one of the most active open source projects in the world. This bottleneck then impacts not only -resolving open issues and applying submitted patches, but also the process of +resolving open issues and accepting submitted pull requests, but also the process of identifying, nominating and mentoring new core developers. The core commit statistics monitored by sites like `OpenHub`_ provide a good diff --git a/core-developers/responsibilities.rst b/core-developers/responsibilities.rst index 0638a967e6..5cd5ed7bdb 100644 --- a/core-developers/responsibilities.rst +++ b/core-developers/responsibilities.rst @@ -7,8 +7,8 @@ Responsibilities As contributors to the CPython project, our shared responsibility is to collaborate constructively with other contributors, including core developers. This responsibility covers all forms of contribution, whether that's submitting -patches to the implementation or documentation, reviewing other peoples' -patches, triaging issues on the issue tracker, or discussing design and +pull requests to the implementation or documentation, reviewing other peoples' +pull requests, triaging issues on the issue tracker, or discussing design and development ideas on the core :ref:`communication channels <communication-channels>`. @@ -68,7 +68,7 @@ the ability to license your code means it can be put under the PSF license so it can be legally distributed with Python. This is a very important step! Hopefully you have already submitted a -contributor agreement if you have been submitting patches. But if you have not +contributor agreement if you have been submitting pull requests. But if you have not done this yet, it is best to do this ASAP, probably before you even do your first commit so as to not forget. Also do not forget to enter your GitHub username into your details on the issue tracker. @@ -127,4 +127,4 @@ And finally, enjoy yourself! Contributing to open source software should be fun (overall). If you find yourself no longer enjoying the work then either take a break or figure out what you need to do to make it enjoyable again. -.. _PSF Code of Conduct: https://www.python.org/psf/conduct/ +.. _PSF Code of Conduct: https://policies.python.org/python.org/code-of-conduct/ diff --git a/developer-workflow/c-api.rst b/developer-workflow/c-api.rst index 8097354c38..3f8c03e92c 100644 --- a/developer-workflow/c-api.rst +++ b/developer-workflow/c-api.rst @@ -174,7 +174,7 @@ Unstable C API The unstable C API tier is meant for extensions that need tight integration with the interpreter, like debuggers and JIT compilers. -Users of this tier may need to change their code with every minor release. +Users of this tier may need to change their code with every feature release. In many ways, this tier is like the general C API: @@ -189,7 +189,7 @@ The differences are: - Names of functions structs, macros, etc. start with the ``PyUnstable_`` prefix. This defines what's in the unstable tier. -- The unstable API can change in minor versions, without any deprecation +- The unstable API can change in feature releases, without any deprecation period. - A stability note appears in the docs. This happens automatically, based on the name @@ -198,7 +198,7 @@ The differences are: Despite being “unstable”, there are rules to make sure third-party code can use this API reliably: -* Changes and removals can be done in minor releases +* Changes and removals can be done in feature releases (:samp:`3.{x}.0`, including Alphas and Betas for :samp:`3.{x}.0`). * Adding a new unstable API *for an existing feature* is allowed even after Beta feature freeze, up until the first Release Candidate. @@ -219,7 +219,7 @@ Moving an API from the public tier to Unstable * Expose the API under its new name, with the ``PyUnstable_`` prefix. The ``PyUnstable_`` prefix must be used for all symbols (functions, macros, variables, etc.). -* Make the old name an alias (e.g. a ``static inline`` function calling the +* Make the old name an alias (for example, a ``static inline`` function calling the new function). * Deprecate the old name, typically using :c:macro:`Py_DEPRECATED`. * Announce the change in the "What's New". @@ -255,7 +255,7 @@ Moving an API from unstable to public ------------------------------------- * Expose the API under its new name, without the ``PyUnstable_`` prefix. -* Make the old ``PyUnstable_*`` name be an alias (e.g. a ``static inline`` +* Make the old ``PyUnstable_*`` name be an alias (for example, a ``static inline`` function calling the new function). * Announce the change in What's New. @@ -393,7 +393,7 @@ Adding a new definition to the Limited API #if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x03yy0000 - with the ``yy`` corresponding to the target CPython version, e.g. + with the ``yy`` corresponding to the target CPython version, for example, ``0x030A0000`` for Python 3.10. - Append an entry to the Stable ABI manifest, ``Misc/stable_abi.toml`` - Regenerate the autogenerated files using ``make regen-limited-abi``. @@ -426,7 +426,7 @@ To add a test file: - Add a C file ``Modules/_testcapi/yourfeature_limited.c``. If that file already exists but its ``Py_LIMITED_API`` version is too low, add a version - postfix, e.g. ``yourfeature_limited_3_12.c`` for Python 3.12+. + postfix, for example, ``yourfeature_limited_3_12.c`` for Python 3.12+. - ``#define Py_LIMITED_API`` to the minimum limited API version needed. - ``#include "parts.h"`` after the ``Py_LIMITED_API`` definition - Enclose the entire rest of the file in ``#ifdef LIMITED_API_AVAILABLE``, diff --git a/developer-workflow/communication-channels.rst b/developer-workflow/communication-channels.rst index 0f7970fade..00c569178d 100644 --- a/developer-workflow/communication-channels.rst +++ b/developer-workflow/communication-channels.rst @@ -27,12 +27,13 @@ in return. Mailing lists ============= -.. note:: Some mailing lists have been supplanted by categories in the - Python `Discourse`_. Specifically, +.. note:: + + Mailing lists have generally been replaced by the `Discourse`_ forum. + Specifically, * The python-dev list is superseded by the `Core Development`_ and `PEPs`_ categories on Discourse. - * The python-ideas list is superseded by posts in the `Ideas`_ category on Discourse. @@ -42,17 +43,21 @@ Mailing lists - Ideas about new functionality should **not** start here, and instead should be discussed in `Ideas`_. - Technical support questions should also not be asked here, and instead - should go to the python-list_ or python-help_ mailing lists, or the - `Python Help`_ category on Discourse. + should go to the `Python Help`_ category on Discourse or the python-list_. + + Previous threads on the python-dev_, python-committers_, and python-ideas_ + mailing lists can be accessed through the `online archive + <https://mail.python.org/archives/>`__. -Existing threads on the python-dev_, python-committers_, and python-ideas_ mailing lists -can be accessed through the `online archive <web gateway_>`__. + .. _python-committers: https://mail.python.org/mailman3/lists/python-committers.python.org/ + .. _python-dev: https://mail.python.org/mailman3/lists/python-dev.python.org/ + .. _python-ideas: https://mail.python.org/mailman3/lists/python-ideas.python.org General Python questions should go to `python-list`_ or `tutor`_ -or similar resources, such as StackOverflow_ or the ``#python`` IRC channel +or similar resources, such as `Stack Overflow`_ or the ``#python`` IRC channel on Libera.Chat_. -`The core-workflow <https://github.com/python/core-workflow/issues>`_ +The `core-workflow <https://github.com/python/core-workflow/issues>`__ issue tracker is the place to discuss and work on improvements to the CPython core development workflow. @@ -62,16 +67,10 @@ https://mail.python.org/mailman3/ (newer lists, using Mailman3). Some lists may be mirrored at `GMANE <https://gmane.io/>`_ and can be read and posted to in various ways, including via web browsers, NNTP newsreaders, and RSS feed readers. -.. _issue tracker: https://github.com/python/cpython/issues -.. _python-committers: https://mail.python.org/mailman3/lists/python-committers.python.org/ -.. _python-dev: https://mail.python.org/mailman3/lists/python-dev.python.org/ -.. _python-help: https://mail.python.org/mailman/listinfo/python-help -.. _python-ideas: https://mail.python.org/mailman3/lists/python-ideas.python.org .. _python-list: https://mail.python.org/mailman/listinfo/python-list .. _tutor: https://mail.python.org/mailman/listinfo/tutor -.. _StackOverflow: https://stackoverflow.com/ +.. _Stack Overflow: https://stackoverflow.com/ .. _Libera.Chat: https://libera.chat/ -.. _web gateway: https://mail.python.org/archives/ .. _communication-discourse: @@ -79,11 +78,8 @@ ways, including via web browsers, NNTP newsreaders, and RSS feed readers. Discourse (discuss.python.org web forum) ======================================== -We have our own `Discourse`_ forum for both developers and users. This forum -complements the `python-dev`_, `python-ideas`_, `python-help`_, and -`python-list`_ mailing lists. - -This forum has different categories and most core development discussions +We have our own `Discourse`_ forum for both developers and users. +It has different categories and most core development discussions take place in the open forum categories for `PEPs`_ and `Core Development`_ (these are the Discourse equivalents to the python-dev mailing list). All categories are open for users to read and post with the exception of @@ -98,8 +94,8 @@ Tutorials for new users To start a topic or participate in any discussions in the forum, sign up and create an account using an email address or GitHub account. You can do so by -clicking the "Sign Up" button on the top right hand corner of the `Discourse`_ -main page. +clicking the :guilabel:`Sign Up` button on the top right hand corner of the +`Discourse`_ main page. The Python Discourse `Quick Start <https://discuss.python.org/t/python-discourse-quick-start/116>`_ compiled by `Carol Willing <https://discuss.python.org/u/willingc/>`_ gives you @@ -110,15 +106,18 @@ These tutorials can be activated by replying to a welcome message from "discours Greetings!" received under Notifications and Messages in your user account. * Click on your personal account found on the top right hand corner of the page. -* The dropdown menu will show four different icons: 🔔 (Notifications), - 🔖 (Bookmarks), ✉️ (Messages), and 👤 (Preferences). +* The dropdown menu will show four different icons: + :guilabel:`🔔` (Notifications), + :guilabel:`🔖` (Bookmarks), + :guilabel:`✉️` (Messages), and + :guilabel:`👤` (Preferences). * Select either Notifications or Messages. * Open the "Greetings!" message sent by Discobot to start the tutorial. Ensure that you read through the `Python Code of Conduct <https://discuss.python.org/faq>`_. We are to be open, considerate and respectful to all users in the community. You can report messages that don't respect the CoC by clicking on the three -dots under the message and then on the ⚐ icon. You can also mention the +dots under the message and then on the :guilabel:`⚐` icon. You can also mention the `@staff <https://discuss.python.org/groups/staff>`_, `@moderators <https://discuss.python.org/groups/moderators>`_, or `@admins <https://discuss.python.org/groups/admins>`_ groups in a message. @@ -126,7 +125,8 @@ dots under the message and then on the ⚐ icon. You can also mention the Reading topics ------------------ +-------------- + Click a topic title and read down the list of replies in chronological order, following links or previewing replies and quotes as you go. Use your mouse to scroll the screen, or use the timeline scroll bar on the right which also shows @@ -142,10 +142,11 @@ Following categories (category notifications) Notifications can be set for individual categories and topics. To change any of these defaults, you can either go to your user preferences, or visit the category -page, and use the notification button 🔔 above the topic list, -on the top right hand corner of the category page beside the "+ New Topic" button. +page, and use the notification button :guilabel:`🔔` above the topic list, +on the top right hand corner of the category page beside the +:guilabel:`+ New Topic` button. -Clicking on the Notification control 🔔 will show a drop-down panel with 5 +Clicking on the notification control :guilabel:`🔔` will show a drop-down panel with 5 different options: Watching, Tracking, Watching First Post, Normal, and Muted. All categories are set by default in Normal mode where you will only be notified if someone mentions your @name or replies to you. @@ -154,7 +155,7 @@ Following individual threads (topic notifications) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To follow any individual topics or threads, you can adjust your notifications -through the notification button 🔔 found on the right of the topic at the end +through the notification button :guilabel:`🔔` found on the right of the topic at the end of the timeline. You can also do so at the bottom of each topic. Select "Watching" and you will be notified when there is any new updated reply from that particular thread. @@ -181,7 +182,7 @@ mailing list mode" and save changes. .. _Core Development: https://discuss.python.org/c/core-dev/23 .. _Committers: https://discuss.python.org/c/committers/5 .. _Ideas: https://discuss.python.org/c/ideas/6 -.. _Python Help: https://discuss.python.org/c/users/7 +.. _Python Help: https://discuss.python.org/c/help/7 Discord (private chat server) @@ -235,7 +236,7 @@ Blogs Several core developers are active bloggers and discuss Python's development that way. You can find their blogs (and various other developers who use Python) -at https://planetpython.org/. +at `Planet Python <https://planetpython.org/>`__. Setting expectations for open source participation @@ -258,7 +259,7 @@ Other core workflow tools are: * `blurb_it`_ * `miss-islington`_ * `cla-bot`_ -* `cpython-emailer-webhook`_ +* `webhook-mailer`_ Python `Performance Benchmark`_ project is intended to be an authoritative source of benchmarks for all Python implementations. @@ -269,5 +270,5 @@ source of benchmarks for all Python implementations. .. _blurb_it: https://github.com/python/blurb_it .. _miss-islington: https://github.com/python/miss-islington .. _cla-bot: https://github.com/ambv/cla-bot -.. _cpython-emailer-webhook: https://github.com/berkerpeksag/cpython-emailer-webhook +.. _webhook-mailer: https://github.com/python/webhook-mailer .. _Performance Benchmark: https://github.com/python/pyperformance diff --git a/developer-workflow/development-cycle.rst b/developer-workflow/development-cycle.rst index 79a9dced48..8a37692ad6 100644 --- a/developer-workflow/development-cycle.rst +++ b/developer-workflow/development-cycle.rst @@ -36,8 +36,8 @@ Some examples of release tags: ``v3.7.0a1``, ``v3.6.3``, ``v2.7.14rc1``. Branches -------- -There is a branch for each *feature version*, whether released or not (e.g. -3.7, 3.8). +There is a branch for each *feature version*, whether released or not (for +example, 3.12, 3.13). .. _indevbranch: @@ -51,13 +51,11 @@ changes, performance improvements, bug fixes. At some point during the life-cycle of a release, a new :ref:`maintenance branch <maintbranch>` is created to host all bug fixing -activity for further micro versions in a feature version (3.8.1, 3.8.2, etc.). +activity for further micro versions in a feature version (3.12.1, 3.12.2, and so +on). -For versions 3.4 and before, this was conventionally done when the final -release was cut (for example, 3.4.0 final). - -Starting with the 3.5 release, we create the release maintenance branch -(e.g. 3.5) at the time we enter beta (3.5.0 beta 1). This allows +We create the release maintenance branch +(``3.14``) at the time we enter beta (3.14.0 beta 1). This allows feature development for the release 3.n+1 to occur within the main branch alongside the beta and release candidate stabilization periods for release 3.n. @@ -79,7 +77,7 @@ releases; the terms are used interchangeably. These releases have a The only changes allowed to occur in a maintenance branch without debate are bug fixes, test improvements, and edits to the documentation. Also, a general rule for maintenance branches is that compatibility -must not be broken at any point between sibling micro releases (3.5.1, 3.5.2, +must not be broken at any point between sibling micro releases (3.12.1, 3.12.2, etc.). For both rules, only rare exceptions are accepted and **must** be discussed first. @@ -89,7 +87,7 @@ since most readers access the `stable documentation <https://docs.python.org/3/> rather than the `development documentation <https://docs.python.org/dev/>`__. A new maintenance branch is normally created when the next feature release -cycle reaches feature freeze, i.e. at its first beta pre-release. +cycle reaches feature freeze, that is, at its first beta pre-release. From that point on, changes intended for remaining pre-releases, the final release (3.x.0), and subsequent bugfix releases are merged to that maintenance branch. @@ -97,9 +95,9 @@ that maintenance branch. Sometime following the final release (3.x.0), the maintenance branch for the previous minor version will go into :ref:`security mode <secbranch>`, usually after at least one more bugfix release at the discretion of the -release manager. For example, the 3.4 maintenance branch was put into -:ref:`security mode <secbranch>` after the 3.4.4 bugfix release -which followed the release of 3.5.1. +release manager. For example, the 3.11 maintenance branch was put into +:ref:`security mode <secbranch>` after the 3.11.9 bugfix release +which followed the release of 3.12.2. .. _secbranch: @@ -120,7 +118,7 @@ Commits to security branches are to be coordinated with the release manager for the corresponding feature version, as listed in the :ref:`branchstatus`. Merging of pull requests to security branches is restricted to release managers. Any release made from a security branch is source-only and done only when actual -security patches have been applied to the branch. These releases have a +security fixes have been applied to the branch. These releases have a **micro version** number greater than the last **bugfix** release. .. _eolbranch: @@ -131,7 +129,7 @@ End-of-life branches The code base for a release cycle which has reached end-of-life status is frozen and no longer has a branch in the repo. The final state of the end-of-lifed branch is recorded as a tag with the same name as the -former branch, e.g. ``3.3`` or ``2.6``. +former branch, for example, ``3.8`` or ``2.7``. The :ref:`versions` page contains list of active and end-of-life branches. @@ -153,8 +151,8 @@ Pre-alpha The branch is in this stage when no official release has been done since the latest final release. There are no special restrictions placed on -commits, although the usual advice applies (getting patches reviewed, avoiding -breaking the buildbots). +commits, although the usual advice applies (getting pull requests reviewed, +avoiding breaking the buildbots). .. _alpha: @@ -191,7 +189,7 @@ Release Candidate (RC) A branch preparing for an RC release can only have bugfixes applied that have been reviewed by other core developers. Generally, these issues must be -severe enough (e.g. crashes) that they deserve fixing before the final release. +severe enough (for example, crashes) that they deserve fixing before the final release. All other issues should be deferred to the next development cycle, since stability is the strongest concern at this point. @@ -227,13 +225,13 @@ repositories are expected to relate to the Python language, the CPython reference implementation, their documentation and their development workflow. This includes, for example: -* The reference implementation of Python and related repositories (i.e. `CPython <https://github.com/python/cpython>`_) -* Tooling and support around CPython development (e.g. `pyperformance <https://github.com/python/pyperformance>`_, `Bedevere <https://github.com/python/bedevere>`_) -* Helpers and backports for Python/CPython features (e.g. `typing_extensions <https://github.com/python/typing_extensions>`_, `typeshed <https://github.com/python/typeshed>`_, `tzdata <https://github.com/python/tzdata>`_, `pythoncapi-compat <https://github.com/python/pythoncapi-compat>`_) -* Organization-related repositories (e.g. the `Code of Conduct <https://github.com/python/pycon-code-of-conduct>`_, `.github <https://github.com/python/.github>`_) -* Documentation and websites for all the above (e.g. `python.org repository <https://github.com/python/pythondotorg>`_, `PEPs <https://github.com/python/peps>`_, `Devguide <https://github.com/python/devguide>`_, docs translations) -* Infrastructure for all the above (e.g. `docsbuild-scripts <https://github.com/python/docsbuild-scripts>`_, `buildmaster-config <https://github.com/python/buildmaster-config>`_) -* Discussions and notes around official development-related processes and events (e.g. `steering-council <https://github.com/python/steering-council>`_, `core-sprint <https://github.com/python/core-sprint>`_) +* The reference implementation of Python and related repositories: `CPython <https://github.com/python/cpython>`_. +* Tooling and support around CPython development: `pyperformance <https://github.com/python/pyperformance>`_, `Bedevere <https://github.com/python/bedevere>`_. +* Helpers and backports for Python/CPython features: `typing_extensions <https://github.com/python/typing_extensions>`_, `typeshed <https://github.com/python/typeshed>`_, `tzdata <https://github.com/python/tzdata>`_, `pythoncapi-compat <https://github.com/python/pythoncapi-compat>`_. +* Organization-related repositories: the `Code of Conduct <https://github.com/python/pycon-code-of-conduct>`_, `.github <https://github.com/python/.github>`_. +* Documentation and websites for all the above: `python.org repository <https://github.com/python/pythondotorg>`_, `PEPs <https://github.com/python/peps>`_, `Devguide <https://github.com/python/devguide>`_, docs translations. +* Infrastructure for all the above: `docsbuild-scripts <https://github.com/python/docsbuild-scripts>`_, `buildmaster-config <https://github.com/python/buildmaster-config>`_. +* Discussions and notes around official development-related processes and events: `steering-council <https://github.com/python/steering-council>`_, `core-sprint <https://github.com/python/core-sprint>`_. Before adding a new repository to the organization, open a discussion to seek consensus in the `Committers Discourse category <https://discuss.python.org/c/committers/5>`_. @@ -248,13 +246,13 @@ accounts or other GitHub orgs. It is relatively easy to move a repository to the organization once it is mature. For example, this would now apply to experimental features like `asyncio <https://github.com/python/asyncio>`_, `exceptiongroups <https://github.com/python/exceptiongroups>`_, -and drafts of new guides and other documentation (e.g. `redistributor-guide +and drafts of new guides and other documentation (for example, `redistributor-guide <https://github.com/python/redistributor-guide>`_). -General-use tools and libraries (e.g. `mypy <https://github.com/python/mypy>`_ +General-use tools and libraries (for example, `mypy <https://github.com/python/mypy>`_ or `Black <https://github.com/psf/black>`_) should also be developed outside the ``python`` organization, unless core devs (as represented by the SC) -specifically want to “bless” one implementation (as with e.g. +specifically want to “bless” one implementation (as with `typeshed <https://github.com/python/typeshed>`_, `tzdata <https://github.com/python/tzdata>`_, or `pythoncapi-compat <https://github.com/python/pythoncapi-compat>`_). @@ -301,7 +299,7 @@ Current owners +----------------------+--------------------------------+-----------------+ | Ee Durbin | PSF Director of Infrastructure | ewdurbin | +----------------------+--------------------------------+-----------------+ -| Van Lindberg | PSF General Counsel | VanL | +| Jacob Coffee | PSF Infrastructure Engineer | JacobCoffee | +----------------------+--------------------------------+-----------------+ | Łukasz Langa | CPython Developer in Residence | ambv | +----------------------+--------------------------------+-----------------+ @@ -340,17 +338,15 @@ Current administrators +-------------------+----------------------------------------------------------+-----------------+ | Name | Role | GitHub Username | +===================+==========================================================+=================+ -| Pablo Galindo | Python 3.10 and 3.11 Release Manager, | pablogsal | -| | Maintainer of buildbot.python.org | | -+-------------------+----------------------------------------------------------+-----------------+ -| Łukasz Langa | Python 3.8 and 3.9 Release Manager, | ambv | -| | PSF CPython Developer in Residence 2021-2022 | | +| Hugo van Kemenade | Python 3.14 and 3.15 Release Manager | hugovk | +-------------------+----------------------------------------------------------+-----------------+ -| Ned Deily | Python 3.6 and 3.7 Release Manager | ned-deily | +| Thomas Wouters | Python 3.12 and 3.13 Release Manager | Yhg1s | +-------------------+----------------------------------------------------------+-----------------+ -| Larry Hastings | Retired Release Manager (for Python 3.4 and 3.5) | larryhastings | +| Pablo Galindo | Python 3.10 and 3.11 Release Manager, | pablogsal | +| | Maintainer of buildbot.python.org | | +-------------------+----------------------------------------------------------+-----------------+ -| Berker Peksag | Maintainer of bpo-linkify and cpython-emailer-webhook | berkerpeksag | +| Łukasz Langa | Python 3.9 Release Manager, | ambv | +| | PSF CPython Developer in Residence 2021-present | | +-------------------+----------------------------------------------------------+-----------------+ | Brett Cannon | | brettcannon | +-------------------+----------------------------------------------------------+-----------------+ @@ -358,6 +354,8 @@ Current administrators +-------------------+----------------------------------------------------------+-----------------+ | Mariatta Wijaya | Maintainer of bedevere, blurb_it and miss-islington | Mariatta | +-------------------+----------------------------------------------------------+-----------------+ +| Seth Larson | PSF Security Developer-in-Residence | sethmlarson | ++-------------------+----------------------------------------------------------+-----------------+ Repository release manager role policy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -365,13 +363,37 @@ Repository release manager role policy Release Managers for :ref:`in-development <indevbranch>`, :ref:`maintenance <maintbranch>`, and :ref:`security mode <secbranch>` Python releases are granted Administrator privileges on the repository. Once a release branch has -entered :ref:`end-of-life <eolbranch>`, the Release Manager for that branch is -removed as an Administrator and granted sole privileges (out side of repository -administrators) to merge changes to that branch. +entered :ref:`end-of-life <eolbranch>`, the Release Manager for that branch +creates a final tag and deletes the branch. After this, they are +removed as an Administrator. Multi-Factor Authentication must be enabled by the user in order to retain access as a Release Manager of the branch. +PyPI organization policy +^^^^^^^^^^^^^^^^^^^^^^^^ + +The Python core team owns the :pypi-org:`cpython` and :pypi-org:`python` +organizations on PyPI for publishing packages. +The main benefits of adding packages to these organizations: + +* Visibility: we can see our packages under a PyPI org page +* Maintainability: we can share granular PyPI access to improve the bus factor + +The general policy on which organization to use: + +* :pypi-org:`cpython`: + for development tools that are tied fairly closely to CPython development. + For example, :pypi:`blurb` and :pypi:`cherry-picker`. + Users generally shouldn’t have to care except for developing CPython itself + (although that doesn’t mean the tools necessarily have to be unusable for + anyone else). +* :pypi-org:`python`: + for general-audience projects that are maintained by the Python core team. + For example, :pypi:`pyperformance`, :pypi:`python-docs-theme` and + :pypi:`tzdata`. + + Governance ---------- diff --git a/developer-workflow/extension-modules.rst b/developer-workflow/extension-modules.rst index 0384c2b382..61c1ff08af 100644 --- a/developer-workflow/extension-modules.rst +++ b/developer-workflow/extension-modules.rst @@ -5,13 +5,662 @@ Standard library extension modules ================================== -In this section, we could explain how to write a CPython extension with the C language, but the topic can take a complete book. - -For this reason, we prefer to give you some links where you can read a very good documentation. - -Read the following references: +In this section, we explain how to configure and compile the CPython project +with a C :term:`extension module`. We will not explain how to write a C +extension module and prefer to give you some links where you can read good +documentation: * https://docs.python.org/dev/c-api/ * https://docs.python.org/dev/extending/ * :pep:`399` * https://pythonextensionpatterns.readthedocs.io/en/latest/ + +Some modules in the standard library, such as :mod:`datetime` or :mod:`pickle`, +have identical implementations in C and Python; the C implementation, when +available, is expected to improve performance (such extension modules are +commonly referred to as *accelerator modules*). + +Other modules mainly implemented in Python may import a C helper extension +providing implementation details (for instance, the :mod:`csv` module uses +the internal :mod:`!_csv` module defined in :cpy-file:`Modules/_csv.c`). + +Classifying extension modules +============================= + +Extension modules can be classified into two categories: + +* A *built-in* extension module is a module built and shipped with + the Python interpreter. A built-in module is *statically* linked + into the interpreter, thereby lacking a :attr:`!__file__` attribute. + + .. seealso:: :data:`sys.builtin_module_names` --- names of built-in modules. + + Built-in modules are built with the :c:macro:`!Py_BUILD_CORE_BUILTIN` + macro defined. + +* A *shared* (or *dynamic*) extension module is built as a shared library + (``.so`` or ``.dll`` file) and is *dynamically* linked into the interpreter. + + In particular, the module's :attr:`!__file__` attribute contains the path + to the ``.so`` or ``.dll`` file. + + Shared modules are built with the :c:macro:`!Py_BUILD_CORE_MODULE` + macro defined. Using the :c:macro:`!Py_BUILD_CORE_BUILTIN` macro + instead causes an :exc:`ImportError` when importing the module. + +.. note:: + + Informally, built-in extension modules can be regarded as *required* + while shared extension modules are *optional* in the sense that they + might be supplied, overridden or disabled externally. + + Usually, accelerator modules are built as *shared* extension modules, + especially if they already have a pure Python implementation. + +According to :pep:`399`, *new* extension modules MUST provide a working and +tested pure Python implementation, unless a special dispensation from +the :github:`Steering Council <python/steering-council>` is given. + +Adding an extension module to CPython +===================================== + +Assume that the standard library contains a pure Python module :mod:`!foo` +with the following :func:`!foo.greet` function: + +.. code-block:: python + :caption: Lib/foo.py + + def greet(): + return "Hello World!" + +Instead of using the Python implementation of :func:`!foo.greet`, we want to +use its corresponding C extension implementation exposed in the :mod:`!_foo` +module. Ideally, we want to modify ``Lib/foo.py`` as follows: + +.. code-block:: python + :caption: Lib/foo.py + + try: + # use the C implementation if possible + from _foo import greet + except ImportError: + # fallback to the pure Python implementation + def greet(): + return "Hello World!" + +.. note:: + + Accelerator modules should *never* be imported directly. The convention is + to mark them as private implementation details with the underscore prefix + (namely, :mod:`!_foo` in this example). + +In order to incorporate the accelerator module, we need to determine: + +- where to update the CPython project tree with the extension module source code, +- which files to modify to configure and compile the CPython project, and +- which ``Makefile`` rules to invoke at the end. + +Updating the CPython project tree +--------------------------------- + +Usually, accelerator modules are added in the :cpy-file:`Modules` directory of +the CPython project. If more than one file is needed for the extension module, +it is more convenient to create a sub-directory in :cpy-file:`Modules`. + +In the simplest example where the extension module consists of one file, it may +be placed in :cpy-file:`Modules` as ``Modules/_foomodule.c``. For a non-trivial +example of the extension module :mod:`!_foo`, we consider the following working +tree: + +- :ref:`Modules/_foo/_foomodule.c` --- the extension module implementation. +- :ref:`Modules/_foo/helper.h` --- the extension helpers declarations. +- :ref:`Modules/_foo/helper.c` --- the extension helpers implementations. + +By convention, the source file containing the extension module implementation +is called ``<NAME>module.c``, where ``<NAME>`` is the name of the module that +will be later imported (in our case :mod:`!_foo`). In addition, the directory +containing the implementation should also be named similarly. + +.. code-block:: c + :caption: Modules/_foo/helper.h + :name: Modules/_foo/helper.h + + #ifndef _FOO_HELPER_H + #define _FOO_HELPER_H + + #include "Python.h" + + typedef struct { + /* ... */ + } foomodule_state; + + static inline foomodule_state * + get_foomodule_state(PyObject *module) + { + void *state = PyModule_GetState(module); + assert(state != NULL); + return (foomodule_state *)state; + } + + /* Helper used in Modules/_foo/_foomodule.c + * but implemented in Modules/_foo/helper.c. + */ + extern PyObject * + _Py_greet_fast(void); + + #endif // _FOO_HELPER_H + +.. tip:: + + Functions or data that do not need to be shared across different C source + files should be declared ``static`` to avoid exporting their symbols from + ``libpython``. + + If symbols need to be exported, their names must start with ``Py`` or + ``_Py``. This can be verified by ``make smelly``. For more details, + please refer to the section on :ref:`Changing Python's C API <c-api>`. + +.. code-block:: c + :caption: Modules/_foo/helper.c + :name: Modules/_foo/helper.c + + #include "_foomodule.h" + + PyObject *_Py_greet_fast(void) { + return PyUnicode_FromString("Hello World!"); + } + +.. code-block:: c + :caption: Modules/_foo/_foomodule.c + :name: Modules/_foo/_foomodule.c + + #include "helper.h" + #include "clinic/_foomodule.c.h" + + /* Functions for the extension module's state */ + static int + foomodule_exec(PyObject *module) + { + // imports, static attributes, exported classes, etc + return 0; + } + + static int + foomodule_traverse(PyObject *m, visitproc visit, void *arg) + { + foomodule_state *st = get_foomodule_state(m); + // call Py_VISIT() on the state attributes + return 0; + } + + static int + foomodule_clear(PyObject *m) + { + foomodule_state *st = get_foomodule_state(m); + // call Py_CLEAR() on the state attributes + return 0; + } + + static void + foomodule_free(void *m) { + (void)foomodule_clear((PyObject *)m); + } + + /* Implementation of publicly exported functions. */ + + /*[clinic input] + module foo + [clinic start generated code]*/ + /*[clinic end generated code: output=... input=...]*/ + + /*[clinic input] + foo.greet -> object + + [clinic start generated code]*/ + + static PyObject * + foo_greet_impl(PyObject *module) + /*[clinic end generated code: output=... input=...]*/ + { + return _Py_greet_fast(); + } + + /* Exported module's data */ + + static PyMethodDef foomodule_methods[] = { + // macro in 'clinic/_foomodule.c.h' after running 'make clinic' + FOO_GREET_METHODDEF + {NULL, NULL} + }; + + static struct PyModuleDef_Slot foomodule_slots[] = { + // 'foomodule_exec' may be NULL if the state is trivial + {Py_mod_exec, foomodule_exec}, + {Py_mod_multiple_interpreters, Py_MOD_PER_INTERPRETER_GIL_SUPPORTED}, + {Py_mod_gil, Py_MOD_GIL_NOT_USED}, + {0, NULL}, + }; + + static struct PyModuleDef foomodule = { + PyModuleDef_HEAD_INIT, + .m_name = "_foo", + .m_doc = "some doc", // or NULL if not needed + .m_size = sizeof(foomodule_state), + .m_methods = foomodule_methods, + .m_slots = foomodule_slots, + .m_traverse = foomodule_traverse, // or NULL if the state is trivial + .m_clear = foomodule_clear, // or NULL if the state is trivial + .m_free = foomodule_free, // or NULL if the state is trivial + }; + + PyMODINIT_FUNC + PyInit__foo(void) + { + return PyModuleDef_Init(&foomodule); + } + +.. tip:: + + Recall that the ``PyInit_<NAME>`` function must be suffixed by the + module name ``<NAME>`` used in import statements (here ``_foo``), + and which usually coincides with :c:member:`PyModuleDef.m_name`. + + Other identifiers such as those used in :ref:`Argument Clinic <clinic>` + inputs do not have such naming requirements. + +Configuring the CPython project +------------------------------- + +Now that we have added our extension module to the CPython source tree, +we need to update some configuration files in order to compile the CPython +project on different platforms. + +Updating ``Modules/Setup.{bootstrap,stdlib}.in`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Depending on whether the extension module is required to get a functioning +interpreter or not, we update :cpy-file:`Modules/Setup.bootstrap.in` or +:cpy-file:`Modules/Setup.stdlib.in`. In the former case, the extension +module is necessarily built as a built-in extension module. + +.. tip:: + + For accelerator modules, :cpy-file:`Modules/Setup.stdlib.in` should be + preferred over :cpy-file:`Modules/Setup.bootstrap.in`. + +For built-in extension modules, update :cpy-file:`Modules/Setup.bootstrap.in` +by adding the following line after the ``*static*`` marker: + +.. code-block:: text + :caption: :cpy-file:`Modules/Setup.bootstrap.in` + :emphasize-lines: 3 + + *static* + ... + _foo _foo/_foomodule.c _foo/helper.c + ... + +The syntax is ``<NAME> <SOURCES>`` where ``<NAME>`` is the name of the +module used in :keyword:`import` statements and ``<SOURCES>`` is the list +of space-separated source files. + +For other extension modules, update :cpy-file:`Modules/Setup.stdlib.in` +by adding the following line after the ``*@MODULE_BUILDTYPE@*`` marker +but before the ``*shared*`` marker: + +.. code-block:: text + :caption: :cpy-file:`Modules/Setup.stdlib.in` + :emphasize-lines: 3 + + *@MODULE_BUILDTYPE@* + ... + @MODULE__FOO_TRUE@_foo _foo/_foomodule.c _foo/helper.c + ... + *shared* + +The ``@MODULE_<NAME_UPPER>_TRUE@<NAME>`` marker expects ``<NAME_UPPER>`` to +be the upper-cased form of ``<NAME>``, where ``<NAME>`` has the same meaning +as before (in our case, ``<NAME_UPPER>`` and ``<NAME>`` are ``_FOO`` and +``_foo`` respectively). The marker is followed by the list of source files. + +If the extension module must be built as a *shared* module, put the +``@MODULE__FOO_TRUE@_foo`` line after the ``*shared*`` marker: + +.. code-block:: text + :caption: :cpy-file:`Modules/Setup.stdlib.in` + :emphasize-lines: 4 + + ... + *shared* + ... + @MODULE__FOO_TRUE@_foo _foo/_foomodule.c _foo/helper.c + +Updating :cpy-file:`configure.ac` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. add section about configuration variable afterwards + +* Locate the ``SRCDIRS`` variable and add the following line: + + .. code-block:: text + :caption: :cpy-file:`configure.ac` + :emphasize-lines: 4 + + AC_SUBST([SRCDIRS]) + SRCDIRS="\ + ... + Modules/_foo \ + ..." + + .. note:: + + This step is only needed when adding new source directories to + the CPython project. + +* Find the section containing ``PY_STDLIB_MOD`` and ``PY_STDLIB_MOD_SIMPLE`` + usages and add the following line: + + .. code-block:: text + :caption: :cpy-file:`configure.ac` + :emphasize-lines: 3 + + dnl always enabled extension modules + ... + PY_STDLIB_MOD_SIMPLE([_foo], [-I\$(srcdir)/Modules/_foo], []) + ... + + The ``PY_STDLIB_MOD_SIMPLE`` macro takes as arguments: + + * the module name ``<NAME>`` used in :keyword:`import` statements, + * the compiler flags (CFLAGS), and + * the linker flags (LDFLAGS). + + If the extension module may not be enabled or supported depending on the + host configuration, use the ``PY_STDLIB_MOD`` macro instead, which takes + as arguments: + + * the module name ``<NAME>`` used in :keyword:`import` statements, + * a boolean indicating whether the extension is **enabled** or not, + * a boolean indicating whether the extension is **supported** or not, + * the compiler flags (CFLAGS), and + * the linker flags (LDFLAGS). + + For instance, enabling the :mod:`!_foo` extension on Linux platforms, but + only providing support for 32-bit architecture, is achieved as follows: + + .. code-block:: text + :caption: :cpy-file:`configure.ac` + :emphasize-lines: 2, 3 + + PY_STDLIB_MOD([_foo], + [test "$ac_sys_system" = "Linux"], + [test "$ARCH_RUN_32BIT" = "true"], + [-I\$(srcdir)/Modules/_foo], []) + + More generally, the host's configuration status of the extension is + determined as follows: + + +-----------+-----------------+----------+ + | Enabled | Supported | Status | + +===========+=================+==========+ + | true | true | yes | + +-----------+-----------------+----------+ + | true | false | missing | + +-----------+-----------------+----------+ + | false | true or false | disabled | + +-----------+-----------------+----------+ + + The extension status is ``n/a`` if the extension is marked unavailable + by the ``PY_STDLIB_MOD_SET_NA`` macro. To mark an extension as unavailable, + find the usages of ``PY_STDLIB_MOD_SET_NA`` in :cpy-file:`configure.ac` and + add the following line: + + .. code-block:: text + :caption: :cpy-file:`configure.ac` + :emphasize-lines: 4 + + dnl Modules that are not available on some platforms + AS_CASE([$ac_sys_system], + ... + [PLATFORM_NAME], [PY_STDLIB_MOD_SET_NA([_foo])], + ... + ) + +.. tip:: + + Consider reading the comments and configurations for existing modules + in :cpy-file:`configure.ac` for guidance on adding new external build + dependencies for extension modules that need them. + +Updating :cpy-file:`Makefile.pre.in` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If needed, add the following line to the section for module dependencies: + +.. code-block:: text + :caption: :cpy-file:`Makefile.pre.in` + :emphasize-lines: 4 + + ########################################################################## + # Module dependencies and platform-specific files + ... + MODULE__FOO_DEPS=$(srcdir)/Modules/_foo/helper.h + ... + +The ``MODULE_<NAME_UPPER>_DEPS`` variable follows the same naming +requirements as the ``@MODULE_<NAME_UPPER>_TRUE@<NAME>`` marker. + +Updating MSVC project files +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +We describe the minimal steps for compiling on Windows using MSVC. + +* Update :cpy-file:`PC/config.c`: + + .. code-block:: c + :caption: :cpy-file:`PC/config.c` + :emphasize-lines: 3, 8 + + ... + // add the entry point prototype + extern PyObject* PyInit__foo(void); + ... + // update the entry points table + struct _inittab _PyImport_Inittab[] = { + ... + {"_foo", PyInit__foo}, + ... + {0, 0} + }; + ... + + Each item in ``_PyImport_Inittab`` consists of the module name to import, + here :mod:`!_foo`, with the corresponding ``PyInit_*`` function correctly + suffixed. + +* Update :cpy-file:`PCbuild/pythoncore.vcxproj`: + + .. code-block:: xml + :caption: :cpy-file:`PCbuild/pythoncore.vcxproj` + :emphasize-lines: 4, 11-12 + + <!-- group with header files ..\Modules\<MODULE>.h --> + <ItemGroup> + ... + <ClInclude Include="..\Modules\_foo\helper.h" /> + ... + </ItemGroup> + + <!-- group with source files ..\Modules\<MODULE>.c --> + <ItemGroup> + ... + <ClCompile Include="..\Modules\_foo\_foomodule.c" /> + <ClCompile Include="..\Modules\_foo\helper.c" /> + ... + </ItemGroup> + +* Update :cpy-file:`PCbuild/pythoncore.vcxproj.filters`: + + .. code-block:: xml + :caption: :cpy-file:`PCbuild/pythoncore.vcxproj.filters` + :emphasize-lines: 4-6, 13-18 + + <!-- group with header files ..\Modules\<MODULE>.h --> + <ItemGroup> + ... + <ClInclude Include="..\Modules\_foo\helper.h"> + <Filter>Modules\_foo</Filter> + </ClInclude> + ... + </ItemGroup> + + <!-- group with source files ..\Modules\<MODULE>.c --> + <ItemGroup> + ... + <ClCompile Include="..\Modules\_foo\_foomodule.c"> + <Filter>Modules\_foo</Filter> + </ClCompile> + <ClCompile Include="..\Modules\_foo\helper.c"> + <Filter>Modules\_foo</Filter> + </ClCompile> + ... + <ItemGroup> + +.. tip:: + + Header files use ``<ClInclude>`` tags, whereas + source files use ``<ClCompile>`` tags. + + +Compiling the CPython project +----------------------------- + +Now that the configuration is in place, it remains to compile the project: + +.. code-block:: shell + + make regen-configure + ./configure + make regen-all + make regen-stdlib-module-names + make + +.. tip:: + + Use ``make -j`` to speed-up compilation by utilizing as many CPU cores + as possible or ``make -jN`` to allow at most *N* concurrent jobs. + +* ``make regen-configure`` updates the :cpy-file:`configure` script. + + The :cpy-file:`configure` script must be generated using a specific version + of ``autoconf``. To that end, the :cpy-file:`Tools/build/regen-configure.sh` + script which the ``regen-configure`` rule is based on either requires Docker + or Podman, the latter being assumed by default. + + .. tip:: + + We recommend installing `Podman <https://podman.io/docs/installation>`_ + instead of Docker since the former does not require a background service + and avoids creating files owned by the ``root`` user in some cases. + +* ``make regen-all`` is responsible for regenerating header files and + invoking other scripts, such as :ref:`Argument Clinic <clinic>`. + Execute this rule if you do not know which files should be updated. + +* ``make regen-stdlib-module-names`` updates the standard module names, making + :mod:`!_foo` discoverable and importable via ``import _foo``. + +* The final ``make`` step is generally not needed since the previous ``make`` + invokations may completely rebuild the project, but it could be needed in + some specific cases. + +Troubleshooting +--------------- + +This section addresses common issues that you may face when following +this example of adding an extension module. + +No rule to make target ``regen-configure`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This usually happens after running ``make distclean`` (which removes +the ``Makefile``). The solution is to regenerate the :cpy-file:`configure` +script as follows: + +.. code-block:: shell + + ./configure # for creating the 'Makefile' file + make regen-configure # for updating the 'configure' script + ./configure # for updating the 'Makefile' file + +If missing, the :cpy-file:`configure` script can be regenerated +by executing :cpy-file:`Tools/build/regen-configure.sh`: + +.. code-block:: shell + + ./Tools/build/regen-configure.sh # create an up-to-date 'configure' + ./configure # create an up-to-date 'Makefile' + +``make regen-configure`` and missing permissions with Docker +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If Docker complains about missing permissions, this Stack Overflow post +could be useful in solving the issue: `How to fix docker: permission denied +<https://stackoverflow.com/q/48957195/9579194>`_. Alternatively, you may try +using `Podman <https://podman.io/docs/installation>`_. + +Missing ``Py_BUILD_CORE`` define when using internal headers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +By default, the CPython :ref:`Stable ABI <stable-abi>` is exposed via +:code:`#include "Python.h"`. In some cases, this may be insufficient +and internal headers from :cpy-file:`Include/internal` are needed; +in particular, those headers require the :c:macro:`!Py_BUILD_CORE` +macro to be defined. + +To that end, one should define the :c:macro:`!Py_BUILD_CORE_BUILTIN` +or the :c:macro:`!Py_BUILD_CORE_MODULE` macro depending on whether the +extension module is built-in or shared. Using either of the two macros +implies :c:macro:`!Py_BUILD_CORE` and gives access to CPython internals: + +.. code-block:: c + :caption: Definition of :c:macro:`!Py_BUILD_CORE_BUILTIN` + + #ifndef Py_BUILD_CORE_MODULE + # define Py_BUILD_CORE_BUILTIN 1 + #endif + +.. code-block:: c + :caption: Definition of :c:macro:`!Py_BUILD_CORE_MODULE` + + #ifndef Py_BUILD_CORE_BUILTIN + # define Py_BUILD_CORE_MODULE 1 + #endif + +Tips +---- + +In this section, we give some tips for improving the quality of +extension modules meant to be included in the standard library. + +Restricting to the Limited API +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In order for non-CPython implementations to benefit from new extension modules, +it is recommended to use the :ref:`Limited API <limited-c-api>`. Instead of +exposing the entire Stable ABI, define the :c:macro:`Py_LIMITED_API` macro +*before* the :code:`#include "Python.h"` directive: + +.. code-block:: c + :caption: Using the 3.13 Limited API. + :emphasize-lines: 3, 6 + + #include "pyconfig.h" // Py_GIL_DISABLED + #ifndef Py_GIL_DISABLED + # define Py_LIMITED_API 0x030d0000 + #endif + + #include "Python.h" + +This makes the extension module non-CPython implementation-friendly by +removing the dependencies to CPython internals. diff --git a/developer-workflow/grammar.rst b/developer-workflow/grammar.rst index ee6bdbaa40..d574dfed7d 100644 --- a/developer-workflow/grammar.rst +++ b/developer-workflow/grammar.rst @@ -4,67 +4,5 @@ Changing CPython's grammar ========================== -Abstract -======== - -There's more to changing Python's grammar than editing -:cpy-file:`Grammar/python.gram`. Here's a checklist. - -.. note:: - These instructions are for Python 3.9 and beyond. Earlier - versions use a different parser technology. You probably shouldn't - try to change the grammar of earlier Python versions, but if you - really want to, use GitHub to track down the earlier version of this - file in the devguide. - -For more information on how to use the new parser, check the -:ref:`section on how to use CPython's parser <parser>`. - -Checklist -========= - -Note: sometimes things mysteriously don't work. Before giving up, try ``make clean``. - -* :cpy-file:`Grammar/python.gram`: The grammar, with actions that build AST nodes. - After changing it, run ``make regen-pegen`` (or ``build.bat --regen`` on Windows), - to regenerate :cpy-file:`Parser/parser.c`. - (This runs Python's parser generator, :cpy-file:`Tools/peg_generator`). - -* :cpy-file:`Grammar/Tokens` is a place for adding new token types. After - changing it, run ``make regen-token`` to regenerate - :cpy-file:`Include/internal/pycore_token.h`, :cpy-file:`Parser/token.c`, - :cpy-file:`Lib/token.py` and :cpy-file:`Doc/library/token-list.inc`. - If you change both ``python.gram`` and ``Tokens``, - run ``make regen-token`` before ``make regen-pegen``. - On Windows, ``build.bat --regen`` will regenerate both at the same time. - -* :cpy-file:`Parser/Python.asdl` may need changes to match the grammar. - Then run ``make regen-ast`` to regenerate - :cpy-file:`Include/internal/pycore_ast.h` and :cpy-file:`Python/Python-ast.c`. - -* :cpy-file:`Parser/lexer/` contains the tokenization code. - This is where you would add a new type of comment or string literal, for example. - -* :cpy-file:`Python/ast.c` will need changes to validate AST objects - involved with the grammar change. - -* :cpy-file:`Python/ast_unparse.c` will need changes to unparse AST - involved with the grammar change ("unparsing" is used to turn annotations - into strings per :pep:`563`). - -* The :ref:`compiler` has its own page. - -* ``_Unparser`` in the :cpy-file:`Lib/ast.py` file may need changes - to accommodate any modifications in the AST nodes. - -* :cpy-file:`Doc/library/ast.rst` may need to be updated to reflect changes - to AST nodes. - -* Add some usage of your new syntax to ``test_grammar.py``. - -* Certain changes may require tweaks to the library module :mod:`pyclbr`. - -* :cpy-file:`Lib/tokenize.py` needs changes to match changes to the tokenizer. - -* Documentation must be written! Specifically, one or more of the pages in - :cpy-file:`Doc/reference/` will need to be updated. +This document is now part of the +`CPython Internals Docs <https://github.com/python/cpython/blob/main/InternalDocs/changing_grammar.md>`_. diff --git a/developer-workflow/index.rst b/developer-workflow/index.rst index 7b069021b6..e73927f1dd 100644 --- a/developer-workflow/index.rst +++ b/developer-workflow/index.rst @@ -1,3 +1,5 @@ +.. _dev-workflow: + ==================== Development workflow ==================== @@ -14,3 +16,4 @@ Development workflow grammar porting sbom + psrt diff --git a/developer-workflow/lang-changes.rst b/developer-workflow/lang-changes.rst index 70ecd679d9..52aabb15dd 100644 --- a/developer-workflow/lang-changes.rst +++ b/developer-workflow/lang-changes.rst @@ -45,7 +45,7 @@ The `Ideas Discourse category`_ is specifically intended for discussion of new features and language changes. Please don't be disappointed if your idea isn't met with universal approval: as the :pep:`long list of Withdrawn and Rejected PEPs -<0#abandoned-withdrawn-and-rejected-peps>` +<0#rejected-superseded-and-withdrawn-peps>` in the :pep:`PEP Index <0>` attests, and as befits a reasonably mature programming language, getting significant changes into Python isn't a simple task. diff --git a/developer-workflow/porting.rst b/developer-workflow/porting.rst index 26756bc8fb..f308e6c14b 100644 --- a/developer-workflow/porting.rst +++ b/developer-workflow/porting.rst @@ -13,11 +13,11 @@ which it has already been ported; preferably Unix, but Windows will do, too. The build process for Python, in particular the ``Makefile`` in the source distribution, will give you a hint on which files to compile for Python. Not all source files are relevant: some are platform-specific, -and others are only used in emergencies (e.g. ``getopt.c``). +and others are only used in emergencies (for example, ``getopt.c``). It is not recommended to start porting Python without at least a medium-level -understanding of your target platform; i.e. how it is generally used, how to -write platform-specific apps, etc. Also, some Python knowledge is required, or +understanding of your target platform; how it is generally used, how to +write platform-specific apps, and so on. Also, some Python knowledge is required, or you will be unable to verify that your port is working correctly. You will need a ``pyconfig.h`` file tailored for your platform. You can diff --git a/developer-workflow/psrt.rst b/developer-workflow/psrt.rst new file mode 100644 index 0000000000..f469f68d12 --- /dev/null +++ b/developer-workflow/psrt.rst @@ -0,0 +1,160 @@ +Python Security Response Team (PSRT) +==================================== + +The Python Security Response Team (PSRT) is responsible for handling +vulnerability reports for CPython and pip. + +Vulnerability report triage +--------------------------- + +Vulnerability reports are sent to one of two locations, +the long-standing ``security@python.org`` mailing list +or using the private vulnerability reporting feature +of GitHub Security Advisories (GHSA). + +For reports sent to ``security@python.org``, a PSRT admin +will triage the report and if the report seems plausible +(that is, not spam and for the correct project) will reply with +instructions on how to report the vulnerability on GitHub. + +If the reporter doesn't want to use GitHub's Security Advisories feature +then the PSRT admins can create a draft report on behalf of the reporter. + +Coordinating a vulnerability report +----------------------------------- + +Each report will have a member of the PSRT assigned as the "coordinator". +The coordinator will be responsible for following the below process and +will be publicly credited on vulnerability records post-publication. + +If a coordinator can't complete the process for any reason (time obligation, +vacation, etc.) they must find a replacement coordinator in the PSRT +and reassign the vulnerability report appropriately. + +Coordinators are expected to collaborate with other PSRT members and core developers +when needed for guidance on whether the report is an actual vulnerability, +severity, advisory text, and fixes. + +**The vulnerability coordination process is:** + +* Coordinator will determine whether the report constitutes a vulnerability. If the report isn't a vulnerability, + the reporter should be notified appropriately. Close the GHSA report, the report can be reopened if + sufficient evidence is later obtained that the report is a vulnerability. + +* After a vulnerability report is accepted, a Common Vulnerabilities and Exposures (CVE) ID must be assigned. If this is not done + automatically, then a CVE ID can be obtained by the coordinator sending an email to ``cna@python.org``. + No details about the vulnerability report need to be shared with the PSF CVE Numbering Authority (CNA) for a CVE ID to be reserved. + +* If the report is a vulnerability, the coordinator will determine the severity of the vulnerability. Severity is one of: + **Low**, **Medium**, **High**, and **Critical**. Coordinators can use their knowledge of the code, how the code is likely used, + or another mechanism like Common Vulnerability Scoring System (CVSS) for determining a severity. Add this information to the GitHub Security Advisory. + +* Once a CVE ID is assigned, the coordinator will share the acceptance and CVE ID with the reporter. + Use this CVE ID for referencing the vulnerability. The coordinator will ask the reporter + if the reporter would like to be credited publicly for the report and if so, how they would like to be credited. + Add this information to the GitHub Security Advisory. + +* The coordinator authors the vulnerability advisory text. The advisory must include the following information: + + * Title should be a brief description of the vulnerability and affected component + (for example, "Buffer over-read in SSLContext.set_npn_protocols()") + + * Short description of the vulnerability, impact, and the conditions where the affected component is vulnerable, if applicable. + + * Affected versions. This could be "all versions", but if the vulnerability exists in a new feature + or removed feature then this could be different. Include versions that are end-of-life in this calculation + (for example, "Python 3.9 and earlier", "Python 3.10 and later", "all versions of Python"). + + * Affected components and APIs. The module, function, class, or method must be specified so users can + search their codebase for usage. For issues affecting the entire project, this can be omitted. + + * Mitigations for the vulnerability beyond upgrading to a fixed version, if applicable. + + This can all be done within the GitHub Security Advisory UI for easier collaboration between reporter and coordinator. + +* The coordinator determines the fix approach and who will provide a fix. + Some reporters are willing to provide or collaborate to create a fix, + otherwise relevant core developers can be invited to collaborate by + the coordinator. + + * For **Low** and **Medium** severity vulnerabilities it is acceptable + to develop a fix in public. + The pull request must be marked with the ``security`` and ``release-blocker`` + labels so that a release is not created without including the fix. + + * For **High** and **Critical** severity vulnerabilities the fix must be + developed privately using GitHub Security Advisories' "Private Forks" feature. + Core developers can be added to the GitHub Security Advisory via "collaborators" + to work on the fix together. Once a fix is approved privately and tested, + a public issue and pull request can be created with + the ``security`` and ``release-blocker`` labels. + +* Once the pull request is merged the advisory can be published. The coordinator will send the advisory by email + to ``security-announce@python.org`` using the below template. Backport labels must be added as appropriate. + After the advisory is published a CVE record can be created. + +Template responses +------------------ + +These template responses should be used as guidance for messaging +in various points in the process above. They are not required to be sent as-is, +please feel free to adapt them as needed for the current context. + +**Directing to GitHub Security Advisories:** + +.. highlight:: none + +:: + + Thanks for submitting this report. + We use GitHub Security Advisories for triaging vulnerability reports, + are you able to submit your report directly to GitHub? + + https://github.com/python/cpython/security/advisories/new + + If you're unable to submit a report to GitHub (due to not having a GitHub + account or something else) let me know and I will create a GitHub Security + Advisory on your behalf, although you won't be able to participate directly + in discussions. + +**Rejecting a vulnerability report:** + +:: + + Thanks for your report. We've determined that the report doesn't constitute + a vulnerability. Let us know if you disagree with this determination. + If you are interested in working on this further, you can optionally open a + public issue on GitHub. + +**Accepting a vulnerability report:** + +:: + + Thanks for your report. We've determined that the report + is a vulnerability. We've assigned {CVE-YYYY-XXXX} and determined + a severity of {Low,Medium,High,Critical}. Let us know if you disagree + with the determined severity. + + If you would like to be publicly credited for this vulnerability as the + reporter, please indicate that, along with how you would like to be + credited (name or organization). + + Please keep this vulnerability report private until we've published + an advisory to ``security-announce@python.org``. + +**Advisory email:** + +:: + + Title: [{CVE-YYYY-XXXX}] {title} + + There is a {LOW, MEDIUM, HIGH, CRITICAL} severity vulnerability + affecting {project}. + + {description} + + Please see the linked CVE ID for the latest information on + affected versions: + + * https://www.cve.org/CVERecord?id={CVE-YYYY-XXXX} + * {pull request URL} diff --git a/developer-workflow/stdlib.rst b/developer-workflow/stdlib.rst index ae0f7208ee..60112d6d3e 100644 --- a/developer-workflow/stdlib.rst +++ b/developer-workflow/stdlib.rst @@ -30,13 +30,10 @@ You have a several options for this: * Open a new thread in the `Ideas Discourse category`_ to gather feedback directly from the Python core developers and community. * Write a blog post about the code, which may also help gather useful feedback. -* Post it to the `Python Cookbook`_. - Based on feedback and reviews of the recipe, - you can see if others find the functionality as useful as you do. If you have found general acceptance and usefulness for your code from people, you can open an issue on the `issue tracker`_ with the code attached as a -:ref:`pull request <patch>`. If possible, also submit a +:ref:`pull request <pullrequest>`. If possible, also submit a :ref:`contributor agreement <contributor_agreement>`. If a core developer decides that your code would be useful to the general @@ -47,7 +44,6 @@ for it you at least can know that others will come across it who may find it useful. .. _Ideas Discourse category: https://discuss.python.org/c/ideas/6 -.. _Python Cookbook: https://code.activestate.com/recipes/langs/python/ Adding a new module @@ -91,7 +87,7 @@ In order for a module to even be considered for inclusion into the stdlib, a couple of requirements must be met. The most basic is that the code must meet -:ref:`standard patch requirements <patch>`. For code that has +:ref:`standard pull request requirements <pullrequest>`. For code that has been developed outside the stdlib typically this means making sure the coding style guides are followed and that the proper tests have been written. @@ -108,11 +104,11 @@ year, a module needs to have established itself as (one of) the top choices by the community for solving the problem the module is intended for. The development of the module must move into Python's -infrastructure (i.e., the module is no longer directly maintained outside of +infrastructure (that is, the module is no longer directly maintained outside of Python). This prevents a divergence between the code that is included in the stdlib and that which is released outside the stdlib (typically done to provide the module to older versions of Python). It also removes the burden of forcing -core developers to have to redirect bug reports or patches to an external issue +core developers to have to redirect bug reports or changes to an external issue tracker and :abbr:`VCS (version control system)`. Someone involved with the development of the diff --git a/development-tools/clang.rst b/development-tools/clang.rst index 14040dd8bc..f06834731a 100644 --- a/development-tools/clang.rst +++ b/development-tools/clang.rst @@ -7,9 +7,7 @@ Dynamic analysis with Clang .. highlight:: bash This document describes how to use Clang to perform analysis on Python and its -libraries. In addition to performing the analysis, the document will cover -downloading, building and installing the latest Clang/LLVM combination (which -is currently 3.4). +libraries. This document does not cover interpreting the findings. For a discussion of interpreting results, see Marshall Clow's `Testing libc++ with @@ -17,6 +15,13 @@ interpreting results, see Marshall Clow's `Testing libc++ with blog posting is a detailed examinations of issues uncovered by Clang in ``libc++``. +The document focuses on Clang, although most techniques should generally apply +to GCC's sanitizers as well. + +The instructions were tested on Linux, but they should work on macOS as well. +Instructions for Windows are incomplete. + + What is Clang? ============== @@ -49,177 +54,95 @@ A complete list of sanitizers can be found at `Controlling Code Generation Clang and its sanitizers have strengths (and weaknesses). Its just one tool in the war chest to uncovering bugs and improving code quality. Clang should be -used to compliment other methods, including Code Reviews, Valgrind, Coverity, +used to complement other methods, including Code Reviews, `Valgrind`_, etc. Clang/LLVM setup ================ -This portion of the document covers downloading, building and installing Clang -and LLVM. There are three components to download and build. They are the LLVM -compiler, the compiler front end and the compiler runtime library. - -In preparation you should create a scratch directory. Also ensure you are using -Python 2 and not Python 3. Python 3 will cause the build to fail. - -Download, build and install ---------------------------- - -Perform the following to download, build and install the Clang/LLVM 3.4. :: - - # Download - wget https://llvm.org/releases/3.4/llvm-3.4.src.tar.gz - wget https://llvm.org/releases/3.4/clang-3.4.src.tar.gz - wget https://llvm.org/releases/3.4/compiler-rt-3.4.src.tar.gz - - # LLVM - tar xvf llvm-3.4.src.tar.gz - cd llvm-3.4/tools +Pre-built Clang builds are available for most platforms: - # Clang Front End - tar xvf ../../clang-3.4.src.tar.gz - mv clang-3.4 clang +- On macOS, Clang is the default compiler. +- For mainstream Linux distros, you can install a ``clang`` package. + In some cases, you also need to install ``llvm`` separately, otherwise + some tools are not available. +- On Windows, the installer for Visual Studio (not Code) + includes the "C++ clang tools for windows" feature. - # Compiler RT - cd ../projects - tar xvf ../../compiler-rt-3.4.src.tar.gz - mv compiler-rt-3.4/ compiler-rt +You can also build ``clang`` from source; refer to +`the clang documentation <https://clang.llvm.org/>`_ for details. - # Build - cd .. - ./configure --enable-optimized --prefix=/usr/local - make -j4 - sudo make install - -.. note:: - - If you receive an error ``'LibraryDependencies.inc' file not found``, then - ensure you are utilizing Python 2 and not Python 3. If you encounter the - error after switching to Python 2, then delete everything and start over. - -After ``make install`` executes, the compilers will be installed in -``/usr/local/bin`` and the various libraries will be installed in -``/usr/local/lib/clang/3.4/lib/linux/``: - -.. code-block:: console +The installer does not install all the components needed on occasion. For +example, you might want to run a ``scan-build`` or examine the results with +``scan-view``. If this is your case, you can build Clang from source and +copy tools from ``tools/clang/tools`` to a directory on your ``PATH``. - $ ls /usr/local/lib/clang/3.4/lib/linux/ - libclang_rt.asan-x86_64.a libclang_rt.profile-x86_64.a - libclang_rt.dfsan-x86_64.a libclang_rt.san-x86_64.a - libclang_rt.full-x86_64.a libclang_rt.tsan-x86_64.a - libclang_rt.lsan-x86_64.a libclang_rt.ubsan_cxx-x86_64.a - libclang_rt.msan-x86_64.a libclang_rt.ubsan-x86_64.a +Another reason to build from source is to get the latest version of Clang/LLVM, +if your platform's channels don't provide it yet. +Newer versions of Clang/LLVM introduce new sanitizer checks. -On macOS, the libraries are installed in -``/usr/local/lib/clang/3.3/lib/darwin/``: -.. code-block:: console +Python build setup +================== - $ ls /usr/local/lib/clang/3.3/lib/darwin/ - libclang_rt.10.4.a libclang_rt.ios.a - libclang_rt.asan_osx.a libclang_rt.osx.a - libclang_rt.asan_osx_dynamic.dylib libclang_rt.profile_ios.a - libclang_rt.cc_kext.a libclang_rt.profile_osx.a - libclang_rt.cc_kext_ios5.a libclang_rt.ubsan_osx.a - libclang_rt.eprintf.a +This portion of the document covers invoking Clang and LLVM with the options +required so the sanitizers analyze Python with under its test suite. -.. note:: +Set the compiler to Clang, in case it's not the default:: - You should never have to add the libraries to a project. Clang will handle - it for you. If you find you cannot pass the ``-fsanitize=XXX`` flag through - ``make``'s implicit variables (``CFLAGS``, ``CXXFLAGS``, ``CC``, - ``CXXFLAGS``, ``LDFLAGS``) during ``configure``, then you should modify the - makefile after configuring to ensure the flag is passed through the - compiler. + export CC="clang" -The installer does not install all the components needed on occasion. For -example, you might want to run a ``scan-build`` or examine the results with -``scan-view``. You can copy the components by hand with: :: +If you want to use additional sanitizer options (found in Clang documentation), +add them to the ``CFLAGS`` variable. +For example, you may want the checked process to exit after the first failure:: - sudo mkdir /usr/local/bin/scan-build - sudo cp -r llvm-3.4/tools/clang/tools/scan-build /usr/local/bin - sudo mkdir /usr/local/bin/scan-view - sudo cp -r llvm-3.4/tools/clang/tools/scan-view /usr/local/bin + export CFLAGS="-fno-sanitize-recover" -.. note:: +Then, run ``./configure`` with the relevant flags: - Because the installer does not install all the components needed on - occasion, you should not delete the scratch directory until you are sure - things work as expected. If a library is missing, then you should search for - it in the Clang/LLVM build directory. +* ASan: ``--with-address-sanitizer --without-pymalloc`` +* UBsan: ``--with-undefined-behavior-sanitizer`` -Python build setup -================== +It is OK to specify both sanitizers. -This portion of the document covers invoking Clang and LLVM with the options -required so the sanitizers analyze Python with under its test suite. Two -checkers are used - ASan and UBSan. +After that, run ``make`` and ``make test`` as usual. +Note that ``make`` itself may fail with a sanitizer failure, +since the just-compiled Python runs during later stages of the build. -Because the sanitizers are runtime checkers, its best to have as many positive -and negative self tests as possible. You can never have enough self tests. -The general idea is to compile and link with the sanitizer flags. At link time, -Clang will include the needed runtime libraries. However, you can't use -``CFLAGS`` and ``CXXFLAGS`` to pass the options through the compiler to the -linker because the makefile rules for ``BUILDPYTHON``, ``_testembed`` and -``_freeze_importlib`` don't use the implicit variables. +Build setup for enabling sanitizers for all code +------------------------------------------------ -As a workaround to the absence of flags to the linker, you can pass the -sanitizer options by way of the compilers - ``CC`` and ``CXX``. Passing the -flags though the compiler is used below, but passing them through ``LDFLAGS`` is -also reported to work. +Some parts of Python (for example, ``_testembed``, ``_freeze_importlib``, +``test_cppext``) may not use the variables set by ``configure``, +and with the above settings they'll be compiled without sanitization. -Building Python ---------------- +As a workaround, you can pass the sanitizer options by way of the *compilers*, +``CC`` (for C) and ``CXX`` (for C++). This is used below. +Passing the options through ``LDFLAGS`` is also reported to work. -To begin, export the variables of interest with the desired sanitizers. Its OK -to specify both sanitizers: :: +For ASan, use:: # ASan - export CC="/usr/local/bin/clang -fsanitize=address" - export CXX="/usr/local/bin/clang++ -fsanitize=address -fno-sanitize=vptr" + export CC="clang -fsanitize=address" + export CXX="clang++ -fsanitize=address -fno-sanitize=vptr" -Or: :: +And for UBSan:: # UBSan - export CC="/usr/local/bin/clang -fsanitize=undefined" - export CXX="/usr/local/bin/clang++ -fsanitize=undefined -fno-sanitize=vptr" - -The ``-fno-sanitize=vptr`` removes vtable checks that are part of UBSan from C++ -projects due to noise. Its not needed with Python, but you will likely need it -for other C++ projects. - -After exporting ``CC`` and ``CXX``, ``configure`` as normal: - -.. code-block:: console - - $ ./configure - checking build system type... x86_64-unknown-linux-gnu - checking host system type... x86_64-unknown-linux-gnu - checking for --enable-universalsdk... no - checking for --with-universal-archs... 32-bit - checking MACHDEP... linux - checking for --without-gcc... no - checking for gcc... /usr/local/bin/clang -fsanitize=undefined - checking whether the C compiler works... yes - ... + export CC="clang -fsanitize=undefined" + export CXX="clang++ -fsanitize=undefined -fno-sanitize=vptr" -Next is a standard ``make`` (formatting added for clarity): +It's OK to specify both sanitizers. -.. code-block:: console +After this, run ``./configure``, ``make`` and ``make test`` as usual. - $ make - /usr/local/bin/clang -fsanitize=undefined -c -Wno-unused-result - -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. - -IInclude -I./Include -DPy_BUILD_CORE -o Modules/python.o - ./Modules/python.c - /usr/local/bin/clang -fsanitize=undefined -c -Wno-unused-result - -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. - -IInclude -I./Include -DPy_BUILD_CORE -o Parser/acceler.o - Parser/acceler.c - ... -Finally is ``make test`` (formatting added for clarity): +Analyzing the output +==================== + +Sanitizer failures will make the process fail and output a diagnostic, +for example: .. code-block:: none @@ -233,8 +156,12 @@ Finally is ``make test`` (formatting added for clarity): ^ ... -If you are using the address sanitizer, its important to pipe the output through -``asan_symbolize.py`` to get a good trace. For example, from Issue 20953 during +If you are using the address sanitizer, an additional tool is needed to +get good traces. Usually, this happens automatically through the +``llvm-symbolizer`` tool. If this tool is not installed on your ``PATH``, +you can set ``ASAN_SYMBOLIZER_PATH`` to the location of the tool, +or pipe test output through ``asan_symbolize.py`` script from the +Clang distribution. For example, from Issue 20953 during compile (formatting added for clarity): .. code-block:: none @@ -302,25 +229,25 @@ compile (formatting added for clarity): .. note:: - ``asan_symbolize.py`` is supposed to be installed during ``make install``. - If its not installed, then look in the Clang/LLVM build directory for it and - copy it to ``/usr/local/bin``. + If ``asan_symbolize.py`` is not installed, build Clang from source, then + look in the Clang/LLVM build directory for it and use it directly or copy + it to a directory on ``PATH``. -Blacklisting (ignoring) findings --------------------------------- +Ignoring findings +----------------- .. highlight:: none Clang allows you to alter the behavior of sanitizer tools for certain -source-level by providing a special blacklist file at compile-time. The -blacklist is needed because it reports every instance of an issue, even if the +source-level by providing a special ignorelist file at compile-time. The +ignorelist is needed because it reports every instance of an issue, even if the issue is reported 10's of thousands of time in un-managed library code. -You specify the blacklist with ``-fsanitize-blacklist=XXX``. For example:: +You specify the ignorelist with ``-fsanitize-ignorelist=XXX``. For example:: - -fsanitize-blacklist=my_blacklist.txt + -fsanitize-ignorelist=my_ignorelist.txt -``my_blacklist.txt`` would then contain entries such as the following. The entry +``my_ignorelist.txt`` would then contain entries such as the following. The entry will ignore a bug in ``libc++``'s ``ios`` formatting functions:: fun:_Ios_Fmtflags @@ -342,7 +269,7 @@ findings:: ... One of the function of interest is ``audioop_getsample_impl`` (flagged at line -422), and the blacklist entry would include:: +422), and the ignorelist entry would include:: fun:audioop_getsample_imp @@ -350,7 +277,9 @@ Or, you could ignore the entire file with:: src:Modules/audioop.c -Unfortunately, you won't know what to blacklist until you run the sanitizer. +Unfortunately, you won't know what to ignorelist until you run the sanitizer. The documentation is available at `Sanitizer special case list <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html>`_. + +.. _Valgrind: https://github.com/python/cpython/blob/main/Misc/README.valgrind diff --git a/development-tools/clinic.rst b/development-tools/clinic.rst index 910de404ac..642f40dce9 100644 --- a/development-tools/clinic.rst +++ b/development-tools/clinic.rst @@ -213,7 +213,7 @@ Classes for extending Argument Clinic The C type to use for this variable. :attr:`!type` should be a Python string specifying the type, - e.g. ``'int'``. + for example, ``'int'``. If this is a pointer type, the type string should end with ``' *'``. .. attribute:: default diff --git a/development-tools/coverity.rst b/development-tools/coverity.rst deleted file mode 100644 index 7c165a3126..0000000000 --- a/development-tools/coverity.rst +++ /dev/null @@ -1,141 +0,0 @@ -.. _coverity: - -============= -Coverity Scan -============= - -Coverity Scan is a free service for static code analysis of Open Source -projects. It is based on Coverity's commercial product and is able to analyze -C, C++ and Java code. - -Coverity's static code analysis doesn't run the code. Instead of that it uses -abstract interpretation to gain information about the code's control flow and -data flow. It's able to follow all possible code paths that a program may -take. For example the analyzer understands that ``malloc()`` returns a memory -that must be freed with ``free()`` later. It follows all branches and function -calls to see if all possible combinations free the memory. The analyzer is -able to detect all sorts of issues like resource leaks (memory, file -descriptors), NULL dereferencing, use after free, unchecked return values, -dead code, buffer overflows, integer overflows, uninitialized variables, and -many more. - - -Access to analysis reports -========================== - -The results are available on the `Coverity Scan`_ website. In order to -access the results you have to create an account yourself. Then go to -*Projects using Scan* and add yourself to the Python project. New members must -be approved by an admin (see `Contact`_). - -Access is restricted to Python core developers only. Other individuals may be -given access at our own discretion, too. Every now and then Coverity detects a -critical issue in Python's code -- new analyzers may even find new bugs in -mature code. We don't want to disclose issues prematurely. - - -Building and uploading analysis -=============================== - -The process is automated. A script checks out the code, runs -``cov-build`` and uploads the latest analysis to Coverity. Since Coverity has -limited the maximum number of builds per week Python is analyzed every second -day. The build runs on a dedicated virtual machine on PSF's infrastructure at -OSU Open Source Labs. The process is maintained by Christian Heimes (see -`Contact`_). At present only the tip is analyzed with the 64bit Linux tools. - - -Known limitations -================= - -Some aspects of Python's C code are not yet understood by Coverity. - -False positives ---------------- - -``Py_BuildValue("N", PyObject*)`` - Coverity doesn't understand that ``N`` format char passes the object along - without touching its reference count. On this ground the analyzer detects - a resource leak. CID 719685 - -``PyLong_FromLong()`` for negative values - Coverity claims that ``PyLong_FromLong()`` and other ``PyLong_From*()`` - functions cannot handle a negative value because the value might be used as - an array index in ``get_small_int()``. CID 486783 - -``PyLong_FromLong()`` for n in [-5 ... +255] - For integers in the range of Python's small int cache the ``PyLong_From*()`` - function can never fail and never returns NULL. CID 1058291 - -``PyArg_ParseTupleAndKeywords(args, kwargs, "s#", &data, &length)`` - Some functions use the format char combination such as ``s#``, ``u#`` or - ``z#`` to get data and length of a character array. Coverity doesn't - recognize the relation between data and length. Sometimes it detects a buffer - overflow if data is written to a fixed size buffer although - ``length <= sizeof(buffer)``. CID 486613 - -``path_converter()`` dereferencing after null check - The ``path_converter()`` function in ``posixmodule.c`` makes sure that - either ``path_t.narrow`` or ``path_t.wide`` is filled unless - ``path_t.nullable`` is explicitly enabled. CID 719648 - - -Modeling -======== - -Modeling is explained in the *Coverity Help Center* which is available in -the help menu of `Coverity Connect`_. `coverity_model.c`_ contains a copy of -Python's modeling file for Coverity. Please keep the copy in sync with the -model file in *Analysis Settings* of `Coverity Scan`_. - - -Workflow -======== - -False positive and intentional issues -------------------------------------- - -If the problem is listed under `Known limitations`_ then please set the -classification to either "False positive" or "Intentional", the action to -"Ignore", owner to your own account and add a comment why the issue -is considered false positive or intentional. - -If you think it's a new false positive or intentional then please contact an -admin. The first step should be an updated to Python's `Modeling`_ file. - - -Positive issues ---------------- - -You should always create an issue unless it's really a trivial case. Please -add the full url to the ticket under *Ext. Reference* and add the CID -(Coverity ID) to both the ticket and the checkin message. It makes it much -easier to understand the relation between tickets, fixes and Coverity issues. - - -Contact -======= - -Please include both Brett and Christian in any mail regarding Coverity. Mails -to Coverity should go through Brett or Christian, too. - -Christian Heimes <christian (at) python (dot) org> - admin, maintainer of build machine, intermediary between Python and Coverity - -Brett Cannon <brett (at) python (dot) org> - co-admin - -Dakshesh Vyas <scan-admin@coverity.com> - Technical Manager - Coverity Scan - - -.. seealso:: - - `Coverity Scan FAQ <https://scan.coverity.com/faq/>`_ - - -.. _Coverity Scan: https://scan.coverity.com/ - -.. _Coverity Connect: https://scan.coverity.com/projects/python - -.. _coverity_model.c: https://github.com/python/cpython/blob/main/Misc/coverity_model.c diff --git a/development-tools/gdb.rst b/development-tools/gdb.rst index e85e826a01..8f89ea1360 100644 --- a/development-tools/gdb.rst +++ b/development-tools/gdb.rst @@ -33,7 +33,7 @@ Fortunately, among the `many ways to set breakpoints you can break at C labels, such as those generated for computed gotos. If you are debugging an interpreter compiled with computed goto support (generally true, certainly when using GCC), each instruction will be -prefaced with a label named ``TARGET_<instruction>``, e.g., +prefaced with a label named ``TARGET_<instruction>``, for example, ``TARGET_LOAD_CONST``. You can then set a breakpoint with a command like:: diff --git a/development-tools/index.rst b/development-tools/index.rst index d7b88bf6d9..5031227a18 100644 --- a/development-tools/index.rst +++ b/development-tools/index.rst @@ -1,3 +1,5 @@ +.. _development-tools: + ================= Development tools ================= @@ -8,4 +10,4 @@ Development tools clinic gdb clang - coverity + warnings diff --git a/development-tools/warnings.rst b/development-tools/warnings.rst new file mode 100644 index 0000000000..b6448f3979 --- /dev/null +++ b/development-tools/warnings.rst @@ -0,0 +1,66 @@ +.. warnings: + +Tools for tracking compiler warnings +==================================== + +.. highlight:: bash + +The compiler warning tracking tooling is intended to alert developers about new +compiler warnings introduced by their contributions. The tooling consists of +a Python script which is ran by the following GitHub workflows: + +* Ubuntu/build and test (:cpy-file:`.github/workflows/reusable-ubuntu.yml`) +* macOS/build and test (:cpy-file:`.github/workflows/reusable-macos.yml`) + +You can check the documentation for the :cpy-file:`Tools/build/check_warnings.py` tool +by running:: + + python Tools/build/check_warnings.py --help + +The script can be run locally by providing the compiler output file +(where the output is saved) and the compiler output type +(either ``gcc`` or ``clang``) to see a list of unique warnings:: + + python Tools/build/check_warnings.py --compiler-output-file-path=compiler_output.txt --compiler-output-type=gcc + +.. _warning-check-failure: + +What to do if a warning check fails GitHub CI +--------------------------------------------- + +The :cpy-file:`Tools/build/check_warnings.py` tool will fail if the compiler generates +more or less warnings than expected for a given source file as defined in the +platform-specific warning ignore file. The warning ignore file is either +:cpy-file:`Tools/build/.warningignore_ubuntu` or +:cpy-file:`Tools/build/.warningignore_macos` depending on the platform. + +If a warning check fails with: + +* Unexpected warnings + * Attempt to refactor the code to avoid the warning. + * If it is not possible to avoid the warning document in the PR why it is + reasonable to ignore and add the warning to the platform-specific + warning ignore file. If the file exists in the warning ignore file + increment the count by the number of newly introduced warnings. +* Unexpected improvements (less warnings) + * Document in the PR that the change reduces the number of compiler + warnings. Decrement the count in the platform-specific warning + ignore file or remove the file if the count is now zero. + +.. _updating-warning-ignore-file: + +Updating the warning ignore file +-------------------------------- + +The warning ignore files can be found in the :cpy-file:`Tools/build/` directory. +Both files and directories can be added to the ignore file. Files can have an explicit warning count or a wildcard count. +Directories must be followed by a wildcard count. Wildcards indicate that 0 or more warnings will be ignored. +The following is an example of the warning ignore file format:: + + Modules/_ctypes/_ctypes_test_generated.c.h * + Objects/longobject.c 46 + Objects/methodobject.c 1 + Objects/mimalloc/ * + +Using wildcards is reserved for code that is not maintained by CPython, or code that is for tests. +Keep lines in warning ignore files sorted lexicographically. diff --git a/documentation/devguide.rst b/documentation/devguide.rst index a17ed342a9..7c53d054e1 100644 --- a/documentation/devguide.rst +++ b/documentation/devguide.rst @@ -19,10 +19,12 @@ main Python documentation, except for some small differences. The source lives in a `separate repository`_ and bug reports should be submitted to the `devguide GitHub tracker`_. -Our devguide workflow uses continuous integration and deployment so changes to -the devguide are normally published when the pull request is merged. Changes -to CPython documentation follow the workflow of a CPython release and are -published in the release. +Changes to the Developer's Guide are published when pull requests are merged. + +Changes to the Python documentation are published regularly, +ususally within 48 hours of the change being committed. +The documentation is also `published for each release <https://docs.python.org/release/>`_, +which may also be used by redistributors. Developer's Guide workflow diff --git a/documentation/help-documenting.rst b/documentation/help-documenting.rst index 06e44549c8..0b287df928 100644 --- a/documentation/help-documenting.rst +++ b/documentation/help-documenting.rst @@ -74,7 +74,7 @@ Proofreading While an issue filed on the `issue tracker`_ means there is a known issue somewhere, that does not mean there are not other issues lurking about in the documentation. Proofreading a part of the documentation, such as a "How to" or -OS specific document, can often uncover problems (e.g., documentation that +OS specific document, can often uncover problems (for example, documentation that needs updating for Python 3). If you decide to proofread, read a section of the documentation from start diff --git a/documentation/markup.rst b/documentation/markup.rst index b72e1a9522..fc8deb4bb3 100644 --- a/documentation/markup.rst +++ b/documentation/markup.rst @@ -22,8 +22,8 @@ Element Markup See also arguments/parameters ``*arg*`` :ref:`inline-markup` variables/literals/code ````foo````, ````42````, ````len(s) - 1```` :ref:`inline-markup` True/False/None ````True````, ````False````, ````None```` :ref:`inline-markup` -functions definitions ``.. function:: print(*args)`` :ref:`directives` -functions references ``:func:`print``` :ref:`roles` +function definitions ``.. function:: print(*args)`` :ref:`directives` +function references ``:func:`print``` :ref:`roles` attribute definitions ``.. attribute: `attr-name``` :ref:`information-units` attribute references ``:attr:`attr-name``` :ref:`roles` reference labels ``.. _label-name:`` :ref:`doc-ref-role` @@ -385,7 +385,7 @@ As you can see, the module-specific markup consists of two directives, the .. describe:: module This directive marks the beginning of the description of a module, package, - or submodule. The name should be fully qualified (i.e. including the + or submodule. The name should be fully qualified (that is, including the package name for submodules). The ``platform`` option, if present, is a comma-separated list of the @@ -443,7 +443,7 @@ The directives are: .. describe:: c:function - Describes a C function. The signature should be given as in C, e.g.:: + Describes a C function. The signature should be given as in C, for example:: .. c:function:: PyObject* PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems) @@ -683,7 +683,7 @@ Syntax highlighting is handled in a smart way: encountered. * The ``code-block`` directive can be used to specify the highlight language - of a single code block, e.g.:: + of a single code block, for example:: .. code-block:: c @@ -759,7 +759,7 @@ versatile: ``:meth:`~Queue.Queue.get``` will refer to ``Queue.Queue.get`` but only display ``get`` as the link text. - In HTML output, the link's ``title`` attribute (that is e.g. shown as a + In HTML output, the link's ``title`` attribute (that might be shown as a tool-tip on mouse-hover) will always be the full target name. * Combining ``~`` and ``!`` (for example, ``:meth:`~!Queue.Queue.get```) is not @@ -949,7 +949,7 @@ in a different style: .. describe:: manpage A reference to a Unix manual page including the section, - e.g. ``:manpage:`ls(1)```. + for example, ``:manpage:`ls(1)```. .. describe:: menuselection @@ -1083,8 +1083,11 @@ units as well as normal text: (``class``, ``attribute``, ``function``, ``method``, ``c:type``, etc), a ``versionadded`` should be included at the end of its description block. - The first argument must be given and is the version in question. The second - argument is optional and can be used to describe the details of the feature. + The first argument must be given and is the version in question. + Instead of a specific version number, you can---and should---use + the word ``next``, indicating that the API will first appear in the + upcoming release. + The second argument is optional and can be used to describe the details of the feature. Example:: @@ -1092,7 +1095,26 @@ units as well as normal text: Return foo and bar. - .. versionadded:: 3.5 + .. versionadded:: next + + When a release is made, the release manager will change the ``next`` to + the just-released version. For example, if ``func`` in the above example is + released in 3.14, the snippet will be changed to:: + + .. function:: func() + + Return foo and bar. + + .. versionadded:: 3.14 + + The tool to do this replacement is `update_version_next.py`_ + in the release-tools repository. + + .. _update_version_next.py: https://github.com/python/release-tools/blob/master/update_version_next.py + + When adding documentation for a function that existed in a past version, + but wasn't documented yet, use the version number where the function was + added instead of ``next``. .. describe:: versionchanged @@ -1106,7 +1128,7 @@ units as well as normal text: Return foo and bar, optionally with *spam* applied. - .. versionchanged:: 3.6 + .. versionchanged:: next Added the *spam* parameter. Note that there should be no blank line between the directive head and the @@ -1118,10 +1140,12 @@ units as well as normal text: There is one required argument: the version from which the feature is deprecated. + Similarly to ``versionadded``, you should use the word ``next`` to indicate + the API will be first deprecated in the upcoming release. Example:: - .. deprecated:: 3.8 + .. deprecated:: next .. describe:: deprecated-removed @@ -1129,16 +1153,17 @@ units as well as normal text: removed. There are two required arguments: the version from which the feature is - deprecated, and the version in which the feature is removed. + deprecated (usually ``next``), and the version in which the feature + is removed, which must be a specific version number (*not* ``next``). Example:: - .. deprecated-removed:: 3.8 4.0 + .. deprecated-removed:: next 4.0 .. describe:: impl-detail This directive is used to mark CPython-specific information. Use either with - a block content or a single sentence as an argument, i.e. either :: + a block content or a single sentence as an argument, that is, either :: .. impl-detail:: @@ -1304,7 +1329,7 @@ the definition of the symbol. There is this directive: Blank lines are not allowed within ``productionlist`` directive arguments. The definition can contain token names which are marked as interpreted text - (e.g. ``unaryneg ::= "-" `integer```) -- this generates cross-references + (for example, ``unaryneg ::= "-" `integer```) -- this generates cross-references to the productions of these tokens. Note that no further reST parsing is done in the production, so that you @@ -1334,12 +1359,12 @@ default. They are set in the build configuration file :file:`conf.py`. .. describe:: |release| Replaced by the Python release the documentation refers to. This is the full - version string including alpha/beta/release candidate tags, e.g. ``2.5.2b3``. + version string including alpha/beta/release candidate tags, for example, ``2.5.2b3``. .. describe:: |version| Replaced by the Python version the documentation refers to. This consists - only of the major and minor version parts, e.g. ``2.5``, even for version + only of the major and minor version parts, for example, ``2.5``, even for version 2.5.1. .. describe:: |today| diff --git a/documentation/style-guide.rst b/documentation/style-guide.rst index 1e94e518d1..49bd15b1d3 100644 --- a/documentation/style-guide.rst +++ b/documentation/style-guide.rst @@ -26,6 +26,7 @@ the footnote reference. Footnotes may appear in the middle of sentences where appropriate. + Capitalization ============== @@ -54,10 +55,15 @@ starting it with a lowercase letter should be avoided. Many special names are used in the Python documentation, including the names of operating systems, programming languages, standards bodies, and the like. Most of these entities are not assigned any special markup, but the preferred -spellings are given here to aid authors in maintaining the consistency of -presentation in the Python documentation. +spellings are given in :ref:`specific words` to aid authors in maintaining the +consistency of presentation in the Python documentation. + +.. _specific words: + +Specific words +============== -Other terms and words deserve special mention as well; these conventions should +Some terms and words deserve special mention. These conventions should be used to ensure consistency throughout the documentation: C API @@ -79,6 +85,12 @@ reST used to produce Python documentation. When spelled out, it is always one word and both forms start with a lowercase 'r'. +time zone + When referring to a Python term like a module, class, or argument spell it + as one word with appropriate markup (for example, ``:mod:`timezone```). + When talking about the real-world concept spell it as two words with no + markup. + Unicode The name of a character coding system. This is always written capitalized. @@ -88,7 +100,18 @@ Unix 1970s. +Use simple language +=================== + +Avoid esoteric phrasing where possible. Our audience is world-wide and may not +be native English speakers. + +Don't use Latin abbreviations like "e.g." or "i.e." where English words will do, +such as "for example" or "that is." + + .. index:: diataxis +.. _diataxis: Diátaxis ======== @@ -110,7 +133,7 @@ explanation. designed to guide a user through a problem-field. Both tutorials and how-to guides are instructional rather than explanatory and should provide logical steps on how to complete a task. However, - how-to guides make more assumptions about the user's knoweldge and + how-to guides make more assumptions about the user's knowledge and focus on the user finding the best way to solve their own particular problem. @@ -132,6 +155,7 @@ explanation. Please consult the `Diátaxis <https://diataxis.fr/>`_ guide for more detail. + Links ===== @@ -160,6 +184,7 @@ documentation for ``map``. You can suppress the link while keeping the semantic presentation of the function name by adding an exclamation point prefix: ``:func:`!map```. See :ref:`roles` for more details. + Affirmative tone ================ @@ -185,6 +210,29 @@ language): achieve the same effect. This assures that files are flushed and file descriptor resources are released in a timely manner. + +Author attribution +================== + +For new documentation, do not use a byline (naming the author of the document). +Explicit attribution tends to discourage other users from updating community +documentation. + +Existing documentation with bylines will not be changed unless the author +decides to do so. This is subject to change in the future. + + +Pronunciation of dunder names +============================= + +"Dunder names" like ``__init__`` can be awkward in running prose: is it "an +init" or "a dunder init"? Our recommendation is to ignore the underscores and +use the article that is appropriate for the word in the name. A `quick poll`__ +backs this up: "an __init__." + +__ https://hachyderm.io/@nedbat/112129685322594689 + + Economy of expression ===================== @@ -196,12 +244,13 @@ to understanding and can result in even more ways to misread or misinterpret the text. Long descriptions full of corner cases and caveats can create the impression that a function is more complex or harder to use than it actually is. + Security considerations (and other concerns) ============================================ Some modules provided with Python are inherently exposed to security issues -(e.g. shell injection vulnerabilities) due to the purpose of the module -(e.g. :mod:`ssl`). Littering the documentation of these modules with red +(for example, shell injection vulnerabilities) due to the purpose of the module +(for example, :mod:`ssl`). Littering the documentation of these modules with red warning boxes for problems that are due to the task at hand, rather than specifically to Python's support for that task, doesn't make for a good reading experience. @@ -213,10 +262,11 @@ similar to :samp:`"Please refer to the :ref:\`{security-considerations}\` section for important information on how to avoid common mistakes."`. Similarly, if there is a common error that affects many interfaces in a -module (e.g. OS level pipe buffers filling up and stalling child processes), +module (for example, OS level pipe buffers filling up and stalling child processes), these can be documented in a "Common Errors" section and cross-referenced rather than repeated for every affected interface. + .. _code-examples: Code examples @@ -237,6 +287,7 @@ lines and output lines. Besides contributing visual clutter, it makes it difficult for readers to cut-and-paste examples so they can experiment with variations. + Code equivalents ================ @@ -261,6 +312,7 @@ An example of when not to use a code equivalent is for the :func:`oct` function. The exact steps in converting a number to octal doesn't add value for a user trying to learn what the function does. + Audience ======== @@ -281,3 +333,19 @@ errors ("I made a mistake, therefore the docs must be wrong ..."). Typically, the documentation wasn't consulted until after the error was made. It is unfortunate, but typically no documentation edit would have saved the user from making false assumptions about the language ("I was surprised by ..."). + + +Function signatures +=================== + +These are the evolving guidelines for how to include function signatures in the +reference guide. As outlined in :ref:`diataxis`, reference material should +prioritize precision and completeness. + +- If a function accepts positional-only or keyword-only arguments, include the + slash and the star in the signature as appropriate:: + + .. function:: some_function(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2): + + Although the syntax is terse, it is precise about the allowable ways to call + the function and is taken from Python itself. diff --git a/documentation/translating.rst b/documentation/translating.rst index 93c295034a..ea3cbaf2d4 100644 --- a/documentation/translating.rst +++ b/documentation/translating.rst @@ -10,7 +10,9 @@ Python documentation translations are governed by :PEP:`545`. They are built by `docsbuild-scripts <https://github.com/python/docsbuild-scripts/>`__ and hosted on docs.python.org. There are several documentation translations already -in production; others are works in progress. +in production; others are works in progress. See `the dashboard +<https://python-docs-translations.github.io/dashboard/>`__ for +details. .. list-table:: :header-rows: 1 @@ -29,7 +31,8 @@ in production; others are works in progress. - :github:`GitHub <python/python-docs-fr>` * - Greek (gr) - Lysandros Nikolaou (:github-user:`lysnikolaou`), - Fanis Petkos (:github-user:`thepetk`) + Fanis Petkos (:github-user:`thepetk`), + Panagiotis Skias (:github-user:`skpanagiotis`) - :github:`GitHub <pygreece/python-docs-gr>` * - Hindi as spoken in India (hi_IN) - Sanyam Khurana (:github-user:`CuriousLearner`) @@ -43,7 +46,8 @@ in production; others are works in progress. - :github:`GitHub <python/python-docs-id>` * - Italian (it) - Alessandro Cucci (`email <mailto:alessandro.cucci@gmail.com>`__) - - `Original mail <https://mail.python.org/pipermail/doc-sig/2019-April/004114.html>`__ + - :github:`GitHub <python/python-docs-it>`, + `original mail <https://mail.python.org/pipermail/doc-sig/2019-April/004114.html>`__ * - `Japanese (ja) <https://docs.python.org/ja/>`__ - Kinebuchi Tomohiko (:github-user:`cocoatomo`), Atsuo Ishimoto (:github-user:`atsuoishimoto`) @@ -58,8 +62,8 @@ in production; others are works in progress. - Albertas Gimbutas (:github-user:`albertas`, `email <mailto:albertasgim@gmail.com>`__) - `Original mail <https://mail.python.org/pipermail/doc-sig/2019-July/004138.html>`__ * - Persian (fa) - - Komeil Parseh (:github-user:`mmdbalkhi`) - - :github:`GitHub <mmdbalkhi/python-docs-fa>` + - Alireza Shabani (:github-user:`revisto`) + - :github:`GitHub <revisto/python-docs-fa>` * - `Polish (pl) <https://docs.python.org/pl/>`__ - Maciej Olko (:github-user:`m-aciek`) - :github:`GitHub <python/python-docs-pl>`, @@ -69,7 +73,8 @@ in production; others are works in progress. - Gustavo Toffo - * - `Portuguese as spoken in Brasil (pt-br) <https://docs.python.org/pt-br/>`__ - - Marco Rougeth + - Rafael Fontenelle (:github-user:`rffontenelle`), + Marco Rougeth (:github-user:`rougeth`) - :github:`GitHub <python/python-docs-pt-br>`, `wiki <https://python.org.br/traducao/>`__, `Telegram <https://t.me/pybr_i18n>`__, @@ -108,22 +113,23 @@ First subscribe to the `translation mailing list <translation_ml_>`_, and introduce yourself and the translation you're starting. Translations fall under the aegis of the `PSF Translation Workgroup <translation_wg_>`_ -Then you can bootstrap your new translation by using our `cookiecutter -<https://github.com/JulienPalard/python-docs-cookiecutter>`__. +Then you can bootstrap your new translation by using `cookiecutter +<https://github.com/JulienPalard/python-docs-cookiecutter>`__ or +`bootstrapper <https://github.com/python-docs-translations/python-docs-bootstrapper>`__. The important steps look like this: -- Create the GitHub repo (anywhere) with the right hierarchy (using the - cookiecutter). +- Create the GitHub repo (anywhere) with the right hierarchy (using one + of the bootstrappers). - Gather people to help you translate. You can't do it alone. - You can use any tool to translate, as long as you can synchronize with Git. Some use Transifex, and some use only GitHub. You can choose another way if you like; it's up to you. - Ensure we update this page to reflect your work and progress, either via a PR or by asking on the `translation mailing list <translation_ml_>`_. -- When ``bugs.html``, ``tutorial``, and ``library/functions`` are 100% +- When ``bugs``, ``tutorial``, and ``library/functions`` are 100% completed, ask on the `translation mailing list <translation_ml_>`_ for - your language to be added in the language picker on docs.python.org. + your language to be added in the language switcher on docs.python.org. PEP 545 summary @@ -151,9 +157,10 @@ Here are the essential points of :PEP:`545`: How to get help =============== -Discussions about translations occur on the `translation mailing list <translation_ml_>`_, -and there's a `Libera.Chat IRC <https://libera.chat/>`_ channel, -``#python-doc``. +Discussions about translations occur on the Python Docs Discord +`#translations channel <https://discord.gg/h3qDwgyzga>`_, `translation +mailing list <translation_ml_>`_, and there's a `Libera.Chat IRC +<https://libera.chat/>`_ channel, ``#python-doc``. Translation FAQ @@ -162,12 +169,12 @@ Translation FAQ Which version of the Python documentation should be translated? --------------------------------------------------------------- -Consensus is to work on current stable. You can then propagate your +Consensus is to work on the current stable version. You can then propagate your translation from one branch to another using :pypi:`pomerge`. -Are there some tools to help in managing the repo? --------------------------------------------------- +Are there tools to help in managing the repo? +--------------------------------------------- Here's what we're using: @@ -178,6 +185,10 @@ Here's what we're using: - :pypi:`potodo` to list what needs to be translated. - :pypi:`sphinx-lint` to validate reST syntax in translation files. +More related tools and projects can be found in the +`python-docs-translations`__ organisation on GitHub. + +__ https://github.com/python-docs-translations How is a coordinator elected? ----------------------------- @@ -229,5 +240,13 @@ As for every project, we have a *branch* per version. We store ``.po`` files in the root of the repository using the ``gettext_compact=0`` style. + +How should I translate code examples? +------------------------------------- + +Translate values in code examples (i.e. string literals) and comments. +Don't translate keywords or names, +including variable, function, class, argument, and attribute names. + .. _translation_wg: https://wiki.python.org/psf/TranslationWG/Charter .. _translation_ml: https://mail.python.org/mailman3/lists/translation.python.org/ diff --git a/getting-started/fixing-issues.rst b/getting-started/fixing-issues.rst index 6161f4aa60..f277cbf60d 100644 --- a/getting-started/fixing-issues.rst +++ b/getting-started/fixing-issues.rst @@ -6,7 +6,7 @@ Fixing "easy" issues (and beyond) ================================= When you feel comfortable enough to want to help tackle issues by trying to -create a patch to fix an issue, you can start by looking at the `"easy" +create a pull request to fix an issue, you can start by looking at the `"easy" issues`_. These issues *should* be ones where it should take no longer than a day or weekend to fix. But because the "easy" classification is typically done at triage time it can turn out to be inaccurate, so do feel free to leave a @@ -17,11 +17,9 @@ are not considered easy and try to fix those. It must be warned, though, that it is quite possible that a bug that has been left open has been left into that state because of the difficulty compared to the benefit of the fix. It could also still be open because no consensus has been reached on how to fix the -issue (although having a patch that proposes a fix can turn the tides of the -discussion to help bring it to a close). Regardless of why the issue is open, +issue, although having a pull request that proposes a fix can turn the tides of the +discussion to help bring it to a close. Regardless of why the issue is open, you can also always provide useful comments if you do attempt a fix, successful or not. .. _"easy" issues: https://github.com/python/cpython/issues?q=is%3Aissue+is%3Aopen+label%3Aeasy - -.. TODO: add something about no active core developer for the area? diff --git a/getting-started/generative-ai.rst b/getting-started/generative-ai.rst new file mode 100644 index 0000000000..90fe020f3f --- /dev/null +++ b/getting-started/generative-ai.rst @@ -0,0 +1,26 @@ +.. _generative-ai: + +============= +Generative AI +============= + +Generative AI has evolved rapidly over the past decade and will continue in the future. +Using generative AI and large language models (LLMs) can be helpful tools for contributors. +Their overuse can also be problematic, such as generation of incorrect code, inaccurate documentation, and unneeded code churn. +Discretion, good judgement, and critical thinking **must** be used when opening issues and pull requests. + +Acceptable uses +=============== + +Some of the acceptable uses of generative AI include: + +- Assistance with writing comments, especially in a non-native language +- Gaining understanding of existing code +- Supplementing contributor knowledge for code, tests, and documentation + +Unacceptable uses +================= + +Maintainers may close issues and PRs that are not useful or productive, including +those that are fully generated by AI. If a contributor repeatedly opens unproductive +issues or PRs, they may be blocked. diff --git a/getting-started/getting-help.rst b/getting-started/getting-help.rst index 4e5a02c39b..50b7583e79 100644 --- a/getting-started/getting-help.rst +++ b/getting-started/getting-help.rst @@ -37,25 +37,6 @@ Those particularly relevant for help contributing to Python itself include: .. _Ideas: https://discuss.python.org/c/ideas/6 -.. _help-mailing-lists: - -Mailing lists -------------- - -Further options for seeking assistance include the -`python-ideas`_ and `python-dev`_ mailing lists, -which correspond to the `Ideas`_ and `Core Development`_ -:ref:`help-discourse` categories, respectively. -The Discourse categories are generally more active -and are the preferred venue for new discussions, -but the mailing lists are still monitored and responded to. -These mailing lists are for questions involving the -development *of* Python, **not** for development *with* Python. - -.. _python-ideas: https://mail.python.org/mailman3/lists/python-ideas.python.org -.. _python-dev: https://mail.python.org/mailman3/lists/python-dev.python.org/ - - Ask #python-dev --------------- @@ -86,25 +67,6 @@ welcomed and encouraged to contribute. .. _Python Mentors: https://www.python.org/dev/core-mentorship/ -.. _office hour: - -Core developers office hours ----------------------------- - -Several core developers have set aside time to host mentorship office hours. -During the office hour, core developers are available to help contributors with -our process, answer questions, and help lower the barrier of contributing and -becoming Python core developers. - -The PSF's code of conduct applies for interactions with core developers -during office hours. - -+------------------+-------------------------------+------------------------------------------------+ -| Core Developer | Schedule | Details | -+==================+===============================+================================================+ -| Zachary Ware | See details link | Schedule at https://calendly.com/zware | -+------------------+-------------------------------+------------------------------------------------+ - File a bug ---------- diff --git a/getting-started/git-boot-camp.rst b/getting-started/git-boot-camp.rst index 9b60f0e1c5..87177840cb 100644 --- a/getting-started/git-boot-camp.rst +++ b/getting-started/git-boot-camp.rst @@ -44,11 +44,13 @@ You will only need to do this once. 1. Go to https://github.com/python/cpython. -2. Press ``Fork`` on the top right. +2. Press :guilabel:`Fork` located near the top right of the page. -3. When asked where to fork the repository, choose to fork it to your username. +3. Uncheck "Copy the ``main`` branch only". -4. Your forked CPython repository will be created at https://github.com/<username>/cpython. +4. Press the :guilabel:`Create fork` button. + +5. Your forked CPython repository will be created at ``https://github.com/<username>/cpython``. .. _clone-your-fork: @@ -105,6 +107,10 @@ To verify the upstream for ``main``:: It should emit ``upstream``, indicating to track/pull changes for ``main`` from the ``upstream`` remote. +Once this is verified, update your local clone with the upstream branches:: + + $ git fetch upstream + .. _set-up-name-email: @@ -126,7 +132,7 @@ Enabling ``autocrlf`` on Windows The ``autocrlf`` option will fix automatically any Windows-specific line endings. This should be enabled on Windows, since the public repository has a hook which -will reject all changesets having the wrong line endings:: +will reject all commits having the wrong line endings:: $ git config --global core.autocrlf input @@ -304,7 +310,7 @@ Creating a pull request 1. Go to https://github.com/python/cpython. -2. Press the ``New pull request`` button. +2. Press the :guilabel:`New pull request` button. 3. Click the ``compare across forks`` link. @@ -313,7 +319,7 @@ Creating a pull request 5. Select the head repository: ``<username>/cpython`` and head branch: the branch containing your changes. -6. Press the ``Create pull request`` button. +6. Press the :guilabel:`Create pull request` button. You should include the issue number in the title of the PR, in the format ``gh-NNNNN: <PR Title>``. @@ -330,12 +336,12 @@ will automatically add a link to the issue in the first message. In addition, pull requests support `special keywords`_ that can be used to link to an issue and automatically close it when the PR is merged. -However, issues often require multiple PRs before they can be closed (e.g. -backports to other branches), so this features is only useful if +However, issues often require multiple PRs before they can be closed (for +example, backports to other branches), so this features is only useful if you know for sure that a single PR is enough to address and close the issue. .. _bedevere: https://github.com/python/bedevere -.. _special keywords: https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword +.. _special keywords: https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword Updating your CPython fork -------------------------- @@ -350,7 +356,7 @@ Scenario: the upstream CPython repository. Please do not try to solve this by creating a pull request from -``python:main`` to ``<username>:main`` as the authors of the patches will +``python:main`` to ``<username>:main`` as the authors of the pull requests will get notified unnecessarily. Solution:: @@ -429,8 +435,8 @@ Solution: .. _git_pr: -Downloading other's patches ---------------------------- +Checking out others' pull requests +---------------------------------- Scenario: @@ -480,14 +486,14 @@ can be merged. See :ref:`"Keeping CI green" <keeping-ci-green>` for some simple things you can do to help the checks turn green. At any point, a core developer can schedule an automatic merge of the change -by clicking the gray ``Enable auto-merge (squash)`` button. You will find +by clicking the gray :guilabel:`Enable auto-merge (squash)` button. You will find it at the bottom of the pull request page. The auto-merge will only happen if all the required checks pass, but the PR does not need to have been approved for a successful auto-merge to take place. If all required checks are already finished on a PR you're reviewing, -in place of the gray ``Enable auto-merge`` button you will find a green -``Squash and merge`` button. +in place of the gray :guilabel:`Enable auto-merge` button you will find a green +:guilabel:`Squash and merge` button. In either case, adjust and clean up the commit message. @@ -520,7 +526,7 @@ PR life cycle, while being irrelevant to the final change. `How to Write a Git Commit Message <https://cbea.ms/git-commit/>`_ is a nice article describing how to write a good commit message. -Finally, press the ``Confirm squash and merge`` button. +Finally, press the :guilabel:`Confirm squash and merge` button. Cancelling an automatic merge ----------------------------- @@ -529,7 +535,7 @@ If you notice a problem with a pull request that was accepted and where auto-merge was enabled, you can still cancel the workflow before GitHub automatically merges the change. -Press the gray "Disable auto-merge" button on the bottom of the +Press the gray :guilabel:`Disable auto-merge` button on the bottom of the pull request page to disable automatic merging entirely. This is the recommended approach. diff --git a/getting-started/index.rst b/getting-started/index.rst index 18f7d5cd55..05ee67a3bc 100644 --- a/getting-started/index.rst +++ b/getting-started/index.rst @@ -1,3 +1,5 @@ +.. _getting-started: + =============== Getting started =============== @@ -10,3 +12,4 @@ Getting started git-boot-camp pull-request-lifecycle getting-help + generative-ai diff --git a/getting-started/pull-request-lifecycle.rst b/getting-started/pull-request-lifecycle.rst index 2381211ab4..59242f13f0 100644 --- a/getting-started/pull-request-lifecycle.rst +++ b/getting-started/pull-request-lifecycle.rst @@ -32,7 +32,7 @@ Here is a quick overview of how you can contribute to CPython: #. :ref:`Create a new branch in Git <pullrequest-steps>` from the ``main`` branch -#. Work on changes (e.g. fix a bug or add a new feature) +#. Work on changes: fix a bug or add a new feature #. :ref:`Run tests <runtests>` and ``make patchcheck`` @@ -42,7 +42,7 @@ Here is a quick overview of how you can contribute to CPython: #. `Create Pull Request`_ on GitHub to merge a branch from your fork #. Make sure the :ref:`continuous integration checks on your Pull Request - are green <keeping-ci-green>` (i.e. successful) + are green <keeping-ci-green>` (successful) #. Review and address `comments on your Pull Request`_ @@ -51,7 +51,7 @@ Here is a quick overview of how you can contribute to CPython: #. Celebrate contributing to CPython! :) -.. [*] If an issue is trivial (e.g. typo fixes), or if an issue already exists, +.. [*] If an issue is trivial (for example, typo fixes), or if an issue already exists, you can skip this step. .. note:: @@ -238,13 +238,32 @@ should do to help ensure that your pull request is accepted. #. Proper :ref:`documentation <documenting>` additions/changes should be included. +Copyrights +========== + +Copyright notices are optional and informational, as international treaties +have abolished the requirement for them to protect copyrights. +However, they still serve an informative role. + +According to the US Copyright Office, valid copyright notices include the year +of first publication of the work. For example: + + Copyright (C) 2001 Python Software Foundation. + +Updating notices to add subsequent years is unnecessary and such PRs will be +closed. + +See also `python/cpython#126133 +<https://github.com/python/cpython/issues/126133#issuecomment-2460824052>`__. + + .. _patchcheck: ``patchcheck`` ============== -``patchcheck`` is a simple automated patch checklist that guides a developer -through the common patch generation checks. To run ``patchcheck``: +``patchcheck`` is a simple automated checklist for changes in progress that +guides a developer through common checks. To run ``patchcheck``: On *Unix* (including macOS):: @@ -256,7 +275,7 @@ On *Windows* (after any successful build): .\python.bat Tools\patchcheck\patchcheck.py -The automated patch checklist runs through: +The automated checklist runs through: * Are there any whitespace problems in Python files? (using :cpy-file:`Tools/patchcheck/reindent.py`) @@ -271,10 +290,10 @@ The automated patch checklist runs through: * Has ``configure`` been regenerated, if necessary? * Has ``pyconfig.h.in`` been regenerated, if necessary? -The automated patch check doesn't actually *answer* all of these +The automated checks don't actually *answer* all of these questions. Aside from the whitespace checks, the tool is a memory aid for the various elements that can go into -making a complete patch. +making a complete pull request. .. _good-commits: @@ -358,7 +377,7 @@ changes to your branch. In general you can run ``git commit -a`` and that will commit everything. You can always run ``git status`` to see what changes are outstanding. -When all of your changes are committed (i.e. ``git status`` doesn't +When all of your changes are committed (that is, ``git status`` doesn't list anything), you will want to push your branch to your fork:: git push origin <branch name> @@ -379,7 +398,7 @@ relevant detail as possible to prevent reviewers from having to delay reviewing your pull request because of lack of information. If this issue is so simple that there's no need for an issue to track -any discussion of what the pull request is trying to solve (e.g. fixing a +any discussion of what the pull request is trying to solve (for example, fixing a spelling mistake), then the pull request needs to have the "skip issue" label added to it by someone with commit access. @@ -419,7 +438,7 @@ your pull request. Getting your pull request reviewed requires a reviewer to have the spare time and motivation to look at your pull request (we cannot force anyone to review pull requests and no one is employed to look at pull requests). If your pull request has not -received any notice from reviewers (i.e., no comment made) after one +received any notice from reviewers (that is, no comment made) after one month, first "ping" the issue on the `issue tracker`_ to remind the subscribers that the pull request needs a review. If you don't get a response within a week after pinging the issue, @@ -531,9 +550,9 @@ will merge in the latest changes from the base branch into the PR. If this still doesn't help with the failure on the PR, you can try to re-run that particular failed check. Go to the red GitHub Action job, -click on the "Re-run jobs" button on the top right, and select -"Re-run failed jobs". The button will only be present when all other jobs -finished running. +click on the :guilabel:`Re-run jobs` button on the top right, and select +:guilabel:`Re-run failed jobs`. The button will only be present when all other +jobs finished running. Re-running failed jobs shouldn't be your first instinct but it is occasionally helpful because distributed systems can have intermittent failures, and @@ -542,6 +561,22 @@ If you identify such flaky behavior, look for an issue in the `issue tracker`_ that describes this particular flakiness. Create a new issue if you can't find one. +:guilabel:`Update branch` button +================================ + +You can click on the :guilabel:`Update branch` button to merge the latest +changes from the base branch (usually ``main``) into the PR. +This is useful to :ref:`keep the CI green <keeping-ci-green>` for old PRs, +or to check if a CI failure has been fixed in the base branch. + +If the PR is very old, it may be useful to update the branch before merging to +ensure that the PR does not fail any CI checks that were added or changed since +CI last ran. + +Do not click :guilabel:`Update branch` without a good reason because it notifies +everyone watching the PR that there are new changes, when there are not, +and it uses up limited CI resources. + Committing/rejecting ==================== @@ -553,7 +588,7 @@ Python is tricky and we simply cannot accept everyone's contributions. But if your pull request is merged it will then go into Python's :abbr:`VCS (version control system)` to be released -with the next major release of Python. It may also be backported to older +with the next feature release of Python. It may also be backported to older versions of Python as a bugfix if the core developer doing the merge believes it is warranted. diff --git a/getting-started/setup-building.rst b/getting-started/setup-building.rst index f653164d38..03b2777b8e 100644 --- a/getting-started/setup-building.rst +++ b/getting-started/setup-building.rst @@ -43,11 +43,11 @@ itself. Git is easily available for all common operating systems. - **Install** As the CPython repo is hosted on GitHub, please refer to either the - `GitHub setup instructions <https://docs.github.com/en/get-started/quickstart/set-up-git>`_ + `GitHub setup instructions <https://docs.github.com/en/get-started/getting-started-with-git/set-up-git>`_ or the `Git project instructions <https://git-scm.com>`_ for step-by-step installation directions. You may also want to consider a graphical client such as `TortoiseGit <https://tortoisegit.org/>`_ or - `GitHub Desktop <https://desktop.github.com/>`_. + `GitHub Desktop <https://github.com/apps/desktop>`_. - **Configure** @@ -115,9 +115,9 @@ in the ``cpython`` directory and two remotes that refer to your own GitHub fork .. XXX move the text below in pullrequest If you want a working copy of an already-released version of Python, -i.e., a version in :ref:`maintenance mode <maintbranch>`, you can checkout -a release branch. For instance, to checkout a working copy of Python 3.8, -do ``git switch 3.8``. +that is, a version in :ref:`maintenance mode <maintbranch>`, you can checkout +a release branch. For instance, to checkout a working copy of Python 3.13, +do ``git switch 3.13``. You will need to re-compile CPython when you do such an update. @@ -127,7 +127,7 @@ changes to Python code will be picked up by the interpreter for immediate use and testing. (If you change C code, you will need to recompile the affected files as described below.) -Patches for the documentation can be made from the same repository; see +Changes for the documentation can be made from the same repository; see :ref:`documenting`. .. _install-pre-commit: @@ -226,7 +226,7 @@ If you decide to :ref:`build-dependencies`, you will need to re-run both Once CPython is done building you will then have a working build that can be run in-place; ``./python`` on most machines (and what is used in all examples), ``./python.exe`` wherever a case-insensitive filesystem is used -(e.g. on macOS by default), in order to avoid conflicts with the ``Python`` +(for example, on macOS by default), in order to avoid conflicts with the ``Python`` directory. There is normally no need to install your built copy of Python! The interpreter will realize where it is being run from and thus use the files found in the working copy. If you are worried @@ -286,7 +286,7 @@ Windows :ref:`clone the repository <checkout>` from a native Windows shell program like PowerShell or the ``cmd.exe`` command prompt, and use a build of Git targeted for Windows, - e.g. the `Git for Windows download from the official Git website`_. + for example, the `Git for Windows download from the official Git website`_. Otherwise, Visual Studio will not be able to find all the project's files and will fail the build. @@ -375,8 +375,8 @@ host/runtime as a *guest*. To build for WASI, you will need to cross-compile CPython. This requires a C compiler just like building for :ref:`Unix <unix-compiling>` as well as: -1. A C compiler that can target WebAssembly (e.g. `WASI SDK`_) -2. A WASI host/runtime (e.g. Wasmtime_) +1. A C compiler that can target WebAssembly (for example, `WASI SDK`_) +2. A WASI host/runtime (for example, Wasmtime_) All of this is provided in the :ref:`devcontainer <using-codespaces>`. You can also use what's installed in the container as a reference of what versions of @@ -394,7 +394,7 @@ to help produce a WASI build of CPython (technically it's a "host x host" cross-build because the build Python is also the target Python while the host build is the WASI build). This means you effectively build CPython twice: once to have a version of Python for the build system to use and another that's the -build you ultimately care about (i.e. the build Python is not meant for use by +build you ultimately care about (that is, the build Python is not meant for use by you directly, only the build system). The easiest way to get a debug build of CPython for WASI is to use the @@ -458,6 +458,12 @@ used in ``python.sh``: .. _wasmtime: https://wasmtime.dev .. _WebAssembly: https://webassembly.org +Android +------- + +Build and test instructions for Android are maintained in the CPython repository +at :cpy-file:`Android/README.md`. + iOS --- @@ -608,8 +614,8 @@ for details. Install dependencies ==================== -This section explains how to install additional extensions (e.g. ``zlib``) -on Linux, macOS and iOS. +This section explains how to install libraries which are needed to compile +some of CPython's modules (for example, ``zlib``). .. tab:: Linux @@ -621,9 +627,19 @@ on Linux, macOS and iOS. On **Fedora**, **RHEL**, **CentOS** and other ``dnf``-based systems:: + $ sudo dnf install git pkg-config $ sudo dnf install dnf-plugins-core # install this to use 'dnf builddep' $ sudo dnf builddep python3 + Some optional development dependencies are not included in the above. + To install some additional dependencies for optional build and test components:: + + $ sudo dnf install \ + gcc gcc-c++ gdb lzma glibc-devel libstdc++-devel openssl-devel \ + readline-devel zlib-devel libffi-devel bzip2-devel xz-devel \ + sqlite sqlite-devel sqlite-libs libuuid-devel gdbm-libs perf \ + expat expat-devel mpdecimal python3-pip + On **Debian**, **Ubuntu**, and other ``apt``-based systems, try to get the dependencies for the Python you're working on by using the ``apt`` command. @@ -635,7 +651,8 @@ on Linux, macOS and iOS. $ deb-src http://archive.ubuntu.com/ubuntu/ jammy main - Alternatively, uncomment lines with ``deb-src`` using an editor, e.g.:: + Alternatively, uncomment lines with ``deb-src`` using an editor, for + example:: $ sudo nano /etc/apt/sources.list @@ -659,6 +676,8 @@ on Linux, macOS and iOS. libncurses5-dev libreadline6-dev libsqlite3-dev libssl-dev \ lzma lzma-dev tk-dev uuid-dev zlib1g-dev libmpdec-dev + Note that Debian 12 and Ubuntu 24.04 do not have the ``libmpdec-dev`` package. You can safely + remove it from the install list above and the Python build will use a bundled version. .. tab:: macOS @@ -690,7 +709,7 @@ on Linux, macOS and iOS. For **Homebrew**, install dependencies using ``brew``:: - $ brew install pkg-config openssl@3.0 xz gdbm tcl-tk mpdecimal + $ brew install pkg-config openssl@3 xz gdbm tcl-tk mpdecimal .. tab:: Python 3.13+ @@ -700,7 +719,7 @@ on Linux, macOS and iOS. GDBM_LIBS="-L$(brew --prefix gdbm)/lib -lgdbm" \ ./configure --with-pydebug \ --with-system-libmpdec \ - --with-openssl="$(brew --prefix openssl@3.0)" + --with-openssl="$(brew --prefix openssl@3)" .. tab:: Python 3.11-3.12 @@ -709,18 +728,23 @@ on Linux, macOS and iOS. $ GDBM_CFLAGS="-I$(brew --prefix gdbm)/include" \ GDBM_LIBS="-L$(brew --prefix gdbm)/lib -lgdbm" \ ./configure --with-pydebug \ - --with-openssl="$(brew --prefix openssl@3.0)" + --with-openssl="$(brew --prefix openssl@3)" - .. tab:: Python 3.8-3.10 + .. tab:: Python 3.9-3.10 - For Python 3.8, 3.9, and 3.10:: + For Python 3.9 and 3.10:: $ CPPFLAGS="-I$(brew --prefix gdbm)/include -I$(brew --prefix xz)/include" \ LDFLAGS="-L$(brew --prefix gdbm)/lib -L$(brew --prefix xz)/lib" \ ./configure --with-pydebug \ - --with-openssl="$(brew --prefix openssl@3.0)" \ + --with-openssl="$(brew --prefix openssl@3)" \ --with-tcltk-libs="$(pkg-config --libs tcl tk)" \ - --with-tcltk-includes="$(pkg-config --cflags tcl tk)" + --with-tcltk-includes="$(pkg-config --cflags tcl tk)" \ + --with-dbmliborder=gdbm:ndbm + + (``--with-dbmliborder`` is a workaround for a Homebrew-specific change + to ``gdbm``; see `#89452 <https://github.com/python/cpython/issues/89452>`_ + for details.) .. tab:: MacPorts @@ -772,11 +796,21 @@ on Linux, macOS and iOS. On Windows, extensions are already included and built automatically. +.. tab:: Android + + The BeeWare project maintains `scripts for building Android dependencies`_, + and distributes `pre-compiled binaries`_ for each of them. + These binaries are automatically downloaded and used by the CPython + build script at :cpy-file:`Android/android.py`. + + .. _scripts for building Android dependencies: https://github.com/beeware/cpython-android-source-deps + .. _pre-compiled binaries: https://github.com/beeware/cpython-android-source-deps/releases + .. tab:: iOS As with CPython itself, the dependencies for CPython must be compiled for each of the hardware architectures that iOS supports. Consult the - documentation for `XZ <https://xz.tukaani.org/xz-utils/>`__, `bzip2 + documentation for `XZ <https://tukaani.org/xz/>`__, `bzip2 <https://sourceware.org/bzip2/>`__, `OpenSSL <https://www.openssl.org>`__ and `libffi <https://github.com/libffi/libffi>`__ for details on how to configure the project for cross-platform iOS builds. @@ -931,7 +965,7 @@ every rule. The part of the standard library implemented in pure Python. ``Mac`` - Mac-specific code (e.g., using IDLE as a macOS application). + Mac-specific code (for example, using IDLE as a macOS application). ``Misc`` Things that do not belong elsewhere. Typically this is varying kinds of @@ -999,7 +1033,7 @@ you'd prefer to use that directly. Create a CPython codespace -------------------------- -Here are the basic steps needed to contribute a patch using Codespaces. +Here are the basic steps needed to contribute a pull request using Codespaces. You first need to navigate to the `CPython repo <https://github.com/python/cpython>`_ hosted on GitHub. diff --git a/include/release-cycle.json b/include/release-cycle.json index fdb574b119..b77e904879 100644 --- a/include/release-cycle.json +++ b/include/release-cycle.json @@ -10,8 +10,8 @@ "3.13": { "branch": "3.13", "pep": 719, - "status": "prerelease", - "first_release": "2024-10-01", + "status": "bugfix", + "first_release": "2024-10-07", "end_of_life": "2029-10", "release_manager": "Thomas Wouters" }, @@ -50,9 +50,9 @@ "3.8": { "branch": "3.8", "pep": 569, - "status": "security", + "status": "end-of-life", "first_release": "2019-10-14", - "end_of_life": "2024-10", + "end_of_life": "2024-10-07", "release_manager": "Łukasz Langa" }, "3.7": { diff --git a/index.rst b/index.rst index 8b060fb877..9dbc06908b 100644 --- a/index.rst +++ b/index.rst @@ -1,3 +1,5 @@ +.. _devguide-main: + ======================== Python Developer's Guide ======================== @@ -22,7 +24,7 @@ community that maintains Python. We welcome your contributions to Python! Quick reference --------------- -Here are the basic steps needed to get set up and contribute a patch. +Here are the basic steps needed to get set up and contribute a pull request. This is meant as a checklist, once you know the basics. For complete instructions please see the :ref:`setup guide <setup>`. @@ -84,12 +86,12 @@ instructions please see the :ref:`setup guide <setup>`. .\python.bat -m test -j3 -5. Create a new branch where your work for the issue will go, e.g.:: +5. Create a new branch where your work for the issue will go, for example:: git checkout -b fix-issue-12345 main If an issue does not already exist, please `create it - <https://github.com/python/cpython/issues>`_. Trivial issues (e.g. typo fixes) do + <https://github.com/python/cpython/issues>`_. Trivial issues (for example, typo fixes) do not require any issue to be created. 6. Once you fixed the issue, run the tests, and the patchcheck: @@ -174,8 +176,8 @@ Contributors Documentarians Triagers Core Develo :ref:`pullrequest` :ref:`style-guide` :ref:`helptriage` :ref:`committing` :ref:`runtests` :ref:`rst-primer` :ref:`experts` :ref:`devcycle` :ref:`fixingissues` :ref:`translating` :ref:`labels` :ref:`motivations` -:ref:`communication` :ref:`devguide` :ref:`gh-faq` :ref:`office hour` -:ref:`gitbootcamp` :ref:`triage-team` :ref:`experts` +:ref:`communication` :ref:`devguide` :ref:`gh-faq` :ref:`experts` +:ref:`gitbootcamp` :ref:`triage-team` :ref:`devcycle` ======================== =================== ======================= ======================= @@ -287,9 +289,9 @@ Please note that all interactions on `Python Software Foundation <https://www.python.org/psf-landing/>`__-supported infrastructure is `covered <https://www.python.org/psf/records/board/minutes/2014-01-06/#management-of-the-psfs-web-properties>`__ -by the `PSF Code of Conduct <https://www.python.org/psf/conduct/>`__, +by the `PSF Code of Conduct <https://policies.python.org/python.org/code-of-conduct/>`__, which includes all infrastructure used in the development of Python itself -(e.g. mailing lists, issue trackers, GitHub, etc.). +(for example, mailing lists, issue trackers, GitHub, etc.). In general this means everyone is expected to be open, considerate, and respectful of others no matter what their position is within the project. @@ -315,6 +317,7 @@ Full table of contents core-developers/index internals/index versions + contrib/index .. _Buildbot status: https://www.python.org/dev/buildbot/ .. _Misc directory: https://github.com/python/cpython/tree/main/Misc @@ -322,7 +325,7 @@ Full table of contents .. _python.org maintenance: https://pythondotorg.readthedocs.io/ .. _Python: https://www.python.org/ .. _Core Python Mentorship: https://www.python.org/dev/core-mentorship/ -.. _PyPy: https://www.pypy.org +.. _PyPy: https://pypy.org .. _Jython: https://www.jython.org/ .. _IronPython: https://ironpython.net/ .. _Stackless: https://github.com/stackless-dev/stackless/wiki/ diff --git a/internals/compiler.rst b/internals/compiler.rst index a4e8457c3c..5b43e1e6dc 100644 --- a/internals/compiler.rst +++ b/internals/compiler.rst @@ -6,597 +6,5 @@ Compiler design .. highlight:: none -Abstract -======== - -In CPython, the compilation from source code to bytecode involves several steps: - -1. Tokenize the source code (:cpy-file:`Parser/lexer/` and :cpy-file:`Parser/tokenizer/`). -2. Parse the stream of tokens into an Abstract Syntax Tree - (:cpy-file:`Parser/parser.c`). -3. Transform AST into an instruction sequence (:cpy-file:`Python/compile.c`). -4. Construct a Control Flow Graph and apply optimizations to it (:cpy-file:`Python/flowgraph.c`). -5. Emit bytecode based on the Control Flow Graph (:cpy-file:`Python/assemble.c`). - -This document outlines how these steps of the process work. - -This document only describes parsing in enough depth to explain what is needed -for understanding compilation. This document provides a detailed, though not -exhaustive, view of the how the entire system works. You will most likely need -to read some source code to have an exact understanding of all details. - - -Parsing -======= - -As of Python 3.9, Python's parser is a PEG parser of a somewhat -unusual design. It is unusual in the sense that the parser's input is a stream -of tokens rather than a stream of characters which is more common with PEG -parsers. - -The grammar file for Python can be found in -:cpy-file:`Grammar/python.gram`. The definitions for literal tokens -(such as ``:``, numbers, etc.) can be found in :cpy-file:`Grammar/Tokens`. -Various C files, including :cpy-file:`Parser/parser.c` are generated from -these. - -.. seealso:: - - :ref:`parser` for a detailed description of the parser. - - :ref:`grammar` for a detailed description of the grammar. - - -Abstract syntax trees (AST) -=========================== - -.. _compiler-ast-trees: - -.. sidebar:: Green Tree Snakes - - See also `Green Tree Snakes - the missing Python AST docs - <https://greentreesnakes.readthedocs.io/en/latest/>`_ by Thomas Kluyver. - -The abstract syntax tree (AST) is a high-level representation of the -program structure without the necessity of containing the source code; -it can be thought of as an abstract representation of the source code. The -specification of the AST nodes is specified using the Zephyr Abstract -Syntax Definition Language (ASDL) [Wang97]_. - -The definition of the AST nodes for Python is found in the file -:cpy-file:`Parser/Python.asdl`. - -Each AST node (representing statements, expressions, and several -specialized types, like list comprehensions and exception handlers) is -defined by the ASDL. Most definitions in the AST correspond to a -particular source construct, such as an 'if' statement or an attribute -lookup. The definition is independent of its realization in any -particular programming language. - -The following fragment of the Python ASDL construct demonstrates the -approach and syntax:: - - module Python - { - stmt = FunctionDef(identifier name, arguments args, stmt* body, - expr* decorators) - | Return(expr? value) | Yield(expr? value) - attributes (int lineno) - } - -The preceding example describes two different kinds of statements and an -expression: function definitions, return statements, and yield expressions. -All three kinds are considered of type ``stmt`` as shown by ``|`` separating -the various kinds. They all take arguments of various kinds and amounts. - -Modifiers on the argument type specify the number of values needed; ``?`` -means it is optional, ``*`` means 0 or more, while no modifier means only one -value for the argument and it is required. ``FunctionDef``, for instance, -takes an ``identifier`` for the *name*, ``arguments`` for *args*, zero or more -``stmt`` arguments for *body*, and zero or more ``expr`` arguments for -*decorators*. - -Do notice that something like 'arguments', which is a node type, is -represented as a single AST node and not as a sequence of nodes as with -stmt as one might expect. - -All three kinds also have an 'attributes' argument; this is shown by the -fact that 'attributes' lacks a '|' before it. - -The statement definitions above generate the following C structure type: - -.. code-block:: c - - typedef struct _stmt *stmt_ty; - - struct _stmt { - enum { FunctionDef_kind=1, Return_kind=2, Yield_kind=3 } kind; - union { - struct { - identifier name; - arguments_ty args; - asdl_seq *body; - } FunctionDef; - - struct { - expr_ty value; - } Return; - - struct { - expr_ty value; - } Yield; - } v; - int lineno; - } - -Also generated are a series of constructor functions that allocate (in -this case) a ``stmt_ty`` struct with the appropriate initialization. The -``kind`` field specifies which component of the union is initialized. The -``FunctionDef()`` constructor function sets 'kind' to ``FunctionDef_kind`` and -initializes the *name*, *args*, *body*, and *attributes* fields. - - -Memory management -================= - -Before discussing the actual implementation of the compiler, a discussion of -how memory is handled is in order. To make memory management simple, an **arena** -is used that pools memory in a single location for easy -allocation and removal. This enables the removal of explicit memory -deallocation. Because memory allocation for all needed memory in the compiler -registers that memory with the arena, a single call to free the arena is all -that is needed to completely free all memory used by the compiler. - -In general, unless you are working on the critical core of the compiler, memory -management can be completely ignored. But if you are working at either the -very beginning of the compiler or the end, you need to care about how the arena -works. All code relating to the arena is in either -:cpy-file:`Include/internal/pycore_pyarena.h` or :cpy-file:`Python/pyarena.c`. - -``PyArena_New()`` will create a new arena. The returned ``PyArena`` structure -will store pointers to all memory given to it. This does the bookkeeping of -what memory needs to be freed when the compiler is finished with the memory it -used. That freeing is done with ``PyArena_Free()``. This only needs to be -called in strategic areas where the compiler exits. - -As stated above, in general you should not have to worry about memory -management when working on the compiler. The technical details of memory -management have been designed to be hidden from you for most cases. - -The only exception comes about when managing a PyObject. Since the rest -of Python uses reference counting, there is extra support added -to the arena to cleanup each PyObject that was allocated. These cases -are very rare. However, if you've allocated a PyObject, you must tell -the arena about it by calling ``PyArena_AddPyObject()``. - - -Source code to AST -================== - -The AST is generated from source code using the function -``_PyParser_ASTFromString()`` or ``_PyParser_ASTFromFile()`` -(from :cpy-file:`Parser/peg_api.c`) depending on the input type. - -After some checks, a helper function in :cpy-file:`Parser/parser.c` begins applying -production rules on the source code it receives; converting source code to -tokens and matching these tokens recursively to their corresponding rule. The -production rule's corresponding rule function is called on every match. These rule -functions follow the format :samp:`xx_rule`. Where *xx* is the grammar rule -that the function handles and is automatically derived from -:cpy-file:`Grammar/python.gram` -:cpy-file:`Tools/peg_generator/pegen/c_generator.py`. - -Each rule function in turn creates an AST node as it goes along. It does this -by allocating all the new nodes it needs, calling the proper AST node creation -functions for any required supporting functions and connecting them as needed. -This continues until all nonterminal symbols are replaced with terminals. If an -error occurs, the rule functions backtrack and try another rule function. If -there are no more rules, an error is set and the parsing ends. - -The AST node creation helper functions have the name :samp:`_PyAST_{xx}` -where *xx* is the AST node that the function creates. These are defined by the -ASDL grammar and contained in :cpy-file:`Python/Python-ast.c` (which is -generated by :cpy-file:`Parser/asdl_c.py` from :cpy-file:`Parser/Python.asdl`). -This all leads to a sequence of AST nodes stored in ``asdl_seq`` structs. - -To demonstrate everything explained so far, here's the -rule function responsible for a simple named import statement such as -``import sys``. Note that error-checking and debugging code has been -omitted. Removed parts are represented by ``...``. -Furthermore, some comments have been added for explanation. These comments -may not be present in the actual code. - -.. code-block:: c - - // This is the production rule (from python.gram) the rule function - // corresponds to: - // import_name: 'import' dotted_as_names - static stmt_ty - import_name_rule(Parser *p) - { - ... - stmt_ty _res = NULL; - { // 'import' dotted_as_names - ... - Token * _keyword; - asdl_alias_seq* a; - // The tokenizing steps. - if ( - (_keyword = _PyPegen_expect_token(p, 513)) // token='import' - && - (a = dotted_as_names_rule(p)) // dotted_as_names - ) - { - ... - // Generate an AST for the import statement. - _res = _PyAST_Import ( a , ...); - ... - goto done; - } - ... - } - _res = NULL; - done: - ... - return _res; - } - - -To improve backtracking performance, some rules (chosen by applying a -``(memo)`` flag in the grammar file) are memoized. Each rule function checks if -a memoized version exists and returns that if so, else it continues in the -manner stated in the previous paragraphs. - -There are macros for creating and using ``asdl_xx_seq *`` types, where *xx* is -a type of the ASDL sequence. Three main types are defined -manually -- ``generic``, ``identifier`` and ``int``. These types are found in -:cpy-file:`Python/asdl.c` and its corresponding header file -:cpy-file:`Include/internal/pycore_asdl.h`. Functions and macros -for creating ``asdl_xx_seq *`` types are as follows: - -``_Py_asdl_generic_seq_new(Py_ssize_t, PyArena *)`` - Allocate memory for an ``asdl_generic_seq`` of the specified length -``_Py_asdl_identifier_seq_new(Py_ssize_t, PyArena *)`` - Allocate memory for an ``asdl_identifier_seq`` of the specified length -``_Py_asdl_int_seq_new(Py_ssize_t, PyArena *)`` - Allocate memory for an ``asdl_int_seq`` of the specified length - -In addition to the three types mentioned above, some ASDL sequence types are -automatically generated by :cpy-file:`Parser/asdl_c.py` and found in -:cpy-file:`Include/internal/pycore_ast.h`. Macros for using both manually -defined and automatically generated ASDL sequence types are as follows: - -``asdl_seq_GET(asdl_xx_seq *, int)`` - Get item held at a specific position in an ``asdl_xx_seq`` -``asdl_seq_SET(asdl_xx_seq *, int, stmt_ty)`` - Set a specific index in an ``asdl_xx_seq`` to the specified value - -Untyped counterparts exist for some of the typed macros. These are useful -when a function needs to manipulate a generic ASDL sequence: - -``asdl_seq_GET_UNTYPED(asdl_seq *, int)`` - Get item held at a specific position in an ``asdl_seq`` -``asdl_seq_SET_UNTYPED(asdl_seq *, int, stmt_ty)`` - Set a specific index in an ``asdl_seq`` to the specified value -``asdl_seq_LEN(asdl_seq *)`` - Return the length of an ``asdl_seq`` or ``asdl_xx_seq`` - -Note that typed macros and functions are recommended over their untyped -counterparts. Typed macros carry out checks in debug mode and aid -debugging errors caused by incorrectly casting from ``void *``. - -If you are working with statements, you must also worry about keeping -track of what line number generated the statement. Currently the line -number is passed as the last parameter to each ``stmt_ty`` function. - -.. versionchanged:: 3.9 - The new PEG parser generates an AST directly without creating a - parse tree. ``Python/ast.c`` is now only used to validate the AST for - debugging purposes. - -.. seealso:: :pep:`617` (PEP 617 -- New PEG parser for CPython) - - -Control flow graphs -=================== - -A **control flow graph** (often referenced by its acronym, **CFG**) is a -directed graph that models the flow of a program. A node of a CFG is -not an individual bytecode instruction, but instead represents a -sequence of bytecode instructions that always execute sequentially. -Each node is called a *basic block* and must always execute from -start to finish, with a single entry point at the beginning and a -single exit point at the end. If some bytecode instruction *a* needs -to jump to some other bytecode instruction *b*, then *a* must occur at -the end of its basic block, and *b* must occur at the start of its -basic block. - -As an example, consider the following code snippet: - -.. code-block:: Python - - if x < 10: - f1() - f2() - else: - g() - end() - -The ``x < 10`` guard is represented by its own basic block that -compares ``x`` with ``10`` and then ends in a conditional jump based on -the result of the comparison. This conditional jump allows the block -to point to both the body of the ``if`` and the body of the ``else``. The -``if`` basic block contains the ``f1()`` and ``f2()`` calls and points to -the ``end()`` basic block. The ``else`` basic block contains the ``g()`` -call and similarly points to the ``end()`` block. - -Note that more complex code in the guard, the ``if`` body, or the ``else`` -body may be represented by multiple basic blocks. For instance, -short-circuiting boolean logic in a guard like ``if x or y:`` -will produce one basic block that tests the truth value of ``x`` -and then points both (1) to the start of the ``if`` body and (2) to -a different basic block that tests the truth value of y. - -CFGs are usually one step away from final code output. Code is directly -generated from the basic blocks (with jump targets adjusted based on the -output order) by doing a post-order depth-first search on the CFG -following the edges. - - -AST to CFG to bytecode -====================== - -With the AST created, the next step is to create the CFG. The first step -is to convert the AST to Python bytecode without having jump targets -resolved to specific offsets (this is calculated when the CFG goes to -final bytecode). Essentially, this transforms the AST into Python -bytecode with control flow represented by the edges of the CFG. - -Conversion is done in two passes. The first creates the namespace -(variables can be classified as local, free/cell for closures, or -global). With that done, the second pass essentially flattens the CFG -into a list and calculates jump offsets for final output of bytecode. - -The conversion process is initiated by a call to the function -``_PyAST_Compile()`` in :cpy-file:`Python/compile.c`. This function does both -the conversion of the AST to a CFG and outputting final bytecode from the CFG. -The AST to CFG step is handled mostly by two functions called by -``_PyAST_Compile()``; ``_PySymtable_Build()`` and ``compiler_mod()``. -The former is in :cpy-file:`Python/symtable.c` while the latter is -:cpy-file:`Python/compile.c`. - -``_PySymtable_Build()`` begins by entering the starting code block for the -AST (passed-in) and then calling the proper :samp:`symtable_visit_{xx}` function -(with *xx* being the AST node type). Next, the AST tree is walked with -the various code blocks that delineate the reach of a local variable -as blocks are entered and exited using ``symtable_enter_block()`` and -``symtable_exit_block()``, respectively. - -Once the symbol table is created, it is time for CFG creation, whose -code is in :cpy-file:`Python/compile.c`. This is handled by several functions -that break the task down by various AST node types. The functions are -all named :samp:`compiler_visit_{xx}` where *xx* is the name of the node type (such -as ``stmt``, ``expr``, etc.). Each function receives a ``struct compiler *`` -and :samp:`{xx}_ty` where *xx* is the AST node type. Typically these functions -consist of a large 'switch' statement, branching based on the kind of -node type passed to it. Simple things are handled inline in the -'switch' statement with more complex transformations farmed out to other -functions named :samp:`compiler_{xx}` with *xx* being a descriptive name of what is -being handled. - -When transforming an arbitrary AST node, use the ``VISIT()`` macro. -The appropriate :samp:`compiler_visit_{xx}` function is called, based on the value -passed in for <node type> (so :samp:`VISIT({c}, expr, {node})` calls -:samp:`compiler_visit_expr({c}, {node})`). The ``VISIT_SEQ()`` macro is very similar, -but is called on AST node sequences (those values that were created as -arguments to a node that used the '*' modifier). There is also -``VISIT_SLICE()`` just for handling slices. - -Emission of bytecode is handled by the following macros: - -``ADDOP(struct compiler *, int)`` - add a specified opcode -``ADDOP_NOLINE(struct compiler *, int)`` - like ``ADDOP`` without a line number; used for artificial opcodes without - no corresponding token in the source code -``ADDOP_IN_SCOPE(struct compiler *, int)`` - like ``ADDOP``, but also exits current scope; used for adding return value - opcodes in lambdas and closures -``ADDOP_I(struct compiler *, int, Py_ssize_t)`` - add an opcode that takes an integer argument -``ADDOP_O(struct compiler *, int, PyObject *, TYPE)`` - add an opcode with the proper argument based on the position of the - specified PyObject in PyObject sequence object, but with no handling of - mangled names; used for when you - need to do named lookups of objects such as globals, consts, or - parameters where name mangling is not possible and the scope of the - name is known; *TYPE* is the name of PyObject sequence - (``names`` or ``varnames``) -``ADDOP_N(struct compiler *, int, PyObject *, TYPE)`` - just like ``ADDOP_O``, but steals a reference to PyObject -``ADDOP_NAME(struct compiler *, int, PyObject *, TYPE)`` - just like ``ADDOP_O``, but name mangling is also handled; used for - attribute loading or importing based on name -``ADDOP_LOAD_CONST(struct compiler *, PyObject *)`` - add the ``LOAD_CONST`` opcode with the proper argument based on the - position of the specified PyObject in the consts table. -``ADDOP_LOAD_CONST_NEW(struct compiler *, PyObject *)`` - just like ``ADDOP_LOAD_CONST_NEW``, but steals a reference to PyObject -``ADDOP_JUMP(struct compiler *, int, basicblock *)`` - create a jump to a basic block -``ADDOP_JUMP_NOLINE(struct compiler *, int, basicblock *)`` - like ``ADDOP_JUMP`` without a line number; used for artificial jumps - without no corresponding token in the source code. -``ADDOP_JUMP_COMPARE(struct compiler *, cmpop_ty)`` - depending on the second argument, add an ``ADDOP_I`` with either an - ``IS_OP``, ``CONTAINS_OP``, or ``COMPARE_OP`` opcode. - -Several helper functions that will emit bytecode and are named -:samp:`compiler_{xx}()` where *xx* is what the function helps with (``list``, -``boolop``, etc.). A rather useful one is ``compiler_nameop()``. -This function looks up the scope of a variable and, based on the -expression context, emits the proper opcode to load, store, or delete -the variable. - -As for handling the line number on which a statement is defined, this is -handled by ``compiler_visit_stmt()`` and thus is not a worry. - -Once the CFG is created, it must be flattened and then final emission of -bytecode occurs. Flattening is handled using a post-order depth-first -search. Once flattened, jump offsets are backpatched based on the -flattening and then a ``PyCodeObject`` is created. All of this is -handled by calling ``assemble()``. - - -Code objects -============ - -The result of ``PyAST_CompileObject()`` is a ``PyCodeObject`` which is defined in -:cpy-file:`Include/cpython/code.h`. And with that you now have executable -Python bytecode! - -The code objects (byte code) are executed in :cpy-file:`Python/ceval.c`. This file -will also need a new case statement for the new opcode in the big switch -statement in ``_PyEval_EvalFrameDefault()``. - - -Important files -=============== - -* :cpy-file:`Parser/` - - * :cpy-file:`Parser/Python.asdl`: ASDL syntax file. - - * :cpy-file:`Parser/asdl.py`: Parser for ASDL definition files. - Reads in an ASDL description and parses it into an AST that describes it. - - * :cpy-file:`Parser/asdl_c.py`: Generate C code from an ASDL description. - Generates :cpy-file:`Python/Python-ast.c` and - :cpy-file:`Include/internal/pycore_ast.h`. - - * :cpy-file:`Parser/parser.c`: The new PEG parser introduced in Python 3.9. - Generated by :cpy-file:`Tools/peg_generator/pegen/c_generator.py` - from the grammar :cpy-file:`Grammar/python.gram`. Creates the AST from - source code. Rule functions for their corresponding production rules - are found here. - - * :cpy-file:`Parser/peg_api.c`: Contains high-level functions which are - used by the interpreter to create an AST from source code. - - * :cpy-file:`Parser/pegen.c`: Contains helper functions which are used - by functions in :cpy-file:`Parser/parser.c` to construct the AST. - Also contains helper functions which help raise better error messages - when parsing source code. - - * :cpy-file:`Parser/pegen.h`: Header file for the corresponding - :cpy-file:`Parser/pegen.c`. Also contains definitions of the ``Parser`` - and ``Token`` structs. - -* :cpy-file:`Python/` - - * :cpy-file:`Python/Python-ast.c`: Creates C structs corresponding to - the ASDL types. Also contains code for marshalling AST nodes (core - ASDL types have marshalling code in :cpy-file:`Python/asdl.c`). - "File automatically generated by :cpy-file:`Parser/asdl_c.py`". - This file must be committed separately after every grammar change - is committed since the ``__version__`` value is set to the latest - grammar change revision number. - - * :cpy-file:`Python/asdl.c`: Contains code to handle the ASDL sequence type. - Also has code to handle marshalling the core ASDL types, such as number - and identifier. Used by :cpy-file:`Python/Python-ast.c` for marshalling - AST nodes. - - * :cpy-file:`Python/ast.c`: Used for validating the AST. - - * :cpy-file:`Python/ast_opt.c`: Optimizes the AST. - - * :cpy-file:`Python/ast_unparse.c`: Converts the AST expression node - back into a string (for string annotations). - - * :cpy-file:`Python/ceval.c`: Executes byte code (aka, eval loop). - - * :cpy-file:`Python/compile.c`: Emits bytecode based on the AST. - - * :cpy-file:`Python/symtable.c`: Generates a symbol table from AST. - - * :cpy-file:`Python/pyarena.c`: Implementation of the arena memory manager. - - * :cpy-file:`Python/opcode_targets.h`: One of the files that must be - modified if :cpy-file:`Lib/opcode.py` is. - -* :cpy-file:`Include/` - - * :cpy-file:`Include/cpython/code.h`: Header file for - :cpy-file:`Objects/codeobject.c`; contains definition of ``PyCodeObject``. - - * :cpy-file:`Include/opcode.h`: One of the files that must be modified if - :cpy-file:`Lib/opcode.py` is. - - * :cpy-file:`Include/internal/pycore_ast.h`: Contains the actual definitions - of the C structs as generated by :cpy-file:`Python/Python-ast.c`. - "Automatically generated by :cpy-file:`Parser/asdl_c.py`". - - * :cpy-file:`Include/internal/pycore_asdl.h`: Header for the corresponding - :cpy-file:`Python/ast.c`. - - * :cpy-file:`Include/internal/pycore_ast.h`: Declares ``_PyAST_Validate()`` - external (from :cpy-file:`Python/ast.c`). - - * :cpy-file:`Include/internal/pycore_symtable.h`: Header for - :cpy-file:`Python/symtable.c`. ``struct symtable`` and ``PySTEntryObject`` - are defined here. - - * :cpy-file:`Include/internal/pycore_parser.h`: Header for the - corresponding :cpy-file:`Parser/peg_api.c`. - - * :cpy-file:`Include/internal/pycore_pyarena.h`: Header file for the - corresponding :cpy-file:`Python/pyarena.c`. - -* :cpy-file:`Objects/` - - * :cpy-file:`Objects/codeobject.c`: Contains PyCodeObject-related code - (originally in :cpy-file:`Python/compile.c`). - - * :cpy-file:`Objects/frameobject.c`: Contains the ``frame_setlineno()`` - function which should determine whether it is allowed to make a jump - between two points in a bytecode. - -* :cpy-file:`Lib/` - - * :cpy-file:`Lib/opcode.py`: Master list of bytecode; if this file is - modified you must modify several other files accordingly - - * :cpy-file:`Lib/importlib/_bootstrap_external.py`: Home of the magic number - (named ``MAGIC_NUMBER``) for bytecode versioning. - - -Objects -======= - -* :cpy-file:`Objects/locations.md`: Describes the location table -* :cpy-file:`Objects/frame_layout.md`: Describes the frame stack -* :cpy-file:`Objects/object_layout.md`: Descibes object layout for 3.11 and later -* :cpy-file:`Objects/exception_handling_notes.txt`: Exception handling notes - - -Specializing Adaptive Interpreter -================================= - -Adding a specializing, adaptive interpreter to CPython will bring significant -performance improvements. These documents provide more information: - -* :pep:`659`: Specializing Adaptive Interpreter -* :cpy-file:`Python/adaptive.md`: Adding or extending a family of adaptive instructions - - -References -========== - -.. [Wang97] Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Chris - S. Serra. `The Zephyr Abstract Syntax Description Language.`_ - In Proceedings of the Conference on Domain-Specific Languages, pp. - 213--227, 1997. - -.. _The Zephyr Abstract Syntax Description Language.: - https://www.cs.princeton.edu/research/techreps/TR-554-97 +This document is now part of the +`CPython Internals Docs <https://github.com/python/cpython/blob/main/InternalDocs/compiler.md>`_. diff --git a/internals/exploring.rst b/internals/exploring.rst index 8f4a565fb0..0ae8337e8c 100644 --- a/internals/exploring.rst +++ b/internals/exploring.rst @@ -60,7 +60,7 @@ building your understanding of CPython internals and its evolution: "`Green Tree Snakes`_", "The missing Python AST docs", Thomas Kluyver, 3.6 "`Yet another guided tour of CPython`_", "A guide for how CPython REPL works", Guido van Rossum, 3.5 "`Python Asynchronous I/O Walkthrough`_", "How CPython async I/O, generator and coroutine works", Philip Guo, 3.5 - "`Coding Patterns for Python Extensions`_", "Reliable patterns of coding Python Extensions in C", Paul Ross, 3.4 + "`Coding Patterns for Python Extensions`_", "Reliable patterns of coding Python Extensions in C", Paul Ross, 3.9+ "`Your Guide to the CPython Source Code`_", "Your Guide to the CPython Source Code", Anthony Shaw, 3.8 .. csv-table:: **Historical references** diff --git a/internals/garbage-collector.rst b/internals/garbage-collector.rst index 7459b23e8d..acbcedf0e8 100644 --- a/internals/garbage-collector.rst +++ b/internals/garbage-collector.rst @@ -8,597 +8,5 @@ Garbage collector design .. highlight:: none -Abstract -======== - -The main garbage collection algorithm used by CPython is reference counting. The basic idea is -that CPython counts how many different places there are that have a reference to an -object. Such a place could be another object, or a global (or static) C variable, or -a local variable in some C function. When an object’s reference count becomes zero, -the object is deallocated. If it contains references to other objects, their -reference counts are decremented. Those other objects may be deallocated in turn, if -this decrement makes their reference count become zero, and so on. The reference -count field can be examined using the ``sys.getrefcount`` function (notice that the -value returned by this function is always 1 more as the function also has a reference -to the object when called): - -.. code-block:: python - - >>> x = object() - >>> sys.getrefcount(x) - 2 - >>> y = x - >>> sys.getrefcount(x) - 3 - >>> del y - >>> sys.getrefcount(x) - 2 - -The main problem with the reference counting scheme is that it does not handle reference -cycles. For instance, consider this code: - -.. code-block:: python - - >>> container = [] - >>> container.append(container) - >>> sys.getrefcount(container) - 3 - >>> del container - -In this example, ``container`` holds a reference to itself, so even when we remove -our reference to it (the variable "container") the reference count never falls to 0 -because it still has its own internal reference. Therefore it would never be -cleaned just by simple reference counting. For this reason some additional machinery -is needed to clean these reference cycles between objects once they become -unreachable. This is the cyclic garbage collector, usually called just Garbage -Collector (GC), even though reference counting is also a form of garbage collection. - -Starting in version 3.13, CPython contains two GC implementations: - -* The default build implementation relies on the :term:`global interpreter - lock` for thread safety. -* The free-threaded build implementation pauses other executing threads when - performing a collection for thread safety. - -Both implementations use the same basic algorithms, but operate on different -data structures. The :ref:`gc-differences` section summarizes the -differences between the two GC implementations. - - -Memory layout and object structure -================================== - -The garbage collector requires additional fields in Python objects to support -garbage collection. These extra fields are different in the default and the -free-threaded builds. - - -GC for the default build ------------------------- - -Normally the C structure supporting a regular Python object looks as follows: - -.. code-block:: none - - object -----> +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \ - | ob_refcnt | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD - | *ob_type | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ / - | ... | - - -In order to support the garbage collector, the memory layout of objects is altered -to accommodate extra information **before** the normal layout: - -.. code-block:: none - - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \ - | *_gc_next | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyGC_Head - | *_gc_prev | | - object -----> +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ / - | ob_refcnt | \ - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD - | *ob_type | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ / - | ... | - - -In this way the object can be treated as a normal python object and when the extra -information associated to the GC is needed the previous fields can be accessed by a -simple type cast from the original object: :code:`((PyGC_Head *)(the_object)-1)`. - -As is explained later in the `Optimization: reusing fields to save memory`_ section, -these two extra fields are normally used to keep doubly linked lists of all the -objects tracked by the garbage collector (these lists are the GC generations, more on -that in the `Optimization: generations`_ section), but they are also -reused to fulfill other purposes when the full doubly linked list structure is not -needed as a memory optimization. - -Doubly linked lists are used because they efficiently support most frequently required operations. In -general, the collection of all objects tracked by GC are partitioned into disjoint sets, each in its own -doubly linked list. Between collections, objects are partitioned into "generations", reflecting how -often they've survived collection attempts. During collections, the generation(s) being collected -are further partitioned into, e.g., sets of reachable and unreachable objects. Doubly linked lists -support moving an object from one partition to another, adding a new object, removing an object -entirely (objects tracked by GC are most often reclaimed by the refcounting system when GC -isn't running at all!), and merging partitions, all with a small constant number of pointer updates. -With care, they also support iterating over a partition while objects are being added to - and -removed from - it, which is frequently required while GC is running. - -GC for the free-threaded build ------------------------------- - -In the free-threaded build, Python objects contain a 1-byte field -``ob_gc_bits`` that is used to track garbage collection related state. The -field exists in all objects, including ones that do not support cyclic -garbage collection. The field is used to identify objects that are tracked -by the collector, ensure that finalizers are called only once per object, -and, during garbage collection, differentiate reachable vs. unreachable objects. - -.. code-block:: none - - object -----> +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \ - | ob_tid | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | - | pad | ob_mutex | ob_gc_bits | ob_ref_local | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD - | ob_ref_shared | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | - | *ob_type | | - +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ / - | ... | - - -Note that not all fields are to scale. ``pad`` is two bytes, ``ob_mutex`` and -``ob_gc_bits`` are each one byte, and ``ob_ref_local`` is four bytes. The -other fields, ``ob_tid``, ``ob_ref_shared``, and ``ob_type``, are all -pointer-sized (i.e., eight bytes on a 64-bit platform). - - -The garbage collector also temporarily repurposes the ``ob_tid`` (thread ID) -and ``ob_ref_local`` (local reference count) fields for other purposes during -collections. - - -C APIs ------- - -Specific APIs are offered to allocate, deallocate, initialize, track, and untrack -objects with GC support. These APIs can be found in the `Garbage Collector C API -documentation <https://docs.python.org/3.8/c-api/gcsupport.html>`_. - -Apart from this object structure, the type object for objects supporting garbage -collection must include the ``Py_TPFLAGS_HAVE_GC`` in its ``tp_flags`` slot and -provide an implementation of the ``tp_traverse`` handler. Unless it can be proven -that the objects cannot form reference cycles with only objects of its type or unless -the type is immutable, a ``tp_clear`` implementation must also be provided. - - -Identifying reference cycles -============================ - -The algorithm that CPython uses to detect those reference cycles is -implemented in the ``gc`` module. The garbage collector **only focuses** -on cleaning container objects (i.e. objects that can contain a reference -to one or more objects). These can be arrays, dictionaries, lists, custom -class instances, classes in extension modules, etc. One could think that -cycles are uncommon but the truth is that many internal references needed by -the interpreter create cycles everywhere. Some notable examples: - -* Exceptions contain traceback objects that contain a list of frames that - contain the exception itself. -* Module-level functions reference the module's dict (which is needed to resolve globals), - which in turn contains entries for the module-level functions. -* Instances have references to their class which itself references its module, and the module - contains references to everything that is inside (and maybe other modules) - and this can lead back to the original instance. -* When representing data structures like graphs, it is very typical for them to - have internal links to themselves. - -To correctly dispose of these objects once they become unreachable, they need -to be identified first. To understand how the algorithm works, let’s take -the case of a circular linked list which has one link referenced by a -variable ``A``, and one self-referencing object which is completely -unreachable: - -.. code-block:: python - - >>> import gc - - >>> class Link: - ... def __init__(self, next_link=None): - ... self.next_link = next_link - - >>> link_3 = Link() - >>> link_2 = Link(link_3) - >>> link_1 = Link(link_2) - >>> link_3.next_link = link_1 - >>> A = link_1 - >>> del link_1, link_2, link_3 - - >>> link_4 = Link() - >>> link_4.next_link = link_4 - >>> del link_4 - - # Collect the unreachable Link object (and its .__dict__ dict). - >>> gc.collect() - 2 - -The GC starts with a set of candidate objects it wants to scan. In the -default build, these "objects to scan" might be all container objects or a -smaller subset (or "generation"). In the free-threaded build, the collector -always operates scans all container objects. - -The objective is to identify all the unreachable objects. The collector does -this by identifying reachable objects; the remaining objects must be -unreachable. The first step is to identify all of the "to scan" objects that -are **directly** reachable from outside the set of candidate objects. These -objects have a refcount larger than the number of incoming references from -within the candidate set. - -Every object that supports garbage collection will have an extra reference -count field initialized to the reference count (``gc_ref`` in the figures) -of that object when the algorithm starts. This is because the algorithm needs -to modify the reference count to do the computations and in this way the -interpreter will not modify the real reference count field. - -.. figure:: /_static/python-cyclic-gc-1-new-page.png - -The GC then iterates over all containers in the first list and decrements by one the -``gc_ref`` field of any other object that container is referencing. Doing -this makes use of the ``tp_traverse`` slot in the container class (implemented -using the C API or inherited by a superclass) to know what objects are referenced by -each container. After all the objects have been scanned, only the objects that have -references from outside the “objects to scan” list will have ``gc_ref > 0``. - -.. figure:: /_static/python-cyclic-gc-2-new-page.png - -Notice that having ``gc_ref == 0`` does not imply that the object is unreachable. -This is because another object that is reachable from the outside (``gc_ref > 0``) -can still have references to it. For instance, the ``link_2`` object in our example -ended having ``gc_ref == 0`` but is referenced still by the ``link_1`` object that -is reachable from the outside. To obtain the set of objects that are really -unreachable, the garbage collector re-scans the container objects using the -``tp_traverse`` slot; this time with a different traverse function that marks objects with -``gc_ref == 0`` as "tentatively unreachable" and then moves them to the -tentatively unreachable list. The following image depicts the state of the lists in a -moment when the GC processed the ``link_3`` and ``link_4`` objects but has not -processed ``link_1`` and ``link_2`` yet. - -.. figure:: /_static/python-cyclic-gc-3-new-page.png - -Then the GC scans the next ``link_1`` object. Because it has ``gc_ref == 1``, -the gc does not do anything special because it knows it has to be reachable (and is -already in what will become the reachable list): - -.. figure:: /_static/python-cyclic-gc-4-new-page.png - -When the GC encounters an object which is reachable (``gc_ref > 0``), it traverses -its references using the ``tp_traverse`` slot to find all the objects that are -reachable from it, moving them to the end of the list of reachable objects (where -they started originally) and setting its ``gc_ref`` field to 1. This is what happens -to ``link_2`` and ``link_3`` below as they are reachable from ``link_1``. From the -state in the previous image and after examining the objects referred to by ``link_1`` -the GC knows that ``link_3`` is reachable after all, so it is moved back to the -original list and its ``gc_ref`` field is set to 1 so that if the GC visits it again, -it will know that it's reachable. To avoid visiting an object twice, the GC marks all -objects that have already been visited once (by unsetting the ``PREV_MASK_COLLECTING`` -flag) so that if an object that has already been processed is referenced by some other -object, the GC does not process it twice. - -.. figure:: /_static/python-cyclic-gc-5-new-page.png - -Notice that an object that was marked as "tentatively unreachable" and was later -moved back to the reachable list will be visited again by the garbage collector -as now all the references that that object has need to be processed as well. This -process is really a breadth first search over the object graph. Once all the objects -are scanned, the GC knows that all container objects in the tentatively unreachable -list are really unreachable and can thus be garbage collected. - -Pragmatically, it's important to note that no recursion is required by any of this, -and neither does it in any other way require additional memory proportional to the -number of objects, number of pointers, or the lengths of pointer chains. Apart from -``O(1)`` storage for internal C needs, the objects themselves contain all the storage -the GC algorithms require. - -Why moving unreachable objects is better ----------------------------------------- - -It sounds logical to move the unreachable objects under the premise that most objects -are usually reachable, until you think about it: the reason it pays isn't actually -obvious. - -Suppose we create objects A, B, C in that order. They appear in the young generation -in the same order. If B points to A, and C to B, and C is reachable from outside, -then the adjusted refcounts after the first step of the algorithm runs will be 0, 0, -and 1 respectively because the only reachable object from the outside is C. - -When the next step of the algorithm finds A, A is moved to the unreachable list. The -same for B when it's first encountered. Then C is traversed, B is moved *back* to -the reachable list. B is eventually traversed, and then A is moved back to the reachable -list. - -So instead of not moving at all, the reachable objects B and A are each moved twice. -Why is this a win? A straightforward algorithm to move the reachable objects instead -would move A, B, and C once each. The key is that this dance leaves the objects in -order C, B, A - it's reversed from the original order. On all *subsequent* scans, -none of them will move. Since most objects aren't in cycles, this can save an -unbounded number of moves across an unbounded number of later collections. The only -time the cost can be higher is the first time the chain is scanned. - -Destroying unreachable objects -============================== - -Once the GC knows the list of unreachable objects, a very delicate process starts -with the objective of completely destroying these objects. Roughly, the process -follows these steps in order: - -1. Handle and clear weak references (if any). Weak references to unreachable objects - are set to ``None``. If the weak reference has an associated callback, the callback - is enqueued to be called once the clearing of weak references is finished. We only - invoke callbacks for weak references that are themselves reachable. If both the weak - reference and the pointed-to object are unreachable we do not execute the callback. - This is partly for historical reasons: the callback could resurrect an unreachable - object and support for weak references predates support for object resurrection. - Ignoring the weak reference's callback is fine because both the object and the weakref - are going away, so it's legitimate to say the weak reference is going away first. -2. If an object has legacy finalizers (``tp_del`` slot) move it to the - ``gc.garbage`` list. -3. Call the finalizers (``tp_finalize`` slot) and mark the objects as already - finalized to avoid calling finalizers twice if the objects are resurrected or - if other finalizers have removed the object first. -4. Deal with resurrected objects. If some objects have been resurrected, the GC - finds the new subset of objects that are still unreachable by running the cycle - detection algorithm again and continues with them. -5. Call the ``tp_clear`` slot of every object so all internal links are broken and - the reference counts fall to 0, triggering the destruction of all unreachable - objects. - -Optimization: generations -========================= - -In order to limit the time each garbage collection takes, the GC -implementation for the default build uses a popular optimization: -generations. The main idea behind this concept is the assumption that most -objects have a very short lifespan and can thus be collected soon after their -creation. This has proven to be very close to the reality of many Python -programs as many temporary objects are created and destroyed very quickly. - -To take advantage of this fact, all container objects are segregated into -three spaces/generations. Every new -object starts in the first generation (generation 0). The previous algorithm is -executed only over the objects of a particular generation and if an object -survives a collection of its generation it will be moved to the next one -(generation 1), where it will be surveyed for collection less often. If -the same object survives another GC round in this new generation (generation 1) -it will be moved to the last generation (generation 2) where it will be -surveyed the least often. - -The GC implementation for the free-threaded build does not use multiple -generations. Every collection operates on the entire heap. - -In order to decide when to run, the collector keeps track of the number of object -allocations and deallocations since the last collection. When the number of -allocations minus the number of deallocations exceeds ``threshold_0``, -collection starts. Initially only generation 0 is examined. If generation 0 has -been examined more than ``threshold_1`` times since generation 1 has been -examined, then generation 1 is examined as well. With generation 2, -things are a bit more complicated; see :ref:`gc-oldest-generation` for -more information. These thresholds can be examined using the -:func:`gc.get_threshold` function: - -.. code-block:: python - - >>> import gc - >>> gc.get_threshold() - (700, 10, 10) - - -The content of these generations can be examined using the -``gc.get_objects(generation=NUM)`` function and collections can be triggered -specifically in a generation by calling ``gc.collect(generation=NUM)``. - -.. code-block:: python - - >>> import gc - >>> class MyObj: - ... pass - ... - - # Move everything to the last generation so it's easier to inspect - # the younger generations. - - >>> gc.collect() - 0 - - # Create a reference cycle. - - >>> x = MyObj() - >>> x.self = x - - # Initially the object is in the youngest generation. - - >>> gc.get_objects(generation=0) - [..., <__main__.MyObj object at 0x7fbcc12a3400>, ...] - - # After a collection of the youngest generation the object - # moves to the next generation. - - >>> gc.collect(generation=0) - 0 - >>> gc.get_objects(generation=0) - [] - >>> gc.get_objects(generation=1) - [..., <__main__.MyObj object at 0x7fbcc12a3400>, ...] - - -.. _gc-oldest-generation: - -Collecting the oldest generation --------------------------------- - -In addition to the various configurable thresholds, the GC only triggers a full -collection of the oldest generation if the ratio ``long_lived_pending / long_lived_total`` -is above a given value (hardwired to 25%). The reason is that, while "non-full" -collections (i.e., collections of the young and middle generations) will always -examine roughly the same number of objects (determined by the aforementioned -thresholds) the cost of a full collection is proportional to the total -number of long-lived objects, which is virtually unbounded. Indeed, it has -been remarked that doing a full collection every <constant number> of object -creations entails a dramatic performance degradation in workloads which consist -of creating and storing lots of long-lived objects (e.g. building a large list -of GC-tracked objects would show quadratic performance, instead of linear as -expected). Using the above ratio, instead, yields amortized linear performance -in the total number of objects (the effect of which can be summarized thusly: -"each full garbage collection is more and more costly as the number of objects -grows, but we do fewer and fewer of them"). - -Optimization: reusing fields to save memory -=========================================== - -In order to save memory, the two linked list pointers in every object with GC -support are reused for several purposes. This is a common optimization known -as "fat pointers" or "tagged pointers": pointers that carry additional data, -"folded" into the pointer, meaning stored inline in the data representing the -address, taking advantage of certain properties of memory addressing. This is -possible as most architectures align certain types of data -to the size of the data, often a word or multiple thereof. This discrepancy -leaves a few of the least significant bits of the pointer unused, which can be -used for tags or to keep other information – most often as a bit field (each -bit a separate tag) – as long as code that uses the pointer masks out these -bits before accessing memory. E.g., on a 32-bit architecture (for both -addresses and word size), a word is 32 bits = 4 bytes, so word-aligned -addresses are always a multiple of 4, hence end in ``00``, leaving the last 2 bits -available; while on a 64-bit architecture, a word is 64 bits = 8 bytes, so -word-aligned addresses end in ``000``, leaving the last 3 bits available. - -The CPython GC makes use of two fat pointers that correspond to the extra fields -of ``PyGC_Head`` discussed in the `Memory layout and object structure`_ section: - -.. warning:: - - Because the presence of extra information, "tagged" or "fat" pointers cannot be - dereferenced directly and the extra information must be stripped off before - obtaining the real memory address. Special care needs to be taken with - functions that directly manipulate the linked lists, as these functions - normally assume the pointers inside the lists are in a consistent state. - - -* The ``_gc_prev`` field is normally used as the "previous" pointer to maintain the - doubly linked list but its lowest two bits are used to keep the flags - ``PREV_MASK_COLLECTING`` and ``_PyGC_PREV_MASK_FINALIZED``. Between collections, - the only flag that can be present is ``_PyGC_PREV_MASK_FINALIZED`` that indicates - if an object has been already finalized. During collections ``_gc_prev`` is - temporarily used for storing a copy of the reference count (``gc_ref``), in - addition to two flags, and the GC linked list becomes a singly linked list until - ``_gc_prev`` is restored. - -* The ``_gc_next`` field is used as the "next" pointer to maintain the doubly linked - list but during collection its lowest bit is used to keep the - ``NEXT_MASK_UNREACHABLE`` flag that indicates if an object is tentatively - unreachable during the cycle detection algorithm. This is a drawback to using only - doubly linked lists to implement partitions: while most needed operations are - constant-time, there is no efficient way to determine which partition an object is - currently in. Instead, when that's needed, ad hoc tricks (like the - ``NEXT_MASK_UNREACHABLE`` flag) are employed. - -Optimization: delay tracking containers -======================================= - -Certain types of containers cannot participate in a reference cycle, and so do -not need to be tracked by the garbage collector. Untracking these objects -reduces the cost of garbage collection. However, determining which objects may -be untracked is not free, and the costs must be weighed against the benefits -for garbage collection. There are two possible strategies for when to untrack -a container: - -1. When the container is created. -2. When the container is examined by the garbage collector. - -As a general rule, instances of atomic types aren't tracked and instances of -non-atomic types (containers, user-defined objects...) are. However, some -type-specific optimizations can be present in order to suppress the garbage -collector footprint of simple instances. Some examples of native types that -benefit from delayed tracking: - -* Tuples containing only immutable objects (integers, strings etc, - and recursively, tuples of immutable objects) do not need to be tracked. The - interpreter creates a large number of tuples, many of which will not survive - until garbage collection. It is therefore not worthwhile to untrack eligible - tuples at creation time. Instead, all tuples except the empty tuple are tracked - when created. During garbage collection it is determined whether any surviving - tuples can be untracked. A tuple can be untracked if all of its contents are - already not tracked. Tuples are examined for untracking in all garbage collection - cycles. It may take more than one cycle to untrack a tuple. - -* Dictionaries containing only immutable objects also do not need to be tracked. - Dictionaries are untracked when created. If a tracked item is inserted into a - dictionary (either as a key or value), the dictionary becomes tracked. During a - full garbage collection (all generations), the collector will untrack any dictionaries - whose contents are not tracked. - -The garbage collector module provides the Python function ``is_tracked(obj)``, which returns -the current tracking status of the object. Subsequent garbage collections may change the -tracking status of the object. - -.. code-block:: python - - >>> gc.is_tracked(0) - False - >>> gc.is_tracked("a") - False - >>> gc.is_tracked([]) - True - >>> gc.is_tracked({}) - False - >>> gc.is_tracked({"a": 1}) - False - >>> gc.is_tracked({"a": []}) - True - - -.. _gc-differences: - -Differences between GC implementations -====================================== - -This section summarizes the differences between the GC implementation in the -default build and the implementation in the free-threaded build. - -The default build implementation makes extensive use of the ``PyGC_Head`` data -structure, while the free-threaded build implementation does not use that -data structure. - -* The default build implementation stores all tracked objects in a doubly - linked list using ``PyGC_Head``. The free-threaded build implementation - instead relies on the embedded mimalloc memory allocator to scan the heap - for tracked objects. -* The default build implementation uses ``PyGC_Head`` for the unreachable - object list. The free-threaded build implementation repurposes the - ``ob_tid`` field to store a unreachable objects linked list. -* The default build implementation stores flags in the ``_gc_prev`` field of - ``PyGC_Head``. The free-threaded build implementation stores these flags - in ``ob_gc_bits``. - - -The default build implementation relies on the :term:`global interpreter lock` -for thread safety. The free-threaded build implementation has two "stop the -world" pauses, in which all other executing threads are temporarily paused so -that the GC can safely access reference counts and object attributes. - -The default build implementation is a generational collector. The -free-threaded build is non-generational; each collection scans the entire -heap. - -* Keeping track of object generations is simple and inexpensive in the default - build. The free-threaded build relies on mimalloc for finding tracked - objects; identifying "young" objects without scanning the entire heap would - be more difficult. - - -.. admonition:: Document History - :class: note - - Pablo Galindo Salgado - Original Author +This document is now part of the +`CPython Internals Docs <https://github.com/python/cpython/blob/main/InternalDocs/garbage_collector.md>`_. diff --git a/internals/index.rst b/internals/index.rst index 999e427b2d..05723f4824 100644 --- a/internals/index.rst +++ b/internals/index.rst @@ -1,3 +1,5 @@ +.. _internals: + =================== CPython's internals =================== diff --git a/internals/interpreter.rst b/internals/interpreter.rst index a53b6283cd..a7ae39c120 100644 --- a/internals/interpreter.rst +++ b/internals/interpreter.rst @@ -1,373 +1,8 @@ .. _interpreter: -=============================== -The bytecode interpreter (3.11) -=============================== - -.. highlight:: c - -Preface -======= - -The CPython 3.11 bytecode interpreter (a.k.a. virtual machine) has a number of improvements over 3.10. -We describe the inner workings of the 3.11 interpreter here, with an emphasis on understanding not just the code but its design. -While the interpreter is forever evolving, and the 3.12 design will undoubtedly be different again, knowing the 3.11 design will help you understand future improvements to the interpreter. - -Other sources -------------- - -* Brandt Bucher's talk about the specializing interpreter at PyCon US 2023. - `Slides <https://github.com/brandtbucher/brandtbucher/blob/master/2023/04/21/inside_cpython_311s_new_specializing_adaptive_interpreter.pdf>`_ - `Video <https://www.youtube.com/watch?v=PGZPSWZSkJI&t=1470s>`_ - -Introduction -============ - -The job of the bytecode interpreter, in :cpy-file:`Python/ceval.c`, is to execute Python code. -Its main input is a code object, although this is not a direct argument to the interpreter. -The interpreter is structured as a (recursive) function taking a thread state (``tstate``) and a stack frame (``frame``). -The function also takes an integer ``throwflag``, which is used by the implementation of ``generator.throw``. -It returns a new reference to a Python object (``PyObject *``) or an error indicator, ``NULL``. -Per :pep:`523`, this function is configurable by setting ``interp->eval_frame``; we describe only the default function, ``_PyEval_EvalFrameDefault()``. -(This function's signature has evolved and no longer matches what PEP 523 specifies; the thread state argument is added and the stack frame argument is no longer an object.) - -The interpreter finds the code object by looking in the stack frame (``frame->f_code``). -Various other items needed by the interpreter (e.g. globals and builtins) are also accessed via the stack frame. -The thread state stores exception information and a variety of other information, such as the recursion depth. -The thread state is also used to access per-interpreter state (``tstate->interp``) and per-runtime (i.e., truly global) state (``tstate->interp->runtime``). - -Note the slightly confusing terminology here. -"Interpreter" refers to the bytecode interpreter, a recursive function. -"Interpreter state" refers to state shared by threads, each of which may be running its own bytecode interpreter. -A single process may even host multiple interpreters, each with their own interpreter state, but sharing runtime state. -The topic of multiple interpreters is covered by several PEPs, notably :pep:`684`, :pep:`630`, and :pep:`554` (with more coming). -The current document focuses on the bytecode interpreter. - -Code objects -============ - -The interpreter uses a code object (``frame->f_code``) as its starting point. -Code objects contain many fields used by the interpreter, as well as some for use by debuggers and other tools. -In 3.11, the final field of a code object is an array of indeterminate length containing the bytecode, ``code->co_code_adaptive``. -(In previous versions the code object was a :class:`bytes` object, ``code->co_code``; it was changed to save an allocation and to allow it to be mutated.) - -Code objects are typically produced by the bytecode :ref:`compiler <compiler>`, although they are often written to disk by one process and read back in by another. -The disk version of a code object is serialized using the :mod:`marshal` protocol. -Some code objects are pre-loaded into the interpreter using ``Tools/scripts/deepfreeze.py``, which writes ``Python/deepfreeze/deepfreeze.c``. - -Code objects are nominally immutable. -Some fields (including ``co_code_adaptive``) are mutable, but mutable fields are not included when code objects are hashed or compared. - -Instruction decoding -==================== - -The first task of the interpreter is to decode the bytecode instructions. -Bytecode is stored as an array of 16-bit code units (``_Py_CODEUNIT``). -Each code unit contains an 8-bit ``opcode`` and an 8-bit argument (``oparg``), both unsigned. -In order to make the bytecode format independent of the machine byte order when stored on disk, ``opcode`` is always the first byte and ``oparg`` is always the second byte. -Macros are used to extract the ``opcode`` and ``oparg`` from a code unit (``_Py_OPCODE(word)`` and ``_Py_OPARG(word)``). -Some instructions (e.g. ``NOP`` or ``POP_TOP``) have no argument -- in this case we ignore ``oparg``. - -A simple instruction decoding loop would look like this: - -.. code-block:: c - - _Py_CODEUNIT *first_instr = code->co_code_adaptive; - _Py_CODEUNIT *next_instr = first_instr; - while (1) { - _Py_CODEUNIT word = *next_instr++; - unsigned char opcode = _Py_OPCODE(word); - unsigned int oparg = _Py_OPARG(word); - switch (opcode) { - // ... A case for each opcode ... - } - } - -This format supports 256 different opcodes, which is sufficient. -However, it also limits ``oparg`` to 8-bit values, which is not. -To overcome this, the ``EXTENDED_ARG`` opcode allows us to prefix any instruction with one or more additional data bytes. -For example, this sequence of code units:: - - EXTENDED_ARG 1 - EXTENDED_ARG 0 - LOAD_CONST 2 - -would set ``opcode`` to ``LOAD_CONST`` and ``oparg`` to ``65538`` (i.e., ``0x1_00_02``). -The compiler should limit itself to at most three ``EXTENDED_ARG`` prefixes, to allow the resulting ``oparg`` to fit in 32 bits, but the interpreter does not check this. -A series of code units starting with zero to three ``EXTENDED_ARG`` opcodes followed by a primary opcode is called a complete instruction, to distinguish it from a single code unit, which is always two bytes. -The following loop, to be inserted just above the ``switch`` statement, will make the above snippet decode a complete instruction: - -.. code-block:: c - - while (opcode == EXTENDED_ARG) { - word = *next_instr++; - opcode = _Py_OPCODE(word); - oparg = (oparg << 8) | _Py_OPARG(word); - } - -For various reasons we'll get to later (mostly efficiency, given that ``EXTENDED_ARG`` is rare) the actual code is different. - -Jumps -===== - -Note that when the ``switch`` statement is reached, ``next_instr`` (the "instruction offset") already points to the next instruction. -Thus, jump instructions can be implemented by manipulating ``next_instr``: - -- An absolute jump (``JUMP_ABSOLUTE``) sets ``next_instr = first_instr + oparg``. -- A relative jump forward (``JUMP_FORWARD``) sets ``next_instr += oparg``. -- A relative jump backward sets ``next_instr -= oparg``. - -A relative jump whose ``oparg`` is zero is a no-op. - -Inline cache entries -==================== - -Some (specialized or specializable) instructions have an associated "inline cache". -The inline cache consists of one or more two-byte entries included in the bytecode array as additional words following the ``opcode`` /``oparg`` pair. -The size of the inline cache for a particular instruction is fixed by its ``opcode`` alone. -Moreover, the inline cache size for a family of specialized/specializable instructions (e.g., ``LOAD_ATTR``, ``LOAD_ATTR_SLOT``, ``LOAD_ATTR_MODULE``) must all be the same. -Cache entries are reserved by the compiler and initialized with zeros. -If an instruction has an inline cache, the layout of its cache can be described by a ``struct`` definition and the address of the cache is given by casting ``next_instr`` to a pointer to the cache ``struct``. -The size of such a ``struct`` must be independent of the machine architecture, word size and alignment requirements. -For 32-bit fields, the ``struct`` should use ``_Py_CODEUNIT field[2]``. -Even though inline cache entries are represented by code units, they do not have to conform to the ``opcode`` / ``oparg`` format. - -The instruction implementation is responsible for advancing ``next_instr`` past the inline cache. -For example, if an instruction's inline cache is four bytes (i.e., two code units) in size, the code for the instruction must contain ``next_instr += 2;``. -This is equivalent to a relative forward jump by that many code units. -(The proper way to code this is ``JUMPBY(n)``, where ``n`` is the number of code units to jump, typically given as a named constant.) - -Serializing non-zero cache entries would present a problem because the serialization (:mod:`marshal`) format must be independent of the machine byte order. - -More information about the use of inline caches :pep:`can be found in PEP 659 <659#ancillary-data>`. - -The evaluation stack -==================== - -Apart from unconditional jumps, almost all instructions read or write some data in the form of object references (``PyObject *``). -The CPython 3.11 bytecode interpreter is a stack machine, meaning that it operates by pushing data onto and popping it off the stack. -The stack is a pre-allocated array of object references. -For example, the "add" instruction (which used to be called ``BINARY_ADD`` in 3.10 but is now ``BINARY_OP 0``) pops two objects off the stack and pushes the result back onto the stack. -An interesting property of the CPython bytecode interpreter is that the stack size required to evaluate a given function is known in advance. -The stack size is computed by the bytecode compiler and is stored in ``code->co_stacksize``. -The interpreter uses this information to allocate stack. - -The stack grows up in memory; the operation ``PUSH(x)`` is equivalent to ``*stack_pointer++ = x``, whereas ``x = POP()`` means ``x = *--stack_pointer``. -There is no overflow or underflow check (except when compiled in debug mode) -- it would be too expensive, so we really trust the compiler. - -At any point during execution, the stack level is knowable based on the instruction pointer alone, and some properties of each item on the stack are also known. -In particular, only a few instructions may push a ``NULL`` onto the stack, and the positions that may be ``NULL`` are known. -A few other instructions (``GET_ITER``, ``FOR_ITER``) push or pop an object that is known to be an iterator. - -Instruction sequences that do not allow statically knowing the stack depth are deemed illegal. -The bytecode compiler never generates such sequences. -For example, the following sequence is illegal, because it keeps pushing items on the stack:: - - LOAD_FAST 0 - JUMP_BACKWARD 2 - -Do not confuse the evaluation stack with the call stack, which is used to implement calling and returning from functions. - -Error handling -============== - -When an instruction like ``BINARY_OP`` encounters an error, an exception is raised. -At this point, a traceback entry is added to the exception (by ``PyTraceBack_Here()``) and cleanup is performed. -In the simplest case (absent any ``try`` blocks), this results in the remaining objects being popped off the evaluation stack and their reference count decremented (if not ``NULL``) . -Then the interpreter function (``_PyEval_EvalFrameDefault()``) returns ``NULL``. - -However, if an exception is raised in a ``try`` block, the interpreter must jump to the corresponding ``except`` or ``finally`` block. -In 3.10 and before, there was a separate "block stack" which was used to keep track of nesting ``try`` blocks. -In 3.11, this mechanism has been replaced by a statically generated table, ``code->co_exceptiontable``. -The advantage of this approach is that entering and leaving a ``try`` block normally does not execute any code, making execution faster. -But of course, this table needs to be generated by the compiler, and decoded (by ``get_exception_handler``) when an exception happens. - -Exception table format ----------------------- - -The table is conceptually a list of records, each containing four variable-length integer fields (in a unique format, see below): - -- start: start of ``try`` block, in code units from the start of the bytecode -- length: size of the ``try`` block, in code units -- target: start of the first instruction of the ``except`` or ``finally`` block, in code units from the start of the bytecode -- depth_and_lasti: the low bit gives the "lasti" flag, the remaining bits give the stack depth - -The stack depth is used to clean up evaluation stack entries above this depth. -The "lasti" flag indicates whether, after stack cleanup, the instruction offset of the raising instruction should be pushed (as a ``PyLongObject *``). -For more information on the design, see :cpy-file:`Objects/exception_handling_notes.txt`. - -Each varint is encoded as one or more bytes. -The high bit (bit 7) is reserved for random access -- it is set for the first varint of a record. -The second bit (bit 6) indicates whether this is the last byte or not -- it is set for all but the last bytes of a varint. -The low 6 bits (bits 0-5) are used for the integer value, in big-endian order. - -To find the table entry (if any) for a given instruction offset, we can use bisection without decoding the whole table. -We bisect the raw bytes, at each probe finding the start of the record by scanning back for a byte with the high bit set, and then decode the first varint. -See ``get_exception_handler()`` in :cpy-file:`Python/ceval.c` for the exact code (like all bisection algorithms, the code is a bit subtle). - -The locations table -------------------- - -Whenever an exception is raised, we add a traceback entry to the exception. -The ``tb_lineno`` field of a traceback entry must be set to the line number of the instruction that raised it. -This field is computed from the locations table, ``co_linetable`` (this name is an understatement), using :c:func:`PyCode_Addr2Line`. -This table has an entry for every instruction rather than for every ``try`` block, so a compact format is very important. - -The full design of the 3.11 locations table is written up in :cpy-file:`Objects/locations.md`. -While there are rumors that this file is slightly out of date, it is still the best reference we have. -Don't be confused by :cpy-file:`Objects/lnotab_notes.txt`, which describes the 3.10 format. -For backwards compatibility this format is still supported by the ``co_lnotab`` property. - -The 3.11 location table format is different because it stores not just the starting line number for each instruction, but also the end line number, *and* the start and end column numbers. -Note that traceback objects don't store all this information -- they store the start line number, for backward compatibility, and the "last instruction" value. -The rest can be computed from the last instruction (``tb_lasti``) with the help of the locations table. -For Python code, a convenient method exists, :meth:`~codeobject.co_positions`, which returns an iterator of :samp:`({line}, {endline}, {column}, {endcolumn})` tuples, one per instruction. -There is also ``co_lines()`` which returns an iterator of :samp:`({start}, {end}, {line})` tuples, where :samp:`{start}` and :samp:`{end}` are bytecode offsets. -The latter is described by :pep:`626`; it is more compact, but doesn't return end line numbers or column offsets. -From C code, you have to call :c:func:`PyCode_Addr2Location`. - -Fortunately, the locations table is only consulted by exception handling (to set ``tb_lineno``) and by tracing (to pass the line number to the tracing function). -In order to reduce the overhead during tracing, the mapping from instruction offset to line number is cached in the ``_co_linearray`` field. - -Exception chaining ------------------- - -When an exception is raised during exception handling, the new exception is chained to the old one. -This is done by making the ``__context__`` field of the new exception point to the old one. -This is the responsibility of ``_PyErr_SetObject()`` in :cpy-file:`Python/errors.c` (which is ultimately called by all ``PyErr_Set*()`` functions). -Separately, if a statement of the form :samp:`raise {X} from {Y}` is executed, the ``__cause__`` field of the raised exception (:samp:`{X}`) is set to :samp:`{Y}`. -This is done by :c:func:`PyException_SetCause`, called in response to all ``RAISE_VARARGS`` instructions. -A special case is :samp:`raise {X} from None`, which sets the ``__cause__`` field to ``None`` (at the C level, it sets ``cause`` to ``NULL``). - -(TODO: Other exception details.) - -Python-to-Python calls -====================== - -The ``_PyEval_EvalFrameDefault()`` function is recursive, because sometimes the interpreter calls some C function that calls back into the interpreter. -In 3.10 and before, this was the case even when a Python function called another Python function: -The ``CALL`` instruction would call the ``tp_call`` dispatch function of the callee, which would extract the code object, create a new frame for the call stack, and then call back into the interpreter. -This approach is very general but consumes several C stack frames for each nested Python call, thereby increasing the risk of an (unrecoverable) C stack overflow. - -In 3.11, the ``CALL`` instruction special-cases function objects to "inline" the call. -When a call gets inlined, a new frame gets pushed onto the call stack and the interpreter "jumps" to the start of the callee's bytecode. -When an inlined callee executes a ``RETURN_VALUE`` instruction, the frame is popped off the call stack and the interpreter returns to its caller, -by popping a frame off the call stack and "jumping" to the return address. -There is a flag in the frame (``frame->is_entry``) that indicates whether the frame was inlined (set if it wasn't). -If ``RETURN_VALUE`` finds this flag set, it performs the usual cleanup and returns from ``_PyEval_EvalFrameDefault()`` altogether, to a C caller. - -A similar check is performed when an unhandled exception occurs. - -The call stack -============== - -Up through 3.10, the call stack used to be implemented as a singly-linked list of :c:type:`PyFrameObject` objects. -This was expensive because each call would require a heap allocation for the stack frame. -(There was some optimization using a free list, but this was not always effective, because frames are variable length.) - -In 3.11, frames are no longer fully-fledged objects. -Instead, a leaner internal ``_PyInterpreterFrame`` structure is used, which is allocated using a custom allocator, ``_PyThreadState_BumpFramePointer()``. -Usually a frame allocation is just a pointer bump, which improves memory locality. -The function ``_PyEvalFramePushAndInit()`` allocates and initializes a frame structure. - -Sometimes an actual ``PyFrameObject`` is needed, usually because some Python code calls :func:`sys._getframe` or an extension module calls :c:func:`PyEval_GetFrame`. -In this case we allocate a proper ``PyFrameObject`` and initialize it from the ``_PyInterpreterFrame``. -This is a pessimization, but fortunately happens rarely (as introspecting frames is not a common operation). - -Things get more complicated when generators are involved, since those don't follow the push/pop model. -(The same applies to async functions, which are implemented using the same infrastructure.) -A generator object has space for a ``_PyInterpreterFrame`` structure, including the variable-size part (used for locals and eval stack). -When a generator (or async) function is first called, a special opcode ``RETURN_GENERATOR`` is executed, which is responsible for creating the generator object. -The generator object's ``_PyInterpreterFrame`` is initialized with a copy of the current stack frame. -The current stack frame is then popped off the stack and the generator object is returned. -(Details differ depending on the ``is_entry`` flag.) -When the generator is resumed, the interpreter pushes the ``_PyInterpreterFrame`` onto the stack and resumes execution. -(There is more hairiness for generators and their ilk; we'll discuss these in a later section in more detail.) - -(TODO: Also frame layout and use, and "locals plus".) - -All sorts of variables -====================== - -The bytecode compiler determines the scope in which each variable name is defined, and generates instructions accordingly. -For example, loading a local variable onto the stack is done using ``LOAD_FAST``, while loading a global is done using ``LOAD_GLOBAL``. -The key types of variables are: - -- fast locals: used in functions -- (slow or regular) locals: used in classes and at the top level -- globals and builtins: the compiler does not distinguish between globals and builtins (though the specializing interpreter does) -- cells: used for nonlocal references - -(TODO: Write the rest of this section. Alas, the author got distracted and won't have time to continue this for a while.) - -Other topics -============ - -(TODO: Each of the following probably deserves its own section.) - -- co_consts, co_names, co_varnames, and their ilk -- How calls work (how args are transferred, return, exceptions) -- Generators, async functions, async generators, and ``yield from`` (next, send, throw, close; and await; and how this code breaks the interpreter abstraction) -- Eval breaker (interrupts, GIL) -- Tracing -- Setting the current lineno (debugger-induced jumps) -- Specialization, inline caches etc. - - -Introducing new bytecode +======================== +The bytecode interpreter ======================== -.. note:: - - This section is relevant if you are adding a new bytecode to the interpreter. - - -Sometimes a new feature requires a new opcode. But adding new bytecode is -not as simple as just suddenly introducing new bytecode in the AST -> -bytecode step of the compiler. Several pieces of code throughout Python depend -on having correct information about what bytecode exists. - -First, you must choose a name, implement the bytecode in -:cpy-file:`Python/bytecodes.c`, and add a documentation entry in -:cpy-file:`Doc/library/dis.rst`. Then run ``make regen-cases`` to -assign a number for it (see :cpy-file:`Include/opcode_ids.h`) and -regenerate a number of files with the actual implementation of the -bytecodes (:cpy-file:`Python/generated_cases.c.h`) and additional -files with metadata about them. - -With a new bytecode you must also change what is called the magic number for -.pyc files. The variable ``MAGIC_NUMBER`` in -:cpy-file:`Lib/importlib/_bootstrap_external.py` contains the number. -Changing this number will lead to all .pyc files with the old ``MAGIC_NUMBER`` -to be recompiled by the interpreter on import. Whenever ``MAGIC_NUMBER`` is -changed, the ranges in the ``magic_values`` array in :cpy-file:`PC/launcher.c` -must also be updated. Changes to :cpy-file:`Lib/importlib/_bootstrap_external.py` -will take effect only after running ``make regen-importlib``. Running this -command before adding the new bytecode target to :cpy-file:`Python/bytecodes.c` -(followed by ``make regen-cases``) will result in an error. You should only run -``make regen-importlib`` after the new bytecode target has been added. - -.. note:: On Windows, running the ``./build.bat`` script will automatically - regenerate the required files without requiring additional arguments. - -Finally, you need to introduce the use of the new bytecode. Altering -:cpy-file:`Python/compile.c`, :cpy-file:`Python/bytecodes.c` will be the -primary places to change. Optimizations in :cpy-file:`Python/flowgraph.c` -may also need to be updated. -If the new opcode affects a control flow or the block stack, you may have -to update the ``frame_setlineno()`` function in :cpy-file:`Objects/frameobject.c`. -:cpy-file:`Lib/dis.py` may need an update if the new opcode interprets its -argument in a special way (like ``FORMAT_VALUE`` or ``MAKE_FUNCTION``). - -If you make a change here that can affect the output of bytecode that -is already in existence and you do not change the magic number constantly, make -sure to delete your old .py(c|o) files! Even though you will end up changing -the magic number if you change the bytecode, while you are debugging your work -you will be changing the bytecode output without constantly bumping up the -magic number. This means you end up with stale .pyc files that will not be -recreated. -Running ``find . -name '*.py[co]' -exec rm -f '{}' +`` should delete all .pyc -files you have, forcing new ones to be created and thus allow you test out your -new bytecode properly. Run ``make regen-importlib`` for updating the -bytecode of frozen importlib files. You have to run ``make`` again after this -for recompiling generated C files. +This document is now part of the +`CPython Internals Docs <https://github.com/python/cpython/blob/main/InternalDocs/interpreter.md>`_. diff --git a/internals/parser.rst b/internals/parser.rst index ac5f9ba49d..688ad61e77 100644 --- a/internals/parser.rst +++ b/internals/parser.rst @@ -6,921 +6,5 @@ Guide to the parser .. highlight:: none -Abstract -======== - -The Parser in CPython is currently a `PEG (Parser Expression Grammar) -<https://en.wikipedia.org/wiki/Parsing_expression_grammar>`_ parser. The first -version of the parser used to be an `LL(1) -<https://en.wikipedia.org/wiki/LL_parser>`_ based parser that was one of the -oldest parts of CPython implemented before it was replaced by :pep:`617`. In -particular, both the current parser and the old LL(1) parser are the output of a -`parser generator <https://en.wikipedia.org/wiki/Compiler-compiler>`_. This -means that the way the parser is written is by feeding a description of the -Grammar of the Python language to a special program (the parser generator) which -outputs the parser. The way the Python language is changed is therefore by -modifying the grammar file and developers rarely need to interact with the -parser generator itself other than use it to generate the parser. - -How PEG parsers work -==================== - -.. _how-peg-parsers-work: - -A PEG (Parsing Expression Grammar) grammar (like the current one) differs from a -context-free grammar in that the way it is written more closely -reflects how the parser will operate when parsing it. The fundamental technical -difference is that the choice operator is ordered. This means that when writing:: - - rule: A | B | C - -a context-free-grammar parser (like an LL(1) parser) will generate constructions -that given an input string will *deduce* which alternative (``A``, ``B`` or ``C``) -must be expanded, while a PEG parser will check if the first alternative succeeds -and only if it fails, will it continue with the second or the third one in the -order in which they are written. This makes the choice operator not commutative. - -Unlike LL(1) parsers, PEG-based parsers cannot be ambiguous: if a string parses, -it has exactly one valid parse tree. This means that a PEG-based parser cannot -suffer from the ambiguity problems that can arise with LL(1) parsers and with -context-free grammars in general. - -PEG parsers are usually constructed as a recursive descent parser in which every -rule in the grammar corresponds to a function in the program implementing the -parser and the parsing expression (the "expansion" or "definition" of the rule) -represents the "code" in said function. Each parsing function conceptually takes -an input string as its argument, and yields one of the following results: - -* A "success" result. This result indicates that the expression can be parsed by - that rule and the function may optionally move forward or consume one or more - characters of the input string supplied to it. -* A "failure" result, in which case no input is consumed. - -Notice that "failure" results do not imply that the program is incorrect, nor do -they necessarily mean that the parsing has failed. Since the choice operator is -ordered, a failure very often merely indicates "try the following option". A -direct implementation of a PEG parser as a recursive descent parser will present -exponential time performance in the worst case, because PEG parsers have -infinite lookahead (this means that they can consider an arbitrary number of -tokens before deciding for a rule). Usually, PEG parsers avoid this exponential -time complexity with a technique called "packrat parsing" [1]_ which not only -loads the entire program in memory before parsing it but also allows the parser -to backtrack arbitrarily. This is made efficient by memoizing the rules already -matched for each position. The cost of the memoization cache is that the parser -will naturally use more memory than a simple LL(1) parser, which normally are -table-based. - - -Key ideas ---------- - -.. important:: - Don't try to reason about a PEG grammar in the same way you would to with an EBNF - or context free grammar. PEG is optimized to describe **how** input strings will - be parsed, while context-free grammars are optimized to generate strings of the - language they describe (in EBNF, to know if a given string is in the language, you need - to do work to find out as it is not immediately obvious from the grammar). - -* Alternatives are ordered ( ``A | B`` is not the same as ``B | A`` ). -* If a rule returns a failure, it doesn't mean that the parsing has failed, - it just means "try something else". -* By default PEG parsers run in exponential time, which can be optimized to linear by - using memoization. -* If parsing fails completely (no rule succeeds in parsing all the input text), the - PEG parser doesn't have a concept of "where the :exc:`SyntaxError` is". - - -.. _consequences-of-ordered-choice: - -Consequences of the ordered choice operator -------------------------------------------- - -Although PEG may look like EBNF, its meaning is quite different. The fact -that in PEG parsers alternatives are ordered (which is at the core of how PEG -parsers work) has deep consequences, other than removing ambiguity. - -If a rule has two alternatives and the first of them succeeds, the second one is -**not** attempted even if the caller rule fails to parse the rest of the input. -Thus the parser is said to be "eager". To illustrate this, consider -the following two rules (in these examples, a token is an individual character): :: - - first_rule: ( 'a' | 'aa' ) 'a' - second_rule: ('aa' | 'a' ) 'a' - -In a regular EBNF grammar, both rules specify the language ``{aa, aaa}`` but -in PEG, one of these two rules accepts the string ``aaa`` but not the string -``aa``. The other does the opposite -- it accepts the string ``aa`` -but not the string ``aaa``. The rule ``('a'|'aa')'a'`` does -not accept ``aaa`` because ``'a'|'aa'`` consumes the first ``a``, letting the -final ``a`` in the rule consume the second, and leaving out the third ``a``. -As the rule has succeeded, no attempt is ever made to go back and let -``'a'|'aa'`` try the second alternative. The expression ``('aa'|'a')'a'`` does -not accept ``aa`` because ``'aa'|'a'`` accepts all of ``aa``, leaving nothing -for the final ``a``. Again, the second alternative of ``'aa'|'a'`` is not -tried. - -.. caution:: - - The effects of ordered choice, such as the ones illustrated above, may be hidden by many levels of rules. - -For this reason, writing rules where an alternative is contained in the next one is in almost all cases a mistake, -for example: :: - - my_rule: - | 'if' expression 'then' block - | 'if' expression 'then' block 'else' block - -In this example, the second alternative will never be tried because the first one will -succeed first (even if the input string has an ``'else' block`` that follows). To correctly -write this rule you can simply alter the order: :: - - my_rule: - | 'if' expression 'then' block 'else' block - | 'if' expression 'then' block - -In this case, if the input string doesn't have an ``'else' block``, the first alternative -will fail and the second will be attempted without said part. - -Syntax -====== - -The grammar consists of a sequence of rules of the form: :: - - rule_name: expression - -Optionally, a type can be included right after the rule name, which -specifies the return type of the C or Python function corresponding to -the rule: :: - - rule_name[return_type]: expression - -If the return type is omitted, then a ``void *`` is returned in C and an -``Any`` in Python. - -Grammar expressions -------------------- - -``# comment`` -^^^^^^^^^^^^^ - -Python-style comments. - -``e1 e2`` -^^^^^^^^^ - -Match ``e1``, then match ``e2``. - -:: - - rule_name: first_rule second_rule - -``e1 | e2`` -^^^^^^^^^^^ - -Match ``e1`` or ``e2``. - -The first alternative can also appear on the line after the rule name -for formatting purposes. In that case, a \| must be used before the -first alternative, like so: - -:: - - rule_name[return_type]: - | first_alt - | second_alt - -``( e )`` -^^^^^^^^^ - -Match ``e``. - -:: - - rule_name: (e) - -A slightly more complex and useful example includes using the grouping -operator together with the repeat operators: - -:: - - rule_name: (e1 e2)* - -``[ e ] or e?`` -^^^^^^^^^^^^^^^ - -Optionally match ``e``. - -:: - - rule_name: [e] - -A more useful example includes defining that a trailing comma is -optional: - -:: - - rule_name: e (',' e)* [','] - -``e*`` -^^^^^^ - -Match zero or more occurrences of ``e``. - -:: - - rule_name: (e1 e2)* - -``e+`` -^^^^^^ - -Match one or more occurrences of ``e``. - -:: - - rule_name: (e1 e2)+ - -``s.e+`` -^^^^^^^^ - -Match one or more occurrences of ``e``, separated by ``s``. The generated parse -tree does not include the separator. This is otherwise identical to -``(e (s e)*)``. - -:: - - rule_name: ','.e+ - -``&e`` -^^^^^^ - -.. _peg-positive-lookahead: - -Succeed if ``e`` can be parsed, without consuming any input. - -``!e`` -^^^^^^ - -.. _peg-negative-lookahead: - -Fail if ``e`` can be parsed, without consuming any input. - -An example taken from the Python grammar specifies that a primary -consists of an atom, which is not followed by a ``.`` or a ``(`` or a -``[``: - -:: - - primary: atom !'.' !'(' !'[' - -``~`` -^^^^^ - -Commit to the current alternative, even if it fails to parse (this is called -the "cut"). - -:: - - rule_name: '(' ~ some_rule ')' | some_alt - -In this example, if a left parenthesis is parsed, then the other -alternative won’t be considered, even if some_rule or ``)`` fail to be -parsed. - -Left recursion --------------- - -PEG parsers normally do not support left recursion but CPython's parser -generator implements a technique similar to the one described in Medeiros et al. -[2]_ but using the memoization cache instead of static variables. This approach -is closer to the one described in Warth et al. [3]_. This allows us to write not -only simple left-recursive rules but also more complicated rules that involve -indirect left-recursion like:: - - rule1: rule2 | 'a' - rule2: rule3 | 'b' - rule3: rule1 | 'c' - -and "hidden left-recursion" like:: - - rule: 'optional'? rule '@' some_other_rule - -Variables in the grammar ------------------------- - -A sub-expression can be named by preceding it with an identifier and an -``=`` sign. The name can then be used in the action (see below), like this: :: - - rule_name[return_type]: '(' a=some_other_rule ')' { a } - -Grammar actions ---------------- - -.. _peg-grammar-actions: - -To avoid the intermediate steps that obscure the relationship between the -grammar and the AST generation the PEG parser allows directly generating AST -nodes for a rule via grammar actions. Grammar actions are language-specific -expressions that are evaluated when a grammar rule is successfully parsed. These -expressions can be written in Python or C depending on the desired output of the -parser generator. This means that if one would want to generate a parser in -Python and another in C, two grammar files should be written, each one with a -different set of actions, keeping everything else apart from said actions -identical in both files. As an example of a grammar with Python actions, the -piece of the parser generator that parses grammar files is bootstrapped from a -meta-grammar file with Python actions that generate the grammar tree as a result -of the parsing. - -In the specific case of the PEG grammar for Python, having actions allows -directly describing how the AST is composed in the grammar itself, making it -more clear and maintainable. This AST generation process is supported by the use -of some helper functions that factor out common AST object manipulations and -some other required operations that are not directly related to the grammar. - -To indicate these actions each alternative can be followed by the action code -inside curly-braces, which specifies the return value of the alternative:: - - rule_name[return_type]: - | first_alt1 first_alt2 { first_alt1 } - | second_alt1 second_alt2 { second_alt1 } - -If the action is omitted, a default action is generated: - -* If there's a single name in the rule, it gets returned. - -* If there is more than one name in the rule, a collection with all parsed - expressions gets returned (the type of the collection will be different - in C and Python). - -This default behaviour is primarily made for very simple situations and for -debugging purposes. - -.. warning:: - - It's important that the actions don't mutate any AST nodes that are passed - into them via variables referring to other rules. The reason for mutation - being not allowed is that the AST nodes are cached by memoization and could - potentially be reused in a different context, where the mutation would be - invalid. If an action needs to change an AST node, it should instead make a - new copy of the node and change that. - -The full meta-grammar for the grammars supported by the PEG generator is: - -:: - - start[Grammar]: grammar ENDMARKER { grammar } - - grammar[Grammar]: - | metas rules { Grammar(rules, metas) } - | rules { Grammar(rules, []) } - - metas[MetaList]: - | meta metas { [meta] + metas } - | meta { [meta] } - - meta[MetaTuple]: - | "@" NAME NEWLINE { (name.string, None) } - | "@" a=NAME b=NAME NEWLINE { (a.string, b.string) } - | "@" NAME STRING NEWLINE { (name.string, literal_eval(string.string)) } - - rules[RuleList]: - | rule rules { [rule] + rules } - | rule { [rule] } - - rule[Rule]: - | rulename ":" alts NEWLINE INDENT more_alts DEDENT { - Rule(rulename[0], rulename[1], Rhs(alts.alts + more_alts.alts)) } - | rulename ":" NEWLINE INDENT more_alts DEDENT { Rule(rulename[0], rulename[1], more_alts) } - | rulename ":" alts NEWLINE { Rule(rulename[0], rulename[1], alts) } - - rulename[RuleName]: - | NAME '[' type=NAME '*' ']' {(name.string, type.string+"*")} - | NAME '[' type=NAME ']' {(name.string, type.string)} - | NAME {(name.string, None)} - - alts[Rhs]: - | alt "|" alts { Rhs([alt] + alts.alts)} - | alt { Rhs([alt]) } - - more_alts[Rhs]: - | "|" alts NEWLINE more_alts { Rhs(alts.alts + more_alts.alts) } - | "|" alts NEWLINE { Rhs(alts.alts) } - - alt[Alt]: - | items '$' action { Alt(items + [NamedItem(None, NameLeaf('ENDMARKER'))], action=action) } - | items '$' { Alt(items + [NamedItem(None, NameLeaf('ENDMARKER'))], action=None) } - | items action { Alt(items, action=action) } - | items { Alt(items, action=None) } - - items[NamedItemList]: - | named_item items { [named_item] + items } - | named_item { [named_item] } - - named_item[NamedItem]: - | NAME '=' ~ item {NamedItem(name.string, item)} - | item {NamedItem(None, item)} - | it=lookahead {NamedItem(None, it)} - - lookahead[LookaheadOrCut]: - | '&' ~ atom {PositiveLookahead(atom)} - | '!' ~ atom {NegativeLookahead(atom)} - | '~' {Cut()} - - item[Item]: - | '[' ~ alts ']' {Opt(alts)} - | atom '?' {Opt(atom)} - | atom '*' {Repeat0(atom)} - | atom '+' {Repeat1(atom)} - | sep=atom '.' node=atom '+' {Gather(sep, node)} - | atom {atom} - - atom[Plain]: - | '(' ~ alts ')' {Group(alts)} - | NAME {NameLeaf(name.string) } - | STRING {StringLeaf(string.string)} - - # Mini-grammar for the actions - - action[str]: "{" ~ target_atoms "}" { target_atoms } - - target_atoms[str]: - | target_atom target_atoms { target_atom + " " + target_atoms } - | target_atom { target_atom } - - target_atom[str]: - | "{" ~ target_atoms "}" { "{" + target_atoms + "}" } - | NAME { name.string } - | NUMBER { number.string } - | STRING { string.string } - | "?" { "?" } - | ":" { ":" } - -As an illustrative example this simple grammar file allows directly -generating a full parser that can parse simple arithmetic expressions and that -returns a valid C-based Python AST: - -:: - - start[mod_ty]: a=expr_stmt* ENDMARKER { _PyAST_Module(a, NULL, p->arena) } - expr_stmt[stmt_ty]: a=expr NEWLINE { _PyAST_Expr(a, EXTRA) } - - expr[expr_ty]: - | l=expr '+' r=term { _PyAST_BinOp(l, Add, r, EXTRA) } - | l=expr '-' r=term { _PyAST_BinOp(l, Sub, r, EXTRA) } - | term - - term[expr_ty]: - | l=term '*' r=factor { _PyAST_BinOp(l, Mult, r, EXTRA) } - | l=term '/' r=factor { _PyAST_BinOp(l, Div, r, EXTRA) } - | factor - - factor[expr_ty]: - | '(' e=expr ')' { e } - | atom - - atom[expr_ty]: - | NAME - | NUMBER - -Here ``EXTRA`` is a macro that expands to ``start_lineno, start_col_offset, -end_lineno, end_col_offset, p->arena``, those being variables automatically -injected by the parser; ``p`` points to an object that holds on to all state -for the parser. - -A similar grammar written to target Python AST objects: - -:: - - start[ast.Module]: a=expr_stmt* ENDMARKER { ast.Module(body=a or [] } - expr_stmt: a=expr NEWLINE { ast.Expr(value=a, EXTRA) } - - expr: - | l=expr '+' r=term { ast.BinOp(left=l, op=ast.Add(), right=r, EXTRA) } - | l=expr '-' r=term { ast.BinOp(left=l, op=ast.Sub(), right=r, EXTRA) } - | term - - term: - | l=term '*' r=factor { ast.BinOp(left=l, op=ast.Mult(), right=r, EXTRA) } - | l=term '/' r=factor { ast.BinOp(left=l, op=ast.Div(), right=r, EXTRA) } - | factor - - factor: - | '(' e=expr ')' { e } - | atom - - atom: - | NAME - | NUMBER - - -Pegen -===== - -Pegen is the parser generator used in CPython to produce the final PEG parser used by the interpreter. It is the -program that can be used to read the python grammar located in :cpy-file:`Grammar/python.gram` and produce the final C -parser. It contains the following pieces: - -* A parser generator that can read a grammar file and produce a PEG parser written in Python or C that can parse - said grammar. The generator is located at :cpy-file:`Tools/peg_generator/pegen`. -* A PEG meta-grammar that automatically generates a Python parser that is used for the parser generator itself - (this means that there are no manually-written parsers). The meta-grammar is - located at :cpy-file:`Tools/peg_generator/pegen/metagrammar.gram`. -* A generated parser (using the parser generator) that can directly produce C and Python AST objects. - -The source code for Pegen lives at :cpy-file:`Tools/peg_generator/pegen` but normally all typical commands to interact -with the parser generator are executed from the main makefile. - -How to regenerate the parser ----------------------------- - -Once you have made the changes to the grammar files, to regenerate the ``C`` -parser (the one used by the interpreter) just execute: :: - - make regen-pegen - -using the :cpy-file:`!Makefile` in the main directory. If you are on Windows you can -use the Visual Studio project files to regenerate the parser or to execute: :: - - ./PCbuild/build.bat --regen - -The generated parser file is located at :cpy-file:`Parser/parser.c`. - -How to regenerate the meta-parser ---------------------------------- - -The meta-grammar (the grammar that describes the grammar for the grammar files -themselves) is located at :cpy-file:`Tools/peg_generator/pegen/metagrammar.gram`. -Although it is very unlikely that you will ever need to modify it, if you make any modifications -to this file (in order to implement new Pegen features) you will need to regenerate -the meta-parser (the parser that parses the grammar files). To do so just execute: :: - - make regen-pegen-metaparser - -If you are on Windows you can use the Visual Studio project files -to regenerate the parser or to execute: :: - - ./PCbuild/build.bat --regen - - -Grammatical elements and rules ------------------------------- - -Pegen has some special grammatical elements and rules: - -* Strings with single quotes (') (e.g. ``'class'``) denote KEYWORDS. -* Strings with double quotes (") (e.g. ``"match"``) denote SOFT KEYWORDS. -* Uppercase names (e.g. ``NAME``) denote tokens in the :cpy-file:`Grammar/Tokens` file. -* Rule names starting with ``invalid_`` are used for specialized syntax errors. - - - These rules are NOT used in the first pass of the parser. - - Only if the first pass fails to parse, a second pass including the invalid - rules will be executed. - - If the parser fails in the second phase with a generic syntax error, the - location of the generic failure of the first pass will be used (this avoids - reporting incorrect locations due to the invalid rules). - - The order of the alternatives involving invalid rules matter - (like any rule in PEG). - -Tokenization ------------- - -It is common among PEG parser frameworks that the parser does both the parsing and the tokenization, -but this does not happen in Pegen. The reason is that the Python language needs a custom tokenizer -to handle things like indentation boundaries, some special keywords like ``ASYNC`` and ``AWAIT`` -(for compatibility purposes), backtracking errors (such as unclosed parenthesis), dealing with encoding, -interactive mode and much more. Some of these reasons are also there for historical purposes, and some -others are useful even today. - -The list of tokens (all uppercase names in the grammar) that you can use can be found in the :cpy-file:`Grammar/Tokens` -file. If you change this file to add new tokens, make sure to regenerate the files by executing: :: - - make regen-token - -If you are on Windows you can use the Visual Studio project files to regenerate the tokens or to execute: :: - - ./PCbuild/build.bat --regen - -How tokens are generated and the rules governing this are completely up to the tokenizer -(:cpy-file:`Parser/lexer/` and :cpy-file:`Parser/tokenizer/`); -the parser just receives tokens from it. - -Memoization ------------ - -As described previously, to avoid exponential time complexity in the parser, memoization is used. - -The C parser used by Python is highly optimized and memoization can be expensive both in memory and time. Although -the memory cost is obvious (the parser needs memory for storing previous results in the cache) the execution time -cost comes for continuously checking if the given rule has a cache hit or not. In many situations, just parsing it -again can be faster. Pegen **disables memoization by default** except for rules with the special marker ``memo`` after -the rule name (and type, if present): :: - - rule_name[typr] (memo): - ... - -By selectively turning on memoization for a handful of rules, the parser becomes faster and uses less memory. - -.. note:: - Left-recursive rules always use memoization, since the implementation of left-recursion depends on it. - -To know if a new rule needs memoization or not, benchmarking is required -(comparing execution times and memory usage of some considerably big files with -and without memoization). There is a very simple instrumentation API available -in the generated C parse code that allows to measure how much each rule uses -memoization (check the :cpy-file:`Parser/pegen.c` file for more information) but it -needs to be manually activated. - -Automatic variables -------------------- - -To make writing actions easier, Pegen injects some automatic variables in the namespace available -when writing actions. In the C parser, some of these automatic variable names are: - -* ``p``: The parser structure. -* ``EXTRA``: This is a macro that expands to ``(_start_lineno, _start_col_offset, _end_lineno, _end_col_offset, p->arena)``, - which is normally used to create AST nodes as almost all constructors need these attributes to be provided. All of the - location variables are taken from the location information of the current token. - -Hard and soft keywords ----------------------- - -.. note:: - In the grammar files, keywords are defined using **single quotes** (e.g. ``'class'``) while soft - keywords are defined using **double quotes** (e.g. ``"match"``). - -There are two kinds of keywords allowed in pegen grammars: *hard* and *soft* -keywords. The difference between hard and soft keywords is that hard keywords -are always reserved words, even in positions where they make no sense (e.g. ``x = class + 1``), -while soft keywords only get a special meaning in context. Trying to use a hard -keyword as a variable will always fail: - -.. code-block:: - - >>> class = 3 - File "<stdin>", line 1 - class = 3 - ^ - SyntaxError: invalid syntax - >>> foo(class=3) - File "<stdin>", line 1 - foo(class=3) - ^^^^^ - SyntaxError: invalid syntax - -While soft keywords don't have this limitation if used in a context other the one where they -are defined as keywords: - -.. code-block:: python - - >>> match = 45 - >>> foo(match="Yeah!") - -The ``match`` and ``case`` keywords are soft keywords, so that they are recognized as -keywords at the beginning of a match statement or case block respectively, but are -allowed to be used in other places as variable or argument names. - -You can get a list of all keywords defined in the grammar from Python: - -.. code-block:: python - - >>> import keyword - >>> keyword.kwlist - ['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', - 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', - 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', - 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield'] - -as well as soft keywords: - -.. code-block:: python - - >>> import keyword - >>> keyword.softkwlist - ['_', 'case', 'match'] - -.. caution:: - Soft keywords can be a bit challenging to manage as they can be accepted in - places you don't intend to, given how the order alternatives behave in PEG - parsers (see :ref:`consequences of ordered choice section - <consequences-of-ordered-choice>` for some background on this). In general, - try to define them in places where there is not a lot of alternatives. - -Error handling --------------- - -When a pegen-generated parser detects that an exception is raised, it will -**automatically stop parsing**, no matter what the current state of the parser -is and it will unwind the stack and report the exception. This means that if a -:ref:`rule action <peg-grammar-actions>` raises an exception all parsing will -stop at that exact point. This is done to allow to correctly propagate any -exception set by calling Python's C API functions. This also includes :exc:`SyntaxError` -exceptions and this is the main mechanism the parser uses to report custom syntax -error messages. - -.. note:: - Tokenizer errors are normally reported by raising exceptions but some special - tokenizer errors such as unclosed parenthesis will be reported only after the - parser finishes without returning anything. - -How syntax errors are reported ------------------------------- - -As described previously in the :ref:`how PEG parsers work section -<how-peg-parsers-work>`, PEG parsers don't have a defined concept of where -errors happened in the grammar, because a rule failure doesn't imply a -parsing failure like in context free grammars. This means that some heuristic -has to be used to report generic errors unless something is explicitly declared -as an error in the grammar. - -To report generic syntax errors, pegen uses a common heuristic in PEG parsers: -the location of *generic* syntax errors is reported in the furthest token that -was attempted to be matched but failed. This is only done if parsing has failed -(the parser returns ``NULL`` in C or ``None`` in Python) but no exception has -been raised. - -.. caution:: - Positive and negative lookaheads will try to match a token so they will affect - the location of generic syntax errors. Use them carefully at boundaries - between rules. - -As the Python grammar was primordially written as an LL(1) grammar, this heuristic -has an extremely high success rate, but some PEG features can have small effects, -such as :ref:`positive lookaheads <peg-positive-lookahead>` and -:ref:`negative lookaheads <peg-negative-lookahead>`. - -To generate more precise syntax errors, custom rules are used. This is a common practice -also in context free grammars: the parser will try to accept some construct that is known -to be incorrect just to report a specific syntax error for that construct. In pegen grammars, -these rules start with the ``invalid_`` prefix. This is because trying to match these rules -normally has a performance impact on parsing (and can also affect the 'correct' grammar itself -in some tricky cases, depending on the ordering of the rules) so the generated parser acts in -two phases: - -1. The first phase will try to parse the input stream without taking into account rules that - start with the ``invalid_`` prefix. If the parsing succeeds it will return the generated AST - and the second phase will not be attempted. - -2. If the first phase failed, a second parsing attempt is done including the rules that start - with an ``invalid_`` prefix. By design this attempt **cannot succeed** and is only executed - to give to the invalid rules a chance to detect specific situations where custom, more precise, - syntax errors can be raised. This also allows to trade a bit of performance for precision reporting - errors: given that we know that the input text is invalid, there is no need to be fast because - the interpreter is going to stop anyway. - -.. important:: - When defining invalid rules: - - * Make sure all custom invalid rules raise :exc:`SyntaxError` exceptions (or a subclass of it). - * Make sure **all** invalid rules start with the ``invalid_`` prefix to not - impact performance of parsing correct Python code. - * Make sure the parser doesn't behave differently for regular rules when you introduce invalid rules - (see the :ref:`how PEG parsers work section <how-peg-parsers-work>` for more information). - -You can find a collection of macros to raise specialized syntax errors in the -:cpy-file:`Parser/pegen.h` header file. These macros allow also to report ranges for -the custom errors that will be highlighted in the tracebacks that will be -displayed when the error is reported. - -.. tip:: - A good way to test if an invalid rule will be triggered when you expect is to test if introducing - a syntax error **after** valid code triggers the rule or not. For example: :: - - <valid python code> $ 42 - - Should trigger the syntax error in the ``$`` character. If your rule is not correctly defined this - won't happen. For example, if you try to define a rule to match Python 2 style ``print`` statements - to make a better error message and you define it as: :: - - invalid_print: "print" expression - - This will **seem** to work because the parser will correctly parse ``print(something)`` because it is valid - code and the second phase will never execute but if you try to parse ``print(something) $ 3`` the first pass - of the parser will fail (because of the ``$``) and in the second phase, the rule will match the - ``print(something)`` as ``print`` followed by the variable ``something`` between parentheses and the error - will be reported there instead of the ``$`` character. - -Generating AST objects ----------------------- - -The output of the C parser used by CPython that is generated by the -:cpy-file:`Grammar/python.gram` grammar file is a Python AST object (using C -structures). This means that the actions in the grammar file generate AST objects -when they succeed. Constructing these objects can be quite cumbersome (see -the :ref:`AST compiler section <compiler-ast-trees>` for more information -on how these objects are constructed and how they are used by the compiler) so -special helper functions are used. These functions are declared in the -:cpy-file:`Parser/pegen.h` header file and defined in the :cpy-file:`Parser/action_helpers.c` -file. These functions allow you to join AST sequences, get specific elements -from them or to do extra processing on the generated tree. - -.. caution:: - Actions must **never** be used to accept or reject rules. It may be tempting - in some situations to write a very generic rule and then check the generated - AST to decide if is valid or not but this will render the `official grammar - <https://docs.python.org/3/reference/grammar.html>`_ partially incorrect - (because actions are not included) and will make it more difficult for other - Python implementations to adapt the grammar to their own needs. - -As a general rule, if an action spawns multiple lines or requires something more -complicated than a single expression of C code, is normally better to create a -custom helper in :cpy-file:`Parser/action_helpers.c` and expose it in the -:cpy-file:`Parser/pegen.h` header file so it can be used from the grammar. - -If the parsing succeeds, the parser **must** return a **valid** AST object. - -Testing -======= - -There are three files that contain tests for the grammar and the parser: - -* :cpy-file:`Lib/test/test_grammar.py` -* :cpy-file:`Lib/test/test_syntax.py` -* :cpy-file:`Lib/test/test_exceptions.py` - -Check the contents of these files to know which is the best place to place new tests depending -on the nature of the new feature you are adding. - -Tests for the parser generator itself can be found in the :cpy-file:`Lib/test/test_peg_generator` directory. - - -Debugging generated parsers -=========================== - -Making experiments ------------------- - -As the generated C parser is the one used by Python, this means that if something goes wrong when adding some -new rules to the grammar you cannot correctly compile and execute Python anymore. This makes it a bit challenging -to debug when something goes wrong, especially when making experiments. - -For this reason it is a good idea to experiment first by generating a Python parser. To do this, you can go to the -:cpy-file:`Tools/peg_generator/` directory on the CPython repository and manually call the parser generator by executing: - -.. code-block:: shell - - $ python -m pegen python <PATH TO YOUR GRAMMAR FILE> - -This will generate a file called :file:`parse.py` in the same directory that you can use to parse some input: - -.. code-block:: shell - - $ python parse.py file_with_source_code_to_test.py - -As the generated :file:`parse.py` file is just Python code, you can modify it and add breakpoints to debug or -better understand some complex situations. - - -Verbose mode ------------- - -When Python is compiled in debug mode (by adding ``--with-pydebug`` when running the configure step in Linux or by -adding ``-d`` when calling the :cpy-file:`PCbuild/build.bat` script in Windows), it is possible to activate a **very** verbose -mode in the generated parser. This is very useful to debug the generated parser and to understand how it works, but it -can be a bit hard to understand at first. - -.. note:: - - When activating verbose mode in the Python parser, it is better to not use interactive mode as it can be much harder to - understand, because interactive mode involves some special steps compared to regular parsing. - -To activate verbose mode you can add the ``-d`` flag when executing Python: - -.. code-block:: shell - - $ python -d file_to_test.py - -This will print **a lot** of output to ``stderr`` so is probably better to dump it to a file for further analysis. The output -consists of trace lines with the following structure:: - - <indentation> ('>'|'-'|'+'|'!') <rule_name>[<token_location>]: <alternative> ... - -Every line is indented by a different amount (``<indentation>``) depending on how deep the call stack is. The next -character marks the type of the trace: - -* ``>`` indicates that a rule is going to be attempted to be parsed. -* ``-`` indicates that a rule has failed to be parsed. -* ``+`` indicates that a rule has been parsed correctly. -* ``!`` indicates that an exception or an error has been detected and the parser is unwinding. - -The ``<token_location>`` part indicates the current index in the token array, -the ``<rule_name>`` part indicates what rule is being parsed and -the ``<alternative>`` part indicates what alternative within that rule -is being attempted. - - -References -========== - -.. [1] Ford, Bryan - https://pdos.csail.mit.edu/~baford/packrat/thesis/ - -.. [2] Medeiros et al. - https://arxiv.org/pdf/1207.0443.pdf - -.. [3] Warth et al. - http://web.cs.ucla.edu/~todd/research/pepm08.pdf - - -.. admonition:: Document history - :class: note - - Pablo Galindo Salgado - Original author +This document is now part of the +`CPython Internals Docs <https://github.com/python/cpython/blob/main/InternalDocs/parser.md>`_. diff --git a/make.bat b/make.bat index 432b7f361f..b486783665 100644 --- a/make.bat +++ b/make.bat @@ -10,23 +10,9 @@ if "%PYTHON%" == "" ( set PYTHON=py -3 ) -if not defined SPHINXLINT ( - %PYTHON% -c "import sphinxlint" > nul 2> nul - if errorlevel 1 ( - echo Installing sphinx-lint with %PYTHON% - rem Should have been installed with Sphinx earlier - %PYTHON% -m pip install "sphinx-lint<1" - if errorlevel 1 exit /B - ) - set SPHINXLINT=%PYTHON% -m sphinxlint -) - set BUILDDIR=_build -set SPHINXOPTS=-W --keep-going -n -set ALLSPHINXOPTS=-d %BUILDDIR%/doctrees %SPHINXOPTS% . -if NOT "%PAPER%" == "" ( - set ALLSPHINXOPTS=-D latex_paper_size=%PAPER% %ALLSPHINXOPTS% -) +set SPHINXOPTS=--fail-on-warning --keep-going +set _ALL_SPHINX_OPTS=%SPHINXOPTS% if "%1" == "check" goto check @@ -60,6 +46,14 @@ if "%1" == "clean" ( goto end ) +if "%1" == "versions" ( + %PYTHON% _tools/generate_release_cycle.py + if errorlevel 1 exit /b 1 + echo. + echo Release cycle data generated. + goto end +) + rem Targets other than "clean", "check", "help", or "" need the rem Sphinx build command, which the user may define via SPHINXBUILD. @@ -73,164 +67,46 @@ if not defined SPHINXBUILD ( ) set PYTHON=venv\Scripts\python set SPHINXBUILD=venv\Scripts\sphinx-build -) - -if "%1" == "html" ( - %SPHINXBUILD% -b html %ALLSPHINXOPTS% %BUILDDIR%/html - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. The HTML pages are in %BUILDDIR%/html. - goto end + set SPHINXAUTOBUILD=venv\Scripts\sphinx-autobuild ) if "%1" == "htmlview" ( - cmd /C %this% html - - if EXIST "%BUILDDIR%\html\index.html" ( - echo.Opening "%BUILDDIR%\html\index.html" in the default web browser... - start "" "%BUILDDIR%\html\index.html" - ) - - goto end -) - -if "%1" == "dirhtml" ( - %SPHINXBUILD% -b dirhtml %ALLSPHINXOPTS% %BUILDDIR%/dirhtml - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. The HTML pages are in %BUILDDIR%/dirhtml. - goto end -) - -if "%1" == "singlehtml" ( - %SPHINXBUILD% -b singlehtml %ALLSPHINXOPTS% %BUILDDIR%/singlehtml - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. The HTML pages are in %BUILDDIR%/singlehtml. - goto end -) - -if "%1" == "pickle" ( - %SPHINXBUILD% -b pickle %ALLSPHINXOPTS% %BUILDDIR%/pickle - if errorlevel 1 exit /b 1 - echo. - echo.Build finished; now you can process the pickle files. - goto end -) - -if "%1" == "json" ( - %SPHINXBUILD% -b json %ALLSPHINXOPTS% %BUILDDIR%/json - if errorlevel 1 exit /b 1 - echo. - echo.Build finished; now you can process the JSON files. - goto end -) - -if "%1" == "htmlhelp" ( - %SPHINXBUILD% -b htmlhelp %ALLSPHINXOPTS% %BUILDDIR%/htmlhelp - if errorlevel 1 exit /b 1 - echo. - echo.Build finished; now you can run HTML Help Workshop with the ^ -.hhp project file in %BUILDDIR%/htmlhelp. - goto end -) - -if "%1" == "qthelp" ( - %SPHINXBUILD% -b qthelp %ALLSPHINXOPTS% %BUILDDIR%/qthelp - if errorlevel 1 exit /b 1 - echo. - echo.Build finished; now you can run "qcollectiongenerator" with the ^ -.qhcp project file in %BUILDDIR%/qthelp, like this: - echo.^> qcollectiongenerator %BUILDDIR%\qthelp\PythonDevelopersGuide.qhcp - echo.To view the help file: - echo.^> assistant -collectionFile %BUILDDIR%\qthelp\PythonDevelopersGuide.ghc - goto end -) - -if "%1" == "devhelp" ( - %SPHINXBUILD% -b devhelp %ALLSPHINXOPTS% %BUILDDIR%/devhelp - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. - goto end -) - -if "%1" == "epub" ( - %SPHINXBUILD% -b epub %ALLSPHINXOPTS% %BUILDDIR%/epub - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. The epub file is in %BUILDDIR%/epub. - goto end -) + cmd /C %this% html -if "%1" == "latex" ( - %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex - if errorlevel 1 exit /b 1 - echo. - echo.Build finished; the LaTeX files are in %BUILDDIR%/latex. - goto end -) - -if "%1" == "text" ( - %SPHINXBUILD% -b text %ALLSPHINXOPTS% %BUILDDIR%/text - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. The text files are in %BUILDDIR%/text. - goto end -) + if EXIST "%BUILDDIR%\html\index.html" ( + echo.Opening "%BUILDDIR%\html\index.html" in the default web browser... + start "" "%BUILDDIR%\html\index.html" + ) -if "%1" == "man" ( - %SPHINXBUILD% -b man %ALLSPHINXOPTS% %BUILDDIR%/man - if errorlevel 1 exit /b 1 - echo. - echo.Build finished. The manual pages are in %BUILDDIR%/man. goto end ) -if "%1" == "changes" ( - %SPHINXBUILD% -b changes %ALLSPHINXOPTS% %BUILDDIR%/changes - if errorlevel 1 exit /b 1 - echo. - echo.The overview file is in %BUILDDIR%/changes. - goto end +if "%1" == "htmllive" ( + %SPHINXAUTOBUILD% --re-ignore="/\.idea/|/venv/" --open-browser --delay 0 --port 55301 . %BUILDDIR%/html + if errorlevel 1 exit /b 1 + goto end ) -if "%1" == "linkcheck" ( - %SPHINXBUILD% -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck - if errorlevel 1 exit /b 1 - echo. - echo.Link check complete; look for any errors in the above output ^ -or in %BUILDDIR%/linkcheck/output.txt. - goto end -) +%SPHINXBUILD% -M %1 "." %BUILDDIR% %_ALL_SPHINX_OPTS% +goto end -if "%1" == "doctest" ( - %SPHINXBUILD% -b doctest %ALLSPHINXOPTS% %BUILDDIR%/doctest - if errorlevel 1 exit /b 1 - echo. - echo.Testing of doctests in the sources finished, look at the ^ -results in %BUILDDIR%/doctest/output.txt. - goto end +:check +if not defined SPHINXLINT ( + rem If it is not defined, we build in a virtual environment + if not exist venv ( + echo. Setting up the virtual environment + %PYTHON% -m venv venv + echo. Installing requirements + venv\Scripts\python -m pip install -r requirements.txt + ) + set PYTHON=venv\Scripts\python + set SPHINXLINT=%PYTHON% -m sphinxlint ) -:check rem Ignore the tools and venv dirs and check that the default role is not used. cmd /S /C "%SPHINXLINT% -i tools -i venv --enable default-role" goto end -:serve - echo.The serve target was removed, use htmlview instead ^ -(see https://github.com/python/cpython/issues/80510) -goto end - -if "%1" == "versions" ( - %PYTHON% _tools/generate_release_cycle.py - if errorlevel 1 exit /b 1 - echo. - echo Release cycle data generated. - goto end -) - :end popd endlocal diff --git a/make.ps1 b/make.ps1 new file mode 100644 index 0000000000..71a8f56f4c --- /dev/null +++ b/make.ps1 @@ -0,0 +1,99 @@ +# Command file for Sphinx documentation + +param ( + [string]$target = "help" +) + +Set-StrictMode -Version 3.0 +$ErrorActionPreference = "Stop" + +$BUILDDIR = "_build" +$SPHINXOPTS = "--fail-on-warning --keep-going" +$_ALL_SPHINX_OPTS = $SPHINXOPTS + +$_PYTHON = $Env:PYTHON ?? "py -3" +$_SPHINX_BUILD = $Env:SPHINXBUILD ?? ".\venv\Scripts\sphinx-build" +$_SPHINX_LINT = $Env:SPHINXLINT ?? ".\venv\Scripts\sphinx-lint" +$_VENV_DIR = "venv" + +function New-VirtualEnviromnent +{ + Write-Host "Creating venv in $_VENV_DIR" + if (Get-Command "uv" -ErrorAction SilentlyContinue) { + & uv venv $_VENV_DIR + $Env:VIRTUAL_ENV = $_VENV_DIR + & uv pip install -r requirements.txt + Remove-Item Env:VIRTUAL_ENV + } else { + & $_PYTHON -m venv venv + Write-Host "Installing requirements" + & venv\Scripts\python -m pip install -r requirements.txt + $Script:_PYTHON = "venv\Scripts\python" + } +} + +function Invoke-SphinxBuild +{ + param ( + [string]$BuilderName, + [string]$BuildDir, + [string]$Options + ) + if (-Not (Test-Path -Path $_VENV_DIR)) { New-VirtualEnviromnent } + & $_SPHINX_BUILD -M $BuilderName "." $BuildDir $Options.Split(" ") +} + +function Invoke-Check { + if (-Not (Test-Path -Path $_VENV_DIR)) { New-VirtualEnviromnent } + & $_SPHINX_LINT -i tools -i venv --enable default-role +} + +if ($target -Eq "help") { + Write-Host "Please use `make <target>` where <target> is one of" + Write-Host " venv to create a venv with necessary tools" + Write-Host " html to make standalone HTML files" + Write-Host " linkcheck to check all external links for integrity" + Write-Host " htmlview to open the index page built by the html target in your browser" + Write-Host " clean to remove the venv and build files" + Write-Host " check to check for stylistic and formal issues using sphinx-lint" + Write-Host " versions to update release cycle after changing release-cycle.json" + Exit +} + +if ($target -Eq "clean") { + $ToClean = @( + $BUILDDIR, + $_VENV_DIR, + "include/branches.csv", "include/end-of-life.csv", "include/release-cycle.svg" + ) + foreach ($item in $ToClean) { + if (Test-Path -Path $item) { + Remove-Item -Path $item -Force -Recurse + } + } + Exit $LASTEXITCODE +} + +if ($target -Eq "check") { + Invoke-Check + Exit $LASTEXITCODE +} + +if ($target -Eq "versions") { + & $_PYTHON _tools/generate_release_cycle.py + if ($LASTEXITCODE -Ne 0) { exit 1 } + Write-Host "Release cycle data generated." + Exit $LASTEXITCODE +} + +if ($target -Eq "htmlview") { + Invoke-SphinxBuild "html" "$BUILDDIR" "$_ALL_SPHINX_OPTS" + if (Test-Path -Path "$BUILDDIR\html\index.html") { + Write-Host "Opening $BUILDDIR\html\index.html in the default web browser..." + Start-Process "$BUILDDIR\html\index.html" + } + Exit $LASTEXITCODE +} + +Invoke-SphinxBuild "$target" "$BUILDDIR" "$_ALL_SPHINX_OPTS" +Exit $LASTEXITCODE diff --git a/requirements.txt b/requirements.txt index 7b98f183dd..bad565b8c9 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,10 +1,10 @@ furo>=2022.6.4 jinja2 -sphinx-autobuild +sphinx-autobuild>=2024.9.19 sphinx-inline-tabs>=2023.4.21 -sphinx-lint==0.9.1 +sphinx-lint==1.0.0 sphinx-notfound-page>=1.0.0 sphinx_copybutton>=0.3.3 sphinxext-opengraph>=0.7.1 sphinxext-rediraffe -Sphinx~=7.3.7 +Sphinx~=8.2.1 diff --git a/testing/buildbots.rst b/testing/buildbots.rst index 38e6063647..97856f7132 100644 --- a/testing/buildbots.rst +++ b/testing/buildbots.rst @@ -16,7 +16,7 @@ will schedule a new build to be run as soon as possible. The build steps run by the buildbots are the following: -* Check out the source tree for the changeset which triggered the build +* Check out the source tree for the change which triggered the build * Compile Python * Run the test suite using :ref:`strenuous settings <strenuous_testing>` * Clean up the build tree @@ -56,7 +56,7 @@ There are three ways of visualizing recent build results: * The Web interface for each branch at https://www.python.org/dev/buildbot/, where the so-called "waterfall" view presents a vertical rundown of recent builds for each builder. When interested in one build, you'll have to - click on it to know which changesets it corresponds to. Note that + click on it to know which commits it corresponds to. Note that the buildbot web pages are often slow to load, be patient. * The command-line ``bbreport.py`` client, which you can get from @@ -78,16 +78,16 @@ There are three ways of visualizing recent build results: If you like IRC, having an IRC client open to the #python-dev-notifs channel on irc.libera.chat is useful. Any time a builder changes state (last build passed and this one didn't, or vice versa), a message is posted to the channel. -Keeping an eye on the channel after pushing a changeset is a simple way to get +Keeping an eye on the channel after pushing a commits is a simple way to get notified that there is something you should look in to. Some buildbots are much faster than others. Over time, you will learn which ones produce the quickest results after a build, and which ones take the longest time. -Also, when several changesets are pushed in a quick succession in the same +Also, when several commits are pushed in a quick succession in the same branch, it often happens that a single build is scheduled for all these -changesets. +commits. Stability ========= @@ -223,37 +223,5 @@ and unpredictable, the issue should be reported on the bug tracker; even better if it can be diagnosed and suppressed by fixing the test's implementation, or by making its parameters - such as a timeout - more robust. - -Custom builders -=============== - -.. highlight:: console - -When working on a platform-specific issue, you may want to test your changes on -the buildbot fleet rather than just on GitHub Actions and Azure Pipelines. To do so, you can -make use of the `custom builders -<https://buildbot.python.org/all/#/builders?tags=%2Bcustom>`_. -These builders track the ``buildbot-custom`` short-lived branch of the -``python/cpython`` repository, which is only accessible to core developers. - -To start a build on the custom builders, push the commit you want to test to -the ``buildbot-custom`` branch:: - - $ git push upstream <local_branch_name>:buildbot-custom - -You may run into conflicts if another developer is currently using the custom -builders or forgot to delete the branch when they finished. In that case, make -sure the other developer is finished and either delete the branch or force-push -(add the ``-f`` option) over it. - -When you have gotten the results of your tests, delete the branch:: - - $ git push upstream :buildbot-custom # or use the GitHub UI - -If you are interested in the results of a specific test file only, we -recommend you change (temporarily, of course) the contents of the -``buildbottest`` clause in ``Makefile.pre.in``; or, for Windows builders, -the ``Tools/buildbot/test.bat`` script. - .. seealso:: :ref:`buildworker` diff --git a/testing/coverage.rst b/testing/coverage.rst index 93273793a3..d01141a7dc 100644 --- a/testing/coverage.rst +++ b/testing/coverage.rst @@ -55,10 +55,10 @@ statements have been covered. In these instances you can ignore the global statement coverage and simply focus on the local statement coverage. When writing new tests to increase coverage, do take note of the style of tests -already provided for a module (e.g., whitebox, blackbox, etc.). As +already provided for a module (for example, whitebox, blackbox, etc.). As some modules are primarily maintained by a single core developer they may have -a specific preference as to what kind of test is used (e.g., whitebox) and -prefer that other types of tests not be used (e.g., blackbox). When in doubt, +a specific preference as to what kind of test is used (for example, whitebox) and +prefer that other types of tests not be used (for example, blackbox). When in doubt, stick with whitebox testing in order to properly exercise the code. @@ -68,9 +68,9 @@ Measuring coverage It should be noted that a quirk of running coverage over Python's own stdlib is that certain modules are imported as part of interpreter startup. Those modules required by Python itself will not be viewed as executed by the coverage tools -and thus look like they have very poor coverage (e.g., the :py:mod:`stat` +and thus look like they have very poor coverage (for example, the :py:mod:`stat` module). In these instances the module will appear to not have any coverage of -global statements but will have proper coverage of local statements (e.g., +global statements but will have proper coverage of local statements (for example, function definitions will not be traced, but the function bodies will). Calculating the coverage of modules in this situation will simply require manually looking at what local statements were not executed. @@ -146,7 +146,7 @@ Basic usage ^^^^^^^^^^^ The following command will tell you if your copy of coverage works (substitute -``COVERAGEDIR`` with the directory where your clone exists, e.g. +``COVERAGEDIR`` with the directory where your clone exists, for example, ``../coveragepy``):: ./python COVERAGEDIR @@ -189,7 +189,7 @@ you visually see what lines of code were not tested:: This will generate an HTML report in a directory named ``htmlcov`` which ignores any errors that may arise and ignores modules for which test coverage is -unimportant (e.g. tests, temp files, etc.). You can then open the +unimportant (for example, tests, temp files, etc.). You can then open the ``htmlcov/index.html`` file in a web browser to view the coverage results along with pages that visibly show what lines of code were or were not executed. @@ -306,5 +306,5 @@ about 20 to 30 minutes on a modern computer. .. _issue tracker: https://github.com/python/cpython/issues .. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html -.. _lcov: https://ltp.sourceforge.net/coverage/lcov.php +.. _lcov: https://github.com/linux-test-project/lcov .. _coverage.py: https://coverage.readthedocs.io/en/latest/ diff --git a/testing/index.rst b/testing/index.rst index 770c258000..55bdd3d08b 100644 --- a/testing/index.rst +++ b/testing/index.rst @@ -1,3 +1,5 @@ +.. _testing: + ===================== Testing and buildbots ===================== diff --git a/testing/new-buildbot-worker.rst b/testing/new-buildbot-worker.rst index 39742669ef..cac7bb63d5 100644 --- a/testing/new-buildbot-worker.rst +++ b/testing/new-buildbot-worker.rst @@ -72,8 +72,8 @@ For Linux: * If your package manager provides the buildbot worker software, that is probably the best way to install it; it may create the buildbot user for - you, in which case you can skip that step. Otherwise, do ``pip install - buildbot-worker``. + you, in which case you can skip the next step. Otherwise, do ``pip install + buildbot-worker`` or ``pip3 install buildbot-worker``. * Create a ``buildbot`` user (using, eg: ``useradd``) if necessary. * Log in as the buildbot user. @@ -102,6 +102,18 @@ can put the ``buildarea`` wherever you want to):: (Note that on Windows, the ``buildbot-worker`` command will be in the :file:`Scripts` directory of your Python installation.) +On Windows, `the maximum length for a path is limited +<https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation>`_. +This might cause some tests to fail, unless long paths support is enabled. + +Use this PowerShell command to check whether long paths are enabled:: + + Get-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" + +If the value is not "1", you can enable long paths using this PowerShell command:: + + New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force + Once this initial worker setup completes, you should edit the files ``buildarea/info/admin`` and ``buildarea/info/host`` to provide your contact info and information on the host configuration, respectively. This information @@ -113,7 +125,70 @@ machine reboots: For Linux: -* Add the following line to ``/etc/crontab``:: +* For systemd based distributions, you can create a systemd unit file in order + to manage the service. Create the unit file named ``buildbot-worker.service`` + under ``/home/buildbot/.config/systemd/user/`` and change the paths according to where the + buildbot-worker binary resides. You can verify its location by running + ``which buildbot-worker``. + If you installed the buildbot-worker through + your package manager it would be:: + + [Unit] + Description=Buildbot Worker + Wants=network-online.target + After=network-online.target local-fs.target + + [Service] + Type=forking + PIDFile=/home/buildbot/buildarea/twistd.pid + WorkingDirectory=/home/buildbot/buildarea + ExecStart=/usr/bin/buildbot-worker start + ExecReload=/usr/bin/buildbot-worker restart + ExecStop=/usr/bin/buildbot-worker stop + Restart=always + User=buildbot + + [Install] + WantedBy=multi-user.target + + + If you installed the buildbot-worker through pip, the systemd unit + file should look like this:: + + [Unit] + Description=Buildbot Worker + Wants=network-online.target + After=network-online.target local-fs.target + + [Service] + Type=forking + PIDFile=/home/buildbot/buildarea/twistd.pid + WorkingDirectory=/home/buildbot/buildarea + ExecStart=/usr/local/bin/buildbot-worker start + ExecReload=/usr/local/bin/buildbot-worker restart + ExecStop=/usr/local/bin/buildbot-worker stop + Restart=always + User=buildbot + + [Install] + WantedBy=multi-user.target + + + Then enable lingering for the buildbot user via the + ``loginctl enable-linger buildbot`` command and you can start + the service through a login shell of the buildbot user + via the ``systemctl --user enable --now buildbot-worker.service`` + command. + + Note that using a systemd unit file, might produce some selinux warnings on systems + where the enforcing mode is enabled, usually related to the twistd.pid file. + If the service fails to start, you should check the output of + ``systemctl status buildbot-worker.service`` as well as the + ``/var/log/audit/audit.log`` file (e.g. through + ``sealert -a /var/log/audit/audit.log``) for potential issues and remedies. + + +* Alternatively you can create a cronjob. Add the following line to ``/etc/crontab``:: @reboot buildbot-worker restart /path/to/buildarea @@ -179,7 +254,7 @@ For Windows: * Alternatively (note: don't do both!), set up the worker service as described in the `buildbot documentation - <https://docs.buildbot.net/current/manual/installation/requirements.html#windows-support>`_. + <https://docs.buildbot.net/current/manual/installation/misc.html#launching-worker-as-windows-service>`_. To start the worker running for your initial testing, you can do:: @@ -199,7 +274,7 @@ idea. If your buildbot worker is disconnecting regularly, it may be a symptom of the default ``keepalive`` value (``600`` for 10 minutes) being `set <https://docs.buildbot.net/latest/manual/installation/worker.html#cmdoption-buildbot-worker-create-worker-keepalive>`_ - too high. You can change it to a lower value (e.g. ``180`` for 3 minutes) + too high. You can change it to a lower value (for example, ``180`` for 3 minutes) in the ``buildbot.tac`` file found in your build area. @@ -207,7 +282,7 @@ Latent workers -------------- We also support running `latent workers -<http://docs.buildbot.net/current/manual/configuration/workers.html#latent-workers>`_ +<https://docs.buildbot.net/current/manual/configuration/workers.html#latent-workers>`_ on the AWS EC2 service. To set up such a worker: * Start an instance of your chosen base AMI and set it up as a diff --git a/testing/run-write-tests.rst b/testing/run-write-tests.rst index c902a99f5b..83a4a28610 100644 --- a/testing/run-write-tests.rst +++ b/testing/run-write-tests.rst @@ -198,12 +198,23 @@ a more random order which helps to check that the various tests do not interfere with each other. The ``-w`` flag causes failing tests to be run again to see if the failures are transient or consistent. The ``-uall`` flag allows the use of all available -resources so as to not skip tests requiring, e.g., Internet access. +resources so as to not skip tests requiring, for example, Internet access. -To check for reference leaks (only needed if you modified C code), use the -``-R`` flag. For example, ``-R 3:2`` will first run the test 3 times to settle -down the reference count, and then run it 2 more times to verify if there are -any leaks. +To check for reference leaks (only needed if you modified C code), +you can enable reference leak checking during testing using the ``-R`` flag. +For example, using the command:: + + python -m test <test_name> -R : + +This default setting performs a few initial warm-up runs to stabilize the reference count, +followed by additional runs to check for leaks. + +If you want more control over the number of runs, you can specify ``warmups`` and ``repeats`` explicitly:: + + python -m test <test_name> -R <warmups>:<repeats> + +For instance, ``-R 3:2`` will first run the test 3 times to settle down the +reference count, and then run it 2 more times to check for leaks. You can also execute the ``Tools/scripts/run_tests.py`` script as found in a CPython checkout. The script tries to balance speed with thoroughness. But if diff --git a/testing/silence-warnings.rst b/testing/silence-warnings.rst index e46a11a022..81de500bfc 100644 --- a/testing/silence-warnings.rst +++ b/testing/silence-warnings.rst @@ -9,7 +9,7 @@ When running Python's test suite, no warnings should result when you run it under :ref:`strenuous testing conditions <strenuous_testing>` (you can ignore the extra flags passed to ``test`` that cause randomness and parallel execution if you want). Unfortunately new warnings are added to Python on occasion which -take some time to eliminate (e.g., ``ResourceWarning``). Typically the easy +take some time to eliminate (for example, ``ResourceWarning``). Typically the easy warnings are dealt with quickly, but the more difficult ones that require some thought and work do not get fixed immediately. diff --git a/triage/github-bpo-faq.rst b/triage/github-bpo-faq.rst index 1385be052a..8c21a17fea 100644 --- a/triage/github-bpo-faq.rst +++ b/triage/github-bpo-faq.rst @@ -57,12 +57,12 @@ Where is the "nosy list"? Subscribe another person to the issue by tagging them in the comment with ``@username``. -If you want to subscribe yourself to an issue, click the *🔔 Subscribe* -button in the sidebar. +If you want to subscribe yourself to an issue, click the +:guilabel:`🔔 Subscribe` button in the sidebar. Similarly, if you were tagged by somebody else but -decided this issue is not for you, you might click the *🔕 Unsubscribe* -button in the sidebar. +decided this issue is not for you, you might click the +:guilabel:`🔕 Unsubscribe` button in the sidebar. There is no exact equivalent of the "nosy list" feature, so to preserve this information during the transfer, we list the previous members of diff --git a/triage/issue-tracker.rst b/triage/issue-tracker.rst index c139f5d361..4dd0815e4c 100644 --- a/triage/issue-tracker.rst +++ b/triage/issue-tracker.rst @@ -50,23 +50,17 @@ Reporting an issue ------------------ If the problem you're reporting is not already in the `issue tracker`_, you -can report it using the green "New issue" button on the right of the search +can report it using the green :guilabel:`New issue` button on the right of the search box above the list of bugs. If you're not already signed in to GitHub, it will ask you to do so now. First you need to select what kind of problem you want to report. The -available choices are: - -* **Bug report**: an existing feature isn't working as expected; -* **Documentation**: there is missing, invalid, or misleading documentation; -* **Enhancement**: suggest a new feature for Python; -* **Performance**: something should work faster; -* **Security**: there is a specific kind of weakness open to exploitation - through the points of vulnerability; -* **Tests**: something is wrong with CPython's suite of regression tests; -* **Discuss**: you'd like to learn more about Python, discuss ideas for - possible changes to future Python versions, track core development - discussions, or join a specific special-interest group. +available choices include, for example: + +* **Bug report**: an existing feature isn't working as expected. +* **Documentation**: there is missing, invalid, or misleading documentation. +* **Feature or enhancement**: suggest a new feature for Python. +* **Report a security vulnerability**: privately report a security vulnerability. Depending on your choice, a dedicated form template will appear. In particular, you'll notice that the last button actually takes you to @@ -159,6 +153,6 @@ reason either as ``complete`` or ``not planned``. .. _Python Discourse: https://discuss.python.org/ .. _autolinks: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/autolinked-references-and-urls .. _checklists: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/about-task-lists -.. _duplicates: https://docs.github.com/en/issues/tracking-your-work-with-issues/marking-issues-or-pull-requests-as-a-duplicate +.. _duplicates: https://docs.github.com/en/issues/tracking-your-work-with-issues/administering-issues/marking-issues-or-pull-requests-as-a-duplicate .. _Core Development Discourse category: https://discuss.python.org/c/core-dev/23 .. _old bug tracker: https://bugs.python.org/ diff --git a/triage/labels.rst b/triage/labels.rst index 95719615b3..d78f6b4727 100644 --- a/triage/labels.rst +++ b/triage/labels.rst @@ -62,7 +62,12 @@ OS labels These labels are used to specify which operating systems are affected. Since most issues either affect all systems or are specific to Unix, we don't have a dedicated Unix label. -Use :gh-label:`OS-windows`, :gh-label:`OS-mac`, and :gh-label:`OS-freebsd`. + +* :gh-label:`OS-android` +* :gh-label:`OS-freebsd` +* :gh-label:`OS-ios` +* :gh-label:`OS-linux` +* :gh-label:`OS-windows` Use the :gh-label:`OS-unsupported` label for issues on platforms outside the support tiers defined in :pep:`11`. Applying this label adds the issue to @@ -95,7 +100,7 @@ Version labels These labels are used to indicate which versions of Python are affected. The available version labels (with the form :samp:`3.{N}`) are updated -whenever new major releases are created or retired. +whenever new feature releases are created or retired. See also :ref:`the branch status page <branchstatus>` for a list of active branches. @@ -109,21 +114,23 @@ for a list of active branches. Other labels ============ -* :gh-label:`triaged`: for issue has been accepted as valid by a triager. -* :gh-label:`easy`: for issues that are considered easy. * :gh-label:`build`/:gh-label:`performance`: for issues related to the build process or performance, respectively. +* :gh-label:`easy`: for issues that are considered easy. +* :gh-label:`infra`: for issues related to the infrastructure of the + project (for example, GitHub Actions, dependabot, the buildbots). +* :gh-label:`pending`: for issues/PRs that will be closed unless further + feedback is provided. * :gh-label:`release-blocker`/:gh-label:`deferred-blocker`: for issues/PRs + and the :ref:`branch's release manager <branchstatus>` + removing or retaining the label as appropriate. that, unless fixed, will hold the current or next release respectively. Triagers may set these labels for issues that must be fixed before a release, - and the :ref:`branch's release manager <branchstatus>` will review them and determine if they indeed qualify, - removing or retaining the label as appropriate. -* :gh-label:`pending`: for issues/PRs that will be closed unless further - feedback is provided. -* :gh-label:`stale`: for issues/PRs that have been inactive for a while. * :gh-label:`sprint`: for easier filtering of issues/PRs being worked on during official sprints. +* :gh-label:`stale`: for issues/PRs that have been inactive for a while. +* :gh-label:`triaged`: for issue has been accepted as valid by a triager. .. _GitHub Labels for PRs: @@ -145,9 +152,10 @@ to trigger specific bot behaviors. by these labels. See also :ref:`the status of the Python branches <branchstatus>` for a list of branches and the type of PRs that can be backported to them. -* :gh-label:`skip issue`: for trivial changes (such as typo fixes, comment +* :gh-label:`skip issue <skip%20issue>`: for trivial changes (such as + typo fixes, comment changes, and section rephrases) that don't require a corresponding issue. -* :gh-label:`skip news`: for PRs that don't need a NEWS entry. +* :gh-label:`skip news <skip%20news>`: for PRs that don't need a NEWS entry. The :ref:`news-entry` section covers in details in which cases the NEWS entry can be skipped. * :gh-label:`test-with-buildbots`: used to test the latest commit with diff --git a/triage/triage-team.rst b/triage/triage-team.rst index 6255ea429d..68a88457e4 100644 --- a/triage/triage-team.rst +++ b/triage/triage-team.rst @@ -38,7 +38,7 @@ following: - PRs proposing fixes for bugs that can no longer be reproduced - PRs proposing changes that have been rejected by Python core developers - elsewhere (e.g. in an issue or a PEP rejection notice) + elsewhere (for example, in an issue or a PEP rejection notice) If a triager has any doubt about whether to close a PR, they should consult a core developer before taking any action. @@ -53,7 +53,7 @@ or a veteran core developer, they're actively choosing to voluntarily donate the time towards the improvement of Python. As is the case with any member of the Python Software Foundation, always follow the `PSF Code of Conduct`_. -.. _PSF Code of Conduct: https://www.python.org/psf/conduct/ +.. _PSF Code of Conduct: https://policies.python.org/python.org/code-of-conduct/ Becoming a member of the Python triage team diff --git a/triage/triaging.rst b/triage/triaging.rst index 1d9e8809fe..c560d8c1d5 100644 --- a/triage/triaging.rst +++ b/triage/triaging.rst @@ -31,7 +31,7 @@ This field indicates who is expected to take the next step in resolving the issue. It is acceptable to assign an issue to someone if the issue cannot move -forward without their help; e.g., they need to make a technical decision on +forward without their help; for example, they need to make a technical decision on how to proceed. Also consult the :ref:`experts` as certain stdlib modules should always be assigned to a specific person. @@ -92,7 +92,7 @@ you can help by making sure the pull request: * includes proper tests * includes proper documentation changes * includes a :ref:`NEWS entry <news-entry>` (if needed) -* includes the author in ``Misc/ACKS``, either already or the patch adds them +* includes the author in ``Misc/ACKS``, either already or the pull request adds them * doesn't have conflicts with the ``main`` branch * :ref:`doesn't have failing CI checks <keeping-ci-green>` diff --git a/versions.rst b/versions.rst index 6ca0d440d6..db7f946829 100644 --- a/versions.rst +++ b/versions.rst @@ -5,7 +5,7 @@ Status of Python versions ========================= -The ``main`` branch is currently the future Python 3.13, and is the only +The ``main`` branch is currently the future Python 3.14, and is the only branch that accepts new features. The latest release for each Python version can be found on the `download page <https://www.python.org/downloads/>`_. @@ -16,6 +16,8 @@ Python release cycle .. raw:: html :file: include/release-cycle.svg +Another useful visualization is `endoflife.date/python <https://endoflife.date/python>`_. + Supported versions ==================