Skip to content

Broken code block generated by unparse in telethon.extensions.markdown and telethon.extensions.html on received message #4714

@immanelg

Description

@immanelg

Code that causes the issue

Send a message with a code block to your Saved Messages (it can be one, doesn't have to be two):

Image
class AbstractPrinter:
    def print(s):
        pass
main()
package main

var g int
if err != nil:
    return err, false

Run the script

import os
import sys
import time
from telethon import TelegramClient
from telethon.tl.types import MessageEntityBold, MessageEntityItalic, MessageEntityCode, MessageEntityPre, MessageEntityTextUrl, MessageEntityStrike

from telethon.extensions.markdown import unparse as markdown_unparse, parse as markdown_parse
from telethon.extensions.html import unparse as html_unparse, parse as html_parse
phone: str = os.environ.get("TELEGRAM_PHONE")
password: str = os.environ.get("TELEGRAM_PASSWORD")
api_id: int = int(os.environ.get("TELEGRAM_API_ID"))
api_hash: str = os.environ.get("TELEGRAM_API_HASH")

async def main():
    client = TelegramClient(session="./session", api_id=api_id, api_hash=api_hash)

    await client.start(phone=phone, password=password, force_sms=False)
    me = await client.get_me()
    async for message in client.iter_messages(me, limit=1):
        # print(message.message)
        # print("entities:")
        # for entity in message.entities:
        #     print(type(entity).__name__, entity.offset, entity.length, entity.language if hasattr(entity, 'language') else '')
        print("---")
        print(html_unparse(message.message, message.entities))
        print("---")
        print(markdown_unparse(message.message, message.entities))

import asyncio
asyncio.run(main())
uv run main.py >example.txt

Expected behavior

---
<pre>
    <code class='language-python'>
class AbstractPrinter:
    def print(s):
        pass
main()
    </code>
</pre>
<pre>
    <code class='language-go'>
package main

var g int
if err != nil:
    return err, false
    </code>
</pre>
---
\```python
class AbstractPrinter:
    def print(s):
        pass
main()
\```
\```go 
package main

var g int
if err != nil:
    return err, false
\```

Actual behavior

unparse() produces broken markdown and html.

html:

<pre>
    <code class='language-python'>
        class AbstractPrinter:
    def print(s):
        pass
main(){}
    </code>
</pre>
<pre>
    <code class='language-go'>
        package main

var g int
if err != nil:
    return err, false{}
    </code>
</pre>

Issues:

  • extra {} at the end
  • broken indentation

markdown

\```class AbstractPrinter:
    def print(s):
        pass
main()\```
\```package main

var g int
if err != nil:
    return err, false\```

Issues:

  • no language
  • broken indentation
  • code starts directly after backticks without newline
  • code ends without newline

All of that makes the output have broken syntax, at the very least.

Traceback

No response

Telethon version

1.40.0

Python version

3.14.0rc2

Operating system (including distribution name and version)

Arch Linux

Other details

No response

Checklist

  • The error is in the library's code, and not in my own.
  • I have searched for this issue before posting it and there isn't an open duplicate.
  • I ran pip install -U https://github.com/LonamiWebs/Telethon/archive/v1.zip and triggered the bug in the latest version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions