fix: Properly render code snippets from Medium/RSS posts (#53)#103
fix: Properly render code snippets from Medium/RSS posts (#53)#103aditya-pandey-dev wants to merge 1 commit intoOpenAstronomy:mainfrom
Conversation
|
So the fix is to use BS4 to correctly render the incoming text? |
|
Yes I used BeautifulSoup4 to parse and clean up the HTML from the incoming Medium/RSS content, so that code snippets and formatting are rendered properly. Is there is anything specific you'd like me to explain or adjust? |
| if html: | ||
| content = convert_html_to_markdown(content) | ||
| print("\n------------MARKDOWN OUTPUT---------------\n") | ||
| print(content) | ||
| print("\n------------------END---------------------\n") |
There was a problem hiding this comment.
Those prints were only for local debugging to inspect the converted markdown output. I’ll remove them in the next commit so the script stays clean.
| sample_html = ''' | ||
| <h1>Test Heading</h1> | ||
| <ul> | ||
| <li>Point 1</li> | ||
| <li>Point 2</li> | ||
| </ul> | ||
| <b>Bold Demo</b> | ||
| <pre><code>print("Hello world")</code></pre> | ||
| ''' | ||
|
|
||
| print("\n------MARKDOWN OUTPUT------\n") | ||
| print(convert_html_to_markdown(sample_html)) | ||
| print("\n------END------\n") |
There was a problem hiding this comment.
Those prints were only for local debugging to inspect the converted markdown output. I’ll also remove them in the next commit so the script stays clean.
|
No, I was just surprised at how simple the fix is. |
|
I had one small question right now the HTML → markdown conversion happens inside grab.py before passing the content to Nikola. Do you think this logic is in the right place, or would you prefer moving it into a separate helper or module so it’s easier to test and reuse for other feeds in the future? |
|
Seems fine to me, will have to let David provide the final word. |
This pull request addresses issue #53 by improving how Medium and RSS feed HTML content is processed and rendered on the Universe_OA site.
Key changes:
The grab.py script now converts Medium/RSS HTML content into markdown format using the markdownify library.
Code blocks, which are enclosed in triple backticks in markdown, are rendered properly on the site. On inspection, these appear correctly wrapped in