Skip to content

Conversation

@cnkang
Copy link
Owner

@cnkang cnkang commented Aug 11, 2025

Summary

  • pass CLI options correctly and support template conversion helpers
  • expose a classmethod to create DOCX templates and fine-tune heading styles
  • adjust tests to exercise CLI and multilingual features from project root

Testing

  • ./scripts/setup-env.sh --group dev
  • uv run pytest

https://chatgpt.com/codex/tasks/task_e_6897d6cb8a60832b81b11a8e82b373d5

Summary by Sourcery

Add environment setup script with documentation, improve template and converter APIs, refine CLI options, and update tests and CI to support new workflow.

New Features:

  • Add scripts/setup-env.sh for installing project and Pandoc dependencies
  • Introduce DocxTemplateManager.create_modern_template classmethod with support for custom style parameters
  • Add MarkdownToDocxConverter.convert_with_template convenience method

Enhancements:

  • Warn instead of error on missing reference DOCX and unify Pandoc argument construction via a helper
  • Configure converter logging and adjust template heading sizes for improved defaults
  • Refine CLI to accept named options consistently and run commands from repository root

CI:

  • Simplify GitHub Actions to invoke setup-env.sh for environment provisioning

Documentation:

  • Update README and README_zh to document setup script usage and showcase updated option syntax

Tests:

  • Adjust existing tests to use add_sample flag, cover custom heading styles, missing template warnings, multilingual content extraction, convert_with_template, and CLI invocation behavior

@amazon-q-developer
Copy link
Contributor

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

@sourcery-ai
Copy link

sourcery-ai bot commented Aug 11, 2025

Reviewer's Guide

This PR introduces a cross-platform environment setup script, refactors the DOCX template and converter APIs for improved classmethod usage and argument handling, enhances CLI options consistency, updates documentation and CI to integrate the setup, and aligns tests with these API changes.

Sequence diagram for CLI option parsing and conversion

sequenceDiagram
    actor User
    participant CLI
    participant MarkdownToDocxConverter
    User->>CLI: Run markdown2docx CLI with --toc and --toc-depth/--toc_depth
    CLI->>CLI: Parse options (toc, toc_depth)
    CLI->>MarkdownToDocxConverter: convert(input, output, toc=True, toc_depth=3)
    MarkdownToDocxConverter->>MarkdownToDocxConverter: Build Pandoc args
    MarkdownToDocxConverter->>MarkdownToDocxConverter: Perform conversion
    MarkdownToDocxConverter-->>CLI: Return output path
    CLI-->>User: Output DOCX file path
Loading

Class diagram for updated DocxTemplateManager and MarkdownToDocxConverter

classDiagram
    class DocxTemplateManager {
        +heading_font
        +code_font
        +create_modern_template(output_path, add_sample=False, **kwargs) Path
        +_create_modern_template(output_path, add_sample=False) Path
        +_configure_core_styles(doc)
        +_set_compatibility_mode_xml(doc, mode="16")
        +create_default_template(output_path) Path
    }
    class MarkdownToDocxConverter {
        +reference_doc
        +min_pandoc
        +convert(input_path, output_path=None, **options) Path
        +convert_with_template(input_path, template_path, output_path=None, **kwargs) Path
        +_get_modern_docx_args() list[str]
        +_build_args(toc, toc_depth, extra_args) list[str]
        +_validate_docx(path)
    }
    DocxTemplateManager <.. MarkdownToDocxConverter : uses template
Loading

File-Level Changes

Change Details Files
Cross-platform environment setup and documentation integration
  • Add scripts/setup-env.sh to install dependencies and Pandoc across platforms
  • Document usage in README.md and README_zh.md
  • Use setup script in GitHub Actions test workflow
scripts/setup-env.sh
README.md
README_zh.md
.github/workflows/test.yml
Refactor DOCX template manager API
  • Expose create_modern_template as a classmethod delegating to an internal helper
  • Allow custom heading styles via kwargs and adjust default heading sizes
  • Update tests to call create_modern_template with add_sample flag
src/markdown2docx/templates.py
tests/test_templates.py
Enhance Markdown-to-DOCX converter
  • Centralize Pandoc argument construction in _get_modern_docx_args
  • Log warning instead of error for missing reference docs
  • Add convert_with_template convenience method
  • Configure module-level logging
src/markdown2docx/converter.py
tests/test_converter.py
Improve CLI option handling
  • Support --toc-depth alias (--toc_depth) and map CLI flags to keyword args
  • Standardize run_cli_command to always use repo root as cwd
src/markdown2docx/cli.py
tests/test_cli.py
Align and extend tests for new features
  • Switch tests to keyword args for toc and toc_depth options
  • Add tests for missing template handling and custom heading styles
  • Enhance multilingual tests to include table cell content
tests/test_integration.py
tests/test_templates.py
tests/test_converter.py
tests/test_cli.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cnkang - I've reviewed your changes - here's some feedback:

  • The log.warning call in converter._build_args isn’t accompanied by a log import or logger definition—ensure you import/configure your logger to avoid runtime errors.
  • Renaming the CLI option keys from dashed ("--toc") to underscored ("toc") is a breaking API change; consider aliasing the old keys or clearly bumping the major version.
  • The new setup-env.sh script is POSIX‐only and won’t run on Windows—if cross‐platform support is required, consider adding a PowerShell alternative or noting the platform limitation in the docs.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `log.warning` call in converter._build_args isn’t accompanied by a `log` import or logger definition—ensure you import/configure your logger to avoid runtime errors.
- Renaming the CLI option keys from dashed ("--toc") to underscored ("toc") is a breaking API change; consider aliasing the old keys or clearly bumping the major version.
- The new setup-env.sh script is POSIX‐only and won’t run on Windows—if cross‐platform support is required, consider adding a PowerShell alternative or noting the platform limitation in the docs.

## Individual Comments

### Comment 1
<location> `scripts/setup-env.sh:1` </location>
<code_context>
+#!/usr/bin/env bash
+set -e
+
+# Install Python dependencies using uv, allowing additional arguments (e.g., --group dev)
+uv sync "$@"
+
+# Ensure pandoc is installed
+if ! command -v pandoc >/dev/null 2>&1; then
+  echo "Pandoc not found. Attempting to install..."
+  if command -v apt-get >/dev/null 2>&1; then
+    sudo apt-get update && sudo apt-get install -y pandoc
+  elif command -v brew >/dev/null 2>&1; then
+    brew install pandoc
+  elif command -v yum >/dev/null 2>&1; then
+    sudo yum install -y pandoc
+  elif command -v dnf >/dev/null 2>&1; then
+    sudo dnf install -y pandoc
+  elif command -v pacman >/dev/null 2>&1; then
+    sudo pacman -Sy --noconfirm pandoc
+  else
+    echo "Could not determine package manager. Please install pandoc manually."
+    exit 1
+  fi
+else
+  echo "Pandoc already installed."
</code_context>

<issue_to_address>
The script attempts to install Pandoc using various package managers, but may fail on systems without sudo or with non-standard setups.

The script should check for 'sudo' before attempting installation, and offer guidance for manual installation if 'sudo' is unavailable or permissions are limited.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +225 to +228
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
full_text += '\n' + cell.text
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Avoid loops in tests. (no-loop-in-tests)

ExplanationAvoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:

  • loops
  • conditionals

Some ways to fix this:

  • Use parametrized tests to get rid of the loop.
  • Move the complex logic into helpers.
  • Move the complex part into pytest fixtures.

Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / Don't Put Logic in Tests

@amazon-q-developer
Copy link
Contributor

✅ I finished the code review, and left comments with the issues I found. I will now generate code fix suggestions.

cnkang and others added 6 commits August 11, 2025 10:34
Co-authored-by: amazon-q-developer[bot] <208079219+amazon-q-developer[bot]@users.noreply.github.com>
Co-authored-by: amazon-q-developer[bot] <208079219+amazon-q-developer[bot]@users.noreply.github.com>
Add basic logging configuration to ensure logger warnings
work properly when reference doc is missing
Add --toc_depth alias for --toc-depth to maintain API
compatibility and prevent breaking changes
- Add sudo availability check before package manager operations
- Prioritize brew (no sudo required) over other package managers
- Provide clear manual installation instructions when sudo unavailable
- Include direct link to Pandoc installation guide
Add note that setup script is designed for Unix-like systems
and Windows users should install dependencies manually
@cnkang
Copy link
Owner Author

cnkang commented Aug 11, 2025

@sourcery-ai review

Remove duplicated code and fix unclosed parentheses in
run_cli_command function that was causing test collection to fail
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cnkang - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `scripts/setup-env.sh:5` </location>
<code_context>
+set -e
+
+# Install Python dependencies using uv, allowing additional arguments (e.g., --group dev)
+uv sync "$@"
+
+# Ensure pandoc is installed
</code_context>

<issue_to_address>
The script assumes 'uv' is installed and available in PATH.

Add a check to verify 'uv' is installed, and display a clear error or installation guidance if not found.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
set -e

# Install Python dependencies using uv, allowing additional arguments (e.g., --group dev)
uv sync "$@"
=======
set -e

# Check if 'uv' is installed
if ! command -v uv >/dev/null 2>&1; then
  echo "Error: 'uv' is not installed or not found in your PATH."
  echo "Please install 'uv' by following instructions at https://github.com/astral-sh/uv or run:"
  echo "  pip install uv"
  exit 1
fi

# Install Python dependencies using uv, allowing additional arguments (e.g., --group dev)
uv sync "$@"
>>>>>>> REPLACE

</suggested_fix>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Verify 'uv' is installed before attempting to use it and provide
clear installation guidance if not found
@cnkang
Copy link
Owner Author

cnkang commented Aug 11, 2025

@sourcery-ai review

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cnkang - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `tests/test_templates.py` </location>
<code_context>
-def test_template_sample_content():
</code_context>

<issue_to_address>
Consider testing template creation with custom heading styles via kwargs.

Add a test that creates a template using custom heading fonts or sizes through kwargs, and verify these customizations appear in the generated DOCX file.
</issue_to_address>

### Comment 2
<location> `tests/test_integration.py` </location>
<code_context>
-def test_toc_generation(complex_markdown):
</code_context>

<issue_to_address>
Consider adding integration tests for missing or invalid template files.

An integration test should verify that when a non-existent template is used, a warning is logged and conversion completes successfully.
</issue_to_address>

### Comment 3
<location> `tests/test_converter.py:107` </location>
<code_context>

 # Conversion with options
 output_path = converter.convert(
-    "input.md", 
+    "input.md",
</code_context>

<issue_to_address>
Consider adding tests for the new convert_with_template method.

Directly testing convert_with_template will help verify its correctness and maintainability.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

…ndling

- Add test for template creation with custom heading styles via kwargs
- Add integration test for missing template file handling with logging verification
- Add direct test for convert_with_template method to ensure correctness
@cnkang
Copy link
Owner Author

cnkang commented Aug 11, 2025

@sourcery-ai review

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cnkang - I've reviewed your changes - here's some feedback:

  • Move the logging.basicConfig call out of the converter module into the CLI entrypoint (or a dedicated setup function) so that importing the library doesn’t reconfigure the root logger unexpectedly.
  • You can simplify the CLI option mapping by relying on argparse’s built-in alias handling for --toc-depth/--toc_depth and drop the getattr(...) fallback logic in main().
  • Since setup-env.sh is promoted as cross-platform but only handles Unix-style package managers, consider adding Windows support (e.g. Chocolatey) or clearly documenting that Windows users must install Pandoc manually.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Move the logging.basicConfig call out of the converter module into the CLI entrypoint (or a dedicated setup function) so that importing the library doesn’t reconfigure the root logger unexpectedly.
- You can simplify the CLI option mapping by relying on argparse’s built-in alias handling for --toc-depth/--toc_depth and drop the getattr(...) fallback logic in main().
- Since setup-env.sh is promoted as cross-platform but only handles Unix-style package managers, consider adding Windows support (e.g. Chocolatey) or clearly documenting that Windows users must install Pandoc manually.

## Individual Comments

### Comment 1
<location> `src/markdown2docx/converter.py:18` </location>
<code_context>
     Version = None  # Converter still works without packaging; only strict compare is skipped

+# Configure logger
+logging.basicConfig(level=logging.INFO)
 log = logging.getLogger(__name__)

</code_context>

<issue_to_address>
Setting logging.basicConfig at module level may interfere with application-wide logging configuration.

This can cause conflicts with user-defined logging. It's best to configure logging only in the main entry point or CLI, not in library modules.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Use actual DocxTemplateManager parameters (heading_font, code_font)
instead of non-existent heading1_size parameter
@cnkang cnkang merged commit a342607 into main Aug 11, 2025
2 checks passed
@cnkang cnkang deleted the codex/create-setup-env.sh-script-for-dependencies branch August 11, 2025 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants