Skip to content

Add DocuForge API — Document processing (Excel, CSV, PDF)#2518

Open
rnhowcla wants to merge 1 commit into
APIs-guru:mainfrom
rnhowcla:main
Open

Add DocuForge API — Document processing (Excel, CSV, PDF)#2518
rnhowcla wants to merge 1 commit into
APIs-guru:mainfrom
rnhowcla:main

Conversation

@rnhowcla

Copy link
Copy Markdown

Add DocuForge API OpenAPI 3.0 spec.

Document processing REST API for Excel, CSV, and PDF files.
Free tier: 50 calls/month.

GitHub: https://github.com/rnhowcla/docuforge-api

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the initial OpenAPI specification for the DocuForge API, which includes endpoints for Excel, CSV, and PDF processing. The review feedback highlights a critical security issue regarding the use of HTTP and a raw IPv6 address for the production server, which should be updated to HTTPS and a domain name. Additionally, several endpoints are missing 'content' definitions in their responses, which is necessary for proper documentation and client generation. Finally, the 'created_at' field in the account endpoint should include a 'date-time' format for better schema precision.

- "tools"
- "developer_tools"
servers:
- url: http://[2409:893d:dcd:8c60:3058:eb94:1d9:c7fa]:5000

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The production server is configured with a raw IPv6 address and uses the insecure http protocol. For a production API that requires authentication via X-API-Key, it is critical to use https to prevent credentials from being transmitted in cleartext. Additionally, using a domain name instead of a static IP is standard practice for API stability and discoverability.

- file
responses:
"200":
description: CSV file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The response for a successful CSV conversion is missing a content definition. Without specifying the media type (e.g., text/csv) and schema, client generators and documentation tools cannot correctly handle the output.

          description: CSV file
          content:
            text/csv:
              schema:
                type: string
                format: binary

- file
responses:
"200":
description: Formatted Excel file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The response for the auto-format endpoint is missing the content definition. It should specify the Excel media type to ensure proper handling by clients.

          description: Formatted Excel file
          content:
            application/vnd.openxmlformats-officedocument.spreadsheetml.sheet:
              schema:
                type: string
                format: binary

- file
responses:
"200":
description: Excel file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The response for the CSV-to-Excel conversion is missing the content definition. It should specify the Excel media type as seen in the /excel/clean endpoint.

          description: Excel file
          content:
            application/vnd.openxmlformats-officedocument.spreadsheetml.sheet:
              schema:
                type: string
                format: binary

- file
responses:
"200":
description: Cleaned CSV file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The response for the CSV cleaning endpoint is missing the content definition. It should specify the text/csv media type.

          description: Cleaned CSV file
          content:
            text/csv:
              schema:
                type: string
                format: binary

- file
responses:
"200":
description: Extracted text

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The text extraction endpoint is missing a content definition for the response. It should likely return text/plain with a string schema.

          description: Extracted text
          content:
            text/plain:
              schema:
                type: string

- file
responses:
"200":
description: PDF metadata JSON

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PDF metadata endpoint is missing the content definition. Since the description mentions JSON, it should define application/json along with the expected schema for the metadata object (e.g., properties for pages, author, title, etc.).

- files
responses:
"200":
description: Merged PDF

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PDF merge endpoint is missing the content definition for the resulting PDF file. This is necessary for tools to recognize the binary output as a PDF.

          description: Merged PDF
          content:
            application/pdf:
              schema:
                type: string
                format: binary

call_count:
type: integer
created_at:
type: string

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The created_at property should include a format: date-time to provide better information for client generators and ensure consistent parsing of the timestamp.

                    type: string
                    format: date-time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant