Skip to content

huynxtb/office2pdf

Repository files navigation

office2pdf

Web API that converts Excel/Word files to PDF using headless LibreOffice. Written in Go (standard library only, no dependencies), runs in Docker.

Run

API_KEY=my-secret docker compose up --build

API

POST /convert

Part Description
Header X-API-KEY Required. Compared against the API_KEY env var in constant time.
Form field file Excel (.xlsx .xls .xlsm .xlsb .ods .csv) or Word (.docx .doc .docm .odt .rtf).
Form field filename Optional. The PDF filename returned to the user (no .pdf needed). Leave empty to use the uploaded file's name. Invalid characters are replaced with _.

Internal processing name: YYYYMMDD_hhmmss_FileName (unique; each request gets its own job dir).

Response: a PDF stream (Content-Disposition: attachment). Temporary files are deleted as soon as the download completes (the job dir is removed via defer RemoveAll).

curl -f -H "X-API-KEY: my-secret" \
  -F "file=@report.xlsx" \
  -F "filename=report-q2" \
  -o report-q2.pdf \
  http://localhost:8080/convert

GET /healthz

Returns 200 ok.

Errors

Code When
400 Missing file, or unsupported format
401 Invalid/missing X-API-KEY
413 Upload exceeds MAX_UPLOAD_MB
500 LibreOffice conversion error / timeout

Error body: {"error": "..."}.

Configuration (env)

Variable Default Description
API_KEY — (required) Key for the X-API-KEY header
PORT 8080 HTTP port
LO_CONCURRENCY 2 Number of parallel soffice processes (extra requests queue)
CONVERT_TIMEOUT_SECONDS 120 Timeout per conversion
MAX_UPLOAD_MB 50 Upload size limit
WORK_DIR /tmp/office2pdf Temporary job directory (compose mounts a tmpfs)

Source layout

File Responsibility
main.go Startup: load config, create work dir, run server
internal/config/config.go Read configuration from environment variables
internal/convert/convert.go Call LibreOffice (soffice) to convert
internal/filename/filename.go Sanitize filenames, build the Content-Disposition header
internal/server/server.go App + route registration
internal/server/handler.go /convert flow: receive file → name it → convert → return → delete
internal/server/middleware.go Check X-API-KEY, JSON error helper

Design

  • Each request gets its own job dir (input + output + LO profile) — no collisions, everything is cleaned up after the response.
  • A separate -env:UserInstallation per job lets LibreOffice run in parallel (by default LO locks its profile and allows only one instance).
  • A semaphore caps the number of concurrent soffice processes — LO is RAM/CPU hungry, so it shouldn't fork freely.
  • The internal filename is YYYYMMDD_hhmmss_FileName; the user-supplied filename only determines the download name.
  • The PDF filter is chosen by file type: calc_pdf_Export (Excel), writer_pdf_Export (Word).
  • Image: debian:bookworm-slim + libreoffice-{calc,writer}-nogui (no GUI deps) + Noto CJK/IPA/Liberation fonts (covers Japanese + Vietnamese). Runs as non-root.

About

Convert Office documents — Excel, Word, ODF, CSV, RTF — to PDF through a simple HTTP API backed by headless LibreOffice. Features API-key authentication, configurable conversion concurrency and timeouts, upload size limits, and per-request isolation with automatic cleanup of temporary files.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors