Dev

u-ashish · web-flow · commit 676c6b92c67a · 2025-09-07T13:04:09.000-04:00
- Add modal deployment example
- Update README
diff --git a/README.md b/README.md
@@ -11,6 +11,8 @@ Marker converts documents to markdown, JSON, chunks, and HTML quickly and accura
 - Optionally boost accuracy with LLMs (and your own prompt)
 - Works on GPU, CPU, or MPS
 
+For our managed API or on-prem document intelligence solution, check out [our platform here](https://datalab.to?utm_source=gh-marker).
+
 ## Performance
 
 <img src="data/images/overall.png" width="800px"/>
@@ -41,14 +43,15 @@ As you can see, the use_llm mode offers higher accuracy than marker or gemini al
 
 # Commercial usage
 
-Our model weights use a modified AI Pubs Open Rail-M license (free for research, personal use, and startups under $2M funding/revenue) and our code is GPL. For broader commercial licensing or to remove GPL requirements, visit our pricing page [here](https://www.datalab.to).
+Our model weights use a modified AI Pubs Open Rail-M license (free for research, personal use, and startups under $2M funding/revenue) and our code is GPL. For broader commercial licensing or to remove GPL requirements, visit our pricing page [here](https://www.datalab.to/pricing?utm_source=gh-marker).
 
-# Hosted API
+# Hosted API & On-prem
 
-There's a hosted API for marker available [here](https://www.datalab.to/):
+There's a [hosted API](https://www.datalab.to?utm_source=gh-marker) and [painless on-prem solution](https://www.datalab.to/blog/self-serve-on-prem-licensing) for marker - it's free to sign up, and we'll throw in credits for you to test it out.
 
+The API:
 - Supports PDF, image, PPT, PPTX, DOC, DOCX, XLS, XLSX, HTML, EPUB files
-- 1/4th the price of leading cloud-based competitors
+- Is 1/4th the price of leading cloud-based competitors
 - Fast - ~15s for a 250 page PDF
 - Supports LLM mode
 - High uptime (99.99%)
@@ -102,7 +105,7 @@ Options:
 - `--page_range TEXT`: Specify which pages to process. Accepts comma-separated page numbers and ranges. Example: `--page_range "0,5-10,20"` will process pages 0, 5 through 10, and page 20.
 - `--output_format [markdown|json|html|chunks]`: Specify the format for the output results.
 - `--output_dir PATH`: Directory where output files will be saved. Defaults to the value specified in settings.OUTPUT_DIR.
-- `--paginate_output`: Paginates the output, using `\n\n{PAGE_NUMBER}` followed by `-` * 48, then `\n\n` 
+- `--paginate_output`: Paginates the output, using `\n\n{PAGE_NUMBER}` followed by `-` * 48, then `\n\n`
 - `--use_llm`: Uses an LLM to improve accuracy.  You will need to configure the LLM backend - see [below](#llm-services).
 - `--force_ocr`: Force OCR processing on the entire document, even for pages that might contain extractable text.  This will also format inline math properly.
 - `--block_correction_prompt`: if LLM mode is active, an optional prompt that will be used to correct the output of marker.  This is useful for custom formatting or logic that you want to apply to the output.
@@ -182,7 +185,7 @@ rendered = converter("FILEPATH")
 
 ### Extract blocks
 
-Each document consists of one or more pages.  Pages contain blocks, which can themselves contain other blocks.  It's possible to programmatically manipulate these blocks.  
+Each document consists of one or more pages.  Pages contain blocks, which can themselves contain other blocks.  It's possible to programmatically manipulate these blocks.
 
 Here's an example of extracting all forms from a document:
 
@@ -222,7 +225,7 @@ text, _, images = text_from_rendered(rendered)
 
 This takes all the same configuration as the PdfConverter.  You can specify the configuration `force_layout_block=Table` to avoid layout detection and instead assume every page is a table.  Set `output_format=json` to also get cell bounding boxes.
 
-You can also run this via the CLI with 
+You can also run this via the CLI with
 ```shell
 marker_single FILENAME --use_llm --force_layout_block Table --converter_cls marker.converters.table.TableConverter --output_format json
 ```
@@ -243,7 +246,7 @@ rendered = converter("FILEPATH")
 
 This takes all the same configuration as the PdfConverter.
 
-You can also run this via the CLI with 
+You can also run this via the CLI with
 ```shell
 marker_single FILENAME --converter_cls marker.converters.ocr.OCRConverter
 ```
@@ -260,7 +263,7 @@ from pydantic import BaseModel
 
 class Links(BaseModel):
     links: list[str]
-    
+
 schema = Links.model_json_schema()
 config_parser = ConfigParser({
     "page_schema": schema
@@ -300,7 +303,7 @@ HTML output is similar to markdown output:
 
 JSON output will be organized in a tree-like structure, with the leaf nodes being blocks.  Examples of leaf nodes are a single list item, a paragraph of text, or an image.
 
-The output will be a list, with each list item representing a page.  Each page is considered a block in the internal marker schema.  There are different types of blocks to represent different elements.  
+The output will be a list, with each list item representing a page.  Each page is considered a block in the internal marker schema.  There are different types of blocks to represent different elements.
 
 Pages have the keys:
 
@@ -366,7 +369,7 @@ All output formats will return a metadata dictionary, with the following fields:
     ], // computed PDF table of contents
     "page_stats": [
       {
-        "page_id":  0, 
+        "page_id":  0,
         "text_extraction_method": "pdftext",
         "block_counts": [("Span", 200), ...]
       },
@@ -553,4 +556,11 @@ PDF is a tricky format, so marker will not always work perfectly.  Here are some
 - Very complex layouts, with nested tables and forms, may not work
 - Forms may not be rendered well
 
-Note: Passing the `--use_llm` and `--force_ocr` flags will mostly solve these issues.
+Note: Passing the `--use_llm` and `--force_ocr` flags will mostly solve these issues.
+
+# Usage and Deployment Examples
+
+You can always run `marker` locally, but if you wanted to expose it as an API, we have a few options:
+- [Deployment example with Modal](./examples/README_MODAL.md) that shows you how to deploy and access `marker` through a web endpoint using [`Modal`](https://modal.com), which makes compute easy to provision and scale.
+- Our platform API is also powered by `marker` and `surya` and is easy to test out - it's free to sign up, and we'll include credits, [try it out here](https://datalab.to)
+- Our painless on-prem solution for commercial use, which you can [read about here](https://www.datalab.to/blog/self-serve-on-prem-licensing)
diff --git a/examples/README_MODAL.md b/examples/README_MODAL.md
@@ -0,0 +1,94 @@
+## Usage Examples
+
+This directory contains examples of running `marker` in different contexts.
+
+### Usage with Modal
+
+We have a [self-contained example](./marker_modal_deployment.py) that shows how you can quickly use [Modal](https://modal.com) to deploy `marker` by provisioning a container with a GPU, and expose that with an API so you can submit PDFs for conversion into Markdown, HTML, or JSON.
+
+It's a limited example that you can extend into different use cases.
+
+#### Pre-requisites
+
+Make sure you have the `modal` client installed by [following their instructions here](https://modal.com/docs/guide#getting-started).
+
+Modal's [Starter Plan](https://modal.com/pricing) includes $30 of free compute each month.
+Modal is [serverless](https://arxiv.org/abs/1902.03383), so you only pay for resources when you are using them.
+
+#### Running the example
+
+Once `modal` is configured, you can deploy it to your workspace by running:
+
+> modal deploy marker_modal_deployment.py
+
+Notes:
+- `marker` has a few models it uses. By default, the endpoint will check if these models are loaded and download them if not (first request will be slow). You can avoid this by running
+
+> modal run marker_modal_deployment.py::download_models
+
+Which will create a [`Modal Volume`](https://modal.com/docs/guide/Volumes) to store them for re-use.
+
+Once the deploy is finished, you can:
+- Test a file upload locally through your CLI using an `invoke_conversion` command we expose through Modal's [`local_entrypoint`](https://modal.com/docs/reference/modal.App#local_entrypoint)
+- Get the URL of your endpoint and make a request through a client of your choice.
+
+**Test from your CLI with `invoke_conversion`**
+
+If your endpoint is live, simply run this command:
+
+```
+$ modal run marker_modal_deployment.py::invoke_conversion --pdf-file <PDF_FILE_PATH> --output-format markdown
+```
+
+And it'll automatically detect the URL of your new endpoint using [`.get_web_url()`](https://modal.com/docs/guide/webhook-urls#determine-the-url-of-a-web-endpoint-from-code), make sure it's healthy, submit your file, and store its output on your machine (in the same directory).
+
+**Making a request using your own client**
+
+If you want to make requests elsewhere e.g. with cURL or a client like Insomnia, you'll need to get the URL.
+
+When your `modal deploy` command from earlier finishes, it'll include your endpoint URL at the end. For example:
+
+```
+$ modal deploy marker_modal_deployment.py
+...
+✓ Created objects.
+├── 🔨 Created mount /marker/examples/marker_modal_deployment.py
+├── 🔨 Created function download_models.
+├── 🔨 Created function MarkerModalDemoService.*.
+└── 🔨 Created web endpoint for MarkerModalDemoService.fastapi_app => <YOUR_ENDPOINT_URL>
+✓ App deployed in 149.877s! 🎉
+```
+
+If you accidentally close your terminal session, you can also always go into Modal's dashboard and:
+  - Find the app (default name: `datalab-marker-modal-demo`)
+  - Click on `MarkerModalDemoService`
+  - Find your endpoint URL
+
+Once you have your URL, make a request to `{YOUR_ENDPOINT_URL}/convert` like this (you can also use Insomnia, etc.):
+```
+curl --request POST \
+  --url {BASE_URL}/convert \
+  --header 'Content-Type: multipart/form-data' \
+  --form file=@/Users/cooldev/sample.pdf \
+  --form output_format=html
+  ```
+
+You should get a response like this
+
+```
+{
+	"success": true,
+	"filename": "sample.pdf",
+	"output_format": "html",
+	"json": null,
+	"html": "<YOUR_RESPONSE_CONTENT>",
+	"markdown": null,
+	"images": {},
+	"metadata": {... page level metadata ...},
+	"page_count": 2
+}
+```
+
+[Modal](https://modal.com) makes deploying and scaling models and inference workloads much easier.
+
+If you're interested in Datalab's managed API or on-prem document intelligence solution, check out [our platform here](https://datalab.to/?utm_source=gh-marker).
diff --git a/examples/marker_modal_deployment.py b/examples/marker_modal_deployment.py