Skip to content

Commit bb2377e

Browse files
authored
readme updates (#115)
Signed-off-by: Mandana Vaziri <[email protected]>
1 parent 27e224e commit bb2377e

File tree

2 files changed

+60
-32
lines changed

2 files changed

+60
-32
lines changed

README.md

+30-16
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,14 @@
22

33
LLMs will continue to change the way we build software systems. They are not only useful as coding assistants, providing snipets of code, explanations, and code transformations, but they can also help replace components that could only previously be achieved with rule-based systems. Whether LLMs are used as coding assistants or software components, reliability remains an important concern. LLMs have a textual interface and the structure of useful prompts is not captured formally. Programming frameworks do not enforce or validate such structures since they are not specified in a machine-consumable way. The purpose of the Prompt Declaration Language (PDL) is to allow developers to specify the structure of prompts and to enforce it, while providing a unified programming framework for composing LLMs with rule-based systems.
44

5-
PDL is based on the premise that interactions between users, LLMs and rule-based systems form a *document*. Consider for example the interactions between a user and a chatbot. At each interaction, the exchanges form a document that gets longer and longer. Similarly, chaining models together or using tools for specific tasks result in outputs that together form a document. PDL allows users to specify the shape and contents of such documents in a declarative way (in YAML or JSON), and is agnostic of any programming language. Because of its document-oriented nature, it can be used to easily express a variety of data generation tasks (inference, data synthesis, data generation for model training, etc...). Moreover, PDL programs themselves are structured data (YAML) as opposed to traditional code, so they make good targets for LLM generation as well.
6-
5+
PDL is based on the premise that interactions between users, LLMs and rule-based systems form a *document*. Consider for example the interactions between a user and a chatbot. At each interaction, the exchanges form a document that gets longer and longer. Similarly, chaining models together or using tools for specific tasks result in outputs that together form a document. PDL allows users to specify the shape and contents of such documents in a declarative way (in YAML), and is agnostic of any programming language. Because of its document-oriented nature, it can be used to easily express a variety of data generation tasks (inference, data synthesis, data generation for model training, etc...).
76

87
PDL provides the following features:
98
- Ability to use any LLM locally or remotely via [LiteLLM](https://www.litellm.ai/), including [IBM's Watsonx](https://www.ibm.com/watsonx)
10-
- Ability to templatize not only prompts for one LLM call, but also composition of LLMs with tools (code and APIs). Templates can encompass tasks of larger granularity than a single LLM call (unlike many prompt programming languages)
9+
- Ability to templatize not only prompts for one LLM call, but also composition of LLMs with tools (code and APIs). Templates can encompass tasks of larger granularity than a single LLM call
1110
- Control structures: variable definitions and use, conditionals, loops, functions
12-
- Ability to read from files, including JSON data
13-
- Ability to call out to code. At the moment only Python is supported, but this could be any other programming language in principle
14-
- Ability to call out to REST APIs with Python code
11+
- Ability to read from files and stdin, including JSON data
12+
- Ability to call out to code and call REST APIs (Python)
1513
- Type checking input and output of model calls
1614
- Python SDK
1715
- Support for chat APIs and chat templates
@@ -24,21 +22,21 @@ See below for installation notes, followed by an [overview](#overview) of the la
2422

2523
## Interpreter Installation
2624

27-
The interpreter has been tested with Python version **3.12**.
25+
The interpreter has been tested with Python version **3.11 and 3.12**.
2826

2927
To install the requirements for `pdl`, execute the command:
3028

3129
```
32-
pip3 install prompt-declaration-language
30+
pip install prompt-declaration-language
3331
```
3432

3533
To install the dependencies for development of PDL and execute all the example, execute the command:
3634
```
37-
pip3 install 'prompt-declaration-language[all]'
35+
pip install 'prompt-declaration-language[dev]'
36+
pip install 'prompt-declaration-language[examples]'
37+
pip install 'prompt-declaration-language[docs]'
3838
```
3939

40-
41-
4240
In order to run the examples that use foundation models hosted on [Watsonx](https://www.ibm.com/watsonx) via LiteLLM, you need a WatsonX account (a free plan is available) and set up the following environment variables:
4341
- `WATSONX_URL`, the API url (set to `https://{region}.ml.cloud.ibm.com`) of your WatsonX instance
4442
- `WATSONX_APIKEY`, the API key (see information on [key creation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui#create_user_key))
@@ -49,12 +47,28 @@ For more information, see [documentation](https://docs.litellm.ai/docs/providers
4947
To run the interpreter:
5048

5149
```
52-
pdl <path/to/example.yaml>
50+
pdl <path/to/example.pdl>
5351
```
5452

5553
The folder `examples` contains many examples of PDL programs. Several of these examples have been adapted from the LMQL [paper](https://arxiv.org/abs/2212.06094) by Beurer-Kellner et al. The examples cover a variety of prompting patterns such as CoT, RAG, ReAct, and tool use.
5654

57-
We highly recommend using VSCode to edit PDL YAML files. This project has been configured so that every YAML file is associated with the PDL grammar JSONSchema (see [settings](https://github.com/IBM/prompt-declaration-language/blob/main/.vscode/settings.json) and [schema](https://github.com/IBM/prompt-declaration-language/blob/main/pdl-schema.json)). This enables the editor to display error messages when the yaml deviates from the PDL syntax and grammar. It also provides code completion. You can set up your own VSCode PDL projects similarly using this settings and schema files. The PDL interpreter also provides similar error messages.
55+
We highly recommend using VSCode to edit PDL YAML files. This project has been configured so that every YAML file is associated with the PDL grammar JSONSchema (see [settings](https://github.com/IBM/prompt-declaration-language/blob/main/.vscode/settings.json) and [schema](https://github.com/IBM/prompt-declaration-language/blob/main/pdl-schema.json)). This enables the editor to display error messages when the yaml deviates from the PDL syntax and grammar. It also provides code completion. You can set up your own VSCode PDL projects similarly using the following `./vscode/settings.json` file:
56+
57+
```
58+
{
59+
"yaml.schemas": {
60+
"https://ibm.github.io/prompt-declaration-language/dist/pdl-schema.json": "*.pdl"
61+
},
62+
"files.associations": {
63+
"*.pdl": "yaml",
64+
}
65+
}
66+
```
67+
68+
The interpreter executes Python code specified in PDL code blocks. To sandbox the interpreter for safe execution,
69+
you can use the `--sandbox` flag which runs the interpreter in a docker container. Without this flag, the interpreter
70+
and all code is executed locally. To use the `--sandbox` flag, you need to have a docker daemon running, such as
71+
[Rancher Desktop](https://rancherdesktop.io).
5872

5973
The interpreter prints out a log by default in the file `log.txt`. This log contains the details of inputs and outputs to every block in the program. It is useful to examine this file when the program is behaving differently than expected. The log displays the exact prompts submitted to models by LiteLLM (after applying chat templates), which can be
6074
useful for debugging.
@@ -233,7 +247,7 @@ The function `deserializeOffsetMap` takes a string as input and returns a map. I
233247
The `@SuppressWarnings("unchecked")` annotation is used to suppress the warning that the type of the parsed map is not checked. This is because the Jackson library is used to parse the input string into a map, but the specific type of the map is not known at compile time. Therefore, the warning is suppressed to avoid potential issues.
234248
```
235249

236-
Notice that in PDL variables are used to templatize any entity in the document, not just textual prompts to LLMs. We can add a block to this document to evaluate the quality of the output using a similarity metric with respect to our [ground truth](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/ground_truth.txt). See [file](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/code-eval.yaml):
250+
Notice that in PDL variables are used to templatize any entity in the document, not just textual prompts to LLMs. We can add a block to this document to evaluate the quality of the output using a similarity metric with respect to our [ground truth](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/ground_truth.txt). See [file](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/code-eval.pdl):
237251

238252
```yaml
239253
description: Code explanation example
@@ -368,7 +382,7 @@ PDL has a Live Document visualizer to help in program understanding given an exe
368382
To produce an execution trace consumable by the Live Document, you can run the interpreter with the `--trace` argument:
369383
370384
```
371-
pdl <my-example> --trace
385+
pdl --trace <file.json> <my-example>
372386
```
373387
374388
This produces an additional file named `my-example_trace.json` that can be uploaded to the [Live Document](https://ibm.github.io/prompt-declaration-language/viewer/) visualizer tool. Clicking on different parts of the Live Document will show the PDL code that produced that part
@@ -379,7 +393,7 @@ This is similar to a spreadsheet for tabular data, where data is in the forefron
379393
380394
## Additional Notes
381395
382-
When using Granite models on Watsonx, we use the following defaults for model parameters:
396+
When using Granite models on Watsonx, we use the following defaults for model parameters (except `granite-20b-code-instruct-r1.1`):
383397
- `decoding_method`: `greedy`
384398
- `max_new_tokens`: 1024
385399
- `min_new_tokens`: 1

docs/README.md

+30-16
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,14 @@ hide:
77

88
LLMs will continue to change the way we build software systems. They are not only useful as coding assistants, providing snipets of code, explanations, and code transformations, but they can also help replace components that could only previously be achieved with rule-based systems. Whether LLMs are used as coding assistants or software components, reliability remains an important concern. LLMs have a textual interface and the structure of useful prompts is not captured formally. Programming frameworks do not enforce or validate such structures since they are not specified in a machine-consumable way. The purpose of the Prompt Declaration Language (PDL) is to allow developers to specify the structure of prompts and to enforce it, while providing a unified programming framework for composing LLMs with rule-based systems.
99

10-
PDL is based on the premise that interactions between users, LLMs and rule-based systems form a *document*. Consider for example the interactions between a user and a chatbot. At each interaction, the exchanges form a document that gets longer and longer. Similarly, chaining models together or using tools for specific tasks result in outputs that together form a document. PDL allows users to specify the shape and contents of such documents in a declarative way (in YAML or JSON), and is agnostic of any programming language. Because of its document-oriented nature, it can be used to easily express a variety of data generation tasks (inference, data synthesis, data generation for model training, etc...). Moreover, PDL programs themselves are structured data (YAML) as opposed to traditional code, so they make good targets for LLM generation as well.
11-
10+
PDL is based on the premise that interactions between users, LLMs and rule-based systems form a *document*. Consider for example the interactions between a user and a chatbot. At each interaction, the exchanges form a document that gets longer and longer. Similarly, chaining models together or using tools for specific tasks result in outputs that together form a document. PDL allows users to specify the shape and contents of such documents in a declarative way (in YAML), and is agnostic of any programming language. Because of its document-oriented nature, it can be used to easily express a variety of data generation tasks (inference, data synthesis, data generation for model training, etc...).
1211

1312
PDL provides the following features:
1413
- Ability to use any LLM locally or remotely via [LiteLLM](https://www.litellm.ai/), including [IBM's Watsonx](https://www.ibm.com/watsonx)
15-
- Ability to templatize not only prompts for one LLM call, but also composition of LLMs with tools (code and APIs). Templates can encompass tasks of larger granularity than a single LLM call (unlike many prompt programming languages)
14+
- Ability to templatize not only prompts for one LLM call, but also composition of LLMs with tools (code and APIs). Templates can encompass tasks of larger granularity than a single LLM call
1615
- Control structures: variable definitions and use, conditionals, loops, functions
17-
- Ability to read from files, including JSON data
18-
- Ability to call out to code. At the moment only Python is supported, but this could be any other programming language in principle
19-
- Ability to call out to REST APIs with Python code
16+
- Ability to read from files and stdin, including JSON data
17+
- Ability to call out to code and call REST APIs (Python)
2018
- Type checking input and output of model calls
2119
- Python SDK
2220
- Support for chat APIs and chat templates
@@ -29,21 +27,21 @@ See below for installation notes, followed by an [overview](#overview) of the la
2927

3028
## Interpreter Installation
3129

32-
The interpreter has been tested with Python version **3.12**.
30+
The interpreter has been tested with Python version **3.11 and 3.12**.
3331

3432
To install the requirements for `pdl`, execute the command:
3533

3634
```
37-
pip3 install prompt-declaration-language
35+
pip install prompt-declaration-language
3836
```
3937

4038
To install the dependencies for development of PDL and execute all the example, execute the command:
4139
```
42-
pip3 install 'prompt-declaration-language[all]'
40+
pip install 'prompt-declaration-language[dev]'
41+
pip install 'prompt-declaration-language[examples]'
42+
pip install 'prompt-declaration-language[docs]'
4343
```
4444

45-
46-
4745
In order to run the examples that use foundation models hosted on [Watsonx](https://www.ibm.com/watsonx) via LiteLLM, you need a WatsonX account (a free plan is available) and set up the following environment variables:
4846
- `WATSONX_URL`, the API url (set to `https://{region}.ml.cloud.ibm.com`) of your WatsonX instance
4947
- `WATSONX_APIKEY`, the API key (see information on [key creation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui#create_user_key))
@@ -54,12 +52,28 @@ For more information, see [documentation](https://docs.litellm.ai/docs/providers
5452
To run the interpreter:
5553

5654
```
57-
pdl <path/to/example.yaml>
55+
pdl <path/to/example.pdl>
5856
```
5957

6058
The folder `examples` contains many examples of PDL programs. Several of these examples have been adapted from the LMQL [paper](https://arxiv.org/abs/2212.06094) by Beurer-Kellner et al. The examples cover a variety of prompting patterns such as CoT, RAG, ReAct, and tool use.
6159

62-
We highly recommend using VSCode to edit PDL YAML files. This project has been configured so that every YAML file is associated with the PDL grammar JSONSchema (see [settings](https://github.com/IBM/prompt-declaration-language/blob/main/.vscode/settings.json) and [schema](https://github.com/IBM/prompt-declaration-language/blob/main/pdl-schema.json)). This enables the editor to display error messages when the yaml deviates from the PDL syntax and grammar. It also provides code completion. You can set up your own VSCode PDL projects similarly using this settings and schema files. The PDL interpreter also provides similar error messages.
60+
We highly recommend using VSCode to edit PDL YAML files. This project has been configured so that every YAML file is associated with the PDL grammar JSONSchema (see [settings](https://github.com/IBM/prompt-declaration-language/blob/main/.vscode/settings.json) and [schema](https://github.com/IBM/prompt-declaration-language/blob/main/pdl-schema.json)). This enables the editor to display error messages when the yaml deviates from the PDL syntax and grammar. It also provides code completion. You can set up your own VSCode PDL projects similarly using the following `./vscode/settings.json` file:
61+
62+
```
63+
{
64+
"yaml.schemas": {
65+
"https://ibm.github.io/prompt-declaration-language/dist/pdl-schema.json": "*.pdl"
66+
},
67+
"files.associations": {
68+
"*.pdl": "yaml",
69+
}
70+
}
71+
```
72+
73+
The interpreter executes Python code specified in PDL code blocks. To sandbox the interpreter for safe execution,
74+
you can use the `--sandbox` flag which runs the interpreter in a docker container. Without this flag, the interpreter
75+
and all code is executed locally. To use the `--sandbox` flag, you need to have a docker daemon running, such as
76+
[Rancher Desktop](https://rancherdesktop.io).
6377

6478
The interpreter prints out a log by default in the file `log.txt`. This log contains the details of inputs and outputs to every block in the program. It is useful to examine this file when the program is behaving differently than expected. The log displays the exact prompts submitted to models by LiteLLM (after applying chat templates), which can be
6579
useful for debugging.
@@ -238,7 +252,7 @@ The function `deserializeOffsetMap` takes a string as input and returns a map. I
238252
The `@SuppressWarnings("unchecked")` annotation is used to suppress the warning that the type of the parsed map is not checked. This is because the Jackson library is used to parse the input string into a map, but the specific type of the map is not known at compile time. Therefore, the warning is suppressed to avoid potential issues.
239253
```
240254

241-
Notice that in PDL variables are used to templatize any entity in the document, not just textual prompts to LLMs. We can add a block to this document to evaluate the quality of the output using a similarity metric with respect to our [ground truth](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/ground_truth.txt). See [file](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/code-eval.yaml):
255+
Notice that in PDL variables are used to templatize any entity in the document, not just textual prompts to LLMs. We can add a block to this document to evaluate the quality of the output using a similarity metric with respect to our [ground truth](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/ground_truth.txt). See [file](https://github.com/IBM/prompt-declaration-language/blob/main/examples/code/code-eval.pdl):
242256

243257
```yaml
244258
description: Code explanation example
@@ -373,7 +387,7 @@ PDL has a Live Document visualizer to help in program understanding given an exe
373387
To produce an execution trace consumable by the Live Document, you can run the interpreter with the `--trace` argument:
374388
375389
```
376-
pdl <my-example> --trace
390+
pdl --trace <file.json> <my-example>
377391
```
378392
379393
This produces an additional file named `my-example_trace.json` that can be uploaded to the [Live Document](https://ibm.github.io/prompt-declaration-language/viewer/) visualizer tool. Clicking on different parts of the Live Document will show the PDL code that produced that part
@@ -384,7 +398,7 @@ This is similar to a spreadsheet for tabular data, where data is in the forefron
384398
385399
## Additional Notes
386400
387-
When using Granite models on Watsonx, we use the following defaults for model parameters:
401+
When using Granite models on Watsonx, we use the following defaults for model parameters (except `granite-20b-code-instruct-r1.1`):
388402
- `decoding_method`: `greedy`
389403
- `max_new_tokens`: 1024
390404
- `min_new_tokens`: 1

0 commit comments

Comments
 (0)