IBM · vazirim · Feb 26, 2025 · Feb 24, 2025 · Feb 24, 2025 · Feb 24, 2025
diff --git a/docs/tutorial.md b/docs/tutorial.md
@@ -36,7 +36,7 @@ Hello, world!
 --8<-- "./examples/tutorial/calling_llm.pdl"
 ```
 
-In this program ([file](https://github.com/IBM/prompt-declaration-language//blob/main/examples/tutorial/calling_llm.pdl)), the `text` starts with the word `"Hello\n"`, and we call a model (`replicate/ibm-granite/granite-3.1-8b-instruct`) with this as input prompt. 
+In this program ([file](https://github.com/IBM/prompt-declaration-language//blob/main/examples/tutorial/calling_llm.pdl)), the `text` starts with the word `"Hello\n"`, and we call a model (`replicate/ibm-granite/granite-3.1-8b-instruct`) with this as input prompt.
 The model is passed a parameter `stop_sequences`.
 
 A PDL program computes 2 data structures. The first is a JSON corresponding to the result of the overall program, obtained by aggregating the results of each block. This is what is printed by default when we run the interpreter. The second is a conversational background context, which is a list of role/content pairs, where we implicitly keep track of roles and content for the purpose of communicating with models that support chat APIs. The contents in the latter correspond to the results of each block. The conversational background context is what is used to make calls to LLMs via LiteLLM.
@@ -81,7 +81,7 @@ When using Granite models, we use the following defaults for model parameters (e
   - `max_new_tokens`: 1024
   - `min_new_tokens`: 1
   - `repetition_penalty`: 1.05
-  
+
   Also if the `decoding_method` is `sample`, then the following defaults are used:
   - `temperature`: 0.7
   - `top_p`: 0.85
@@ -199,7 +199,7 @@ Here are its possible values:
 
 ## Specifying Data
 
-In PDL, the user specifies step by step the shape of data they wish to generate. A `text` block takes a list of blocks, stringifies the result of each block, 
+In PDL, the user specifies step by step the shape of data they wish to generate. A `text` block takes a list of blocks, stringifies the result of each block,
 and concatenates them.
 
 An `array` takes a list of blocks and creates an array of the results of each block:
@@ -253,7 +253,7 @@ Notice that block types that require lists (`repeat`, `for`, `if-then-else`) hav
 on this see [this section](#conditionals-and-loops).
 
 The PDL interpreter will raise a warning for a list item inside a `lastOf` block that is not capturing the result in a variable definition meaning that the result is being implicitly ignored.
-If this is intended because the block is contributing to the context or doing a side effect for example, the warning can be turned off by including `contribute: [context]` or `contribute: []`. 
+If this is intended because the block is contributing to the context or doing a side effect for example, the warning can be turned off by including `contribute: [context]` or `contribute: []`.
 On the other hand, if this was a mistake, then capture the result of the block using a variable definition by adding `def`.
 You could also turn the list into a text or an array by surrounding it with a `text` or `array` block so that no result is lost.
 
@@ -387,15 +387,15 @@ PDL supports conditionals and loops as illustrated in the following example ([fi
 
 The first block prompts the user for a query, and this is contributed to the background context. The next
 block is a `repeat-until`, which repeats the contained `text` block until the condition in the `until` becomes
-true. The field `repeat` can contain a string, or a block, or a list. If it contains a list, then the list is 
-interpreted to be a `lastOf` block. This means that all the blocks in the list are executed and the result of the body is that of the last block. 
+true. The field `repeat` can contain a string, or a block, or a list. If it contains a list, then the list is
+interpreted to be a `lastOf` block. This means that all the blocks in the list are executed and the result of the body is that of the last block.
 
 The example also shows the use of an `if-then-else` block. The `if` field contains a condition, the `then` field
-can also contain either a string, or a block, or a list (and similarly for `else`). If it contains a list, 
+can also contain either a string, or a block, or a list (and similarly for `else`). If it contains a list,
 the list is interpreted to be a `lastOf` block. So again the blocks in the list are executed and the result is that
 of the last block.
 
-The chatbot keeps looping by making a call to a model, asking the user if the generated text is a good answer, 
+The chatbot keeps looping by making a call to a model, asking the user if the generated text is a good answer,
 and asking `why not?` if the answer (stored in variable `eval`) is `no`. The loop ends when `eval` becomes `yes`. This is specified with a Jinja expression on line 18.
 
 Notice that the `repeat` and `then` blocks are followed by `text`. This is because of the semantics of lists in PDL. If we want to aggregate the result by stringifying every element in the list and collating them together, then we need the keyword `text` to precede a list. If this is omitted then the list is treated as a programmatic sequence where all the blocks are executed in sequence but result of the overall list is the result of the {\em last} block in the sequence. This behavior can be marked explicitly with a `lastOf` block.
@@ -477,7 +477,7 @@ join:
 ```
 
 meaning that result of each iteration is stringified and concatenated with that of other iterations. When using `with`,
-`as: text` can be elided. 
+`as: text` can be elided.
 
 Note that `join` can be added to any looping construct (`repeat`) not just `for` loops.
 
@@ -651,7 +651,7 @@ This is similar to a spreadsheet for tabular data, where data is in the forefron
 ## Using Ollama models
 
 1. Install Ollama e.g., `brew install --cask ollama`
-2. Run a model e.g., `ollama run granite-code:34b-instruct-q5_K_M`. See [the Ollama library for more models](https://ollama.com/library/granite-code/tags)
+2. Run a model e.g., `ollama run granite-code:8b`. See [the Ollama library for more models](https://ollama.com/library/granite-code/tags)
 3. An OpenAI style server is running locally at [http://localhost:11434/](http://localhost:11434/), see [the Ollama blog](https://ollama.com/blog/openai-compatibility) for more details.
 
 
@@ -660,7 +660,7 @@ Example:
 ```
 text:
 - Hello,
-- model: ollama_chat/granite-code:34b-instruct-q5_K_M
+- model: ollama_chat/granite-code:8b
   parameters:
     stop:
     - '!'

diff --git a/examples/callback/repair_prompt.pdl b/examples/callback/repair_prompt.pdl
@@ -9,7 +9,7 @@ lastOf:
   Please repair the code!
 
 - def: raw_output
-  model: replicate/ibm-granite/granite-3.1-8b-instruct
+  model: ollama/granite-code:8b
   parameters:
     #stop_sequences: "\n\n"
     temperature: 0

diff --git a/examples/chatbot/chatbot.pdl b/examples/chatbot/chatbot.pdl
@@ -6,7 +6,7 @@ text:
 - repeat:
     text:
     # Send context to Granite model hosted at replicate.com
-    - model: replicate/ibm-granite/granite-3.1-8b-instruct
+    - model: ollama/granite-code:8b
     # Allow the user to type 'yes', 'no', or anything else, storing
     # the input into a variable named `eval`.  The input is also implicitly
     # added to the context.

diff --git a/examples/code/code-eval.pdl b/examples/code/code-eval.pdl
@@ -10,13 +10,13 @@ defs:
 text:
 # Print the source code to the console
 - "\n${ CODE.source_code }\n"
-# Use replicate.com to invoke a Granite model with a prompt.  Output AND
+# Use ollama to invoke a Granite model with a prompt.  Output AND
 # set the variable `EXPLANATION` to the output.
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   def: EXPLANATION
   input: |
       Here is some info about the location of the function in the repo.
-      repo: 
+      repo:
       ${ CODE.repo_info.repo }
       path: ${ CODE.repo_info.path }
       Function_name: ${ CODE.repo_info.function_name }

diff --git a/examples/code/code-json.pdl b/examples/code/code-json.pdl
@@ -6,13 +6,13 @@ defs:
   TRUTH:
     read: ./ground_truth.txt
 text:
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   def: EXPLANATION
   contribute: []
   input:
      |
       Here is some info about the location of the function in the repo.
-      repo: 
+      repo:
       ${ CODE.repo_info.repo }
       path: ${ CODE.repo_info.path }
       Function_name: ${ CODE.repo_info.function_name }
@@ -37,7 +37,7 @@ text:
     """
     # (In PDL, set `result` to the output you wish for your code block.)
     result = textdistance.levenshtein.normalized_similarity(expl, truth)
-- data: 
+- data:
     input: ${ CODE }
     output: ${ EXPLANATION }
     metric: ${ EVAL }

diff --git a/examples/code/code.pdl b/examples/code/code.pdl
@@ -7,16 +7,16 @@ defs:
 text:
 # Output the `source_code:` of the YAML to the console
 - "\n${ CODE.source_code }\n"
-# Use replicate.com to invoke a Granite model with a prompt
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+# Use ollama to invoke a Granite model with a prompt
+- model: ollama/granite-code:8b
   input: |
     Here is some info about the location of the function in the repo.
-    repo: 
+    repo:
     ${ CODE.repo_info.repo }
     path: ${ CODE.repo_info.path }
     Function_name: ${ CODE.repo_info.function_name }
 
-    
+
     Explain the following code:
     ```
     ${ CODE.source_code }```

diff --git a/examples/demo/3-weather.pdl b/examples/demo/3-weather.pdl
@@ -2,7 +2,7 @@ description: Using a weather API and LLM to make a small weather app
 text:
 - def: QUERY
   text: "What is the weather in Madrid?\n"
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   input: |
       Extract the location from the question.
       Question: What is the weather in London?
@@ -19,14 +19,14 @@ text:
 - lang: python
   code: |
     import requests
-    #result = requests.get('https://api.weatherapi.com/v1/current.json?key==XYZ=${ LOCATION }') 
+    #result = requests.get('https://api.weatherapi.com/v1/current.json?key==XYZ=${ LOCATION }')
     #Mock result:
     result = '{"location": {"name": "Madrid", "region": "Madrid", "country": "Spain", "lat": 40.4, "lon": -3.6833, "tz_id": "Europe/Madrid", "localtime_epoch": 1732543839, "localtime": "2024-11-25 15:10"}, "current": {"last_updated_epoch": 1732543200, "last_updated": "2024-11-25 15:00", "temp_c": 14.4, "temp_f": 57.9, "is_day": 1, "condition": {"text": "Partly cloudy", "icon": "//cdn.weatherapi.com/weather/64x64/day/116.png", "code": 1003}, "wind_mph": 13.2, "wind_kph": 21.2, "wind_degree": 265, "wind_dir": "W", "pressure_mb": 1017.0, "pressure_in": 30.03, "precip_mm": 0.01, "precip_in": 0.0, "humidity": 77, "cloud": 75, "feelslike_c": 12.8, "feelslike_f": 55.1, "windchill_c": 13.0, "windchill_f": 55.4, "heatindex_c": 14.5, "heatindex_f": 58.2, "dewpoint_c": 7.3, "dewpoint_f": 45.2, "vis_km": 10.0, "vis_miles": 6.0, "uv": 1.4, "gust_mph": 15.2, "gust_kph": 24.4}}'
   def: WEATHER
   parser: json
   contribute: []
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   input: |
       Explain the weather from the following JSON:
       ${ WEATHER }
-  
+
diff --git a/examples/demo/4-translator.pdl b/examples/demo/4-translator.pdl
@@ -1,7 +1,7 @@
 description: PDL program
 text:
 - "What is APR?\n"
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
 - repeat:
     text:
     - read:
@@ -11,5 +11,5 @@ text:
       then:
         text:
         - "\n\nTranslate the above to ${ language }\n"
-        - model: replicate/ibm-granite/granite-3.1-8b-instruct
+        - model: ollama/granite-code:8b
   until: ${ language == 'stop' }
diff --git a/examples/fibonacci/fib.pdl b/examples/fibonacci/fib.pdl
@@ -6,7 +6,7 @@ text:
 # Use IBM Granite to author a program that computes the Nth Fibonacci number,
 # storing the generated program into the variable `CODE`.
 - def: CODE
-  model: replicate/ibm-granite/granite-3.1-8b-instruct
+  model: ollama/granite-code:8b
   input: "Write a Python function to compute the Fibonacci sequence. Do not include a doc string.\n\n"
   parameters:
     # Request no randomness when generating code
@@ -42,5 +42,4 @@ text:
 
 # Invoke the LLM again to explain the PDL context
 - "\n\nExplain what the above code does and what the result means\n\n"
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
-
+- model: ollama/granite-code:8b
diff --git a/examples/hello/hello-def-use.pdl b/examples/hello/hello-def-use.pdl
@@ -1,8 +1,8 @@
 description: Hello world with variable use
 text:
 - "Hello\n"
-# Define GEN to be the result of a Granite LLM using replicate.com
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+# Define GEN to be the result of a Granite LLM using ollama
+- model: ollama/granite-code:8b
   parameters:
     # "greedy" sampling tells the LLM to use the most likely token at each step
     decoding_method: greedy

diff --git a/examples/hello/hello-model-chaining.pdl b/examples/hello/hello-model-chaining.pdl
@@ -1,16 +1,15 @@
 description: Hello world showing model chaining
 text:
 - "Hello\n"
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   parameters:
     # "greedy" sampling tells the LLM to use the most likely token at each step
     decoding_method: greedy
     # Tell the LLM to stop after generating an exclamation point.
     stop_sequences: '!'
   def: GEN
 - "\nDid you say ${ GEN }?\n"
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   parameters:
     decoding_method: greedy
     stop_sequences: '.'
-
diff --git a/examples/hello/hello-model-input.pdl b/examples/hello/hello-model-input.pdl
@@ -1,6 +1,6 @@
 description: Hello world with model input
 text:
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   input: "Hello,"
   parameters:
     # Tell the LLM to stop after generating an exclamation point.

diff --git a/examples/hello/hello-parser-json.pdl b/examples/hello/hello-parser-json.pdl
@@ -6,25 +6,24 @@ defs:
     parser: yaml
     spec: { questions: [str], answers: [obj] }
 text:
-  - model: replicate/ibm-granite/granite-3.1-8b-instruct
+  - model: ollama/granite-code:8b
     def: model_output
     spec: {name: str, age: int}
     input:
       array:
       - role: user
-        content: 
+        content:
           text:
-          - for: 
+          - for:
               question: ${ data.questions }
               answer: ${ data.answers }
             repeat: |
               ${ question }
               ${ answer }
-          - > 
+          - >
             Question: Generate only a JSON object with fields 'name' and 'age' and set them appropriately. Write the age all in letters. Only generate a single JSON object and nothing else.
     parser: yaml
     parameters:
       stop_sequences: "Question"
       temperature: 0
 
-
diff --git a/examples/hello/hello-parser-regex.pdl b/examples/hello/hello-parser-regex.pdl
@@ -1,6 +1,6 @@
 description: Hello world with parser using regex
 text:
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   input: "Hello,"
   parameters:
     # Tell the LLM to stop after generating an exclamation point.

diff --git a/examples/hello/hello-roles-array.pdl b/examples/hello/hello-roles-array.pdl
@@ -7,6 +7,6 @@ text:
   - role: user
     content: Write a Python function that implement merge sort.
   contribute: []
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b
   input: ${ prompt }
 
diff --git a/examples/hello/hello-type.pdl b/examples/hello/hello-type.pdl
@@ -11,7 +11,7 @@ text:
   return:
     lastOf:
     - "\nTranslate the sentence '${ sentence }' to ${ language }.\n"
-    - model: replicate/ibm-granite/granite-3.1-8b-instruct
+    - model: ollama/granite-code:8b
       parameters:
         stop_sequences: "\n"
 - call: ${ translate }

diff --git a/examples/hello/hello.pdl b/examples/hello/hello.pdl
@@ -1,4 +1,4 @@
 description: Hello world
 text:
 - "Hello\n"
-- model: replicate/ibm-granite/granite-3.1-8b-instruct
+- model: ollama/granite-code:8b