Skip to content

Commit 335bf1b

Browse files
prabodDevinTDHa
authored andcommitted
[SPARKNLP-1256] - Introducing AutoGGUFReranker (#14649)
* [SPARKNLP-1256] Add AutoGGUFReranker annotator and its tests * add resource downloader Signed-off-by: Prabod Rathnayaka <[email protected]> * Add example notebook for AutoGGUFReranker model integration in Spark NLP * Add documentation for AutoGGUFReranker annotator and update annotators list * Changes requested Signed-off-by: Prabod Rathnayaka <[email protected]> --------- Signed-off-by: Prabod Rathnayaka <[email protected]>
1 parent 713c3a0 commit 335bf1b

File tree

10 files changed

+1568
-2
lines changed

10 files changed

+1568
-2
lines changed
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
````markdown
2+
{%- capture title -%}
3+
AutoGGUFReranker
4+
{%- endcapture -%}
5+
6+
{%- capture description -%}
7+
Annotator that uses the llama.cpp library to rerank text documents based on their relevance to
8+
a given query using GGUF-format reranking models.
9+
10+
This annotator is specifically designed for text reranking tasks, where multiple documents or
11+
text passages are ranked according to their relevance to a query. It uses specialized
12+
reranking models in GGUF format that output relevance scores for each input document.
13+
14+
The reranker takes a query (set via `setQuery`) and a list of documents, then returns the same
15+
documents with added metadata containing relevance scores. The documents are processed in
16+
batches and each receives a `relevance_score` in its metadata indicating how relevant it is to
17+
the provided query.
18+
19+
For settable parameters, and their explanations, see [HasLlamaCppInferenceProperties](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/HasLlamaCppInferenceProperties.scala), [HasLlamaCppModelProperties](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/HasLlamaCppModelProperties.scala) and refer to
20+
the llama.cpp documentation of
21+
[server.cpp](https://github.com/ggerganov/llama.cpp/tree/7d5e8777ae1d21af99d4f95be10db4870720da91/examples/server)
22+
for more information.
23+
24+
If the parameters are not set, the annotator will default to use the parameters provided by
25+
the model.
26+
27+
Pretrained models can be loaded with `pretrained` of the companion object:
28+
29+
```scala
30+
val reranker = AutoGGUFReranker.pretrained()
31+
.setInputCols("document")
32+
.setOutputCol("reranked_documents")
33+
.setQuery("A man is eating pasta.")
34+
```
35+
36+
The default model is `"bge-reranker-v2-m3-Q4_K_M"`, if no name is provided.
37+
38+
For available pretrained models please see the [Models Hub](https://sparknlp.org/models).
39+
40+
For extended examples of usage, see the
41+
[AutoGGUFRerankerTest](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/test/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFRerankerTest.scala)
42+
and the
43+
[example notebook](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/llama.cpp/llama.cpp_in_Spark_NLP_AutoGGUFReranker.ipynb).
44+
45+
**Note**: This annotator is designed for reranking tasks and requires setting a query using `setQuery`.
46+
The query represents the search intent against which documents will be ranked. Each input
47+
document receives a relevance score in the output metadata.
48+
49+
To use GPU inference with this annotator, make sure to use the Spark NLP GPU package and set
50+
the number of GPU layers with the `setNGpuLayers` method.
51+
52+
When using larger models, we recommend adjusting GPU usage with `setNCtx` and `setNGpuLayers`
53+
according to your hardware to avoid out-of-memory errors.
54+
{%- endcapture -%}
55+
56+
{%- capture input_anno -%}
57+
DOCUMENT
58+
{%- endcapture -%}
59+
60+
{%- capture output_anno -%}
61+
DOCUMENT
62+
{%- endcapture -%}
63+
64+
{%- capture python_example -%}
65+
>>> import sparknlp
66+
>>> from sparknlp.base import *
67+
>>> from sparknlp.annotator import *
68+
>>> from pyspark.ml import Pipeline
69+
>>> document = DocumentAssembler() \
70+
... .setInputCol("text") \
71+
... .setOutputCol("document")
72+
>>> reranker = AutoGGUFReranker.pretrained() \
73+
... .setInputCols(["document"]) \
74+
... .setOutputCol("reranked_documents") \
75+
... .setBatchSize(4) \
76+
... .setQuery("A man is eating pasta.") \
77+
... .setNGpuLayers(99)
78+
>>> pipeline = Pipeline().setStages([document, reranker])
79+
>>> data = spark.createDataFrame([
80+
... ["A man is eating food."],
81+
... ["A man is eating a piece of bread."],
82+
... ["The girl is carrying a baby."],
83+
... ["A man is riding a horse."]
84+
... ]).toDF("text")
85+
>>> result = pipeline.fit(data).transform(data)
86+
>>> result.select("reranked_documents").show(truncate = False)
87+
+-------------------------------------------------------------------------------------------+
88+
|reranked_documents |
89+
+-------------------------------------------------------------------------------------------+
90+
|[{document, 0, 20, A man is eating food., {query -> A man is eating pasta., relevance_...}]|
91+
|[{document, 0, 31, A man is eating a piece of bread., {query -> A man is eating pasta.,...}]|
92+
|[{document, 0, 27, The girl is carrying a baby., {query -> A man is eating pasta., rel...}]|
93+
|[{document, 0, 22, A man is riding a horse., {query -> A man is eating pasta., relevan...}]|
94+
+-------------------------------------------------------------------------------------------+
95+
{%- endcapture -%}
96+
97+
{%- capture scala_example -%}
98+
import com.johnsnowlabs.nlp.base._
99+
import com.johnsnowlabs.nlp.annotator._
100+
import org.apache.spark.ml.Pipeline
101+
import spark.implicits._
102+
103+
val document = new DocumentAssembler()
104+
.setInputCol("text")
105+
.setOutputCol("document")
106+
107+
val reranker = AutoGGUFReranker
108+
.pretrained("bge-reranker-v2-m3-Q4_K_M")
109+
.setInputCols("document")
110+
.setOutputCol("reranked_documents")
111+
.setBatchSize(4)
112+
.setQuery("A man is eating pasta.")
113+
.setNGpuLayers(99)
114+
115+
val pipeline = new Pipeline().setStages(Array(document, reranker))
116+
117+
val data = Seq(
118+
"A man is eating food.",
119+
"A man is eating a piece of bread.",
120+
"The girl is carrying a baby.",
121+
"A man is riding a horse."
122+
).toDF("text")
123+
val result = pipeline.fit(data).transform(data)
124+
result.select("reranked_documents").show(truncate = false)
125+
+-------------------------------------------------------------------------------------------+
126+
|reranked_documents |
127+
+-------------------------------------------------------------------------------------------+
128+
|[{document, 0, 20, A man is eating food., {query -> A man is eating pasta., relevance_...}]|
129+
|[{document, 0, 31, A man is eating a piece of bread., {query -> A man is eating pasta.,...}]|
130+
|[{document, 0, 27, The girl is carrying a baby., {query -> A man is eating pasta., rel...}]|
131+
|[{document, 0, 22, A man is riding a horse., {query -> A man is eating pasta., relevan...}]|
132+
+-------------------------------------------------------------------------------------------+
133+
134+
{%- endcapture -%}
135+
136+
{%- capture api_link -%}
137+
[AutoGGUFReranker](/api/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFReranker)
138+
{%- endcapture -%}
139+
140+
{%- capture python_api_link -%}
141+
[AutoGGUFReranker](/api/python/reference/autosummary/sparknlp/annotator/seq2seq/auto_gguf_reranker/index.html)
142+
{%- endcapture -%}
143+
144+
{%- capture source_link -%}
145+
[AutoGGUFReranker](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFReranker.scala)
146+
{%- endcapture -%}
147+
148+
{% include templates/anno_template.md
149+
title=title
150+
description=description
151+
input_anno=input_anno
152+
output_anno=output_anno
153+
python_example=python_example
154+
scala_example=scala_example
155+
api_link=api_link
156+
python_api_link=python_api_link
157+
source_link=source_link
158+
%}
159+
````

docs/en/annotators.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ There are two types of Annotators:
4747
|---|---|---|
4848
{% include templates/anno_table_entry.md path="" name="AutoGGUFEmbeddings" summary="Annotator that uses the llama.cpp library to generate text embeddings with large language models."%}
4949
{% include templates/anno_table_entry.md path="" name="AutoGGUFModel" summary="Annotator that uses the llama.cpp library to generate text completions with large language models."%}
50+
{% include templates/anno_table_entry.md path="" name="AutoGGUFReranker" summary="Annotator that uses the llama.cpp library to rerank text documents based on their relevance to a given query using GGUF-format reranking models."%}
5051
{% include templates/anno_table_entry.md path="" name="AutoGGUFVisionModel" summary="Multimodal annotator that uses the llama.cpp library to generate text completions with large language models."%}
5152
{% include templates/anno_table_entry.md path="" name="BGEEmbeddings" summary="Sentence embeddings using BGE."%}
5253
{% include templates/anno_table_entry.md path="" name="BigTextMatcher" summary="Annotator to match exact phrases (by token) provided in a file against a Document."%}

0 commit comments

Comments
 (0)