How to fine-tune or prompt engineer these available models ? #138534

FaheemOnHub · 2024-09-13T05:23:13Z

FaheemOnHub
Sep 13, 2024

Select Topic Area

Question

Body

Having access to these models is great, but how to implement RAG on them ?

Rohit-RA-2020 · 2024-09-14T09:56:36Z

Rohit-RA-2020
Sep 14, 2024

You can pass the context as a user query {"role": "user", "content": "Answer the following question {query} based on given {context}"}

0 replies

KelvinCode1234 · 2025-02-07T21:43:52Z

KelvinCode1234
Feb 7, 2025

To fine-tune or prompt engineer models:

Fine-Tuning

Select a Pre-trained Model: Choose one like GPT-3 or BERT.
Prepare Data: Gather and preprocess your dataset.
Set Up: Use frameworks like TensorFlow, PyTorch, or Hugging Face. Configure training parameters.
Train: Train the model with your data.
Evaluate: Validate and adjust as needed.

Prompt Engineering

Understand the Model: Learn its strengths and limitations.
Craft Prompts: Write clear, effective prompts. Experiment with different phrasings.
Test and Refine: Test outputs and refine prompts.
Optimize: Use techniques like prompt chaining for better results.

Check the model's documentation and community resources for more details.

Good luck!

0 replies

KelvinCode1234 · 2025-02-07T21:44:23Z

KelvinCode1234
Feb 7, 2025

To implement RAG with the models you have:

Choose Your Models:
- Retriever: For fetching relevant documents (e.g., BM25, DPR).
- Generator: For generating answers (e.g., GPT-3, T5).
Set Up the Retriever:
- Index your knowledge base using a retriever model.
Set Up the Generator:
- Load and fine-tune a generative model if needed.
Integrate Both:
- Query Processing: Retrieve documents using the retriever.
- Generate Answer: Use the generative model to create the final answer based on retrieved documents.

Check out the model documentation and community resources for more detailed steps. Good luck!

0 replies

KateCatlin · 2025-02-09T20:51:11Z

KateCatlin
Feb 9, 2025
Maintainer

Hey @FaheemOnHub! Thanks for the question! RAG support is coming soon.

I'd love to chat more with you about what you're building and how both RAG and fine tuning tie into it. If you're open to a 30-minute zoom call, you can book me here! We have a ton of work planned for expanding GitHub Models and I really want to learn more about how we can make it great for users like you.

0 replies

waletmaker0 · 2025-04-17T14:38:26Z

waletmaker0
Apr 17, 2025

Great question! To implement RAG, you'll need to combine retrieval (like using a vector database) with generation. Fine-tuning isn't always needed—prompt engineering with a proper context window and retrieved documents can go a long way.

0 replies

harshhrawte · 2025-06-21T09:41:06Z

harshhrawte
Jun 21, 2025

Yes, you can finetune these models easily by using the LoRA technique: you can quantize the model from high memory format to lower memory format and then also adjust the learning rate, decay, and all such factors, provide your own custom dataset, and then finally your finetuned model is ready. Here, instead of directly asking questions like "What is Python?" you can ask "You are a Python expert, guide me through and solve my doubt," so this way is better to get an output.

Similarly, you can build a RAG as well, where you use your custom dataset and model. Along with this, you also add a prompt like "You are a helpful guide," and all such. You can also use a query like {"role": "user", "content": "Answer the following question {query} based on the given {context}"} frame this properly.

0 replies

rifialdiif · 2025-07-03T18:51:18Z

rifialdiif
Jul 3, 2025

RAG (Retrieval Augmented Generation) is a technique to make LLMs smarter and more factual. Here's how it works:

Retrieve Information: The LLM fetches relevant data from an external knowledge base (your up-to-date information).

Combine & Generate: This retrieved information is then combined with your query, and the LLM uses this combined context to generate an accurate and relevant answer.

You can implement RAG using:

Prompt Engineering: Structuring the LLM's prompt to include the retrieved context, without needing to retrain the model. This is the most common and efficient method.

Fine-tuning: Retraining parts of the model (e.g., the embedding model for retrieval, or the LLM itself) to better handle your specific context retrieval and usage. This is more complex but can improve performance in niche domains.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Community

How to fine-tune or prompt engineer these available models ? #138534

Uh oh!

{{title}}

Uh oh!

Replies: 7 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to fine-tune or prompt engineer these available models ? #138534

Uh oh!

Select Topic Area

Body

Replies: 7 comments

Uh oh!

Uh oh!

Fine-Tuning

Prompt Engineering

Uh oh!

Uh oh!

KateCatlin Feb 9, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

KateCatlin
Feb 9, 2025
Maintainer