How to fine-tune or prompt engineer these available models ? #138534
Replies: 7 comments
-
You can pass the context as a user query {"role": "user", "content": "Answer the following question {query} based on given {context}"} |
Beta Was this translation helpful? Give feedback.
-
To fine-tune or prompt engineer models: Fine-Tuning
Prompt Engineering
Check the model's documentation and community resources for more details. Good luck! |
Beta Was this translation helpful? Give feedback.
-
To implement RAG with the models you have:
Check out the model documentation and community resources for more detailed steps. Good luck! |
Beta Was this translation helpful? Give feedback.
-
Hey @FaheemOnHub! Thanks for the question! RAG support is coming soon. I'd love to chat more with you about what you're building and how both RAG and fine tuning tie into it. If you're open to a 30-minute zoom call, you can book me here! We have a ton of work planned for expanding GitHub Models and I really want to learn more about how we can make it great for users like you. |
Beta Was this translation helpful? Give feedback.
-
Great question! To implement RAG, you'll need to combine retrieval (like using a vector database) with generation. Fine-tuning isn't always needed—prompt engineering with a proper context window and retrieved documents can go a long way. |
Beta Was this translation helpful? Give feedback.
-
Yes, you can finetune these models easily by using the LoRA technique: you can quantize the model from high memory format to lower memory format and then also adjust the learning rate, decay, and all such factors, provide your own custom dataset, and then finally your finetuned model is ready. Here, instead of directly asking questions like "What is Python?" you can ask "You are a Python expert, guide me through and solve my doubt," so this way is better to get an output. Similarly, you can build a RAG as well, where you use your custom dataset and model. Along with this, you also add a prompt like "You are a helpful guide," and all such. You can also use a query like {"role": "user", "content": "Answer the following question {query} based on the given {context}"} frame this properly. |
Beta Was this translation helpful? Give feedback.
-
RAG (Retrieval Augmented Generation) is a technique to make LLMs smarter and more factual. Here's how it works: Retrieve Information: The LLM fetches relevant data from an external knowledge base (your up-to-date information). Combine & Generate: This retrieved information is then combined with your query, and the LLM uses this combined context to generate an accurate and relevant answer. You can implement RAG using: Prompt Engineering: Structuring the LLM's prompt to include the retrieved context, without needing to retrain the model. This is the most common and efficient method. Fine-tuning: Retraining parts of the model (e.g., the embedding model for retrieval, or the LLM itself) to better handle your specific context retrieval and usage. This is more complex but can improve performance in niche domains. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Select Topic Area
Question
Body
Having access to these models is great, but how to implement RAG on them ?
Beta Was this translation helpful? Give feedback.
All reactions