-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
- Would fixing LLMs with Dr. GRPO (which protects against overthinking AND giving wrong answers more frequently), enable LLMs to handle premises and context better? https://github.com/sail-sg/understand-r1-zero
- Should the LLM fill in the premise with a variable, and turn the answer to the "broken" question into a function? That way, it won't just deal with checking dead ends, but also shows the way they answer "it depends" questions
- What about logic puzzles and other linguistic problems? ZebraLogic for example, seemed like a good candidate for this https://github.com/WildEval/ZeroEval
- How do you view false or inaccurate information? https://github.com/dannyallover/overthinking_the_truth
Metadata
Metadata
Assignees
Labels
No labels