Skip to content

Conversation

CaralHsi
Copy link
Collaborator

@CaralHsi CaralHsi commented Jul 24, 2025

Description

Summary: This PR enhances the Xinyu-based internet retrieval module with improved chunking, deduplication, and integration logic. It also simplifies the retrieval flow in "fine" mode and improves configurability.

  1. Added Sentence-Level Chunking + Embedding
    • Integrates SentenceChunker to split result content into semantically meaningful sentence chunks.
    • Each chunk is independently embedded and returned as a TextualMemoryItem.

  2. Multi-chunk Result Processing
    • Each search result may now generate multiple memory items (based on content chunking).
    • _process_result method handles individual results with parallel processing (ThreadPoolExecutor).

  3. Memory-Level Deduplication
    • Duplicates are removed based on the memory content field to ensure uniqueness.

  4. Introduced OuterMemory Type
    • All retrieved internet items are labeled with memory_type="OuterMemory" for better downstream control and separation.

  5. Extended Config Support
    • XinyuSearchConfig now accepts a chunker config (via ChunkerConfigFactory).
    • Default search engine switched to Bing (disabled Baidu).

  6. Enhanced Example Script
    • tree_textual_memory.py updated with a usage example for mode="fine" and internet-based retrieval.

  7. Removed Legacy Reasoning Step
    • Removed reasoner.reason() call from "fine" mode to simplify logic (now pure retrieval + ranking).

Fix: #153

Docs Issue/PR: (docs-issue-or-pr-link)

Reviewer: @fridayL

Checklist:

  • I have performed a self-review of my own code | 我已自行检查了自己的代码
  • I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释
  • I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常
  • I have created related documentation issue/PR in MemOS-Docs (if applicable) | 我已在 MemOS-Docs 中创建了相关的文档 issue/PR(如果适用)
  • I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用)
  • I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人

@CaralHsi CaralHsi requested a review from fridayL July 24, 2025 10:36
@CaralHsi CaralHsi marked this pull request as ready for review July 24, 2025 10:46
@fridayL fridayL merged commit 5e60d89 into MemTensor:dev Jul 24, 2025
20 checks passed
tangg555 pushed a commit to tangg555/MemOS that referenced this pull request Jul 29, 2025
* feat: modify internet search

* feat: modify internet search

* feat: modify internet search

* feat: modify internet search

* fix: unittest for tree_searcher

* feat: add source to memory key
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants