-
Notifications
You must be signed in to change notification settings - Fork 234
refactor: improve architecture and configurations for memory scheduler #142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ors and web logs, and refactor the code structure
…mples; some bugs due to refactoring have been fixed
It is recommended to improve the PR description, especially the checklist. The current PR modifications may involve updates to the MemOS-Docs documentation, so it is advised to add links to related issues/PRs in the PR description. A well-filled example of a PR title & description can be found in PR #120. 建议完善PR描述,尤其是其中的检查清单。当前的PR修改可能涉及MemOS-Docs文档的更新,建议在PR描述中添加相关issue/PR的链接。一个填写完善的PR title&description示例如 PR #120. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the memory scheduler with improved architecture and configurations. The refactoring reorganizes scheduler modules into clear schemas (general, message, monitor), enhances configuration parameters, and improves memory monitoring with better scoring and retention logic.
Key changes include:
- Split large schema file into specialized modules for better maintainability
- Enhanced memory monitoring with keyword-based scoring and importance ranking
- Improved filter utilities with better similarity detection and length validation
- Updated examples to demonstrate new schema organization and web logging capabilities
Reviewed Changes
Copilot reviewed 35 out of 37 changed files in this pull request and generated 8 comments.
Show a summary per file
File | Description |
---|---|
src/memos/mem_scheduler/schemas/ | New schema organization split into general, message, and monitor modules |
src/memos/mem_scheduler/utils/ | Refactored utilities split into filter and misc utilities |
src/memos/mem_scheduler/modules/ | Updated modules to use new schema imports and enhanced functionality |
tests/mem_scheduler/ | Updated test imports and removed obsolete test methods |
examples/ | Updated examples to use new schemas and demonstrate web logs |
|
||
def parse_yaml(yaml_file): | ||
yaml_path = Path(yaml_file) | ||
yaml_path = Path(yaml_file) |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate line detected - yaml_path = Path(yaml_file)
is assigned twice consecutively (lines 18-19). Remove the duplicate assignment.
yaml_path = Path(yaml_file) |
Copilot uses AI. Check for mistakes.
def search( | ||
self, query: str, mem_cube: GeneralMemCube, top_k: int, method=TreeTextMemory_SEARCH_METHOD | ||
): | ||
) -> list[TextualMemoryItem]: |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type annotation is inconsistent with the actual return statement. The method can return both list[TextualMemoryItem]
and an empty list []
in the exception case, but the type annotation only specifies the success case.
Copilot uses AI. Check for mistakes.
logger.error("memory_base is not supported") | ||
|
||
return memories_with_new_order | ||
return memories_with_new_order, success_flag |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type annotation specifies list[TextualMemoryItem] | None
but the method always returns a tuple (list[TextualMemoryItem], bool)
. Update the return type annotation to match the actual return value.
Copilot uses AI. Check for mistakes.
def generate_mapping_key(cls, v, values): # noqa: N805 | ||
if v is None and "memory_text" in values: | ||
return transform_name_to_key(values["memory_text"]) |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field validator accesses values
dictionary without proper validation. Use info.data
instead of values
for Pydantic v2 compatibility and add proper error handling for missing keys.
def generate_mapping_key(cls, v, values): # noqa: N805 | |
if v is None and "memory_text" in values: | |
return transform_name_to_key(values["memory_text"]) | |
def generate_mapping_key(cls, v, info): # noqa: N805 | |
memory_text = info.data.get("memory_text") | |
if v is None and memory_text is not None: | |
return transform_name_to_key(memory_text) |
Copilot uses AI. Check for mistakes.
f"Removed: {len(memories_to_remove)} (excluding top {partial_retention_number} by recording_count)" | ||
) | ||
|
||
return self.memories |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method update_memories
has inconsistent return type. The return type annotation specifies MemoryMonitorItem
but the method returns self.memories
which is list[MemoryMonitorItem]
. Either update the return type annotation or change the return statement.
Copilot uses AI. Check for mistakes.
data_copy = data.copy() # Avoid modifying original dictionary | ||
if "timestamp" in data_copy and isinstance(data_copy["timestamp"], str): | ||
try: | ||
data_copy["timestamp"] = datetime.fromisoformat(data_copy["timestamp"]) | ||
except ValueError: | ||
# Handle invalid time formats - adjust as needed (e.g., log warning or set to None) | ||
data_copy["timestamp"] = None | ||
|
||
return cls(**data_copy) |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Creating a copy of the entire dictionary for every conversion may be inefficient for large dictionaries. Consider modifying only the timestamp field in-place or using a more targeted approach.
data_copy = data.copy() # Avoid modifying original dictionary | |
if "timestamp" in data_copy and isinstance(data_copy["timestamp"], str): | |
try: | |
data_copy["timestamp"] = datetime.fromisoformat(data_copy["timestamp"]) | |
except ValueError: | |
# Handle invalid time formats - adjust as needed (e.g., log warning or set to None) | |
data_copy["timestamp"] = None | |
return cls(**data_copy) | |
if "timestamp" in data and isinstance(data["timestamp"], str): | |
try: | |
data["timestamp"] = datetime.fromisoformat(data["timestamp"]) | |
except ValueError: | |
# Handle invalid time formats - adjust as needed (e.g., log warning or set to None) | |
data["timestamp"] = None | |
return cls(**data) |
Copilot uses AI. Check for mistakes.
print(f"Received message: {body.decode()}\n") | ||
self.rabbitmq_message_cache.put({"properties": properties, "body": body}) | ||
print(f"message delivery_tag: {method.delivery_tag}\n") |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using print()
statements for logging in production code is not recommended. Use the logger instance that's already available in the class for consistent logging behavior.
print(f"Received message: {body.decode()}\n") | |
self.rabbitmq_message_cache.put({"properties": properties, "body": body}) | |
print(f"message delivery_tag: {method.delivery_tag}\n") | |
logger.debug(f"Received message: {body.decode()}") | |
self.rabbitmq_message_cache.put({"properties": properties, "body": body}) | |
logger.debug(f"Message delivery_tag: {method.delivery_tag}") |
Copilot uses AI. Check for mistakes.
|
||
if (not intent_result["trigger_retrieval"]) and (not time_trigger_flag): | ||
logger.info(f"Query schedule not triggered. Intent_result: {intent_result}") | ||
return |
Copilot
AI
Jul 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method process_session_turn
has inconsistent return behavior. It can return None
(line 222) or a tuple (line 251), but the return type annotation specifies only the tuple. Update the return type to include None
or ensure consistent return values.
Copilot uses AI. Check for mistakes.
Looks good overall. Might want to incorporate Copilot's suggested changes @tangg555. Also, I suggest having @fridayL and @J1awei-Yang review this as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many changes, they're all closely related to the demo, need merge and deploy online .
MemTensor#142) * fix bugs: modify mos_for_test_scheduler.py and fix bugs of scheduler dispatch * feat & fix bugs & refactor: mem scheudler add more functions of monitors and web logs, and refactor the code structure * new feat & fix bugs: mem_scheduler update the logging feature and examples; some bugs due to refactoring have been fixed * fix ruff problem * fix UTC problem
Description
Summary: Refactor memory scheduler with improved architecture and configurations
Fix: #5
Docs Issue/PR: (docs-issue-or-pr-link)
Reviewer: @fridayL @Ki-Seki @CaralHsi
Checklist: