Skip to content

Conversation

Amitverma0509
Copy link

Describe your change:

This PR extensively refactors build_directory_md.py to align with PEP 8 standards, introduce modern type hinting, and significantly improve code readability, documentation, and maintainability. The core functionality—generating a Markdown Table of Contents from Python source files—remains the same.

Explanation of code:
#!/usr/bin/env python3
"""
A utility script to generate a Markdown-formatted table of contents
for Python-related files (.py, .ipynb) in a specified directory structure.
It excludes common project utility folders and hidden directories.
"""
import os
import re
from typing import Iterator, Set

Define directories to exclude during os.walk traversal

EXCLUDED_DIRS: Set[str] = {"scripts", "venv", "pycache"}

def good_file_paths(top_dir: str = ".") -> Iterator[str]:
"""
Recursively walks the directory structure, yielding file paths for
Python files (.py, .ipynb), while skipping excluded directories and
init.py files.

Directory names are filtered in-place (dir_names[:]) to prevent
traversal into excluded directories.

Args:
    top_dir: The starting directory for the walk. Defaults to the current directory.

Yields:
    Relative file path strings (e.g., 'src/module/file.py').
"""
for dir_path, dir_names, filenames in os.walk(top_dir):
    # 1. Prune directory names (in-place modification of dir_names)
    # Excludes: Hidden directories (starting with '.'), common excluded names, and directories starting with '_'
    dir_names[:] = [
        d
        for d in dir_names
        if d[0] not in "._" and d not in EXCLUDED_DIRS
    ]

    # 2. Iterate over files and yield good paths
    for filename in filenames:
        # Skip module initialization files
        if filename == "__init__.py":
            continue

        # Check for valid extensions
        _, ext = os.path.splitext(filename)
        if ext in (".py", ".ipynb"):
            # Clean up path to ensure it doesn't start with './'
            full_path = os.path.join(dir_path, filename)
            yield full_path.lstrip("./")

def _generate_markdown_prefix(indent_level: int) -> str:
"""
Generates the appropriate Markdown prefix for directory levels or file entries.

Level 0 (no indent) generates a top-level header.
Level > 0 generates a bullet point indented by 2 spaces per level.

Args:
    indent_level: The depth of the current item (0 for root, 1 for first level).

Returns:
    The markdown prefix string.
"""
if indent_level == 0:
    # Markdown H2 header for root-level entries
    return "\n##"
# Indent with 2 spaces per level for nested lists (e.g., 2 spaces for level 1, 4 for level 2)
return f"{' ' * (indent_level * 2)}*"

def format_name(name: str) -> str:
"""
Converts a file/directory name (e.g., 'my_awesome_file') into a Title Case
string, replacing underscores with spaces (e.g., 'My Awesome File').
"""
return name.replace('
', ' ').title()

def print_directory_md(top_dir: str = ".") -> None:
"""
Generates and prints a Markdown table of contents for the file structure
starting at top_dir.

Args:
    top_dir: The root directory to start the scan.
"""
# Stores the path of the last directory printed to determine new structure levels
last_printed_path = ""

# Get and sort all file paths to ensure consistent output order
sorted_file_paths = sorted(good_file_paths(top_dir))

for filepath in sorted_file_paths:
    # Separate the directory path from the filename
    current_dir_path, filename = os.path.split(filepath)

    # --- Directory Structure Printing ---
    if current_dir_path != last_printed_path:
        # Path segments of the previous and current directory
        old_parts = last_printed_path.split(os.sep)
        new_parts = current_dir_path.split(os.sep)

        # Determine where the new path structure begins
        # 'i' tracks the indent level and the common prefix length
        i = 0
        while i < len(old_parts) and i < len(new_parts) and old_parts[i] == new_parts[i]:
            i += 1

        # Print the new directory segments
        for indent, new_part in enumerate(new_parts[i:], start=i):
            if new_part: # Ensure we don't print empty segments
                prefix = _generate_markdown_prefix(indent)
                print(f"{prefix} {_format_name(new_part)}")

        # Update the last printed path for the next comparison
        last_printed_path = current_dir_path

    # --- File Entry Printing ---
    # The indent for the file is one level deeper than its parent directory.
    indent = (current_dir_path.count(os.sep) + 1) if current_dir_path else 0

    # Create the URL-encoded path for the Markdown link
    url = filepath.replace(" ", "%20")

    # Format the filename for display (excluding extension)
    display_name = os.path.splitext(_format_name(filename))[0]

    prefix = _generate_markdown_prefix(indent)
    print(f"{prefix} [{display_name}]({url})")

if name == "main":
print_directory_md()

###Reviewer Notes: Please verify that the output structure remains consistent with the expected hierarchical Markdown list and that file exclusion logic is correct.

@algorithms-keeper
Copy link

Closing this pull request as invalid

@Amitverma0509, this pull request is being closed as none of the checkboxes have been marked. It is important that you go through the checklist and mark the ones relevant to this pull request. Please read the Contributing guidelines.

If you're facing any problem on how to mark a checkbox, please read the following instructions:

  • Read a point one at a time and think if it is relevant to the pull request or not.
  • If it is, then mark it by putting a x between the square bracket like so: [x]

NOTE: Only [x] is supported so if you have put any other letter or symbol between the brackets, that will be marked as invalid. If that is the case then please open a new pull request with the appropriate changes.

@algorithms-keeper algorithms-keeper bot closed this Oct 5, 2025
@algorithms-keeper algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Oct 5, 2025
@Amitverma0509 Amitverma0509 deleted the patch-1 branch October 5, 2025 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting reviews This PR is ready to be reviewed invalid
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant