Skip to content

Same distribution can appear multiple times on Python 3.8 #91

@jaraco

Description

@jaraco

In GitLab by @jaraco on Sep 30, 2019, 03:09

With Python 3.8, if one imports importlib_metadata, both the PathFinder and the MetadataPathFinder will appear on sys.meta_path with find_distributions methods... causing every package to be reported twice. This behavior is fine for some calls like find_distribution(name) which will just find the first one, but less desirable for ones like find_distributions(path=['.']), where one might expect to find exactly one (distribution on the current path), but instead finds two (the same one twice).

Here are some of the goals that led us here:

  • there should be an extensible interface whereby other package providers (like pyinstaller or some tool that imports from a database or the network but not from a typical file system) to provide metadata for those packages
  • the importlib_metadata behavior should be available on Python 3.8 and later even though importlib.metadata is available (possibly providing forward compatibility and feature backports).

I'm not sure what should happen here, but there are a few options that seem possible to me:

  1. Discourage users from using importlib_metadata on Python 3.8 or later. That would prevent importlib_metadata from installing its duplicate finder. This approach goes directly counter to the second goal above... and doesn't scale well. If one library imports importlib_metadata, they all get the new global state (MetadataPathFinder on sys.meta_path).
  2. On importlib_metadata don't install the sys.meta_path hook if on Python 3.8 or later (always rely on the system-provided one). That's probably suitable but limits the scope of backward compatibility that can be provided.
  3. If using importlib_metadata, bypass the system PathFinder for find_distributions. This approach would allow the backport to avoid interactions with finders as found in CPython but would still honor other distribution finders. This approach would address the issue when using importlib_metadata but not when using importlib.metadata (but some other library has imported importlib_metadata).
  4. Don't do anything and require all the clients to be aware that a single package might be advertised twice by different distribution finders.
  5. Always de-duplicate packages by name when returning multiple results, allowing duplicates to be advertised but giving precedence to the first one found.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions