Skip to content

Conversation

Copilot
Copy link

@Copilot Copilot AI commented May 25, 2025

This PR adds support for nested data types (lists and dictionaries) in the pandas-pyarrow library. With this enhancement, columns containing Python lists or dictionaries will be properly converted to their corresponding PyArrow types.

Changes

  • Added a new nested_mapper.py module to handle mappings for nested data types
  • Enhanced the PandasArrowConverter to detect and properly handle nested types:
    • List columns are converted to list[pyarrow]
    • Dictionary columns are converted to struct[pyarrow]
  • Implemented a registry-based approach to store nested type information
  • Added proper reverse conversion to ensure nested types can be converted back to pandas/numpy
  • Created comprehensive test cases to verify the functionality
  • Updated documentation in README.md with examples

Example usage

import pandas as pd
from pandas_pyarrow import convert_to_pyarrow, convert_to_numpy

# Create a DataFrame with list and dictionary columns
df = pd.DataFrame({
    'list_col': [[1, 2, 3], [4, 5], [6, 7, 8, 9]],
    'dict_col': [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4, 'e': 5}]
})

# Convert to PyArrow-backed DataFrame
adf = convert_to_pyarrow(df)

# Access nested type information
converter = PandasArrowConverter()
nested_types = converter.get_nested_dtypes(adf)
print(nested_types)  # {'list_col': 'list[pyarrow]', 'dict_col': 'struct[pyarrow]'}

# Convert back to pandas/numpy
rdf = convert_to_numpy(adf)

Fixes #1.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • cdn.fwupd.org
    • Triggering command: /usr/bin/fwupdmgr refresh (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@Copilot Copilot AI changed the title [WIP] Add support for nested data types (lists and dictionaries) #281 Add support for nested data types (lists and dictionaries) May 25, 2025
@Copilot Copilot AI requested a review from eh-main-bot May 25, 2025 13:36
Copilot finished work on behalf of eh-main-bot May 25, 2025 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for nested data types (lists and dictionaries) #281

2 participants