NDJsonExec doesn't properly apply predicates on partitioned tables.

### Describe the bug

Performing a SQL query against a NDJson with partition columns will fail when filtering on any of the partition columns with the following error. In this case my partition column is a timestamp but it holds for other types as well.

> ArrowError(JsonError("Encountered unmasked nulls in non-nullable StructArray child: Field { name: \"hourly_timestamp\", data_type: Timestamp(Second, None), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }"))

It correctly prunes the files however, it doesn't populate the partition predicate correctly. This is in contrast to the ParquetExec which adds an extra predicate to populate the partition column.

> JsonExec: file_groups={1 group: [[Users/taylorbeever/git/theelderbeever/df-test/data/ndjson/hourly_timestamp=2023-09-25T20:00:00/data.ndjson]]}, projection=[id, timestamp, value, hourly_timestamp]

> ParquetExec: file_groups={1 group: [[Users/taylorbeever/git/theelderbeever/df-test/data/parquet/hourly_timestamp=2023-09-25T20:00:00/data.zstd.parquet]]}, projection=[id, timestamp, value, hourly_timestamp], predicate=hourly_timestamp@3 = 1695672000

Attempted solutions - all fail:
- Add partition columns to each json file.
- Define the Schema for the table
- Other datatypes for partition

### To Reproduce

I created an example repo [here](https://github.com/theelderbeever/datafusion-ndjson-issue).

Example data is included in the repo. All code is contained in `src/main.rs`. The parquet files are identical data to ndjson files.  They do not contain a column for the partition column as written.

To run:

First the parquet one which will succeed.
```console
RUST_LOG=debug cargo run -- parquet
```

Then the ndjson which will fail.
```console
RUST_LOG=debug cargo run -- ndjson
```


### Expected behavior

Partitioned table reads shouldn't fail when filtering on a partition column.

Additionally, the default file_extension for `NDJsonReadOptions` is `.json` which is a little misleading. Its should be one of `.ndjson` or `.jsonl`.

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NDJsonExec doesn't properly apply predicates on partitioned tables. #7686

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NDJsonExec doesn't properly apply predicates on partitioned tables. #7686

Description

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions