-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I'm using the dataframe API to perform a join. I can build a join without issue however attempting to add an additional column results in a failure. This is the logical plan
DataFrame {
session_state: SessionState {
session_id: "56e65554-2665-46a7-8f3f-6839b25e542c",
},
plan: Full Join: Filter: target.id = source.id
Projection: source.id, source.value, source.modified, Boolean(true) AS __delta_rs_source
TableScan: source
Projection: target.id, target.value, target.modified, Boolean(true) AS __delta_rs_target
TableScan: target,
}
With the following error being given
Result::unwrap()` on an `Err` value: Generic("Schema error: Ambiguous reference to unqualified field id")
To Reproduce
Original code that caused this issue is here: https://github.com/Blajda/delta-rs/blob/merge-logical/rust/src/operations/merge.rs#L649
Codes that reproduces that issue
let schema = Arc::new(ArrowSchema::new(vec![
Field::new("id", DataType::Utf8, true),
Field::new("value", DataType::Int32, true),
Field::new("modified", DataType::Utf8, true),
]));
let ctx = SessionContext::new();
let batch = RecordBatch::try_new(
Arc::clone(&schema),
vec![
Arc::new(arrow::array::StringArray::from(vec!["B", "C", "X"])),
Arc::new(arrow::array::Int32Array::from(vec![10, 20, 30])),
Arc::new(arrow::array::StringArray::from(vec![
"2021-02-02",
"2023-07-04",
"2023-07-04",
])),
],
)
.unwrap();
let source = ctx.read_batch(batch).unwrap();
let batch = RecordBatch::try_new(
Arc::clone(&schema),
vec![
Arc::new(arrow::array::StringArray::from(vec!["B", "D", "X"])),
Arc::new(arrow::array::Int32Array::from(vec![10, 20, 30])),
Arc::new(arrow::array::StringArray::from(vec![
"2021-02-02",
"2023-07-04",
"2023-07-04",
])),
],
)
.unwrap();
let target = ctx.read_batch(batch).unwrap();
let source_name = TableReference::bare("source");
let source =
LogicalPlanBuilder::scan(source_name, provider_as_source(source.into_view()), None)
.unwrap()
.build()
.unwrap();
let source = DataFrame::new(ctx.state(), source);
let target_name = TableReference::bare("source");
let target =
LogicalPlanBuilder::scan(target_name, provider_as_source(target.into_view()), None)
.unwrap()
.build()
.unwrap();
let target = DataFrame::new(ctx.state(), target);
let join = source
.join(
target,
datafusion_common::JoinType::Full,
&[],
&[],
Some(col("source.id").eq(col("target.id"))),
)
.unwrap();
let proj = join.with_column("test123", lit(true)).unwrap();
proj.show().await.unwrap();Expected behavior
I should be able to add a new unique columns to this Dataframe
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working