Skip to content

Conversation

@eregon
Copy link
Member

@eregon eregon commented Nov 5, 2025

  • rb_struct_initialize() does not accept a Hash, and it's very brittle to pass [{...}] and to rely on that C function using rb_keyword_given_p(). It basically worked accidentally, by having **members in the caller of the caller. Such logic when Struct#initialize is defined in Ruby (as in TruffleRuby) is basically impossible to implement, because it's incorrectly treating positional arguments as keyword arguments.
  • rb_struct_initialize() is used in CRuby to set members of Data instances in marshal.c (there is no rb_data_initialize() yet). There, the code passes an Array of members values for Data (and for Struct which are not keyword_init: true): https://github.com/ruby/ruby/blob/48c7f349f68846e10d60ae77ad299a38ee014479/marshal.c#L2150-L2176 So we should do the same in psych.
  • Rename to init_data since it's only used for Data.
  • See Data object encoding #692 (comment).

…s and not a Hash

* rb_struct_initialize() does not accept a Hash, and it's very brittle
  to pass `[{...}]` and to rely on that C function using rb_keyword_given_p().
  It basically worked accidentally, by having **members in the caller of the caller.
  Such logic when Struct#initialize is defined in Ruby (as in TruffleRuby) is basically impossible to implement,
  because it's incorrectly treating positional arguments as keyword arguments.
* rb_struct_initialize() is used in CRuby to set members of Data instances in marshal.c (there is no rb_data_initialize() yet).
  There, the code passes an Array of members values for Data (and for Struct which are not `keyword_init: true`):
  https://github.com/ruby/ruby/blob/48c7f349f68846e10d60ae77ad299a38ee014479/marshal.c#L2150-L2176
  So we should do the same in psych.
* Rename to init_data since it's only used for Data.
* See ruby#692 (comment).
@eregon eregon requested a review from tenderlove November 5, 2025 10:31
@eregon eregon mentioned this pull request Nov 5, 2025
@eregon eregon changed the title Fix usage of rb_struct_initialize() to pass an Array of members values and not a Hash Fix usage of rb_struct_initialize() to pass an Array of members values and not a Hash and add TruffleRuby in CI Nov 5, 2025
@eregon
Copy link
Member Author

eregon commented Nov 5, 2025

I added TruffleRuby in CI in this PR since it just needs a tiny extra fix to be fully green.
This will help to prevent issues like truffleruby/truffleruby#3997 (or make them much faster to notice & fix)

end
data ||= allocate_anon_data(o, members)
init_struct(data, **members)
values = data.members.map { |m| members[m] }
Copy link
Member Author

@eregon eregon Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also use members.values but then we should check if the data_class.members == members.keys, and raise an exception if not. If so, which exception class should I use then, do we have one for "incompatible change in loaded classes"?
Otherwise if e.g. one changes the order of fields from Foo = Data.define(:a, :b) to Data.define(:b, :a) then it would load the instance incorrectly in a rather surprising manner.

If there are extra members in Data.define, the current approach sets them to nil, like before:

irb(main):002> Psych::VERSION
=> "5.2.6"
irb(main):011> Foo = Data.define(:a,:b)
=> Foo
irb(main):012> d=Psych.dump Foo.new(1,2)
=> "--- !ruby/data:Foo\na: 1\nb: 2\n"
irb(main):013> puts d
--- !ruby/data:Foo
a: 1
b: 2
=> nil
irb(main):014> Foo = Data.define(:a,:b, :c)
(irb):14: warning: already initialized constant Foo
(irb):11: warning: previous definition of Foo was here
=> Foo
irb(main):015> Psych.unsafe_load d
=> #<data Foo a=1, b=2, c=nil>

This is rb_struct_initialize()-specific behavior BTW, Data#initialize doesn't allow that:

> Data.define(:a,:b).new(a: 1)
(irb):3:in 'Data#initialize': missing keyword: :b (ArgumentError)

Checking members would raise, unless we get fancy and only compare the first N members where N is members.size.

Basically this seems a good approach and it's compatible with what was done before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant