Multi insert only inserts columns present in the first tuple

We recently hit an issue doing a multi insert via a repository’s create command where some columns are not written.

Specifically, the columns that are not represented in the first hash passed to create are ignored. i.e. in the following call, the private column of the posts table will not be set to true for the second post.

post_repo.create([
  {title: "First post"}, 
  {title: "Second post", private: true}
])

The reason for that is this method in Sequel, which documents the fact that all hashes should have the same keys (the columns are determined by the first hash):

# This is a front end for import that allows you to submit an array of
# hashes instead of arrays of columns and values:
#
#   DB[:table].multi_insert([{x: 1}, {x: 2}])
#   # INSERT INTO table (x) VALUES (1)
#   # INSERT INTO table (x) VALUES (2)
#
# Be aware that all hashes should have the same keys if you use this calling method,
# otherwise some columns could be missed or set to null instead of to default
# values.
#
# This respects the same options as #import.
def multi_insert(hashes, opts=OPTS)
  return if hashes.empty?
  columns = hashes.first.keys
  import(columns, hashes.map{|h| columns.map{|c| h[c]}}, opts)
end

As this behaviour could be the cause of application bugs where columns are unexpectedly not written, I wanted to open a discussion about whether some behaviour should change here?

For example, should Sequel check all keys of all hashes being inserted? Should rom-sql do that instead? Or should rom-sql raise an error if it encounters a multi-insert where the hashes have different key sets, to prevent developers encountering this situation?

Hey Andrew!

I think this should be handled better by rom-sql. I :100: agree with you that this can easily hide bugs and cause confusion. We could generate an explicit hash schema based on the first tuple and then apply it to all tuples prior to sending it down to the multi_insert. It would crash with an exception before touching the database if there’s any tuple that does not match the hash schema.

Thanks @solnic :slight_smile: That sounds like good solution for avoiding confusion. I’ll create an issue on rom-sql for this? Happy to have a go at a PR for it also.

1 Like

@andrewcroome yes please report an issue and we can take it from there. One thing to investigate is if Relation#input_schema is applied to these tuples and if that’s the case, then why it’s not crashing when tuples have keys missing.