Today I ran into an issue while reading a quite significant mysql database table. I needed to fetch the records in batches so the good old find_in_batches method in rails ActiveRecord came into picture.

Now the actual problem surfaced, as the lookup table was not having any sort of ID column present, the find_in_batches kept throwing an error Invalid Statement. After some debugging it came to light that find_in_batches by default only works with Integer only Primary Key fields, as it uses order to order the records as per the Integer Primary Key column.

Thanks to http://monkeyandcrow.com/blog/reading_rails_how_do_batched_queries_work/

So I came up with this little hack that seemed to work for me:

module CustomFindInBatches

  def custom_find_in_batches(options={})

    return unless block_given?

    start      = options[:start]
    batch_size = options.delete(:batch_size) || 1000

    # use custom primary key set in the model
    relation   = self.reorder(self.primary_key).limit(batch_size)

    records = start ? relation.offset(start).to_a : relation.to_a

    offset  = 0

    while records.any?
      yield(records)

      break if records.size < batch_size
      offset += batch_size

      # fetch batch_size records based on offset
      records = relation.offset(offset).to_a
    end
  end

end

This worked by setting Model.primary_key on the model and the reordering the relation based on that primary key, which can now be any data type(which was varchar in my case).