August 29, 2019

Upcoming Features in Shrine 3.0

The last couple of months I’ve been working hard to prepare for Shrine 3.0, which I expect will be released by the end of October. A lot of work has gone into it, including some big but much needed rewrites. I feel the API has stabilized now, so I thought it would be a good time to share with your some of the new features and improvements that will be coming to 3.0.

For those who don’t know, Shrine is a versatile file attachment library for Ruby applications. It was born out of frustration for not being able to achieve the desired user experience with existing solutions. Tomorrow it will be turning 4 years old.

Before we start, here is a little refresher on Shrine’s core classes:

Class	Description
`Shrine`	performs uploads
`Shrine::UploadedFile`	represents uploaded files
`Shrine::Attacher`	handles attaching
`Shrine::Attachment`	model wrapper for `Shrine::Attacher`

Decoupling from models

Currently, when attaching files, Shrine requires you to provide a mutable struct (aka “model”) which the attached file data will be written to:

class Photo < Struct.new(:image_data)
end

photo = Photo.new

attacher = Shrine::Attacher.new(photo, :image)
attacher.assign(file) # uploads file

photo.image_data #=> '{"id":"abc123.jpg", "storage":"cache", "metadata":{...}}'

This works nicely with Active Record, Sequel, Mongoid and all other database libraries that implement the Active Record pattern.

However, it so happens that not all Ruby database libraries implement this pattern. ROM and Hanami::Model implement the Repository pattern, which separates data from persistence. Using this pattern, record objects (aka “entities”) are represented with immutable structs:

class Photo < Hanami::Entity
end

photo = Photo.new(image_data: nil)

attacher = Shrine::Attacher.new(photo, :image)
attacher.assign(file) #~> NoMethodError: undefined method `image_data=' for #<Photo>

You could somehow hack your way around it, but this is far from ideal. Even if I didn’t care about ROM and Hanami::Model, I did feel that coupling to model instances made the Shrine::Attacher implementation more difficult to reason about. When that reached a point where we couldn’t implement the new derivatives feature, I knew that attaching logic needed to be rewritten.

The result of that rewrite is that the Shrine::Attacher API is now layered into base, column, entity, and model.

Base

The core Shrine::Attacher is now instantiated standalone and maintains its own state:

attacher = Shrine::Attacher.new
attacher.assign(file) # uploads file
attacher.file #=> #<Shrine::UploadedFile id="abc123.jpg" storage=:store ...>

It provides the #data method which returns attached file data as a serializable Hash, suitable for persisting:

attacher.data #=>
# {
#   "id" => "abc123.jpg",
#   "storage" => "store",
#   "metadata" => {
#     "size" => 9534842,
#     "filename" => "nature.jpg",
#     "mime_type" => "image/jpeg",
#   }
# }

The attachment can then be loaded back from this data:

attacher = Shrine::Attacher.from_data(data)
attacher.file #=> #<Shrine::UploadedFile @id="abc123.jpg" @storage_key=:store ...>

Column

Now, if you want to persist the attached file data to a text database column, you’ll need to serialize the data hash into a string (e.g. JSON). For that you can use the column plugin:

Shrine.plugin :column

data = attacher.column_data # dump JSON string
#=> '{"id":"abc123.jpg","storage":"store","metadata":{...}}'

attacher = Shrine::Attacher.from_column(data) # load JSON string
attacher.file #=> #<Shrine::UploadedFile id="abc123.jpg" storage=:store ...>

Entity

The entity plugin builds upon the column plugin, providing integration for immutable structs:

class Photo < Hanami::Entity
end

photo = Photo.new(image_data: nil)

# allows instantiating attacher from an entity
attacher = Shrine::Attacher.from_entity(photo, :image)
attacher.assign(file) # does not attempt to write to the entity attribute

# provides the hash of attributes for you to persist
attacher.column_values #=> { :image_data => '{"id":"abc123.jpg","storage":"cache","metadata":{...}}' }

You can remove some of the boilerplate with the Shrine::Attachment module:

class Photo < Hanami::Entity
  include Shrine::Attachment(:image)
end

photo = Photo.new(image_data: '{"id":"abc123.jpg","storage":"store","metadata":{...}}')
photo.image #=> #<Shrine::UploadedFile id="abc123.jpg" storage=:store ...>
photo.image_attacher # shorthand for `Shrine::Attacher.from_entity(photo, :image)`

The upcoming shrine-rom gem will build upon the entity plugin, as well as the hanami-shrine gem.

Model

With the entity plugin providing the reads, the new model plugin adds the writes, which is convenient for mutable structs:

class Photo < Struct.new(:image_data)
end

photo = Photo.new

# allows instantiating attacher from a model
attacher = Shrine::Attacher.from_model(photo, :image)
attacher.assign(file) # writes uploaded file data to the model attribute

photo.image_data #=> #=> '{"id":"abc123.jpg", "storage":"cache", "metadata":{...}}'

Or with the Shrine::Attachment module:

class Photo < Struct.new(:image_data)
  include Shrine::Attachment(:image)
end

photo = Photo.new
photo.image = file
photo.image #=> #<Shrine::UploadedFile id="abc123.jpg" storage=:cache ...>
photo.image_data #=> #=> '{"id":"abc123.jpg", "storage":"cache", "metadata":{...}}'

The existing activerecord and sequel plugins are then built on top of the model plugin.

Derivatives

The Shrine::Attacher rewrite also enabled us to implement the main new feature – the derivatives plugin. It is a reimplementation of the existing versions plugin, but with a proper API and much needed flexibility.

Problems with versions

The versions plugin works in a way that you register a processing block, which receives the original cached file and needs to return the set of files that should be saved. This block is automatically triggered when the cached file is being uploaded to permanent storage.

class ImageUploader < Shrine
  plugin :processing
  plugin :versions

  process(:store) do |io|
    processor = ImageProcessing::MiniMagick.source(io.download)

    { original: io,
      large:    processor.resize_to_limit!(800, 800),
      medium:   processor.resize_to_limit!(500, 500),
      small:    processor.resize_to_limit!(300, 300) }
  end
end

photo.image = file
photo.save  # triggers processing of versions
photo.image #=>
# {
#   original: <Shrine::UploadedFile @id="original.jpg" ...>,
#   large:    <Shrine::UploadedFile @id="large.jpg" ...>,
#   medium:   <Shrine::UploadedFile @id="medium.jpg" ...>,
#   small:    <Shrine::UploadedFile @id="small.jpg" ...>,
# }

One problem with this design is that you needed to change how you access your original file after the versions has been processed. This is especially problematic when processing in a background job, as then you need to handle both attachment states, with and without versions.

# how we access the original file...
photo.image #=> #<Shrine::UploadedFile @id="original.jpg" ...>
photo.image.mime_type #=> "image/jpeg"

photo.save

# ...now needs to be changed
photo.image[:original] #=> #<Shrine::UploadedFile @id="original.jpg" ...>
photo.image[:original].mime_type #=> "image/jpeg"

The fact that processing versions was tied to promotion made other things difficult as well:

uploading versions to a different storage than the original file
adding new versions to an existing attachment
reprocessing existing versions

Solution

With the new derivatives plugin, you trigger processing explicitly when you want, and processed files are retrieved separately from the original file:

class ImageUploader < Shrine
  plugin :derivatives

  Attacher.derivatives_processor do |original|
    processor = ImageProcessing::MiniMagick.source(original)

    {
      large:  processor.resize_to_limit!(800, 800),
      medium: processor.resize_to_limit!(500, 500),
      small:  processor.resize_to_limit!(300, 300),
    }
  end
end

photo.image = file
photo.image_derivatives! # triggers processing and uploads results
photo.save

# ...

photo.image_derivatives #=>
# {
#   large:  #<Shrine::UploadedFile id="large.jpg" storage=:store>,
#   medium: #<Shrine::UploadedFile id="medium.jpg" storage=:store>,
#   small:  #<Shrine::UploadedFile id="small.jpg" storage=:store>,
# }

# original file is still accessed in the same way
photo.image #=> #<Shrine::UploadedFile @id="original.jpg" ...>

The processing block is just a convention, we can also add files directly using Attacher#add_derivative(s):

attacher = photo.image_attacher
attacher.derivatives #=> { small: ..., medium: ..., large: ... }
attacher.add_derivative(:extra_large, extra_large_file) # uploads file and merges result
attacher.derivatives #=> { small: ..., medium: ..., large: ..., extra_large: ... }

The storage where processed files will be uploaded to can now be changed as well:

# upload all derivatives to :thumbnail_store
plugin :derivatives, storage: :thumbnail_store

# upload different derivatives to different storage
plugin :derivatives, storage: -> (name) { ... }

Backgrounding

Shrine’s backgrounding plugin allows you to delay uploading cached file to permanent storage and file processing into a background job. Previously, it tried to do everything for you – fetch the record in the background job, perform processing, reload the record to check that the attachment hasn’t changed – which meant when something wouldn’t work, you had very little visibility as to why.

Shrine.plugin :backgrounding
Shrine::Attacher.promote_block { |data| PromoteJob.perform_async(data) } # magic hash

class PromoteJob
  include Sidekiq::Worker

  def perform(data)
    Shrine::Attacher.promote(data) # use the magic hash to do magic things
  end
end

For Shrine 3.0, the backgrounding feature has been completely rewritten to be more explicit and flexible:

Shrine.plugin :backgrounding
Shrine::Attacher.promote_block do
  PromoteJob.perform_async(self.class.name, record.class.name, record.id, name, file_data)
end

class PromoteJob
  include Sidekiq::Worker

  def perform(attacher_class, record_class, record_id, name, file_data)
    attacher_class = Object.const_get(attacher_class)
    record         = Object.const_get(record_class).find(record_id)

    attacher = attacher_class.retrieve(model: record, name: name, file: file_data)
    attacher.atomic_promote
  rescue Shrine::AttachmentChanged, ActiveRecord::RecordNotFound
    # attachment has changed or the record has been deleted, nothing to do
  end
end

You can now see what’s going on:

record, attachment name, and current attached file are passed to the background job
background job fetches the database record
you retrieve the attacher as it was before the background job was spawned
- if attachment has changed, Shrine::AttachmentChanged is raised
you upload the cached attached file to permanent storage
- if attachment has changed during upload, Shrine::AttachmentChanged is raised
- if record has been deleted, ActiveRecord::RecordNotFound is raised

It’s now easy for example to add processing derivatives into the mix:

def perform(attacher_class, record_class, record_id, name, file_data)
  # ...
  attacher = attacher_class.retrieve(model: record, name: name, file: file_data)
  attacher.create_derivatives # process derivatives and store results
  attacher.atomic_promote
  # ...
end

People have also wanted to pass additional parameters from the controller into the background job. You can now do this with instance-level hooks:

class PhotosController < ApplicationController
  def create
    photo = Photo.new(photo_params)

    photo.image_attacher.promote_block do |attacher|
      # explicit style without instance eval
      PromoteJob.perform_async(
        attacher.class,
        attacher.record.class.name,
        attacher.record.id,
        attacher.name,
        attacher.file_data,
        current_user.id, # pass data from the controller
      )
    end

    photo.save # background job is spawned
  end
end

Other improvements

In addition to these big rewrites, there have been many other notable improvements in areas of usability, performance and design. Here are some of the highlights:

Skipping temporary storage

Shrine uses a temporary storage for storing files that have not been attached yet. This enables features such as retaining uploads on validation errors and direct uploads.

However, if you’re attaching files from a background job or a script, you don’t need the temporary storage. Starting from Shrine 3.0, you won’t need to have temporary storage defined, and you can change attachment writer method to upload directly to permanent storage:

Shrine.plugin :model, cache: false

photo.image = file
photo.image.storage_key #=> :store (permanent storage)

Faster file retrieval

Currently, whenever the attached file is accessed, it’s parsed from the attachment data attribute on the record instance. This can add up if you’re storing lots of processed files or metadata.

photo.image # parses `image_data` column attribute
photo.image # parses `image_data` column attribute again

Starting from Shrine 3.0, the attached file will be loaded from the data attribute only on first access. Additionally, you’ll be able to switch to a faster JSON parser if you want to.

require "oj" # https://github.com/ohler55/oj

Shrine.plugin :column, serializer: Oj

photo.image # parses `image_data` using Oj, and memoizes the result
photo.image # returns memoized file

Standardized persistence API

The persistence API has now been standardized across different persistence plugins:

Method	Description
`Attacher#persist`	persists attachment data
`Attacher#atomic_persist`	persists attachment data if attachment hasn’t changed
`Attacher#atomic_promote`	promotes cached file and atomically persists changes

All persistence plugins will now share this API:

activerecord
sequel
mongoid
rom
hanami
…

This will make it easier for 3rd-party plugins to be agnostic to your persistence backend, as they’ll be able to just call Attacher#persist and know it will do the right thing. It even works if the user has multiple persistence plugins loaded simultaneously.

Conclusion

I still stand strong by the decision that Shrine will never paint you into a corner, regardless of which database library and web framework you’re using, or which design patterns you prefer. At times following this philosophy can be very challenging, but ultimately it’s very rewarding, because you know you’ve ended up with a design that’s really solid.

I’ve released 3.0.0.beta with these new changes, so if you want you can already use them:

$ gem install --pre shrine
Successfully installed shrine-3.0.0.beta

I still need to finish up the backwards compatibility layer, write the upgrading guide and full release notes, as well as finish updating the documentation and other shrine-* gems. Once that’s done, Shrine 3.0 should be out the doors.