Better File Uploads with Shrine: Metadata
Shrine has very flexible and customizable support for saving file metadata.
Whenever Shrine is about to upload a file, it extracts available metadata from
the file, and adds it to the returned Shrine::UploadedFile
object.
uploaded_file = uploader.upload(file)
uploaded_file #=> #<Shrine::UploadedFile>
uploaded_file.metadata #=>
# {
# "filename" => "nature.jpg",
# "mime_type" => "image/jpeg",
# "size" => 2859343
# }
uploaded_file.original_filename #=> "nature.jpg"
uploaded_file.extension #=> "jpg"
uploaded_file.mime_type #=> "image/jpeg"
uploaded_file.size #=> 2859343
Most file attachments libraries have support for saving file metadata into additional columns. However, that means that you need to have a database column for each metadata you want to save.
# This is lame
add_column :photos, :image_filename, :string
add_column :photos, :image_type, :string
add_column :photos, :image_size, :integer
add_column :photos, :image_width, :integer
add_column :photos, :image_height, :integer
# ...
Shrine takes a much simpler approach here. Since it uses a single
<attachment>_data
database column to save the serialized
Shrine::UploadedFile
object, any metadata included in this object will also
get saved to the same column.
photo = Photo.create(image: file)
photo.image_data #=>
# {
# "id": "ee08af.jpg",
# "storage": "store",
# "metadata": {
# "filename" => "nature.jpg",
# "mime_type" => "image/jpeg",
# "size" => 2859343
# }
# }
Furthermore, when you’re storing processed files alongside the main file, Shrine automatically extracts and saves metadata of each file:
class ImageUploader < Shrine
plugin :derivatives
Attacher.derivatives_processor do |original|
magick = ImageProcessing::MiniMagick.source(original)
{
small: magick.resize_to_limit!(300, 300),
medium: magick.resize_to_limit!(500, 500),
large: magick.resize_to_limit!(800, 800),
}
end
end
photo.image_derivatives!
photo.image_data
# {
# "id": "ee08af.jpg",
# "storage": "store",
# "metadata": { ... },
# "derivatives": {
# "small": { "id": "f9f67a.jpg", "storage": "store", "metadata": { ... } },
# "medium": { "id": "5e27cf.jpg", "storage": "store", "metadata": { ... } },
# "large": { "id": "ff1da8.jpg", "storage": "store", "metadata": { ... } },
# }
# }
photo.image(:small).size #=> 21496
photo.image(:medium).size #=> 98237
photo.image(:large).size #=> 150383
MIME type
Shrine doesn’t have any mandatory dependency for extracting MIME type, so by
default it is inherited from #content_type
of the input file if available.
However, this attribute on uploaded files is set by Rack from the
Content-Type
request header, which was set by the browser solely based on the
file extension.
This means that by default Shrine’s mime_type
metadata is not guaranteed to
hold the actual MIME type of the file (e.g. the user can upload a PHP file with
a .jpg extension). This might sound like Shrine is not secure by deafult, but
you do get a warning in the console when #content_type
is used. And in some
scenarios this might be exactly what you want.
Shrine comes with a determine_mime_type
plugin, which determines MIME type
from file content. It uses tools that know how to read “magic headers” of
files, and saves the result into mime_type
(which can then be
validated).
# Gemfile
gem "marcel"
Shrine.plugin :determine_mime_type, analyzer: :marcel
File.write("image.png", "<?php ... ?>") # PHP file with a .png extension
uploaded_file = uploader.upload(File.open("image.png"))
uploaded_file.mime_type #=> "text/x-php"
If the :marcel
analyzer doesn’t suit you, you can choose a different
analyzer, or even combine them:
Shrine.plugin :determine_mime_type, analyzer: -> (io, analyzers) do
analyzers[:mimemagic].call(io) || analyzers[:file].call(io)
end
Image Dimensions
Image dimensions can be extracted by loading the store_dimensions
plugin:
# Gemfile
gem "fastimage" # default analyzer
class ImageUploader < Shrine
plugin :store_dimensions
end
image = image_uploader.upload(file)
image.metadata #=>
# {
# "filename" => "nature.jpg",
# "mime_type" => "image/jpeg",
# "size" => 90423,
# "width" => 500,
# "height" => 400,
# }
image.width #=> 500
image.height #=> 400
image.dimensions #=> [500, 400]
Custom metadata
In addition to built-in metadata, Shrine allows you to easily extract and save
custom metadata, with the add_metadata
plugin.
class DocumentUploader < Shrine
plugin :add_metadata
add_metadata :pages do |io|
PDF::Reader.new(io.path).page_count
end
end
pdf = document_uploader.upload(cv)
pdf.metadata #=>
# {
# "filename" => "curriculum-vitae.pdf",
# "mime_type" => "application/pdf",
# "size" => 49234,
# "pages" => 5
# }
pdf.pages #=> 5
Notice that it also generated a #pages
reader method on the
Shrine::UploadedFile
object. I think it’s nice to be able to extend Shrine
objects with methods that fit your domain.
If you’re using a tool which extracts multiple metadata at once, the
add_metadata
plugin supports returning a hash as well.
class VideoUploader < Shrine
plugin :add_metadata
add_metadata do |io|
movie = FFMPEG::Movie.new(io.path)
{ "duration" => movie.duration,
"bitrate" => movie.bitrate,
"resolution" => movie.resolution,
"frame_rate" => movie.frame_rate }
end
metadata_method :duration, :bitrate, :resolution, :frame_rate
end
video = video_uploader.upload(file)
video.duration #=> 7.5
video.bitrate #=> 481
video.resolution #=> "640x480"
video.frame_rate #=> 16.72
Storage metadata
In addition to extracting the metadata on your side, Shrine also gives the storage itself the ability to update the metadata after uploading. Some storages like filesystem and Amazon S3 won’t use this, but many other storage services extract file metadata during uploading.
For example, when you’re uploading images to Cloudinary, shrine-cloudinary
will automatically update size
, mime_type
, width
and height
metadata values. This is especially useful if you’re processing
the image on upload with Cloudinary, because then the metadata that Shrine
extracted won’t match the uploaded file, since those were extracted before the
upload.
uploaded_file = cloudinary_uploader.upload(image, upload_options: {
format: "png",
width: 800,
height: 800,
crop: :limit,
})
uploaded_file.metadata #=>
# {
# "filename" => "nature.jpg"
# "mime_type" => "image/png",
# "size" => 8584,
# "width" => 800,
# "height" => 600,
# }
Cloudinary also has the ability to automatically generate responsive breakpoints, and the ability to update the metadata allows the storage to store the information about the generated breakpoints.
uploaded_file = cloudinary_uploader.upload(image, upload_options: {
responsive_breakpoints: {
bytes_step: 20000,
min_width: 200,
max_width: 1000,
max_images: 20,
}
})
uploaded_file.metadata["cloudinary"]["responsive_breakpoints"] #=>
# [{
# "breakpoints": {
# {
# "width": 1000,
# "height": 667,
# "bytes": 79821,
# "url": "http://res.cloudinary.com/demo/image/upload/c_scale,w_1000/v1453637947/dog.jpg",
# "secure_url": "https://res.cloudinary.com/demo/image/upload/c_scale,w_1000/v1453637947/dog.jpg"
# },
# ...
# }
# }]
Some other storages that use the ability to update metadata include shrine-flickr, shrine-transloadit and shrine-uploadcare.
Summary
We’ve seen how Shrine automatically extracts metadata before upload, which is then stored into the same database column. You can determine MIME type and extract image dimensions from file content using a variety of tools, just by loading a corresponding plugin. You can also extract any custom metadata, and storage services can add their own metadata as well.