My journey with the Symfony MIME type guesser
Symfony provides a lot of helpful components, among those components, there is the MIME Component.
The Mime component allows manipulating the MIME messages used to send emails and provides utilities related to MIME types.
“and provides utilities related to MIME type”
Ok, but what is a MIME type ❓
📖 A MIME type is like a standard to classify files so computers know how to handle files, a file with a MIME type application/pdf will be opened by your computer in a PDF reader, file with a MIME type text/html will be opened in a navigator, a file with a MIME type image/jpeg will be opened as an image…
💡The MIME type has 2 parts separated by a “/”. The type and the subtype.
I can hear you: 🗣️ “thanks dude but why should I know this stuff as a developer”
Me: 🗣 ️“When you want to validate a file, you use the MIME type to assert that the file corresponds to what you want, an image for an avatar, a pdf for an invoice, and so on...”
Whatever the kind of file you expect, you should validate it otherwise you’ll have a surprise 💥 .
To do this you have probably already used the File constraint
// src/Entity/Author.php
namespace App\Entity;use Symfony\Component\Validator\Constraints as Assert;class Author
{
/**
* @Assert\File(
* maxSize = “1024k”,
* mimeTypes = {“application/pdf”, “application/x-pdf”},
* mimeTypesMessage = “Please upload a valid PDF”
* )
*/
protected $bioFile;
}
In the mimeTypes option, you pass the allowed MIME types for this property and behind the scene, the FileValidator use the MIME type guesser to guess the MIME type and compare it to these values, if it’s not allowed it adds a violation.
How Symfony guess the MIME type 🔍
To guess the MIME type Symfony has a helper, the class Symfony\Component\Mime\MimeTypes
.
Honestly, the code is well documented and commented, but let’s explain what we need to know:
📚 The class uses 2 classes called “guesser” that are registered in the construct. Those classes are the FileBinaryMimeTypeGuesser
(based upon the binary file) and the FileinfoMimeTypeGuesser
(based upon the extension FileInfo), when you guess the MIME type, the code iterate on the guessers and return the 1st result.
By default it uses FileinfoMimeTypeGuesser
first but you can change the order by calling the methodregisterGuesser
with FileBinaryMimeTypeGuesser
as a parameter or you can set a custom guesser.
💻 Now we know how it works let’s see an example, suppose you want to create a pdf file from a path. We will use it like this:
I will add a dump in the method guessMimeType
so we can see the order of guessers:
// MimeTypes.phppublic function guessMimeType(string $path): ?string
{
dd($this->guessers);
foreach ($this->guessers as $guesser) {
if (!$guesser->isGuesserSupported()) {
continue;
} if (null !== $mimeType = $guesser->guessMimeType($path)) {
return $mimeType;
}
}
👆As you can see the MimeTypes
will guess the path with Fileinfo
first and then with FileBinary
💡As it’s well named the guesser tries to guess and unfortunately, it can guess wrong, from my experience, the FileBinaryMimeTypeGuesser
have better results, so let’s use it first.
👀 Let’s dump again and see our registered guessers.
Now we use FileBinaryMimeTypeGuesser
first but it’s not optimal because the helper is instantiated by ourselves and then we reorder the guessers, it means everywhere I need this helper in my codebase I should do this 🙁
Let’s improve this by using the helper as a service and the good news is Symfony has registered the MimeTypes as a service with an interface MimeTypeInterface
and an aliasmimes_type
, we just have to reorder the guesser during the service definition and we’re done, everywhere we need it we just have to inject the interface 🎉.
Thanks to the compiler pass, I will do exactly what we did to register the guesser but with the service MimeTypes
👇
Now we’re good, anywhere I inject MimeTypesInterface
I will use guess the MIME type with FileBinary
first, let’s check that everything is OK with my dump:
It looks pretty good 🎉
Add a Custom MimeTypeGuesser
⚠️ Depending on the MIME type you want to validate this configuration can be enough but sometimes guessers (file binary or file extension) are unable to guess the real MIME type.
For the Docx files, you can have wrong results, there is a ton of issues on the web related to this, sometimes it returns application/octet-stream
(the MIME type used to indicate the body contains arbitrary data) this result is too vague to be trusted and sometimes it returns application/zip
because a Docx is a zip (you can open it) but a zip file is not necessarily a Docx.
ℹ️ The MIME type for a Docx is application/vnd.openxmlformats-officedocument.wordprocessingml.document
From my research the only way to be sure that a file is a Docx is to inspect it 😅.
🔧 Let’s do this only in case of the guesser returns application/octet-stream
or application/zip
if the guesser returns application/vnd.openxmlformats-officedocument.wordprocessingml.document
we are good and there is nothing to do.
🗺️ The plan is to use the FileBinaryMimeTypeGuesser
and to update its behavior to inspect the file if it can be a Docx. We will use the design pattern proxy.
Thanks to auto configuration, As my proxy implements MimeTypeGuesserInterface
it will be autoregistered as a guesser and will be used first, so we can remove all our stuff in the kernel.
As we see in the dump, our decorator is well registered and is the first 🎉
🌟 The other cool thing is that Symfony will register our decorator as default guesser in the MimeTypes
class (see snippet bellow).
// src/Symfony/Bundle/FrameworkBundle/Resources/config/mime_type.phpreturn static function (ContainerConfigurator $container) {
$container->services()
->set('mime_types', MimeTypes::class)
->call('setDefault', [service('mime_types')])
->alias(MimeTypesInterface::class, 'mime_types')
->alias(MimeTypeGuesserInterface::class, 'mime_types')
;
};
It means even you use the static class like it’s done by theFileValidator
and the HttpFoundation\File
you will have the decorator at the first position.
🔥 So your custom guesser is used by the service MimeTypesInterface
, by the HttpFoundation\File
and by the FileValidator
💻 let’s check this:
dd(MimeTypes::getDefault());
👀 Output:
And voilaaaaa! 🚀
👏 👏 Congratulations! Now you know what is a MIME type, how Symfony guess a MIME type validate a File and how to create a custom guesser to fix edge cases 😄.