My journey with the Symfony MIME type guesser

Smaine Milianni
5 min readJul 3, 2022

Symfony provides a lot of helpful components, among those components, there is the MIME Component.

The Mime component allows manipulating the MIME messages used to send emails and provides utilities related to MIME types.

and provides utilities related to MIME type

Ok, but what is a MIME type ❓

📖 A MIME type is like a standard to classify files so computers know how to handle files, a file with a MIME type application/pdf will be opened by your computer in a PDF reader, file with a MIME type text/html will be opened in a navigator, a file with a MIME type image/jpeg will be opened as an image…

💡The MIME type has 2 parts separated by a “/”. The type and the subtype.

I can hear you: 🗣️ “thanks dude but why should I know this stuff as a developer”

Me: 🗣 ️“When you want to validate a file, you use the MIME type to assert that the file corresponds to what you want, an image for an avatar, a pdf for an invoice, and so on...”

Whatever the kind of file you expect, you should validate it otherwise you’ll have a surprise 💥 .

To do this you have probably already used the File constraint

// src/Entity/Author.php
namespace App\Entity;
use Symfony\Component\Validator\Constraints as Assert;class Author
{
/**
* @Assert\File(
* maxSize = “1024k”,
* mimeTypes = {“application/pdf”, “application/x-pdf”},
* mimeTypesMessage = “Please upload a valid PDF”
* )
*/
protected $bioFile;
}

In the mimeTypes option, you pass the allowed MIME types for this property and behind the scene, the FileValidator use the MIME type guesser to guess the MIME type and compare it to these values, if it’s not allowed it adds a violation.

How Symfony guess the MIME type 🔍

To guess the MIME type Symfony has a helper, the class Symfony\Component\Mime\MimeTypes.

Honestly, the code is well documented and commented, but let’s explain what we need to know:

📚 The class uses 2 classes called “guesser” that are registered in the construct. Those classes are the FileBinaryMimeTypeGuesser (based upon the binary file) and the FileinfoMimeTypeGuesser (based upon the extension FileInfo), when you guess the MIME type, the code iterate on the guessers and return the 1st result.

By default it uses FileinfoMimeTypeGuesser first but you can change the order by calling the methodregisterGuesser with FileBinaryMimeTypeGuesser as a parameter or you can set a custom guesser.

💻 Now we know how it works let’s see an example, suppose you want to create a pdf file from a path. We will use it like this:

I will add a dump in the method guessMimeType so we can see the order of guessers:

// MimeTypes.phppublic function guessMimeType(string $path): ?string
{
dd($this->guessers);
foreach ($this->guessers as $guesser) {
if (!$guesser->isGuesserSupported()) {
continue;
}
if (null !== $mimeType = $guesser->guessMimeType($path)) {
return $mimeType;
}
}

👆As you can see the MimeTypes will guess the path with Fileinfo first and then with FileBinary

💡As it’s well named the guesser tries to guess and unfortunately, it can guess wrong, from my experience, the FileBinaryMimeTypeGuesser have better results, so let’s use it first.

👀 Let’s dump again and see our registered guessers.

Now we use FileBinaryMimeTypeGuesser first but it’s not optimal because the helper is instantiated by ourselves and then we reorder the guessers, it means everywhere I need this helper in my codebase I should do this 🙁

Let’s improve this by using the helper as a service and the good news is Symfony has registered the MimeTypes as a service with an interface MimeTypeInterfaceand an aliasmimes_type, we just have to reorder the guesser during the service definition and we’re done, everywhere we need it we just have to inject the interface 🎉.

Thanks to the compiler pass, I will do exactly what we did to register the guesser but with the service MimeTypes👇

Now we’re good, anywhere I inject MimeTypesInterface I will use guess the MIME type with FileBinary first, let’s check that everything is OK with my dump:

It looks pretty good 🎉

Add a Custom MimeTypeGuesser

⚠️ Depending on the MIME type you want to validate this configuration can be enough but sometimes guessers (file binary or file extension) are unable to guess the real MIME type.

For the Docx files, you can have wrong results, there is a ton of issues on the web related to this, sometimes it returns application/octet-stream (the MIME type used to indicate the body contains arbitrary data) this result is too vague to be trusted and sometimes it returns application/zip because a Docx is a zip (you can open it) but a zip file is not necessarily a Docx.

ℹ️ The MIME type for a Docx is application/vnd.openxmlformats-officedocument.wordprocessingml.document

From my research the only way to be sure that a file is a Docx is to inspect it 😅.

🔧 Let’s do this only in case of the guesser returns application/octet-stream or application/zip if the guesser returns application/vnd.openxmlformats-officedocument.wordprocessingml.document we are good and there is nothing to do.

🗺️ The plan is to use the FileBinaryMimeTypeGuesser and to update its behavior to inspect the file if it can be a Docx. We will use the design pattern proxy.

Thanks to auto configuration, As my proxy implements MimeTypeGuesserInterface it will be autoregistered as a guesser and will be used first, so we can remove all our stuff in the kernel.

As we see in the dump, our decorator is well registered and is the first 🎉

🌟 The other cool thing is that Symfony will register our decorator as default guesser in the MimeTypes class (see snippet bellow).

// src/Symfony/Bundle/FrameworkBundle/Resources/config/mime_type.phpreturn static function (ContainerConfigurator $container) {
$container->services()
->set('mime_types', MimeTypes::class)
->call('setDefault', [service('mime_types')])

->alias(MimeTypesInterface::class, 'mime_types')
->alias(MimeTypeGuesserInterface::class, 'mime_types')
;
};

It means even you use the static class like it’s done by theFileValidator and the HttpFoundation\File you will have the decorator at the first position.

🔥 So your custom guesser is used by the service MimeTypesInterface, by the HttpFoundation\File and by the FileValidator

💻 let’s check this:

dd(MimeTypes::getDefault());

👀 Output:

And voilaaaaa! 🚀

👏 👏 Congratulations! Now you know what is a MIME type, how Symfony guess a MIME type validate a File and how to create a custom guesser to fix edge cases 😄.

--

--

Smaine Milianni

Fullstack Developer- certified Symfony 4,5 and certified AWS Solution Architect - Freelancer - Remote Worker