mime - library for parsing shared MIME database

FreeSlave via Digitalmars-d-announce digitalmars-d-announce at puremagic.com
Sun Aug 16 00:04:05 PDT 2015


On Sunday, 16 August 2015 at 03:56:45 UTC, Rikki Cattermole wrote:
> On 16/08/2015 6:30 a.m., FreeSlave wrote:
>> Currently I'm working on mime library for D. Dub page:
>> http://code.dlang.org/packages/mime
>> It can parse MIME database files, including binary ones, like
>> mime.cache. It also has algorithms for mime type detecting by 
>> file name.
>>
>> It's not fully implemented yet and does not have stable API. 
>> Issues and
>> goals are listed on the github page: 
>> https://github.com/MyLittleRobo/mime
>>
>> If someone is interested in the project, I would be glad to 
>> discuss
>> interface and implementation details of the library.
>>
>> If you don't know what is shared MIME database and why does it 
>> matter
>> read this:
>> http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html#idm140625831778224
>
> I had an mime implementation in Cmsed that basically was a hard 
> coded file with a whole bunch of mime types along with file 
> extensions.
>
> I would be interested in seeing if this can match it 1:1 for 
> features, while not allocating. Say give me the mime type for 
> payload.
>
> Possibly with its own override/addition csv files.

This library focuses on shared MIME database used in freedesktop 
systems, usually for detecting file types in file managers to 
display appropriate icon and make correct choice of default 
application (well, that's another spec) to run on file. I don't 
think this suits the web world.

Your "hardcoded" approach is what usually used in web. If I 
remember correctly mime types are hardcoded in Chromium too.
But mime type is not only about extension. Generally pattern can 
be any glob pattern. That's why file managers can detect Makefile 
type as text/x-makefile, even though it does not have extension. 
Same for CMakeLists.txt - the preferred type is text/x-cmake, not 
just text/plain.

Patterns are not alone. There're magic rules for the rescue when 
mime type can't be detected from the name of file. That's how 
Linux file managers differ shell script from python script even 
if both don't have extension (but they have leading comment like 
#!/bin/sh or #!/usr/bin/python). Also that's how file managers 
detect file with unknown pk3 extension (used in Quake III based 
games) as zip file. Because it's really just zip file by its 
contents.

The whole shared MIME database thing is system and user 
dependent, so again it's not what you want to use for web, 
probably unless you manage the database yourself on the server. 
Even in this case patterns and magic rules are just hints. You 
can't rely on that to check if the uploading file is of the 
needed type. For example, if you want to validate image file, the 
only way to do it is to parse the whole file. Still you can use 
hints to cut off obviously invalid files.

I did not target non-allocating code yet, though I believe it's 
possible to make MimeCache to not allocate when detecting file 
type.


More information about the Digitalmars-d-announce mailing list