Advanced email parser for Node.js. Everything is handled as a stream which should make it able to parse even very large messages (100MB+) with relatively low overhead.
The module exposes two separate modes, a lower level MailParser class and simpleParser function. The latter is simpler to use (hence the name) but is less resource efficient as it buffers attachment contents in memory.
npm install mailparser --save
simpleParser is the easiest way to parse emails. You only need to provide a message source to get a parsed email structure in return. As an additional bonus all embedded images in HTML (eg. the images that point to attachments using cid: URIs) are replaced with base64 encoded data URIs, so the message can be displayed without any additional processing. Be aware though that this module does not do any security cleansing (eg. removing javascript and so on), this is left to your own application.
const simpleParser = require('mailparser').simpleParser;
simpleParser(source, options, (err, parsed) => {});
See MailParser
options list
or as a Promise:
simpleParser(source, options)
.then(parsed => {})
.catch(err => {});
or even with async..await:
let parsed = await simpleParser(source);
Where
Parsed mail* object has the following properties
Address objects have the following structure:
value an array with address details
text is a formatted address string for plaintext context
html is a formatted address string for HTML context
Example
{
value: [
{
address: 'andris+123@kreata.ee',
name: 'Andris Reinman'
},
{
address: 'andris.reinman@gmail.com',
name: ''
}
],
html: '<span class="mp_address_name">Andris Reinman</span> <<a href="mailto:andris+123@kreata.ee" class="mp_address_email">andris+123@kreata.ee</a>>, <a href="mailto:andris.reinman@gmail.com" class="mp_address_email">andris.reinman@gmail.com</a>',
text: 'Andris Reinman <andris+123@kreata.ee>, andris.reinman@gmail.com'
}
headers is a Map with lowercase header keys. So if you want to check for the Subject: header then you can do it like this:
if (mail.headers.has('subject')) {
console.log(mail.headers.get('subject'));
}
The format of a header depends on the specific key. For most header keys the value is either a string (a single header) or an array of strings (multiple headers with the same key were found).
Special header keys are the following:
references is a string if only a single reference-id exists or an array if multiple ids exist
date value is a Date object
The following headers are parsed into structured objects, where value property includes the main value as string and params property holds an object of additional arguments as key-value pairs
Some headers are also automaticaly mime-word decoded
Attachment objects have the following structure:
MailParser is a lower-level email parsing class. It is a transform stream that takes email source as bytestream for the input and emits data objects for attachments and text contents.
const MailParser = require('mailparser').MailParser;
let parser = new MailParser();
boolean
Don’t generate plaintext from HTML. Defaults to undefined
(falsy).number
The maximum amount of HTML to parse in bytes. Defaults to undefined
(Infinity).function
Provide a custom formatting function. Defaults to undefined
.boolean
Skip converting CID attachments to data URL images. Defaults to undefined
(falsy).boolean
Don’t generate HTML from plaintext message. Defaults to undefined
(falsy).boolean
Do not linkify links in plaintext content. Defaults to undefined
(falsy).boolean
Do not consider attachments of type message/delivery-status
as text but as a separate attachment. Useful for parsing delivery status notifications. Defaults to undefined
(falsy).object
Defaults to iconv-liteboolean
simpleParser
-only option. Sets skipImageLinks
to true.The parser emits ‘headers’ once message headers have been processed. The headers object is a Map. Different header keys have different kind of values, for example address headers have the address object/array as the value while subject value is string.
Header keys in the Map are lowercase.
parser.on('headers', headers => {
console.log(headers.get('subject'));
});
Event ‘data’ or ‘readable’ emits message content objects. The type of the object can be determine by the type property. Currently there are two kind of data objects
Attachment object is the same as in simpleParser except that content is not a buffer but a stream. Additionally there’s a method release() that must be called once you have processed the attachment. The property related is set after message processing is ended, so at the data event this value is not yet available.
parser.on('data', data => {
if (data.type === 'attachment') {
console.log(data.filename);
data.content.pipe(process.stdout);
data.content.on('end', () => data.release());
}
});
If you do not call release() then the message processing is paused.
Text object has the following keys:
parser.on('data', data => {
if (data.type === 'text') {
console.log(data.html);
}
});
Charset decoding is handled using iconv-lite, except for ISO-2022-JP and EUCJP that are handled by encoding-japanese. Alternatively you can use node-iconv module instead for all charset decoding. This module is not included in the mailparser package, you would have to provide it to Mailparser or simpleParser as a configuration option.
const Iconv = require('iconv').Iconv;
const MailParser = require('mailparser').MailParser;
let parser = new MailParser({ Iconv });
or
const Iconv = require('iconv').Iconv;
const simpleParser = require('mailparser').simpleParser;
simpleParser('rfc822 message', { Iconv }, callback);
Dual licensed under MIT or EUPLv1.1+