We were presented with the task of finding a Markdown parser written in JavaScript about the same time the CommonMark drama was well underway. Each of us had our requirements for our ideal parser:
- As fast as possible
- Renders GitHub Flavored Markdown, at least supporting the tables
- Works with the CommonMark spec
- Extensible, has to support YouTube cat videos
I looked into Marked at first, and found the extension support difficult to the point where I ended up forking it to rewrite some functions to allow for the creation of new rules at the cost of a couple hundred milliseconds. I was planning on making it a marked wrapper in the end, but after some team discussion we decided to try another parser out. Remarkable had only recently been released at the time but development was moving quickly, it passed the CommonMark test suite, and seemed to fit a more modern style of JavaScript writing. It also was faster than Marked, though I haven’t fully understood how, even after reading the code for quite a while. It also obviously supported extensions, addition and manipulation of rules that would be far easier to handle and maintain than an entire fork. There were no clear examples on how to create new plugins or extensions for Remarkable though. So having worked on creating some and having a measure of success it is time to share the knowledge.
A Simple Extension: Open Links in a New Window
The simplest ways of modifying the Markdown parser is to change the rendered output. Open up the rules file from the Remarkable source and search for the current rule renderer you want to modify. In this case we are looking for link_open. That is what we want to extend to add our own code, note the output string is the first part of a complete anchor tag. These rules are exposed through a property named renderer on a new Remarkable object.
var markdownParser = new Remarkable();
// open links in new windows
markdownParser.renderer.rules.link_open = (function() {
var original = markdownParser.renderer.rules.link_open;
return function() {
var link = original.apply(this, arguments);
return link.substring(0, link.length - 1) + ' target="_blank">';
};
})();
Through a bit of JavaScript closure we have saved off the original link_open function as the variable original which we then use to get the normal Remarkable anchor tag output. Then we append our target property to open all links in new windows. This function replaces the one on our current instance of the Remarkable object. For my purposes I have extended a single Remarkable object and reuse it throughout my application.
Another Simple Extension: Adding Images to Lighbox
This is another rule rendering extension example where we want to include any Markdown images into a lightbox. The lightbox library I use looks for anchor tags with a data-lightbox attribute wrapping an img element. Again we look into the rules file from Remarkable and find the rules.image is the renderer we want to extend to add this functionality.
var markdownParser = new Remarkable();
// add images to lightbox
markdownParser.renderer.rules.image = (function() {
var original = markdownParser.renderer.rules.image;
return function(tokens, idx) {
var href = Remarkable.utils.escapeHtml(tokens[idx].src);
var imgOutput = original.apply(this, arguments);
var anchorStart = '';
return anchorStart + imgOutput + "";
};
})();
We’ve had to do a bit more work here since the anchor tags href is a link to the image used by the lightbox library. This exposes a little of the Remarkable inner workings. We get an array of tokens, and our current index that we can use to get information gathered in earlier steps. Specifically we want to get the image src. We leverage the existing Remarkable utility functions exposed in the utils property to get a safe href string. Then we get the original image function output, and return it wrapped in our new anchor tag.
A True Remarkable Plugin
Remarkable has a use function to include new plugins. The architecture isn’t well documented however, and this requires a great deal more work. By no means am I suggesting my example is the best practice, but it does work. Use expects a function that takes the current Remarkable instance, and an optional options parameter. This function assigns adds itself to the rules list and to the parsing functionality. So far we’ve only looked at rendering. The parser goes character by character to match a complex rule set. For example to you can add emphasis to text by using asterisks or underscores. The emphasis parser looks for a sequence of either, but not exceeding 3 characters, then continues down the string it has been given looking for a matching closing sequence of characters. This type of code isn’t often seen outside of parsers so may look alien.
Lets make a smiley emoticon plugin as an example. We’ll define the syntax as a colon followed by an exclamation mark followed by a colon :!:. That markdown will be rendered as a smiley GIF. Our parser function will check each character in the sequence to make certain we finish a completed smiley before adding it to the token list to be rendered.
var parse = function(state) {
// I get my character hex codes from the console using
// "0x"+"[".charCodeAt(0).toString(16).toUpperCase()
var pos = state.pos;
var marker = state.src.charCodeAt(state.pos);
// Our pos starts at 0, so we are looking for :!:
// marker starts with the character at 0
// Given state.src :!:
// We are here: ^
if (marker !== 0x3A/* : */) {
return false;
}
pos++;
marker = state.src.charCodeAt(pos);
// Given state.src :!:
// We are here: ^
if (marker !== 0x21/* ! */) {
return false;
}
pos++;
marker = state.src.charCodeAt(pos);
// Given state.src :!:
// We are here: ^
if (marker !== 0x3A/* : */) {
return false;
}
state.pos = pos+1;
if (state.pos > state.posMax) {
state.pos = state.posMax;
}
// Having matched all three characters we add a token to the state list
var token = {
type: "bangSmiley",
level: state.level,
content: marker
};
state.push(token);
return true;
};
Making our renderer is much more straight forward. The render will be called when the token comes back up in the rendering process. The render function we create will pass back the string to be rendered. The check you see at the end for options.xhtmlOut is to comply with the Remarkable option for outputting XHTML.
var render = function(tokens, idx, options) {
var smileyString = '' : '>';
return smileyString;
};
Finally we combine both our new parser and render functions to pass into the Remarkable.use function and complete our plugin.
// var parse = code above
// var render = code above
var bangSmiley = function(md) {
md.inline.ruler.push('bangSmiley', parse);
md.renderer.rules.bangSmiley = render;
};
var markdownParser = new Remarkable();
markdownParser.use(bangSmiley);
This Markdown:
[new window](http://github.com)
![lightbox](https://assets-cdn.github.com/images/icons/emoji/unicode/1f44d.png)
:!:
Becomes:
This is just scratching the surface of writing plugins for Remarkable, and writing a parser in general. The smiley example above is an inline rule, there are also block and core rules. Hopefully these examples get you started and you can come up with new and creative ways to extend Remarkable and Markdown syntax to suit your own needs.
Here is a Gist of the full code: https://gist.github.com/barretts/8677348c6e77c2b3ea80
Pingback: Markdown parser extensions - Clarify Solutions | Dovetail Software