With some solid refactoring work completed, it was time for me to write the first set of rules based on all that hard work. To start things off, I planned to work on 2-3 rules that would verify that I had created an acceptable framework for writing and evaluating Markdown linting rules. As David Anson’s MarkDownLint plugin for VSCode already has a number of rules defined, it made sense to pick a handful of rules from there as a starting point. This choice was also fitting, as David’s NPM-based project was the inspiration for my Python-based project.

But how to pick the rules to start with? One of my criteria was that whatever set of rules that I picked, I wanted to be able to extend the set of rules naturally once I finished proving the first few rules were able to be clearly written. The other criteria that I wanted was for those initial rules to be a good cross-section of what to expect in the other rules.

Based on those criteria and some quick thinking on my part, it took me less than 2 minutes to realized that the first three rules in David’s check list would do just fine!

Why Is This Article So Long?

To be honest, this article got away from me. Without breaking down this article into 3 articles, each one focusing on its own rule, I couldn’t see any intermediate solutions for breaking up the article. These are the first three rules I wrote for the project, and as such, they laid the foundation for all the other rules. It just did not seem right to artificially break up the article.

What Is the Audience for This Article?

While detailed more eloquently in this article, my goal for this technical article is to focus on the reasoning behind my solutions, rather that the solutions themselves. For a full record of the solutions presented in this article, please go to this project’s GitHub repository and consult the commits between 17 April 2020 and 23 April 2020, except for the commit from 18 April 2020.

Why the First Three Rules?

Given that David Anson’s work was the inspiration for this project, it made sense to start with those rules, outlined here. The first three of those rules easily meet the first criteria, as all three rules deal with examining the headings in a Markdown document. Doing a quick count of all the rules that dealt with headings, I counted 15 such rules. Definitely qualifies as extensible.

As for the second criteria, the first rule is a standard rule, the second rule is a disabled rule, and the third rule has configuration that affects how the rule is measured. Between the three of them, I had a lot of confidence that together they would represent a good cross-section of all rules, and therefore satisfy the second criteria nicely.

With the three rules selected and a confirmation that these three rules satisfied my criteria, it was time to move forward with implementation.

A Quick Aside

Are they headings or headers? While some of my cheat sheet resources like the Markdown Guide’s Cheat Sheet refer to them as headings, other resources like Adam Pritchard’s Cheatsheet refer to them as headers. Even the CommonMark specification refers to them as headings, but then include a couple of times where the term header is used instead of heading. So which one is right?1

To keep things simple, I am going to use the term heading in this article and my other articles that deal with headings going forward. That term seems to be the one that is most dominant in the specification, and I believe that the authors of the specification had a good reason for specifically using the term heading. Even if that reason is not documented.

Rule MD001 - Incrementing Heading Levels

This section describes the initial implementation of PyMarkdown’s Rule MD001. Feel free to examine the code at your convenience.

Why Does This Rule Make Sense?

This rule is simple: when using headings, the heading level should at most increase by one. Based largely on the W3C’s Accessibility Guidelines, this rule just makes sense even without those accessibility guidelines. If you have a heading, and you want subsections under that heading, you use a heading level that is one below the current one. Consider this example:

# What I Did On My Summer Vacation

I did a lot of things.

## July

July is when it started.

### The Pool

I went to the pool every day.

#### Swim Classes

Mom made me do this, again!

## August

This is when everything seemed repetitive.

### The Pool

I got really sick of the pool.

While it does accurately detail most of my summer vacations as a kid, I believe that it is also a decent example of a report with the various highlights of each section organized under their own headings. It makes sense to begin the document with the level 1 heading describing what the document is all about. From there, it logically follows that to break the document up, there should be level 2 headings with the month names, each heading containing text specific to that month. As going to the local pool was the major part of each summer, it therefore follows that each of the months has its own level 3 heading under which I talk about what I did at the pool during that month. Finally, swim classes were deemed mandatory by my mother, so me and my siblings took at least one session of swim classes each summer. Detailing those classes in the month they happened, under a level 4 heading, just seems to be the right thing to do. As a matter of fact, this example reminds me a lot of the “returning to school” report that my mother made us right just before school started, just to ensure that our waterlogged brains still remembered how to write properly. Thanks Mom!

Putting my reminiscing about summers as a kid aside, take another look at the headings and the text, observing the natural progression for heading levels, as detailed in the last paragraph. For me, the headings and their levels just feel right, their flow is natural and not jarring. When I need to get more specific with information in each section, I used a new heading one level down from the current heading and added that more specific information under the new heading. From my point of view, it just worked, and I really did not need to think about why it worked… it just did.

Specifically looking at the heading levels themselves, while there is a case where the heading levels decrease by more than 1, there are no cases where the heading level does not increase by 1. As an author, it just made sense to author the report like that, adding more detail with a lower heading and a lower subsection, and then popping back up to the right level to continue. While there may have been a scenario in which the Swim Classes section was followed by a level 3 heading and more text, it was not required. In fact, I believe that my freedom to not follow up that section with a “garbage” level 3 heading and section text are what makes the flow of the headings work as they do.

While I might have taken the long way around in describing my theory behind this rule, to me it simply just makes sense both as an author and as a reader.

Adding the Rule

This was my first rule using the built-in parser, so I wanted to make sure to lay down some good patterns for myself to repeat going forward.

Pattern: Test Format

The first pattern I wanted to set in stone is the pattern to specify how to execute tests for a given rule. After experimenting with different formats and combinations, it was the proven test format that I chose back in November that won out.

def test_md0047_good_description():
    Test to make sure...

    # Arrange
    scanner = MarkdownScanner()
    suppplied_arguments = ["test/resources/rules/md047/"]

    expected_return_code = 0
    expected_output = ""
    expected_error = ""

    # Act
    execute_results = scanner.invoke_main(arguments=suppplied_arguments)

    # Assert
        expected_output, expected_error, expected_return_code

This format keeps things simple: a descriptive function name, a decent function description, and simple boiler-plate code for the function that can be applied to most tests. Even in cases where I had to add some extra code, such as adding a configuration file for the linter to read in and use, those changes were always applied on top of this template code, not instead of it. And except for those additions, only the variables supplied_arguments, expected_return_code, expected_output, and expected_error were ever changed for any of the tests, even to this day.

Basically, my plan was to create the template once, put it through a trial of fire to test it, and then reuse it endlessly once proven. Keep it effective by keeping it simple. As Isaac Newton said:

Truth is ever to be found in the simplicity, and not in the multiplicity and confusion of things.

Pattern: Creating a New Rule

Translating the logic described under the section Why Does This Rule Make Sense? into Python code was easy. First, I created the tests cases described above, and examined the debug information output for those cases, specifically looking at what the final sequence of tokens was. As all the information for the rules was contained within the instances of the AtxHeaderMarkdownToken or the SetextHeaderMarkdownToken2, those were the only two tokens I had to worry about.

With that knowledge in hand, it was time to move to the second pattern that I wanted to repeat: creating a new rule. While the content of each rule changes, this process is always consistent. First, I pick a test to clone from and copy its contents into a new file. In this initial case, I copied the file into the file Any initialization for each rule is performed in the starting_new_file function, so that function is cleaned out except for the comment. Finally, the next_token is cleaned out in the same way, to provide for a clean slate to start writing the rule with. For this rule, used the next_line function, so that needed to be changed to override the next_token function instead, as this rule is specifically token based. Except for this one special case, this pattern has now been replicated for each rule in the project.

Pattern: Creating the Initial Tests

The third pattern that I wanted to get in place was specifying a good set of initial test cases for the rule, prior to writing the rule itself. A strong proponent of test-driven development, I believe that before writing the source code, at least a rough outline of the test code and test data should be written.

A common misconception of test-driven development is that before you write any code, you write all the tests. The process is an iterative process, one that grows over time. While this entire process is the pattern that I want to enshrine in this project, the important part that I want to tackle at this point is coming up with a good set of tests and test data to start with.

For this project, this initial set of tests and test data are made easy by the existing MarkdownLint rules outlined here. While the rules outlined there do not always have a comprehensive set of Markdown documents to test that rule, they always document at least one good Markdown document and one bad Markdown document. If there was something obvious that is missing, I also try and add it at this point, just to save iterations later. But just in case I miss something, I have another pattern, Thorough Rule Validation, that I will talk about later to try and catch those missed cases.

Implementing the Rule

Once things were setup, it was time to add the logic. A false failure is not desired when a new document is started, so I added a reset of the last_header_count class variable in the starting_new_file function. In the next_token function, I then added some simple code to test whether the token was one of the two heading tokens, and if so, set the hash_count variable to a non-None value. At this point, I added some debug to the function and ran each of the test cases against the rule, checking to see whether what I expected to happen, did happen.

As this rule implements simple logic, the initial logic was validated on my first try. Removing the debug statements, I added some logic to filter out any cases where last_header_count was not set (initial case) or where header_count was not greater than last_header_count (not increasing). With those cases filtered out, it was simple to check for an increase of 1 and to fail if the increase was more than 1. A quick call to report_next_token_error to report the failure, and the basic case was completed.

From there, I circled back to the test data, and looked to see if there were any obvious cases that I was missing. It was then that I noticed that the text’s description specified headings, but had no test data for SetExt headings, just Atx headings. I quickly crafted some data that mixed a SetExt heading followed by a valid and invalid Atx heading and iterated through the process again. It was only after I was sure that I had not missed anything obvious that I proceeded to the next pattern: Thorough Rule Validation.

Pattern: Thorough Rule Validation

The final pattern that I wanted to put into place was to be thorough in my rule validation, using both internal data sources and external data sources.

The validation against the internal data sources was easy, as I had just finished the source code and the tests code for that rule. However, I put that aside and instead ignored the source code in favor of the the definition of the rule’s scenario along with the test data. Based on those two factors alone, I predicted what the outcome of the test should be, then executed the test to verify that prediction.

If I encountered an error, the first step I took was to recheck the expected output more rigorously against the rule’s scenario and the test data. Whenever doing this, I rechecked the output multiple times just to make sure I had the right answer. From my experience, unless I try very hard, it is unlikely that I will make the same mistake twice. If there was an error in the output, I corrected the error and executed the tests again, as if from the beginning of this section. If there was an error in the rule itself, I would add some debug to the rule and run further tests to check what refinements I needed to make, before restarting all the tests in this section.

For this initial rule, I had errors in the test data and the rule, and this attention to detail helped me spot them quickly. After a few iterations, I was confident that the validation against internal data sources was completed, and I needed to move on to an external data source. As the MarkDownLint implementation of the rules was done as a VSCode plugin, it made sense to use VSCode + MarkDownLint as the external validation source.

It was with great confidence that I I loaded up the sample files into VSCode to check against MarkDownLint. It was when I looked at VSCode’s Problems tab that I got a big surprise. I had forgotten something important: line numbers and column numbers.

Line Numbers and Column Numbers

To say this hit me like a ton of bricks would not do it justice. I was floored. I was so focused on getting the parsing of the tokens done accurately, I completely forgot to design and implement a way to place the location of the element in the Markdown document into their tokens. It was hard to miss the issue:

  Heading levels should only increment by one level at a time
  [Expected: h3; Actual: h4] markdownlint(MD001) [4,1]

Honestly, I took me a bit to get over this. It was so obvious to me that I should have seen this ahead of time. When you are reporting any type of linting failure, you need to specify the location of that failure so that the user can fix it. Without that information, the rule is somewhat useless.

In the present as I am writing this article, I can better understand what happened and how I missed those number. However, at the time I struggled to find an interim solution until I could start to tackle this properly. I needed to focus on the first cohort of rules, so I tried to put this mishap out of my mind. It was after some a couple of frustrating hours that I added two fields to track the line number and column number of the tokens, setting both to 0.

After some additional “yelling” at my monitor, I decided to close out that first rule, and moved one to the second rule, confident that I (mostly) had set up some solid patterns to focus on for the future rules.

Rule MD002 - (Deprecated) First Heading Should Be Top Level

This section describes the initial implementation of PyMarkdown’s Rule MD002.

Why Does This Rule (Not) Make Sense?

Note the slightly different wording of the heading. As documented on the MarkdownLint site, this rule has been replaced by rule MD041. The important difference between the two rules is that MD002 looks at Atx headings and SetExt headings, while rule MD041 also looks at the metadata at the start of the document, often referred to as YAML front matter. Because there is an improved rule, this rule is disabled by default in favor of that rule.

Whether in rule MD002 or rule MD047, the reasoning for both rules is consistent: each document should contain a clean title. For both rules, this is achieved by looking at the first Atx heading or SetExt heading in the document and verifying that the first heading is a level 1 heading. For rule MD047, the only difference from rule MD002 is that it additionally looks for a specific metadata field that can take the place of an explicit level 1 heading.

A good example of rule MD002 is the standard file that I usually add at the base of a GitHub project. Typically, I start with a file that looks like this:

# ReadMe

This file describes what the project does, who to contact, etc.

While the title is simplistic, it does present a clear indication of what the title and purpose of the document is. If, on the other hand, you see a Markdown document like:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus felis lacus, finibus
eget gravida eget, dapibus vel lacus. Phasellus placerat nisi enim, eu maximus ipsum
congue nec. Integer sollicitudin metus urna, quis iaculis ligula condimentum eu.

it is hard to figure out what this document is for. Simply by adding a title heading, this can be cleared up.

# Random Test Paragraphs

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus felis lacus, finibus
eget gravida eget, dapibus vel lacus. Phasellus placerat nisi enim, eu maximus ipsum
congue nec. Integer sollicitudin metus urna, quis iaculis ligula condimentum eu.

Adding the Rule

Having previously explained my process for adding new rules, I’ll leave that content out from here on, and just concentrate on what has changed.

The logic for this rule is almost exactly the same as for the previously implemented rule MD001, except that instead of checking for an increase in the heading level, it simply looks to see if the first heading it encounters has a heading level of 1. The implementation of this rule introduced two new concepts: rule configuration and disabling a rule by default. While the next rule, MD003, will properly deal with the configuration aspect, the main focus of this rule is the ability to add a rule that is disabled by default.

To provide options to the user, adding a rule that is disabled has merit. In this case, a new rule was added that is more comprehensive than this rule. However, rather than removing this rule and possibly breaking the configuration of some users, the original rule was preserved for users that are not comfortable updating their linting to use the more comprehensive rule. In addition, there is also a good argument to be made for new rules to be added in a disabled state, allowing people who are upgrading to a new version of the project to control which new features they want. As both examples illustrate, having a rule be disabled by default is a useful feature to have.

The code to allow for a rule to be disabled by default was added back in November when I was testing out the plugin manager concept. While there was a good test of the disable feature at that time, it was now time to really test its functionality with this rule. This testing was achieved by looking back at the test data for MD001 and noticing that the first heading in the file is a level 2 heading. That means that if rule MD002 is enabled and executed for that Markdown file, I would expect a failure to occur. Therefore, when I executed the MD001 tests again, with rule MD002 in its default disabled state, I expected that no additional failures would be reported. I executed that tests, and this behavior is exactly what I saw in the results. For me, that was enough proof that the rule was disabled by default.

To properly test this disabled rule, only a slight change to my normal process of testing rules was required. In addition to the normal information supplied by the test in the supplied_arguments variable of the test, the start of that array was modified to include the elements -e and MD002. As PyMarkdown allows for rules to be enabled and disabled on the command line, those two additions simply told the command line to enable (-e) rule MD002 (MD002).

With those changes made to the tests, the rest of the testing went by without incident.

Rule MD003 - Heading Style

This section describes the initial implementation of PyMarkdown’s Rule MD003.

Why Does This Rule Make Sense?

In a single word: consistency. I am sure I would not want to read a Markdown document that had multiple heading styles, such as:

# ATX style H1

## Closed ATX style H2 ##

Setext style H1

It would be a confusing mess! If I were reviewing that document for someone, I would tell them to keep it simple, pick a heading style, and stick to it. By picking a single, simple style, it would help any readers know what to expect.

In terms of styles, there are 6 styles to choose from. The obvious style, the default consistent style, simply looks at the first header and assumes that the style of that heading will be used for the entire document, atx, atx_closed or setext. If the user wants to be more specific about the style, there are two variations on Atx headings styles, and three variations on SetExt headings styles to choose from. They take a bit of getting used to, so let me walk through them.

Atx Headings vs Atx_Closed Headings

For Atx headings, the two variations that are available are atx and atx_closed, demonstrated in the follow example:

## Atx

## Atx Closed ##

The only difference between these two variations are that atx_closed style includes # characters at the end of the heading as well as at the beginning. While the GFM specification has strict requirements that there can only be 1 to 6 # characters at the start of the Atx heading, the only requirement for the closing # characters is that there is at least one valid # character. For any stricter requirements on the closing # characters, a new rule would have to be written to add more stringent requirements for any Markdown documents with the atx_closed style.

SetExt Headings vs SetExt_With_* Headings

For documents that use the SetExt headings, the obvious issue is what to do if the document requires a level 3 heading, as SetExt headings only support a level 1 and a level 2 heading. To handle this case, the setext_with_atx style is used to specify that level 1 and level 2 headings remain SetExt, while level 2 to 6 headings are to use Atx headings, as in the following example:

Setext style H1

Setext style H2

### ATX style H3

Without specifying a heading style of setext_with_atx and relying on an implicitly or defaulted setting of setext, the rule would fail on the heading labeled ATX style H3. To round things out, there is also a style variation setext_with_atx_closed which has the same behavior as the above example, except using the atx_closed style instead of the atx style.

Adding the Rule

Back in the description for rule MD002, I mentioned that I would cover the configuration aspect later, focusing at that time on the disabling of rules by default. Having completed the discussion about that rule, it is now time to talk configuration. For any readers following along with the commits in the project’s repository, note that the work for rule configuration was performed in changes introduced in the commits for both rules MD002 and MD003.


Configuration was the last component of the rules that I needed to have implemented and tested thoroughly before implementing more rules. It was important to me to get configuration right, as just over half of the initial rules contains some element of configuration.

The main part of accepting configuration was to change the command line interface to accept a configuration file in the JSON format, verifying that it was a proper JSON file by parsing it into a simple Python dict object before continuing. Just before the files were scanned, a call was introduced to the new __load_configuration_and_apply_to_plugins function, that function performing the required work to call the initialize_from_config function in each plugin. At that point, if the plugin requires any configuration, it calls the plugin’s get_configuration_value function to see if a value of the requested type is present in the map.

That might seem like a lot of work, but that work done can be summarized as: load the configuration, let the plugins know about it, and then let the plugin retrieve the configuration if required. Almost everything else surrounding those actions were either making sure errors were handled properly or making sure that the correct information was passed properly to the plugin manager.

Before checking the configuration code in for MD002, and then again for MD003, I changed various parts of the plugins to request different information from the configuration store. It was this extra testing that allowed me to simplify rule MD003. The initial code for the rule raised an exception if the configuration value did not match one of the required values. Based on that testing, I changed it to use the default value in that case instead. It just seemed like the right thing to do in that case.

The Rule and The Tests

Once the configuration was in place, the rest of the development went smoothly. A slight change to the Atx heading token was required to report the number of trailing # characters, but other than that, the core rules engine was stabilizing with no other changes.

The rule itself was somewhat simple but reducing the complexity of the check was a daunting task. At first, I wrote the rule with everything in the next_token function, which worked decently well. The first part of that function was a block of code that figured out two important attributes: the type of heading (atx, atx_closed, or setext) if the token was a heading token and whether that token contained a level 1 or level 2 heading token.

Based on that information, the rest of the code worked out cleanly. If it was not a heading token, exit quickly. If the style was consistent and this was the first heading token to be seen, set the style to the heading type of the current token. With all that out of the way, the actual checking of the styles started. If the style was one of the 3 basic styles, a simple comparison determined if the rule failed or not. In the *with* variations for SetExt, the logic was a little more complicated, mostly dealing with checking the level of the heading. A certain amount of playing around with the code was required to get all the rules validating the Markdown in a clean and simple manner.

The tests themselves were simple as well. Before starting on the rule, I had created one test input file with a positive example for each of the style types. By changing the configured style type to apply to the rule, I was able to cover all combinations very quickly. What were negative cases for some tests became positive tests for other cases, and vice versa. The reusability of the data for testing this rule ended up being a big win. Instead of 3-5 test documents for each style, the tests only use a total of 5 documents, not including the file. Pretty efficient!

It was with that battery of tests in place that I worked to reduce the complexity of the rule. I won’t try and say I got everything right on the first try. I didn’t. But having those tests in place helped me figure out where I went wrong and helped me determine the next changes to make. Having a good set of tests is pivotal in being able to refactor any algorithm, and that includes one the project’s rules.

Resolving Conflicts Between Rule Test Data

The one thing that I had to start watching out with this rule was the test data for one rule raising an error on a previously written rule. In each of the tests that I wrote, I specifically wanted to narrow down the testing to that specific rule, to keep the test more relevant to the rule. With rules MD001 and MD002 being in the same area, it was only luck that they did not cause any interference with each other. For rule MD003, it caused interference with the test data for the previous 2 rules, where a consistent style for the input data was not a priority.

To remove the interference, the PyMarkdown’s disable rule feature was used, the opposite to the enable rule used in the testing of rule MD002. Instead of adding the -e and MD002 values to the supplied_arguments variable, the values --disable-rules and MD003 were added. In the tests for rule MD002, rule MD002 was enabled from the command line at the same time that rule MD003 was disabled from the command line.

By applying --disable-rules and MD003 to the tests for MD001 and MD002, I was able to resolve the interference from rule MD003 cleanly, getting nice consistent test results.

What Was My Experience So Far?

The first part of my experience that I want to talk about is change and how I handle it. Specifically, I want to talk about the line and column numbers. While it was painfully obvious after comparing the output in VS Code with my output, it really hadn’t crossed my mind before then. I was more concerned with the ability to write a solid rule, and not concerned with the content that would be displayed when that rule detected a violation. Sure, I felt like I should have caught that in the design process, and I gave myself somewhere between 5 minutes and 5 hours to deal with that.

After that, I noted it in my “to do” document as technical debut, and I just put it behind me. That was not an easy thing to do, and my failure to account for that in my design haunted me for a bit. In the end, what helped me get over it was looking at what I did accomplish. I know it sounds cliché, but in this case, it helped me get a better perspective on the issue. What I forgot to do was add support for line numbers and column numbers. What I did not forget was to build a strong parser that has over 800 test cases that it is passing without fail. That parser also has a simple translation layer that can be plugged in to the parser to generate GFM specification compliant HTML code. Accomplishing those two feats was not easy.

On top of accomplishing those two feats was another, more obvious one. I started in 2019 November with an idea of writing a Markdown linter in Python. With the three initial rules that I had just created, I proved to myself that I had made the right choice in selecting to build a parser for Markdown. The successful rules just proved that. The third feat was writing a parser-based linter for Markdown in Python.

Yeah, I still feel a bit like a fool for missing something that was obvious, but with those things in my head, it became easier to let it go. Instead of focusing on the 1% that was that design flaw, I made sure to refocus myself on the 99% of the project that was not a design flaw… and moved forward.

What is Next?

Having completed the first set of rules, I decided that it was more important for me to keep my momentum in creating new rules than to add line numbers and column numbers to the parser. Truth be told, I thought that getting some distance from that problem would help me so it more clearly. With both reasons in place, I started work on the next group of heading based rules.

  1. To avoid the same issue with “cheat sheet” versus “cheatsheet”, says the correct answer is “cheat sheet”. 

  2. It was not until the writing of this article that I formally decided to go with heading over header. There is now an item in my backlog to make this change throughout the source code. 

Like this post? Share on: TwitterFacebookEmail


So what do you think? Did I miss something? Is any part unclear? Leave your comments below.

Reading Time

~23 min read


Markdown Linter Rules


Software Quality


Stay in Touch