Markdown Linter - Road To Initial Release - Triple Check Everything

Summary¶

In my last article, I talked about getting the documentation cleaned up and ready for release. In this article, I talk about the final changes that I needed to make to get the project ready for its initial release.

Introduction¶

This was it! After many hours of work, countless hours of debugging, and too many brain cells burnt debugging, the project was on the cusp of being ready. I just had a few tasks to complete before I felt confident in announcing the release of the project. Nothing spectacular, but just some (hopefully) small tasks to make sure everything was in place.

What Is the Audience for This Article?¶

While detailed more eloquently in this article, my goal for this technical article is to focus on the reasoning behind my solutions, rather that the solutions themselves. For a full record of the solutions presented in this article, please go to this project’s GitHub repository and consult the commits between 27 May 2021 and 31 May 2021.

The True Initial Release Was Quiet¶

The release of PyMarkdown 0.5.0 or pymarkdownlnt-0.5.0 was done very quietly and without much fanfare. Seriously quiet. I mean, if you twist my arm, I might admit to doing enough of a victory dance that my wife asked what the noise in my office was. But other than that, nothing. It was just a normal night, and the release was performed by following some simple instructions and using this command line:

pipenv run python -m twine upload --config-file .pypirc --repository pypi dist/*

And The Work Goes On¶

After taking a minute to appreciate what I had done, I got to work on finding out if there were any small tasks in the release that I had missed. The first thing I did was bump the project version to 0.5.1. After that, I double checked to make sure that the setup.py file contained the PyPi name of the project, pymarkdownlnt, and committed that change.

Having passed my 48-hour documentation cool off period, I went through each of the documentation files carefully and found a handful of small changes that I felt needed to be made. Except for adding the missing advanced_plugins.md file to the repository, most of the remaining changes were simple wording changes or organizational changes.

The main organizational change that I made were primarily made to provide a brief summary for each section. While I am not sure it is the right approach, I decided to start each important section with a simple table containing the important concepts from that section. My goal with these tables is to provide readers with a quick summary to help them decide if they wanted to read that section. I am not sure if I have the right information or the right format yet, but I wanted to give it a try and see how it works out.

The other organizational change I made was along the same lines as the summary tables. But instead of starting each section with a table, I decided to start the readme.md file with some relevant badges.

What Are Badges?¶

Badges (or shields) are an interesting concept that leverage visual information with the advent of webservices. A good example of this is the badge Markdown that I use for displaying the version number associated with the package uploaded to PyPi: :

[![Version](https://img.shields.io/pypi/v/pymarkdownlnt.svg)](https://pypi.org/project/pymarkdownlnt)

The most important part of the badge is the badge image link, which is the Image element specified inside of the Link’s label. In this case, I use the provider img.shields.io to provide a badge displaying the version (the /v) from PyPi (the /pypi) for this project (the pymarkdownlnt) in the SVG image format (the .svg). This is all that is required to display the above image with the version information. Because nothing is hardcoded, when I upload a new version of the project to PyPi, eventually img.shields.io will expire its cache and retrieve a new version. When that happens, the badge image automatically updates.

The second part of the badge is the optional outer Link element. While it is optional, it is convenient in that it includes a link to something associated with the badge itself. In the above case, as the badge shows the version of the package on PyPi, it made sense to include a link to the package page on PyPi. For the badge that I use to display the project’s license agreement (), I specify the location of the license.txt file in the project. And for badges where there is no commonsense page to go to, such as the Black version used to format the Python for the project ( GitHub Pipenv locked dependency version (branch) ), I just provide the image with no link.

While I initially viewed badges as frivolous, I quickly decided that they are useful for two things: quick links for me and quick information for others. Instead of maintaining multiple project links in my browser’s toolbar, I can just go to the project’s main page and reference them from there. From the other people perspective, I can hopefully provide information to anyone interested in the project in a quick and easy to digest summary format.

And That Completed The Work on Version 0.5.1¶

Having cleaned up what I wanted to, I went ahead and packaged up the code and documentation for the project and uploaded version 0.5.1 of the project to PyPi.

I then started on one more thing that I wanted to get out of the way before the initial release: a good solid CI/CD pipeline.

CI/CD Pipelines With Github Actions¶

For those who are not in the know, CI/CD pipelines are all the rage these days, and with good reason. The full name for them is Continuous Integration and Continuous Deployment pipelines, which is a mouthful. And technically, while I want a pipeline in place before the initial release, it will only technically be a Continuous Integration pipeline.

The difference between the two is simple: one is for integration, and one is for deployment. Before sending an email to me to let me know that I am responsible for a facepalm that was just performed, let me explain: common sense isn’t always common. A lot of people that I know in the industry often get those two parts of the pipeline or two different pipelines confused, so let me try and clarify what I mean.

Continuous Integration is the part of the pipeline that most people implement and takes care of updating the project’s main repository with the latest changes submitted by developers. Once submitted, those changes are subjected to various checks and balances to ensure that those changes do not negatively affect the project. In the case of the pipeline that I am setting up, I am more concerned about running extensive checks after I have finished my own subset of checks. However, many pipelines are setup to mandate that all checks pass before any changes are accepted into the repository. As these checks happen with every change submitted, and not on a schedule, they are considered continuous integrations.

Continuous Deployment is similar, but different. For this example, in addition to the PyMarkdown linter, assume that we also provide a simple webservice. This webservice allows people to submit Markdown documents, with the webservice reporting any standard violations back to the submitter. With that assumption in place, we can set up another pipeline to trigger once the Continuous Integration pipeline produces a new artifact. This new pipeline will probably then run extra checks to verify that the new artifact works properly. If those checks succeed, then the pipeline starts the process of deploying that new artifact to the location where the webservice is being hosted. Once again, only if all the checks pass will the new webservice be deployed to the specified environment. So, while the concepts of integration and deployment are different, the continuous prefix of both concepts are the same in that they happen automatically once something is produced.

For this project, I do not have anything that I need to deploy, so I am just implementing a CI pipeline. For now, when I submit any changes, I want to ensure that the formatting of the changes is correct, and I want to run the full set of scenario tests on all three common platforms. I do not want to have to ask for these to be kicked off, I want it to “just happen”. And that is where GitHub Actions comes into play.

GitHub Actions¶

While GitHub Actions have only been around for a short while, they have made quite the splash in the software community. As the main repository for the PyMarkdown project is a GitHub repository, I can associate specific actions with specific triggers that occur with the repository itself.

Setting The Initial Context¶

This is what I specified in the prefix of my main.yml file, located in the .github/workflows/main.yml directory of the repository:

name: Main

on:
  push:
  pull_request:
  schedule:
    - cron: '0 0 * * *'  # daily

Breaking it down, this workflow has a name Main, and will be executed if one of three criteria are met: a push occurs, a pull request is created, or the scheduled time occurs. In this case, that scheduled time is 00:00, or midnight server time. It is important to note that this interpretation of midnight is up to the server’s interpretation of midnight, not the viewer’s.

After that initial context was established, I started working on the jobs section, which initially looked like this:

jobs:
  install-test:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v2 # Checking out the repo

    - name: Install dependecies
      uses: VaultVulp/action-pipenv@v2.0.1
      with:
        command: install -d # Install all dependencies, including development ones

    - name: Test
      uses: VaultVulp/action-pipenv@v2.0.1
      with:
        command: run pytest # Run custom `pytest` command defined in the `[scripts]` block of Pipfile

Looking around for something that worked with the project, I read the documentation on the VaultVulp/action-pipenv action and thought it would work. However, after a few attempts, I wasn’t getting anywhere. I then moved on to the dschep/install-pipenv-action@v1 action:

test-2:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@master
    - uses: actions/setup-python@v1
    - uses: dschep/install-pipenv-action@v1
    - run: pipenv run pytest

and got some manner of response right away. Thinking about that response, I quickly changed the last line to:

    - run: pipenv --help

allowing me to make some progress. I then realized that I had tried to jump right to executing the tests instead of just verifying that pipenv was working for the project. By only executing pipenv --help, I was able to make sure that Pipenv was working properly without worrying about any other component, which was the right call for that point in time. And while it wasn’t exactly where I wanted to be, that simplified workflow worked right away.

Thinking things through with my use of pipenv, I figured out that pipenv run pytest by itself was never going to work. I needed to ensure that pipenv sycned itself with the project before running the tests. Therefore, I changed those steps to reflect that, as follows:

test-2:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout Repository
      uses: actions/checkout@master
    - name: Setup Python 3.8
      uses: actions/setup-python@v1
      with:
        python-version: 3.8
    - name: Install PipEnv
      uses: dschep/install-pipenv-action@v1
    - name: Sync With Repository
      run: pipenv sync
    - name: Execute Tests
      run: pipenv run pytest

and it worked right away! Before, I had tried to create a workflow that jumped right to the last step, and it failed. But now I was taking the time to set pipenv up properly using run: pipenv sync. I still took the steps to run the test as before, but by inserting the run: pipenv sync line into the workflow, I ensured that Pipenv synced itself up with the repository and its piplock file. That helped a lot!

Also, it helped me to give each step a distinct name. While different things work for different people, having a good name for each step just helped me identify with each step properly. And seeing as I maintain this project, that was important to me.

Adding Platform Support¶

I stated earlier in the article, my main goal for the pipeline was to run more extensive tests than I can locally. While I do have an Ubuntu subsystem on my machine, I do not have any Apple subsystem that I can install for testing. As such, I did some research and quickly ran into the GitHub Actions strategy item. Looking like this:

    strategy:
      matrix:
        #python: [3.8, 3.9]
        #platform: [ubuntu-latest, macos-latest, windows-latest]
        python: [3.8]
        platform: [windows-latest]
    runs-on: ${{ matrix.platform }}

the strategy section replaces the runs-on: ubuntu-latest line in the previous examples. The power of this construct is that it allows me to specify multiple platforms and Python versions to test on, which is exactly what I wanted to do: more extensive testing.

Code Coverage¶

I could have focused on more extensive testing at this stage, but I decided to focus on duplicating the existing tests on my local development environment. At that moment, I was more interested in demonstrating the code coverage for the project rather than the project running on multiple platforms. After doing some research on CodeCov, I created a new account on their site and added this code to the workflow file:

    - name: Report Coverage
      uses: codecov/codecov-action@v1
      if: github.event_name != 'schedule'
      with:
        file: ./report/coverage.xml
        name: ${{ matrix.python }} - ${{ matrix.platform }}
        fail_ci_if_error: true

After adding an environment secret to the project, the coverage information was relayed to CodeCov when I ran the workflow. It was then available for examinination on the project’s page at codecov.io, and soon after that information appeared in a new badge for code coverage that I added. While I got a lot of benefit for a relatively small amount of work, it was due to my ongoing measurement of code coverage that this was so easy. But a win is a win, so I took it.

Cleaning Up Documentation¶

Now that I was getting tangibly close to releasing, I wanted to start tracking changes properly. As such, when I noticed that every example in the project documentation was using python main.py instead of pymarkdown, I knew it was a good time to start. So, I created Issue 3 to track that project work needed to make the change. While the change was small, it was a habit that I needed to get in to.

Adding Lint Support¶

Before circling back and adding extra platforms to the workflow, there was one last thing that I needed to add: Lint Support. Literally cutting and pasting from the other job in the workflow and my clean.cmd script, I quickly came up with this job:

  lint:
    strategy:
      matrix:
        python: [3.8]
        platform: [windows-latest]
    runs-on: ${{ matrix.platform }}
    steps:
    - name: Checkout Repository
      uses: actions/checkout@master
    - name: Setup Python 3.8
      uses: actions/setup-python@v1
      with:
        python-version: 3.8
    - name: Install PipEnv
      uses: dschep/install-pipenv-action@v1
    - name: Sync With Repository
      run: pipenv sync
    - name: Execute Flake8
      run: pipenv run flake8 --exclude dist,build
    - name: Execute PyLint on Source
      run: pipenv run pylint --rcfile=setup.cfg ./pymarkdown ./pymarkdown/extensions ./pymarkdown/plugins
    - name: Execute PyLint on Tests
      run: pipenv run pylint --rcfile=setup.cfg ./test

Since linting should work equally well on any given platform, I simply picked the Windows platform to run this job on. Then, using my clean.cmd script as a guide, I created new steps after the run: pipenv sync step to execute the lint commands the exact same way that I execute those commands in my script.

While it doesn’t happen that often, I was able to get that new job up and running with only one try. I appreciate that it was due to previous debugging sessions, but I was still grateful.

Figuring Out The Linux Tests¶

After having left the Linux tests turned off from the night before, on Thursday night I was eager to get them those tests working. Enabling the Linux tests was as simple as adding the ubuntu-latest tag to the platform configuration for the tests:

    platform: [windows-latest, ubuntu-latest]

and committing it to the repository. That is when the fun began!

I created Issue 4 to track the Linux build issues and Issue 5 to track the MacOs build issues. These issues didn’t have to have tons of documentation, just some information that would help me figure out what to do.

Thinking that it would be an easy fix, I looked at the output and determined that it was a problem with the temporary files that I use in the tests. To ensure that configuration options are tested properly, for any scenario tests that require it, I create a temporary file with the configuration dictionary serialized to the file and pass the path to that temporary file into PyMarkdown using the --config command line setting. But for some reason it wasn’t working.

At first, it reported a problem with the delete=False parameter that was passed into the TemporaryFile function but removing that parameter didn’t fix the problem. So then I added --log-level DEBUG to the arguments for one of the tests, but I didn’t see any immediate difference. After four more debug commits, I still wasn’t any closer to getting any information that would help me figure things out.

I figured out that the only way I was going to figure out the issue properly was to install the Ubuntu subsystem on my machine. Without that, I was going to be guessing and guessing, with not much progress. After setting up the install for the Ubuntu subsystem, I decided it was time that I retired for the night. I had no idea what I was in store for next.

Serendipity Strikes¶

Having some family commitments to attend to, I was out of the house for about 24 hours from Friday to Saturday. When I got home, I was shocked. My system had died. It really died. After three or four hours of doing updates and resets and such, I was very convinced that it had died to the extent that it would take a complete re-install of the operating system to get things back in working order. It was deflating.

But I had an option. My son is very hardware oriented, and before this year’s computer chip scarcity started, he built himself a new system from scratch. To help him on his journey and get a newer system in the process, I purchased his old system from him. My current system was at least six years old and purchased from Fry’s, whereas his older system was around three years old and made from scratch for his gaming needs. It was going to be a clear upgrade.

The issue? Time. Between my various hobbies, yardwork, professional work, and the PyMarkdown project, I had never found the required time to get his older system set up. Now that my system died, I had a great forcing function for me to get it set up and fast.

Thankfully, I back most of what I need up to a backup server, and I was able to retain almost everything that I thought I might have lost. Installing the new operating system on the older machine was a lot easier than I remembered it being, and it was quickly completed. Going through a list of programs that I needed took a bit, but by focusing on it, I made quick progress of it. Once that was done, I started getting my GitHub repositories cloned to my new system, and I was back in business.

Painful, but worth it. The new system is clean, uncluttered, and fast.

Getting Back To Work¶

After sleeping in Sunday morning (after setting up the system until sometime in the early morning), I started to work on the Linux test issue in my system’s Ubuntu subsystem. I had installed Python, Git, and everything else the night before, so it was just a matter of working things out from my usual baseline.

With everything in place, I was able to quickly diagnose what the problem was. In my haste to get things working, I had created temporary files on my Windows system using the TemporaryFile Python function. While it worked fine on my Windows machine, it was not working at all on my Ubuntu subsystem. After a bit of research, I determined that I needed to change that function to the NamedTemporaryFile function and add the delete=False parameter back to the call to that function. After removing the print debug statements that I had previously added, that one change alone brough the number of failing tests from 70 down to 5 in a matter of minutes. After more than three days of trying to figure it out, it was solved, and it felt good.

The next three issues were also easy to fix. When I wrote the tests that handled the entities.json file testing, I erroneously used simple string concatenation to determine the path names. Now that I was dealing with more than one operating system, that practice fell apart. Replacing those concatenations with calls to os.path.join and doing some extra testing, I was able to put that issue to bed. With a reminder to myself to not shortcut code that deals with operating systems and their artifacts, it was on to the next set of tests.

There were only two stubborn tests left. Looking at the output, those test failures appeared to be simple ordering issues. When the pymarkdown plugins list command was submitted on the Linux systems, the order in which the plugins were reported back seemed arbitrary. To fix that, I simply replaced:

    for next_plugin_id in self.all_plugin_ids:

with the code:

    ids = self.all_plugin_ids
    ids.sort()
    for next_plugin_id in ids:

to sort the list before using it, and I was then down to zero failures. After ensuring that everything was working properly for both Windows and Ubuntu environments, I ran my script to clean up the code and committed those changes. A few anxious minutes later, and all the GithubAction jobs completed successfully!

And Then… Issue 5¶

And after all that work to get tests running on Linux machines, Issue 5 was addressed by simply including the macos-latest platform:

    platform: [windows-latest, ubuntu-latest, macos-latest]

I fully expected something to happen, but nothing did. After everything it took to get to that point, it was a nice change.

And Finally… The Initial Release¶

And with as much fanfare as the release of PyMarkdown 0.5.0, the initial release of PyMarkdown 0.8.0 was released. All scenario tests are passing on all platforms. The entire project is being linted and everything looks fine. It just feels right.

What Was My Experience So Far?¶

It was a long time getting to this point, but I was now here. I have a project that I can be proud of that I believe has been architected, designed, and implemented properly. It has a healthy set of scenario tests and unit tests, and it follows known coding practices. More than that, I gave myself the time to do things properly instead of rushing ahead, even when my instinct was to do just that.

With the initial release taken care of, I know I have some issues to take care of before I start adding new features, but I feel okay about that. I have a good set of things that I know I must look at, and a great set of tests that will make sure that any changes I make don’t break anything else. Basically, because I took my time to do things “right”, I have the utmost confidence that I can quickly deal with most of the issues in the Issues List.

And because of what I have learned getting to this point, the experience has been priceless.

What is Next?¶

To be honest, I am not sure. I still need to work on the project, but I am not sure if I should get some utility stuff dealt with or focus on the project before moving forward. Stay tuned!

So what do you think? Did I miss something? Is any part unclear? Leave your comments below.

Comments