Summary¶
In my last article, I talked whether the painstaking work that I am doing to enumerate every use case is worth it. In this article, I talk about taking a break from scenario testing the nested containers and just get back to fixing simple issues.
Introduction¶
Add north of fifty scenario tests to address a slice of the scenarios possible with three level nested containers. Work through each one and see if it passes or fails. Categorize the failures and work through them in groups. Rinse and repeat for a month or so. It was just getting a bit monotonous and boring. That meant that it was starting to feel more of a chore than a good project to work on.
To combat that feeling, I decided to work on a few of the easier issues in the issues list. If nothing else, I could get issues off the backlog and off my plate. But I just wanted to change things up for a bit.
What Is the Audience for This Article?¶
While detailed more eloquently in this article, my goal for this technical article is to focus on the reasoning behind my solutions, rather than the solutions themselves. For a full record of the solutions presented in this article, please consult the commits that occurred between 07 Feb 2022 and 13 Feb 2022.
Version 0.9.5¶
Well, right off the mark, the first thing to happen this week is that I released version 0.9.5 of the PyMarkdown project. It had been roughly a month since the last release and having completed a solid number of fixes for issues with nested containers, I thought it was a good time to publish them.
Not really anything exciting to report there. Just some bug fix goodness.
Fixing Some Easy Issues - Issue 95¶
Having just released an updated version of PyMarkdown, I decided to look at the project’s Issues List and find and fix a few of the more visible issues in the list. The first one that I grabbed was an issue that had been in the list for about a month that I knew was going to be a tricky issue to resolve. But as I had the time to take care of it, it was one that I definitely wanted to resolve.
It ended up taking as much time to resolve as two or three nested container issues, but it was well worth it. By the time I was finished, I had made changes in the Container Block Processor module, the List Block Processor module, the TransformToMarkdown module and the code for Rule Md027. There was a lot of head scratching along the way, but I persevered through it, and figured out all the changes.
Why was it so complicated? The Markdown document itself was relatively simple:
> + list
> this
> > good
> > item
> + that
But it was the +
character on the last line that caused issues. The first issue
was that up
until this point, I have not exhaustively tested three-level nested container blocks.
This document is essentially a Block Quote/Unordered List/Block Quote element nesting,
with the two remaining issues being the two of the big concepts I have left to test
in my nesting scenarios: ending container support and new List Item element support.
To break it down properly, ending container support is when the use of something
in a later line causes one or more of the containers elements to end.
In this case, the new List Item +
character signifies the start of a new List Item
within the level-two List element. As a level-three Block Quote element was in
effect, that level-three element is closed to allow the new List Item to start.
While these do not usually cause issues, it did in this case, and I had to mitigate
that. The other reason is new List Item element support. While I have orchestrated
all the combinations of Block Quote characters and their “missing” cases, I have
not started yet on the same process for new List Item elements. As some of the
List element information is stored in the List Item tokens, it just needed to be
worked out.
None of these were unexpected issues, but I did not expect for them to happen at the same time. I just buckled down and worked through all the issues. Since this issue had annoyed me for at least a couple of months, it was good to get it out of the way and cleaned up.
User Request: Slight Change To Rule Md003¶
One of the other issues that leapt out to me was a user request regarding Rule Md003.
In its default style of consistent
, that rule searches for the first use of a heading,
either an Atx Heading or a SetExt Heading. As the main purpose of this rule is
to check for consistent use of headings throughout the Markdown document, the rule
uses that first heading as the basis for its evaluation of the rest of the document.
For example, given the Markdown document:
This is a heading
=================
and a setting of consistent
, the rule rightfully assumes that the document author
wants to use SetExt Headings in the document. The problem with that is the following
document:
This is a heading
=================
### This is another heading
As there are only two SetExt Heading sequences, =
for level one headings and
-
for level two headings, an added style of setext_with_atx
was created to allow
for this mashup of level one and level two SetExt Headings and level three plus
Atx Headings. But from the consistent
style point of view, there is one
problem: there is no way to distinguish between a style of setext
and setext_with_atx
using only the first line of the document. As such, it is safest to assume a style
of setext
without any other context being available.
To address the user’s request, I decided to add a new configuration value for
this rule: allow-setext-update
. When enabled, the original assumed style is
still setext
.
However, when a level three plus Atx Heading is encountered and the allow-setext-update
value
is set to True
, the style is updated from setext
to setext_with_atx
.
In this way, the new behavior of the rule is backwards compatible while allowing
a consistent
style to upgrade itself to the setext_with_atx
style if needed.
Multiple Issues in One: Issue 189¶
This was a fun issue that took me a couple of tries to get it fixed correctly. Pared down to its base components, the Markdown document causes the failures was:
# Document
## Solution B
> :exclamation:
>
## Metadata
- Jira issue:
When I looked at this issue at the beginning of January, it looked simple enough. As such, after about an hour, I had a potential fix as well as a workaround to use until the next time that I published a release. All looked good.
So it was with shock that when I went to confirm this issue’s resolution that I found it was firing on an exception. Thinking about it for a bit, it made sense that it could fire an exception because I had tested documents like the above document, but not that exact combination. As such, it fell between the cracks.
Adding both the full example attached to the issue and a condensed version of that same document, I created a proper set of scenario tests for it this time. With the information from both new tests, I was able to create the one-line change needed to fix this issue. I had a simple off-by-one error in the code for Rule Md027, and the exception went away.
Remember To Keep Things Simple: Issue 161¶
The final issue that I wanted to get done this week was Issue 161. For this issue, the rule itself was firing properly, but it was reporting the wrong actual and expected numbers when the rule was fired. After an hour of reading and debugging, I could tell why.
Instead of trying to do something clean, I decided to try something smart. Maybe
there something preventing me from doing the clean approach before, I am not sure.
But as I looked at the code for the __report_issue
function, I was confused.
Given that the detection logic was working properly, why did I try and make the
code recompute both values instead of reusing the one value from the
detection logic? And did the other value’s computation have to be so difficult?
After another half-hour of trying to figure things out, I gave up. Instead, I
added a if True: pass else:
block before the confusing code and started to
experiment with a cleaner approach to calculating those values. Once I got the
cobwebs of the first approach out of my head, I was able to get both calculations
working properly within the next hour. Add in extra time for cleanup, and I was
done.
Time spent writing the old approach? I would have to guess at least five to six hours. Time spent trying to understand it? At least two hours, give or take. Time spent to replace it with something cleaner, including tidying the new approach up? Less than two hours.
Sometimes I wonder if I write the more complicated code to confuse myself, or if there was something else preventing me from writing the clean code. I wish I knew so I could avoid the nastier approach from the start.
What Was My Experience So Far?¶
Sometimes I feel that getting the PyMarkdown project to a good state is further away than it probably is. Part of my brain sees the issues that I am finding with nested containers and worries that I am going to keep on finding those bugs for years. The other part of my brain knows that I cannot discover every issue before it is reported by a user, but I am making good progress in identifying those issues myself. It is a balancing act that is never boring.
But generally speaking, I am okay with it. I think one thing that this week has reminded me of is that change is good. I do have a couple of smaller projects that I can work on, and I think I am going to try and spend a bit more time on those projects to help me rest up before the push on the PyMarkdown project later in the week. Is that the right thing to do? Not sure. But I do know that I must treat the PyMarkdown project like the marathon it is, and not a sprint.
What is Next?¶
Release with fixed code for past four weeks. Done. Picked up low-hanging-fruit issues and resolve them. Check. What is next? Not sure. Stay tuned!
Comments
So what do you think? Did I miss something? Is any part unclear? Leave your comments below.