In my last article, I continued in my quest to reduce the size of the issues list. In this article, I make good progress in removing items from the issues list.
This article is probably going to be a lot shorter than the others in this series. It is not the case that I did a lot less work, just that there is just a lot less of that work to talk about. The bulk of that work was an effort by me to get the extra groups of scenario tests more organized. Sometimes the work I do on the project is easy to describe and takes a lot of words to explain, as usual. But in this case, it is just a lot of work that has a short description.
What Is the Audience for This Article?¶
While detailed more eloquently in this article, my goal for this technical article is to focus on the reasoning behind my solutions, rather that the solutions themselves. For a full record of the solutions presented in this article, please go to this project’s GitHub repository and consult the commits between 23 Oct 2020 and 25 Oct 2020.
Keeping Things Organized¶
For most of the time that I have spent on this project, I have maintained a very good grasp of where things are and what needs to be done. Sure, I use the issues list to keep track of all the little things that I need to fix or check, but I rarely forget about items on the list. Usually, within seconds of reading one of the items on the issues list, I remember a good amount of information about what the problem is and why I added it. But since the project has been going on for a while now, I sometimes forget when I added the item to the list.
When it comes to the scenario tests, it used to be the case that I had an equally solid hold on the tests and what they covered. At first it was easy, one example from the GFM specification equals one scenario test. Then, as I was working through the validation of the parser, I added some extra scenarios by adding a letter to the end of the parent scenario and documenting the change. At that point, there was still a good link between the example that spawned the new scenario test and the parent of that test.
And then came the “extras” scenario tests. Most of them were added to provide additional testing for the line/column numbers needed for the linter than for the parser itself. As the example provided for in the GFM specification are targeted for parsers, I felt it was totally reasonable to add extra scenario tests to ensure that the different in testing between the needs of a parser and the needs of a linter were covered.
But what started out as a few scenario tests, rapidly grew to a set of tests that numbered over 140. The tests are their contents started to get jumbled up in my head on a frequent basis. What group of tests was the test I was just looking at a part of? Did I cover all the cases within in the group? Did I miss one? If I missed one, how could I tell? It just got more difficult to keep things clear with each case I added.
Finding A Solution¶
Recently, I had added an item to the issues list to start looking at this:
- need comprehensive table with tests that qualify for each test case i.e. para_extra with different groups, links and images
Now that I had a few weeks to think about it, I believed a solution was in order. The first big decision I need to make was to figure out the medium for this table, with Markdown being the natural selection.
Creating the Markdown document
scenario-case.md in the root folder, I tried four
or five formats before settling down on the final format. The first column contains a
unique identifier for that test. Starting with the letter id for the group of tests,
I made sure that the rest of the identifier clearly described what the scenario test
contained using a common id schema. Following that identifier column are the columns
that contain a short description of the scenario test, the relevant Markdown from the
test, and the actual function name of the scenario test. Starting by enumerating the
scenario tests for the
test_markdown_paragraph_extra.py module, I left space for a trailing column that
would also include the function name of the scenario test for the
I Really Needed This¶
When I finished creating this document as the result of this multi-day task, it was extremely obvious why I needed it: there was a lot of information to process. There were a solid core of groups in the initial document, and by documenting each group, I easily found things that I had missed or not completed. By clearly delineating each group of tests, it also became a lot easier to see the patterns in the groups and why they were important.
First, there were the basic groups, Series A to Series E. These were simple foundational tests for the complete set of inline elements. Series A tested the inline element at the start of a line, followed by text. Series B and Series C followed those tests by testing the same inline elements with text around the inline element and text followed by the inline element. While they would not be seen that way naturally, Series D provided tests for each inline element alone in a document by itself. Finally, Series E extended the Series D tests by testing those inline elements that can contain a newline character, rounding out the foundational tests.
From there there were 4 groups of tests that I had added that all dealt with link elements and image elements. The Series F tests provided for a newline character in each part of all 4 link types. The Series G tests provided for the same type of tests but tested for the inclusion of a backslash escape sequence and character reference sequences. Due to some issues I encountered with Code Span elements and Raw Html elements inside of Link Labels, the Series H tests were added to provide a thorough testing of those combinations. Rounding out those tests, the Series J tests provided for various combinations of Inline elements with Link elements and Image elements, once again added due to some issues that I ran into.
The Proof Is In The Pudding¶
As they used to say “the proof is in the pudding”. Basically, the only way that I am going to find out if this works or not is by using it and judging the results. However, I can say that this initial task of putting the table together has already yielded positive results. There were tests that were duplicated, and there were new distinct tests that were child tests of existing tests. Putting that table together helped me clean up the extra tests by fixing up those cases. In addition, it found a missing scenario that dealt with an inline link type that did not have a title but had whitespace after the URI. That scenario is one that I would not have found otherwise.
I feel the balance point of this work needs to be mentioned, as the cost of putting this table together was a couple of days’ worth of work. That cost would have be spread out over numerous issues if it was started at the beginning, but at that point, I am not sure if the benefit of putting this table together is something that I thought would justify the cost. As a matter of personal sanity, I try not to do the “what ifs” too often. From my point of view, while it might have been better to do this earlier, it was at this point that I started seeing that it needed to be done. After I made that observation, it was only a couple of weeks before I took the time to create the table. And that time was spent thinking about how I wanted to organize the table, so it was not wasted time.
I guess the thing I am trying to say is this: do not kick yourself and do not rush things. If you can see something ahead of time, and the benefit is worth the cost to mitigate the issue, then mitigate. Otherwise, wait until the benefit becomes worth the cost, plan it out to make sure you can do it cleanly and clearly, and then work on it.
Is A Character Entity the Same As A Character?¶
In the section of the GFM on Entity References, the specification is very clear about how Character Entities are to be used:
Entity and character references cannot stand in place of special characters that define structural elements in [Markdown]. For example, although * can be used in place of a literal * character, * cannot replace * in emphasis delimiters, bullet list markers, or thematic breaks.
To be blunt, you cannot get clearer than that. If you want the Markdown document to be interpreted properly, your document must use the actual character used to invoke the behavior you want, not a character reference.
For the most part, I was very comfortable with that, and I was sure that I had adhered to that rule throughout the parser. When it came to the newline character, I was not as confident. I was pretty sure that if I tested all the inline elements with the character entity newline, that the HTML output would look correct. However, while I was sure that I properly handled most of the cases, I was not as sure that I had properly handled all those cases. Therefore, enter Series K.
To get this series started, I began with the Link elements. Starting with the inline
Link element, I enumerated the 7 places where the character entity
inserted, followed by the 2 places for the full Link element and 1 a piece for the
shortcut Link element and the Collapsed Link element. Once that was in place, I
duplicated those scenarios, transforming the Link elements into their equivalent
Image elements. Finally, I added cases for the Emphasis element, the Code Span element,
the Raw Html element, and both Autolink elements.
With a total of 27 new scenarios to cover, I started to work on creating these scenario tests, one at a time. As with other tests I have documented in previous articles, I was very precise and meticulous with the creation of those tests. When each test was created, I gave it a manual check before testing it against the parser’s HTML output, but only after running the Markdown against BabelMark.
In the end, 11 of those new scenario tests resulted in failures, with 16 of them passing right away. But at that point, it had taken me most of the day to add and verify the tests, along with all the things that life threw at me that Saturday. A bit reluctantly, I committed those tests as-is, after adding 11 new items to the issue list to track those tests I needed to finish. Then I went to sleep.
Finishing Up the Series K Tests¶
After a good night’s sleep, I took a look at that week’s article notes. I figured that the article notes were in a good enough state for me to look at those 11 new items I added the night before. Planning out my work to address those issues, a couple of things leapt out at me.
The first thing that I noticed was that while I had added some good tests for the link types, I had not varied the type of character entity or numeric entity that I used in place of the newline character. While I was confident it had been covered in the base GFM Specification, I believed that for completeness it would be good if I also added it in this group of tests. When I was done fixing that, there were 14 new tests in the table, 7 new tests for Link elements and 7 new tests for Image elements. In addition, there was 3 extra scenario tests that I added to complete the coverage for the other inline elements. When that work was done, the Series K group contained 42 tests.
The other thing that I did not realize the day before was that the failures were nicely
bucketed into three groups: one that failed in the
InlineProcessor class, ones that
failed in the
__verify_next_inline_handle_previous_end function, and ones that failed
__verify_next_inline_text function. While it would have been nice if they
all had the same root cause, three different causes to examine was still better than
Attacking the Parse Failure Cases¶
As luck would have it, this was the easiest problem to fix. In the code for the
__calculate_inline_deltas function, there was an assert statement as follows:
assert link_part_index > -2, "Newline in link token not accounted for."
When I originally added this code, I was being defensive, thinking that I had not properly handled a newline character occurring in the link label part of the Link element and the Image element. It took me a bit of time and some extra debug scenarios, but I was able to conclusively prove that newline character in link labels were already been handled by another part of the code. As such, I was able to comment out that assert and resolve two out of the eleven failures. On the the next one!
Properly Handling Split Lines¶
After a bit of looking at the remaining tests, one pattern leapt out at me. Immediately, it looked like the line/column numbers being calculated by the parser were correct, but the same calculation for the consistency checks was off. With only a small amount of looking at the problem, the cause for that result became obvious to me almost immediately.
When any of the character entities or numeric entities are used in a normally
processed part of the document, a replacement sequence is placed within the token
to represent that sequence. For the Markdown generator, the original entity is
persisted, and for the HTML generator, the replacement text is persisted. This
is performed by using a replacement sequence such as
\a\n\a, making it
clear which part is “what is replaced” and which part is “what it is replaced with”.
And therein lied the problem. In the
__verify_next_inline_text function, the
current_line variable is split as follows:
split_current_line = current_line.split("\n")
From there, each part of that split line is processed. But due to the above
each entity sequence like
generates another entry in that list. Now, if the
HTML output were being verified, that would work well. But as the consistency checks
are used to verify the line/column numbers in the Markdown document, those newline
characters confuse the issue.
I realize that this may seem confusing, so consider a small Markdown document like:
When this document is parsed, the above Python code will generate an array with two values in it:
If that does look right, it is because it is not right. It is a faithful interpretation
of the request made to split the string
\a\n\ab at its newlines, but it
does not understand that it should only consider the first part of the replacement
sequence, not both the first and the second parts.
I needed to come up with a way to deal with this issue.
Fixing the Issue With Split Lines¶
That is when I started putting together the
function. Given that array of split lines, this function specifically looks for
any entries that start with the replacement character (
\a), and the previous entry
ends with the first part of the replacement sequence. Basically, as the data is split
on the newline character (
\n), we want to look for cases where one entry ends with
\a part of the sequence and the following entry starts with the
the end of that sequence.
When it was all said and done, the function looked like:
def __handle_newline_character_entity_split(split_current_line): try_again = True while try_again: try_again = False for search_index in range(1, len(split_current_line)): if split_current_line[search_index].startswith("\a") and split_current_line[ search_index - 1 ].endswith(__create_newline_tuple()): combined_line = ( split_current_line[search_index - 1] + "\n" + split_current_line[search_index] ) split_current_line[search_index - 1] = combined_line del split_current_line[search_index] try_again = True break return split_current_line
After testing this new function with a single scenario test, I execute all the failed tests again, and was rewarded with only four failing tests. Time to buckle down and get the last ones taken care of.
With four tests remaining, I was ready to put in a lot of work to figure out what the problem was. I made sure I had a good snack and a big glass of water, turned on the debug logging output for some of the tests, and proceeded to look at the failed scenario tests and the scenario tests that were around them and were passing.
The first issues I noticed in the failed tests were in the
__verify_next_inline_handle_current_end function, where the number of newlines
characters are counted. Looking at the difference between the reported line numbers
and the calculated line numbers, there was a direct correlation between the number
of entity sequences used and how much the reported difference in line numbers. Having
seen this pattern many times before,
it was easy for me to see that instead of using fields in this manner:
newline_count4 = ParserHelper.count_newlines_in_text( current_inline_token.start_markdown_token.link_title )
I needed to change that to:
pre_link_title = current_inline_token.start_markdown_token.link_title if current_inline_token.start_markdown_token.pre_link_title: pre_link_title = current_inline_token.start_markdown_token.pre_link_title ... newline_count4 = ParserHelper.count_newlines_in_text(pre_link_title)
After a quick check against the tests, it was obvious that there were also some
issues with the
__verify_next_inline_handle_previous_end function, of the same
type. Looking at that function, I needed to do the same with the
pre_link_uri field that I did in the above example with the
field and the
While verifying those changes, I did notice that there was an omission that was not
tested. In cases where an inline link contains an URI that is inside of “angle
>), I was not adjusting the counts to accommodate for that.
After adding an extra test or two, the solution to this was easy, adding
if parent_cur_token.did_use_angle_start: part_3 += 2
What Was My Experience So Far?¶
There were a couple of times during the creation and validation of the scenarios table that I wanted to give up. It was a large body of work with no immediate validation of its correctness. It was also a body of work that could easily fall out of sync with the actual scenario tests themselves, so there was the future work to maintain and revalidate the table to consider.
But still, that table was, and still is worth it! I emphatically believe this with few reservations, if any. I do agree that I need to be careful in concluding about whether to add a scenario test or two versus adding a new scenario test group. That is a no-brainer. But this is another tool that I have in my toolbelt to help me make sure the quality of the PyMarkdown project is where I want it to be.
Take the case with the group for the newline character entity sequence. If I had added only a couple of tests, I would have wondered if I had captured all the cases. By adding the tests as a group of cases, I carefully documented and tested the cases that I could come up with, but left the door open for more cases to be added at a future date. For me, that is a winning proposition for the right kind of scenario tests.
And as I mentioned above, the proof is in the pudding. I have already cleaned up an old assert statement in the Inline Processor, something I would have overlooked without that group of tests. In addition, it found at least three issues with the consistency checks for the line/column numbers, making that watchdog code more complete. That small step towards completeness of the consistency checks means that I have more confidence that it will detect any issues that arise when something is changed.
I know I am getting close to the end of the project, and my goal is to add both single tests and groups of tests with no changes to either the parser itself or the consistency checks. And this work brought me one step closer to that goal!
What is Next?¶
Now that I had the master table of scenario tests and their groups together, it was time to leverage that information and add to it. That was the focus of the next week’s work, work that was indeed satisfying.
So what do you think? Did I miss something? Is any part unclear? Leave your comments below.