Summary¶
In my last article, I talked about how I was getting back to work on my project after taking a few weeks to recover from a crash. In this article, I talk about the refactoring work I did in the last week.
Introduction¶
During my recovery phase from my crash, I decided to do some low-cost refactoring of the PyMarkdown project. And while the refactoring was easy to do, it did make me think about the way in which I was refactoring the class.
But First, An Aside¶
Any avid readers of these articles will notice that I am writing this and posting this on a Tuesday. I was a victim of “National Scare The Crap Out Of Your Pets” Day. For any Canadian readers, Happy Canada Day and for any American readers, Happy Independence Day. That means one thing: a lot of partying and a lot of fireworks.
From a pet owner’s point of view, it was just an exceptionally long weekend. Our dog Bruce is a lovable doofus, but he fears loud noises. By the time Sunday rolled around, people in our neighborhood were already starting to set off fireworks. That meant that Bruce was unsettled and looking for the places in our house with good sound proofing. That either meant our master bathroom shower (with the bathroom fan running of course) or in the basement with the door closed. On Sunday night, Bruce hid in those locations when he could, but on Monday night, hiding in one of those two locations was a necessity.
Why do I mention this? Because to a certain extent, I was doing the same thing as Bruce. My office is a wonderful place for working, but not a good place for blocking out outside noise. On most days, I can play my music and it covers most of the outside noise. But the past Sunday night and Monday night were not about normal outside noise. Even with my music turned up, I was still hearing the fireworks that were going off outside. Combining that with my Autism meant that I was losing focus every time a firework went off outside.
And while I am a bit tired from last night (see the earlier mention of Bruce being anxious about the noise), it is peacefully quiet outside. Birds chirping, the odd car driving by the house, and no fireworks. That means, I found my space to write this week’s article. Sorry for the delay!
Simple Refactoring Is Not Always Simple¶
Now, back to the focus of this article: grab bags. I am not sure what the actual name of these objects are, but I have always heard them referred to as grab bags. In a physical sense, a grab bag or a go bag is a short form for a grab-and-go bag. These grab-and-go bags are actual bags, usually a large purse or a backpack, that disaster preparers keep ready for emergencies. The general idea is that with one of these bags, a person has enough of their basic needs met to keep them going through at least 72 hours of an emergency.
Back To Basics¶
From my years of experience, one of the development paradigms that I find useful is the object-oriented development paradigm. Without going too far into the explanation of what object-oriented development (OOP) is, one of the underlying facets of this type of programming is that common elements are grouped together in objects, those objects usually being referred to as classes. Therefore, an object that deals with a position on a map should be represented by a class with either two or three numeric values specifying a relative location. If dealing with other concerns about that object are required, then OOP allows for a new class to be created with those concerns, inheriting the base elements from the original class.
What does this look like? Using a simple Python data class, the original class would look something like:
@dataclass
class MyPosition:
x_location: int
y_location: int
z_location: int
Using inheritance, if I want to add extra concerns to that class, such as a name, we can create a new class:
@dataclass
class MyNamedPosition(MyPosition):
name: str
Here I am using the @dataclass
modifier to simplify things, but if I write it
out in long form, the same rules still apply. The class MyNamedPosition
contains
four properties, three from the MyPosition
class and one that it generates itself.
The important thing here is that there is cohesion between the data elements and the functions that use them. When I look at either of those two classes, the collection or data elements I see in each class is a cohesive group that work together. This is enough of a grounded concept that there is an existing metric called Lack of Cohesion of Methods that is documented at the above link. At that site, one of the recommendations for objects that have low cohesion is:
Low cohesion indicates inappropriate design and high complexity. It has also been found to indicate a high likelihood of errors. The class should probably be split into two or more smaller classes.
And for the most part, I sincerely agree with their arguments and try and keep my classes cohesive, with a single responsibility if possible.
Enter The Container Block Processor and the Grab Bag¶
In trying to simplify and refactor the ContainerBlockProcessor
class, I was faced
with an interesting dilemma. While there are a handful of variables used by
the class that can be grouped together, most of the variables denote a distinct action
that was undertaken or a distinct measurement that was performed. As such, any attempt
on my part to clean up the arguments being passed between functions would
result in almost as many new classes as there were existing common arguments.
As someone who believes in using Best Common Practices, I believe that the Single Responsibility Principle and Object-Oriented Development are solid ways of creating and refactoring code. I did not like all the arguments being passed between the various functions in the class, but I could not find enough common responsibility between the variables to have a manageable number of classes that I could pass around instead of those arguments.
Enter the programming grab bag. In the physical world, a grab bag is a bag
that can be grabbed that holds a mix of things that are probably not related,
except for them being needed in an emergency. In the development world, the
normal practice of maintaining a single responsibility for the class and cohesion
within that class is suspended in favor of having one location for all variable
related to the parent class. In this case, I created the ContainerGrabBag
class
to hold the various variables I collected from the arguments of the functions
of the ContainerBlockProcessor
class.
This decision was somewhat dangerous from a maintenance point for one simple reason: multiple responsibilities and low cohesion means that understanding the flow of the parent object is going to be more difficult than it should be. However, since I was starting with arguments that were being passed up and down the function chain, I decided that the grab bag approach was going to be the better approach. To further enhance the maintainability of the grab bag, I made sure to log initial states and every change of state of any of the elements within the grab bag. I figured out that while I cannot reduce the count of elements in the grab bag, I can improve maintainability by clearly noting when any of the states change.
But when I sat back and thought about it, the refactoring was needed. Based on my development principles, I would not have taken this route from the start, as I believed that I could find simple responsibilities that I could factor out from the arguments. I had that belief right up until the point when I decided that using a grab bag was the only way to solve the issue. In the end, it was a calculated move that one class with many variables and logging of any changes in those variables would be more maintainable than passing arguments around.
The Refactoring Took Many Weeks¶
Even factoring in my recover over the last few weeks, this type of refactoring
takes a long time. Adding the new variable to the ContainerGrabBag
class was easy.
That part of the refactoring took less than five minutes. I then had to
scan for that variable throughout the ContainerBlockProcessor
class and
figure out whether each reference was referring to the “global” variable being
passed around, or if it was a special case. Most of the extracted variables
just referred to the “global” variable, but the ones that did not caused me
enough concern that I took things slowly.
Slowly meant making a small set of changes, executing ptest -m
to execute
the scenario tests, and then waiting
for those tests to complete. If everything was fine, it was on to the next change
in the search results. If not, I had to go back and figure out why the change
failed and adjust for those results. Guessing the amount of time taken
for each iteration of that loop, I would say it averaged about 3-4 minutes
between the successes and failures. Multiple that time by the number of
variables in the ContainerGrabBag
class and the number of times that they
occurred in the original ContainerBlockProcessor
class, and that is a
substantial duration of time. If I had to guess, that duration
would be days, not hours.
Once that was all cleaned up, the other parts of the refactoring were less
time consuming. Since any state change was being logged as part of the
ContainerGrabBag
class, removing any lines that were in the original
class to trace values was a simple operation. Then, going through the
search results for the transferred object, I was able to quickly isolate
arguments and return values that were no longer needed now that the
value was in the grab bag. I usually cleaned up two or three functions
at once, so the overhead of executing the scenario tests was not too
expensive.
Finally, I was left with two sets of operations of the variable that I was working on: those that did actual work and those that were setting the variable to its default value. While that distinction may appear to be a simple calculation, it was not always like that. In cases where the variable was set in multiple locations, I had to comment out that set statement and verify that it was indeed setting the variable to the same value. And yes, that meant another set of scenario test runs.
What Is The End Result?¶
Based on a couple of attempts at debugging a couple of minor issues, I can verify that the newly reworked code is indeed easier to maintain. While I know distinct objects would be easier for me to model in my head while debugging, the logging of any state change to the log file helps mitigate that negative. It is a somewhat weird balance that I need to get used to, but I am getting used to it. Instead of keeping that information in my head, I am getting used to checking the previous lines in the logs to figure out when the states changes and what they changed to.
It is an ongoing process, and it is working better, which is what my primary goal was. From that point of view, even an incremental improvement is a success.
Release 0.9.7¶
With the refactoring of the Container Block Processor class completed, I looked at the project history and realized that it had been over three months since I created a new release. While a month of that was covered by the crash and recovering from the crash, I had completed cleaning up the remaining scenarios for the nested container scenarios that I had added back in the February-March period. As such, I thought it was long time that I created a release.
There was not anything fancy added in the release, but for me it was important to release a more stable version of the project. I still have two more classes of scenarios to cover, but I was proud to have eradicated all the issues that I had found to this point. I also have faith that the remaining issues that I have found through random testing will be covered by the next groups of test scenarios that I will add.
How Am I Feeling?¶
To be blunt… almost back to normal. One of the things that my crash illustrated for me is that I need work more diligently on a good balance in my life. Without that balance, I know it is only a matter of time before I get into another situation like the one that caused my crash. I need to feel okay about stepping away from my projects for a bit to clear my head.
But on the other side, I also need to make sure I am not taking too much time away from them either. I enjoy working on my projects, and they do require a certain level of focus to maintain my interest in them. After taking some time off, I am finding that it is more difficult to find that balance between too much project time and too little project time… with the focus on the too little side.
But other than that, I am feeling better physically and mentally, and I did enjoy spending time working on the projects this weekend. Well, before the fireworks started going off that is!
Comments
So what do you think? Did I miss something? Is any part unclear? Leave your comments below.