Python Projects - Using PipEnv · Jack's Digital Workbench

Summary¶

With almost a year of Python development under my belt, I wanted to start talking about the Python elements that I feel have made my development of the PyMarkdown project successful. When thinking about where to start, the first thing that came to mind was my use of PipEnv to maintain the project environment.

Introduction¶

While I am relatively new to Python, I am an old hat at trying to maintain a cohesive way to keep my projects usable yet portable. From my early days in C and Pascal to my current days in Go and Java, I have always tried to have a cohesive story around how to package my project’s source code to be used by multiple developers. In the end, I found that the main goals were always the same: how do I keep a project usable by all interested developers, while keeping the maintenance of that project to a minimum?

From my point of view, PipEnv meets those goals for Python by solving a lot of the common issues that project maintainers and developers have. This in turn makes two of the harder parts of developing programs mostly disappear: dependency management and portability. Using the simple command pip install --user pipenv to install into any Python system, PipEnv quickly became a useful tool in my toolbox and has stayed there. During this article, I will talk about why PipEnv keeps that position in my toolbox.

Quick Note¶

While I realize some projects may still be on Python version 2, this article is targeted for people developing on Python 3 and above. More precisely, these examples were all tested with Python 3.7.

What Is PipEnv?¶

While the full answer can be seen at the top of the PipEnv home page, my summary of that answer is as follows:

PipEnv is a Python module that cleanly manages your Python project and its dependencies, ensuring that the project can be easily rebuilt on other systems.

While that might seem like an easy thing to accomplish, PipEnv or tools like it are usually only employed after a journey through other, less efficient solutions.

To understand those solutions and the problems that they present, let’s start at the beginning with dependency management.

Starting with Dependencies¶

In Python, as with most modern languages, a lot of the heavy lifting is done in libraries. For Python, these libraries must be installed directly into the current instance of Python for those libraries to be visible to the Python programs. The most common way of installing and managing those libraries locally is to use the pip tool. For example, to add the latest version of the colorama library to the local Python installation, the following command is used:

pip install colorama

But that command has a problem in that it installs the specified library into the global instance of Python. Even addressing that problem using a user-specific variation of that command:

pip install --user colorama

that caching still takes effect for all Python programs that are run as that user. That is a bit better, but not optimal.

The big issue is that it is hard to replicate which libraries are installed for any given project without maintaining a per-project script with many lines in it, one for each library. With multiple projects sharing the same global cache of libraries, a clear way to enforce the libraries and versions needed for a specific project is needed.

Enter the requirements.txt file.

Explicitly Specifying Requirements as Code¶

The first step that many people take on their Python project journey is to create a requirements.txt file to hold all the library requirements for their project. The file has a format of:

colorama=0.4.3
Pygments==2.4.2

with each library being listed on its own line, optionally followed by the version of the library to install. To apply these library requirements to a given Python environment, the command:

pip install --user -r requirements.txt

is used. This usage of pip is used as a shortcut for having to call pip install once for every library needed in the project. By this point in the project process, most developers understand that nailing down a specific version of the library being used is critical to the project’s success. By specifying an exact version of the library to be referenced, the pip tool guarantees that it will always reference that specific version of the library, locking down the behavior of the library and making it predictable.

While the tool usage is simple enough, there are problems with using the pip tool in this way. The first problem is that to ensure that I have the right libraries for my program, every time I run that program, I need to re-run that pip install --user -r requirements.txt command before I run my program. If I do not run that command, I risk the chance that another program has changed the libraries on my program, either causing the program to fail or rendering its output questionable. Even when I created a script to run the above command and my program together, I felt that the combination often feels slow, bothersome, and inelegant.

The second problem is that of “phantom” dependencies. For argument’s sake, let’s assume that I am maintaining 2 projects and their dependencies. Project A has a dependency on Library 1 and Project B has a dependency on Library 2, with Library 1 and Library 2 being separate libraries. Furthermore, let’s assume that both projects use a requirements.txt file and the above pipenv install method to manage their project dependencies. Because of the way these files are applied, if Project B is used after Project A, it retains access to Library 1 that was installed by Project A. After all, with nothing to remove Project A’s dependencies, they stay in the user’s global cache of libraries. This means Project B’s dependencies are not clearly defined and may prove difficult to replicate on someone else’s machine.

Given those problems, how can the project move away from using global library caches?

Another Step Forward: Virtual Environments¶

The next step forward is to use the tools accumulated so far and to add virtual environments into that group of tools. Installed using the following command:

pip install --user virtualenv

the user can then run the following command to create a virtual environment under the current directory:

virtualenv venv

This command may take some time to complete, but once it is done, there will be a local virtualenv directory that contains a completely distinct version of the Python installation that it was created from.

Even with that environment in place, there is some work to do before executing the project’s Python program. To use the virtual environment, the execution of an activator script is required to set the shell’s environment to point at the correct environment. Located in either the virtualenv\Scripts directory (on Windows systems) or the virtualenv\bin directory (on Posix systems) are a group of scripts with the name activate or matching the pattern activate.*. Executing the correct script activates the virtual environment for a given shell, isolating any changes to the Python environment into that virtual environment.

The usage of this tool seems beneficial so far, so what are the issues with this tool? The big issue for me is that you must remember to deactivate the environment before leaving the project folder. If you do not deactivate the environment before leaving the project’s directories, you can be doing something else in another directory and forget the directory that you anchored that environment to . Without noticing it, a simple pip install command will then alter the requirements of that environment, and not of the environment in the current directory. This is a realistic scenario. In my early days of Python development, I did this numerous times! And each time, it took a while to figure out what I had done and how to reverse it.

A smaller issue with these environments is that they are specific to the local system but are anchored within the project. This means that when dealing with version control systems such as Git, the project needs to explicitly ignore the files in the project’s virtualenv directory to prevent local environment files from being committed. However, even with that directory ignored, the project requires extra scripts as part of its project code that specify how developers need to create the required virtual environment.

Given those issues, how do we take the best parts of pip and virtualenv and merge them together?

Enter PipEnv¶

Installed using the following command:

pip install --user pipenv

PipEnv combines the best aspects of the previous concepts while getting rid of a fair number of the problems. To illustrate this, I created a sample project from scratch to say, “Hello world!” Setting up the project was easy. I created a new directory, changed my current directory to that directory, and entered the command:

pipenv --three

It took a while for the command to complete, but upon its completion I had a directory that contained the file Pipfile and a “hidden”¹ virtual environment. To see the location of that virtual environment, I entered the command:

pipenv --venv

and PipEnv returned the path to the virtual environment. To be specific, it created a virtual environment that was not in my project directory, but in my local system’s user space. This meant that the virtual environment directory did not show up in any scans of the project directory, meaning that I did not have to ignore that directory by any version control systems. That was definitely a win!

From there, I decided I wanted to add a splash of color to the program, brightening up a normally dull Hello World program. While a simple Hello World program would look like this:

print("Hello World!")

I decided to go with using the colorama library to add that color. Installing the colorama library to the current project was easy, using the following command:

pipenv install colorama

That command looked for the latest version of the colorama library, installed it in the virtual environment, updated the Pipfile and generated a new Pipfile.lock file. Once that was completed, I created the file main.py with the following contents:

from colorama import init, Fore, Back, Style

init()
print("Hello " + Fore.BLACK + Back.GREEN + "World!" + Style.RESET_ALL)

and saved that file. After a quick check for spelling and grammar mistakes, I executed that program with the command:

pipenv run python main.py

and I was greeted with this response:

To be clear, with 3 commands and 3 lines of Python code, I was able to create a simple program that referenced a library to colorize the output for my simple program and to write a single line of text to the console. An even bigger win for me was that I knew that if I needed this project to be portable, I could easily bundle up the source in the directory and recreate the project elsewhere. Having used PipEnv for months, this was not a theory for me, this was something that I have done in practice multiple times.

But unless you have performed that action, it may be hard to appreciate. So let’s prove it!

Proving That the Project Really Is Portable¶

To prove that the project is portable, I created a separate directory and copied the contents of my sample directory into that directory. While that is not exactly what happens when I clone a project from Git, I believe it is a close enough estimate for the purpose of this example. To be specific, there were only 3 files in the source directory for my sample project, and all of them were copied over: main.py, Pipfile, and Pipfile.lock.

To properly setup the project, I entered the directory and executed the command:

pipenv sync

After a while, control returned to my shell, with the pipenv output clearly detailing that it created a new virtual environment and downloaded the needed libraries.

From there, entering the command:

pipenv run python main.py

yielded the same output as the example project in the original directory.

While the typical project will be more complicated than this small project, the underlying principles are the same. If a project uses PipEnv to manage library dependencies, the Pipfile and Pipfile.lock files become part of the project’s source and allows the project’s dependencies to be replicated in another directory or on another system.

Satisfied with the portability test passing, and not wanting to be a bad consumer of system resources, I then used the following command to remove the virtual environment from my local machine:

pipenv --rm

By using the pipenv --venv command both before and after this command, I was able to verify that the directory containing the virtual environment was removed from my local system.

Why Do I Like It?¶

I like using PipEnv because it is simple to setup, simple to maintain, and easy to use. I am lazy in that I want to have the right tools at my fingertips to do the simple and easy stuff for me. Having to remember to activate a virtual environment when I go into a directory and deactivate it when I leave was a headache. There were many times I forgot to do just that, and it caused a bit of chaos.

I feel that PipEnv keeps everything I need together and keeps it in the right place for me: the base of the project. If I need to run a Python script from the project, I use pipenv run python {script-name}.py, a format that makes sense to me. If I need to check the project’s dependencies, the pipenv graph command is there, with an intuitive output format. If I need to recreate the project in a new directory, the pipenv sync command is there, and completes its job properly

Basically, it is a tool that I find useful and that I think makes sense.

Wrap Up¶

While there are options out there on how to maintain Python projects, the one that I prefer is PipEnv. Newer options exist, such as Poetry, but for projects that I rely on, I want something that has been battle tested thoroughly.

I feel that PipEnv is the best option that easily meets my requirements.

The directory is not hidden according to the file system. However, from the project’s point of view, it does not appear anywhere within the bounds of the project. ↩
You’ll have to take my word that the output was the same. I am using the same picture for both, but that is only because the output was the same. ↩

So what do you think? Did I miss something? Is any part unclear? Leave your comments below.

Comments

Python Projects - Using PipEnv