The use of virtual Python environments is highly recommended but why?
What is the link between the module venv
and the
virtualenv
utility? And why are pip
and
setuptools
playing hide and seek as I try to install them on?
Indeed should they be installed at all?
This posting is about virtual Python environments on three different Linux systems.
- Ubuntu 17.10 to Ubuntu 18.04 (Mint 20.1) on a relatively powerful desktop computer.
- Raspbian 9.1 Lite (Stretch) to Raspberry Pi OS (Buster) on a Raspberry Pi 3.
- Armbian 5.31 (DietPi) on an Orange Pi Zero.
Table of Contents
- Python Virtual Environments
- Python 2 and 3 Installed in Ubuntu 17.10
- Installing Python 3 in Raspbian (DietPi)
- Creating Python 3 Virtual Environments
- Updating Python 3 Virtual Environments
- Automating with a Script
- Using a Virtual Environment
- Examples
- Installing pip
Python Virtual Environments
Most Python projects are not standalone monoliths. They use standard
libraries provided with Python itself. Often they also use third party
libraries. But that can be tricky. Python project 1 could require the
installation version 2.18.4 of requests
a commonly used
http library. What happens if Python project 2 decides to install the
older 1.2.1 version of requests
? It could clobber the newer
version 2.18.4 and then Python project 1 might be broken. Virtual environments
are meant to solve this problem. Each virtual environment contains all
needed libraries for a Python project in a "box" that is independent of
all other virtual environments.
The "box" is just a directory that contains a few utilities, directories
and copies of or links to the Python compiler. In the current versions of
Python (version 3.5 and newer) it is created with the venv
script,
in older versions of Python (prior to version 3.3) that was done with
virtualenv
. In between (Python 3.3 and 3.4) the script was
called pyvenv
.
At this point, reading A non-magical introduction to Pip and Virtualenv for Python
beginners by Jamie Matthews would be a good idea. Just remember, that
text was written in 2013 and since then venv
is preferred
to virtualenv
in Python 3x (among other things,
virtualenv
copies the compiler binaries to the virtual
environment while venv
creates links). Also, I will not be installing
pip
et al globally, but only in virtual environments.
More about that latter.
The Python package management system is called pip
(pip installs packages). It is to Python what APT (advanced package tools) is
to Debian. Funny thing though, pip
and apt-get
can get all tangled up. Like apt
, pip
installs
Python modules from an "official" repository, PyPi, nicknamed
the Cheese Shop, a well-known sketch from Monty Python.
Pip
also installs packages found elsewhere such as
github
repositories.
The package installation tool uses Setuptools. For the most part this will be transparent and we do not have to look at the details.
The third part of the basic elements is Wheel. It is a "built-package format for Python".
Essentially, a wheel is an appropriately named archive containing all the
files needed to install a package setup in a specific hierarchy. The
wheel project also created a tool that is included in setuptools
and which is used by pip
to actually perform the installation
of a package supplied as a wheel.
Python packages themselves are divided into two categories: system
and site. The first refers to packages that are part of the standard
Python library, while later refers to packages from third parties. These
will be installed in an activated the virtual environment automatically by
pip
.
Python 2 and 3 Installed on Ubuntu 17.10 and Raspbian 9.1
Ubuntu 17.10 comes with two version of Python installed: 2.7.14 and 3.6.3. To quote the python.org wiki Python 2.x is legacy, Python 3.x is the present and future of the language. One would think that version 3 can handle any version 2 code. Generally, speaking that is true but not always. Version 3 of Python introduce new keywords. If a version 2 script used any of these words as variable names, then these will have to be changed. Also, some version 2 libraries may not have been ported to version 3. I am new to Python, so I have no investment in the older version and will be using version 3.
As can be seen, python
and python2
are
symbolic links (aliases if you prefer) to the executable python2.7
.
Similarly, python3
is a symbolic link to the executable
python3.6
.
The virtual environment utility virtualenv
which is
compatible with versions 2 and 3 of Python is not
available nor is venv
which is the version 3 replacement.
Similarly the package managers pip
and pip3
are not
installed by default.
Strangely, there is another Python executable,
python3.6m
(alias python3.m
) which is exactly the
same size as python3.6
. A binary comparison of the two files
reveals no difference between these distinct files (since they point to
different inodes
). Python Enhancement Proposal (PEP) 3149
provides a partial explanation
- --with-pydebug (flag: d)
- --with-pymalloc (flag: m)
- --with-wide-unicode (flag: u)
By default in Python 3.2, configure enables --with-pymalloc so shared library file names would appear as foo.cpython-32m.so. When the other two flags are also enabled, the file names would be foo.cpython-32dmu.so.
I interpret that to mean that by default version 3.6 is compiled using
pymalloc
for heap memory allocation instead of the system
malloc
. Accordingly, shared library files should end in
36m.so. Actually this naming convention is even more complex.
Hopefully, this will not prove important in the future.
The situation is almost the same on Raspbian. Both legacy Python and a current Python interpreter are included in the distribution although they are slightly older versions.
Also, here it seems that python3.5m
and python3.5
are hard links to the same inode
.
On both of these systems, a virtual environment cannot be created. A
package, python3-venv
must be installed.
python3-dev
will be installed as it will be needed for the
example to follow.
The installation of those packages on the Raspberry Pi is shown below. The same procedure was followed on Ubuntu.
Installing Python 3 in Raspbian (DietPi)
DietPi is a pared-down Linux server distribution (Armbian)
available for many single board computers such as the different models of the
Raspberry Pi, Orange Pi, Banana Pi meant to be used as servers.
I am using it on an Orange Pi Zero. This distribution does not contain Python
which must be installed with the apt-get
utility.
It is simple to verify that Python 3.4.2 is installed in the
/usr/bin
directory and that a legacy version of Python is
not present. As with Ubuntu, pip
is not installed, which I
guess is the norm for Debian based distributions.
Creating Python 3 Virtual Environments
Creating a virtual environment is quite simple. Use the venv
module giving it the name of virtual environment. There will not be a
confirmation, the command prompt will appear after a while.
It is possible to create an environment into a pre-existing directory.
It looks like the venv
module checks for the presence of
the file pyvenv.cfg
when asked to create a virtual
environment in a pre-existing directory. My cursory test shows that if
the cfg
file is present, venv
will do nothing
but if the cfg
file is not found, then the virtual
environment will be created without disturbing files that may have
existed in directories name bin
, include
, etc.
The site packages installed in the virtual environment by Python 3.6
in Ubuntu 17.10 include pip
and setuptools
.
The virtual environment created by Python 3.5 in Raspbian is quite similar
but strangely the version of setuptools
seems newer.
The virtual environment created by Python 3.4 in Armbian is slightly
different and the versions of pip
and setuptools
are considerably older.
Updating Python 3 Virtual Environments
The Python Software Foundation recommends updating the
pip
and setuptools
modules installed in a newly
created virtual environment. It also suggests installing wheel
.
All this can be done with pip
itself.
On Ubuntu and Raspbian, which use version 9.0 of pip
, only
setuptools
will be updated as pip
is
already the most recent version available. As can be seen,
both wheel
and the newer version of setuptools
will be downloaded for installation.
After that first download, pip
will use cached copies
of the whl
archives and will not need to download them.
The downloaded whl
archives are saved in the pip
cache found in the default directory ~/.cache/pip
. It may
be elsewhere on a Linux system if the environment variable
XDG_CACHE_HOME
is defined.
The behaviour is exactly the same in Raspbian. Unfortunately, there is no
caching in Armbian probably related to the older
installed versions of Python, pip
and setuptools.
This is easily verified by creating and updating a virtual environment and
then doing it again a second time.
There is a way to avoid all those downloads. First use the updated
pip
to download the most recent pip
,
setuptools
and wheel
archives to a local
directory.
Now let's create a virtual environment and update it from these local sources. The computer could be disconnected from the network.
There is one problem with this approach. As far as I know, there is no verification for newer versions any of the three packages in PyPi. Every now and then it may be worthwhile to either reload the locally cached versions or to update from the Cheese Shop.
Automating with a Script
Since it would be best to always update a newly created virtual
environment, it seemed like a good idea to create a simple script to
do both steps at once. Accordingly, I created the directory .local/bin
in my home directory (in may case, the full path is
/home/michel/.local/bin
.
.profile
file in my home directory
This will only take effect on the next login to my session. I then created
the simple mkvenv
script in the newly created directory.
Then the file is made executable.
After that, a one line command will create and update a virtual environment (don't forget you have to restart the current session for the search path to be updated).
I repeated the same steps in Armbian
except for the addition of an option in the line that performs the
update in mkvenv
.
#!/bin/bash python3 -m venv $1 $1/bin/pip install -U --no-index --find-links=.local/venvtools pip setuptools wheel
Of course, that was too simple. I modified these scripts so that they
can create multiple virtual environments at once. I also added a help
(-h
or --help
) option and an ignore the cache and get
the packages from PyPi (-g
or
--get
) option for the Armbian version. To make things as
simple as possible, the option can be used but it will be ignored in
the Ubuntu version. These can be downloaded from
these links.
The first time these scripts are used, a
Cache entry deserialization failed, entry ignoredwarning may be issude. It should not appear thereafter.
Following a suggestion by AndyG, I added a shell function and
alias in my .bashrc
file.
If you look at the topic, you will see what rookie mistake I made. With this shell function I can activate any virtual environment in my present working directory with a simple command.
Where ve
stands for (activating a) virtual environment, the reversed
letters ev
perform the opposite action: deactivate
.
Before I continue adding stuff like this I should look into virtualenvwrapper.
Using a Virtual Environment
With all that precedes, the final bit is something of an anti-climax.
There really is not much to using virtual Python environments. As seen
above a virtual environment named env
is a directory named
env
created in the working directory when venv
was invoked. When activated, it changes the prompt to indicate its
presence.
A virtual environment is deactivated with the deactivate
command. The environment prefix is then removed.
The following dialog shows that activating the virtual environment means
that the command python
will now invoke python3
.
This is done with symbolic links and modification of the search path and
probably more tricks. This is how it becomes it possible to have other
versions of Python installed in other virtual environments and using any
version without interference.
Removing a virtual environment is as simple as deleting the directory.
Do not rename a virtual environment or any directory that contains it. The reason is that absolute paths are used. See Don’t Rename Your Virtualenv Projects by Justin Iso and read the discussion Renaming a virtualenv folder without breaking it at stack overflow.
Someone at the Universty of Washington has written an An Introduction To Venv which I have found useful. I will again mention the introduction by Jamie Matthews A non-magical introduction to Pip and Virtualenv for Python beginners.
Examples
As an example of using a virtual environment, I will install
Pocketsphinx which is a part of CMUShpinx an Open Source Speech Recognition Toolkit by a
team at Carnegie Mellon University. There are many computer language bindings.
The Python binding pocketsphinx-python
by Dmitry
Prazdnichnov (bambocher) is easily installed with pip
from PyPi.
The source is available from a Github
repository.
There are a few requirements that need to be installed before
setting up the virtual environment: python3-dev
(if not
already done), swig
and libpulse-dev
. All this
is done with apt-get
.
Time to create the virtual environment that will contain our project.
With remarkable imagination, I named it pocketsphinx
.
Now we are ready to install the Python package pocketsphinx from the Cheese Store.
Two site packages are installed in the lib/python3.6/site-packages
directory: sphinxbase
and pocketsphinx
. These
include two shared libraries and two Python
scripts.
There is a model
directory in pocketsphinx
that
contains the en_US language model. No other language model is installed
although Pocket Sphinx is multilingual.
Following the LiveSpeech instructions by Dmitry Prazdnichnovat,
the livespeech.py
script is created in ~/pocketsphinx
.
I made the script executable (chmod +x pocketsphinx.py) and then I connected a USB headset to the computer, checked that it was the default source and output sound device and that it worked correctly. in Settings. Then I ran the script which printed out more or less what I was saying.
Then, on a lark, I tried talking in French, saying
-- Bonjour, quelle est la température?
-- Bonjour
-- Bonjour?
and got the following
Is that what French sounds like to a unilingual anglophone with no prior exposure to la langue de Molière?
Presumably, I could have called the virtual environment something
like env
and I could have installed pocketsphinx
in a separate directory, with the env
activated. Right now,
I am working on the hypothesis that would be a mistake. At the very least,
I would have to add a text file in the two directories to remember how
they are linked. Why not keep both the Python virtual environment and the
Python project in the same directory.
As a second example, I installed the speech
recognition module for Python by Anthony Zhang (Uberi). It is
available from PyPi so that it could be
installed just like pocketsphinx
above. Instead, I
cloned the git repository in the ~/Development/python
.
Before doing that, there was a new system requirement to fulfill.
I then installed a virtual Python environment in that same directory.
As recommended by Jamie Matthews, I edited the .gitignore
file
by adding the virtual environment directories to the list of items to be
ignored by git
.
The next step, installing PyAudio
which contains the Python
bindings for PortAudio
, needed to be done in the virtual
environment.
PocketSphinx-Python is another requirement that needs to be installed. The installation went very quickly, probably because it was already built in the previous virtual environment.
Finally, it was time to install Speech Recognition itself.
Unfortunately, the scripts found in the examples
directory
did not work even if all the unit tests passed including the two Sphinx test
and the Google Speech Recognition test.
This was surprising as I had tried installing the library from the
PyPi repository and it had worked. Looking at the
git
log, it became obvious that there had been changes made
since the upload to the Cheese Shop. Si I rolled back to older versions
until I found that the library worked.
I will discuss installing these two packages in Armbian in a future post as it is not quite as straight forward.
The instructions about installing other languages are not up
to date in the sense that the "simple Bash script" does not have the correct
address for the language and even if it had, I do not think it would extract
a downloaded language pack to the correct directory. The links are working,
so download any desired language pack. Then extract the enclosed
directory with the typical local identifier such as fr-FR
or
it-IT
to the following directory
speech_recognition/lib/python3.6/site-packages/speech_recognition/pocketsphinx-dataThe new language pack should now appear along side of the default
en-US
language pack.
To use the new language, add the parameter language=fr-FR
in the r.recognize_xxxx(audio, ...)
calls. Here are two
examples from the microphone_recognition.py
script:
By the way, adding the parameter before extracting the language pack is a simple way of finding out where the pack should go. Run the modified script and it will complain.
I was able to check that the French language pack can be used with Sphinx, Google Speech Recognition and Bing Voice Recognition. That does not mean it does not work with the other engines, I simply have not tested.
Installing pip
By default pip
is included with Python if the latter is installed from source. However, in
Debian it is not present by default. It can
be installed with the apt-get
utility. The latest version
available from the Ubuntu 17.10 repositories is
quite recent.
I no longer have Ubuntu 14.04 running, but
I distinctly recall that only a much older version was available.
That was a problem because when an apt-get
installed
version of pip
is used to update itself, it gives up saying the
package belongs to the operating system.
According to the Python Software Foundation
(Ensure you can run pip from the command line)
it may be possible to "bootstrap" an installation using ensurepip
.
That module is available in both installed versions of Python on
Ubuntu 17.10 and Armbian 9.1/
To be more accurate, ensurepip
for Python 2.7 was installed
by default, it was not there initially in Python 3 and was probably added
along with the python3-dev
.
The commands would be something like this.
Actually, I would assume that the sudo
prefix will be
necessary as root
is the owner of /usr/bin
where
at least pip
should live. I would install the Python 3.6
version first and change the name of pip
to pip3
if necessary, to ensure that the latter does not get clobbered when
pip
for Python 2.7 is installed.
I have not done this, there is no guarantee that it would work. It looks
as it may be possible to add pip
, setuptools
and
wheel
to the system Python but I decided to not do it for the
time being. I want to force myself to use virtual environments in a
systematic fashion. As a side benefit, not having pip
installed
in the system means that it is impossible to install a Python module
as a system package by mistake. Just think about how similar the two
look when pip
is found in the /usr/bin
directory
(or any other directory in the search path).