How Not to Use Python Environments
Why This Topic?
When I first began learning Python, it was not with the intention of using it for development. I used it as a flexible platform for computing and data visualization. Since these were projects where sharing code with other people wasn’t required, I stayed pretty ignorant in regards to environments.
However, working in a professional setting where code is commonly shared does require environment sharing and isolation. I’ve been finding myself slipping back into my old ways of global library installs out of simplicity and lack of comprehension regarding computer science concepts. So, this post will outline what I’m working on learning with Python environments and how I’ve tried to implement it.
Obviously, I am no expert. So if I say anything incorrect, corrections are massively appreciated. I’d also like to thank Fox for his availability for questions and input on the topic.
Notice: Do not read this journey of learning if you’re looking for a happy ending; it’s a work in progress.
Notice 2: The terminal commands listed are for Mac. Windows users beware, they don’t translate over exactly.
Notice 3: Fun drinking game, take a drink every time I write “environment.”
Basics - Definitions
As noted above, I can have a hard time grasping computer science concepts quickly. So, I started at the very basics in my pursuit of understanding environments by looking at general definitions.
Definition of dependency: This is a file that needs to be accessible in order to run the piece of code.(1) Working with Python, this is usually a library like NumPy or Pandas.
Definition of package: A collection of dependencies that are grouped together for ease of use.(1)
Definition of dependency/package manager: A tool that helps to manage and organize libraries/packages in a logical way. It helps in regards to keeping libraries updated and providing an easy way to add/remove packages.(1) In python the main dependency manager is called Pip.
Definition of development environment: “…the development environment is the set of processes and programming tools used to create the program or software product.”(2) So, things like the version of the coding language and any libraries installed and their versions. The sum of the parts.
Definition of integrated development environment: An environment “…in which the processes and tools are coordinated to provide developers an orderly interface to and convenient view of the development process.”(2) An example of one would be Eclipse which has a debugger, compiler, auto-completion for code, and keeps track of installed libraries.
Cite 1: https://teamtreehouse.com/community/what-is-a-dependency-or-package-manager
Cite 2: http://searchsoftwarequality.techtarget.com/definition/development-environment
Python Tools to Isolate Environments
There are two main tools in Python that can be used to isolate environments: virtualenv and conda.
virtualenv
The older of the two, virtualenv has the appeal of being tried and true. It’s used in a vast majority of Python tutorials (the ones that include mention of environments that is) and its environment export requirements.txt is always supported when pushing environments in applications.
conda
The new kid on the block. Conda is the isolation tool of choice by Anaconda, a very popular python and R distribution. It gets points for being easy to use, having easy to remember commands, and having automatically installed basic data science libraries. This is the tool I used below to disasterous results. This is likely due to me using it incorrectly however and should not be considered a stain on it’s back.
Initial cleaning of global libraries
To start my journey, I figured that since I had heavily polluted my global environment by importing every python package under the sun, I would try to clean it up a bit by uninstalling some libraries.
After some googling, I found that this is a bad thing to attempt without experience and seems to have led to nothing but harm and sorrow for many people. In fact, using isolated environments is recommended as a solution to a polluted global environment. So, for now, I will leave it polluted.
Attempt to run a Jupyter Notebook in a fresh environment.
I thought a good starting project would be to take a jupyter notebook and test it in a new environment. The idea was that I could tell if the environment was working if the notebook wouldn’t load because required libraries wouldn’t be installed. A form of test-driven development.
As stated above, I decided on conda as my environment tool of choice. First, I navigated to the directory of the notebook and called the command to create a conda environment.
```$ conda create –no-default-packages -n coolEnv $ conda create –no-default-packages -n coolEnv $ conda activate coolEnv (coolEnv) $
This created an environment without the anaconda packages that are normally automatically installed. This was done to make sure the notebook would fail. Then, I activated the environment so that I was in it.
Cool cool, all that's left is to try to run jupyter notebooks. Because it is not installed in the current environment, jupyter notebooks shouldn't even open!
(coolEnv) $ jupyter notebook
But it did. Loaded right up. Okay, maybe I'm missing something and that's normal? The cells definitely shouldn't run because they contain libraries that aren't installed in this environment. The cell of libraries I tried to run is listed below.
```Python
import pandas as pd
import numpy as np
from vincent.colors import brews
from heapq import nlargest
pd.options.mode.chained_assignment = None
from scipy.stats import zscore
So I ran the cell. Ran just fine. Huh. I exported the environment to the condo env export file, environment.yml, just to be sure it didn’t have the libraries.
(coolEnv) $ conda env export > environment.yml
name: coolEnv
channels:
- anaconda-fusion
- defaults
prefix: /Users/rferguson/anaconda/envs/coolEnv
It did not. It had nothing.
Jupyter NB Try 2: This time, it’s kernels
After more of the google machine, I ran across a couple people with the same problems I was having with Jupyter Notebooks seeming to always only access a global copy. JN was apparently a bad application to start doing virtual environments with because of the interesting way it is set up. It doesn’t seem to care about environments, rather what kernels are available to it. To use environments as kernels, one can install ipykernel into the isolated environment and then call a command to point to current kernel to the environment. Sounds like a plan!
First, I installed jupyter in the conda environment to lessen the chance of changing my global copy. Then, I installed and called the ipykernel, setting the current kernel to the coolEnv.
(coolEnv) $ conda install jupyter
(coolEnv) $ conda install ipykernel
(coolEnv) $ python -m ipykernel install --user
(coolEnv) $ jupyter notebook
Ready to go! Once notebooks is up, I go to run the cell and *gasp* I get a library import error for pandas! Hurrah! A quick look at the new environment.yml file shows that pandas is indeed not installed.
name: coolEnv
channels:
- anaconda-fusion
- defaults
dependencies:
- appnope=0.1.0=py36hf537a9a_0
- bleach=2.1.1=py36h27c13d8_0
...
- ncurses=6.0=hd04f020_2
- notebook=5.2.2=py36h124cd7f_0
- openssl=1.0.2n=hdbc3d79_0
- pandoc=1.19.2.1=ha5e8f32_1
- pandocfilters=1.4.2=py36h3b0b094_1
...
- pip:
- flake8==3.5.0
- mccabe==0.6.1
- pew==1.1.0
- pipenv==8.3.2
- pycodestyle==2.3.1
- virtualenv-clone==0.2.6
prefix: /Users/rferguson/anaconda/envs/coolEnv
That’s awesome! Now, just a quick change back out of the environment just to be sure that nothing was changed globally.
(coolEnv) $ source deactivate
$ jupyter notebook
Aaaand, another library failure. The kernel is still set to the coolEnv. Cool. Changing the kernel in the environment changed the application globally. I’m sure I’m missing something here. Quick changed the kerenel back to normal to avoid any failures later.
$ python -m ipykernel install --user
I’m guessing this happened because I didn’t use the word conda in the call to set the kernel. But, still not sure.
Searching further for phrases like “all environments accessing global jupyter” turned up a couple of unsolved questions about it.
So, this method kind of works, if you want to call the above command every time you switch environments.
Jupyter NB Try 3: Gettin’ Desperate
As my last conda try for notebooks, I installed the nb_conda_kernels library. This library displays all available environments in jupyter notebooks as kernels that you can select. However, it’s support was mainly for Python 2.x and is being phased out. Worth an attempt though.
$ conda activate coolEnv
(coolEnv) $ conda install nb_conda_kernels
(coolEnv) $ jupyter notebook
Hurray, all conda environments on my computer are now visible on jupyter notebooks when I go to change kernels! This means I can run any environment that I have created in conda.
Downside? I have created a lot. Right now, from testing and previous projects, I have nine and they are only labelled by their name, not any libraries they have available or python versions. So, there would likely be a lot of environments to shift through to find the one that I currently have activated.
Upside? After deactivating the environment and viewing JN, the only kernels available were my standard kernels. So, the import did not globally pollute JN which is fantastic!!
Jupyter NB Try 4: How about we forget conda
Moving on to conda’s older sister, virtualenv is worth a quick shot to wrap up this exciting journey.
(coolEnv) $ source deactivate
$ virtualenv coolVE
$ source coolVE/bin/activate
(coolVE) $
From what I can tell now, one of the ways virtualenv differs is that it creates an environment directory in your current folder named the environment name. Also, it relies solely on Pip for library installs and management.
The export of this environment isolation tool should be stored in a file called requirements.txt
(coolVE) $ pip freeze > requirements.txt
The initial export is empty.
(coolVE) $ jupyter notebook
The notebook still opened and ran all libraries with no problems. Same as conda. Therefore, it was extrapolated that I would be in for a world of more hurt if I continued to try using virtualenv and decided to stop to keep this journal bearably long.
Conclusions
Learned a lot of what should not be done. Which, honestly, I find messing up and getting errors to be the best way for me to learn. I think frustration raises my stress levels enough to burn the knowledge into my mind.
If anyone has had similar experiences with JN or an idea as to what could be going on here, I’m very open to ideas.
I’ll keep working on getting isolated environments with python and try other applications to see if they’re easier to use. Here’s to hoping you didn’t take my advice on the drinking game. If you did, congratulations on surviving.