.. _modules_and_namespaces:
#######################################
Code Structure, Modules, and Namespaces
#######################################
**How to get what you want when you want it.**
Code Structure
==============
In Python, the structure of your code is determined by whitespace. This is nicely clear, and you've probably already figured it out, but we'll formally spell it out here:
How you *indent* your code determines how it is structured
::
block statement:
some code body
some more code body
another block statement:
code body in
that block
end of "another" block statement
still in the first block
outside of the block statement
The colon that terminates a block statement is also important...
One-liners
----------
You can put a one-liner after the colon:
.. code-block:: ipython
In [167]: x = 12
In [168]: if x > 4: print(x)
12
But this should only be done if it makes your code *more* readable. And that is rare.
So you need both the colon and the indentation to start a new a block. But the end of the indented section is the only indication of the end of the block.
Spaces vs. Tabs
---------------
Whitespace is important in Python.
An indent *could* be:
* Any number of spaces
* A tab
* A mix of tabs and spaces:
If you want anyone to take you seriously as a Python developer:
**Always use four spaces -- really!**
`(PEP 8) `_
.. note::
If you *do* use tabs (and really, don't do that!) python interprets them as the equivalent of *eight* spaces. Text editors can display tabs as any number of spaces, and most modern editors default to four -- so this can be *very* confusing! so again:
**Never mix tabs and spaces in Python code**
Spaces Elsewhere
----------------
Other than indenting -- space doesn't matter, technically.
.. code-block:: python
x = 3*4+12/func(x,y,z)
x = 3*4 + 12 / func (x, y, z)
These will give the exact same results.
But you should strive for proper style. Isn't this easier to read?
.. code-block:: python
x = (3 * 4) + (12 / func(x, y, z))
**Read PEP 8 and install a linter in your editor.**
Modules and Packages
====================
Python is all about *namespaces* -- the "dots"
``name.another_name``
The "dot" indicates that you are looking for a name in the *namespace* of the given object. It could be:
* a name in a module
* a module in a package
* an attribute of an object
* a method of an object
The only way to know is to know what type of object the name refers to. But in all cases, it is looking up a name in the namespace of the object.
So what *are* all these different types of namespaces?
Modules
-------
A module is simply a namespace. But a module more or less maps to a file with python code in it.
It might be a single file, or it could be a collection of files that define a shared API.
But in the common and simplest case, a single file is a single module.
So you can think of the files you write that end in ``.py`` as modules.
When a module is imported, the code in that file is run, and any names defined in that file are now available in the module namespace.
Packages
--------
A package is a module with other modules in it.
On a filesystem, this is represented as a directory that contains one or more ``.py`` files, one of which **must** be called ``__init__.py``.
When you have a package, you can import only the package, or any of the modules inside it. When a package is imported, the code in the ``__init__.py`` file is run, and any names defined in that file are available in the *package namespace*.
Here we define about the simplest package possible:
Create a directory (folder) for your package:
.. code-block:: bash
mkdir my_package
Save a file in that package, called ``__init__.py``, and put this in it:
.. code-block:: python
name1 = "Fred"
name2 = "Bob"
Save another file in your my_package dir called ``a_module.py``, and put this in it:
.. code-block:: python
name3 = "Mary"
name4 = "Jane"
def a_function():
print("a_function has been called")
You now have about the simplest package you can have. Make sure your current working dir is the dir that ``my_package`` is in, and start python or iPython. Then try this code:
.. code-block:: ipython
In [1]: import my_package
In [2]: my_package.name1
Out[2]: 'Fred'
In [3]: my_package.name2
Out[3]: 'Bob'
The names you've defined are available in the package namespace.
What about the module?
.. code-block:: ipython
In [4]: my_package.a_module
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in ()
----> 1 my_package.a_module
AttributeError: module 'my_package' has no attribute 'a_module'
the a_module name does not exist. It must be imported explicitly:
.. code-block:: ipython
In [1]: import my_package.a_module
Now the names defined in the ``a_module.py`` file are all there:
.. code-block:: ipython
In [2]: my_package.a_module.name3
Out[2]: 'Mary'
In [3]: my_package.a_module.name4
Out[3]: 'Jane'
In [4]: my_package.a_module.a_function()
a_function has been called
Note that you can also put a package inside a package. So you can create arbitrarily deeply nested hierarchy of packages. This can be helpful for a large, complex collection of related code, such as an entire Web Framework. But from the *Zen of Python*:
"Flat is better than nested."
So don't overdo it -- only go as deep as you really need to to keep your code organized.
Importing modules
-----------------
You have probably imported a module or two already:
.. code-block:: python
import sys
import math
But there a handful of ways to import modules and packages.
.. code-block:: python
import modulename
Is the simplest way: this adds the name of the module to the global namespace, and lets you access the names defined in that module:
.. code-block:: python
modulename.a_name_in_the_module
If you want only a few names in a module, and don't want to type the module name each time, you can import only the names you want:
.. code-block:: python
from modulename import this, that
This brings only the names specified (``this``, ``that``) into the global namespace. All the code in the module is run, but the module's name is not available. But the explicitly imported names are directly available.
.. code-block:: python
import modulename as a_new_name
This imports the module, and gives it a new name in the global namespace. This is done to avoid a name conflict, or to give the module a shorter name. For example, the numpy module is usually imported as:
.. code-block:: python
import numpy as np
Because numpy has a LOT of names, some of which may conflict with builtins or other modules, and users want to be able to reference them without too much typing.
.. code-block:: python
from modulename import this as that
This imports only one name from a module, while also giving it a new name in the global namespace.
Examples
--------
You can play with some of this with the standard library:
.. code-block:: ipython
In [1]: import math
In [2]: math.sin(1.2)
Out[2]: 0.9320390859672263
In [3]: from math import cos
In [4]: cos(1.2)
Out[4]: 0.3623577544766736
In [5]: import math as m
In [6]: m.sin(1)
Out[6]: 0.8414709848078965
In [7]: from math import cos as cosine
In [8]: cosine(1.2)
Out[8]: 0.3623577544766736
My rules of thumb
-----------------
If you only need a few names from a module, import only those:
.. code-block:: python
from math import sin, cos, tan
If you need a lot of names from that module, just import the module:
.. code-block:: python
import math
math.cos(2 * math.pi)
Or import it with a nice short name:
.. code-block:: python
import math as m
m.cos(2 * m.pi)
import \* ?
-----------
**Warning:**
You can also import all the names in a module with:
.. code-block:: python
from modulename import *
But this leads to name conflicts, and a cluttered namespace. It is NOT recommended practice.
Importing from packages
-----------------------
Packages can contain modules, which can be nested -- ideally not very deeply.
In that case, you can simply add more "dots" and follow the same rules as above.
.. code-block:: python
from packagename import my_funcs.this_func
Here's a nice reference for more detail:
http://effbot.org/zone/import-confusion.htm
And :ref:`packaging` goes into more detail about creating (and distributing!) your own package.
What does ``import`` actually do?
---------------------------------
When you import a module, or a symbol from a module, the Python code is *compiled* to **bytecode**.
The result is a ``module.pyc`` file.
Then after compiling, all the code in the module is run **at the module scope**.
For this reason, it is good to avoid module-scope statements that have global side-effects.
Re-import
----------
The code in a module is NOT re-run when imported again. This makes it efficient to import the same module multiple places in a program. But it means that if you change the code in a module after importing it, that change will not be reflected when it is imported again.
If you DO want a change to be reflected, you can explicitly reload a module:
.. code-block:: python
import importlib
importlib.reload(modulename)
This is rarely needed (which is why it's a bit buried in the ``importlib`` module), but is good to keep in mind when you are interactively working on code under development.
Import Interactions
-------------------
Another key point to keep in mind is that all code files in a given python program are sharing the same modules. So if you change a value in a module, that value's change will be reflected in other parts of the code that have imported that same module.
This can create dangerous side effects and hard to find bugs if you change anything in an imported module, but it can also be used as a handy way to store truly global state, like application preferences, for instance.
A rule of thumb for managing global state is to have only *one* part of your code change the values, and everywhere else considers them read-only. You can't enforce this, but you can structure you own code that way.
Let's take a look at an example of this.
Create three modules (python files):
``mod1.py``, ``mod2.py``, ``mod3.py``
``mod1.py`` is very simple -- one name declared:
.. code-block:: python
x = 5
``mod2.py`` is where a bit actually goes on:
.. code-block:: python
#!/usr/bin/env python3
import mod1
print(f"In mod2: mod1.x = {mod1.x}")
input("pausing (hit enter to continue >")
print("importing mod3")
import mod3
print(f"Still in mod2: mod1.x = {mod1.x}")
print("mod3 changed the value in mod1, and that change shows up in mod2")
Here, we import ``mod1``, and we can now see the names defined in it, and print the value of ``x``. Then it pauses, waiting for input. After the user hits the key, it then imports ``mod3``, and again prints the value of ``x`` that is in ``mod1``. Let's now look at ``mod3.py``:
.. code-block:: python
import mod1
print("In mod3 -- changing the value of mod1.x")
mod1.x = 555
Other than the print -- all ``mod3`` does is re-set the value of ``x`` that is on ``mod1``.
Running ``mod2.py`` results in::
$ python mod2.py
In mod2: mod1.x = 5
pausing (hit enter to continue >
importing mod3
In mod3 -- changing the value of mod1.x
Still in mod2: mod1.x = 555
mod3 changed the value in mod1, and that change shows up in mod2
You can see that when ``mod2`` changed the value of ``mod1.x``, that changed the value everywhere that ``mod1`` is imported. You want to be very careful about this.
If you are writing ``mod2.py``, and did not write ``mod3`` (or wrote it long enough ago that you don't remember its details), you might be very surprised that a value in ``mod1`` changes simply because you imported ``mod3``. This is known as a "side effect", and you generally want to avoid them!