´╗┐Python Tutorial - 2.7

( Modules, Packages, and Classes )


This is part two of our Python 2.7 tutorial. In this section we will be covering modules and classes.

Modules

This is a very simple example of a module.

module1.py

# example module

a = 1
b = 2

def function1(n): 
   pass

def function2(n): 
   result = 5
    result += a
   return result

This example shows how to import and use a module.

test1.py

import module1


module1.function1(5)
module1.function2(8)

print(module1.__name__)

f1 = module1.function1      # assign a local name
f1(5)

When a module is imported, any executable statements are run. This only happens the first time the module is imported. They are generally meant to initialize the module.

Each module has a symbol table. This basically consists of all global variables from within the module.


module1.a             # access modules global variables
module1.b             # access modules global variables

You can show which names are defined in a module using dir(). This will list variables, functions, modules, etc. It doesn't list built-in functions and variables. There is a module named __builtin__ that can be used to show built-in names.


import module1, sys
dir(module1)         # show names in module1
dir(sys)             # show names in sys
dir()                # show currently defined names (default)


import __builtin__
dir(__builtin__)     # show built-in names

Module / Script - Dual Use

It is possible to check the name of the current module. It is stored in a global variable called __name__.

When a module is run directly, the variable __name__ is set to __main__. Using this, you can use the module as a script and still import it as a module. This will only run if when the module is executed as a script. This is great for testing when developing a module.


if __name__ == "__main__":
    pass
    # more code here

Importing Modules


# import names directly into the symbol table 
#  does NOT import the actual module name
from module1 import function1, function2 

                           
# import all names except ones starting with an underscore                   
# avoid doing this as it leads to messy code
from module1 import *      
                                         

# import module or function using a different name
import module1 as m1                            
from module1 import function1 as f1i     


# reload a module in case it was changed 
# while the script was running
reload(modulename)       

Import Path

When importing a module ( ex: module1 ) the system will search for the module in the following places:

  • built-in module
  • file named module1.py in the dirs from sys.path
sys.path is
  • list of strings for module search path
  • initialized by PYTHONPATH
sys.path in constructed from:
  • the current directory
  • dirs in PYTHONPATH
  • default for the python installation

You can change the path with list operations if you want. For example:


import sys
sys.path.append('/opt/custom/lib')

Compiled and Optimized Files

Compiled .pyc files:
  • already byte-compiled
  • .pyc file is created when .py file is compiled
  • The modification time of the .py file is added to the .pyc file. If these match, the .pyc file is used. This way it is only used if it is up to date.
Compiled and Optimized .pyo files
  • python -o my_module.py will generate a .pyo file
  • slightly more optimized (assert statements removed)
  • python -oo my_module.py will generate a .pyo file
    • can result in problems
    • more compact(__doc__ strings are removed)

pyc and pyo files load faster but run at the same speed as py files. pyc and pyo files can be distributed by themselves without a .py file. This is great if you want to make things harder for other people.


# compile all modules  in a directory
python -m compileall    

Extra Diversion

It is possible to view and update the primary and secondary prompts for interactive mode.


import sys
sys.ps1 = "]"
sys.ps2 = ")"

Packages

Packages can be used to group modules together. They can also be organized into subpackages.

Structure

To create a package, you will need to create a file named __init__.py. The directory containing this file will be treated as a package.

This file can:
  • be empty
  • contain initialization code
  • set the __all__ variable

A package can contain subpackages. This is what an example file tree might look like for a package with with subpackages.


package1/
    __init__.py
    sub-package1/
        __init__.py
        module1.py
        module2.py
    sub-package2/
        __init__.py
        module1.py
        module2.py

Importing

Here are some examples showing how to import things from packages.

You can import an individual module using its full path. You can't import functions or vars this way. This will require you to reference functions and variables with the full package prefix.


# import module and call with package prefix
import package1.sub-package1.module1           
package1.sub-package1.module1.function1()    

You can also import an individual module so that you don't need to use the full package prefix to reference variables and functions. You will still need the the module name to reference things though.


# import module and call without package prefix
from package1.sub-package1 import module1     
module1.function1()              

It is also possilbe to import functions and variables directly from a module within a package. Here is how it is done.


# import function directly and run it directly
from package1.sub-package1.module1 import function1  
function1()   

You can use a wildcard to just import all modules from a packages.


from package1.sub-package1 import *
This will do the following:
  • imports modules in the __all__ list from the __init__.py file
  • if __all__ is empty, no modules are imported
  • the package package1.sub-package1 is imported
  • runs any init code from __init__.py, imports any names from this file
  • includes any modules from import statements

The __init__.py file may contain a variable named __all__. This controls which modules will be imported when the wildcard is used as described above. It will looks like this:


__all__ = ["module1", "module2", "module3"]

When you have two modules that are both inside the same package, they don't need to specify the package name when importing. For example module1.py can just import module2 like this:


import module2

In Python 2.5 and up you can also use explicit relative imports. Modules can be imported relative to the packages. This won't work if if the module is being run as a script and the module name is __main__. Here is how explicit relative imports look:


from . import echo
from .. import formats
from ..filters import equalizer

Pulling Modules From An Alternate Path

A package can define another special variable in the __init__.py file. This is the __path__ variable. It is a list and it will contain a the directory of the packages by default. It is possible to modify this variable. This will change the search path for subpackages and modules going forward.

Namespaces

A namespace is a region of code that provides scope to identifiers. They are used to group and organize code logically. Namespaces help to prevent name collisions. This is especially important when you start working with different libraries which may or may not define variables with identical names.

scope - area of a program where a namespace is directly accessible

Example

Here is an example showing how namespaces and scoping works. Notice that the variable x defined outside of the function is different from the variable x that is defined within the function.


x = 5               # create var 'x'
def test1():
   x = 8            # create new var 'x' 
                    # (different scope)
   x = x + 7
   print x          # prints incremented value (15)
test1()
print x             # prints 5

How It Works

  • Each module has its own global scope
  • Global variables (module global namespace)
    • names that are declared global
    • names defined outside of a function ( local scope and global scope are the same here )
  • Names go into inner most scope unless declared global.
  • Variable written inside enclosing function won't overwrite global variables.
  • Even import statements could go inside a function where they will only have local scope. Don't do this though.
  • You can modify and del names that were imported from a module.

Scopes

Scope order follows the LEGB Rule:
  • L - Local - names defined within a function ( inner most )
  • E - Enclosing-function locals - any names in an enclosing function from inner to outer
  • G - Global - defined at the top level outside any function or declared global
  • B - Built-in - names that are built in to Python, never deleted, created when the Python interpreter starts up

View Special Namespaces

Python built-in names can be viewed by importing the __builtin__ module.


import __builtin__
print(__builtin__.__dict__)

Names in the top level namespace ( from script or interactive ) can be viewed by importing the __main__ module.


import __main__
print(__main__.__dict__)

Classes

  • each class has its own namespace
  • class needs to be defined before it can be used
  • A class is like a blueprint and an object is an instance of that blueprint. In Python, classes are also objects.
    • class type: <type 'classobj'>
    • instance type: <type 'instance'>
  • method - function belonging to an object
    • method type: <type 'instancemethod'>
  • attributes:
    • data attribute ( variables )
    • method attributes ( functions )

Every value is actually an object. You can check what class or type it is like this:


x = 5
x.__class__

New Style and Old Style Classes

Python includes new and old style classes. We will mostly be covering old-style classes in this tutorial since this is a Python 2 tutorial. We will cover new-style classes for the Python 3 tutorial. Here is a bit about each that you should be aware of.

old-style (or classic)
  • default in Python 2
  • don't exist in Python 3
  • all one type: <type 'instance'>
new-style
  • created by inheriting from an existing new style class
  • python 2.2 and up (optional)
  • all classes are new-style in Python 3
  • each class is a type
  • built in types are classes and can be subclassed

Syntax and How It Works

Most basic class definition possible:


class ClassName:
    pass
Really simple example:

class MyClass:
   """A simple example class"""
   i = 12345

   def f(self):
       return 'hello world'


x = MyClass()  # create instance of class 
               #  ( an object)


x.i          # access a variable
x.i = 45     # write to instance variable
del x.i      # delete instance variable
x.f()        # call a method
y = x.f      # save a method
x.__doc__    # access the docstring 


x.z = 0      # add new var to an instance dynamically
             # z was not defined in the class
             # only added to this instance


MyClass.u = 0  # does NOT add new var to class
               # just creates new var with a funny name

Include a method named __init__ to initialize an object as soon as it is created.


class MyClass
    def __init__(self):
       self.data = []

When you create a class you can pass variables to it. These are then passed to the __init__ function. Note that 'self' is a reference to the current instance object and doesn't correspond to any of the passed arguments. It is passed automatically and doesn't need to be included as an argument.

NOTE - Using the name self is just a convention. Self could actually be named something else. Don't do this though.

class Tree
    def __init__(self, a, b)
        self.x = a
        self.y = b


t = Tree(1, 2)
MyClass.f(x)

Class and Instance Variables

WARNING - if you use a class variable it will share the same value for all instances. Make sure that this is what you want.


class Tree:

   location = 'forest'     # shared by all instances 
                           # ( class variable )

   def __init__(self, name):
       self.name = name    # unique to each instance 
                           # ( instance variable )


a = Tree('Greg')
b = Tree('Steve')
a.location                # shared between instances
b.location                # shared between instances
a.name                    # unique to a
b.name                    # unique to b

WARNING - Variables and methods can have the same name. The variable takes precedence. Be careful as this can lead to bugs.

Function defined outside the class ( Don't do this ):


def outside(self, x):
   return x + 1
class MyClass:
   def b(self):
       return 'test'
   e = b              # assign function to local variable
   a = outside        # assign externally defined function 

You can call a method from within a class. Note that these are basically getting called using the instance.


class MyClass:
   def __init__(self):
       self.a = []

   def store_data(self, i):
       self.a.append(i)

   def process_store(self, x, g):
       g + 5
       x * 45
       #  call another method of this instance
       self.store_data(x * g)    

Inheritance

One class can inherit from a base class. The syntax looks like this:


class MyDerivedClass(BaseClass):
   pass


class MyDerivedClass(module1.MyBaseClass):
    pass

If an attribute is not found in a class, the base class is checked next, then the class above that, and the one above that, and so on.

A method from a base class can be overridden in a derived class:


class MyDerivedClass(BaseClass):
   def process_data(self, x):
       return x + 5
    def store_data()
        self.process_data(4)


class MyDerivedClass(module1.MyBaseClass):
    def process_data(self, x):  # override method 
        return x + 10           # change functionality

If a method is overridden, other methods from the base class can still call call this method but will get your overridden copy. Basically all python methods are effectively virtual. For example, if you call store_data() from an instance of MyDerivedClass, it will then call the overridden version of process_data.

Extend functionality of an Overridden Method

Let say you want to extend the functionality of an overridden method instead of just replacing it completely. You can do that by calling the base class version of the method from the overridden version. The base class version of the method can be called by explicitly specifying the class. See the example below:


class MyDerivedClass(MyBaseClass):
    def do_stuff(self, arguments):
        MyBaseClass.do_stuff(self, arguments)
            pass

Two Neat Tools

Here are two neat tools you can use if you want to test if something is an instance of a particular class or if something is a subclass of another.


print isinstance(a, MyClass)
print isinstance(a, int)


issubclass(bool, int)
issubclass(unicode, str)
issubclass(a, MyClass)

Multiple Inheritance

It is possible to inherit from more than one base class. Here is example of inheriting from multiple base classes:


class MyDerivedClass(Base1, Base2, Base3):
   pass

What if two base classes have a name conflict?

  • when searching for a name, base classes are searched in a specific order
    • old-style classes are depth-first (not breadth first), left-to-right
    • new-style classes use dynamic ordering

Private Variables and Class-local References

  • There are no real private variables in Python.
  • Variables and classes that start with an underscore are considered private by convention. They could be written to but shouldn't be.

Name Mangling

If you want to make sure that you have unique names, you can use name mangling. Python will rename your variables for you automatically if they follow this format. An identifier with two leading underscores but one or fewer trailing underscores will be automatically renamed by Python to include the class name. For example if you have a variable named __data inside a class named MyClass, it will be renamed to _MyClass__data.

Using name mangling, two methods within a superclass can call each other without being affected by an overridden class.

Name mangling example:

class MyClass:
   def __init__(self, a):
       # call the mangled version internally
       self.__process_data(a)  

   # method we might override
   def process_data(self, a):  
       pass

   # private copy with mangled name
   __process_data = process_data  

class MySubclass(MyClass):
   # override the method
   def process_data(self, a, b):  
       pass

More Stuff

Methods can actually have attributes of their own. Here is an example:


class Test:
   def junk():
       pass
t = Test()

# refers the instance object that this is a method of
print t.junk.im_self    

# refers to the corresponding function object
print t.junk.im_func    

Exceptions with Classes

You can raise an exception using classes like this:


raise Class, instance
raise instance
raise instance.__class__, instance

A matching class or base class is compatible when placed in an except clause. For example:


class X:
   pass
class Y(X):
   pass
class Z(Y):
   pass


try:
    raise Y
except X:     # yes
    pass
except Y:     # yes
    pass
except Z:     # no
    pass

Iterators

Iterators allow things to be treated as collections and looped over.


x = 'abcdefghijklmnopqrstuvwxyz'
i = iter(x)  # returns an iterator
i.next()     # get the next item
i.next()     # get the next item
             # raises a StopIteration 
             # exception when done

If you want a class to behave as an iterator, you need to do the following in your class:

  • define an __iter__() method
  • that method will return an object with a method named next()
  • if the class itself defines a next() method, __iter__() should return self.

Here is an example taken from the official Python tutorial at python.org:


class Reverse:
   def __init__(self, data):
       self.data = data
       self.index = len(data)
   def __iter__(self):
       return self
   def next(self):
       if self.index == 0:
           raise StopIteration
       self.index = self.index - 1
       return self.data[self.index]


rev = Reverse('spam')
for char in rev:
   print char

Generators

A generator basically just creates an iterator.

  • the __iter__() and next() methods are created automatically.
  • data/state is tracked automatically so you don't need self.index and self.data.
  • raises StopIteration for you
  • in a loop, you can just call a generator function instead of using a list

See the following example mostly taken from the official Python tutorial at python.org:


def reverse(data):
   for index in range(len(data)-1, -1, -1):
       yield data[index]      # return data


for char in reverse('golf'):
   print char

Generator Expressions

A generator expression is a really compact way to use a generator. It is kind of like a list comprehension except that it is placed inside parenthesis instead of square brackets and is passed to functions that expect a list. They tend to use less memory than a list comprehension.

Here are a few examples taken from python.org:


sum(i*i for i in range(10))   # sum of squares

xvec = [10, 20, 30]
yvec = [7, 5, 3]
sum(x*y for x,y in zip(xvec, yvec))   # dot product

uw = set(word  for line in page  for word in line.split())
vv = max((student.gpa, student.name) for student in graduates)

data = 'golf'
list(data[i] for i in range(len(data)-1,-1,-1))