Python Tutorial 2.7 - Part 2
( Modules, Packages, and Classes )
This is part two of our Python 2.7 tutorial. In this section we will be covering modules and classes.
Modules
This is a very simple example of a module.
module1.py
# example module
a = 1
b = 2
def function1(n):
pass
def function2(n):
result = 5
result += a
return result
This example shows how to import and use a module.
test1.py
import module1
module1.function1(5)
module1.function2(8)
print(module1.__name__)
f1 = module1.function1 # assign a local name
f1(5)
When a module is imported, any executable statements are run. This only happens the first time the module is imported. They are generally meant to initialize the module.
Each module has a symbol table. This basically consists of all global variables from within the module.
module1.a # access modules global variables
module1.b # access modules global variables
You can show which names are defined in a module using dir(). This will list variables, functions, modules, etc. It doesn't list built-in functions and variables. There is a module named __builtin__ that can be used to show built-in names.
import module1, sys
dir(module1) # show names in module1
dir(sys) # show names in sys
dir() # show currently defined names (default)
import __builtin__
dir(__builtin__) # show built-in names
Module / Script - Dual Use
It is possible to check the name of the current module. It is stored in a global variable called __name__.
When a module is run directly, the variable __name__ is set to __main__. Using this, you can use the module as a script and still import it as a module. This will only run if when the module is executed as a script. This is great for testing when developing a module.
if __name__ == "__main__":
pass
# more code here
Importing Modules
# import names directly into the symbol table
# does NOT import the actual module name
from module1 import function1, function2
# import all names except ones starting with an underscore
# avoid doing this as it leads to messy code
from module1 import *
# import module or function using a different name
import module1 as m1
from module1 import function1 as f1i
# reload a module in case it was changed
# while the script was running
reload(modulename)
Import Path
When importing a module ( ex: module1 ) the system will search for the module in the following places:
- built-in module
- file named module1.py in the dirs from sys.path
sys.path is
- list of strings for module search path
- initialized by PYTHONPATH
sys.path in constructed from:
- the current directory
- dirs in PYTHONPATH
- default for the python installation
You can change the path with list operations if you want. For example:
import sys
sys.path.append('/opt/custom/lib')
Compiled and Optimized Files
Compiled .pyc files:
- already byte-compiled
- .pyc file is created when .py file is compiled
- The modification time of the .py file is added to the .pyc file. If these match, the .pyc file is used. This way it is only used if it is up to date.
</ul>
Compiled and Optimized .pyo files
- python -o my_module.py will generate a .pyo file
- slightly more optimized (assert statements removed)
- python -oo my_module.py will generate a .pyo file
- can result in problems
- more compact(__doc__ strings are removed)
pyc and pyo files load faster but run at the same speed as py files. pyc and pyo files can be distributed by themselves without a .py file. This is great if you want to make things harder for other people.
# compile all modules in a directory python -m compileall
Extra Diversion
It is possible to view and update the primary and secondary prompts for interactive mode.
import sys sys.ps1 = "]" sys.ps2 = ")"
Packages
Packages can be used to group modules together. They can also be organized into subpackages.
Structure
To create a package, you will need to create a file named __init__.py. The directory containing this file will be treated as a package.
This file can:- be empty
- contain initialization code
- set the __all__ variable
A package can contain subpackages. This is what an example file tree might look like for a package with with subpackages.
package1/ __init__.py sub-package1/ __init__.py module1.py module2.py sub-package2/ __init__.py module1.py module2.py
Importing
Here are some examples showing how to import things from packages.
You can import an individual module using its full path. You can't import functions or vars this way. This will require you to reference functions and variables with the full package prefix.
# import module and call with package prefix import package1.sub-package1.module1 package1.sub-package1.module1.function1()
You can also import an individual module so that you don't need to use the full package prefix to reference variables and functions. You will still need the the module name to reference things though.
# import module and call without package prefix from package1.sub-package1 import module1 module1.function1()
It is also possilbe to import functions and variables directly from a module within a package. Here is how it is done.
# import function directly and run it directly from package1.sub-package1.module1 import function1 function1()
You can use a wildcard to just import all modules from a packages.
from package1.sub-package1 import *
This will do the following:- imports modules in the __all__ list from the __init__.py file
- if __all__ is empty, no modules are imported
- the package package1.sub-package1 is imported
- runs any init code from __init__.py, imports any names from this file
- includes any modules from import statements
The __init__.py file may contain a variable named __all__. This controls which modules will be imported when the wildcard is used as described above. It will looks like this:
__all__ = ["module1", "module2", "module3"]
When you have two modules that are both inside the same package, they don't need to specify the package name when importing. For example module1.py can just import module2 like this:
import module2
In Python 2.5 and up you can also use explicit relative imports. Modules can be imported relative to the packages. This won't work if if the module is being run as a script and the module name is __main__. Here is how explicit relative imports look:
from . import echo from .. import formats from ..filters import equalizer
Pulling Modules From An Alternate Path
A package can define another special variable in the __init__.py file. This is the __path__ variable. It is a list and it will contain a the directory of the packages by default. It is possible to modify this variable. This will change the search path for subpackages and modules going forward.
Namespaces
A namespace is a region of code that provides scope to identifiers. They are used to group and organize code logically. Namespaces help to prevent name collisions. This is especially important when you start working with different libraries which may or may not define variables with identical names.
scope - area of a program where a namespace is directly accessibleExample
Here is an example showing how namespaces and scoping works. Notice that the variable x defined outside of the function is different from the variable x that is defined within the function.
x = 5 # create var 'x' def test1(): x = 8 # create new var 'x' # (different scope) x = x + 7 print x # prints incremented value (15) test1() print x # prints 5
How It Works
- Each module has its own global scope
- Global variables (module global namespace)
- names that are declared global
- names defined outside of a function ( local scope and global scope are the same here )
- Names go into inner most scope unless declared global.
- Variable written inside enclosing function won't overwrite global variables.
- Even import statements could go inside a function where they will only have local scope. Don't do this though.
- You can modify and del names that were imported from a module.
Scopes
Scope order follows the LEGB Rule:- L - Local - names defined within a function ( inner most )
- E - Enclosing-function locals - any names in an enclosing function from inner to outer
- G - Global - defined at the top level outside any function or declared global
- B - Built-in - names that are built in to Python, never deleted, created when the Python interpreter starts up
View Special Namespaces
Python built-in names can be viewed by importing the __builtin__ module.
import __builtin__ print(__builtin__.__dict__)
Names in the top level namespace ( from script or interactive ) can be viewed by importing the __main__ module.
import __main__ print(__main__.__dict__)
Classes
- each class has its own namespace
- class needs to be defined before it can be used
- A class is like a blueprint and an object is an instance of that blueprint. In Python, classes are also objects.
- class type: <type 'classobj'>
- instance type: <type 'instance'>
- method - function belonging to an object
- method type: <type 'instancemethod'>
- attributes:
- data attribute ( variables )
- method attributes ( functions )
Every value is actually an object. You can check what class or type it is like this:
x = 5 x.__class__
New Style and Old Style Classes
Python includes new and old style classes. We will mostly be covering old-style classes in this tutorial since this is a Python 2 tutorial. We will cover new-style classes for the Python 3 tutorial. Here is a bit about each that you should be aware of.
old-style (or classic)- default in Python 2
- don't exist in Python 3
- all one type: <type 'instance'>
- created by inheriting from an existing new style class
- python 2.2 and up (optional)
- all classes are new-style in Python 3
- each class is a type
- built in types are classes and can be subclassed
Syntax and How It Works
Most basic class definition possible:
class ClassName: pass
Really simple example:
class MyClass: """A simple example class""" i = 12345 def f(self): return 'hello world' x = MyClass() # create instance of class # ( an object) x.i # access a variable x.i = 45 # write to instance variable del x.i # delete instance variable x.f() # call a method y = x.f # save a method x.__doc__ # access the docstring x.z = 0 # add new var to an instance dynamically # z was not defined in the class # only added to this instance MyClass.u = 0 # does NOT add new var to class # just creates new var with a funny name
Include a method named __init__ to initialize an object as soon as it is created.
class MyClass def __init__(self): self.data = []
When you create a class you can pass variables to it. These are then passed to the __init__ function. Note that 'self' is a reference to the current instance object and doesn't correspond to any of the passed arguments. It is passed automatically and doesn't need to be included as an argument.
NOTE - Using the name self is just a convention. Self could actually be named something else. Don't do this though.
class Tree def __init__(self, a, b) self.x = a self.y = b t = Tree(1, 2) MyClass.f(x)
Class and Instance Variables
WARNING - if you use a class variable it will share the same value for all instances. Make sure that this is what you want.
class Tree: location = 'forest' # shared by all instances # ( class variable ) def __init__(self, name): self.name = name # unique to each instance # ( instance variable ) a = Tree('Greg') b = Tree('Steve') a.location # shared between instances b.location # shared between instances a.name # unique to a b.name # unique to b
WARNING - Variables and methods can have the same name. The variable takes precedence. Be careful as this can lead to bugs.
Function defined outside the class ( Don't do this ):
def outside(self, x): return x + 1 class MyClass: def b(self): return 'test' e = b # assign function to local variable a = outside # assign externally defined function
You can call a method from within a class. Note that these are basically getting called using the instance.
class MyClass: def __init__(self): self.a = [] def store_data(self, i): self.a.append(i) def process_store(self, x, g): g + 5 x * 45 # call another method of this instance self.store_data(x * g)
Inheritance
One class can inherit from a base class. The syntax looks like this:
class MyDerivedClass(BaseClass): pass class MyDerivedClass(module1.MyBaseClass): pass
If an attribute is not found in a class, the base class is checked next, then the class above that, and the one above that, and so on.
A method from a base class can be overridden in a derived class:
class MyDerivedClass(BaseClass): def process_data(self, x): return x + 5 def store_data() self.process_data(4) class MyDerivedClass(module1.MyBaseClass): def process_data(self, x): # override method return x + 10 # change functionality
If a method is overridden, other methods from the base class can still call call this method but will get your overridden copy. Basically all python methods are effectively virtual. For example, if you call store_data() from an instance of MyDerivedClass, it will then call the overridden version of process_data.
Extend functionality of an Overridden Method
Let say you want to extend the functionality of an overridden method instead of just replacing it completely. You can do that by calling the base class version of the method from the overridden version. The base class version of the method can be called by explicitly specifying the class. See the example below:
class MyDerivedClass(MyBaseClass): def do_stuff(self, arguments): MyBaseClass.do_stuff(self, arguments) pass
Two Neat Tools
Here are two neat tools you can use if you want to test if something is an instance of a particular class or if something is a subclass of another.
print isinstance(a, MyClass) print isinstance(a, int) issubclass(bool, int) issubclass(unicode, str) issubclass(a, MyClass)
Multiple Inheritance
It is possible to inherit from more than one base class. Here is example of inheriting from multiple base classes:
class MyDerivedClass(Base1, Base2, Base3): pass
What if two base classes have a name conflict?
- when searching for a name, base classes are searched in a specific order
- old-style classes are depth-first (not breadth first), left-to-right
- new-style classes use dynamic ordering
Private Variables and Class-local References
- There are no real private variables in Python.
- Variables and classes that start with an underscore are considered private by convention. They could be written to but shouldn't be.
Name Mangling
If you want to make sure that you have unique names, you can use name mangling. Python will rename your variables for you automatically if they follow this format. An identifier with two leading underscores but one or fewer trailing underscores will be automatically renamed by Python to include the class name. For example if you have a variable named __data inside a class named MyClass, it will be renamed to _MyClass__data.
Using name mangling, two methods within a superclass can call each other without being affected by an overridden class.
Name mangling example:
class MyClass: def __init__(self, a): # call the mangled version internally self.__process_data(a) # method we might override def process_data(self, a): pass # private copy with mangled name __process_data = process_data class MySubclass(MyClass): # override the method def process_data(self, a, b): pass
More Stuff
Methods can actually have attributes of their own. Here is an example:
class Test: def junk(): pass t = Test() # refers the instance object that this is a method of print t.junk.im_self # refers to the corresponding function object print t.junk.im_func
Exceptions with Classes
You can raise an exception using classes like this:
raise Class, instance raise instance raise instance.__class__, instance
A matching class or base class is compatible when placed in an except clause. For example:
class X: pass class Y(X): pass class Z(Y): pass try: raise Y except X: # yes pass except Y: # yes pass except Z: # no pass
Iterators
Iterators allow things to be treated as collections and looped over.
x = 'abcdefghijklmnopqrstuvwxyz' i = iter(x) # returns an iterator i.next() # get the next item i.next() # get the next item # raises a StopIteration # exception when done
If you want a class to behave as an iterator, you need to do the following in your class:
- define an __iter__() method
- that method will return an object with a method named next()
- if the class itself defines a next() method, __iter__() should return self.
Here is an example taken from the official Python tutorial at python.org:
class Reverse: def __init__(self, data): self.data = data self.index = len(data) def __iter__(self): return self def next(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.data[self.index] rev = Reverse('spam') for char in rev: print char
Generators
A generator basically just creates an iterator.
- the __iter__() and next() methods are created automatically.
- data/state is tracked automatically so you don't need self.index and self.data.
- raises StopIteration for you
- in a loop, you can just call a generator function instead of using a list
See the following example mostly taken from the official Python tutorial at python.org:
def reverse(data): for index in range(len(data)-1, -1, -1): yield data[index] # return data for char in reverse('golf'): print char
Generator Expressions
A generator expression is a really compact way to use a generator. It is kind of like a list comprehension except that it is placed inside parenthesis instead of square brackets and is passed to functions that expect a list. They tend to use less memory than a list comprehension.
Here are a few examples taken from python.org:
sum(i*i for i in range(10)) # sum of squares xvec = [10, 20, 30] yvec = [7, 5, 3] sum(x*y for x,y in zip(xvec, yvec)) # dot product uw = set(word for line in page for word in line.split()) vv = max((student.gpa, student.name) for student in graduates) data = 'golf' list(data[i] for i in range(len(data)-1,-1,-1))