paulgorman.org

Python notes

Whitespace

In Python, whitespace is significant. Indentation defines code blocks.

Documentation and help

Comments

Python comments start with a pound sign:

# This is a comment x = 'foo' # ...and so is this.

Functions can be documented with with triple quote docstrings, like:

def destroy_the_earth(): """This function will destroy the planet. Not recommended.""" pass

Docstrings can span multiple lines. If they do, the final """ should be on a new line.

Interactive interpreter (REPL)

You can invoke an interactive Python interpreter on the command line with python. Quit the interpreter by typing quit(). Get help with help().

Variables

Numbers can be bare: x = 2.5 + 5 or print 57.

Variables must be defined before they can be used.

Values can be assigned to several variables in one line: x = y = z = 0.

Strings

Strings can be enclosed in single, double, or triple (single or double) quotes.

x = 'A string' x = "A string" x = "A string with the continuation backslash to break up very, very" \ "long lines, so the code reads better" x = """A string which preserves line breaks so you don't need continuation characters""" x = 'A string should\'t be left unescaped, if needed' x = "New lines \n will be interpreted" x = r"But raw string will not \n interpret new lines, although you can \ still use the line continuation backslash."

The string concatenation operator is +, like x = "foo" + "bar".

Note that strings are immutable, so you need to copy them to a new string instead of changing them directly.

Lists

Python has lists like birds = ['robin', 'sparrow', 'lark'].

Unlike strings, lists are mutable, so you can:

birds.append('eagle') birds.pop() # Pops 'eagle' birds.popleft() # Pops 'robin'

Declaring and using a nested list:

pages = [ [1, 5, 9], [5, 25, 28], [234, 9, 45] ] for row in pages: for page in row: print page

List comprehensions

A list comprehension nests a for loop in a list declaration, performing an operation on every member of an existing list, the results of which become the members of the new list.

insults = [ 'tosser', 'neer-do-well', 'jackanape' ] shouted_insults = [ insult.upper() for insult in insults ] # Results in shouted_insults: TOSSER, NEER-DO-WELL, JACKANAPE

Dictionaries

Python has associative arrays called dictionaries, like user_ids = { 'fred':10, 'ralph':32, 'barny':40 }.

You can do all kinds of stuff to dictionaries, like getting values by key, and keys by value, and loop over dictionaries:

print user_ids['fred'] # Prints 10 print 'ralph' in user_ids # Prints True user_ids['bob'] = 99 # Adds bob to dictionary del user_ids['bob'] # Removes bob for name, id in user_ids.iteritems(): print name, id

Note that dictionaries are not numerically indexed, so you can't count on getting items in or out in any particular order. Of course, you can use functions like sort() to get an ordered list of keys, for example.

Tuples and sets

Tuples are like lists but immutable. This can be desirable in certain situations, like when efficiency is a major concern. Tuples are declared with parens instead of square braces: my_tuple = ('Roy', 'Bill', 'Sandy').

Another list variant is the set. I don't use sets that often, but they're incredibly useful in the rare cases when you do need them. A set is an unordered list that does not contain duplicates. Sets support operations like union, intersection, and difference.

shopping = ['apple', 'orange', 'banana', 'tomato', 'apple', 'banana'] fruit = set(shopping) # Shopping minus duplicates 'orange' in fruit # Returns True vegetables = set(['carrots', 'celery', 'tomato']) fruit - vegetables # Returns fruit contents without 'tomato' fruit & vegetables # Returns 'tomato' fruit ^ vegetables # Returns everything except 'tomato' fruit | vegetables # Returns items in either (all unique items)

Conditionals and loops

If/else conditional:

if x > 0: print 'x is greater than zero' elif x == 0: print 'x equals zero' else: print 'x is less than zero'

While loop:

x = 0 while x < 10: print x x += 1

For loop:

colors = [ 'red', 'blue', 'green', 'yellow' ]; for c in colors: print c

Note that you can't modify the list you're iterating over in a for loop. To modify the list, create a copy with the slice notation:

for c in colors[:]: if c = 'green': colors.insert('lime')

If you need the numeric index, you can use the range() and len() functions:

for i in range(len(colors)): print 'The index of', a[i], 'is', i

Break, continue, and else in for loops:

for n in range(2, 10): for x in range(2, n): if n % x == 0: print n, 'equals', x, '*', n/x break else: # loop fell through without finding a factor print n, 'is a prime number'

Functions

Defining a function:

def fib(n = 89): # Sets a default value for n if none is supplied. """Return Fibonacci series up to n.""" result = [] a, b = 0, 1 while a < n: result.append(a) a, b = b, a + b return result

Lambda functions are anonymous functions:

foo = lambda x: x**2 print foo(4)

would print 16.

Lambda function are particularly useful in conjunction with filter(), map(), and reduce(), like:

print map(lambda word: len(word), "All the king's horses".split() )

which would print 3, 3, 6, 6.

Files

Reading a file:

f = open('/home/me/foo.txt') input = f.read() # Slurps entire file as string. Careful; limit by f.read(bytes) lines = f.readlines() # Slurp entire file, each line as list element while line: # Read one line at a time line = f.readline() print line # ...alternately... for line in f: print line f.close()

Writing to a file:

f = open('/home/me/foo.txt', 'w') # Overwrites file; use 'a' to append f.write('Some text for the file') f.close()

Regular expressions

Regular expressions in Python are provided by the re module. Also see the regular expression howto.

import re pattern = re.compile(r'\bt[a-z]*i\b', re.IGNORECASE | re.MULTILINE) pattern.findall('O Tite tute Tati tibi tanta tyranne tulisti!') # Matches 'Tati', 'tibi', 'tulisti'

The major matching methods are match(), search(), findall(), and finditer(). match() returns None or a MatchObject if the pattern matches the beginning of the string. search() does the same, but matches anywhere in the string not just the beginning. findall() returns a list of substrings that match the pattern. finditer() returns an iterator of all matching substrings.

A MatchObject, as returned by match() or search(), has methods including group(), start(), end(), and span(). group() returns the matched string. start() and end() return, respectively, the start and end position of the match withing the string. span returns both the start and end positions as a tuple.

p = re.compile(...) m = p.match('string goes here') if m: print 'Match found: ', m.group() else: print 'No match'

If you're only going to use the pattern once, re provides top-level shortcuts like match(), search(), findall(), split(), and sub(). Without having to explicitly compile a pattern and examine the match, you can do:

print re.match(r'\bt[a-z]*imus\b', 'O Tite tute Tati tibi tanta tyranne tulisti!') # Prints None re.split('[\W]+', 'Words, words, words.') # ['Words', 'words', 'words', ''] re.split('([\W]+)', 'Words, words, words.') # ['Words', ', ', 'words', ', ', 'words', '.', ''] re.split('[\W]+', 'Words, words, words.', 1) # ['Words', 'words, words.']

Miscellaneous and useful

The map() function applies all arguments to a list to a function:

def double(n): return n * 2 map(double, [1, 2, 3]) # Returns 2, 4, 6

The filter() function takes a function and a list as arguments, and returns a list that contains the items for which the function returned True:

def greater_than_9(n): if n > 9: return True two_digit_numbers = filter(greater_than_9, [1, 4, 44, 7, 92, 33])

Get any command line arguments supplied by the user:

import sys print sys.argv # Prints 'myscript.py', 'arg1', 'arg2', 'arg3'

Save Python data structures to a file using the pickle module:

pickle.dump(my_data, file) my_data = pickle.load(file)

You can write to STDERR:

sys.stderr.write('Warning! Error!\n')

If you cat some input to a python script from the shell, read it like:

input = sys.stdin.read()

Classes and objects

Python supports object oriented programming. Define a new class:

class MyClass: """ This class is just an example. """ def __init__(self, a, b): self.a = a self.b = b def addup(self ): return a + b

Instantiate an object and use it:

x = MyClass(2, 4) print x.addup() # Prints 6

Child classes can inherit from parent classes like class DerivedClass(BaseClass):. Python also support multiple inheritance like class DerivedClass(Base1, Base2, Base3):.

Use empty classes like C structs

Empty class definitions can be used to make a C-struct-like bundle of data:

class Car: pass c = Car() c.model = 'Camry' c.driver = 'Jane' c.passengers = [ 'Terry', 'John', 'Ned' ] c.miles = 60300

Iterators, generators, and generator expressions

Give a class an __iter__() function, so that it can be iterated over using the for x in foo syntax.

class Reverse: "Iterator for looping over a sequence backwards" def __init__(self, data): self.data = data self.index = len(data) def __iter__(self): return self def next(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.data[self.index] for character in Reverse('spam'): print character # Prints maps

Generators are a more succinct than iterators, in that they automatically create their own __iter__() and next() methods and you don't need to worry about instance variables like self.index and self.data.

def reverse(data): for index in range(len(data)-1, -1, -1): yield data[index] # yield() is the magic of generators. for char in reverse('golf'): print char # Prints flog

Simple generators can be written as one expression:

sum(i * i for i in range (10)) # Gives 285

Exception handling

There's a list of built-in exceptions. Here's how to handle exceptions:

import sys try: f = open('file.txt') line = f.readline() i = int(s.strip()) except IOError as (errno, strerror): sys.stderr.write("I/O error({0} ): {1}".format(errno, strerror) except ValueError: sys.stderr.write("Can't convert line to an integer.") except: sys.stderr.write("Unexpected error:", sys.exc_info()[0]) raise

An else clause can be added after all except clauses, if there's code that must be executed if the try clause does not raise an exception.

for arg in sys.argv[1:]: try: f = open(arg, 'r') except IOError: print 'cannot open', arg else: print arg, 'has', len(f.readlines()), 'lines' f.close()

There is also a finally clause, which is always executed, regardless of the outcome of the try clause.

You can create your own custom exceptions by defining a new class that inherits from Exception:

class MyException(Exception): def __init))(self, value): self.value = value def __str__(self): return repr(self.value) try: raise MyException(2 + 2) except My Exception as e: print 'My exception occurred: ', e.value

Coding style

There is a Style Guide for Python Code, which includes:

* Personally, I think spam(ham[1], {eggs: 2}) is more readable.