<?php include('HEADER.php'); ?>

<h1>Python notes</h1>

<ul>
    <li><a href="#whitespace">Whitespace</a></li>
    <li><a href="#help">Documentation and help</a></li>
    <li><a href="#comments">Comments</a></li>
    <li><a href="#interpreter">Interactive interpreter</a></li>
    <li><a href="#variables">Variables</a></li>
    <li><a href="#conditionals">Conditionals and loops</a></li>
    <li><a href="#functions">Functions</a></li>
    <li><a href="#files">Files</a></li>
    <li><a href="#regex">Regular expressions</a></li>
    <li><a href="#misc">Miscellaneous and useful</a></li>
    <li><a href="#objects">Classes and objects</a></li>
    <li><a href="#exceptions">Exception handling</a></li>
    <li><a href="#codingstyle">Coding style</a></li>
</ul>

<h2 id="whitespace">Whitespace</h2>

<p>In Python, whitespace is significant. Indentation defines code blocks.</p>

<h2 id="help">Documentation and help</h2>

<ul>
    <li><a href="http://docs.python.org/tutorial/">The Python Tutorial</a></li>
    <li><a href="http://docs.python.org/library/">Python standard library docs</a></li>
    <li><a href="http://www.python.org/doc/">Official Python docs</a></li>
    <li><a href="http://www.pasteur.fr/formation/infobio/python/">Introduction to Programming using Python</a></li>
</ul>

<h2 id="comments">Comments</h2>

<p>Python comments start with a pound sign:</p>

<code class="prettyprint"># This is a comment
x = 'foo' # ...and so is this.</code>

<p>Functions can be documented with with triple quote docstrings, like:</p>

<code class="prettyprint">def destroy_the_earth():
    """This function will destroy the planet. Not recommended."""
    pass</code>

<p>Docstrings can span multiple lines. If they do, the final """ should be on a new line.</p>

<h2 id="interpreter">Interactive interpreter (REPL)</h2>

<p>You can invoke an interactive Python interpreter on the command line with <code class="prettyprint">python</code>. Quit the interpreter by typing <code class="prettyprint">quit()</code>. Get help with <code class="prettyprint">help()</code>.</p>

<h2 id="variables">Variables</h2>

<p>Numbers can be bare: <code class="prettyprint">x = 2.5 + 5</code> or <code class="prettyprint">print 57</code>.</p>

<p>Variables must be defined before they can be used.</p>

<p>Values can be assigned to several variables in one line: <code class="prettyprint">x = y = z = 0</code>.</p>

<h3 id="strings">Strings</h3>

<p>Strings can be enclosed in single, double, or triple (single or double) quotes.</p>

<code class="prettyprint">x = 'A string'
x = "A string"
x = "A string with the continuation backslash to break up very, very" \
    "long lines, so the code reads better"
x = """A string which preserves 
line breaks so you don't need 
continuation characters"""
x = 'A string should\'t be left unescaped, if needed'
x = "New lines \n will be interpreted"
x = r"But raw string will not \n interpret new lines, although you can \
still use the line continuation backslash."</code>

<p>The string concatenation operator is <code class="prettyprint">+</code>, like <code class="prettyprint">x = "foo" + "bar"</code>.</p>

<p>Note that strings are immutable, so you need to copy them to a new string instead of changing them directly.</p>

<h3 id="lists">Lists</h3>

<p>Python has lists like <code class="prettyprint">birds = ['robin', 'sparrow', 'lark']</code>.<p>

<p>Unlike strings, lists are mutable, so you can:</p>

<code class="prettyprint">birds.append('eagle')
birds.pop() # Pops 'eagle'
birds.popleft() # Pops 'robin'</code>

<p>Declaring and using a nested list:</p>

<code class="prettyprint">pages = [
    [1, 5, 9],
    [5, 25, 28],
    [234, 9, 45]
]
for row in pages:
    for page in row:
        print page</code>

<h4>List comprehensions</h4>

<p>A list comprehension nests a for loop in a list declaration, performing an operation on every member of an existing list, the results of which become the members of the new list.</p>

<code class="prettyprint">insults = [ 'tosser', 'neer-do-well', 'jackanape' ]
shouted_insults = [ insult.upper() for insult in insults ]
# Results in shouted_insults: TOSSER, NEER-DO-WELL, JACKANAPE</code>

<h3 id="dictionaries">Dictionaries</h3>

<p>Python has associative arrays called dictionaries, like <code class="prettyprint">user_ids = { 'fred':10, 'ralph':32, 'barny':40 }</code>.</p>

<p>You can do all kinds of stuff to dictionaries, like getting values by key, and keys by value, and loop over dictionaries:</p>

<code class="prettyprint">print user_ids['fred']  # Prints 10
print 'ralph' in user_ids  # Prints True
user_ids['bob'] = 99  # Adds bob to dictionary
del user_ids['bob']  # Removes bob
for name, id in user_ids.iteritems():
    print name, id
</code>

<p>Note that dictionaries are not numerically indexed, so you can't count on getting items in or out in any particular order. Of course, you can use functions like sort() to get an ordered list of keys, for example.</p>

<h4>Tuples and sets</h4>

<p>Tuples are like lists but immutable. This can be desirable in certain situations, like when efficiency is a major concern. Tuples are declared with parens instead of square braces: <code class="prettyprint">my_tuple = ('Roy', 'Bill', 'Sandy')</code>.

<p>Another list variant is the <a href="http://docs.python.org/library/stdtypes.html#set-types-set-frozenset">set</a>. I don't use sets that often, but they're incredibly useful in the rare cases when you do need them. A set is an unordered list that does not contain duplicates. Sets support operations like union, intersection, and difference.</p>

<code class="prettyprint">shopping = ['apple', 'orange', 'banana', 'tomato', 'apple', 'banana']
fruit = set(shopping) # Shopping minus duplicates
'orange' in fruit # Returns True
vegetables = set(['carrots', 'celery', 'tomato'])
fruit - vegetables # Returns fruit contents without 'tomato'
fruit &amp; vegetables # Returns 'tomato'
fruit ^ vegetables # Returns everything except 'tomato'
fruit | vegetables # Returns items in either (all unique items)</code>

<h2 id="conditionals">Conditionals and loops</h2>

<p>If/else conditional:</p>

<code class="prettyprint">if x &gt; 0:
    print 'x is greater than zero'
elif x == 0:
    print 'x equals zero'
else:
    print 'x is less than zero'</code>

<p>While loop:</p>

<code class="prettyprint">x = 0
while x &lt; 10:
    print x
    x += 1</code>

<p>For loop:</p>

<code class="prettyprint">colors = [ 'red', 'blue', 'green', 'yellow' ];
for c in colors:
    print c</code>

<p>Note that you can't modify the list you're iterating over in a for loop. To modify the list, create a copy with the slice notation:</p>

<code class="prettyprint">for c in colors[:]:
    if c = 'green': colors.insert('lime')</code>

<p>If you need the numeric index, you can use the range() and len() functions:</p>

<code class="prettyprint">for i in range(len(colors)): 
    print 'The index of', a[i], 'is', i</code>

<p>Break, continue, and else in for loops:</p>

<code class="prettyprint">for n in range(2, 10):
    for x in range(2, n):
        if n % x == 0:
        print n, 'equals', x, '*', n/x
        break
    else:
        # loop fell through without finding a factor
        print n, 'is a prime number'</code>

<h2 id="functions">Functions</h2>

<p>Defining a function:</p>

<code class="prettyprint">def fib(n = 89): # Sets a default value for n if none is supplied.
    """Return Fibonacci series up to n."""
    result = []
    a, b = 0, 1
    while a &lt; n:
        result.append(a)
        a, b = b, a + b
    return result</code>

<p>Lambda functions are anonymous functions:</p>

<code class="prettyprint">foo = lambda x: x**2
print foo(4)</code>

<p>would print 16.</p>

<p>Lambda function are particularly useful in conjunction with filter(), map(), and reduce(), like:</p>

<code class="prettyprint">print map(lambda word: len(word), "All the king's horses".split() )</code>

<p>which would print 3, 3, 6, 6.</p>

<h2 id="files">Files</h2>

<p>Reading a file:</p>

<code class="prettyprint">f = open('/home/me/foo.txt')
input = f.read()  # Slurps entire file as string. Careful; limit by f.read(<i>bytes</i>)
lines = f.readlines() # Slurp entire file, each line as list element
while line: # Read one line at a time
    line = f.readline()
    print line
# ...alternately...
for line in f:
    print line
f.close()</code>

<p>Writing to a file:</p>

<code class="prettyprint">f = open('/home/me/foo.txt', 'w') # Overwrites file; use 'a' to append
f.write('Some text for the file')
f.close()</code>

<h2 id="regex">Regular expressions</h2>

<p>Regular expressions in Python are provided by the <a href="http://docs.python.org/library/re.html#module-re">re module</a>. Also see the <a href="http://docs.python.org/howto/regex.html#regex-howto">regular expression howto</a>.</p>

<code class="prettyprint">import re
pattern = re.compile(r'\bt[a-z]*i\b', re.IGNORECASE | re.MULTILINE)
pattern.findall('O Tite tute Tati tibi tanta tyranne tulisti!')
# Matches 'Tati', 'tibi', 'tulisti'</code>

<p>The major matching methods are match(), search(), findall(), and finditer(). <code class="prettyprint">match()</code> returns None or a MatchObject if the pattern matches the beginning of the string. <code class="prettyprint">search()</code> does the same, but matches anywhere in the string not just the beginning. <code class="prettyprint">findall()</code> returns a list of substrings that match the pattern. <code class="prettyprint">finditer()</code> returns an iterator of all matching substrings.</p>

<p>A MatchObject, as returned by match() or search(), has methods including group(), start(), end(), and span(). <code class="prettyprint">group()</code> returns the matched string. <code class="prettyprint">start()</code> and <code class="prettyprint">end()</code> return, respectively, the start and end position of the match withing the string. <code class="prettyprint">span</code> returns both the start and end positions as a tuple.</p>

<code class="prettyprint">p = re.compile(...)
m = p.match('string goes here')
if m:
    print 'Match found: ', m.group()
else:
    print 'No match'</code>

<p>If you're only going to use the pattern once, re provides top-level shortcuts like match(), search(), findall(), split(), and sub(). Without having to explicitly compile a pattern and examine the match, you can do:</p>

<code class="prettyprint">print re.match(r'\bt[a-z]*imus\b', 'O Tite tute Tati tibi tanta tyranne tulisti!')
# Prints None
re.split('[\W]+', 'Words, words, words.')
# ['Words', 'words', 'words', '']
re.split('([\W]+)', 'Words, words, words.')
# ['Words', ', ', 'words', ', ', 'words', '.', '']
re.split('[\W]+', 'Words, words, words.', 1)
# ['Words', 'words, words.']</code>

<h2 id="misc">Miscellaneous and useful</h2>

<p>The map() function applies all arguments to a list to a function:</p>

<code class="prettyprint">def double(n):
    return n * 2
map(double, [1, 2, 3]) # Returns 2, 4, 6</code>

<p>The filter() function takes a function and a list as arguments, and returns a list that contains the items for which the function returned True:</p>

<code class="prettyprint">def greater_than_9(n):
    if n &gt; 9: return True
two_digit_numbers = filter(greater_than_9, [1, 4, 44, 7, 92, 33])</code>

<p>Get any command line arguments supplied by the user:</p>

<code class="prettyprint">import sys
print sys.argv
# Prints 'myscript.py', 'arg1', 'arg2', 'arg3'</code>

<p>Save Python data structures to a file using the pickle module:</p>

<code class="prettyprint">pickle.dump(my_data, file)
my_data = pickle.load(file)</code>

<p>You can write to STDERR:</p>

<code class="prettyprint">sys.stderr.write('Warning! Error!\n')</code>

<p>If you cat some input to a python script from the shell, read it like:</p>

<code class="prettyprint">input = sys.stdin.read()</code>

<h2 id="objects">Classes and objects</h2>

<p>Python supports object oriented programming. Define a new class:</p>

<code class="prettyprint">class MyClass:
    """ This class is just an example. """
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def addup(self ):
        return a + b</code>

<p>Instantiate an object and use it:</p>

<code class="prettyprint">x = MyClass(2, 4)
print x.addup()  # Prints 6</code>

<p>Child classes can inherit from parent classes like <code class="prettyprint">class DerivedClass(BaseClass):</code>. Python also support multiple inheritance like <code class="prettyprint">class DerivedClass(Base1, Base2, Base3):</code>.</p>

<h3>Use empty classes like C structs</h3>

<p>Empty class definitions can be used to make a C-struct-like bundle of data:</p>

<code class="prettyprint">class Car:
    pass
c = Car()
c.model = 'Camry'
c.driver = 'Jane'
c.passengers = [ 'Terry', 'John', 'Ned' ]
c.miles = 60300</code>

<h3>Iterators, generators, and generator expressions</h3>

<p>Give a class an <code class="prettyprint">__iter__()</code> function, so that it can be iterated over using the <code class="prettyprint">for x in foo</code> syntax.</p>

<code class="prettyprint">class Reverse:
    "Iterator for looping over a sequence backwards"
    def __init__(self, data):
        self.data = data
        self.index = len(data)
    def __iter__(self):
        return self
    def next(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]
for character in Reverse('spam'):
    print character # Prints maps</code>

<p>Generators are a more succinct than iterators, in that they automatically create their own __iter__() and next() methods and you don't need to worry about instance variables like self.index and self.data.</p>

<code class="prettyprint">def reverse(data):
    for index in range(len(data)-1, -1, -1):
        yield data[index]  # yield() is the magic of generators.
for char in reverse('golf'):
    print char  # Prints flog</code>

<p>Simple generators can be written as one expression:</p>

<code class="prettyprint">sum(i * i for i in range (10))  # Gives 285</code>

<h2 id="exceptions">Exception handling</h2>

<p>There's a list of <a href="http://docs.python.org/library/exceptions.html#bltin-exceptions">built-in exceptions</a>. Here's how to handle exceptions:</p>

<code class="prettyprint">import sys
try:
    f = open('file.txt')
    line = f.readline()
    i = int(s.strip())
except IOError as (errno, strerror):
    sys.stderr.write("I/O error({0} ): {1}".format(errno, strerror)
except ValueError:
    sys.stderr.write("Can't convert line to an integer.")
except:
    sys.stderr.write("Unexpected error:", sys.exc_info()[0])
    raise</code>

<p>An else clause can be added after all except clauses, if there's code that must be executed if the try clause does <em>not</em> raise an exception.</p>

<code class="prettyprint">for arg in sys.argv[1:]:
    try:
        f = open(arg, 'r')
    except IOError:
        print 'cannot open', arg
    else:
        print arg, 'has', len(f.readlines()), 'lines'
        f.close()</code>

<p>There is also a <code class="prettyprint">finally</code> clause, which is always executed, regardless of the outcome of the try clause.</p>

<p>You can create your own custom exceptions by defining a new class that inherits from Exception:</p>

<code class="prettyprint">class MyException(Exception):
    def __init))(self, value):
        self.value = value
    def __str__(self):
        return repr(self.value)
try:
    raise MyException(2 + 2)
except My Exception as e:
    print 'My exception occurred: ', e.value</code>

<h2 id="codingstyle">Coding style</h2>

<p>There is a <a href="http://www.python.org/dev/peps/pep-0008/">Style Guide for Python Code</a>, which includes:</p>

<ul>
    <li>Use 4 spaces for each indentation level (not tabs). Never mix tabs and spaces.</li>
    <li>Wrap lines at 79 characters where possible.</li>
    <li>Use two blank lines between top level functions and class definitions. Use one blank line between method definitions. A blank line may be used ("sparingly") between logical sections of code.</li>
    <li>Each import declaration should be on its own line</li>
    <li> Leave spaces around operators, like <code class="prettyprint">x = a + b</code></li>
    <li>Don't put spaces before or after brackets, like <code class="prettyprint">spam(ham[1], {eggs: 2})</code>, not anything like <code class="prettyprint">spam (ham [ 1 ], {eggs:2})</code>. *</li>
    <li>Modules should have short, all_lowercase_names, though the preferably short enough that underscores are unnecessary. Classes should be named in CapitalizedWords. Functions, methods, and instance variables should be lowercase_words_separated_by_underscores. Constants should be ALL_CAPS</li>
</ul>

<p style="font-style:oblique;">* Personally, I think <code class="prettyprint">spam(ham[1], {eggs: 2})</code> is more readable.</p>


<?php include('../FOOTER.php'); ?>
