paulgorman.org/technical

Python as a Glue Language

(July 2016)

Get command line arguments

import sys
script_name = sys.argv[0]
first_argument = sys.argv[1]
second_argument = sys.argv[2]
final_argument = sys.argv[-1]

Reading a file line by line by line

with open('/tmp/myfile', 'r') as f:
	for line in f:
		print(line)

Slurp in an entire (not too large) file at once

with open('/tmp/myfile', 'r') as f:
	stuff = f.read()

Use readlines() rather than read() if we want an array of lines rather than one string with all lines.

Check if a file exists

import os.path
os.path.isfile('/tmp/myfile')

Unlike isfile(), the exists() function will also return true for directories.

Write to a file

with open('/tmp/myfile', 'w') as f:
	f.write('Hello, world\n')

The ‘w’ argument overwrites the existing content; ‘a’ appends. https://docs.python.org/3/library/functions.html#open

Recurse through directory tree

import os
for thisdir, subdirs, files in os.walk('/tmp'):
	for f in files:
		print(os.path.join(thisdir, f))

Or:

import glob
for f in glob.glob('/tmp/*'):
	print(f)

All text files in a directory non-recursively

import glob
for f in glob.glob('/tmp/*.txt'):
	print(f)

Python 3.5+ adds ** to glob, so it can be used like glob.iglob(/tmp/foo/**/*.txt, recursive=True).

import sys
sys.stderr.write("ERROR: line printer on fire!!!\n")

Note that sys.stderr.write() expects a string, so (unlike for print()) we may need manually str(foo) some things.

Escape a string for the shell

It’s unnecessary to escape list elements of any subprocess.* functions, so use those unless we’re doing some kind of code generation (in which case, look at shlex.quote for Python 3.3+ and pipes.quote for earlier Python versions).

Run a shell command

import subprocess
exit_code = subprocess.call(['touch', '/tmp/foo'])

As we would expect, an exit code of zero is good. An alternative is subprocess.check_call(), which raises CalledProcessError for a non-zero return value.

The old and less safe way:

import os
exit_code os.system('touch /tmp/foo')

Get output of a shell command

import subprocess
foo = subprocess.check_output(['awk', '-F:', '{ print $6, $7 }', '/etc/passwd'])

The advantage of passing the command and argument as a list rather than one long string is that we don’t need to worry about shell escapes.

Send unix mail

import subprocess
p = subprocess.Popen(
		['mail', '-s', "'Error in system!'", 'root'],
		stdout=subprocess.PIPE,
		stdin=subprocess.PIPE,
		stderr=subprocess.STDOUT
)
p.communicate('This is the message body.')

Hook up to the shell’s standard handles

import subprocess
p = subprocess.Popen(
	['bc'],
	stdout = subprocess.PIPE,
	stderr = subprocess.PIPE,
	stdin = subprocess.PIPE)
p.stdin.write('2 + 2\n')
stdout, stderr = p.communicate()
print(stdout)

Note that communicate() sends EOF and waits for the subprocess to exit, so we can only call it once. We could make repeated calls to p.stdin.write() and p.stdin.read(), but we’d have to figure out how long to wait for the subprocess to finish each time and avoid deadlocks. If we want to interact with a subprocess repeatedly, try the pexpect module.

Essentially, there’s no good, built-in equivalent of Ruby’s open3.

Spawn a new process, and don’t wait for it to exit

import subprocess
import sys
pid = subprocess.Popen([sys.executable, 'tar', 'xf', 'foo.tar'],
	stdout = subprocess.PIPE,
	stderr = subprocess.PIPE,
	stdin = subprocess.PIPE)

Fork a Copy and Communicate via Unix Sockets

import socket
import os
parent, child = socket.socketpair()
pid = os.fork()
if pid:
	print('Parent sending message...')
	child.close()
	parent.sendall('ping')
	response = parent.recv(1024)
	print 'response from child:', response
	parent.close()
else:
	print('Child waiting for message...')
	parent.close()
	message = child.recv(1024)
	print('message from parent:', message)
	child.sendall('pong')
	child.close()

Read from and Write to a Unix Named Pipe (FIFO)

Writer:

import os
f = '/tmp/myfifo'
os.mkfifo(f)
with open(f, 'w') as f:
	f.write("Hello, fifo!\n")

Reader:

with open('/tmp/myfifo', 'r') as f:
	for line in f:
		print('Read: ' + line)

Web serve the current directory

From the command line:

% python -m SimpleHTTPServer 8888

Or:

import SimpleHTTPServer
import SocketServer
PORT = 8888
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
SocketServer.TCPServer(('', PORT), Handler).serve_forever()

Web serve a simple message

import BaseHTTPServer
PORT = 8888
class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
	def do_GET(self):
		self.send_response(200)
		self.send_header('Content-type', 'text/html')
		self.end_headers()
		self.wfile.write('Hello, web!')
		return
BaseHTTPServer.HTTPServer(('', PORT), MyHandler).serve_forever()

Fetch URL (GET)

from urllib import urlopen
from urllib import urlencode
url = 'http://httpbin.org/get'
query = urlencode({
	'name0': 'value0',
	'name1': 'value1'
})
response = urlopen(url + '?' + query).read()

Fetch URL (POST)

from urllib import urlopen
from urllib import urlencode
url = 'http://httpbin.org/post'
query = urlencode({
	'name0': 'value0',
	'name1': 'value1'
}).encode('ascii')
response = urlopen(url, query).read()

Die!

import sys
sys.exit('Oh, crap!')

Note that this kills the python interpreter, so if our script was called from another with execfile(), we kill the parent script too unless its execfile() call catches the SystemExit exception.

Regular Expressions