Python as a Glue Language ========================= (July 2016) ## Get command line arguments ## import sys script_name = sys.argv[0] first_argument = sys.argv[1] second_argument = sys.argv[2] final_argument = sys.argv[-1] ## Reading a file line by line by line ## with open('/tmp/myfile', 'r') as f: for line in f: print(line) ## Slurp in an entire (not too large) file at once ## with open('/tmp/myfile', 'r') as f: stuff = f.read() Use `readlines()` rather than `read()` if we want an array of lines rather than one string with all lines. ## Check if a file exists ## import os.path os.path.isfile('/tmp/myfile') Unlike `isfile()`, the `exists()` function will also return true for directories. ## Write to a file ## with open('/tmp/myfile', 'w') as f: f.write('Hello, world\n') The 'w' argument overwrites the existing content; 'a' appends. https://docs.python.org/3/library/functions.html#open ## Recurse through directory tree ## import os for thisdir, subdirs, files in os.walk('/tmp'): for f in files: print(os.path.join(thisdir, f)) Or: import glob for f in glob.glob('/tmp/*'): print(f) ## All text files in a directory non-recursively ## import glob for f in glob.glob('/tmp/*.txt'): print(f) Python 3.5+ adds ** to glob, so it can be used like `glob.iglob(/tmp/foo/**/*.txt, recursive=True)`. ## Print to STDERR ## import sys sys.stderr.write("ERROR: line printer on fire!!!\n") Note that `sys.stderr.write()` expects a string, so (unlike for `print()`) we may need manually `str(foo)` some things. ## Escape a string for the shell ## It's unnecessary to escape list elements of any subprocess.* functions, so use those unless we're doing some kind of code generation (in which case, look at shlex.quote for Python 3.3+ and pipes.quote for earlier Python versions). ## Run a shell command ## import subprocess exit_code = subprocess.call(['touch', '/tmp/foo']) As we would expect, an exit code of zero is good. An alternative is `subprocess.check_call()`, which raises CalledProcessError for a non-zero return value. The old and less safe way: import os exit_code os.system('touch /tmp/foo') ## Get output of a shell command ## import subprocess foo = subprocess.check_output(['awk', '-F:', '{ print $6, $7 }', '/etc/passwd']) The advantage of passing the command and argument as a list rather than one long string is that we don't need to worry about shell escapes. ## Send unix mail ## import subprocess p = subprocess.Popen( ['mail', '-s', "'Error in system!'", 'root'], stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.STDOUT ) p.communicate('This is the message body.') ## Hook up to the shell's standard handles ## import subprocess p = subprocess.Popen( ['bc'], stdout = subprocess.PIPE, stderr = subprocess.PIPE, stdin = subprocess.PIPE) p.stdin.write('2 + 2\n') stdout, stderr = p.communicate() print(stdout) Note that communicate() sends EOF and waits for the subprocess to exit, so we can only call it once. We could make repeated calls to p.stdin.write() and p.stdin.read(), but we'd have to figure out how long to wait for the subprocess to finish each time and avoid deadlocks. If we want to interact with a subprocess repeatedly, try the pexpect module. Essentially, there's no good, built-in equivalent of Ruby's open3. ## Spawn a new process, and don't wait for it to exit ## import subprocess import sys pid = subprocess.Popen([sys.executable, 'tar', 'xf', 'foo.tar'], stdout = subprocess.PIPE, stderr = subprocess.PIPE, stdin = subprocess.PIPE) ## Fork a Copy and Communicate via Unix Sockets ## import socket import os parent, child = socket.socketpair() pid = os.fork() if pid: print('Parent sending message...') child.close() parent.sendall('ping') response = parent.recv(1024) print 'response from child:', response parent.close() else: print('Child waiting for message...') parent.close() message = child.recv(1024) print('message from parent:', message) child.sendall('pong') child.close() ## Read from and Write to a Unix Named Pipe (FIFO) ## Writer: import os f = '/tmp/myfifo' os.mkfifo(f) with open(f, 'w') as f: f.write("Hello, fifo!\n") Reader: with open('/tmp/myfifo', 'r') as f: for line in f: print('Read: ' + line) ## Web serve the current directory ## From the command line: % python -m SimpleHTTPServer 8888 Or: import SimpleHTTPServer import SocketServer PORT = 8888 Handler = SimpleHTTPServer.SimpleHTTPRequestHandler SocketServer.TCPServer(('', PORT), Handler).serve_forever() ## Web serve a simple message ## import BaseHTTPServer PORT = 8888 class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler): def do_GET(self): self.send_response(200) self.send_header('Content-type', 'text/html') self.end_headers() self.wfile.write('Hello, web!') return BaseHTTPServer.HTTPServer(('', PORT), MyHandler).serve_forever() ## Fetch URL (GET) ## from urllib import urlopen from urllib import urlencode url = 'http://httpbin.org/get' query = urlencode({ 'name0': 'value0', 'name1': 'value1' }) response = urlopen(url + '?' + query).read() ## Fetch URL (POST) ## from urllib import urlopen from urllib import urlencode url = 'http://httpbin.org/post' query = urlencode({ 'name0': 'value0', 'name1': 'value1' }).encode('ascii') response = urlopen(url, query).read() ## Die! ## import sys sys.exit('Oh, crap!') Note that this _kills_ the python interpreter, so if our script was called from another with execfile(), we kill the parent script too unless its execfile() call catches the SystemExit exception. ## Regular Expressions ## ## Links ##