Preventing Directory Traversal in Python

Consider the following use case:

PREFIX = '/home/user/files/'
full_path = os.path.join(PREFIX, filepath)
read(full_path, 'rb')
...

Assuming that filepath is user-controlled, a malicious user might attempt a directory traversal (like setting filepath to ../../../etc/passwd). How can we make sure that filepath cannot traverse “above” our prefix? There are, of course, numerous solutions to sanitizing input against directory traversal. The easiest way (that I came up with) to do so in Python is:

filepath = os.normpath('/' + filepath).lstrip('/')

It works because it turns the path into an absolute path, normalizes it, and makes it relative again. As one cannot traverse above /, it effectively ensures that filepath cannot go outside of PREFIX.

Post updated: See the comments below for an explanation of the changes.

URL-Safe Timestamps Using Base64

Passing around timestamps in URLs is a common task. We usually want our URLs to be as short as possible. I’ve found Base64 to result in the shortest URL-safe representation: just 6 chars. This compares with the 12 chars of the naive way, and 8 chars when using a hex representation.

The following Python functions allow you to build and read these 6-char URL-safe timestamps:
Continue reading URL-Safe Timestamps Using Base64