Top 10ish Python standard library modules

When interviewing Python programming candidates, my wife always likes to ask the simple question, “can you name ten Python standard library modules?” This is harder than most think, as many people will completely blank out and others will be dead wrong. “Requests?” one poor soul answered. It’s a good interview question, as it gives insight onto what people are familiar with and may use regularly. So I sat down and though of of which ones I use and enjoy the most. So here are my top ten(ish) useful, favorite and unordered standard modules.

pathlib

Back in the dark days, you would have to store your path as a string, and call obscure functions under os.path to figure anything out about it. Pathlib removes the headache.

from pathlib import Path

my_path = Path('text_file.txt')
if not my_path.exists():
    my_path.write_text('File Content')
assert my_path.exists()
assert my_path.is_file()

Read more at the pathlib python docs.

tempfile

There are a boatload of uses for a temporary file or directory. Hence why it’s in the standard library. I find myself using them together, inside context managers more often than not.

from pathlib import Path
from tempfile import TemporaryDirectory, TemporaryFile


with TemporaryDirectory(prefix='Code_', suffix='_Calamity') as temp_dir:
    with TemporaryFile(dir=temp_dir) as temp_file:
        temp_file.write(b'Test')

        temp_file_path = Path(temp_file.name)
        assert temp_file_path.exists()

# Make sure file only exists within the context
assert not temp_file_path.exists()

I usually end up using this when a tool or library wants to work with a file rather than standard input, so short lived files in a context manager make life a lot easier. Tempfile python docs.

subprocess

Python is pretty amazing, but sometimes you do need to call other programs. Subprocess makes it easy to execute and interact with other executable across operating systems. Check out my other post on it!

from subprocess import run, PIPE
 
response = run("echo 'Join the Dark Side!'", shell=True, stdout=PIPE)

print(response.stdout.decode('utf-8'))
# 'Join the Dark Side!'

logging

This is probably the most useful built in library for debugging there is, and I see it either unused or misused more than anything else.

import logging
import sys
 
logger = logging.getLogger(__name__)
my_stream = logging.StreamHandler(stream=sys.stdout)
my_stream.setLevel(logging.DEBUG)
my_stream.setFormatter(
    logging.Formatter("%(asctime)s - %(name)-12s  "
                      "%(levelname)-8s %(message)s"))
logger.addHandler(my_stream)
logger.setLevel(logging.DEBUG)
 
logger.info("We the people")

If you haven’t already, go on your first date with python logging! It’s also possible to put all the configuration details into a separate ini or json file, learn more from the logging python docs.

threading and multithreading

Two very different things for widely different uses, but they have very similar interfaces and easy to talk about at the same time. Quick and dirty difference: Use threading for IO heavy tasks (writing to files, reading websites, etc) and multithreading for CPU heavy tasks.

from multiprocessing.pool import ThreadPool, Pool
 
def square_it(x):
    return x*x
 
# On Windows, make sure that multiprocessing doesn't start
# until after "if __name__ == '__main__'" 
 
# Pool and ThreadPool are interchangable in this example 
with Pool(processes=5) as pool:
   results = pool.map(square_it, [5, 4, 3, 2 ,1])
 
print(results) 
# [25, 16, 9, 4, 1]

I did a post on ThreadPools and Multithreading Pools, as I find them the easiest way to work with (multi)threading in Python.

os and sys

The Python world would not exist if we didn’t have all the power and functionality these built-ins bring. You really haven’t coded in Python if you haven’t used these yet, so I won’t even bother elaborating them here.

random and uuid

Maybe you’re making a game…

import random
random.choice(['Sneak Attack', 'High Kick', 'Low Kick'])

Or debugging a webserver…

from uuid import uuid4

# Bad example, but not writing out a whole webserver to prove a point
def get(*args):
request = uuid4()
logger.info(f'Request {request} called with args {args}')

Or turning your webserver into your own type of game…

if user_name == 'My Boss':
    time.sleep(random.randint(1, 5))

No matter which you are doing, it’s always handy to have randomly generated or unique numbers.

socket

The internet runs because of sockets. Modern technology exists because of sockets. Sockets are life, sockets are….annoying low level at times but its good to know the basics so you can appreciate everything written on top of them.

Thankfully the Python docs have good examples of them in use.

hashlib

Need to check a file’s integrity? Hashlib is there with md5 and sha hashes! I created a reusable function to easily reference when I need to do it.

Need to securly store people’s password hashes for a website? Hashlib now has scrypt support! Heck, here is my own function I always use to generate the scrypt hashes.

from collections import namedtuple
import hashlib
import os


Hashed = namedtuple('Hashed', ['hash', 'salt', 'n', 'r', 'p', 'key_length'])


def secure_hash(value: bytes, salt: bytes = None, key_length: int = 128, n: int = 2 ** 16, r: int = 8, p: int = 1):
    maxmem = n * r * 2 * key_length
    salt = salt or os.urandom(16)
    hashed = hashlib.scrypt(value, salt=salt, n=n, r=r, p=p, maxmem=maxmem, dklen=key_length)
    return Hashed(hash=hashed.hex(), salt=salt.hex(), n=n, r=r, p=p, key_length=key_length)

venv

You probably only think of it as a command when you run python -m venv python_virtual_env to create your environments, but it’s run that way because it’s a standard library. Every new project you start or Python program you install should be using this library, so it is used a lot!

Summary

There ya go, 10 or so can’t live without standard libraries! Isn’t it so nice that Python comes “batteries included”?