learning

First steps into GUI design with Kivy

Boring Background

I have always had in interest in making a local photo organization tool, that supports albums and tagging the way I want to do it. Maybe down the line allowing others to connect and use it too. So I started on one a few years ago, using what I knew best, Python REST back-end with Angular JS web based front end.

And it worked decently well.

However I was never happy with it’s performance past 10K images (and I am dealing with hundreds of thousands) and I tried halfway through development to switch to Angular 2. And everything blew up.

Now that I am older and wiser less stupid, I can face the fact that local data is best served with a desktop native application. It also removes the need for trying to keep up to date with ever changing web standards and libraries, in exchange for ones that might last through an entire development process (crazy thought, I know.)

Choosing Kivy

To be honest, I didn’t even think about using Kivy for a desktop application at first. I went straight to PySide, as it’s less bad licensing compared to PyQT. However, at the time, it didn’t want to play ball with a halfway modern python version, 3.5, the oldest I will create new content on, 3.7-dev is preferred or 3.6 for stable. I then looked into alternatives, such as wx and Kivy. Lets just say a quick view at both of their home pages tells you plenty about who you trust more with front end design off the bat.

Even thought Kivy seems to be aimed at more mobile development, including touch capabilities, I have found it to be extremely capable as a desktop application. It is also very appealing to me as it is no BS, MIT licensed.

First Steps

Just like any good python GUI, Kivy was a pain in the arse to install. On Windows, I just ended up grabbing the pre-compiled binaries from a well-loved unofficial binaries page. On my Linux VirtualBox, ended up having to disable 3D acceleration for it to start.

However, once I was able to start coding, it became stupid simple to be able to get content on the screen fast, and manipulate them. After about four hours of coding in one night, I had this:

An image viewer app that could display any directory of images. It included a preview bar that you could select past or upcoming images from, and control with either mouse clicks or keyboard presses. The actual code for it can be found here (just be aware it isn’t ready for the lime light yet, plenty of fixes to go.)

Application Layout

The application is composed of six widget classes overall. Here is a simple visual diagram of what they look like, and the Kivy widget classes they are using. (The padding is for visual ease only, not part of the program or to scale).

First Lessons learned

The two biggest issues I ran into were proper widget alignment, and the fact there is no built-in ‘on_hover’ like I am used to with CSS. I wrote up a quick class to do the latter, and including it here as it should be a drop-in for other’s kivy projects.

Kivy Mouse Over Code

from kivy.factory import Factory
from kivy.properties import BooleanProperty, ObjectProperty
from kivy.core.window import Window
from kivy.uix.widget import Widget


class MouseOver(Widget):

    def __init__(self, **kwargs):
        Window.bind(mouse_pos=self._mouse_move)
        self.hovering = BooleanProperty(False)
        self.poi = ObjectProperty(None)
        self.register_event_type('on_hover')
        self.register_event_type('on_exit')
        super(MouseOver, self).__init__(**kwargs)

    def _mouse_move(self, *args):
        if not self.get_root_window():
            return
        is_collide = self.collide_point(*self.to_widget(*args[1]))
        if self.hovering == is_collide:
            return
        self.poi = args[1]
        self.hovering = is_collide
        self.dispatch('on_hover' if is_collide else 'on_exit')

    def on_hover(self):
        """Mouse over"""

    def on_exit(self):
        """Mouse leaves"""


Factory.register('MouseOver', MouseOver)

Then you simply have to subclass it as well as the original widget class you want your object to be, and override the on_hover and on_exit methods.

class Selector(Button, MouseOver):
    """ Base class for Prev and Next Image Buttons"""

    def on_hover(self):
        self.opacity = .8

    def on_exit(self):
        self.opacity = 0

Widget alignment

Proper widget alignment was fun to figure out, because Kivy has both hard set properties width and height for objects, as well as a more lose size_hint feature. By default, everything has a size_hint of (1, 1)  , think of this as a percentage of the screen,  so (1,1) == (100% height, 100% width).

Simple right? Well, now add two of those objects in base layout. Some layouts, like BoxLayout and GridLayout will then make each of them 50% of the screen. Others, like FloatLayout will let each of them take up the entire layer, so one will be hidden. Even more fun, is to disable it, you manually have to set size_hint to (None, None).

Then on top of that, you can start mix and matching, hinting and absolutes with different elements on the screen. I chose to do that in this case, because I always wanted the preview bar of images at the bottom to be only 100px tall. But I wanted the large image to expand to the full height.  But if you set the bar to 100px tall, and leave the image (1,1), with just raising it’s starting position 100px from the bottom, it now is 100px off the top of the page. So now on top of the size_hint you need to add an on_size method, that will reduce the height of the image object every time the window is resized, like so:

    def on_size(self, obj, size):
        """Make sure all children sizes adjust properly"""
        self.image.height = self.height - 100
        self.next_button.pos = (self.width - 99, self.next_button.hpos)

You can notice we also have make sure that the right hand button always looks attached to the right hand of the screen. (Which I later learned could probably be accomplished with the RelitiveLayout, but am still unsure how that would work compared to everything else I already have with my FloatLayout.)

So the takeaway for me was, even without CSS, you’re going to have fun with alignment issues.

Next Steps

So I have am image viewer, great! But you want to organize images into more manageable groups. I want to create a folder / album view to be able to select at a higher level what you want to view.

After that I am sure I will want to create a database system to store tags, album info and thumbnails (for faster preview loading).

Creating a successful project – Part 2: Project Organization

Ok, so you’ve familiarized yourself with the rules.  Good.  This isn’t Fight Club, but they are important rules. You want to be a professional in all the projects on which you want to participate.  If properly managed, your GitHub account should be a functional part of your ever-evolving resume.

Code organization:

So, code itself really isn’t hard to organize.  But you’d be surprised how many people just don’t get it.  Since I usually develop in python, here is my standard project directory structure:

./new_project.
 ├── CONTRIBUTING.md
 ├── doc
 ├── LICENSE.md
 ├── README.md
 ├── scripts
 ├── setup.py
 ├── src
 └── tests

If you code in a different language which has different structural requirements, make changes.  This is how I do it.

First, the markdown files:

I’m not going to worry over the preference for ReStructuredText, Markdown, LaTeX, or basic text for these files.  Feel free to do your own research and come to your own conclusions.

CONTRIBUTING.md

This document is optional for simple projects.  It is possible to have this as part of the README.  In this document, you should spell out how people can contribute to your project.  Should they fork and submit pull requests?  How should they report issues and/or request new features?  These are important questions.  Not just for potential compatriots, but for you.

LICENSE.md

This should have the text of the license under which you are placing the project.  You can cut and paste this from any number of reputable licensing sources.  I recommend reading about the different licenses here. I will also be going over the primary players in a future post.

README.md

I feel that the project README is non-optional.  This single document should have key information about the project.  It should (ideally) address the following:

  1. What is the purpose of this project?  Think of it as the two-to-three sentence description of the project.  What problem were you trying to solve?
  2. Where to find more documentation.
  3. Installation instructions
  4. Simple configuration settings – if appropriate.
  5. Possibly simple use-cases and examples.

Optional Docs:

DONATIONS.md/SPONSOR.md – Where/how to send any monetary donations.  A good example of what might be found in a donations/sponsorship doc can be found here.

THANKS.md – If you want to thank anyone for help or encouragement, this would be alright to add.  Completely optional.

Now the directories:

doc

This is where your project documentation should reside.  If you do this correctly, you can easily get them posted to readthedocs.org.

scripts

This is optional for projects which have no need for command-line scripts.  But, if your project has a command-line tool associated with it, you should really consider having them in this directory.

src

This is where your code should reside.

tests

This is where the tests for your code should be found.  Having tests is a big indicator of the health and maturity of the project.  The tests here should show that you take the writing and maintaining of the code very seriously.

Optional Directories:

examples – Some people like to house some usage examples here.  If appropriate, feel free.

 

 

 

Creating a successful project – Part 1: The Rules

So, you want to create a project.  That’s great!  Welcome to the party.  I’d like to pass along some general knowledge about how best to approach creating/running a project.  Also, I’m not going to say that I know everything there is to know about running a successful project.  I have run a few projects in my time, some successful, some less so;  I also have some hard-fought lessons learned that I am going to share.  Perhaps you’ll use some of this knowledge to improve your projects.

One side note on what I am teaching/advocating here:  I am not telling you some secret to making your project the next docker or requests.  I’m showing you best practices for success.  These ideas and concepts are good for any size or purpose of a software project.  They have been successfully used on single-person projects all the way up to projects with 70+ contributors.  Good ideas and buy-in from your team are critical to success, but they are never going to be enough to replace plain old elbow grease to write the code, docs, and tests for a project.  That stuff takes discipline and time.

General Project Structure Rules

  1. If you want you (and your project) to be taken seriously?  Be organized.
  2. Tests are non-optional.
  3. Documentation is also non-optional.
  4. Documentation should read like it was written by someone with knowledge who cares about informing a newcomer to the project.
  5. You should understand contributor roles if any.
  6. You MUST understand the licenses that are in regular use and decide which license to use.  This is more about long-term CYA than anything else.

Well, there’s the first entry in this series.  I hope people find it interesting.

 

Reusables – Part 1: Overview and File Management

Reusables 0.8 has just been released, and it’s about time I give it a proper introduction.

I started this project three years ago, with a simple goal of keeping code that I inevitably end up reusing grouped into a single library. It’s for the stuff that’s too small to do well as it’s own library, but common enough it’s handy to reuse rather than rewrite each time.

It is designed to make the developer’s life easier in a number of ways. First, it requires no external modules, it’s possible to supplement some functionality with the modules specified in the requreiments.txt file, but are only required for specific use cases; for example: rarfile is only used to extract, you guessed it, rar files.

Second, everything is tested on both Python 2.6+ and Python 3.3+, also tested on pypy. It is cross platform compatible Windows/Linux, unless a specific function or class specifies otherwise.

Third, everything is documented via docstrings, so they are available at readthedocs, or through the built-in help() command in python.

Lastly, all functions and classes are all available at the root level (except CLI helpers), and can be broadly categorized as follows:

  • File Management
    • Functions that deal with file system operations.
  • Logging
    • Functions to help setup and modify logging capabilities.
  • Multiprocessing
    • Fast and dynamic multiprocessing or threading tools.
  • Web
    • Things related to dealing with networking, urls, downloading, serving, etc.
  • Wrappers
    • Function wrappers.
  • Namespace
    • Custom class to expand the usability of python dictionaries as objects.
  • DateTime
    • Custom datetime class primarily for easier formatting.
  • Browser Cookie Management
    • Find, extract or modify cookies of Firefox and Chrome on a system.
  • Command Line Helpers
    • Bash analogues to help system admins perform faster operations from inside an interactive python shell.

In this overview, we will cover:

  1. Installation
  2. Getting Started
  3. File, Folder and String Management
    1. Find Files Fast
    2. Archives (Extraction and Compression)
    3. Run Command
    4. File Hashing
    5. Finding Duplicate Files
    6. Safe File and Folder Names
    7. Touch (ing a file)
    8. Simple JSON and CSV
    9. Cut (ing a string into equal lengths)
    10. Config to dictionary

Installation

Very straightforward install, just do a simple pip or easy_install from PyPI.

pip install reusables

OR

easy_install reusables

If you need to install it on an offline computer, grab the appropriate Python 2.x or 3.x wheel from PyPI, and just pip install it directly.

There are no additional modules required for install, so if either of those don’t work, please open an issue at github.

Getting Started

import reusables 

reusables.add_stream_handler('reusables', level=10)

The logger’s name is ‘reusables’, and by default does not have any handlers associated with it. For these examples we will have logging on debug level, if you aren’t familiar with logging, please read my post about logging.

File, Folder and String Management

Everything here deals with managing something on the disk, or strings that relate to files. From checking for safe filenames to saving data files.

I’m going to start the show off with my most reused function, that is also one of the most versatile and powerful, find_files. It is basically an advanced implementation of os.walk.

Find Files Fast

reusables.find_files_list("F:\\Pictures",
                              ext=reusables.exts.pictures, 
                              name="sam", depth=3)

# ['F:\\Pictures\\Family\\SAM.JPG', 
# 'F:\\Pictures\\Family\\Family pictures - assorted\\Sam in 2009.jpg']

With a single line, we are able to search a directory for files by a case insensitive name, a list (or single string) of extensions and even specify a depth.  It’s also really fast, taking under five seconds to search through 70,000 files and 30,000 folders, taking just half a second longer than using the windows built in equivalent dir /s *sam* | findstr /i "\.jpg \.png \.jpeg \.gif \.bmp \.tif \.tiff \.ico \.mng \.tga \.xcf \.svg".

If you don’t need it as a list, use the generator itself.

for pic in reusables.find_files("F:\\Pictures", name="*chris *.jpg"):
    print(pic)

# F:\Pictures\Family\Family pictures - assorted\Chris 1st grade.jpg
# F:\Pictures\Family\Family pictures - assorted\Chris 6th grade.jpg
# F:\Pictures\Family\Family pictures - assorted\Chris at 3.jpg

That’s right, it also supports glob wildcards. It even supports using the external module scandir for older versions of Python that don’t have it nativity (only if enable_scandir=True is specified of course, its one of those supplemental modules). Check out the full documentation and more examples at readthedocs.

Archives

Dealing with the idiosyncrasies between the compression libraries provided by Python can be a real pain. I set out to make a super simple and straight forward way to archive and extract folders.

reusables.archive(['reusables',    # Folder with files 
                   'tests',        # Folder with subfolders
                   'AUTHORS.rst'], # Standalone file
                   name="my_archive.bz2")

# 'C:\Users\Me\Reusables\my_archive.bz2'

It will compress everything, store it, and keep folder structure in the archives.

To extract files, it is very similar behavior. Given a ‘wallpapers.zip’ file like this:

It is trivial to extract it to a location without having to specify it’s archive type.

reusables.extract("wallpapers.zip",
                  path="C:\\Users\\Me\\Desktop\\New Folder 6\\")
# ... DEBUG File wallpapers.zip detected as a zip file
# ... DEBUG Extracting files to C:\Users\Me\Desktop\New Folder 6\
# 'C:\\Users\\Me\\Desktop\\New Folder 6'

We can see that it extracted everything and again kept it’s folder structure.

The only support difference between the two is that you can extract rar files if you have installed rarfile and dependencies (and specified enable_rar=True), but cannot archive them due to licensing.

Run Command

Ok, so it many not always deal with the file system, but it’s better here than anywhere else. As you may or may not know, in Python 3.5 they released the excellent subprocess.run which is a convenient wrapper around Popen that returns a clean CompletedProcess class instance. reusables.run is designed to be a version agnostic clone, and will even directly run subprocess.run on Python 3.5 and higher.

reusables.run("cat setup.cfg", shell=True)

# CompletedProcess(args='cat setup.cfg', returncode=0, 
#                 stdout=b'[metadata]\ndescription-file = README.rst')

It does have a few subtle differences that I want to highlight:

  • By default, sets stdout and stderr to subprocess.PIPE, that way the result is always is in the returned CompletedProcess instance.
  • Has an additional copy_local_env argument, which will copy your current shell environment to the subprocess if True.
  • Timeout is accepted, buy will raise a NotImplimentedError if set on Python 2.x.
  • It doesn’t take positional Popen arguments, only keyword args (2.6 limitation).
  • It returns the same output as Popen, so on Python 2.x stdout and stderr are strings, and on 3.x they are bytes.

Here you can see an example of copy_local_env  in action running on Python 2.6.

import os

os.environ['MYVAR'] = 'Butterfly'

reusables.run("echo $MYVAR", copy_local_env=True, shell=True)

# CompletedProcess(args='echo $MYVAR', returncode=0, 
#                 stdout='Butterfly\n')

File Hashing

Python already has nice hashing capabilities through hashlib, but it’s a pain to rewrite the custom code for being able to handle large files without a large memory impact.  Consisting of opening a file and iterating over it in chunks and updating the hash. Instead, here is a convenient function.

reusables.file_hash("reusables\\reusables.py", hash_type="sha")

# '50c5425f9780d5adb60a137528b916011ed09b06'

By default it returns an md5 hash, but can be set to anything available on that system, and returns it in the hexdigest format, if the kwargs hex_digest is set to false, it will be returned as bytes.

reusables.file_hash("reusables\\reusables.py", hex_digest=False)

# b'4\xe6\x03zPs\xf5\xe9\x8dX\x9c/=/<\x94'

Starting with python 2.7.9, you can quickly view the available hashes directly from hashlib via hashlib.algorithms_available.

# CPython 3.6
import hashlib

print(f"{hashlib.algorithms_available}")
# {'sha3_256', 'MD4', 'sha512', 'sha3_512', 'DSA-SHA', 'md4', ...

reusables.file_hash("wallpapers.zip", "sha3_256")

# 'b7c357d582f8932977d785a24f728b267cef1de87537076aadac5049f4e4fa70'

Duplicate Files

You know you’ve seen this picture  before, you shouldn’t have to safe it again, where did that sucker go? Wonder no more, find it!

list(reusables.dup_finder("F:\\Pictures\\20131005_212718.jpg", 
                          directory="F:\\Pictures"))

# ['F:\\Pictures\\20131005_212718.jpg',
#  'F:\\Pictures\\Me\\20131005_212718.jpg',
#  'F:\\Pictures\\Personal Favorite\\20131005_212718.jpg']

dup_finder is a generator that will search for a given file at a directory, and all sub-directories. This is a very fast function, as it does a three step escalation to detect duplicates, if a step does not match, it will not continue with the other checks, they are verified in this order:

  1. File size
  2. First twenty bytes
  3. Full SHA256 compare

That is excellent for finding a single file, but how about all duplicates in a directory? The traditional option is to create a dictionary of hashes of all the files to compares against. It works, but is slow. Reusables has directory_duplicates function, which first does a file size comparison first, and only moves onto hash comparisons if the size matches.

reusables.directory_duplicates(".")

# [['.\\.git\\refs\\heads\\master', '.\\.git\\refs\\tags\\0.5.2'], 
#  ['.\\test\\empty', '.\\test\\fake_dir']]

It returns a list of lists, each internal list is a group of matching files.  (To be clear “empty” and “fake_dir” are both empty files used for testing.)

Just how much faster is it this way? Here’s a benchmark on my system of searching through over sixty-six thousand (66,000)  files in thirty thousand (30,000) directories.

The comparison code (the Reusables duplicate finder is refereed to as ‘size map’)

import reusables

@reusables.time_it(message="hash map took {seconds:.2f} seconds")
def hash_map(directory):
    hashes = {}
    for file in reusables.find_files(directory):
        file_hash = reusables.file_hash(file)
        hashes.setdefault(file_hash, []).append(file)

    return [v for v in hashes.values() if len(v) > 1]


@reusables.time_it(message="size map took {seconds:.2f} seconds")
def size_map(directory):
    return reusables.directory_duplicates(directory)


if __name__ == '__main__':
    directory = "F:\\Pictures"

    size_map_run = size_map(directory)
    print(f"size map returned {len(size_map_run)} duplicates")

    hash_map_run = hash_map(directory)
    print(f"hash map returned {len(hash_map_run)} duplicates")

The speed up of checking size first in our scenario is significant, over 16 times faster.

size map took 40.23 seconds
size map returned 3511 duplicates

hash map took 642.68 seconds
hash map returned 3511 duplicates

It jumps from under a minute for using reusables.directory_duplicates to over ten minutes when using a traditional hash map. This is the fastest pure Python method I have found, if you find faster, let me know!

Safe File Names

There are plenty of instances that you want to save a meaningful filename supplied by a user, say for a file transfer program or web upload service, but what if they are trying to crash your system?

Reusables has three functions to help you out.

  • check_filename: returns true if safe to use, else false
  • safe_filename: returns a pruned filename
  • safe_path: returns a safe path

These are designed not off of all legally allowed characters per system, but a restricted set of letters, numbers, spaces, hyphens, underscores and periods.

reusables.check_filename("safeFile?.text")
# False

reusables.safe_filename("safeFile?.txt")
# 'safeFile_.txt'

reusables.safe_path("C:\\test'\\%my_file%\\;'1 OR 1\\filename.txt")
# 'C:\\test_\\_my_file_\\__1 OR 1\\filename.txt'

Touch

Designed to be same as Linux touch command. It will create the file if it does not exist, and updates the access and modified times to now.

time.time()
# 1484450442.2250443

reusables.touch("new_file")

os.path.getmtime("new_file")
# 1484450443.804158

Simple JSON and CSV save and restore

These are already super simple to implement in pure python with the standard library, and are just here for convince of not having to remember conventions.

List of lists to CSV file and back

my_list = [["Name", "Location"],
           ["Chris", "South Pole"],
           ["Harry", "Depth of Winter"],
           ["Bob", "Skull"]]

reusables.list_to_csv(my_list, "example.csv")

# example.csv
#
# "Name","Location"
# "Chris","South Pole"
# "Harry","Depth of Winter"
# "Bob","Skull"


reusables.csv_to_list("example.csv")

# [['Name', 'Location'], ['Chris', 'South Pole'], ['Harry', 'Depth of Winter'], ['Bob', 'Skull']]

Save JSON with default indent of 4

my_dict = {"key_1": "val_1",
           "key_for_dict": {"sub_dict_key": 8}}

reusables.save_json(my_dict,"example.json")

# example.json
# 
# {
#     "key_1": "val_1",
#     "key_for_dict": {
#         "sub_dict_key": 8
#     }
# }

reusables.load_json("example.json")

# {'key_1': 'val_1', 'key_for_dict': {'sub_dict_key': 8}}

Cut a string into equal lengths

Ok, I admit, this one has absolutely nothing to do with the file system, but it’s just to handy to not mention right now (and doesn’t really fit anywhere else). One of the features I was most surprised to not be included in the standard library was to a have a function that could cut strings into even sections.

I haven’t seen any PEPs about it either way, but I wouldn’t be surprised if one of the reasons is ‘why do to with leftover characters?’. Instead of forcing you to stick with one, Reusables has four different ways it can behave for your requirement.

By default, it will simply cut everything into even segments, and not worry if the last one has matching length.

reusables.cut("abcdefghi")
# ['ab', 'cd', 'ef', 'gh', 'i']

The other options are to remove it entirely, combine it into the previous grouping (still uneven but now last item is longer than rest instead of shorter) or raise an IndexError exception.

reusables.cut("abcdefghi", 2, "remove")
# ['ab', 'cd', 'ef', 'gh']

reusables.cut("abcdefghi", 2, "combine")
# ['ab', 'cd', 'ef', 'ghi']

reusables.cut("abcdefghi", 2, "error")
# Traceback (most recent call last):
#     ...
# IndexError: String of length 9 not divisible by 2 to splice

Config to Dictionary

Everybody and their co-worker has written a ‘better’ config file handler of some sort, this isn’t trying to add to that pile, I swear. This is simply a very quick converter using the built in parser directly to dictionary format, or to a python object  I call a Namespace (more on that in future post.)

Just to make clear, this only reads configs, not writes any changes. So given an example config.ini file:

[General]
example=A regular string

[Section 2]
my_bool=yes
anint=234
exampleList=234,123,234,543
floatly=4.4

It reads it as is into a dictionary. Notice there is no automatic parsing or anything fancy going on at all.

reusables.config_dict("config.ini")
# {'General': {'example': 'A regular string'},
#  'Section 2': {'anint': '234',
#                'examplelist': '234,123,234,543',
#                'floatly': '4.4',
#                'my_bool': 'yes'}}

You can also take it into a ConfigNamespace.

config = reusables.config_namespace("config.ini")
# <ConfigNamespace: {'General': {'example': 'A regular string'}, 'Section 2': ...

Namespaces are special dictionaries that allow for dot notation, similar to Bunch but recursively convert dictionaries into Namespaces.

config.General
# <ConfigNamespace: {'example': 'A regular string'}>

ConfigNamespace has handy built-in type specific retrieval.  Notice that dot notation will not work if item have spaces in them, but the regular dictionary key notation works as well.

config['Section 2'].bool("my_bool")
# True

config['Section 2'].bool("bool_that_doesn't_exist", default=False)
# False
# If no default specified, will raise AttributeError

config['Section 2'].float('floatly')
# 4.4

It supports booleans, floats, ints, and unlike the default config parser, lists. Which even accepts a modifier function.

config['Section 2'].list('examplelist', mod=int)
# [234, 123, 234, 543]

Finale

That’s all for this first overview,. hope you found something useful and will make your life easier!

Related links:

Exception! Exception! Read all about it!

Exceptions in Python aren’t new, and pretty easy to wrap your head around. Which means you should have no trouble following what this example will output.

import warnings

def bad_func():
    try:
        assert False and True or False
    except AssertionError as outer_err:
        try:
            assert None is False
        except AssertionError as inner_err:
            warnings.warn("None is not False", UserWarning)
            raise inner_err from None
        else:
            return None
    else:
        return True
    finally:
        return False


print(bad_func())

Simple, right?  It raises both except clauses, triggering the warning and so your screen will obviously print False and the  warning.

False
C:/python_file.py:15: UserWarning: None is not False
  warnings.warn("None is not False", UserWarning)

So if you followed that without issue and understand why its happening, go ahead and check out another post, because today we are diving into the depths of Python 3 exception handling.

The basic try except block

If you have coded any semi-resilient Python code, you have used at least the basic try except.

try:
    something()
except:
    pass

If something breaks in the try block, the code in the except block is execute instead of allowing the exception to be raised. However the example code above does something unforgivable, it captures everything. “But that’s what I want it to do!” No, you do not.

Catching everything includes both KeyboardInterrupt and SystemExit. Those Exceptions are outside the normal Exception scope. They are special, because they are cased from external sources. Everything else is caused by something in your code.

That means if you have the following code, it would only exit if it had a SIGKILL sent to it, instead of the regular, and more polite SIGTERM or a user pressing Ctrl+C .

while True:
    try:
        something()
    except:
        pass

Now it’s fine, and even recommend to catch KeyboardInterrupt and SystemExit as needed. A common use case is to do a ‘soft’ exit, where your program tries to finish up any working tasks before exiting. However, in any case you use it, it’s better to be explicit. There should almost always be a separate path for internal vs external errors.

try:
    something()
except Exception as err:
    print(err)
    raise  # A blank except rasies the last exception
except (SystemError, KeyboardInterrupt) as err:
    print('Shutting down safely, please wait until process exits')
    soft_exit()

This way it is clear to the user when either an internal error happened or if it was a user generated event. This is usually only used on the very external event loop or function of the program. For most of your code, the bare minimum you should use in a try except block is:

try:
    something()
except Exception as err:
    print(err)

That way you are capturing all internal exceptions, but none of the system level ones. Also very important, adding a message (in this simple example a print line) will help show what the error was. However, str(err) will only output the error message, not the error type or traceback. We will cover how to get the most of the exception next.

Error messages, custom exceptions and scope

So why does Exception catch everything but the external errors?

Scope

The Exception class is inherited by all the other internal errors. Whereas something like FileNotFoundError is at the lowest point on the totem pole of built-in exception hierarchy. That means it’s as specific as possible, and if you catch that error, anything else will still be raised. 

| Exception
+--| OSError
   +-- FileNotFoundError
   +-- PermissionError

So the further –> you go, the less encapsulation and more specific of error you get.

try:
    raise FileNotFoundError("These aren't the driods you're looking for")
except FileNotFoundError:
    print("It's locked, let's try the next one")
except OSError:
    print("We've got em")

# Output: It's locked, let's try the next one

It works as expected, the exception is caught, it’s message is not printed, rather the code in the except FileNotFoundError is executed. Now keep in mind that other exceptions at the same level will not catch errors on the same level.

try:
    raise FileNotFoundError("These aren't the driods you're looking for")
except PermissionError:
    print("It's locked, let's try the next one")
except OSError:
    print("We've got em")

# Output: We've got em

Only that specific error or it’s parent (or parent’s parent, etc..) will work to capture it. Also keep in mind, with try except blocks, order of operations matter, it won’t sort through all the possible exceptions, it will execute only the first matching block.

try:
    raise FileNotFoundError("These aren't the driods you're looking for")
except OSError:
    print("We've got em")
except FileNotFoundError as err:
    print(err)

# Output: We've got em

That means even though FileNotFoundError was a more precise match, it won’t be executed if an except block above it also matches.

In practice, it’s best to capture the known possible lowest level (highest precision) exceptions. That is because if you have a known common failure, there is generally a different path you want to take instead of complete failure. We will use that in this example, where we want to create a file, but don’t want to overwrite existing ones.

index = 0 
while True:
    index += 1
    try:
        create_a_file(f'file_{index}')
        break
    except FileExistsError:
        continue
    except Exception as err:
        print(f"Unexpected error occurred while creating a file: {err}")

Here you can see, if that file already exists, and create_a_file raises a FileExistsError the loop will increase the index, and try again for file_2 and that will continue until a file doesn’t exist and it break out of the loop.

Custom Exceptions

What if you have code that needs to raise an exception under certain conditions? Well, it’s good to use the built-ins only when exactly applicable,  otherwise it is better to create your own. And you should always start with creating that error off of Exception.

class CustomException(Exception):
    """Something went wrong in my program"""

raise CustomException("It's dead Jim")

All user-defined exceptions should also be derived from [Exception] -Python Standard Library

Don’t be fooled into using the more appropriate sounding BaseException, that is the parent of all exceptions, including SystemExit.

You can of course create your own hierarchical system of exceptions, which will behave the same way as the built-in.

class CustomException(Exception):
    """Something went wrong in my program"""

class FileRelatedError(CustomException):
    """Couldn't access or update a file"""

class IORelatedError(CustomException):
    """Couldn't read from it"""


try:
    raise FileRelatedError("That file should NOT be there")
except CustomException as err:
    print(err)

# Output: That file should NOT be there

Traceback and Messages

So far we have been printing messages in the except blocks. Which is fine for small scripts, and I highly recommended including instead of just pass (Unless that is a planned safe route). However, if there is an exception, you probably want to know a bit more. That’s where the traceback comes in, it lets you know some of the most recent calls leading to the exception.

import traceback
import reusables

try:
    reusables.extract('not_a_real_file')
except Exception as err:
    traceback.print_exc()

Using the built-in traceback module, we can use traceback.print_exc() to directly print the traceback to the screen. (Use my_var = traceback.format_exc() if you plan to capture it to a variable instead of printing it.)

Traceback (most recent call last):
  File "C:\python_file.py", line 11, in <module>
    reusables.extract('not_a_real_file')
  File "C:\reusables.py", line 594, in extract
    raise OSError("File does not exist or has zero size")
OSError: File does not exist or has zero size

This becomes a lifesaver when trying to figure out what went wrong where. Here we can see what line of our code executed to cause the exception reusables.extract and where in that library the exception was raised. That way, if we need to dig deeper, or it’s a more complex issue, everything is nicely laid out for us.

“So what, it prints to my screen when it errors out and exits my program.” True, but what if you don’t want your entire program to fail over a trivial section of code, and want to be able to go back and figure out what erred. An even better example of this is with logging.

The standard library logging module logger has a beautiful feature built-in, .exception. Where it will log the full traceback of the last exception.

import logging
log = logging.getLogger(__name__)

try:
    assert False
except Exception as err:
    log.exception("Could not assert to True")

This will log the usual message you provide at the ERROR level, but will then include the full traceback of the last exception. Extremely valuable for services or background programs.

2017-01-20 22:28:34,784 - __main__      ERROR    Could not assert to True
Traceback (most recent call last):
  File "C:/python_file.py", line 12, in <module>
    assert False
AssertionError

Traceback and logging both have a lot of wonderful abilities that I won’t go into detail with here, but make sure to take the time to look into them.

Finally

try except blocks actually have two additional optional clauses, else and finally sections. Finally is very straightforward, no matter what happens in the rest of try except, that code WILL execute. It supersedes all other returns, it will suppress other block’s raises if it has it’s own raise or return.  (The only case when it won’t run is because the program is terminated during it’s execution.)

That’s why in the very top example code block, even though there is a raise being called, the False is still returned.

    finally:
        return False

The Finally section is invaluable when you are dealing with stuff that is critical to handle for clean up.  For example, safely closing database connections.

try:
    something()
except CustomExpectedError:
    relaunch_program()
except Exception:
    log.exception("Something broke")
except (SystemError, KeyboardInterrupt):
    wait_for_db_transactions()
finally:
    stop_db_transactions()
    db_commit()
    db_close()

Now that is a rather beautiful try except, it catches lowest case first, will log unexpected errors, has a soft_exit if the user presses Ctrl+C, and will safely close the database connection not matter what, even if Ctrl+C is pressed again while wait_for_db_transactions() is being run. (Notice this isn’t a perfect example, as it will still wait to close the database connection after relauch_program is ended, which means that function should do something like fork off so the finally block isn’t wanting  forever to run.)

What Else?

The final possible section of the try except is else, which is the optional success path. Meaning that code will only run if no exceptions are raised. (Remember that code in the finally section will still run as well if present.)

try:
    a, b, *c = "123,456,789,10,11,12".split(",")
except ValueError:
    a, b, c = '1', '2', '3'
else:
    print(",".join(num for num in a))

# Output: 1,2,3

This helps mainly with flow control in the program. For example, if you have a function that tries multiple ways to interpret a string, you could add an else clause to each of the try except blocks that returns on success.  (Dear goodness don’t actually use this code it’s just an example for flow.)

def what_is_it(incoming):
    try:
        float(incoming)
    except ValueError:
        pass  # This is fine as it's a precise, expected exception
    else:
        return 'number'

    try:
        len(str(incoming).split(",")) > 1
    except Exception:
        traceback.print_exc() # Unsure of possible errors, find out
    else:
        return 'list'

From What?

As you might have noticed in the initial example, there  is a line:

raise inner_err from None

This is new in Python 3 and from my experience, hardly ever used, even though very useful. It sets the context of the exception. Ever seen a page log exception with plenty of During handling of the above exception, another exception occurred: 

This can set it to an exact exception above it, so it only displays the relevant ones.

try:
    [] + {}
except TypeError as type_err:
    try:
        assert False
    except AssertionError as assert_err:
        try:
            "" + []
        except TypeError as type_two_err:
            raise type_two_err from type_err

Without the from the above code would show all three tracebacks, but by setting the last exception’s context to the first one, we get only those two, with a new message between them.

Traceback (most recent call last):
  File "C:/python_file.py", line 10, in <module>
    [] + {}
TypeError: can only concatenate list (not "dict") to list

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/python_file.py", line 18, in <module>
    raise type_two_err from type_err
  File "C:/python_file.py", line 16, in <module>
    "" + []
TypeError: Can't convert 'list' object to str implicitly

Pretty handy stuff, and setting it to from None will remove all other exceptions from the context, so it only displays that exception’s traceback.

try:
    [] + {}
except TypeError as type_err:
    try:
        assert False
    except AssertionError as assert_err:
        try:
            "" + []
        except TypeError as type_two_err:
            raise type_two_err from None
Traceback (most recent call last):
  File "C:/Users/Chris/PycharmProjects/Reusables/ww.py", line 18, in <module>
    raise type_two_err from None
  File "C:/Users/Chris/PycharmProjects/Reusables/ww.py", line 16, in <module>
    "" + []
TypeError: Can't convert 'list' object to str implicitly

Now you and the users of the program know where those exceptions are raised from. 

Warnings

Another often neglected part of the python standard library is warnings. I will also not delve too deeply into it, but at least I’m trying to raise awareness.

I find the most useful feature of warnings as run once reminders. For example, if you have some code that you think might change in the next release, you can put:

import warnings

def my_func():
    warnings.warn("my_func is changing it's name to master_func",
                  FutureWarning)
    master_func()

This will print the warning to stderr, but only the first time by default. So if someone’s code is using this function a thousand times, it doesn’t blast their terminal with warning messages.

To learn more about them, check out Python Module of the Week‘s post on it.

Best Practices

  1. Never ever have a blank except, always catch Exception or more precise exceptions for internal errors, and have a separate except block for external errors.
  2. Only have the code that may cause a precise error in the try block. It’s better to have multiple try except blocks with limited scope rather than a long section of code in the try block with a generic Exception.
  3. Have a meaningful message to users (or yourself) in except blocks.
  4. Create your own exceptions instead of just raising Exception.