The Documentation Dilemma

Every coder will come across the same aggravating issue one day, finding a script or library you think you need, and it has zero instructions. Or the examples don’t work, or there is so much writing and so little substance you can’t hope to dig through it all to discover how it’s meant to be used.

Don’t be like that project, document your code smartly. And document to the level of project you are writing. If you have a little script that’s self-contained as a single file, it would be stupid to add a documentation directory and set up Sphinx. Whereas if you were creating a library for other coders, it would be stupid if you didn’t have extensive documentation.

Acceptable Documentation

I used to be like every other college grad, “your code should document itself”, “An idiot could decipher my code”, “if they can’t bother reading through the code, why would they run it on their system?” That is the sound of inexperience and ignorance. For your code to be useful to other people, it needs at minimum to describe what it does and have expected input and outputs, which is easily shown in examples.

The reality is,  professionals simply do a quick search (either company internal or web) and determine if a program or script already exists to do what they want, by reading a short blurb about them and moving on. No one stops to read the code, don’t do your project the disservice of hiding it from others that would find it useful. Not only does it help others, it’s a much better self promoter, showing that you care about your own work and want others to use it.

Here is a quick breakdown of what I view as requirements of documentation:

Single script

  • Readme: can be the docstring at top of file
    • Description of what the script is designed to do
    • Examples, including a sample run with common input / output
  • Code should have meaningful messages on failure

Multiple scripts

  • Readme: standalone file
    • Description of project as a whole
    • Description of each script
    • Example of each script
    • Description and examples of any interconnection between scripts
  • Code needs meaningful messages on failure

Library

  • Readme: standalone file
    • Description of project
    • Installation guide
    • Basic examples
  • Changelog
  • License
  • Documentation directory
    • Stored in a human readable markup language (.md, .rst, .latex) that can be generated into PDF or HTML
    • Python projects should use Sphinx that pulls in docstrings
    • Index with contents of Readme, links to other files for each aspect of library
  • Code should use logging

Library with API

  • Everything in Library
  • API documents for every interface
    • preferably in a common format such as RAML

Program

  • Readme: standalone file
    • Description of project
    • Installation guide
    • Tutorials
      • Including detailed examples / use cases
    • FAQ / troubleshooting
  • License
  • Code should use logging
  • Changelog

Obviously I can’t go into every possible scenario, but I hope this clears up what is a common expectation of a project. And if you intend for others to help with the project, it’s a good idea to include developer docs for contributors that go over code conventions and any particular styles you want to adhere too. If other people are helping, make sure to have a section dedicated thanks to them, such as an Authors file.

Writing the Documentation

Here’s a quick baseline of a README in Markdown you can feel free to copy and reuse.

Project Name
============

Description
-----------

The is an example project that is designed 
to teach documentation. The project will search
imdb and find a movie based off the input given.

Supported Systems:

* Windows XP or newer
* Linux 
 
Tested on:

* Python 2.6+
* Python 3.3+
* Pypy


Examples
--------

Find the movie

```
$ python find_imdb_movie.py 300 

Title: 300 (2006)
Summary: King Leonidas of Sparta and a
         force of 300 men fight the Persians at 
         Thermopylae in 480 B.C.
Rating: 7.7 / 10
```

Get more details about it

```
$ python find_imdb_movie.py --director 300 

Title: 300 (2006)
Director: Zack Snyder
```

License
-------

This code is MIT Licensed, see LICENSE 

If you have a script, and it already has good error messages, you’re good to go! Feel free to add more example cases, FAQ or the full usage output (aka the details given when do you my_project -h, and yes you need that too.)

For larger projects, this is not enough. As it would be very difficult to fit enough examples onto a single readable page, and very easy for them to go stale. To avoid that issue in code libraries, I recommend including the examples of usage in the Python docstrings themselves, and have them auto included in the documentation with Sphinx. That way, when you update a function, you’re much more likely to remember to update the example beside it.

def now(utc=False, tz=None):
"""
Get a current DateTime object. By default is local.

.. code:: python

reusables.now()
# DateTime(2016, 12, 8, 22, 5, 2, 517000)

reusables.now().format("It's {24-hour}:{min}")
# "It's 22:05"

:param utc: bool, default False, UTC time not local
:param tz: TimeZone as specified by the datetime module
:return: reusables.DateTime
"""

Here’s what that will look like when rendered with Sphinx (in this case hosted with Read the Docs )

Reusables.now() documentation example

This is very convenient if you have a library and are exposing all your functions, but isn’t an option for full-fledged programs or APIs. I have seen a few fancy tricks, such as pulling the examples out of the Readme’s and using them as functional tests, or parsing the RAML API files and testing them.  It’s a good way to ensure that all the examples are up to date, but now the documentation is dual purpose, and has to be both machine and human readable, possibly taking too much away from the human readability for the good that the tests do to make sure they are up to date.

So what’s the answer? Well, there isn’t just one. If it’s just you and you’re not going to be updating the project that often, write the examples when it is near completion and try your best to update them if the project updates. If it’s an ever evolving project with multiple contributors, it would probably be best to have some sort of automation to ensure they are not going stale.

The best thing to do is have a standard across your team / company. Even a stupid standard implemented well can result in better documentation than no standards. That way everyone knows exactly where to look for docs, and what they are expected to contribute.