Python is awesome, and can pretty much do everything you ever wanted, but on rare occasion, you may want to call an external program. The original way to do this with Python was to use os.system
.
import os return_code = os.system("echo 'May the force be with you'")
The message “May the force be with you” would be printed to the terminal via stdout
, and the return code variable would be 0
as it did not error. Great for running a program, not so great if you need to capture its output.
So the Secret Order of the Pythonic Brotherhood* meet, performed the required rituals to appease our Benevolent Dictator for Life**, and brought fourth subprocess. * not real ** real
Subprocess is a module dedicated to running other processes. You’ve probably already have used or encountered it in it’s many forms. subprocess.call
, subprocess.check_call
, subprocess.check_output
or even the direct call to the process constructor subprocess.Popen
.
These made life a lot easier, as you could now have easy interaction with the process pipes, running it as a direct call or opening a new shell first, and more. The downside was the inconvenience of having to switch between them depending on needs, as they all had different outputs considering how you interacted with them.
That all changed in Python 3.5, with the introduction of subprocess.run
(for older versions check out reusables.run). It is simply the only subprocess command you should ever need! Let’s look at a quick example.
import subprocess response = subprocess.run("echo 'Join the Dark Side!' 1>&2", shell=True, stderr=subprocess.PIPE) # CompletedProcess(args="echo 'Join the Dark Side!' 1>&2", # returncode=0, # stderr=b"'Join the Dark Side!' \r\n")
Now check that response out. It’s an organized class, that stores what args
you sent to start the subprocess, the returncode
as well as stdout
and/or stderr
if there was a pipe specified for them. (If something was sent to stdout
or stderr
and there wasn’t a pipe specified, it would send it to the current terminal.)
As the return value is a class, you can access any of those attributes as normal.
print(response.stderr) # Join the Dark Side! print(response.returncode) # 0 response.check_returncode() # Would return None in this case
It also includes a check_returncode
function that will raise subprocess.CalledProcessError
if the return code is not 0.
Basically, you should use subprocess.run
and never look back. It’s only real limitation is that it is equivalent to using Popen(...).communicate()
, which means you cannot provide multiple inputs, wait for certain output, or behave interactively in any manner.
There are plenty of additional capabilities that are good to know, this article will cover:
- Timeouts
- Shell
- Passing arguments as string or list
- Pipes and Buffers
- Input
- Working Directory
- Environment Variables
Timeout
In Python 2 it’s a real pain to have a timeout for a subprocess. You could potentially do a poll
for a max amount of time before calling it quits. But if you had input it was much harder. On Linux you could use signals, but Windows required either a forever running background thread or run in a separate process.
Thankfully in the modern world we can simply specify one to the run
command.
subprocess.run("ping 127.0.0.1", shell=True, timeout=1)
You’ll see some ping responses being printed to the terminal (as we didn’t send it to a pipe) then in a second (literally) see a traceback.
subprocess.TimeoutExpired: Command 'ping 127.0.0.1' timed out after 1 seconds
No crazy multiprocessing or signaling needed, only need to pass a keyword argument.
Shell
I see shell being overused and misunderstood a lot, so I want to define its behavior very clearly here. When shell=False
is set, there is no system shell started up, so the first argument must be a path to an executable file or else it will fail.
Setting shell=True
will first spin up a system dependent shell process (commonly \bin\sh
on Linux or cmd.exe
on Windows) and run the command within it. With a shell you can use environment variables, shell built-in commands and have glob “*” expansion.
Also keep in mind a lot of programs are actual files on Linux, whereas they are shell built-ins on Windows. That’s why “echo” with shell=False
will work on Linux but will break on Windows:
subprocess.run(["echo", "hi"]) # Linux: CompletedProcess(args=['echo', 'hi'], returncode=0) # Windows: FileNotFoundError: # [WinError 2] The system cannot find the file specified
“So, just always use shell?” Nope, it’s actually better to avoid it whenever possible. It’s costly, aka slower, to spin up a new shell, and it’s susceptible to shell injection vulnerabilities.
If you are going to be calling an executable file, it’s best to always keep shell=False
unless you need one of the shell’s features.
Arguments as string or list
There seems to be very odd behavior with the first argument being passed to subprocess functions, including .run
that changes from a list to a string if you use shell=True
. In general if shell=False
(the default behavior) pass in a list of arguments. If shell=True
, then pass in a string.
subprocess.run(['echo', 'howdy']) # List when shell=False subprocess.run('echo "howdy"', shell=True) # String when shell=True
However it’s important to know why you should do that. It’s because of because of how Python has to create the new processes and send them information, which differs across operating systems.
On Windows you can get away with murder pass either a list or string for either case. Because when creating a new process, Python has to interpret the list of arguments into a string anyways when shell=False;
Otherwise, when shell=True
, the string is sent directly to the shell as-is.
On Linux, the same scenario happens when shell=True
. The string will be passed directly to the newly spawned shell as is, so it can expand globs and environment variables. However, if a list is passed, it is sent as positional arguments to the shell. So if you have:
subprocess.run(['echo', 'howdy'], shell=True)
It is not sending “howdy” as an argument to echo
, but rather to /bin/sh.
/bin/sh -c "echo" "howdy"
Which will result in confusing behavior of nothing being returned to stdout
and no error.
And going the other direction can be a pain on Linux. When shell=False
and a string is provided, the entire thing is treated as the path to the program. Which is helpful if you want to run something without passing any arguments, but can be confusing at first when it returns a FileNotFoundError
.
subprocess.run('echo "howdy"') # FileNotFoundError: [Errno 2] No such file or directory: 'echo "howdy"'
So to be safe, simply remember:
subprocess.run(['echo', 'howdy']) # List when shell=False subprocess.run('echo "howdy"', shell=True) # String when shell=True
You can also “cheat” by always building a string, then use shlex.split on it if you don’t need to use a shell.
import shlex args = shlex.split("conquer --who 'mine enemy'" "--when 'Sometime in the next, eh, \"6\" minutes'") print(args) # ['conquer ', '--who', 'mine enemy', # '--when', 'Sometime in the next, eh, "6" minutes'] subprocess.run(args)
(Note that shlex.split
should also be sent posix=False
when being used on Windows)
Stream, Pipes and Buffers
Pipes and buffers are both sections of memory used for the storage and exchange of data between processes. Pipes are designed to transfer and hold the data, while buffers act as temporary vessels to transfer data to files.
Setting stdout
or stderr
streams to subprocess.PIPE
will save any output from the program to memory, and then stored in the CompletedProcess
class under the corresponding attribute name. If you do not set them, they will write to the corresponding default file descriptors, which are same as sys.stdout
(aka file descriptor 1) and so on. So if you redirect sys.stdout
the subprocess stdout
will also be redirected there.
Another common use case is to send their output to a file, i.e:
subprocess.run('sh info_gathering.sh', stdout=open('comp_info.txt', 'w'), shell=True, encoding='utf-8', bufsize=4096)
That way the output is stored into a file, using a buffer to temporarily hold information in memory until a large enough section is worth writing to the file.
If encoding is specified (Python 3.6+), the incoming bytes will be decoded and the buffer will be treated as text, aka “text mode”. This can also happen if either errors
or universal_newlines
keyword arguments are specified.
There are multiple different ways to use buffering:
bufsize | shorthand | description |
---|---|---|
0 | unbuffered | Data will be directly written to file |
1 | line buffered | Text mode only, will write out buffer on `\n` |
-1 | system default | Line buffered if text mode, otherwise will generally be 4096 or 8192 |
>= 1 | sized buffer | Write out when (approx) that amount of bytes are in the buffer |
Input
With run
there is a simple keyword argument of input the same as Popen().communicate(input)
. It is a one time dump to standard input of the new process, which can read any of it at it’s choosing. However it is not possible to wait for certain output for event before sending input, that is more suited to pexpect
or similar.
Check
This allows run
to be a drop in replacement of check_call.
It makes sure the status return code is 0, or will raise subprocess.CalledProcessError
.
subprocess.run('exit 1', check=True, shell=True) # subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1. subprocess.check_call('exit 1', shell=True) # subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1.
Working directory
To start the process from a certain directory, pass in the argument cwd
with the directory you want to start at.
subprocess.run('dir', shell=True, cwd="C:\\") # Volume in drive C has no label. # Volume Serial Number is 58F1-C44C # Directory of C:\
Environment Variables
To send environment variables to the program, it needs to be run in a shell.
subprocess.run('echo "hey sport, here is a %TEST_VAR%"', shell=True, env={'TEST_VAR': 'fun toy'}) "hey sport, here is a fun toy" # CompletedProcess(args='echo "hey sport, here is a %TEST_VAR%"', # returncode=0)
It is not uncommon to want to pass the current environment variables as well as your own.
subprocess.run('echo "hey sport, here is a %TEST_VAR%. Being run on %OS%"', shell=True, env=dict(os.environ, TEST_VAR='fun toy')) "hey sport, here is a fun toy. Being run on Windows_NT" # CompletedProcess(args='echo "hey sport, here is a %TEST_VAR%. Being run on %OS%"', # returncode=0)
I dabble in Python, writing scripts for personal and small business use. Your explanation of subprocess.run helped me debug problems that the official Python docs couldn’t help with. Thanks!
Best explanation of subprocess.run and Shell=True or False I’ve found. Thanks.
Great explanation, thank you.
But how do you pass a find with -exec openSSL to subprocess.run and capture the output?
The find reports the files ok, but I keep getting ‘stdin is not tty’ from the SSL part.
Note: in python the trailing ; should not be escaped.
e.g. on the Windows cmd line: find . -name xxxx =printf “%p\n” -exec winpty openssl x509 -noout -in {} \;
I would not recommend using subprocess / exec directly when using SSH in Python due to a lot of different issues it causes. I would use Paramiko.
To actually answer the question, I think it may just be looking for some form of input, so make sure that you are passing
stdin=PIPE
to therun
function. (from subprocess import PIPE, run
for the imports). If it’s more than that I am sorry I specifically avoid using Popen for SSH so I do not know.Thanks but I am limited to what I can use. Paramiko is unavailable to me.