strings

Stop using plus signs to concatenate strings!

In Python, using plus signs to concatenate strings together is one of the first things you learn, i.e. print("hello" + "world!"), and it should be one of the first things you stop using. Using plus signs to “add” strings together is inherently more error prone, messier and unprofessional. Instead you should be using .format() or f-strings.

Hunter – Artwork by Clara Griffith

Before diving into what’s really wrong with + plus sign concatenation, we are going to take a quick step back and look at the possible different ways to merge strings together in Python, so we can get a better understanding of where to use what.

Concatenating strings

When to useWhen to avoid
+NeverAlways
%Legacy code, logging modulePython 3+
formatEverywhere
f-stringPython 3.6+When you need to escape characters inside the {}s
joinOn an iterable (list, tuple, etc) of strings

Here is a quick demo of each of those methods in action using the same tuple of strings. For an already existing iterate of strings, join makes the most sense if you want them to have the same character(s) between all of them. However, in most other cases join won’t be applicable so we are going to ignore it for the rest of this post.

variables = ("these", "are", "strings")

print(" ".join(variables))
print("%s %s %s" % variables)
print("{} {} {}".format(*variables))
print(f"{variables[0]} {variables[1]} {variables[2]}")
print(variables[0] + " " + variables[1] + " " + variables[2])

# They all print "these are strings"

In many cases you will have other words or strings not in the same structure you will be concatenating together, so even though something like f-strings here looks more cumbersome than the others, it wins out in simplicity in other scenarios. I honestly use f-strings more than anything else, but .format does have advantages we will look at later. Anyways, back to why using plus signs with strings is bad.

Errors lurking in the shadows

Consider the following code, which has four different perfectly working examples of string concatenation.

wait_time = "0.1"
time_amount = "seconds"

print("We are going to wait {} {}".format(wait_time, time_amount))

print(f"We are going to wait {wait_time} {time_amount}")

print("We are going to wait %s %s" % (wait_time, time_amount))

print("We are going to wait " + wait_time + " " + time_amount)

# We are going to wait 0.1 seconds
# We are going to wait 0.1 seconds
# We are going to wait 0.1 seconds
# We are going to wait 0.1 seconds

Everything works as expected, but wait, if we are going to put a time.sleep in there, it takes the wait time as a float. Let’s update that and add the sleep.

Concatenation TypeErrors

import time

wait_time = 0.1 # Changed from string to float
time_amount = "seconds"

print("We are going to wait {} {}".format(wait_time, time_amount))

print(f"We are going to wait {wait_time} {time_amount}")

print("We are going to wait %s %s" % (wait_time, time_amount))

print("We are going to wait " + wait_time + " " + time_amount)

time.sleep(wait_time)

print("All done!")


# We are going to wait 0.1 seconds
# We are going to wait 0.1 seconds
# We are going to wait 0.1 seconds
# Traceback (most recent call last):
#    print("We are going to wait " + wait_time + " " + time_amount)
# TypeError: can only concatenate str (not "float") to str

That’s right, the only method of string concatenation to break our code was using + plus signs. Now here it was very obvious it was going to happen. But what about going back to your code a few weeks or months later? Or even worse, if you are using someone else’s code as a library and they do this. It can become quite an avoidable headache.

Formatting issues

Another common issue that you will run into frequently using plus signs is unclear formatting. It’s very easy to forget to add white space around variables when you aren’t using a single string with replace characters like every other method. What can look very similar will yield two different results:

print(f"{wait_time} {time_amount}")
print(wait_time + time_amount)

# 0.1 seconds
# 0.1seconds

Did you even notice we had that issue in the very first paragraph’s code? print("hello" + "world!")

Messy

This is the most subjective of my reasons to avoid it, but I personally think it becomes very unreadable compared to any other methods, as shown with the following example.

mixed_type_vars = {
    "a": "My",
    "b": 2056,
    "c": "bodyguards",
    "d": {"have": "feelings"}
}


def plus_string(variables):
    return variables["a"] + " " + str(variables["b"]) + \
           " " + variables["c"] + " " + str(variables["d"])


def format_string(variables):
    return "{a} {b} {c} {d}".format(**variables)


def percent_string(variables):
    return "%s %d %s %s" % (variables["a"], variables["b"], 
                            variables["c"], variables["d"])

print(plus_string(mixed_type_vars))
print(format_string(mixed_type_vars))
print(percent_string(mixed_type_vars))

String format is very powerful because it is a function, and can take positional or keyword args and replace them as such in the string. In the example above .format(**variables) is equivalent to

.format(a="My", b=2056, c="bodyguards", d={"have": "feelings"})

That way in the string you can reference them by their keywords (in this case single characters a through d).

"Thing string is {opinion} formatted".format(opinion="very nicely")

Which means with format you have a lot of options to make the string a lot more readable, or you can reuse positional or named variables easily.

print("{0} is not {1} but it is {0} just like "
      "{fruit} is not a {vegetable} but is a {fruit}"
      "".format(1, 2, fruit="apple", vegetable="potato"))

Slower string conversion

Using the functions from the Messy section we can see that it is also slower when concatenation a mix of types.

import timeit
plus = timeit.timeit('plus_string(mixed_type_vars)',
                     number=1000000,
                     setup='from __main__ import mixed_type_vars, plus_string')

form = timeit.timeit('format_string(mixed_type_vars)',
                     number=1000000,
                     setup='from __main__ import mixed_type_vars, format_string')

percent = timeit.timeit('percent_string(mixed_type_vars)',
                     number=1000000,
                     setup='from __main__ import mixed_type_vars, percent_string')

print("Concatenating a mix of types into a string one million times:")
print(f"{plus:.04f} seconds - plus signs")
print(f"{form:.04f} seconds - string format")
print(f"{percent:.04f} seconds - percent signs")

# Concatenating a mix of types into a string one million times:
# 1.9958 seconds - plus signs
# 1.3123 seconds - string format
# 1.0439 seconds - percent signs

On my machine, percent signs were slightly faster than string format, but both smoked using plus signs and explicit conversion.

Unprofessional

This isn’t only something to call out teammates on during code review, but can even negatively impact you if you’re applying for Python jobs. Using “+” everywhere for strings is a red flag that you are still a novice. I don’t know anyone personally that has been turned away because of something so trivial, but it does show that you unfamiliar with Python’s awesome feature rich strings and haven’t had a lot of experience in group coding.

If you ever saw Batman or James Bond coding in Python, they wouldn’t be using +s in their string concatenation, and nor should you!

Summary

"If" + "πŸ‘" + "you" + "πŸ‘" +"use" + "πŸ‘" + "plus signs" + "πŸ‘" + "to" + "πŸ‘" + "concatenate" + "πŸ‘" + "your" + "πŸ‘" + "strings" + "πŸ‘" + "you" + "πŸ‘" + "are" + "πŸ‘" + "more" + "πŸ‘" + "annoying" + "πŸ‘" + "than" + "πŸ‘" + "this" + "πŸ‘" + "meme!"