a gentle introduction to programming using Python
Welcome to our Intro to Python Programming guide! It is designed for people without much experience in either programming generally or using Python specifically.
It is also written for people from a campaign/movement background. The guide’s examples are taken progressive data work and make reference to Parsons. But you do not need to know anything about Parsons to use this guide!
If you want to show thanks for the free guide, please give feedback. Feel free to add questions and suggestions to this guide by leaving comments.
There are many different ways to set up and work with code. Some people write and run Python code via a hosted platform like Civis, for instance, or via Jupyter notebooks.
Today we’re going to focus on two very common and useful ways to work with code: from the command line, and from an Integrated Development Environment (IDE).
Command line interfaces let you do a lot of different things on your computer, including installing and running programs and navigating the directory structure of your computer.
On Macs/Linux the default command line interface is called a Terminal and on Windows it is called the Command Prompt. Command line interfaces are also sometimes called shells.
Open up a command line window and type: python
. This should launch the Python interpreter. This allows you to write Python code directly.
Note how the Python interpreter replaces the command line, rather than opening up in a separate tab or window. The differences between the command line prompt and the Python interpreter prompt can be subtle.
This image shows me opening a command line and writing text using echo
. I then opened the Python interpreter using python
. Finally, I write text within the interpreter using print
(Python’s print and the shell’s echo are kind of equivalent).
The Python interpreter prompt typically ends with >>>
while the command line/shell prompt typically ends with $
.
You can leave the interpreter by typing quit()
. This will bring you back to the command line.
You want to be using Python 3. Some older computers/operating systems may make Python 2 the default. You can see which Python version is your default by typing python –version
on your command line. If you see a number that starts with 2 (python 2.7 is common) you’ll want to either work within a python3 virtual environment or use the command python3
instead of python
in all the examples below.
Now that we’ve got all that explained, let’s make a super simple file to run from the command line. Open up a new file with the command nano test_script.py
. (All Python files end with .py
. Nano is just a simple text editing program.)
Write the following code:
print(“hi!”)
Save and close the file (see the bottom of the Nano screen for instructions - on Linux to save and quit you press cntrl+O cntrl+X).
You can now run your file with the command python test_script.py
. You should see the output Hi! printed to the command line.
Using Nano is fine for very simple scripts, but if you write complicated code you’ll want to use a program that helps you do so. These are called Integrated Development Environments or IDEs. IDEs can have a variety of features, including syntax highlighting, checking for missed imports, updating reference calls, etc.
I use Visual Studio Code. VS Code is nice because you can add lots of features to it, but you can also keep things pretty simple to start. You can explore the other things VS Code has to offer on a future day.
Using an IDE is especially helpful for Python because Python has meaningful white space. That means the empty places in the code - spaces and indentations - really matter. Most other languages use parentheses or brackets to indicate where one function or method starts and another begins. In Python, we use indentation. An IDE can help you make sure you’re using indentation correctly. If you try to mix tabs and spaces, for instance, Python will throw an error.
Let’s open our test_script.py
file in our IDE. If you’re not sure where you made the file, go back to the command line and type pwd
(Mac/Linux) or dir
(Windows). This will show you where in your directory tree your test script is.
Make a change in your script from the IDE, for instance, have the script say “Bye!” instead of “Hi!”. Save the file in the IDE and run it from the command line. The new output text should show up.
Let’s go ahead and put an error in our script. For instance, let’s take the quotation marks away from our “Hi!” or “Bye!”
When we run the script again, we’ll get an error. It should look like this:
A SyntaxError is one of many kinds of inbuilt Python errors (gotta catch ‘em all!). Looking at this error, we can see it has a few different parts. It tells us what type of error it is (a Syntax error), a clarifying message (“invalid syntax”), and it also points to where the error arose (on line 1). Learning to read these kinds of messages will serve you very well. We’ll run across a few different errors in our training today.
When we write Python, we don’t want to have to build everything from scratch. We re-use existing code by importing it. You can do so with the command import
. For example, if you want to use the Parsons project, you’d use the command import parsons
.
Not everything has to be imported. Some Python functions and types called built-ins are, you guessed it, built in to the language. You can call them without importing them. For example, our print(“hi!”)
call above uses the print
built-in.
Imports can come from three places:
The Python Standard Library. The Python Standard Library is installed when you install Python, and you don’t have to do anything else to access it. Just import them with a command like import datetime
or import random
.
Third party packages (ie PyPI). Other developers may offer packages for you to download and install. The most common way to offer packages is to put them on PyPI, the Python packaging index. That’s how many people get Parsons. These packages can be installed with commands like pip install parsons
. Once that’s done, you can import using a command like import parsons.
Write and import your own code locally. You can write code locally and import it by referencing the file. For instance, this is a picture of some Parsons code. We can see that Parsons is importing code from elsewhere in Parsons:
import json
import time
from parsons.etl import Table
from parsons.utilities import check_env
import logging
Now that we’re done setting up, let’s take a look at some basic Python data structures. Let’s open up the python interpreter with the command python
(or python3
).
We can create variables of any type by assigning them to a name using the =
operator. Python has what’s known as a dynamically typed language. By that, we mean that it’s not assigned upon creation like a statically typed language. Java, a statically typed language, makes you write things like String foo = new String “hello!”
, whereas in Python, you can just write foo = “hello!”
. This makes it easier to write but can make it hard to track what type a variable is.
(It’s okay if you immediately forget the difference between dynamic and statically typed variables. It’s one of those things that often takes a few explanations to remember, and it’s not immediately relevant to your life anyway.)
One helpful trick to determine the type of a variable when you’re unsure is type()
:
Two of the most basic types in Python are integers and strings. Integers are whole numbers, and strings are characters within single or double quotation marks.
I’m not going to go deep into data types, but it’s worth mentioning this is just the beginning of working with numbers and text in Python. For instance, what if you want to work with decimal numbers instead of just integers? You can use a data type known as a float.
There are some interesting rules when doing math with these numbers - for instance, when adding or subtracting numbers. If they’re both integers, the result will be an integer; if they’re both floats, the result will be a float. What if you add an integer and a float? The result is a float. This is because a float contains more information than an integer does. Python assumes you don’t want to lose that info, and keeps the result as a float by default.
Lists or arrays are just zero or more things encased in square brackets []
.
A list can be full of integers: [1,2,3,4]
It can be full of strings: [“Hi”, “Bye”, “Hi again”]
It can be a mix of things: [1, “Hi”, 3.0]
It can even contain other lists: [200, “Okay”, [“A”, “B”]]
Like any other Python type, you do not need to declare ahead of time what your list will contain. You can add whatever you want with wild abandon!
You can get an item in a list by supplying its index number. Python is zero-indexed, which means the index starts with 0:
You will probably forget this fact at some point and cause an IndexError:
Another error for our collection!
You can add new items to the end of a list using “append”:
You can add items anywhere in a list using insert. The first number you give to insert is the index where you’d like the item placed. The second variable (which can be a number but doesn’t have to be) is the item to be added:
Dictionaries are a bit like lists except instead of just values you have key-value pairs. Dictionaries are encased in curly brackets {}
.
Dictionaries can also be full of many different types, including other dictionaries:
{“A”: 1, “B”: 2, “C”: [10, 100, 1000], “D”: {“KY”: “Kentucky”, “DC”: “District of Columbia”}}
You cannot index into a dictionary. Instead, you must provide a key:
Sometimes you ask for a key that’s not there, leading to a KeyError:
To avoid this, many people use a slightly more complex syntax:
state = states.get(“NY”)
With this syntax, asking for a key that’s not there gives you either None
or, if supplied, a default value:
Dictionaries are not inherently ordered, although you can use a special kind of dictionary called an OrderedDict if the order is important to you. Because of this, it doesn’t matter “where” you add the key-value pair. Instead, all that matters is whether it’s part of the dictionary or not.
You can add items to a dictionary by assigning to the key directly, or by using update
. You can delete using del()
.
Booleans are types that can be evaluated as True or False. The basic booleans are, of course, True and False, but many other things can also be used as Booleans. You will sometimes hear people referring to these things as “truthy” and “falsey”—that just means that, when used as Booleans, they evaluate to true or false.
Generally, empty things are considered falsy. So 0, an empty string, an empty list and an empty dictionary all evaluate to false. Any number, or a non-empty string, list or dictionary will evaluate to true.
Another falsey type is None. None is a special type that means, well, nothing. We’ll see it again in a bit when we get to functions. When a function isn’t given an explicit value to return, it returns None. For instance, the print function isn’t given a value to return, and returns None, which is a Falsey value:
Booleans are used throughout Python. One common place to find them is in if statements.
If statements help you to create branches in your code. If something is true, you do one thing. If something is false, you do another:
If statements take a boolean or anything that evaluates to a boolean. In this example, we’re giving in a “comparison operator”. There are six common types of comparison operators:
==
is equal to
!=
is not equal to
>
is greater than
>=
is greater than or equal to
<
is less than
<=
is less than or equal to
So the example above can be read as “If the variable voter is equal to “registered”, call the function gotv, otherwise call the function register”.
Comparison operators are very common, but you can also use truthy or falsy values directly. For instance, maybe in the example above, we get our voter variable from a function and we know that voter will be a name if registered and a None if not registered. Then we could just write:
You may want to chain your if statements. Perhaps your person has three states they could be in: unregistered, registered, and already voted.
You can check for multiple states using elif
, short for “else if”:
Note that you don’t have to include the else statement. The only required part is the “if”.
For loops allow you to run code multiple times. In English, you’d express it as, “For each of…” So the following code you might describe as “For each state in the list of NJ, OH and TX, print that state.”
You can give anything that’s an iterator to a for loop. An iterator is something you can iterate through—that is, that you can go through item by item. Lists, dictionaries, and strings are iterators.
You might expect an integer to be iterable, but it’s not:
An integer isn’t inherently iterable, but you can make it iterable by passing it to a function like range
that iterates through it:
Anything can be put in a for loop as long as it iterates. You can make custom functions or objects that iterate! But that’s a pretty advanced Python topic, so we won’t go over that here.
One way to make code re-usable is to encapsulate it in a function. Let’s take our last example and put it in a function in test_script.py:
def print_numbers(last_name):
for number in range(0, last_number):
print(number)
As you can see, you define a function using the special syntax of def then the function name, then parentheses, then a colon. Any code the function runs is indented a level.
You can now use this code in another script by importing it. (Remember importing?) Create a second file in the same place and add this code:
from script import print_numbers
print(“This is the calling script!”)
print(numbers(5))
You can then run the code from the command line:
Functions sometimes return things. Let’s go back to our test script and make a dictionary of voters.
voters = {
"Anna": "registered",
"Bob": "not registered",
"Carlos": "registered",
"Divya": "already voted"
}
Now, let’s write a little function that finds registered voters and returns them:
def get_registered_voters(voting_dict):
registered_voters = []
for name, status in voting_dict.items():
if status == "registered":
registered_voters.append(name)
return registered_voters
There’s a few things going on here. First, we’re creating an empty list to store our registered voters. Then, we’re iterating through our dictionary. We call .items()
on the dictionary to get both the key and value pair because by default, when you iterate through a dictionary, you only get the key.
An alternative, less readable but perfectly valid way to get the item’s value looks like this:
def get_registered_voters(voting_dict):
registered_voters = []
for key in voting_dict:
if status == "registered":
registered_voters.append(voting_dict[key])
return registered_voters
Here, instead of getting key and value as part of the iteration, we only get key and use it to look up the value.
In both cases, we return our list of registered voters via the return syntax. You can return multiple things from a function, but here we just return the list of interest.
We can go back to our calling script and get + call our new function:
from script import get_registered_voters
print(get_registered_voters(voters))
Remember to put this below your voters
dictionary so it can be passed into the function.
You’ll notice that our get_registered_voters
function accepts a single input parameter or argument, voting_dict
. There are two types of arguments in Python, positional arguments and keyword arguments.
You may have spotted that our function calls the input parameter “voting_dict” while the calling script passes in something called “voters”. Those don’t match! But it doesn’t matter, because that input parameter is a positional argument. Positional arguments are always required, and you have to be careful when passing them in to make sure they’ve got the correct order.
For example, a function may be defined:
def get_data(name, date):
But you may accidentally call it with:
get_data(date, name)
Python will not catch this error unless you use type hints (an advanced topic) and sometimes not even then. If you’re lucky, your code will break because it expects a date and got a name. If you’re unlucky, the code will still work and you won’t notice you made this mistake until a user reports problems. (This is why testing and peer review are both important!)
To prevent these kinds of errors, I prefer to use keyword arguments. Keyword arguments have the following syntax:
def get_data(name="default_name", date="default_date"):
Note that defaults for keyword arguments are often supplied. If you want to have keyword arguments without defaults, you can use the asterisk syntax:
def get_data(*, name, date):
We can turn our get_registered_voters function into something that accepts keyword arguments, like so:
def get_registered_voters(*, voting_dict):
Now when we call the function the old way, we get an error:
We can fix this by calling the function with a keyword argument instead:
print(get_registered_voters(voting_dict=voters))
That’s it for our introduction to Python programming. Would you like to see additional topics in this guide? Let us know!