In this passage, we're continuing to learn the most popular loop in Python: the for loop.
The for loop tells the program to perform an action multiple times with a single command.
The first challenge that we face when we want to use this loop is to determine the number of repetitions in the loop. Usually, this amount is self-imposed. All right: sometimes we need to arbitrarily decide how many times the loop should run through the loop, but more often we just want to iterate through all the items in the collection.
Let's consider an example. On the basis of the list of first names – written with a lower case letter – we want to create a new list, rewriting each name with the first letter replaced with a capital letter:
['ava', 'amelia', 'alexander', 'aiden', 'avery', 'abigail', 'asher', 'anthony', 'aria', 'andrew', 'adrian', 'aurora', 'angel', 'aaron', 'axel', 'addison', 'austin', 'aubrey', 'adam', 'audrey', 'aaliyah', 'anna', 'alice', 'amir', 'allison', 'ariana', 'autumn', 'ayden', 'ashton', 'august', 'adeline', 'adriel', 'athena', 'archer', 'adalynn', 'arthur', 'alex']
So how many of these names are there in the collection? Will we count manually? Or (what might be easier) – will we use predefined functions to count elements in the collection[1]? Or maybe… we will not be counting at all?
Foreach loop
The answer to the last question is: if we want to iterate over all the elements of the collection, recounting the elements is — from a human perspective — completely unnecessary work. Unfortunately, from the computer perspective it is different. The computer does not have the ability to look in a panoramic way at a collection of data and to see with one glance its beginning, end and logically imposing "jump" (selection of the next element), as for example in such a chain:
house → sequence of letters 'h', 'o', 'u', 's', 'e'
Most programming languages therefore require that elements be indexed — and then indicated with those indexes how — and how many times — the loop is to run.
The creators of Python decided to free us from this obligation and invented their brilliant iterators, thanks to which the computer sees the collection like a human: it recognizes its first and last element and is able to perform the magic "next" — jump to the next element until we reach the last element in the collection.
You will find quite a lot of articles and studies on iterators on the web. In our opinion, when learning the basics of programming, it makes no sense to explore them. Iterators were invented as "good spirits": they are invisible themselves and do their job in invisible way. It is enough for us that thanks to them we can comfortably traverse[2] the collections.
Let's have a look.
The classic loop (known from C or C ++ languages) looks something like this:
for(int i=0; i<10; i++)
…do something with A[i]
where A [i] is the i‑th element of collection A
As you can see, instead of iterating over the elements directly, we iterate over their indexes.
We need to make sure that there are just 10 elements and then limit the range of the loop to that number (i <10).
Contrary to that, the loop in Python
for element in A:
do something with element
has the same effect without indexes and without ranges.
The advantage of the Python for loop is its simplicity.
You don't need to count items
In Python, the for loop actually works like the foreach loop. As the name suggests, the foreach loop engages each item in the collection in turn. It becomes redundant to determine the number of items in a collection.
Direct access to collection items
While in the classic for loop you access the elements of a sequence indirectly — through their indexes — the foreach loop gives us direct access to the elements in a collection. Indexes become redundant. The annoying "hedgehog" — that is, the piece of code "riddled with" square and round brackets — disappears.
Let's compare the two mentioned types of loops
While Python favors its elegant indexless loop (which is essentially a foreach loop), in Python we can also easily build a for loop like the classic indexed loop. We will use the functions range() and len(); len() reflects the length of the sequence (number of elements in it) — so, if you want to go through the whole sequence A, just write in range(len(A)).
This pattern is very eagerly used by students who "come" to Python with experience in programming in Java or C ++. Driven by habit, they try to access successive elements (in a set of elements) using indexes. They forget that Python offers a better and simpler way to access the elements of a set that the loop iterates over. Let us look at some examples.
Iterating over indexes (range and len in action)
How many of you often write such code?
animals = ['sheep', 'cow', 'pig', 'horse', 'goat']
for i in range(len(animals)):
print(animals[i])
It makes sense, doesn't it? You check the length of the animals list, create a "range" object containing the index collection, and then iterate over that collection.
As a result, with the help of these indexes, you get access to the names of the animals. A bit of a roundabout road to the goal, isn't it? And you don't have to do it in such a "roundabout" way.
Python gives you a nice shortcut: the "indexless" Python for loop — which is essentially a foreach loop.
Unlike the classic for loop, the foreach loop gives you direct access to the items in the collection. So we can assign these elements directly to the loop variable, instead of looking for help in indices as we did in the example above.
Thus the more Pythonic way of writing a loop might look like this:
animals = ['sheep', 'cow', 'pig', 'horse', 'goat']
for animal in animals:
print(animal)
Here instead of abstract variable name: "i", we have a nice descriptive variable name, and the code is generally much shorter and less complex.
Please note that using range() function to "generate" indexes – and then accessing element using those indexes – is not in the spirit of Python. Iterating using indexes is not recommended if we can iterate over the elements directly.
Let's discuss the pros and cons …
Iterating over two lists at once
OK, now comes the question: is it better to iterate directly over the elements of a collection or over their indexes? Well, the answer is not simple. Each solution has its advantages and disadvantages.
For example, you cannot iterate over two lists at the same time with an indexless loop. While with the indexed loop it is possible!
Look at a typical example: list A — list of net prices, list B — list of 10% tax. We iterate over A and B simultaneously and create a new list C with the sums: net price + tax
A=[12, 14, 16]
B=[1.2, 1.4, 1.6]
C=[]
for i in range(len(A)):
C.append(A[i]+B[i])
print©
Output:
[13.2, 15.4, 17.6]
As you might have guessed, the Python developers found a way to iterate over two lists at once without using indexes. Use the zip function
A=[12, 14, 16]
B=[1.2, 1.4, 1.6]
C=[]
for (price, tax) in zip (A, B):
C.append(price+tax)
print©
Output:
[13.2, 15.4, 17.6]
zip is a simple tool for parallel iteration. It is nice and convenient to use. But supporters of index methods will surely wince: another function to remember … And it works less intuitively than the variant using indexes …
Changing when iterating
Another consideration when deciding between an indexed loop and an indexless loop is the loop's ability to modify the collection it iterates over. In particular, the ability to change the value of individual elements of the collection.
The indexed for loop has the ability to change the value of the elements of the collection it iterates over. However, an indexless loop does not. Let's explain why.
The indexed loop allows access to the list items "by index"; when we therefore have a list:
A = ['cat', 'dog', 'mouse']
the program views it as follows:
A = [A[0] = 'cat', A[1] = 'dog', A[2] = 'mouse')
thus it sees the list items as a collection of variables (labels) with assigned values[3]. These "variables" contain the index against which the value of the element is found.
However, in the case of an indexless loop, the values — which are successive elements of the list — are assigned to the variable outside the list.
This is the crux of the matter. When we ask the for loop to invoke the elements of the list as quasi-variables, the loop can change their values in the list itself. And when we ask the loop to call (find) a value in a list, we cannot change these values at the same time. At most, we can change them outside the list.
Let's check with examples if it works as described:
Indexed for loop | Indexless for loop |
A = ['cat', 'dog', 'mouse'] for i in range(len(A)): A[i]='parrot' print(A) Output: ['parrot', 'parrot', 'parrot'] | A = ['cat', 'dog', 'mouse'] for elem in A: elem='parrot' print(A) Output: ['cat', 'dog', 'mouse'] |
Here you go. Everything turned out to be true — as announced. The indexed loop has changed the list items. An indexless loop left the list elements unchanged[4].
Finally: be very careful when modifying the lists you are iterating through the for loop. Avoid removing or adding items to the list during iteration, because your code may trick you …
[1] for example, in C ++ it is the function length(), and in Python it is the function len()
[2] traversing is the process by which you access – one by one – every element present in a data structure, such as an array.
[3] note: the above notation is for illustrative purpose only; do not write list items this way
[4] another old rule, which you should always remember: in programming, a value (= an immutable object) is a sequence of bits and it involves memory allocation. You can assign a value to a variable. Then, the variable may be reassigned a new value, but the existing value cannot be modified.