Splitting a list in Python might seem intimidating to people who don’t have much experience with the language. In fact, it can be a difficult task within many other programming languages. But it’s a fairly straightforward process in Python. In the following examples we’ll go over a method to split a Python string into a list and then split that list into separate components. We can begin by creating a string in Python. We’ll then split each word in the string into individual components within a list.
aString = "one two three four five six"
bList = aString.split()
print(bList[1])
The first print function outputs the list created by the split string. There’s a few components that we need to take note of. The first is how the split function was able to tell which elements to create the list from. When split is called it uses the first passed value as a delimiter string. If no value is passed it assumes we’ll be splitting the string between whitespace. We could specify consecutive whitespace as well. In fact, any string passed to split can be used as the character that elements are cut upon.
Next, we use print to output a single element from within the new list. Keep in mind that these print statements are just in the code to illustrate specific points. You wouldn’t need to use them in actual production code.
The result seen when printing bList might come as a surprise if you’re new to python. Specifying the 1 element in the list might suggest the number one. But instead, we see the number two on our screen. This is because python counts up from zero. So if we wanted the first item in a list we actually would want to print bList[0].
Splitting a List
With those basics down we can move on to actually splitting a list under Python. We’ll use the following code.
aString = "one two three four five six"
aList = aString.split()
print(aList)
aList = aList[1:4]
print(aList)
The first line of the example creates an original string string consisting of words which use whitespace as a separator. Next, we take that string and create a Python split list. Note that the Python program code isn’t passing variables with the split function. This is due to the fact that we’re using whitespace as a specified separator value.
The first value passed in the split function will determine what to use to create individual list elements within the structure. For example, we could pass a comma to the split command and it would split the string into components after every comma within it. This is especially useful when working with comma separated list conversions. We see this a lot when loading csv spreadsheets. These files are essentially just a data output that exists as a text file with a csv extension. Inside the file we simply find lists with comma separated values. But no matter the source, it all works in a similar way after being properly created or stored in the Python programming language. In this instance, our print(aList) shows the current content of the list. From here we can see that our list is now composed of the individual words found within the string aString.
Next, we simply redefine the newly generated aList with a subset of itself. We can select part of any given list by using two integers to select a start and end point within it. Keep in mind that this syntax would even give us the ability to work with any data type. We could even do this with a lists of lists – a nested list.
Doing More
But what if we wanted to be even more selective with our criteria? For example, what if we only wanted to create a Python string split list containing words that started with the letter t? We could accomplish that using something called list comprehension.
aString = "one two three four five six"
aList = aString.split()
tList = [x for x in aList if "t" in x]
print(tList)
Here we go about things in a similar way to the previous example. But once we’ve created the aList from our python split string we create a new list using list comprehension. This amounts to a simple loop that goes over one piece of selective criteria when populating the Python string split list. If the item contains the letter t then it’s included in the new list or sublist. Finally, we can print the results on screen into some kind of dataframe or numpy array after we’re done.