In this article let us see how to check if a substring is present inside a given string. Strings are used everywhere, even the article you are reading right now is made up of a bunch of strings. “String” is just a fancy name for “Text”!
Python is exceptionally good at analyzing text-based data. There is no other programming language that can compete with Python when it comes to how quickly and easily you can write up a script to analyze your text inputs!
Making yourself familiar with the Toolkit provided by the programming language is a must-do step on the path to becoming a “Master Python Craftsman”. This article is all about wielding the power of Python to quickly check if a substring is present inside a string!
For those of you who came here just to refresh your memories, here is the short version of the answer!
Checking If a Substring present in a given String: The Short Answer
We can figure out if a substring is in a given string using the “in” operator in Python. Let us see a short example.
>>> "ap" in "apple" True >>> "pl" in "apple" True >>> "ae" in "apple" False
Here the main string is “apple” and the substrings which we checked are “ap“, “pl” and “ae“.
- “ap” being the first 2 letters of “apple” we get True
- “pl” being the 3rd and 4th letters of “apple” we get True again
- “ae” has 2 letters from “apple“, the 1st and 5th/last letters, but the string “ae” (as in ‘a’ next to an ‘e’) is not in “apple” and hence we get False!
If you think you have learned everything to know about this topic, then hold your horses!
In the rest of this article, the following questions are answered
- how the “in” operator works under the hood with strings and
- how this differs when the same “in” operator works with other sequences and collections.
As a bonus, we will also see
- how to use this super-power in our programs “The Right Way”!
How the “in” operator works under the hood with strings
In order to understand how things work, let us try to achieve the same result as above by this time using a code we wrote instead of using the “in” operator.
def is_substring_in_string(substring: str, string: str): # iterate over the string to see if the 1st character is present for i in range(len(string)): if string[i] == substring: # 1st character matches, let us check the next few characters! for j in range(1,len(substring)): if i + j < len(string): if string[i+j] == substring[j]: if j == len(substring) - 1 : return True else: break else: break return False
This function can be used as follows.
>>> is_substring_in_string("ap","apple") True >>> is_substring_in_string("pl","apple") True >>> is_substring_in_string("ae","apple") False
As you can see the function works the way we want it to!
A Challenge for you!
Before reading the next paragraph I want you to look at the code above and first come up with your own explanation on why the code does what it does. The practice of “Reading code with intent” will get you from being a “noob” to an “expert” in no time!
This might take some time, but think of this as a workout to the coding muscles of your brains!
Are you ready?
Okay, time for the explanation!
- there are 2 for-loops nested within each other
- the outer for-loop iterates over the main string and see if any of the characters match the 1st character of the substring.
- if there is a match for the 1st character, the inner for-loop checks to verify if the rest of the characters match.
- at any-point in the inner for-loop if there is a mismatch, the control breaks out of the inner for-loop.
- if there is a total match, then True is returned!
All of this complicated yet powerful low-level stuff is taken care of by Python internally and we are left with the awesomeness of the “in” operator!
If you are not familiar with the “==” operator, I suggest you read my other article mentioned below first and then come back here!
There I have explained
- How to use the “==” operator in the interpreter and in programs
- How is the “==” operator implemented inside Python
- How to redefine the behavior of the “==” operator for the classes that we define.
So go ahead and read that article (at least till LEVEL#2) and then come back here!
How this differs from the way “in” operator works
Okay, time for a shocker!
Have a look at the example below.
>>> apple_list = ['a', 'p', 'p', 'l', 'e'] >>> ['a', 'p'] in apple_list False >>> 'a' in apple_list True
Time for another Challenge!
Why does line-3 return False? why does line-5 return True?
Open your python interpreter and Experiment!
That is the best way to master python!
Okay let us see the answer, shall we!
In essence, everything we learned so far about the “in” operator only applies to the string class!
The behavior of the “in” operator is completely different for all the other data types!
For Non-String datatypes
The “in” operator is used to check from a given item in present inside a container
The container can be a list, set, dictionary, or anything in the collections module, or even a user-defined class that is derived from one of the above containers.
I have written an entire article explaining everything you need to know about the “in” operator and his brother the “not in” operator which you can find in the link below.
There I have explained
- what containers are
- what does the “in” and “not in” operators do
- when and where to use “in” and “not in” operators and
- how these 2 operators are implemented under the hood
So I suggest you give that article a go when you get some time!
Okay, time to get back to this article! In the next section, let us learn how to use these 2 operators in our programs!
Example Use-Cases of This Powerful Tool
Use-Case#1: Finding Needle in a Haystack
A good real-world example is finding strings in a log file using a Python script. I have personally used this one plenty of times while doing some automated data processing!
Imagine you have gazillion lines in a log file to sort through and you are looking for a particular string in that file you can do so as follows.
file_name = "log.txt" # name of file with the gazillion lines! #open the log file and read all the lines into lines list file = open(file_name) log_string = file.read() # reads the entire file into a string if "error occurred" in log_string: print("There is a bug in your code :(") else: print("Hurray! your code is bug free!")
How do you find a needle in a haystack?
You bring along a really powerful magnet!
When your “haystack” is a huge log file, Python is your magnet!
Use-case#2: Confirming the absence of any needles in a haystack
Say you wish to make sure that there is not a single needle in a given haystack, how can you do that?
This is as simple as replacing “in” with “not in“!
>>> "el" not in "apple" True >>> "ap" not in "apple" False >>> "ae" not in "apple" True
You might be thinking, if they are simply the opposite then why do we even have “not in“?
The answer to that question is the readability and maintainability of code!
Example#4 above is a good example where we can use the “not in” operator instead of the “in” operator since our interest is in finding the absence of the substring “error occurred“
file_name = "log.txt" # name of file with the gazillion lines! ##open the log file and read all the lines into lines list file = open(file_name) log_string = file.read() # reads the entire file into a string if "error occured" not in log_string: print("Hurray! your code is bug free!") else: print("There is a bug in your code 🙁 ")
The best code is the easily readable code, because it survives for the longest without getting rewritten!
There used to be a time where performance is as important as readability, but with the advancement in Processors, we have reached a stage where “Readability” is equally important as “Performance”.
Where To Go From Here?
If you have made it this far, then “Bravo!” to you! I admire your spirit and hunger for knowledge!
Again, to learn further about the usage of “in” and “not in” operators with other sequences and collections, refer to the article below!
Once you have mastered these membership operators, go ahead and check out the “Related Articles” section below for more cool articles!
I hope you found this article useful!
Feel free to share this article with your friends and colleagues!