Find The Index Of The First Occurrence In A String Leetcode Problem 28 [Python]
Are you ready to dive into another exciting LeetCode problem, the Find The Index Of The First Occurrence In A String Leetcode Problem 28?
We’ll provide you with a Python solution to tackle this problem efficiently.
So, let’s get started!
Problem Overview
The problem statement is quite simple but crucial.
Given two strings, needle
and haystack
, your task is to return the index of the first occurrence of the needle
in the haystack
.
If the needle
is not part of the haystack
, return -1. Let’s illustrate this with an example.
Example 1:
- Input:
haystack = "sadbutsad", needle = "sad"
- Output:
0
- Explanation: In the
haystack
, “sad” occurs at index 0 and 6. The first occurrence is at index 0, so we return 0.
Example 2:
- Input:
haystack = "leetcode", needle = "leeto"
- Output:
-1
- Explanation: “leeto” did not occur in “leetcode,” so we return -1.
Understanding Constraints
Before diving into the solution, let’s understand the constraints set for this problem:
- 1 <=
haystack.length
,needle.length
<= 10^4 haystack
andneedle
consist of only lowercase English characters.
Now, let’s move on to explore our Python solution.
Python Solution
We’ll start with the efficient Python solution for this problem.
This approach leverages the Knuth-Morris-Pratt (KMP) string matching algorithm to find the first occurrence of needle
in haystack
.
Let’s take a look at the code:
def strStr(self, haystack: str, needle: str) -> int:
if needle == "":
return 0
lps = [0] * len(needle)
prevLPS, i = 0, 1
while i < len(needle):
if needle[i] == needle[prevLPS]:
lps[i] = prevLPS + 1
prevLPS += 1
i += 1
elif prevLPS == 0:
lps[i] = 0
i += 1
else:
prevLPS = lps[prevLPS - 1]
i = 0 # Pointer for haystack
j = 0 # Pointer for needle
while i < len(haystack):
if haystack[i] == needle[j]:
i, j = i + 1, j + 1
else:
if j == 0:
i += 1
else:
j = lps[j - 1]
if j == len(needle):
return i - len(needle)
return -1
Now, let’s break down the solution into sections to better understand it.
1. Bruteforce Approach
The brute-force approach for this problem involves checking each possible substring of haystack
with the same length as needle
to find a match.
While this approach can be implemented using a nested loop, it is highly inefficient, especially for large strings.
It would have a time complexity of O(N*M)
, where N is the length of haystack
and M is the length of needle
.
So, for a more efficient solution, we turn to the KMP algorithm.
2. An Efficient Approach with KMP Algorithm
The Knuth-Morris-Pratt (KMP) algorithm is a powerful tool for efficiently finding occurrences of a substring within a longer string.
Let’s see how it works.
- Building the Longest Prefix Suffix (LPS) Table First, we construct the LPS table for the
needle
.
The LPS table is an array that stores the length of the longest proper prefix that is also a suffix of the string ending at the current character.
- We initialize
lps
as an array of zeros with the same length asneedle
. - We use two pointers,
prevLPS
andi
, to iterate throughneedle
. - If the characters at
needle[i]
andneedle[prevLPS]
match, we increaseprevLPS
by 1 and assignlps[i]
withprevLPS + 1
.
We continue to the next character.
- If the characters do not match, and
prevLPS
is 0, we setlps[i]
to 0 and move to the next character. - If the characters do not match and
prevLPS
is greater than 0, we updateprevLPS
tolps[prevLPS - 1]
.
This process helps us identify the longest common prefix and suffix for each position in needle
.
- Finding the First Occurrence With the LPS table constructed, we use two pointers,
i
andj
, to traversehaystack
andneedle
, respectively.
We use the LPS table to optimize our search.
- While iterating through
haystack
:- If
haystack[i]
matchesneedle[j]
, we increment bothi
andj
. - If there’s a mismatch and
j
is 0, we incrementi
to check the next character. - If there’s a mismatch and
j
is not 0, we updatej
tolps[j - 1]
to skip the already matched characters.
- If
- If
j
becomes equal to the length ofneedle
, it means we’ve found the first occurrence, and we return the starting index by subtractingj
fromi
. - If the loop finishes without finding a match, we return -1. This efficient approach significantly reduces the number of character comparisons and provides a time complexity of
O(N)
for finding the first occurrence ofneedle
inhaystack
.
Time and Space Complexity
Now, let’s discuss the time and space complexity of our solution.
Time Complexity: The time complexity of our KMP-based solution is O(N)
, where N is the length of the haystack
.
This is because we only traverse the haystack
once while efficiently matching the needle
.
Space Complexity: The space complexity is O(M)
, where M is the length of the needle
.
We create the LPS table of size needle
, and the additional space used for other variables is constant.
Reasoning Behind Our Efficient Approach
The reason we chose the KMP algorithm to solve this problem is its ability to optimize the search for the first occurrence of a substring in a larger string.
By constructing the LPS table, we can efficiently skip unnecessary comparisons, reducing the time complexity from O(N*M)
to O(N)
.
This approach becomes particularly valuable when dealing with longer strings.
In summary, the KMP algorithm is a powerful technique for string matching problems.
Understanding the concept of LPS and how to use it to optimize string matching can be a valuable addition to your algorithmic
toolkit.
Related Interview Questions By Company:
- Min Cost Climbing Stairs LeetCode
- Squares Of A Sorted Array LeetCode
- Binary Tree Preorder Traversal LeetCode
Related Interview Questions By Difficulty:
- Insert Delete Get Random O(1) LeetCode
- Non Decreasing Array LeetCode
- Maximum Product Of The Length Of Two Palindromic Subsequences
Related Interview Questions By Category:
Conclusion
In this blog post, we explored the Find The Index Of The First Occurrence In A String problem (LeetCode Problem 28) and provided an efficient Python solution that leverages the Knuth-Morris-Pratt (KMP) algorithm.
We discussed the problem statement, constraints, and presented both the brute-force approach and the optimized KMP algorithm solution.
We hope this guide has been helpful in understanding this problem and the efficient algorithm for solving it.
If you have any questions, suggestions, or want to share your thoughts, please feel free to comment below.
Happy coding!