Find The Index Of The First Occurrence In A String Leetcode Problem 28 [Python]

Are you ready to dive into another exciting LeetCode problem, the Find The Index Of The First Occurrence In A String Leetcode Problem 28?

We’ll provide you with a Python solution to tackle this problem efficiently.

So, let’s get started!

Problem Overview

The problem statement is quite simple but crucial.

Given two strings, needle and haystack, your task is to return the index of the first occurrence of the needle in the haystack.

If the needle is not part of the haystack, return -1. Let’s illustrate this with an example.

Example 1:

Input: haystack = "sadbutsad", needle = "sad"
Output: 0
Explanation: In the haystack, “sad” occurs at index 0 and 6. The first occurrence is at index 0, so we return 0.

Example 2:

Input: haystack = "leetcode", needle = "leeto"
Output: -1
Explanation: “leeto” did not occur in “leetcode,” so we return -1.

Understanding Constraints

Before diving into the solution, let’s understand the constraints set for this problem:

1 <= haystack.length, needle.length <= 10^4
haystack and needle consist of only lowercase English characters.

Now, let’s move on to explore our Python solution.

Python Solution

We’ll start with the efficient Python solution for this problem.

This approach leverages the Knuth-Morris-Pratt (KMP) string matching algorithm to find the first occurrence of needle in haystack.

Let’s take a look at the code:

View on GitHub ➹

def strStr(self, haystack: str, needle: str) -&gt; int:
    if needle == "":
        return 0
    lps = [0] * len(needle)

    prevLPS, i = 0, 1
    while i &lt; len(needle):
        if needle[i] == needle[prevLPS]:
            lps[i] = prevLPS + 1
            prevLPS += 1
            i += 1
        elif prevLPS == 0:
            lps[i] = 0
            i += 1
        else:
            prevLPS = lps[prevLPS - 1]

    i = 0  # Pointer for haystack
    j = 0  # Pointer for needle
    while i &lt; len(haystack):
        if haystack[i] == needle[j]:
            i, j = i + 1, j + 1
        else:
            if j == 0:
                i += 1
            else:
                j = lps[j - 1]
        if j == len(needle):
            return i - len(needle)
    return -1

Now, let’s break down the solution into sections to better understand it.

1. Bruteforce Approach

The brute-force approach for this problem involves checking each possible substring of haystack with the same length as needle to find a match.

While this approach can be implemented using a nested loop, it is highly inefficient, especially for large strings.

It would have a time complexity of O(N*M), where N is the length of haystack and M is the length of needle.

So, for a more efficient solution, we turn to the KMP algorithm.

2. An Efficient Approach with KMP Algorithm

The Knuth-Morris-Pratt (KMP) algorithm is a powerful tool for efficiently finding occurrences of a substring within a longer string.

Let’s see how it works.

Building the Longest Prefix Suffix (LPS) Table First, we construct the LPS table for the needle.

The LPS table is an array that stores the length of the longest proper prefix that is also a suffix of the string ending at the current character.

We initialize lps as an array of zeros with the same length as needle.
We use two pointers, prevLPS and i, to iterate through needle.
If the characters at needle[i] and needle[prevLPS] match, we increase prevLPS by 1 and assign lps[i] with prevLPS + 1.

We continue to the next character.

If the characters do not match, and prevLPS is 0, we set lps[i] to 0 and move to the next character.
If the characters do not match and prevLPS is greater than 0, we update prevLPS to lps[prevLPS - 1].

This process helps us identify the longest common prefix and suffix for each position in needle.

Finding the First Occurrence With the LPS table constructed, we use two pointers, i and j, to traverse haystack and needle, respectively.

We use the LPS table to optimize our search.

While iterating through haystack:
- If haystack[i] matches needle[j], we increment both i and j.
- If there’s a mismatch and j is 0, we increment i to check the next character.
- If there’s a mismatch and j is not 0, we update j to lps[j - 1] to skip the already matched characters.
If j becomes equal to the length of needle, it means we’ve found the first occurrence, and we return the starting index by subtracting j from i.
If the loop finishes without finding a match, we return -1. This efficient approach significantly reduces the number of character comparisons and provides a time complexity of O(N) for finding the first occurrence of needle in haystack.

Time and Space Complexity

Now, let’s discuss the time and space complexity of our solution.

Time Complexity: The time complexity of our KMP-based solution is O(N), where N is the length of the haystack.

This is because we only traverse the haystack once while efficiently matching the needle.

Space Complexity: The space complexity is O(M), where M is the length of the needle.

We create the LPS table of size needle, and the additional space used for other variables is constant.

Reasoning Behind Our Efficient Approach

The reason we chose the KMP algorithm to solve this problem is its ability to optimize the search for the first occurrence of a substring in a larger string.

By constructing the LPS table, we can efficiently skip unnecessary comparisons, reducing the time complexity from O(N*M) to O(N).

This approach becomes particularly valuable when dealing with longer strings.

In summary, the KMP algorithm is a powerful technique for string matching problems.

Understanding the concept of LPS and how to use it to optimize string matching can be a valuable addition to your algorithmic

toolkit.

Conclusion

In this blog post, we explored the Find The Index Of The First Occurrence In A String problem (LeetCode Problem 28) and provided an efficient Python solution that leverages the Knuth-Morris-Pratt (KMP) algorithm.

We discussed the problem statement, constraints, and presented both the brute-force approach and the optimized KMP algorithm solution.

We hope this guide has been helpful in understanding this problem and the efficient algorithm for solving it.

If you have any questions, suggestions, or want to share your thoughts, please feel free to comment below.

Happy coding!

Question Link

Find The Index Of The First Occurrence In A String Leetcode Problem 28 [Python]

Problem Overview

Example 1:

Example 2:

Understanding Constraints

Python Solution

1. Bruteforce Approach

2. An Efficient Approach with KMP Algorithm

Time and Space Complexity

Reasoning Behind Our Efficient Approach

Related Interview Questions By Company:

Related Interview Questions By Difficulty:

Related Interview Questions By Category:

Conclusion

Written By

CID

Tags