Press enter to see results or esc to cancel.

Find Median From Data Stream Leetcode Problem 295 [Python Solution]

LeetCode problem #295, Find Median From Data Stream problem, shall we? We will provide a Python solution for this problem using two heaps (a max heap and a min heap).

This problem falls under the category of Heap and Priority Queue, and it is considered a hard problem on LeetCode.

Let’s start by understanding the problem and its constraints.

Problem Overview

The median is defined as the middle value in an ordered integer list.

If the size of the list is even, there is no single middle value, and the median is calculated as the mean of the two middle values.

For example, if we have an array arr = [2, 3, 4], the median is 3. If the array is arr = [2, 3], the median is (2 + 3) / 2, which equals 2.5. The task is to implement the MedianFinder class, which has the following methods:

  1. MedianFinder(): Initializes the MedianFinder object.
  2. addNum(int num): Adds the integer num from the data stream to the data structure.
  3. findMedian(): Returns the median of all elements added so far.

The answer should be accurate within 10^-5 of the actual answer.

Example 1:

Input:

["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]

Output:

[null, null, null, 1.5, null, 2.0]

Explanation:

medianFinder = MedianFinder()
medianFinder.addNum(1)  # arr = [1]
medianFinder.addNum(2)  # arr = [1, 2]
medianFinder.findMedian()  # Returns 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3)  # arr = [1, 2, 3]
medianFinder.findMedian()  # Returns 2.0

Understanding the Constraints

Before diving into the solution, let’s discuss the constraints:

  • The value of num falls in the range -10^5 to 10^5.
  • There will be at least one element in the data structure before calling findMedian.
  • There can be a maximum of 5 * 10^4 calls to both addNum and findMedian.

Now that we have a clear understanding of the problem and its constraints, let’s explore the Python solution.

Find Median From Data Stream LeetCode Problem Solution

To solve this problem, we’ll use two heaps: a max heap (small) and a min heap (large).

We’ll maintain these two heaps to ensure that the elements are roughly balanced, and the median can be efficiently calculated.

The small heap will store the smaller half of the numbers, while the large heap will store the larger half.

Let’s break down the solution into several components:

1. Initialization

We’ll initialize the small and large heaps as empty lists using Python’s heapq module.

The small heap will be a max heap, and the large heap will be a min heap.

These heaps will ensure that we can efficiently access the maximum and minimum values in constant time.

import heapq

class MedianFinder:
    def __init__(self):
        self.small = []  # Max heap for the smaller half of elements
        self.large = []  # Min heap for the larger half of elements

2. Adding Elements to the Heaps

The addNum method is responsible for adding an integer num to the data stream.

We need to ensure that the heaps remain balanced while doing so.

Here’s how it works:

  • If the large heap is not empty and num is greater than the smallest element in the large heap, we add num to the large heap.
  • Otherwise, we add -num to the small heap after negating it.

This negation ensures that the small heap effectively acts as a max heap for the original numbers.

  • We then check if the size of the small heap is greater than the size of the large heap by more than one.

If this is the case, we need to rebalance the heaps.

  • If the small heap has more elements, we pop the largest element from it (after negating it back to its original value) and push it into the large heap.

This balancing step ensures that the sizes of the two heaps remain roughly equal.

def addNum(self, num: int) -> None:
    if self.large and num > self.large[0]:
        heapq.heappush(self.large, num)
    else:
        heapq.heappush(self.small, -num)

    if len(self.small) > len(self.large) + 1:
        val = -heapq.heappop(self.small)
        heapq.heappush(self.large, val)
    if len(self.large) > len(self.small) + 1:
        val = heapq.heappop(self.large)
        heapq.heappush(self.small, -val)

3. Finding the Median

The findMedian method is responsible for calculating and returning the median of the elements added so far.

It handles both odd and even-length lists.

Here’s how it works:

  • If the length of the small heap is greater than the length of the large heap, we return the maximum value from the small heap (after negating it to its original value).

This handles the case when there are an odd number of elements.

  • If the length of the large heap is greater than the length of the small heap, we return the minimum value from the large heap.
  • If the lengths of both heaps are equal, we find the maximum from the small heap and the minimum from the large heap, add them together, and divide by 2 to calculate the median for even-length lists.
def findMedian(self) -> float:
    if len(self.small) > len(self.large):
        return -self.small[0]
    elif len(self.large) > len(self.small):
        return self.large[0]
    return (-self.small[0] + self.large[0]) / 2.0

Now that we have a clear understanding of the solution, including initialization, adding elements, and finding the median, let’s explore the time and space complexity of this algorithm.

Find Median From Data Stream Leetcode Solution [Python Full Code]

class MedianFinder:

    def __init__(self):
        self.small = []  # Max heap for the smaller half of elements
        self.large = []  # Min heap for the larger half of elements

    def addNum(self, num: int) -> None:
        if self.large and num > self.large[0]:
            heapq.heappush(self.large, num)
        else:
            heapq.heappush(self.small, -num)

        if len(self.small) > len(self.large) + 1:
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        if len(self.large) > len(self.small) + 1:
            val = heapq.heappop(self.large)
            heapq.heappush(self.small, -val)

    def addNum(self, num: int) -> None:
        if self.large and num > self.large[0]:
            heapq.heappush(self.large, num)
        else:
            heapq.heappush(self.small, -num)

        if len(self.small) > len(self.large) + 1:
            val = -heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        if len(self.large) > len(self.small) + 1:
            val = heapq.heappop(self.large)
            heapq.heappush(self.small, -val)

    def findMedian(self) -> float:
        if len(self.small) > len(self.large):
            return -self.small[0]
        elif len(self.large) > len(self.small):
            return self.large[0]
        return (-self.small[0] + self.large[0]) / 2.0

Time and Space Complexity

Time Complexity

  • Adding an element to the heaps (in the addNum method) takes O(log n) time, where n is the number of elements in the heap.
  • Finding the median (in the findMedian method) takes O(1) time, as we can access the maximum and minimum elements in the heaps in constant time.

The overall time complexity of the solution is O(log n) for adding elements and O(1) for finding the median.

Space Complexity

The space complexity of this solution is O(n), where n is the number of elements added to the data stream.

The heaps store all the elements.

Related Interview Questions By Company:

Related Interview Questions By Difficulty:

Related Interview Questions By Category:

Conclusion

In this blog post, we explored the Find Median From Data Stream problem on LeetCode.

We provided a Python solution that efficiently calculates the median of a data stream using two heaps (a max heap and a min heap).

The solution balances the heaps and handles both odd and even-length lists, providing an accurate median within the specified constraints.

I hope this explanation helps you understand and implement this solution.

If you have any questions or need further clarification, feel free to ask!

>