Find Median From Data Stream Leetcode Problem 295 [Python Solution]
LeetCode problem #295, Find Median From Data Stream problem, shall we? We will provide a Python solution for this problem using two heaps (a max heap and a min heap).
This problem falls under the category of Heap and Priority Queue, and it is considered a hard problem on LeetCode.
Let’s start by understanding the problem and its constraints.
Problem Overview
The median is defined as the middle value in an ordered integer list.
If the size of the list is even, there is no single middle value, and the median is calculated as the mean of the two middle values.
For example, if we have an array arr = [2, 3, 4], the median is 3. If the array is arr = [2, 3], the median is (2 + 3) / 2, which equals 2.5. The task is to implement the MedianFinder class, which has the following methods:
MedianFinder(): Initializes the MedianFinder object.addNum(int num): Adds the integernumfrom the data stream to the data structure.findMedian(): Returns the median of all elements added so far.
The answer should be accurate within 10^-5 of the actual answer.
Example 1:
Input:
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output:
[null, null, null, 1.5, null, 2.0]
Explanation:
medianFinder = MedianFinder()
medianFinder.addNum(1) # arr = [1]
medianFinder.addNum(2) # arr = [1, 2]
medianFinder.findMedian() # Returns 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3) # arr = [1, 2, 3]
medianFinder.findMedian() # Returns 2.0
Understanding the Constraints
Before diving into the solution, let’s discuss the constraints:
- The value of
numfalls in the range -10^5 to 10^5. - There will be at least one element in the data structure before calling
findMedian. - There can be a maximum of 5 * 10^4 calls to both
addNumandfindMedian.
Now that we have a clear understanding of the problem and its constraints, let’s explore the Python solution.
Find Median From Data Stream LeetCode Problem Solution
To solve this problem, we’ll use two heaps: a max heap (small) and a min heap (large).
We’ll maintain these two heaps to ensure that the elements are roughly balanced, and the median can be efficiently calculated.
The small heap will store the smaller half of the numbers, while the large heap will store the larger half.
Let’s break down the solution into several components:
1. Initialization
We’ll initialize the small and large heaps as empty lists using Python’s heapq module.
The small heap will be a max heap, and the large heap will be a min heap.
These heaps will ensure that we can efficiently access the maximum and minimum values in constant time.
import heapq
class MedianFinder:
def __init__(self):
self.small = [] # Max heap for the smaller half of elements
self.large = [] # Min heap for the larger half of elements
2. Adding Elements to the Heaps
The addNum method is responsible for adding an integer num to the data stream.
We need to ensure that the heaps remain balanced while doing so.
Here’s how it works:
- If the
largeheap is not empty andnumis greater than the smallest element in thelargeheap, we addnumto thelargeheap. - Otherwise, we add
-numto thesmallheap after negating it.
This negation ensures that the small heap effectively acts as a max heap for the original numbers.
- We then check if the size of the
smallheap is greater than the size of thelargeheap by more than one.
If this is the case, we need to rebalance the heaps.
- If the
smallheap has more elements, we pop the largest element from it (after negating it back to its original value) and push it into thelargeheap.
This balancing step ensures that the sizes of the two heaps remain roughly equal.
def addNum(self, num: int) -> None:
if self.large and num > self.large[0]:
heapq.heappush(self.large, num)
else:
heapq.heappush(self.small, -num)
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
3. Finding the Median
The findMedian method is responsible for calculating and returning the median of the elements added so far.
It handles both odd and even-length lists.
Here’s how it works:
- If the length of the
smallheap is greater than the length of thelargeheap, we return the maximum value from thesmallheap (after negating it to its original value).
This handles the case when there are an odd number of elements.
- If the length of the
largeheap is greater than the length of thesmallheap, we return the minimum value from thelargeheap. - If the lengths of both heaps are equal, we find the maximum from the
smallheap and the minimum from thelargeheap, add them together, and divide by 2 to calculate the median for even-length lists.
def findMedian(self) -> float:
if len(self.small) > len(self.large):
return -self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
return (-self.small[0] + self.large[0]) / 2.0
Now that we have a clear understanding of the solution, including initialization, adding elements, and finding the median, let’s explore the time and space complexity of this algorithm.
Find Median From Data Stream Leetcode Solution [Python Full Code]
class MedianFinder:
def __init__(self):
self.small = [] # Max heap for the smaller half of elements
self.large = [] # Min heap for the larger half of elements
def addNum(self, num: int) -> None:
if self.large and num > self.large[0]:
heapq.heappush(self.large, num)
else:
heapq.heappush(self.small, -num)
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
def addNum(self, num: int) -> None:
if self.large and num > self.large[0]:
heapq.heappush(self.large, num)
else:
heapq.heappush(self.small, -num)
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
def findMedian(self) -> float:
if len(self.small) > len(self.large):
return -self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
return (-self.small[0] + self.large[0]) / 2.0
Time and Space Complexity
Time Complexity
- Adding an element to the heaps (in the
addNummethod) takesO(log n)time, wherenis the number of elements in the heap. - Finding the median (in the
findMedianmethod) takesO(1)time, as we can access the maximum and minimum elements in the heaps in constant time.
The overall time complexity of the solution is O(log n) for adding elements and O(1) for finding the median.
Space Complexity
The space complexity of this solution is O(n), where n is the number of elements added to the data stream.
The heaps store all the elements.
Related Interview Questions By Company:
Related Interview Questions By Difficulty:
- K Closest Points To Origin LeetCode
- Reorganize String LeetCode
- Find The Kth Largest Integer In The Array LeetCode
Related Interview Questions By Category:
- Reverse String LeetCode
- Partition To K Equal Sum Subsets LeetCode
- Middle Of The Linked List LeetCode
Conclusion
In this blog post, we explored the Find Median From Data Stream problem on LeetCode.
We provided a Python solution that efficiently calculates the median of a data stream using two heaps (a max heap and a min heap).
The solution balances the heaps and handles both odd and even-length lists, providing an accurate median within the specified constraints.
I hope this explanation helps you understand and implement this solution.
If you have any questions or need further clarification, feel free to ask!