Find Median From Data Stream Leetcode Problem 295 [Python Solution]
LeetCode problem #295, Find Median From Data Stream problem, shall we? We will provide a Python solution for this problem using two heaps (a max heap and a min heap).
This problem falls under the category of Heap and Priority Queue, and it is considered a hard problem on LeetCode.
Let’s start by understanding the problem and its constraints.
Problem Overview
The median is defined as the middle value in an ordered integer list.
If the size of the list is even, there is no single middle value, and the median is calculated as the mean of the two middle values.
For example, if we have an array arr = [2, 3, 4]
, the median is 3. If the array is arr = [2, 3]
, the median is (2 + 3) / 2, which equals 2.5. The task is to implement the MedianFinder
class, which has the following methods:
MedianFinder()
: Initializes the MedianFinder object.addNum(int num)
: Adds the integernum
from the data stream to the data structure.findMedian()
: Returns the median of all elements added so far.
The answer should be accurate within 10^-5 of the actual answer.
Example 1:
Input:
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output:
[null, null, null, 1.5, null, 2.0]
Explanation:
medianFinder = MedianFinder()
medianFinder.addNum(1) # arr = [1]
medianFinder.addNum(2) # arr = [1, 2]
medianFinder.findMedian() # Returns 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3) # arr = [1, 2, 3]
medianFinder.findMedian() # Returns 2.0
Understanding the Constraints
Before diving into the solution, let’s discuss the constraints:
- The value of
num
falls in the range -10^5 to 10^5. - There will be at least one element in the data structure before calling
findMedian
. - There can be a maximum of 5 * 10^4 calls to both
addNum
andfindMedian
.
Now that we have a clear understanding of the problem and its constraints, let’s explore the Python solution.
Find Median From Data Stream LeetCode Problem Solution
To solve this problem, we’ll use two heaps: a max heap (small) and a min heap (large).
We’ll maintain these two heaps to ensure that the elements are roughly balanced, and the median can be efficiently calculated.
The small
heap will store the smaller half of the numbers, while the large
heap will store the larger half.
Let’s break down the solution into several components:
1. Initialization
We’ll initialize the small
and large
heaps as empty lists using Python’s heapq
module.
The small
heap will be a max heap, and the large
heap will be a min heap.
These heaps will ensure that we can efficiently access the maximum and minimum values in constant time.
import heapq
class MedianFinder:
def __init__(self):
self.small = [] # Max heap for the smaller half of elements
self.large = [] # Min heap for the larger half of elements
2. Adding Elements to the Heaps
The addNum
method is responsible for adding an integer num
to the data stream.
We need to ensure that the heaps remain balanced while doing so.
Here’s how it works:
- If the
large
heap is not empty andnum
is greater than the smallest element in thelarge
heap, we addnum
to thelarge
heap. - Otherwise, we add
-num
to thesmall
heap after negating it.
This negation ensures that the small
heap effectively acts as a max heap for the original numbers.
- We then check if the size of the
small
heap is greater than the size of thelarge
heap by more than one.
If this is the case, we need to rebalance the heaps.
- If the
small
heap has more elements, we pop the largest element from it (after negating it back to its original value) and push it into thelarge
heap.
This balancing step ensures that the sizes of the two heaps remain roughly equal.
def addNum(self, num: int) -> None:
if self.large and num > self.large[0]:
heapq.heappush(self.large, num)
else:
heapq.heappush(self.small, -num)
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
3. Finding the Median
The findMedian
method is responsible for calculating and returning the median of the elements added so far.
It handles both odd and even-length lists.
Here’s how it works:
- If the length of the
small
heap is greater than the length of thelarge
heap, we return the maximum value from thesmall
heap (after negating it to its original value).
This handles the case when there are an odd number of elements.
- If the length of the
large
heap is greater than the length of thesmall
heap, we return the minimum value from thelarge
heap. - If the lengths of both heaps are equal, we find the maximum from the
small
heap and the minimum from thelarge
heap, add them together, and divide by 2 to calculate the median for even-length lists.
def findMedian(self) -> float:
if len(self.small) > len(self.large):
return -self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
return (-self.small[0] + self.large[0]) / 2.0
Now that we have a clear understanding of the solution, including initialization, adding elements, and finding the median, let’s explore the time and space complexity of this algorithm.
Find Median From Data Stream Leetcode Solution [Python Full Code]
class MedianFinder:
def __init__(self):
self.small = [] # Max heap for the smaller half of elements
self.large = [] # Min heap for the larger half of elements
def addNum(self, num: int) -> None:
if self.large and num > self.large[0]:
heapq.heappush(self.large, num)
else:
heapq.heappush(self.small, -num)
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
def addNum(self, num: int) -> None:
if self.large and num > self.large[0]:
heapq.heappush(self.large, num)
else:
heapq.heappush(self.small, -num)
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
def findMedian(self) -> float:
if len(self.small) > len(self.large):
return -self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
return (-self.small[0] + self.large[0]) / 2.0
Time and Space Complexity
Time Complexity
- Adding an element to the heaps (in the
addNum
method) takesO(log n)
time, wheren
is the number of elements in the heap. - Finding the median (in the
findMedian
method) takesO(1)
time, as we can access the maximum and minimum elements in the heaps in constant time.
The overall time complexity of the solution is O(log n)
for adding elements and O(1)
for finding the median.
Space Complexity
The space complexity of this solution is O(n)
, where n
is the number of elements added to the data stream.
The heaps store all the elements.
Related Interview Questions By Company:
Related Interview Questions By Difficulty:
- K Closest Points To Origin LeetCode
- Reorganize String LeetCode
- Find The Kth Largest Integer In The Array LeetCode
Related Interview Questions By Category:
- Reverse String LeetCode
- Partition To K Equal Sum Subsets LeetCode
- Middle Of The Linked List LeetCode
Conclusion
In this blog post, we explored the Find Median From Data Stream problem on LeetCode.
We provided a Python solution that efficiently calculates the median of a data stream using two heaps (a max heap and a min heap).
The solution balances the heaps and handles both odd and even-length lists, providing an accurate median within the specified constraints.
I hope this explanation helps you understand and implement this solution.
If you have any questions or need further clarification, feel free to ask!