LeetCode 2405 - Optimal Partition of String

Open Table of contents

Problem
Explanation/Approach
- Greedy Approach:
Solution: Greedy Approach
Solution: Using Last Seen Indices

Problem

Given a string s, partition the string into one or more substrings such that the characters in each substring are unique. The goal is to minimize the number of substrings in such a partition.

Example 1:

Input: s = "abacaba"
Output: 4
Explanation:
Possible partitions are ("a","ba","cab","a") and ("ab","a","ca","ba"). Minimum number of substrings is 4.

Example 2:

Input: s = "ssssss"
Output: 6
Explanation:
The only valid partition is ("s","s","s","s","s","s").

Constraints:

1 <= s.length <= 10^5
s consists of only English lowercase letters.

Explanation/Approach

The problem can be approached using a greedy strategy. We need to traverse the string and create substrings, ensuring that each character appears only once in each substring.

Greedy Approach:

Initialization: Start with an empty substring and a set to track characters in the current substring.
Traversing the String: For each character in s, check if it is already in the current substring (using the set).
Creating New Substrings: If the character is already in the set, start a new substring and clear the set.
Counting Substrings: Keep track of the number of substrings formed during the traversal.

Solution: Greedy Approach

class Solution:
    def partitionString(self, s: str) -> int:
        # Set to store unique characters of the current substring
        unique_chars = set()
        # Count of substrings
        substring_count = 0

        for char in s:
            # If char is already in the current substring, start a new one
            # Increment the counter, this means a substring has been found
            if char in unique_chars:
                unique_chars.clear()
                substring_count += 1

            unique_chars.add(char)

        # Increment count for the last substring
        substring_count += 1

        return substring_count

Time and Memory Complexity

The time complexity of this solution is O(n), where n is the length of the string s. This is because we traverse the string once. The space complexity is O(1), as the set unique_chars will have at most 26 characters (all lowercase English letters).

Explanation

This approach uses a greedy strategy, where we continuously form substrings with unique characters. When we encounter a character that is already in the current substring, we start a new substring. The count of substrings is incremented each time we start a new one.

Visual Representation

Consider the input s = "abacaba".

Initialization:
- Substring count = 0
- Unique characters set = {}
Processing Each Character:
- Process ‘a’: Substring count = 1, Set = {a}
- Process ‘b’: Substring count = 1, Set = {a, b}
- Process ‘a’: Substring count = 2, Start new substring, Set = {a}
- Process ‘c’: Substring count = 2, Set = {a, c}
- Process ‘a’: Substring count = 3, Start new substring, Set = {a}
- Process ‘b’: Substring count = 3, Set = {a, b}
- Process ‘a’: Substring count = 4, Start new substring, Set = {a}
Final Substring Count:
- Total substrings = 4 (The last increment happens after the loop).

The process of creating and tracking substrings can be visualized as follows:

Solution: Using Last Seen Indices

class Solution:
  def partitionString(self, s):
    # Array to track each characters and the last index it was seen
    last_seen = [-1]*26
    count = 1
    # The index of current substring start
    sub_start = 0

    for i, char in enumerate(s):
      # If this char was seen at any index after `sub_start` it means it is a duplicate
      if last_seen[ord(char) - ord("a")] >= sub_start:
        count += 1
        # Update substring start index to the current index
        sub_start = i
      # Update last time the char has been seen
      last_seen[ord(char) - ord("a")] = i
    return count

Time and Memory Complexity

Time: O(n)
Space: O(1)

Explanation

This optimized approach keeps track of the last seen index of each character in the string. As we traverse the string, we update these indices. We also keep track of the farthest last seen index. When our current index in the string matches this farthest index, it signals the end of a substring with unique characters, prompting us to start a new substring.

Visual Representation

Consider the input s = "abacaba".

Tracking Last Seen Indices:
- Process ‘a’: last_seen[‘a’] = 0, sub_start = 0
- Process ‘b’: last_seen[‘b’] = 1, sub_start = 0
- Process ‘a’: last_seen[‘a’] = 2, sub_start = 2
- Process ‘c’: last_seen[‘c’] = 3, sub_start = 2
- Process ‘a’: last_seen[‘a’] = 4, sub_start = 4
- Process ‘b’: last_seen[‘b’] = 5, sub_start = 4
- Process ‘a’: last_seen[‘a’] = 6, sub_start = 6
Incrementing Substring Count:
- Increment count at indices 2, 4 and 6.
Final Substring Count:
- Total substrings = 4.