Algorithm: Group position-transformed strings 49. Group Anagrams

49. Group Anagrams

Given an array of strings strs, group the anagrams together. You can return the answer in any order.
An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Example 1:

Input: strs = ["eat","tea","tan","ate","nat","bat"]
Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Example 2:

Input: strs = [""]
Output: [[""]]

Example 3:

Input: strs = ["a"]
Output: [["a"]]

Constraints:

  • 1 <= strs.length <= 104
  • 0 <= strs[i].length <= 100
  • strs[i] consists of lowercase English letters.

1. Encode the letter positions and group the corresponding positions and quantities of tuples.

The purpose is to group a set of strings according to their anagrams. Anagrams are words made up of the same letters in a different order. For example, "eat", "tea", "ate" are a set of allophones.

Insert image description here

import collections
from typing import List

class Solution:
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        # 初始化一个 defaultdict,其值的类型为 list
        ans = collections.defaultdict(list)
        
        # 遍历输入的字符串列表
        for s in strs:
            # 初始化一个长度为 26 的列表,用于存储每个字母的出现次数
            # 26 是英文字母的数量
            count = [0] * 26
            
            # 遍历字符串中的每个字符
            for c in s:
                # 计算每个字符在字母表中的位置,并更新 count 列表
                count[ord(c) - ord('a')] += 1
            
            # 将 count 列表转换为 tuple 类型,并作为字典的键
            # 将当前字符串添加到对应的列表中
            ans[tuple(count)].append(s)
        
        # 返回字典中的所有值,即分组后的字母异位词
        return ans.values()
  1. Complexity analysis:
    Time complexity: O(NK), where N is the number of strings and K is the maximum length of the string. This is because we need to iterate through each string and count the number of occurrences of each letter in each string.

  2. Space complexity: O(NK), mainly used to store output results.

The core idea of ​​this code is to represent each string using a fixed-length count list, where each element in the list represents the number of occurrences of the corresponding letter in the string. Since allophones have the same letters and their occurrences, they are mapped to the same key in the dictionary. Finally, each value in the dictionary is a set of anagrams.

In Python, ord()it is a built-in function used to return the Unicode code point of a character (string), that is, the numerical representation of the character in the Unicode character set. ord()The parameter of the function is a string representing a single Unicode character.

For example, ord('a')97 is returned because the character 'a' has code point 97 in the Unicode character set. In the same way, ord('b')98 will be returned, ord('1')49 will be returned, and so on.

In the given code, ord(c) - ord('a')used to calculate the position of the character c in the alphabet. For example, if c is 'a', then ord('a') - ord('a')the result of is 0, indicating that 'a' is the first letter of the alphabet. If c is 'b', then ord('b') - ord('a')the result of is 1, indicating that 'b' is the second letter of the alphabet, and so on.

This calculation is used to determine where in the count list to increment the count of character c. For example, if c is 'a', then count[0]is incremented by 1; if c is 'b', then count[1]is incremented by 1, and so on.

2. Sorting solution

Group the input string list strs according to anagrams. Anagrams are words made up of the same letters in a different order. For example, "eat", "tea", and "ate" are a set of allophones.
Insert image description here

import collections
from typing import List

class Solution(object):
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        # 初始化一个 defaultdict,其值的类型为 list
        # 这个字典用于存储分组后的字母异位词
        ans = collections.defaultdict(list)
        
        # 遍历输入的字符串列表
        for s in strs:
            # 对每个字符串的字符进行排序,并转换为 tuple 类型
            # 这样,字母异位词会被转换为相同的 tuple
            # 将当前字符串添加到字典中对应的列表中
            ans[tuple(sorted(s))].append(s)
        
        # 返回字典中的所有值,即分组后的字母异位词
        return ans.values()

Complexity analysis:

  1. Time complexity: O(NKlogK), where N is the number of strings and K is the maximum length of the string. This is because we need to iterate through each string and sort each character of each string. The time complexity of sorting is O(KlogK).

  2. Space complexity: O(NK), mainly used to store output results.

The core idea of ​​this code is to take advantage of the equality property of anagrams after sorting, use the sorted strings as the keys of the dictionary, and add the original strings to the corresponding lists to achieve grouping. Finally, each value in the dictionary is a set of anagrams.

Guess you like

Origin blog.csdn.net/zgpeace/article/details/133410578