472. Concatenated Words

难度: Hard

刷题内容

原题连接

https://leetcode.com/problems/concatenated-words/

内容描述

Given a list of words (without duplicates), please write a program that returns all concatenated words in the given list of words.
A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.

Example:
Input: ["cat","cats","catsdogcats","dog","dogcatsdog","hippopotamuses","rat","ratcatdogcat"]

Output: ["catsdogcats","dogcatsdog","ratcatdogcat"]

Explanation: "catsdogcats" can be concatenated by "cats", "dog" and "cats"; 
 "dogcatsdog" can be concatenated by "dog", "cats" and "dog"; 
"ratcatdogcat" can be concatenated by "rat", "cat", "dog" and "cat".
Note:
The number of elements of the given array will not exceed 10,000
The length sum of elements in the given array will not exceed 600,000.
All the input string will only include lower case letters.
The returned elements order does not matter.

解题方案

思路 1 - 时间复杂度: O(words_num * word_len)- 空间复杂度: O(words_num * word_len)******

Trie 树，参考GraceMeng

A concatenated word is a word that satisfies:
  - exist in words;
  - is combination of at least 2 words in words;

In other words, a concatenated word is a word add other word(words) as prefix. 
That's natual to prefix tree(trie). 
We can build a trie using words and search for concatenated words in the trie. 

We have to make a decision when we meet a node that meets the end of a word (with en in the example below). We can
  - either take the current node's associated word as prefix (and restart at root for another word) 
  - or not take the  current node's associated word as prefix (i.e. move further within the same branch).

beats 15.24%

class TrieNode(object):
    
    def __init__(self):
        self.children = dict()
        self.isWordEnd = False
        
        
class Solution:
    def findAllConcatenatedWordsInADict(self, words):
        """
        :type words: List[str]
        :rtype: List[str]
        """
        if not words or len(words) == 0:
            return []
        res = []
        self.root = TrieNode()
        self.buildTrie(words)
        for word in words:
            if self.isConcatenation(0, word, 0):
                res.append(word)
        return res
    
    def isConcatenation(self, idx, word, count): # idx代表当前正在比较word的第几个char
        ptr = self.root
        for i in range(idx, len(word)):
            if word[i] not in ptr.children: # 如果当前word不存在，肯定不行
                return False
            ptr = ptr.children[word[i]]
            if ptr.isWordEnd:
                if i == len(word) - 1: # 如果比较到了word的最后一个字符，那么看看是否是由2个以上的word组成的
                    return count >= 1
                if self.isConcatenation(i+1, word, count+1): # 递归判断
                    return True
        return False
    
    def buildTrie(self, words):
        ptr = TrieNode()
        for word in words:
            ptr = self.root
            for c in word:
                if c not in ptr.children:
                    ptr.children[c] = TrieNode()
                ptr = ptr.children[c]
            ptr.isWordEnd = True

思路 2 - 时间复杂度: O(words_num * word_len^2)- 空间复杂度: O(words_num * word_len)******

将这道题转化为第139题的解法，一个word肯定只能由比它更短的word组成，因此我们先按照word长度将words排个序

然后遍历，看当前的word是否能由它之前的words组成，这个地方就是第139题的解法了，但是稍微有一点点区别，那就是我们这里如果当前word必须由两个及两个以上word组成才可以，也就是说如果当前word的长度小于2的话，肯定不能由两个及两个以上word组成

beats 8.57%

class Solution:
    def findAllConcatenatedWordsInADict(self, words):
        """
        :type words: List[str]
        :rtype: List[str]
        """
        if not words or len(words) == 0:
            return []
        res = []
        words.sort(key = lambda s: len(s))
        pre_words_lookup = set()
        for i in range(len(words)):
            if self.canConcatenated(words[i], pre_words_lookup):
                res.append(words[i])
            pre_words_lookup.add(words[i])
        return res
    
    def canConcatenated(self, s, wordDict):
        if len(s) <= 1:
            return False
        ok = [True]
        for i in range(1, len(s)+1):
            ok += [any(ok[j] and s[j:i] in wordDict for j in range(i))]
        return ok[-1]

思路 3 - 时间复杂度: O(words_num * word_len^2)- 空间复杂度: O(max(words_num, word_len))******

Let's discuss whether a word should be included in our answer.
Consider the word as a topologically sorted directed graph, where each node is a letter, 
and an edge exists from i to j if word[i:j] is in our wordlist, 
[and there is no edge from i=0 to j=len(word)-1]. 
We want to know if there is a path from beginning to end. 
If there is, then the word can be broken into concatenated parts from our wordlist.
To answer this question, we DFS over this graph.

参考awice

beats 57.27%

class Solution:
    def findAllConcatenatedWordsInADict(self, words):
        """
        :type words: List[str]
        :rtype: List[str]
        """
        S = set(words)
        res = []
        for word in words:
            if not word:
                continue
            stack = [0]
            seen = {0}
            while stack:
                idx = stack.pop()
                if idx == len(word):
                    res.append(word)
                    break
                for j in range(len(word) - idx + 1):
                    if (word[idx:idx + j] in S and
                                    idx + j not in seen and
                            not (idx == 0 and idx + j == len(word))):
                        stack.append(idx + j)
                        seen.add(idx + j)
        return res

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!