Skip to content

Latest commit

 

History

History
109 lines (51 loc) · 1.64 KB

0187._Repeated_DNA_Sequences.md

File metadata and controls

109 lines (51 loc) · 1.64 KB

187. Repeated DNA Sequences

难度: Medium

刷题内容

原题连接

内容描述


All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

解题方案

思路 1 - 时间复杂度: O(N)- 空间复杂度: O(N)******

就直接长度为10的窗口不停向右边移动呗,计算frequency

beast 22.97%

class Solution:
    def findRepeatedDnaSequences(self, s):
        """
        :type s: str
        :rtype: List[str]
        """
        if not s or len(s) < 10:
            return []
        
        lookup = collections.Counter()
        for i in range(10, len(s)+1):
            lookup[s[i-10:i]] += 1
        return [k for k, v in lookup.items() if v > 1]

其实不全部计算会稍微快一点, beats 94.59%

class Solution:
    def findRepeatedDnaSequences(self, s):
        """
        :type s: str
        :rtype: List[str]
        """
        if not s or len(s) < 10:
            return []

        res, lookup = set(), set()
        for i in range(10, len(s)+1):
            seq = s[i-10:i]
            if seq in lookup:
                res.add(seq)
            else:
                lookup.add(seq)
        return list(res)