难度: Medium
原题连接
内容描述
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC", "CCCCCAAAAA"]
思路 1 - 时间复杂度: O(N)- 空间复杂度: O(N)******
就直接长度为10的窗口不停向右边移动呗,计算frequency
beast 22.97%
class Solution:
def findRepeatedDnaSequences(self, s):
"""
:type s: str
:rtype: List[str]
"""
if not s or len(s) < 10:
return []
lookup = collections.Counter()
for i in range(10, len(s)+1):
lookup[s[i-10:i]] += 1
return [k for k, v in lookup.items() if v > 1]
其实不全部计算会稍微快一点, beats 94.59%
class Solution:
def findRepeatedDnaSequences(self, s):
"""
:type s: str
:rtype: List[str]
"""
if not s or len(s) < 10:
return []
res, lookup = set(), set()
for i in range(10, len(s)+1):
seq = s[i-10:i]
if seq in lookup:
res.add(seq)
else:
lookup.add(seq)
return list(res)