All DNA by a series of abbreviated as A, C, G, and T nucleotides, for example: "ACGAATTCCG". In the study of DNA, DNA repeat sequence recognition in the research will be very helpful sometimes.
A function to find the write occurs over a plurality of long sequence of 10 letters (substring) all of the DNA molecule.
Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
输出: ["AAAAACCCCC", "CCCCCAAAAA"]
Source: stay button (LeetCode)
link: https://leetcode-cn.com/problems/repeated-dna-sequences
copyrighted by deduction from all networks. Commercial reprint please contact the authorized official, non-commercial reprint please indicate the source.
A algorithm: violence enumeration
algorithm II: hash table optimization cycle of the second dimension, the values are kept up, you can find
class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_map<string,int> hash;
vector<string> res;
for(int i=0;i+10<=s.size();++i){
string now = s.substr(i,10);
if(hash[now]==1)res.push_back(now);
hash[now]++;
}
return res;
}
};