Problem Statement : Given two Strings let’s say S and P, return the array of all start indices of P’s anagram in S.
An Anagram is a word or phrase or name formed by rearranging the letters of another word or phrase or name.
Note: S and P contains only Lower Case Letters
Examples:
Example 1: Input: S = “abcdefcbagbac”, P = “abc” Output: [0,6,10] Explanation: substrings with starting indexes 0,6,10 and having length equal to length of P are “abc” , “cba”, “bac”. These substrings can be formed by arranging the characters in string P. Hence they are the anagrams of string P. Example 2: Input: S = “tufuftfut”, P = “tuf” Output: [0,3,5,6] Explanation: substrings with starting indexes 0,3,5,6 and having length equal to length of P are “tuf”, “uft”, ,”tfu”,”fut”. These substrings can be formed by arranging the characters in string P. Hence they are the anagrams of string P.
Disclaimer: Don’t jump directly to the solution, try it out yourself first.
Solution 1: Sorting
Let’s take the first example, string P is “abc”, Anagrams of string P in S are “abc”, ”cab”, “bac”. Let’s sort the string P and its Anagrams. String P becomes “abc” and Anagrams becomes “abc”, “abc”,”abc”. We can observe that Upon sorting String P becomes equal to Anagrams.
So if the sorted version of any substring in S is equal to Sorted String P . Then that particular substring is an Anagram of P. As there are the same characters in Anagrams and String P, sorting them yields the same result.
Code :
C++ Code
#include <bits/stdc++.h>
using namespace std;
vector<int> FindAnagrams(string &s, string &p)
{
vector<int> ans;
int n = s.length(), m = p.length();
// sortedP stores the sorted version of string P
string sortedp = p;
sort(sortedp.begin(), sortedp.end());
string temp;
for (int i = 0; i <= (n - m); i++)
{
temp = "";
// temp stores the every substring of length m
for (int k = i; k < m + i; k++)
temp.push_back(s[k]);
sort(temp.begin(), temp.end());
// checking sorted version of substring equal to sorted P are not
if (temp == sortedp)
ans.push_back(i);
}
return ans;
}
int main()
{
string s = "tufuftfut", p = "tuf";
vector<int> arr = FindAnagrams(s, p);
if (arr.empty())
cout << "There are No Anagrams of " << p << " in " << s << endl;
else
{
cout << "starting Indices of Anagrams are " << endl;
for (auto &val : arr)
cout << val << " ";
}
return 0;
}
Output:
starting Indices of Anagrams are
0 3 5 6
Time Complexity: O(mlogm) + O( (n-m+1)*(m+mlogm+m) )
mlogm for sorting string P and the for loop runs for n-m+1 times in each iteration we building string temp, which takes O(m) time and sorting temp, which takes O(mlogm) time and comparing sorted strings which takes O(m). So time complexity is O( (n-m+1)*(m+mlogm+m) )
In General Time Complexity is: O(n*m)
Space Complexity: O(m) we are using string temp and sortedp
Java Code
import java.util.*;
class TUF{
static ArrayList<Integer> FindAnagrams(String s, String p)
{
ArrayList<Integer>ans =new ArrayList<>();
int n = s.length(), m = p.length();
// sortedP stores the sorted version of string P
String sortedp = p;
char str[] = sortedp.toCharArray();
Arrays.sort(str);
sortedp = new String(str);
String temp;
for (int i = 0; i <= (n - m); i++)
{
temp = "";
// temp stores the every substring of length m
for (int k = i; k < m + i; k++)
temp+=s.charAt(k);
char str2[] = temp.toCharArray();
Arrays.sort(str2);
temp=new String(str2);
// checking sorted version of substring equal to sorted P are not
if (temp.equals(sortedp))
ans.add(i);
}
return ans;
}
public static void main(String args[])
{
String s = "tufuftfut", p = "tuf";
ArrayList<Integer> arr = FindAnagrams(s, p);
if (arr.isEmpty())
System.out.println("There are No Anagrams of "+p+" in "+s);
else
{
System.out.println("starting Indices of Anagrams are ");
for (int val : arr)
System.out.print(val+" ");
}
}
}
Output:
starting Indices of Anagrams are
0 3 5 6
Time Complexity: O(mlogm) + O( (n-m+1)*(m+mlogm+m) )
mlogm for sorting string P and the for loop runs for n-m+1 times in each iteration we building string temp, which takes O(m) time and sorting temp, which takes O(mlogm) time and comparing sorted strings which takes O(m). So time complexity is O( (n-m+1)*(m+mlogm+m) )
In General Time Complexity is: O(n*m)
Space Complexity: O(m) we are using string temp and sortedp
Solution2: Hashing + Sliding Window
Intuition:
By Definition, we know that string and its Anagrams contain the same characters with equal frequency because anagrams are formed by rearranging characters in the string.
So if we consider every substring of length equal to string P in S and check whether there are same characters with the same frequency that of characters in P then that particular substring is an Anagram of string P.
Approach:
Maintain two frequency arrays one for the string P and the other for the substring in S. IF frequency arrays are equal for a substring, then the substring is an Anagram of string P. For finding the substrings of length equal to string P, Maintain a window of size equal to the length of string P. In each iteration add a new character to the window and remove the first character in window. This makes sure that window length remains constant in every iteration.
Code:
C++ Code
#include <bits/stdc++.h>
using namespace std;
vector<int> FindAnagrams(string &s, string &p)
{
vector<int> ans;
// Frequency arrays
vector<int> arr1(26, 0), arr2(26, 0);
int n = s.length(), m = p.length();
if (m > n)
return ans;
// First window
for (int i = 0; i < m; i++)
arr1[p[i] - 'a']++, arr2[s[i] - 'a']++;
if (arr1 == arr2)
ans.push_back(0);
// subsequent windows
for (int i = m; i < n; i++)
{
arr2[s[i] - 'a']++;
arr2[s[i - m] - 'a']--;
if (arr1 == arr2)
ans.push_back(i - m + 1);
}
return ans;
}
int main()
{
string s = "tufuftfut", p = "tuf";
vector<int> arr = FindAnagrams(s, p);
if (arr.empty())
cout << "There are No Anagrams of " << p << " in " << s << endl;
else
{
cout << "starting Indices of Anagrams are " << endl;
for (auto &val : arr)
cout << val << " ";
}
return 0;
}
Output :
starting Indices of Anagrams are
0 3 5 6
Time Complexity: O((n-m+1)*26)
For loop is running for n-m+1 times and with each iteration, we are comparing frequency arrays.
In General O(n)
Space Complexity: O(1) Sizes of frequency arrays are taken as constant as they have fixed size.
Java Code
import java.util.*;
class TUF{
static ArrayList<Integer> FindAnagrams(String s, String p)
{
ArrayList<Integer> ans=new ArrayList<>();
// Frequency arrays
int arr1[]=new int[26];
int arr2[]=new int[26];
int n = s.length(), m = p.length();
if (m > n)
return ans;
// First window
for (int i = 0; i < m; i++)
{
arr1[p.charAt(i) - 'a']++;
arr2[s.charAt(i) - 'a']++;
}
if (Arrays.equals(arr1,arr2))
ans.add(0);
// subsequent windows
for (int i = m; i < n; i++)
{
arr2[s.charAt(i) - 'a']++;
arr2[s.charAt(i - m) - 'a']--;
if (Arrays.equals(arr1,arr2))
ans.add(i - m + 1);
}
return ans;
}
public static void main(String args[])
{
String s = "tufuftfut", p = "tuf";
ArrayList<Integer> arr = FindAnagrams(s, p);
if (arr.isEmpty())
System.out.println("There are No Anagrams of "+p+" in "+ s);
else
{
System.out.println("starting Indices of Anagrams are ");
for (int val : arr)
System.out.print(val+" ");
}
}
}
Output :
starting Indices of Anagrams are
0 3 5 6
Time Complexity: O((n-m+1)*26)
For loop is running for n-m+1 times and with each iteration, we are comparing frequency arrays.
In General O(n)
Space Complexity: O(1) Sizes of frequency arrays are taken as constant as they have fixed size.
Special thanks to SaiSri Angajala for contributing to this article on takeUforward. If you also wish to share your knowledge with the takeUforward fam, please check out this article