Count the semiprime numbers in the given range [a..b]

oleg.cherednik :

I am solving Codility problem CountSemiprimes: Count the semiprime numbers in the given range [a..b].

Task description

A prime is a positive integer X that has exactly two distinct divisors: 1 and X. The first few prime integers are 2, 3, 5, 7, 11 and 13.

A semiprime is a natural number that is the product of two (not necessarily distinct) prime numbers. The first few semiprimes are 4, 6, 9, 10, 14, 15, 21, 22, 25, 26.

You are given two non-empty arrays P and Q, each consisting of M integers. These arrays represent queries about the number of semiprimes within specified ranges.

Query K requires you to find the number of semiprimes within the range (P[K], Q[K]), where 1 ≤ P[K] ≤ Q[K] ≤ N.

Write an efficient algorithm for the following assumptions:

  • N is an integer within the range [1..50,000];
  • M is an integer within the range [1..30,000];
  • each element of arrays P, Q is an integer within the range [1..N]; P[i] ≤ Q[i].

My solution

My current score is 66% and problem is preformance for large data set:

  • large random, length = ~30,000
  • all max ranges

Test says, that it should take about 2sec, but my solution takes over 7sec.

This is my current solution

class Solution {
    private static List<Integer> getPrimes(int max) {
        List<Integer> primes = new ArrayList<>(max / 2);

        for (int i = 0; i < max; i++)
            if (isPrime(i))
                primes.add(i);

        return primes;
    }

    private static boolean isPrime(int val) {
        if (val <= 1)
            return false;
        if (val <= 3)
            return true;

        for (int i = 2, sqrt = (int)Math.sqrt(val); i <= sqrt; i++)
            if (val % i == 0)
                return false;

        return true;
    }

    private static boolean[] getSemiPrimes(int N) {
        List<Integer> primes = getPrimes(N);
        boolean[] semiPrimes = new boolean[N + 1];

        for (int i = 0; i < primes.size(); i++) {
            if (primes.get(i) > N)
                break;

            for (int j = i; j < primes.size(); j++) {
                if (primes.get(j) > N || N / primes.get(i) < primes.get(j))
                    break;

                int semiPrime = primes.get(i) * primes.get(j);

                if (semiPrime <= N)
                    semiPrimes[semiPrime] = true;
            }
        }

        return semiPrimes;
    }

    public static int[] solution(int N, int[] P, int[] Q) {
        boolean[] semiPrimes = getSemiPrimes(N);
        int[] res = new int[P.length];

        for (int i = 0; i < res.length; i++)
            for (int j = P[i]; j <= Q[i]; j++)
                if (semiPrimes[j])
                    res[i]++;

        return res;
    }
}

Any ideas about improving performance? My last one was to remove Set for holding semi-primes with array. It helped me to solve couple of performance tests.

Paul Hankin :

You can precompute an array A of size N+1, which stores at A[i] the number of semiprimes less than or equal to i. Then a query p, q can be computed immediately: the number of semiprimes between p and q (inclusive) is A[q] - A[p-1].

This array can be computed efficiently: let P be an array of primes less than or equal to N/2. Then (in java-like pseudocode):

A = new int[N+1]
for (int p : P) {
  for (int q : P) {
      if (p*q > N || q > p) break;
      A[p*q] = 1
  }
}

for (int i = 1; i <= N; i++)
    A[i] += A[i-1]

This works by marking the semiprimes with a 1 in the array, and then taking a cumulative sum. It runs in better than O(N^2) and worse than O(N) time -- there's about N/2logN primes in P, so the first part is O((N/logN)^2), and the summing-up is O(N). [Note: I guess the first part has better complexity than O((N/log N)^2) because of the early termination of the inner loop, but I've not proved that]. Computing the primes in P is O(N log log N) using the sieve of Erastothenes.

A Python version of this program takes 0.07s to precompute A for N=50000, and to perform 30000 queries. It gets a perfect score (100) when run on codility, and codility reports that it detects the code to be have complexity O(N log(log(N)) + M).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=129805&siteId=1