JAVA Regex - How do I exclude a certain email extension?

B Noonen :

My program is used to sort out emails and find ones without the proper extension. For this, I am experimenting in Regex and can get it to detect when an email has the extension, or no extension at all, but can not get the program to detect when the line has an extension that just is not the specific one I wish to exclude.

I have tried using tags like ?! with the statements and have had no results. I have not got a lot of experience in regex so my attempts are numbered.

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Emails {
    public static void main(String args[]) throws IOException {
        Scanner scanner = new Scanner(new File("rajeev.dat"));

        ArrayList<String[]> lines = new ArrayList<>();

        Pattern regex = Pattern.compile("(?!^([A-Za-z0-9.]+([email protected])|[A-Za-z0-9.]+([email protected])))");
        Pattern findComma = Pattern.compile(",");

    while(scanner.hasNextLine()){
        lines.add(scanner.nextLine().split(","));
    }

    for(String[] s: lines){
        for(String s1: s){
            System.out.println(s1);
        }
        System.out.println();
    }


    String temp = "";

    String output = "";

    output += lines.get(0)[0] + ":" + lines.get(0)[1] + ":";

    for(int i = 2; i < lines.get(0).length; i++){
        temp += lines.get(0)[i] + " ";
    }

        System.out.println(temp);

    Matcher match = regex.matcher(temp);
    String temp2 = "";
    boolean nofail = false;

        while(match.find()){
            output += match.group().trim() + ":";
            nofail = true;
        }


        if(nofail) {
            System.out.println(output);
        }


    }
}

The program is expected to sort out any email with extensions that are not @Google.org or @yahoo.net

The program finds no matches

The fourth bird :

You could use a negative lookahead (?!Google\.org|Yahoo\.net) to assert what is directly on the right of the @ is not either Google.org or Yahoo.net. Note to escape the dot to match it literally.

If the only characters you want to allow are listed in your character class [A-Za-z0-9.], you might use a regex which first matches the character class without the dot using [A-Za-z0-9]+

Then repeat the part 0+ times starting with a dot using (?:\.[A-Za-z0-9])* to prevent the email starting or ending with a dot.

Note that you can extend the character classes to allow more characters.

^[A-Za-z0-9]+(?:\.[A-Za-z0-9])*@(?!Google\.org|Yahoo\.net)[A-Za-z0-9]+(?:\.[A-Za-z0-9])*\.\w+$

In Java

String regex = "^[A-Za-z0-9]+(?:\\.[A-Za-z0-9])*@(?!Google\\.org|Yahoo\\.net)[A-Za-z0-9]+(?:\\.[A-Za-z0-9])*\\.\\w+$";

Regex demo

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=158076&siteId=1