Regex or Functions for validating a file?

Liger :

I have a task on my hand to validate the schema of a text file which holds the following data

50 entries in the following format,

Serial numbers are from 1-50 followed by a tab followed by a random number n ranging from 100<=n<=500

e.g. 1 <tab> 256

Since a regular expression is easier to check the schema of the file and is more maintainable I am preferring to use regex rather than a class which will parse each string and validate within no time

The output file should be like

Line 1 formatted correctly
Invalid format on line 2 (51 1000) + (Error message that can be set using a custom exception class)

My question is, can regex be powerful enough to give me the desired output i.e raise an exception to set in the correct way?

My try is below

public class TestOutput {

    private final int MAX_LINES_TO_READ = 50;

    private final String REGEX = "RAWREGEX";

    public void testFile(String fileName) {

        int lineCounter = 1;

        try {

            BufferedReader br = new BufferedReader(new FileReader(fileName));

            String line = br.readLine();

            while ((line != null) && (lineCounter <= MAX_LINES_TO_READ)) {

                // Validate the line is formatted correctly based on regular expressions                
                if (line.matches(REGEX)) {
                    System.out.println("Line " + lineCounter + " formatted correctly");
                }
                else {
                    System.out.println("Invalid format on line " + lineCounter + " (" + line + ")");
                }

                line = br.readLine();
                lineCounter++;
            }

            br.close();

        } catch (Exception ex) {
            System.out.println("Exception occurred: " + ex.toString());
        }
    }

    public static void main(String args[]) {

        TestOutput vtf = new TestOutput();

        vtf.testFile("transactions.txt");
    }   
}

Here are my questions

  1. How the optimal design should look like (use regex or not)?
  2. If yes, What regex to use?
Bohemian :

Use this regex:

String REGEX = "([1-9]|[1-4]\\d|50)\t([1-4]\\d\\d|500)";

See live demo.

To explain...

[1-9]|[1-4]\\d|50 means “any number 1-50”, achieved by three alternations 1-9, 10-49 and 50.

Similarly, [1-4]\\d\\d|500 means “100-500”, achieved by two alternations 100-499 and 500.

With only 50 lines, “performance” is irrelevant (unless you’re doing it 100’s of times per second) - pick the approach that is most readable and understandable. If you can use regex, it usually results in less code, and it performs well enough.


Test code:

private final String REGEX = "([1-9]|[1-4]\\d|50)\\t([1-4]\\d\\d|500)";

public void testFile(String fileName) {
    int lineCounter = 1;
    try {
        BufferedReader br = new BufferedReader(new FileReader(fileName));
        String line = br.readLine();
        while ((line != null) && (lineCounter <= MAX_LINES_TO_READ)) {
            if (line.matches(REGEX)) {
                System.out.println("Line " + lineCounter + " formatted correctly");
            } else {
                System.out.println("Invalid format on line " + lineCounter + " (" + line + ")");
            }
            line = br.readLine();
            lineCounter++;
        }
        br.close();
    } catch (Exception ex) {
        System.out.println("Exception occurred: " + ex.toString());
    }
}

Test file:

1   123
50  346
23  145
68  455
1   535

Output:

Line 1 formatted correctly
Line 2 formatted correctly
Line 3 formatted correctly
Invalid format on line 4 (68    455)
Invalid format on line 5 (1 535)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=129150&siteId=1