Remove elements from Date Format String using a Regular Expression

Jakg :

I want to remove elements a supplied Date Format String - for example convert the format "dd/MM/yyyy" to "MM/yyyy" by removing any non-M/y element.

What I'm trying to do is create a localised month/year format based on the existing day/month/year format provided for the Locale.

I've done this using regular expressions, but the solution seems longer than I'd expect.

An example is below:

public static void main(final String[] args) {
 System.out.println(filterDateFormat("dd/MM/yyyy HH:mm:ss", 'M', 'y'));
 System.out.println(filterDateFormat("MM/yyyy/dd", 'M', 'y'));
 System.out.println(filterDateFormat("yyyy-MMM-dd", 'M', 'y'));
}

/**
 * Removes {@code charsToRetain} from {@code format}, including any redundant
 * separators.
 */
private static String filterDateFormat(final String format, final char...charsToRetain) {
 // Match e.g. "ddd-"
 final Pattern pattern = Pattern.compile("[" + new String(charsToRetain) + "]+\\p{Punct}?");
 final Matcher matcher = pattern.matcher(format);

 final StringBuilder builder = new StringBuilder();

 while (matcher.find()) {
  // Append each match
  builder.append(matcher.group());
 }

 // If the last match is "mmm-", remove the trailing punctuation symbol
 return builder.toString().replaceFirst("\\p{Punct}$", "");
}
Avi :

Let's try a solution for the following date format strings:

String[] formatStrings = { "dd/MM/yyyy HH:mm:ss", 
                           "MM/yyyy/dd", 
                           "yyyy-MMM-dd", 
                           "MM/yy - yy/dd", 
                           "yyabbadabbadooMM" };

The following will analyze strings for a match, then print the first group of the match.

Pattern p = Pattern.compile(REGEX);
for(String formatStr : formatStrings) {
    Matcher m = p.matcher(formatStr);
    if(m.matches()) {
        System.out.println(m.group(1));
    }
    else {
        System.out.println("Didn't match!");
    }
}

Now, there are two separate regular expressions I've tried. First:

final String REGEX = "(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*)";

With program output:

MM/yyyy
MM/yyyy
yyyy-MMM
Didn't match!
Didn't match!

Second:

final String REGEX = "(?:[^My]*)((?:[My]+[^\\w]*)+[My]+)(?:[^My]*)";

With program output:

MM/yyyy
MM/yyyy
yyyy-MMM
MM/yy - yy
Didn't match!

Now, let's see what the first regex actually matches to:

(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*) First regex =
(?:[^My]*)                              Any amount of non-Ms and non-ys (non-capturing)
          ([My]+                        followed by one or more Ms and ys
                [^\\w]*                 optionally separated by non-word characters
                                        (implying they are also not Ms or ys)
                       [My]+)           followed by one or more Ms and ys
                             (?:[^My]*) finished by any number of non-Ms and non-ys
                                        (non-capturing)

What this means is that at least 2 M/ys are required to match the regex, although you should be careful that something like MM-dd or yy-DD will match as well, because they have two M-or-y regions 1 character long. You can avoid getting into trouble here by just keeping a sanity check on your date format string, such as:

if(formatStr.contains('y') && formatStr.contains('M') && m.matches())
{
    String yMString = m.group(1);
    ... // other logic
}

As for the second regex, here's what it means:

(?:[^My]*)((?:[My]+[^\\w]*)+[My]+)(?:[^My]*) Second regex =
(?:[^My]*)                                   Any amount of non-Ms and non-ys 
                                             (non-capturing)
          (                      )           followed by
           (?:[My]+       )+[My]+            at least two text segments consisting of
                                             one or more Ms or ys, where each segment is
                   [^\\w]*                   optionally separated by non-word characters
                                  (?:[^My]*) finished by any number of non-Ms and non-ys
                                             (non-capturing)

This regex will match a slightly broader series of strings, but it still requires that any separations between Ms and ys be non-words ([^a-zA-Z_0-9]). Additionally, keep in mind that this regex will still match "yy", "MM", or similar strings like "yyy", "yyyy"..., so it would be useful to have a sanity check as described for the previous regular expression.

Additionally, here's a quick example of how one might use the above to manipulate a single date format string:

LocalDateTime date = LocalDateTime.now();
String dateFormatString = "dd/MM/yyyy H:m:s";
System.out.println("Old Format: \"" + dateFormatString + "\" = " + 
    date.format(DateTimeFormatter.ofPattern(dateFormatString)));
Pattern p = Pattern.compile("(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*)");
Matcher m = p.matcher(dateFormatString);
if(dateFormatString.contains("y") && dateFormatString.contains("M") && m.matches())
{
    dateFormatString = m.group(1);
    System.out.println("New Format: \"" + dateFormatString + "\" = " + 
        date.format(DateTimeFormatter.ofPattern(dateFormatString)));
}
else
{
    throw new IllegalArgumentException("Couldn't shorten date format string!");
}

Output:

Old Format: "dd/MM/yyyy H:m:s" = 14/08/2019 16:55:45
New Format: "MM/yyyy" = 08/2019

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=307108&siteId=1