Java Regex decoding treating multiple delimiters as same not working

GLMills :

enter image description hereand thank you for your help,

I am trying to get a regex expression to decode a string with either a comma or semi-colon as anchor but I can't seem to get it to work for comma's or both. Please tell me what I'm missing or doing wrong. thanks!

^(?<FADECID>\d{6})?(?<MSG>([a-z A-Z 0-9 ()-:]*[;,]{1}+){8,}+)?(?<ANCH>\w*[;,])?(?<TIME>\d{4})?(?<FM>\d{2})?[;,]?(?<CON>.*)$.*

inbound type strings to decode - I need to treat the comma and or semicolon the same.

383154VSC    X1;;;;;;;BOTH WASTE DRAIN VLV    NOT CLSD (135MG;35MG);HARD;093502
282151FCMC1  1;;;;;;;FUEL MAIN PUMP1 (121QA1);HARD;093502
732112EEC2B  1;;;;;;;FMU(E2-4071KS)WRG:EEC  J12 TO FMV LVDT POS,HARD;
383154VSC    X1,,,,,,,BOTH WASTE DRAIN VLV    NOT CLSD (135MG,35MG),HARD,093502
282151FCMC1  1,,,,,,,FUEL MAIN PUMP1 (121QA1);HARD;093502
732112EEC2B  1,,,,,,,FMU(E2-4071KS)WRG:EEC  J12 TO FMV LVDT POS,HARD,
383154VSC    X1,,,,,,,BOTH WASTE DRAIN VLV    NOT CLSD (135MG;35MG);HARD;093502
282151FCMC1  1;;;;;;;FUEL MAIN PUMP1 (121QA1),HARD,093502
732112EEC2B  1,,,,,,,FMU(E2-4071KS)WRG:EEC  J12 TO FMV LVDT POS;HARD;

This string has the possibility to contain mulitple text [;,] separated messages.

ABC;DEF;;HIJ;NNN;JJJ;XXX;EEX;HARD;

This manages that - (?([a-z A-Z 0-9 ()-:]*[;,]{1}+){8,}+)? but it doesn't observe commas?

This works for ; but not for comma or both, my problem is that it can be both a semi-colon or a comma? if I make the regex only comma, it works for comma strings, I know i'm missing a quantifier or something like.

                            if ( null != MORE && ! MORE.isEmpty() ) {

                                while ( null != MORE && ! MORE.isEmpty() || MORE.trim().equals("EOR")) {

                                    LOG.info("MORE CONTINUE: " + MORE);
                                    if ( MORE.trim().equals("EOR") ) {
                                        break;
                                    }
                                    String patternMoreString = "^(?<FADECID>\\d{6})?(?<MSG>([a-z A-Z 0-9 ()-:()]*[;,]{1}+){8,}+)+?(?<ANCH>\\w*[;,])?(?<TIME>\\d{4})?(?<FM>\\d{2})?[;,]?(?<CON>.*)$.*";
                                    Pattern patternMore = Pattern.compile(patternMoreString, Pattern.DOTALL);
                                    Matcher matcherMore = patternMore.matcher(MORE);

                                    while ( matcherMore.find() ) {

                                        MORE = matcherMore.group("CON");

                                        summary.setReportId("FLR");
                                        summary.setAreg(Areg);
                                        summary.setShip(Ship);
                                        summary.setOrig(Orig);
                                        summary.setDest(Dest);
                                        summary.setTimestamp(Ts);
                                        summary.setAta(matcherMore.group("FADECID"));
                                        summary.setTime(matcherMore.group("TIME"));
                                        summary.setFm(matcherMore.group("FM"));
                                        summary.setMsg(matcherMore.group("MSG"));

                                        serviceRecords.add(summary);

                                        LOG.info("*** A330 MPF MORE Record ***");
                                        LOG.info(summary.getReportId());
                                        LOG.info(summary.getAreg());
                                        LOG.info(summary.getShip());
                                        LOG.info(summary.getOrig());
                                        LOG.info(summary.getDest());
                                        LOG.info(summary.getTimestamp());
                                        LOG.info(summary.getAta());
                                        LOG.info(summary.getTime());
                                        LOG.info(summary.getFm());
                                        LOG.info(summary.getMsg());

                                        summary = new A330PostFlightReportRecord();
                                    }
                                }
                            }
                        }
                        //---

I need for all cases group 2 and if TIME and FM exists.

The fourth bird :

You could make use of a capturing group and a backreference using the number of that group to get consistent delimiters.

In this case the capturing group is ([;,]) which is the fourth group denoted by \4 matching either ; or ,

If you only need group 2 and if TIME and FM you can omit group ANCH

^(?<FADECID>\d{6})(?<MSG>([a-zA-Z0-9() -]*([;,])){7,})(?<TIME>\d{4})?(?<FM>\d{2})?\4?(?<CON>.*)$

Explanation

  • ^ Start of string
  • (?<FADECID>\d{6}) Named group FADECID, match 6 digits
  • (?<MSG> Named group MSG
    • ( Capture group 3
      • [a-zA-Z0-9() -]* Match 0+ times any of the lister
      • ([;,]) Capture group 4, used as backreference to get consistent delimiters
    • ){7,} Close group and repeat 7+ times
  • ) Close group MSG
  • (?<TIME>\d{4})? Optional named group TIME, match 4 digits
  • (?<FM>\d{2})? Optional named group FM, match 2 digits
  • \4? Optional backreference to capture group 4
  • (?<CON>.*) Named group CON Match any char except a newline 0+ times
  • $ End of string

Regex demo

Note that group 3 the capture group itself is repeated, giving you the last value of the iteration, which will be HARD

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=195479&siteId=1