SC.Lab3's construction process for Factory (from HIT)

  The Factory design pattern is basically to obtain an object by passing in the specified parameters/or no parameters, and passing a certain method of the Factory (in order to avoid instantiating the Factory object, the general method is static). This is where Factory uses more. For Vertex and Edge's Factory, an object is automatically created by passing in some information needed to construct the object, which avoids repeatedly calling Vertex/Edge's own methods to add information/properties to Vertex/Edge objects.

  For graphs, instantiating a graph only requires a String tag (depending on the Graph constructor you design), but if you want to build a complete graph, you must add vertices and edges to the graph. In this case, it is necessary to repeatedly call the relevant methods of the graph. In this case, it is entirely possible to consider writing the information needed to construct the graph into a file. When constructing the graph, read this file to obtain the information, so as to make the process of constructing the graph. Simplified, i.e. avoiding a lot of repetitive operations. That is to say, you only need to pass a String filePath to the createGraph of the graph Factory, let it open the file according to the path of the file, and read the information from the file. The return value of this method is the reference of the graph, so this method implements the function of constructing the graph.
  Therefore, this method of constructing a graph mainly involves two parts, one is to read the data of the graph, and the other is to use the data by calling the method of the graph (it may also be necessary to judge whether the incoming information is legal).
  When reading data, use regular expressions to match processing. The processing method is similar to the processing method of the previous experiment. Whenever a line of data is read from a file, the line of data is parsed to obtain and store the required data. information. How to read files in Java may be discussed later in the course, but there are many ways to deal with it in the book and on the Internet, of course, the first two experiments were also used. In general, it is to process the stream of read and write data in Java, specify a file and instantiate a file object for it, create a read-in stream object to read the file object, and then create a buffer to read the stream object, process the read Enter the stream object, and finally convert the read data into a String object by calling the method of reading the stream object from the buffer. So the statement "Java is all objects" is evident.
  My process of reading a file is like this. By calling the method of reading the stream object from the buffer, the data (String type) is passed in line by line. If the end of the file is read, this method will return a null reference. At this time The String object reference of your temporary storage of the read data will also become null, which proves that the file has been read at this time. Under normal circumstances, you should pay attention to closing the read stream of the buffer object (there must be many reasons for security, just remember it), that is, call the close method of the buffer object.
At this point, a line of data in the file has been obtained. In order to match the data, several patterns must be designed to match the incoming string. For the format of this file, there are not many styles, you can refer to the previous blog link:          http://www.cnblogs.com/stevenshen123/p/8973413.html.

  It can be seen that when reading the file, the format is basically a word plus ' = ' plus the information string defined by ""<>, so you don't need to care too much about the matching of the first word, as long as you use the same Pattern matching is good (that is, use the regular expression \\w+ to match, \\w is a letter, which can refer to [a-zA-Z0-9], + means that there are one or more such letters, so for the first A line string, you can match "GraphType", when the space is read, the match is over, then you can read the part that has been matched, that is, read the GraphType, and then match the space = space, use English quotation marks / sharp The brackets are used to match the next content. At this time, for a line of data, roughly speaking, only the data of the part before = and the part after = are needed. So consider maximizing the acquisition of the string after =, and then refine it to process this string. So we design two regular expressions for matching:
    pattern1 = "(\\w+) = (\".+\")", pattern2 = "(\\w+) = <(. +)>";
  For these two patterns, parentheses are used to obtain data later, that is, the data in parentheses is the part to be obtained.
  For pattern1, it matches / word (multiple letters) = "String"/, that is, the content of the matching file has
    GraphType = "MovieGraph"
    GraphName = "MyFavoriteMovies"
    VertexType = "Movie", "Actor", "Director"
    EdgeType = "MovieActorRelation", "MovieDirectorRelation", "SameMovieHyperEdge"
  observe the position of the brackets, and the obtained data is
    /GraphType/, /"MovieGraph"/
    /GraphName/, /"MyFavoriteMovies"/
    /VertexType/, /"Movie", "Actor", "Director"/
    /EdgeType/, /"MovieActorRelation", "MovieDirectorRelation", "SameMovieHyperEdge"/
  that is, get in the first bracket is the word at the beginning of the line, and the second bracket matches the string from the first quotation mark to the last quotation mark (including the quotation mark itself, and the second bracket matching mode uses the maximized match mentioned earlier).
  For pattern2, similar to pattern1, it matches data containing angle brackets, it matches /word (multiple letters) = <string>/, and its matching file content format includes:
    Vertex = <"TheShawshankRedemption", "Movie", <"1994", "USA", "9.3">>
    Edge = <"SRFD", "MovieDirectorRelation", "-1", "TheShawshankRedemption", "FrankDarabont", "No">
    HyperEdge = <" ActorsInSR","


    /Edge/, /"SRFD", "MovieDirectorRelation", "-1", "TheShawshankRedemption", "FrankDarabont", "No"/
    /HyperEdge/, /"ActorsInSR","SameMovieHyperEdge",{"TimRobbins", "MorganFreeman" "}/
  That is, the word in the first bracket is the word at the beginning of the line, and the second bracket matches the string from the first angle bracket to the last angle bracket (excluding the angle bracket itself, the second bracket matches the pattern It is used to maximize the matching mentioned earlier).
  At this point, the first word of each line has been obtained, and according to the specific content of the word, it can be judged how to operate the matching string after the word.
  Considering the data in the second bracket of each line, the part to be obtained is all in "" (don't care about <>, because the specific data to be obtained is a "word"), so we also need to construct a pattern3, use to get each word, so
    pattern3 = ",? ?\"([^\"]+)\"";
  here for the convenience of interpretation, there are ,? ? parts, indicating whether there are, or spaces are matched, of course, This is useless, that is, pattern3 can also be written like this, pattern3 = "\"([^\"]+)\"";
  ^ means not, here ^\" means not "", [^\"] It means a character that is not "", [^\"
    /"MovieGraph"/ --> /MovieGraph/ *Pay attention to the position of the brackets in pattern3, the matching content does not include the quotation marks around the word.
    /"MyFavoriteMovies"/ -->/MyFavoriteMovies/
    /"MovieActorRelation", "MovieDirectorRelation", "SameMovieHyperEdge"/ --> /MovieActorRelation/

     *Note that it is only matched once, and each match is only responsible for matching two quotation marks, and intercepting the string between the quotation marks. For this line of data, match again to determine whether there is a next match. At this time, a match will occur, and interception will be performed, /"MovieDirectorRelation"/ --> /MovieDirectorRelation/, and then matched and intercepted, there will be /"SameMovieHyperEdge "/ --> /SameMovieHyperEdge/
  Of course, use this pattern3 to match another set of strings, repeat the cycle to find a match, and get
    /"TheShawshankRedemption", "Movie", <"1994", "USA", "9.3" >/ --> /TheShawshankRedemption/ /Movie/ /1994/ /USA/ /9.3/
    /"SRFD", "MovieDirectorRelation", "-1", "TheShawshankRedemption", "FrankDarabont", "No"/ --> / SRFD/ /MovieDirectorRelation/ /-1/ /TheShawshankRedemption/ /FrankDarabont/ /No/
    /"ActorsInSR","SameMovieHyperEdge",{"TimRobbins", "MorganFreeman"}/ -->
  Since the data is read in line by line, for each line of data, you can determine how many times the string after = should be matched according to the word at the beginning of the sentence. For example, when the word is Vertex, it will be matched 5 times. Two matches are used to construct a Vertex (or its subclass) object, and the content of the remaining three matches is the attribute of this vertex. This part is mainly judged according to the matching words. For example, if the word Movie is matched, a Movie object can be created. Its label is TheShawshankRedemption, and its attributes are 1994, USA, 9.3. This vertex can be instantiated and added to the graph by calling the relevant method. Similar lines can be handled similarly, just pay attention to how the match is handled by the matched string to determine how many times the match is. It can also be seen here that because pattern3 only minimally matches "words", it is not sensitive to characters such as <>{}, which means that these characters do not affect the matching process of pattern3.

  Let's talk about how to match and intercept strings.
    Pattern p = Pattern.compile(pattern1);
    Matcher parse1 = p.matcher("");
    p = Pattern.compile(pattern2);
    Matcher parse2 = p.matcher("");
  will create pattern p, the first line, Let p design and match according to pattern1. In the second line, parse1 matches according to this matching pattern, that is to say, parse1 at this time can match the string according to the format of pattern1. Similarly, the third and fourth lines enable parse2 to match in the format of pattern2. For parse1/parse2 to match the specified string, call its reset method, namely parse1.reset(line), and let parse1 match the string line. To judge whether it matches, use the parse.find() method. If it matches, this method will return true, otherwise it will return fasle. If there is a match, you can use the parse1.groupCount() method to know that the string can match the contents of several brackets, that is, the part we need to intercept. For parse1, its format pattern1 contains two brackets, so the value is 2, group corresponds to a bracket, so to get the content in the first bracket, call parse1.group(1), this method will return the string in the first bracket, and parse1.group(2) will get the first The strings in the two brackets, similarly, can be intercepted to the required string.
  It should be noted here that the find() method must be used as a conditional judgment, because if a matching string is not found, parse1 is a null, and if the group method is called on parse1, an exception will occur... You know, something like Null pointers are used in C/C++.
  In general, this part is to pay attention to the details, how to match each line of data, how many times to match, and how to process strings to get the data you need. Skilled use of if/else or even switch is particularly important, funny.

  In order to facilitate the comparison of the mode process mentioned in this part, and to prevent the copy code, the following figure is attached (witty face):

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325209635&siteId=291194637