perl split function

A very useful function in Perl is split, which splits a string and puts the split result into an array . This function uses a regular expression (RE) and works on the $_ variable if not specified.

The split function can be used like this:
$info = "Caine:Michael:Actor:14, Leafy Drive";
@personal = split(/:/, $info);
The result is:
@personal = ("Caine", "Michael" , "Actor", "14, Leafy Drive");

If we have stored the information in the $_ variable, then it can be like this:
@personal = split(/:/);

If the fields are separated by any number of colons , they can be split with the RE code:
$_ = "Capes:Geoff::Shot putter:::Big Avenue";
@personal = split(/: + /);

其结果是
@personal = ("Capes", "Geoff", "Shot putter", "Big Avenue");

But the following code:
$_ = "Capes:Geoff::Shot putter:::Big Avenue";
@personal = split(/:/);

的结果是
@personal = ("Capes", "Geoff", "", "Shot putter", "", "", "Big Avenue");

Words can be split into characters, sentences can be split into words, and paragraphs can be split into sentences:

@chars = split(//, $word);
@words = split(/ /, $sentence);
@sentences = split(/\./, $paragraph);

In the first sentence, the empty string is matched between each character, so the @chars array is an array of characters. >>

=======================================================

I just learned perl, and I stumbled across several uses of split a few days ago, so I tested the efficiency of various uses.

First quote a split manual:
split /PATTERN/, EXPR, LIMIT
split /PATTERN/, EXPR
split /PATTERN/
split

This function scans the string for the delimiter given by EXPR and splits the string into a list of substrings, returning the resulting list value in list context, or the number of substrings in scalar context. (Note: The scalar environment also causes split to write its result to @_, but this usage is now deprecated.) The delimiter is determined by repeated pattern matching, using the regular expression given in PATTERN, So the delimiter can be of any size, and doesn't have to be the same string every match. (The delimiter is not returned as usual; we discuss exceptions later in this section.) If PATTERN cannot match the string at all, split returns the original string as a substring. If it matches once, then you get two substrings, and so on. You can use regular expression modifiers in PATTERN, such as /PATTERN/i, /PATTERN/x, etc. If you split with the pattern /^/ then the //m modifier is assumed.

If LIMIT is declared and positive, the function splits into no more than that many fields (of course if it runs out of delimiters, it can split into fewer substrings). If LIMIT is negative, treat it as if an arbitrarily large LIMIT was declared. If LIMIT is omitted or is zero, the trailing empty space will be removed from the result (which potential pop users should keep in mind). If EXPR is omitted, the function splits the $_ string. If PATTERN is also omitted or it is a literal space, " ", then the function operates on spaces, /\s+/, but ignores any leading spaces.

I saw a file one day, each line has 18 items, the items are separated by \t, and the 6th item is used when I use it. It is said that I have a strong thirst for knowledge today, so I have tossed a few usages
1. my @array = split("\t",$_); my $var = $array[6]; The average time to test the file is 8.2s
2. my($var) = (split("\t",$_))[6 ]; The average test time is 5.1s
3. my(undef,undef,undef,undef,undef,undef,$var) = split("\t",$_); The average time is 3.53s
4. my(undef,undef, undef,undef,undef,undef,$var) = split("\t",$_,7); average time 3.52s
5. my $var = (split("\t",$_,7))[ 6]; Average time 3.53s

It seems that the latter three are the kingly way. If you need to use more than one, you can make appropriate changes. However, if the span of the two items is relatively large, 3 and 4 should be good choices, and 5 can only use the middle array.

Guess you like