How to remove the serial number cited in the linux shell command notes [regular expression]

How to remove the serial number cited in the linux shell command notes [regular expression]

Written in the front: I know that I can export the table, but it was already like this when I got it. It is not realistic to go back and check hundreds of articles
, ok , that is,
[1] A. Ben Hameda, S. Elosta, J. Havel Optimization of the capillary zone electrophoresis method for Huperzine A determination using experimental design and artificial neural networks[J]. Elsevier BV,2004,1084(1).
Batch conversion to
A. Ben Hameda,S. Elosta,J. Havel. Optimization of the capillary zone electrophoresis method for Huperzine A determination using experimental design and artificial neural networks[J]. Elsevier BV,2004,1084(1).

First look for [1]…[999], the following commands are used:

grep: can only display the number of rows

  1. grep "[[0-9]{1,}]" x.txt
    search "[at least one digit]"
  2. grep "[[0-9]{1,3}]" x.txt
    search "[one to three digits]"
  3. grep "[[0-9]*]"
    *Find "[any number]" is the easiest

cut: select and output

  1. cut -d']' -f 2,3 x.txt >>y.txt
    Press] to split and extract the following content to y.txt

The following code extracts all the titles in list.txt to title.txt; extracts the quotations without serial numbers to rmnum.txt

#!/bin/bash
#提取出list.txt中全部title
if [ -a "./list.txt" ];then
cut -d ']' -f 2,3 list.txt>>rmnum.txt
fi
if [ -a "./rmnum.txt" ];then
cut -d "." -f 2,3,4,5,6 rmnum.txt > title_mid.txt
fi
if [ -a "./title_mid.txt" ];then
cut -d "[" -f 1 title_mid.txt> title.txt
print "GOOD JOB!"
fi

Code after change: The
following code extracts all the titles in $1 to $2; extracts the quotations without serial numbers to rmnum.txt

#!/bin/bash
#提取出全部title
if [ -a "./$1" ] ;then
cut -d ']' -f 2,3 $1 > rmnum.txt
fi
if [ -a "./rmnum.txt" ] ;then
cut -d "." -f 2,3,4,5,6 rmnum.txt > title_mid.txt
fi
if [ -a "./title_mid.txt" ] ;then
cut -d "[" -f 1 title_mid.txt> $2
echo "GOOD JOB!"
fi

Remove the first line of spaces
cat x.txt | sed's/ 1 *//g' >>y.txt

Remove the last line
cat x.txt | sed's/[.]*$//g' >>y.txt

Replace case
tr "[:upper:]" "[:lower:]" <x.txt> y.txt

Merge duplicates
sort ./x1.txt ./x2.txt | uniq -u> y.txt

—————————————————— I
gave up the above, because many people’s names have been added. This makes it difficult to separate people’s names from titles, so let’s
talk about people’s names and articles together:

#!/bin/bash
#提取出全部title

tr "[:upper:]" "[:lower:]" < $1 > $1.x
if [ -a "$1.x" ] ;then
cut -d ']' -f 2,3 $1.x > 1.x
cat 1.x | sed 's/^[ \t]*//g' > 2.x
cat 2.x | sed 's/[.]*$//g' > 3.x

sed -i '/^\s*$/d' 3.x

cut -d "[" -f 2 3.x | grep -o [12][890][0-9][0-9] > 4.x
cut -d "[" -f 1 3.x  > 5.x
paste 4.x 5.x > 6.x

sort 6.x | uniq -u >>all
sort 6.x | uniq -d >>all
sort all | uniq -u > allnew
sort all | uniq -d >> allnew
sort allnew | uniq -u > all
sort allnew | uniq -d >> all
echo "success"
fi

  1. \t ↩︎

Guess you like

Origin blog.csdn.net/mushroom234/article/details/109019450