advertisements
_____________________________________________________________________________________________________________________
I
have a file called dat.txt with few lines. I wanted to count and remove the
duplicate lines.
$ cat dat.txt
xml.1-lkj<mn1-1lkjmg1-1w13lg.rec
xml.1-CCJGL1-CCJGL1-CCJGL.rec
xml.1-BSDF0Q1-BW;LKJ1-BWP30Q.rec
xml.1-BSDF0Q1-BW;LKJ1-BWP30Q.rec
xml.1-LKJ<MN1-1LKJMG1-1W13LG.rec
xml.1-LKJ<MN1-1LKJMG1-1W13LG.rec
xml.1-2<MBMV1-NVNBVKJH21HMRE.rec
xml.1-2EW*&Y1-(878761-2AJKGY.rec
uniq is the command is used to find out the duplicates. You
have to pass the sorted file to the uniq command.
$ sort dat.txt|uniq
xml.1-2EW*&Y1-(878761-2AJKGY.rec
xml.1-2<MBMV1-NVNBVKJH21HMRE.rec
xml.1-BSDF0Q1-BW;LKJ1-BWP30Q.rec
xml.1-CCJGL1-CCJGL1-CCJGL.rec
xml.1-lkj<mn1-1lkjmg1-1w13lg.rec
xml.1-LKJ<MN1-1LKJMG1-1W13LG.rec
Ignore
case using –i option
$ sort dat.txt|uniq -i
xml.1-2EW*&Y1-(878761-2AJKGY.rec
xml.1-2<MBMV1-NVNBVKJH21HMRE.rec
xml.1-BSDF0Q1-BW;LKJ1-BWP30Q.rec
xml.1-CCJGL1-CCJGL1-CCJGL.rec
xml.1-lkj<mn1-1lkjmg1-1w13lg.rec
Count
the occurrence of duplicates using –c option.
$ sort dat.txt|uniq -ic
1
xml.1-2EW*&Y1-(878761-2AJKGY.rec
1
xml.1-2<MBMV1-NVNBVKJH21HMRE.rec
2
xml.1-BSDF0Q1-BW;LKJ1-BWP30Q.rec
1
xml.1-CCJGL1-CCJGL1-CCJGL.rec
3
xml.1-lkj<mn1-1lkjmg1-1w13lg.rec
awk
script to perform the same task
$ awk '!x[$0]++' dat.txt
xml.1-lkj<mn1-1lkjmg1-1w13lg.rec
xml.1-CCJGL1-CCJGL1-CCJGL.rec
xml.1-BSDF0Q1-BW;LKJ1-BWP30Q.rec
xml.1-LKJ<MN1-1LKJMG1-1W13LG.rec
xml.1-2<MBMV1-NVNBVKJH21HMRE.rec
xml.1-2EW*&Y1-(878761-2AJKGY.rec_____________________________________________________________________________________________________________________
0 comments:
Post a Comment