uniq 命令用于检查及删除文本文件中重复出现的行列,一般与 sort 命令结合使用。

uniq 可检查文本文件中重复出现的行列。


uniq [-c/d/D/u/i] [-f Fields] [-s N] [-w N] [InFile] [OutFile]


-c: 在每列旁边显示该行重复出现的次数。
-d: 仅显示重复出现的行列,显示一行。
-D: 显示所有重复出现的行列,有几行显示几行。
-u: 仅显示出一次的行列
-i: 忽略大小写字符的不同
-f Fields: 忽略比较指定的列数。
-s N: 忽略比较前面的N个字符。
-w N: 对每行第N个字符以后的内容不作比较。
[InFile]: 指定已排序好的文本文件。如果不指定此项,则从标准读取数据;
[OutFile]: 指定输出的文件。如果不指定此选项,则将内容显示到标准输出设备(显示终端)



uniq 用法示例


vim uniq.txt

# cat uniq.txt
My name is Delav
My name is Delav
My name is Delav
I'm learning Java
I'm learning Java
I'm learning Java
who am i
Who am i
Python is so simple
My name is Delav
That's good
That's good
And studying Golang



1. 直接去重

uniq uniq.txt


# uniq uniq.txt 
My name is Delav
I'm learning Java
who am i
Who am i
Python is so simple
My name is Delav
That's good
And studying Golang


2. 显示重复出现的次数

uniq -c uniq.txt


# uniq -c uniq.txt
      3 My name is Delav
      3 I'm learning Java
      1 who am i
      1 Who am i
      1 Python is so simple
      1 My name is Delav
      2 That's good
      1 And studying Golang

你会发现,上面有两行 "My name is Delav" 是相同的。

也就是说,当重复的行不相邻时,uniq 命令是不起作用的

所以,经常需要 sort + uniq 命令一起使用,详见米扑博客:Linux 删除重复行

sort uniq.txt | uniq -c


# sort uniq.txt | uniq -c
      1 And studying Golang
      3 I'm learning Java
      4 My name is Delav
      1 Python is so simple
      2 That's good
      1 who am i
      1 Who am i



sort file1 file2 | uniq 


3. 只显示重复的行,并显示重复次数取交集

uniq -cd uniq.txt


# uniq -cd uniq.txt
      3 My name is Delav
      3 I'm learning Java
      2 That's good


显示所有重复的行 -D,不能与 -c 一起使用

uniq -D uniq.txt


# uniq -d uniq.txt 
My name is Delav
I'm learning Java
That's good
# uniq -D uniq.txt
My name is Delav
My name is Delav
My name is Delav
I'm learning Java
I'm learning Java
I'm learning Java
That's good
That's good



sort file1 file2 | uniq -d


4. 只显示不重复的行,重复的都不显示删除交集

sort uniq.txt | uniq -cu


# sort uniq.txt | uniq -cu
      1 And studying Golang
      1 Python is so simple
      1 who am i
      1 Who am i



sort file1 file2 | uniq -u


5. 忽略第几列字符

下面这里 -f 1 忽略了第一列字符,所以"who am i" 和 "Who am i" 判定为重复

uniq -c -f 1 uniq.txt


# uniq -c -f 1 uniq.txt
      3 My name is Delav
      3 I'm learning Java
      2 who am i
      1 Python is so simple
      1 My name is Delav
      2 That's good
      1 And studying Golang


6. 忽略大小写

下面这里 -i 忽略了大小写,所以"who am i" 和 "Who am i" 判定为重复

uniq -c -i uniq.txt


# uniq -c -i uniq.txt
      3 My name is Delav
      3 I'm learning Java
      2 who am i
      1 Python is so simple
      1 My name is Delav
      2 That's good
      1 And studying Golang


7. 忽略前面N个字符

下面这里 -s 4 表示忽略前面四个字符,所以"who am i" 和 "Who am i" 判定为重复

uniq -c -s 4 uniq.txt


# uniq -c -s 4 uniq.txt
      3 My name is Delav
      3 I'm learning Java
      2 who am i
      1 Python is so simple
      1 My name is Delav
      2 That's good
      1 And studying Golang


8. 忽略第N个字符后的内容

下面这里 -w 2 表示忽略第二个字符后的内容,因第一个字母"who am i" 和 "Who am i" 不同,因此判定不重复

uniq -c -w 2 uniq.txt


# uniq -c -w 2 uniq.txt
      3 My name is Delav
      3 I'm learning Java
      1 who am i
      1 Who am i
      1 Python is so simple
      1 My name is Delav
      2 That's good
      1 And studying Golang



uniq file.txt
sort file.txt | uniq 
sort -u file.txt 

uniq -u file.txt 
sort file.txt | uniq -u 

sort file.txt | uniq -c 

sort file.txt | uniq -d




