介绍

sed流编辑器是一种文本编辑器,可对来自标准输入或文件的信息执行编辑操作。 Sed逐行且以非交互方式进行编辑。

This means that you make all of the editing decisions as you are calling the command and sed will execute the directions automatically. This may seem confusing or unintuitive, but it is a very powerful and fast way to transform text.

本教程将介绍一些基本操作,并向您介绍操作该编辑器所需的语法。几乎可以肯定,您永远不会用sed替换常规的文本编辑器,但是它可能会成为文本编辑工具箱中受欢迎的一部分。

基本用法

通常,sed对它从标准输入或文件中读取的文本流进行操作。

这意味着您可以将另一个命令的输出直接发送到sed中进行编辑,或者可以处理已经创建的文件。

您还应该知道sed默认情况下会将所有内容输出到标准输出。这意味着,除非重定向,否则sed会将其输出打印到屏幕上,而不是将其保存在文件中。

基本用法是:

sed [options] commands [file-to-edit]

我们将一些文件复制到主目录中以进行一些编辑。

cd
cp /usr/share/common-licenses/BSD .
cp /usr/share/common-licenses/GPL-3 .

让我们使用sed查看我们复制的BSD许可证文件的内容。

由于我们知道sed默认情况下会将其结果发送到屏幕,因此我们可以通过不向其传递任何编辑命令来将其用作文件读取器。让我们尝试一下:

sed '' BSD
Copyright (c) The Regents of the University of California.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
...
...

这是可行的,因为单引号包含了我们传递给sed的编辑命令。我们没有传递任何东西,因此它只是将接收到的每一行打印到标准输出中。

我们将通过将“ cat”命令的输出传递到sed中以产生相同的结果,来演示sed如何使用标准输入。

cat BSD | sed ''
Copyright (c) The Regents of the University of California.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
. . .
. . .

如您所见,我们可以轻松地对文件或文本流(如管道输出带有“ |”字符时产生的文本)进行操作。

印刷线

在前面的示例中,我们看到无需任何操作即可将输入传递给sed的结果会直接将结果打印到标准输出中。

现在,我们将探讨sed的显式“ print”命令,该命令通过在单引号内使用“ p”字符指定。

sed 'p' BSD
Copyright (c) The Regents of the University of California.
Copyright (c) The Regents of the University of California.
All rights reserved.
All rights reserved.


Redistribution and use in source and binary forms, with or without
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
modification, are permitted provided that the following conditions
are met:
are met:
. . .
. . .

您可以看到sed现在每行打印两次。这是因为它会自动打印每行,然后我们告诉它使用“ p”命令显式打印。

如果检查输出如何使第一行两次,然后使第二行两次,依此类推,您将看到sed逐行运行。它接受一行,对其进行操作,然后输出结果文本,然后在下一行重复该过程。

我们可以通过将“ -n”选项传递给sed来清除结果,从而禁止自动打印:

sed -n 'p' BSD
Copyright (c) The Regents of the University of California.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
. . .
. . .

现在,我们将每行打印一次。

地址范围

到目前为止,几乎都不能认为这些示例是编辑的(除非我们想将每行打印两次...)。让我们仅通过sed打印第一行来修改输出。

sed -n '1p' BSD
Copyright (c) The Regents of the University of California.
By placing the number "1" before the print command, we have told sed the line number to operate on. We can just as easily print five lines (don't forget the "-n"). 通过在打印命令前放置数字“ 1”,我们告诉sed行号可以进行操作。 我们可以轻松地打印五行(别忘了“ -n”)。
sed -n '1,5p' BSD
Copyright (c) The Regents of the University of California.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
We've just given an address range to sed. If we give sed an address, it will only perform the commands that follow on those lines. In this example, we've told sed to print line 1 through line 5. We could have specified this in a different way with by giving the first address and then using an offset to tell sed how many additional lines to travel: 我们刚刚给了sed一个地址范围。如果我们给sed一个地址,它将仅执行这些行之后的命令。在此示例中,我们告诉sed从第5行打印第1行。 我们可以通过给出第一个地址,然后使用偏移量来告诉sed要移动多少行,从而以其他方式指定了此地址:
sed -n '1,+4p' BSD

这将导致相同的输出,因为我们告诉sed从第1行开始,然后在接下来的4行中进行操作。

如果要每隔一行打印一次,可以在“〜”字符后指定间隔。下一行将从行1开始每隔一行打印一次:

sed -n '1~2p' BSD
Copyright (c) The Regents of the University of California.

modification, are permitted provided that the following conditions
1. Redistributions of source code must retain the above copyright
2. Redistributions in binary form must reproduce the above copyright
   documentation and/or other materials provided with the distribution.
   may be used to endorse or promote products derived from this software
. . .
. . .

删除文字

通过将“ p”命令更改为“ d”命令,我们可以轻松地执行以前删除文本打印的文本删除操作。

我们不再需要“ -n”命令,因为使用delete命令,sed将打印所有未删除的内容,这将帮助我们了解发生了什么。

我们可以修改上一节中的最后一条命令,使其从第一行开始每隔一行删除一次。结果是应该给我们上次没有得到的每一行。

sed '1~2d' BSD
All rights reserved.
Redistribution and use in source and binary forms, with or without
are met:
   notice, this list of conditions and the following disclaimer.
   notice, this list of conditions and the following disclaimer in the
3. Neither the name of the University nor the names of its contributors
   without specific prior written permission.
. . .
. . .

在此必须注意,我们的源文件未受到影响。它仍然完好无损。编辑将输出到我们的屏幕。

如果要保存所做的编辑,可以将标准输出重定向到如下文件:

sed '1~2d' BSD > everyother.txt

如果使用cat打开文件,我们将看到与以前在屏幕上看到的相同的输出。为了安全起见,Sed默认不会编辑源文件。

通过传递sed的“ -i”选项,我们可以更改此行为,这意味着就地执行编辑。这将编辑源文件。

让我们通过就地编辑刚刚创建的“ every other.text”文件来进行尝试。让我们通过再次删除其他所有行来进一步减少文件:

sed -i '1~2d' everyother.txt

如果再次使用cat,则可以看到该文件已被编辑。

如前所述,“-i”选项可能很危险!幸运的是,sed使我们能够在编辑之前创建备份文件。

要在编辑之前创建备份文件,请在“ -i”选项后直接添加备份扩展名:

sed -i.bak '1~2d' everyother.txt

这将创建扩展名为“ .bak”的备份文件,然后就地编辑常规文件。

替代文字

Perhaps the most well-known use for sed is substituting text. Sed has the ability to search for text patterns using regular expressions, and then replace the found text.

如果需要,请单击此处了解正则表达式。

以最简单的形式,您可以使用以下语法将一个单词更改为另一个单词:

's/old_word/new_word/'

“ s”是替代命令。三个斜杠(/)用于分隔不同的文本字段。如果更有用,则可以使用其他字符来分隔字段。

例如,如果我们试图更改网站名称,则使用另一个定界符将很有帮助,因为URL包含斜杠。在示例中,我们将使用echo进行管道传输:

echo "http://www.example.com/index.html" | sed 's_com/index_org/home_'
http://www.example.org/home.html

不要忘记最后的定界符,否则sed会抱怨。

echo "http://www.example.com/index.html" | sed 's_com/index_org/home'
sed: -e expression #1, char 22: unterminated `s' command

让我们创建一个文件来实践以下内容:

echo "this is the song that never ends
yes, it goes on and on, my friend
some people started singing it
not knowing what it was
and they'll continue singing it forever
just because..." > annoying.txt

现在,将表达式“ on”替换为“ forward”。

sed 's/on/forward/' annoying.txt
this is the sforwardg that never ends
yes, it goes forward and on, my friend
some people started singing it
not knowing what it was
and they'll cforwardtinue singing it forever
just because...

您可以在这里看到一些值得注意的事情。首先,是我们要替换模式,而不是文字。 “歌曲”中的“打开”更改为“前进”。

还要注意的另一件事是,在第2行上,第二个“ on”未更改为“ forward”。

这是因为默认情况下,“ s”命令在一行的第一个匹配项上操作,然后移至下一行。

为了使sed替换“ on”的每个实例而不是仅替换每行的第一个实例,我们可以将一个可选标志传递给替代命令。

我们将通过将其放置在替换集之后来为替换命令提供“ g”标志。

sed 's/on/forward/g' annoying.txt
this is the sforwardg that never ends
yes, it goes forward and forward, my friend
some people started singing it
not knowing what it was
and they'll cforwardtinue singing it forever
just because...

现在,替换命令将更改每个实例。

如果我们只想更改sed在每一行中找到的“ on”的第二个实例,那么我们可以使用数字“ 2”代替“ g”。

sed 's/on/forward/2' annoying.txt
this is the song that never ends
yes, it goes on and forward, my friend
some people started singing it
not knowing what it was
and they'll continue singing it forever
just because...

如果我们只想查看替换了哪些行,则可以再次使用“ -n”选项来禁止自动打印。

然后,我们可以将“ p”标志传递给替代命令,以打印发生替代的行。

sed -n 's/on/forward/2p' annoying.text
yes, it goes on and forward, my friend

如您所见,我们可以在命令末尾组合标志。

如果我们希望搜索过程忽略大小写,可以将其传递给“ i”标志。

sed 's/SINGING/saying/i' annoying.txt
this is the song that never ends
yes, it goes on and on, my friend
some people started saying it
not knowing what it was
and they'll continue saying it forever
just because...

引用匹配的文本

如果希望使用正则表达式查找更复杂的模式,则可以使用多种不同的方法在替换文本中引用匹配的模式。

例如,如果我们要从行首匹配“ at”,则可以使用以下表达式:

sed 's/^.*at/REPLACED/' annoying.txt
REPLACED never ends
yes, it goes on and on, my friend
some people started singing it
REPLACED it was
and they'll continue singing it forever
just because...

您可以看到通配符表达式从行的开头到“ at”的最后一个实例匹配。

由于您不知道搜索字符串中将匹配的确切短语,因此可以使用“&”字符来表示替换字符串中的匹配文本。

本示例说明如何在匹配的文本周围加上括号:

sed 's/^.*at/(&)/' annoying.txt
(this is the song that) never ends
yes, it goes on and on, my friend
some people started singing it
(not knowing what) it was
and they'll continue singing it forever
just because...

引用匹配文本的一种更灵活的方法是使用转义括号对匹配文本的各个部分进行分组。

带有括号的每组搜索文本都可以通过转义的参考数字来参考。例如,第一个括号组可以用“ \ 1”引用,第二个括号组可以用“ \ 2”引用,依此类推。

在此示例中,我们将切换每行的前两个单词:

sed 's/\([a-zA-Z0-9][a-zA-Z0-9]*\) \([a-zA-Z0-9][a-zA-Z0-9]*\)/\2 \1/' annoying.txt
is this the song that never ends
yes, goes it on and on, my friend
people some started singing it
knowing not what it was
they and'll continue singing it forever
because just...

如您所见,结果并不完美。例如,第二行跳过第一个单词,因为它的字符集未列出该字符。同样,它在第五行将“他们”视为两个单词。

让我们改进正则表达式以使其更准确:

sed 's/\([^ ][^ ]*\) \([^ ][^ ]*\)/\2 \1/' annoying.txt
is this the song that never ends
it yes, goes on and on, my friend
people some started singing it
knowing not what it was
they'll and continue singing it forever
because... just

这比上一次要好得多。这会将标点符号与相关单词组合在一起。

注意我们如何在括号内重复该表达式(一次不带“ *”字符,然后一次)。这是因为“ *”字符与零次或多次出现的字符集匹配。

This means that the match with the wildcard would be considered a "match" even if the pattern is not found.

为确保至少找到一次,我们必须在使用通配符之前将其匹配一次且不带通配符。

结论

我们仅介绍了sed的一些基础知识。您应该已经能够看到如何使用正确构造的sed命令快速转换文本文档。

在本系列的下一篇文章中,我们将介绍sed的一些更高级的功能。