The question is published on by Tutorial Guruji team.
I have a file called hostlist.txt
that contains text like this:
host1.mydomain.com host2.mydomain.com anotherhost www.mydomain.com login.mydomain.com somehost host3.mydomain.com
I have the following small script:
#!/usr/local/bin/bash while read host; do dig +search @ns1.mydomain.com $host ALL | sed -n '/;; ANSWER SECTION:/{n;p;}'; done <hostlist.txt | gawk '{print $1","$NF}' >fqdn-ip.csv
Which outputs to fqdn-ip.csv
:
host1.mydomain.com.,10.0.0.1 host2.mydomain.com.,10.0.0.2 anotherhost.internal.mydomain.com.,10.0.0.11 www.mydomain.com.,10.0.0.10 login.mydomain.com.,10.0.0.12 somehost.internal.mydomain.com.,10.0.0.13 host3.mydomain.com.,10.0.0.3
My question is how do I remove the .
just before the comma without invoking sed
or gawk
again? Is there a step I can perform in the existing sed
or gawk
calls that will strip the dot?
hostlist.txt
will contain 1000s of hosts so I want my script to be fast and efficient.
Answer
The sed
command, the awk
command, and the removal of the trailing period can all be combined into a single awk command:
while read -r host; do dig +search "$host" ALL; done <hostlist.txt | awk 'f{sub(/.$/,"",$1); print $1", "$NF; f=0} /ANSWER SECTION/{f=1}'
Or, as spread out over multiple lines:
while read -r host do dig +search "$host" ALL done <hostlist.txt | awk 'f{sub(/.$/,"",$1); print $1", "$NF; f=0} /ANSWER SECTION/{f=1}'
Because the awk
command follows the done
statement, only one awk
process is invoked. Although efficiency may not matter here, this is more efficient than creating a new sed or awk process with each loop.
Example
With this test file:
$ cat hostlist.txt www.google.com fd-fp3.wg1.b.yahoo.com
The command produces:
$ while read -r host; do dig +search "$host" ALL; done <hostlist.txt | awk 'f{sub(/.$/,"",$1); print $1", "$NF; f=0} /ANSWER SECTION/{f=1}' www.google.com, 216.58.193.196 fd-fp3.wg1.b.yahoo.com, 206.190.36.45
How it works
awk implicitly reads its input one record (line) at a time. This awk script uses a single variable, f
, which signals whether the previous line was an answer section header or not.
f{sub(/.$/,"",$1); print $1", "$NF; f=0}
If the previous line was an answer section header, then
f
will be true and the commands in curly braces are executed. The first removes the trailing period from the first field. The second prints the first field, followed by,
, followed by the last field. The third statement resetsf
to zero (false).In other words,
f
here functions as a logical condition. The commands in curly braces are executed iff
is nonzero (which, in awk, means ‘true’)./ANSWER SECTION/{f=1}
If the current line contains the string
ANSWER SECTION
, then the variablef
is set to1
(true).Here,
/ANSWER SECTION/
serves as a logical condition. It evaluates to true if the current matches the regular expressionANSWER SECTION
. If it does, then the command in curly braces in executed.