extract the contents in between the tags

Friday, December 28, 2012 2 Comments


<?xml version="1.0" encoding="utf-8"?>
<job xmlns="http://www.sample.com/">programming</job>

I need a way to extract what is in the <job..> </job> tags, programmin in this case.
This should be done on linux command prompt, using grep/sed/awk.

solution-1
----------
grep '<job' file_name | cut -f2 -d">"|cut -f1 -d"<"

solution-2
----------
sed -ne '/<\/job>/ { s/<[^>]*>\(.*\)<\/job>/\1/; p }'
notes: -n stops it outputting everything automatically;
-e means it's a one-liner (aot a script) /<\/job> acts like a grep;
s strips the opentag + attributes and endtag;
; is a new statement;
p prints;
{} makes the grep apply to both statements, as one

2 comments:

  1. "Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing.."!!

    Digital Marketing Training in Chennai

    Digital Marketing Course in Chennai

    ReplyDelete