Answer» Hello everybody of this supergood forum about Linux.
The problem I have now is that follows:
With bash shell I must do a file called file.txt, and I write these commands:
[emailprotected]:> cat > file.txt A aba aabb aabbb aaabbb Aa baa abab ababa bababa Ab bbb baaa abbba bbaabb Ba aaaa baba bbaab bbabaa aaa aaba bbbb bbbaa bbbabb
[emailprotected]:>
The problem is, using command 'grep', I must filter the two lines who have only one word with only 'a' letter. In the file, the lines fourth and fifth has a word aaaa and aaa, but in fact, I have tried all grep command with options I have thought and it has been impossible to me to do this. The only I have got is to filter the fourth line wich contain an aaaa word, but not the two lines.
And after this, I must filter the lines who has a word starting with 'b' followed by 'a' letters without 'b', for example the line with 'baa' word (the second line) and 'baaa' (the third line)...
I know that this isn´t easy, but I have spent four hours trying and trying resolve these two questions and I feel myself broken.
Thanks for read my two questions.So the first one: you want to find a word that begins with whitespace ([[:space:]]) or is at the start of the line (^) and contains only 'a's - in other words is followed by whitespace or is at the end of the line ($).
Using the -E command line switch makes regular expressions a bit EASIER to use. (e.g. + is acceptable instead of \+)
Code: [Select]grep -E "(^|[[:space:]])a+($|[[:space:]])" file.txt For the second one, how about trying "ba+([[:space:]]|$)"?
Please demonstrate that you understand this, otherwise I will not be HAPPY helping you with your homework again. Thanks very much. I´m going to try it using your line commands, and after this I will try to demonstrate you how they work. Thanks. Come back in minutes....Oh well, I just come back... These commands lines work as well I think you are a Linux "WIZARD". I couldn´t imagine how complicated these lines are.
I must give you a million thanks for them, for your help.
THANKS. THANKS. THANKS. THANKS. THANKS a lot.
Now I must pay my DEBT and I am going to study these command lines and try to explain how work they.
I´ll come back in some minutes more...Let´s go!
The expression to describe is:
grep -E "(^|[[:space:]])a+($|[[:space:]])" file.txt
grep -E is the same as egrep. For that, the precedence line command is the same as: egrep "(^|[[:space:]])a+($|[[:space:]])" file.txt
-E is a option to interpret pattern as an extended regular expression.
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using variuos operators to combine smaller expressions.
Thus, this is the complet expression: "(^|[[:space:]])a+($|[[:space:]])" Composed of smaller subexpressions: ^|[[:space:]] and a+ and $|[[:space:]]
A whole subexpression may be enclosed between parentheses to override the precedence rules.
For that, the precedence line command is the same as: grep -E "(^|[[:space:]])(a+)($|[[:space:]])" file.txt (note the new parentheses of a+).
Regular expressions may be joined by the infix operator |. The resulting resultant expression matches any STRING matching either subexpression.
If the first character of the subexpression is the caret ^, then it matches any character not in the list. (But I must confess that I don´t understand how the caret works in this subexpression because if I remove the caret ^ and the operator | that follows, and execute the command line, then, only appears the fourth file.txt line that is: Ba aaaa baba bbaab bbabaa. If it´s possible I would like you explain me this detail).
Certain named classes of characters are predefined within brackets expressions. Their name are self-explanatory, and one is [:space:] (this not means the Universe, only the space between words). These brackets are part of the symbolic name, and must be included in adition to the brackets delimiting the brackets list. This is the reason the subexpression is written [[:space:]]
a+ means the first item 'a' will be matched with '+' one or more times. (This is simple) .
$ isn´t explained in "man grep" (I think the "man grep" author forgot explain it). But I know this $ means the last character of the string but I have used it always with grep command after a matched character, something like: a$ (to say 'a' is the last character in the string I´m looking for. If it´s possible I would like you explain me this detail).
For these reasons about details I can´t understand by handbooks, I myself understand the subexpression (^|[[:space:]]) like "look for the line wich contains a string who starts with space". And ($|[[:space:]]) like "look for the line wich contains a string who ends with space".
Well Robpomeroy... I want to know if I have made fine my homework now.... You're almost there. ^ and $ are causing you confusion. That's forgiveable because they MEAN different things in different contexts. In my example, ^ means the start of a line and $ means the end of a line. I'll just break down one, to help you: (^|[[:space:]]) ^: start of the line |: OR [[:space:]]: any whitespace - tabs, spaces, etc. (): bundle up the 'OR'
Hence in full: ( the start of the line OR any whitespace ) You can now work out what ($|[[:space:]]) means.Hi Robpomeroy and thanks again for all the practice lessons I can learn with the information you give me.
Thanks.
HarlequinGood!
|