Inspecting

Use your browser to open any web page and save the source as page.html in your home directory.
View the file's contents:
```
$ cat page.html
```
That usually makes the contents whiz by so that you only see the very end. This variation allows you to page through the file one screen at a time:
```
$ more page.html
```
Type a space to page forward, and type q to quit. The more command only pages forward, but less lets you page back and forth by typing additional commands b or f:
```
$ less page.html
```
- page
See the start or end of the file:
```
$ head page.html
$ tail page.html
```
For either command, specify the number of lines to display:
```
$ head -10 page.html
```
Optionally clear the screen before displaying the output:
```
$ clear ; head -10 page.html
```
Look for stuff in the file:
```
$ grep body page.html
```
It displays lines of text with body anywhere in it. In Unix, lines are a very important unit of content, even though each line can be very long. Lines that are longer than 80 characters display as if they wrap onto the following line.
- line
A case-insensitive search for a multi-word string:
```
$ grep -i "BODY class" page.html
```
Note the need for the quotes. Otherwise the shell interprets BODY and class separately as arguments, and it gets confused. (If this doesn't work, try it with any other two adjacent words you see in the file.) A string is any sequence of characters, and can include letters, digits, whitespace, and punctuation characters.
- string
- case-insensitive
- line
See more context in the output:
```
$ grep -n -C 1 "body" page.html
```
The -n option shows the line number on which the match appears, which may tell you how far down from the top of the file it is. The -C (context) option shows an extra line around each match, with each chunk of text marked with --- regions. Try changing the 1 to 2 or more. It lets you see more of the file that surrounds the match.
The grep command stands for global regular expression parser, a fancy way of saying the stuff within the quotes is special. Regular expressions (aka regex or regexps) offer a system of matching patterns. These patterns resemble shell wildcards, but work differently. Suppose the word body appears all over the place and you only want to see it when it's used as an HTML tag. You could do one command each to match the open and close tags:
```
$ grep -n "<body" page.html
$ grep -n "</body" page.html
```
But in this variation, the * specifies zero or more of the preceding / character, so it matches both scenarios:
```
$ grep -n "</*body" page.html
```
This is a simple regular expression, but they can do very powerful and complex things. You'll encounter slight variations in support for them in three kinds of environment: in simple line-based Unix utilities such as grep and sed, in more complex streaming text editors such as Emacs, and in full programming languages such as Python, JavaScript, and Perl.
- pattern
- regex, regexp
Maybe you just want to count (-c) how many hits there are in a bunch of files. Perhaps you also want to search recursively (-r) through all nested subdirectories. This is a powerful way to inspect a directory structure:
```
$ grep -cr "body" *
```
The -v option reverses the matching results, so this matches any line that doesn't have an angle bracket that marks HTML tags:
```
$ grep -v "<" *.html
```
Find out how big the file is in lines, words, and characters:
```
$ wc page.html
```
Copy a file, make some random edits, and save the file:
```
$ cp page.html page2.html
$ open -e page2.html
```
Now compare to the original:
```
$ diff page.html page2.html
```
This common diff format appears in various content-management systems such as Git, and is the main way you compare one version of a file to another and track changes. They're much easier to read when each line of text doesn't exceed 80 characters, the standard width of most terminals.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

04_inspect.md

04_inspect.md

Inspecting

Files

04_inspect.md

Latest commit

History

04_inspect.md

File metadata and controls

Inspecting