Kinky linux command-line

From Gender and Tech Resources

Revision as of 21:24, 10 July 2015 by Lilith2 (Talk | contribs) (Process management (job control))

Graphical user interfaces (GUIs) are helpful for many tasks, but they box you in in the tasks the designer designed the GUI for. This is true to a certain extent for the command-line too, as it relies on the commands available. Still, some commands are so basic (close to the kernel), and come with many flags and options, or can be built on easily and be combined with other commands in shell scripts, that knowing the command-line and shell scripting is well worth the effort.

When I first discovered the power to delete the file in my OpenBSD terminal that the OSX finder could not trash I felt was no longer a prisoner inside my machine, only possessing knowledge of a GUI, I was formerly stuck in a holding pattern. Using *nix you keep moving all the time, discovering always new executable codes sensitive to commands. In the shell I find a marvelous mess of constellations, nebulae, interstellar gaps, awesome gullies, that provokes in me an indescribable sense of vertigo, as if I am hanging from earth upside down on the brink of infinite space, with terrestrial gravity still holding me by the heels but about to release me any moment. An example is /dev/null – a special *nix file where you pipe your unwanted data flow through this output. [1]

Farce of the Pinguins: A mockumentary that illuminates penguin survival and mating rituals, as well as one bird's search for love while on a 70-mile trek with his hedonistic buddies http://www.imdb.com/title/tt0488539/

Command Line Culture (CLC)

Some people use a Command Line Interface (CLI) extensively, and like it more than a GUI. After a ten-step program, they will admit something like, "I am a command line junkie, I like it far better than pointing and clicking. I have become addicted to the bash command, and the basic linux utilities. I find myself installing the basic GNU tools on any system I use. Heck I even installed cygnus-win on my windows gaming box. Mmmm... Command completion... Tasty!"

A only somewhat more sane version of that seems to be running a GUI and a command line at the same time and switching between the two depending on what needs doing. Usually things can be done faster with the command line, but there are situations, such as doing something with multiple directories, when a GUI is more efficient.

$ cd /insanely/long/directory/path/and/you/thought/you/were/there/yet/but/no/muhhahahaaa/aaaaah

Typing that tends to waste time, even when using that yummy command completion. When doing that same thing regularly with the GUI, that may get annoying too and scripting ensues. Goodbye ten step plan. :D Well, except for geany.

Getting started

  • Case sensitivity is very important and a common source of problems for people new to Linux. Other systems such as M$ windows are case insensitive when it comes to referring to files. Linux is not like this. You can have two or more files and directories with the same name but letters of different case.
  • A lof of commands in linux are named as an abbreviation of a word or words describing them. This makes it easier to remember them.

Man

Bash shells come with a very useful utility called man, short for manual files or manual pages. It gives a standardised format for documenting the purpose and usage of most of the utilities, libraries, and system calls https://www.kernel.org/doc/man-pages/. For documentation other than man pages, see the Linux Documentation Project site http://www.tldp.org/.

The manual pages are a set of pages that explain every command available on your system including what they do, the specifics of how you run them and what command line arguments they accept. They are fairly consistent in their structure so you can easily get the hang of it. Start up a console or terminal and invoke the manual pages with the following command:

$ man [command]

For example:

$ man grep
Man-grep.png

Everything is a file

Everything in linux can be viewed as a file:

  • regular files are documents, images, archives, recordings, directories (just a file containing names of other files) …
  • (character and block) device files give you access to hardware components
  • named pipes and sockets give access points for processes to communicate with each other
  • (hard and soft) links make a file accessible from different locations

Navigation

With pwd (present working directory) you can see your location in the file structure.

$ pwd 
/home/user
$

With ls (list) you can see what is in a location:

$ ls [options] [location]

For example:

$ ls -l /home/user
total 20
drwxr-xr-x 2 user user 4096 Jun 17 14:39 Desktop
drwxr-xr-x 2 user user 4096 Jul  2 00:45 Documents
drwxr-xr-x 4 user user 4096 Jul  2 00:46 Pictures

The result lines explained

  • The first character on a result line indicates whether it is a normal file - or a directory d. In the above example all are directories.
  • The next 9 characters are permissions for the file or directory. More on that in file permissions below.
  • A character representing the number of blocks.
  • The field following that is the owner of the file or directory (user in this case).
  • The group the file or directory belongs to (user)
  • File size
  • File modification time
  • Name of the file or directory

For more explanation on and examples of using ls do:

$ man ls

When referring to either a file or directory on the command line, like with /home/user in the ls example, we are referring to a path, a description of a route to get to a particular file or directory on the system. The linux file system is a hierarchical with at the very top of the structure a directory called the root directory denoted by a single slash /. It has subdirectories and the subdirectories have subdirectories and so on. Files may reside in any of these directories.

Paths can be absolute or relative:

  • Absolute paths specify a location (file or directory) in relation to the root directory and begin with a /
  • Relative paths specify a location (file or directory) in relation to where you currently are in the system and do not begin with a /

More building blocks:

  • ~ (tilde), a shortcut for your home directory. For example /home/user/Pictures and ~/Pictures both refer to the Pictures folder in the home directory of user.
  • . (dot), a reference to your current directory. For example, ./Pictures refers to the same directory again.
  • .. (dotdot), a reference to the parent directory. You can use this several times in a path to keep going up the hierarchy. If you are in the location the path /home/user refers to, you could run the command ls ../../ and this would return a listing of the root directory.

In order to move around in the system you can use a command called cd (change directory):

$ cd [location]

Typing out these paths can become tedious. Not to mention my typos. Yessss. Tab Completion. When you start typing a path and hit the Tab key on your keyboard at any time you will invoke an auto complete action. If nothing happens then that means there are several possibilities. If you hit Tab again it will show you those possibilities. You can continue typing and hit Tab again and it will again try to auto complete for you.

Expansion

When a tilde is used at the beginning of a word, it expands into the name of the home directory of the named user, or if no user is named, the home directory of the current user:

$ echo ~ 
/home/user 

If user "foo" has an account, then:

$ echo ~foo 
/home/foo

The shell allows arithmetic to be performed by expansion making using the shell prompt as a calculator easy. Arithmetic expansion only supports integers (whole numbers, no decimals), but can perform quite a number of different operations::

$ echo $((2 + 2)) 
4

It allows for nesting of expressions (5**2 means 52):

$ echo $(($((5**2)) * 3)) 
75 

Single parentheses may be used to group multiple subexpressions:

$ echo $(((5**2) * 3)) 
75 

Here is an example using the division and remainder operators (integer division):

$ echo Five divided by two equals $((5/2)) 
Five divided by two equals 2 
$ echo with $((5%2)) left over. 
with 1 left over. 

Perhaps the strangest expansion is called brace expansion. You can create multiple text strings from a pattern containing braces:

$ echo last{mce,boot,xorg}.log
lastmce.log lastboot.log lastxorg.log

Patterns to be brace expanded may contain a leading portion called a preamble and a trailing portion called a postscript. The brace expression itself may contain either a comma-separated list of strings, or a range of integers or single characters. The pattern may not contain embedded whitespace.

Using a range of integers in reverse order:

$ echo Number_{5..1} 
Number_5 Number_4 Number_3 Number_2 Number_1 

Brace expansions may also be nested:

$ echo a{A{1,2},B{3,4}}b

aA1b aA2b aB3b aB4b

The most common application of brace expansion is to easily make files or directories:

$ mkdir {2011..2013}-0{1..9} {2011..2013}-{10..12} 
$ ls 
2011-01 2011-07 2012-01 2012-07 2013-01 2013-07
2011-02 2011-08 2012-02 2012-08 2013-02 2013-08
2011-03 2011-09 2012-03 2012-09 2013-03 2013-09
2011-04 2011-10 2012-04 2012-10 2013-04 2013-10
2011-05 2011-11 2012-05 2012-11 2013-05 2013-11
2011-06 2011-12 2012-06 2012-12 2013-06 2013-12

File manipulation

From the command line, there are many ways to create, find and list different types of files.

In systems such as M$ Windows the extension is important and the system uses it to determine what type of file it is. In linux the system ignores extensions and looks inside the file to determine what type of file it is. So sometimes it can be hard to know for certain what type of file a particular file is. You can determine the type of a file with the file command:

$ file privatelyinvestigating.wordpress.2015-05-02.xml 
privatelyinvestigating.wordpress.2015-05-02.xml: XML document text

With cp (copy) you can copy files and directories:

$ cp [options] [filename] [filename]

For example:

$ cp -u *.png /home/user/Pictures/

Will copy all files in the current directory with extension .png to the Pictures directory in the home directory of user.

With mv (move) you can move or rename files and directories. To rename a file, use like this:

$ mv [filename1] [filename2]

To move a file, use like this:

$ mv [filename1] [directory]

To move files, use like this:

$ mv [filename1] [filename2] [directory]

With rm (remove) you can remove files and directories. Linux does not have an undelete command. Once you delete something with rm, it's gone. You can inflict horrifying damage on your system with rm if you are not careful, particularly with wildcards such as *.

To remove a file:

$ rm [filename]

To remove directories:

$ rm -r [filename]

And with mkdir (make directory) you can create directories:

$ mkdir [directory]

File permissions

Unix-like operating systems differ from other computing systems in that they are not only multitasking but also multi-user. The multi-user capability of Unix-like systems is a feature that is deeply ingrained into the design of the *nix operating system. In the environment in which Unix was created, this makes perfect sense, and now, with the internet, this makes perfect sense again. In the beginning, computers were large, expensive, and centralised, access was by terminals and The Computer would support many users at the same time, as does the internet. A method had to be devised to protect users from each other.

In linux, each file and directory is assigned access rights for the owner of the file, the members of a group of related users, and everybody else. Rights can be assigned to read a file, to write a file, and to execute a file (run the file as a program). There are two ways to specify the permissions.

For the first, see the permission settings for a file or directory, use the ls -l command (see above in Navigation). Taking one line of the results:

drwxr-xr-x 4 user user 4096 Jul  2 00:46 Pictures

The ls -l output line starts with a d indicating it is a directory, and the next nine characters are for permissions. These are three groups of three characters each. The first set of three characters rwx is for owner, the owner of the file. Owner has read r, write w and execute x permissions on that directory. The second set of characters is for group. Users in the group have r-x permissions and can only read and execute the file. Other (the rest of the world) have those permissions too in this case.

The conversion to the other permissions representation goes like this:

1) Convert the three sets rwx r-x r-x to three groups of binary code using 1's for "turned on" indicated by r, w and x and 0's as "turned off" indicated by a -, like so:

rwx = (111)2

r-x = (101)2

r-- = (100)2

--x = (001)2

The example then looks like 111 101 1012

2) Convert binary code to octal code. If not familiar with number conversions, a decent tutorial can be found in http://www.cstutoringcenter.com/tutorials/general/convert.php

(111)2 = 20 + 21 + 22 = 1 + 2 + 2*2 = 78

(101)2 = 20 + 22 = 1 + 2*2 = 58

(100)2 = 22 = 2*2 = 48

(001)2 = 20 = 1 8

The example then looks like 7 5 58. So the permissions for this Pictures directory in octal notation are 755 8 and the base, 8, is often not mentioned.

Handy file permissions mental shortcuts:

  • 777 (rwxrwxrwx) No restrictions on permissions. Anybody may do anything. Generally not a desirable setting.
  • 755 (rwxr-xr-x) The file's owner may read, write, and execute the file. All others may read and execute the file. This setting is common for programs that are used by all users.
  • 700 (rwx------) The file's owner may read, write, and execute the file. Nobody else has any rights. This setting is useful for programs that only the owner may use and must be kept private from others.
  • 666 (rw-rw-rw-) All users may read and write the file. But not execute.
  • 644 (rw-r--r--) The owner may read and write a file, while all others may only read the file. A common setting for data files that everybody may read, but only the owner may change.
  • 600 (rw-------) The owner may read and write a file. All others have no rights. A common setting for data files that the owner wants to keep private.

Handy directory permissions mental shortcuts:

  • 777 (rwxrwxrwx) No restrictions on permissions. Anybody may list files, create new files in the directory and delete files in the directory. Generally not a good setting.
  • 755 (rwxr-xr-x) The directory owner has full access. All others may list the directory, but cannot create files nor delete them. This setting is common for directories that you wish to share with other users.
  • 700 (rwx------) The directory owner has full access. Nobody else has any rights. This setting is useful for directories that only the owner may use and must be kept private from others.

With chmod you can modify access rights to a file:

$ chmod [permissions] [filename]

With su (super user) or sudo (as superuser do) you can temporarily become the superuser. Doing su on debian you will be asked for your root password.

$ su 

A new shell owned by root is started, indicated by a # instead of a $ as prompt. You can kill that shell and return to your previous user shell with exit:

# exit

In mint and ubuntu default you do not have su but you can use sudo. You will be asked for your user password.

$ sudo

In order to change the owner of a file, you must be the superuser. With chown (change ownership) you can then change file ownership:

# chown [username] [filename] 

For changing the group ownership of a file you do not need superuser, but you do need to be owner of the file. With chgrp (change group) you can change a file's group ownership:

$ chgrp [groupname] [filename]

Regular expressions

Regular expressions are strings that describe a collection of strings using a for that purpose created language. That probably reads like garble, but a few examples can help. Regular expressions are useful for expansion, static code source analysis, reverse engineering, malware fingerprinting, vulnerability assessment, and exploit development. Many of the tools for working with text enable you to use regular expressions, sometimes referred to as regex, to identify the text you are looking for based on some pattern. You can use these strings to find text within a text editor or use them with search commands to scan multiple files for the strings of text you want.

Matching using regex

Expression Matches
a* a, ab, abc, abs, absolutely, ...
^a Any "a" appearing at the beginning of a line
*a$ Any "a" appearing at the end of a line
a.c Three character strings that begin with a and end with c
[bcf]at bat, cat, or fat
[a-d]at aat, bat, cat, dat (but not Aat, Bat, Cat or Dat)
[A-D]at Aat, Bat, Cat, Dat (but not aat, bat, cat or dat)
1[3-5]7 137, 147 and 157
\tHello A tab character preceding the word Hello
\.[tT][xX][Tt] .txt, .TXT, .TxT, and all other case combinations

Regular expressions are not completely consistent from program to program. For example, the meaning of the asterisk * in the shell's filename expansion is different from that used by grep and other programs which support regular expressions. In addition, other versions of grep (like fgrep and egrep) support additional features. Programming languages also have many additional extensions to regular expressions. The online man pages can be consulted to resolve any discrepancies.

Try this terminal: http://uni.xkcd.com/

Search and replace with regular expressions in vi

The most common editor is still vi and can be found on any *nix (unless it has been removed). Knowledge about how to make minor file edits is critical for administrators. On minimalist systems or when trying to bring a foreign server back online, vi will almost certainly be there. Vim is an enhanced vi editor that may be there (for vim regex see http://www.vimregex.com/ and for vim macros see http://vimdoc.sourceforge.net/htmldoc/usr_10.html#10.1).

You can use regular expressions to find patterns in files from inside editors like vi:

Expression Matches
. (dot) Any single character except newline
* Zero or more occurances of any character
[...] Any single character specified in the set
[^...] Any single character not specified in the set
^ Anchor - beginning of the line
$ Anchor - end of line
\< Anchor - beginning of word
\> Anchor - end of word
\(...\) Grouping - usually used to group conditions
\n Contents of nth grouping

Examples of sets:

Expression Matches
[A-Z] The set from Capital A to Capital Z
[a-z] The set from lowercase a to lowercase z
[0-9] The set from 0 to 9 (All numerals)
[./=+] The set containing . (dot), / (slash), =, and +
[-A-F] The set from Capital A to Capital F and the dash (dashes must be specified first)
[0-9 A-Z] The set containing all capital letters and digits and a space
[A-Z][a-zA-Z] In the first position the set from capital A to Z, in the second position the set of all letters

Examples of expressions:

Expression Matches
/Hello/ Line containing the value Hello
/^TEST$/ Line containing TEST by itself
/^[a-zA-Z]/ Line starts with any letter
/^[a-z].*/ First character of the line is a-z and there is at least one more of any character following it
/2134$/ Line ends with 2134
/[0-9]*/ Zero or more numbers in the line
\<00* A number with leading zeroes
/^[^#]/ The first character is not a # in the line

The search and replace function in vi is done with the :%s command:

:%s/pattern/string/flags	

This command replaces pattern with string according to flags:

  • Flags can be g for replacing all occurences of pattern globally (in the open file) and c for confirming replacements.
  • With & you can repeat the last :%s command.

Suppose we have a text file with this content:

- I mount my soul at /dev/null
- Those who do not understand Unix are condemned to reinvent it, poorly.
- Unix was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
- Unix is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep
- Try this terminal: http://uni.xkcd.com/

Use the Escape key to get into command mode, type :%s (your cursor will jump to the bottom and show you what you are typing) and enclose your regular expression in either slashes / / for a forward search or backslashes \ \ for a backward search. In the above file (with the cursor on the first line) try:

:%s/U.i/linu/g 

And see what happens:

- I mount my soul at /dev/null
- Those who do not understand linux are condemned to reinvent it, poorly.
- linux was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
- linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep
- Try this terminal: http://uni.xkcd.com/

Extracting columns of text with awk

Awk is a programming language which allows easy manipulation of structured data and the generation of formatted reports. Awk stands for the names of its authors "Aho, Weinberger, and Kernighan". Awk is used for pattern scanning and processing. It searches one or more files to see if they contain lines that matches with the specified patterns and then perform associated actions.

Key features of awk are:

  • Awk views a text file as records and fields.
  • Like most programming language, awk has variables, conditionals and loops
  • Awk has arithmetic and string operators.
  • Awk can generate formatted reports.
  • Awk reads from a file or standard input, and outputs to standard output.
  • Awk does not get along with non-text files.
$ awk '/[pattern]/ [Actions] /[pattern]/ [Actions]' [inputfile]

Pattern is a regular expression and the single quotes are used to make sure the shell does not interpret any of the enclosed special characters.

$ awk '{print;}' fun.txt

will print all the lines in fun.txt.

$ awk '/null/' fun.txt
- I mount my soul at /dev/null

You can continue on a next line. When you enter a return, the > appears as prompt. When awk reads completion the return will be taken as "go".

$ awk '/sexy/
> /null/' fun.txt
- I mount my soul at /dev/null
- linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep

The examples below use a pipe |, see I/O redirection for more on using pipes and Process management for more on ps.

Showing you the processes of user:

$ ps auwx | awk '/user/ {print $11}'

or:

$ ps auwx | grep user | awk '{print $11}'

Both display the contents of the 11th column (command name) from currently running processes output from the ps command ps auwx. In the first example awk is used and in the second grep to find all processes owned by the user named user. In each case, when processes owned by user are found, column 11 (command name) is displayed for each of those processes.

By default awk uses spaces as delimiter between columns. You can specify a different delimiter with the -F option:

$ awk -F: '{print $1,$5}' /etc/passwd

or:

$ cut -d: -f1,5 /etc/passwd

In both cases the colon : is specified as delimiter. Changing the comma to a dash prints columns 1 through 5:

$ cut -d: -f1-5 /etc/passwd

When there a varying number of spaces such as in the output of ps, awk is recommended. When there's files delimited by commas or colons as is the case in the /etc/passwd file, cut is recommended.

Searching for text with grep

The grep command comes in handy when performing more advanced string searches in a file. By now it's a verb. Grep's regex flavor is limited. An enhanced version of grep is called egrep. It uses a text-directed engine. Since neither grep nor egrep support any of the special features such as lazy repetition or lookaround, and because grep and egrep only indicate whether a match was found on a particular line or not, this distinction does not matter, except that the text-directed engine is faster. On POSIX systems, egrep uses POSIX Extended Regular Expressions http://www.regular-expressions.info/posix.html#bre. For more on POSIX see wikipedia https://en.wikipedia.org/wiki/POSIX. Despite the name "extended", egrep is almost the same as grep. It just uses a slightly different regex syntax and adds support for alternation, but loses support for backreferences.

The usual suspects with some differences, for example * is not a wildcard:

Expression Matches
. (dot) Any character except the end of the line character.
$ The expression at the end of a line.
* Zero or more occurrence of the previous character.

Bracket expressions:

Expression Matches
[:alnum:] Alphanumeric characters.
[:alpha:] Alphabetic characters.
[:blank:] Blank characters: space and tab.
[:lower:] Lower-case letters: 'a b c d e f g h i j k l m n o p q r s t u v w x y z'.
[:digit:] Digits: '0 1 2 3 4 5 6 7 8 9'.
[:space:] Space characters: tab, newline, vertical tab, form feed, carriage return, and space.
[:upper:] Upper-case letters: 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z'.

A regular expression for grep may be followed by one of several repetition operators:

Expression Matches
 ? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
{n} The preceding item is matched exactly n times.
{n,} The preceding item is matched n or more times.
{,m} The preceding item is matched at most m times.
{n,m} The preceding item is matched at least n times, but not more than m times.

For showing lines containing linux in a file:

$ grep linux fun.txt
- Those who do not understand linux are condemned to reinvent it, poorly.
- linux was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
- linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep
$ 

For showing empty lines in a file:

$ grep -c  "^$" [filename]

Searching for a pattern “kernel: *” i.e kernel: and zero or more occurrence of space character:

$ grep "kernel: *." *
grep: Desktop: Is a directory
grep: eepsite: Is a directory
grep: Music: Is a directory
grep: scripts: Is a directory
grep: Templates: Is a directory

Use of bracket expressions:

$ grep '[:upper:]' filename

Wildcards, matching all 3 character word starting with "b" and ending in "t":

$ grep '\<b.t\>' filename

Print all lines with exactly two characters:

$ grep '^..$' filename

The following regex to find an IP address 192.168.1.254 will not work:

$ grep '192.168.1.254' /etc/hosts

All three dots need to be escaped:

$ grep '192\.168\.1\.254' /etc/hosts

An IP address with egrep:

$ egrep '[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}' filename

The examples below use a pipe |, see I/O redirection for more on using pipes, Process management for more on ps and Network exploitation and monitoring for more on tcpdump.

For showing init lines from ps output:

$ ps auwx | grep init
root         1  0.0  0.0  29432  5416 ?        Ss   10:14   0:01 /sbin/init
user      4999  0.0  0.0  12724  2092 pts/0    S+   18:32   0:00 grep init

Using grep to search for specific network traffic with tcpdump:

$ sudo tcpdump -n -A | grep -e 'POST'
[sudo] password for user: 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
E...=.@.@......e@.H..'.P(.o%~...P.9.PN..POST /blog/wp-admin/admin-ajax.php HTTP/1.1
E...c_@.@..=...e@.H..*.PfC&lt;....wP.9.PN..POST /blog/wp-admin/admin-ajax.php HTTP/1.1
E.....@.@......e@.H...."g;.(.-,WP.9.Nj..POST /login/?login_only=1 HTTP/1.1

Sniffing passwords using egrep:

$ tcpdump port http or port ftp or port smtp or port imap or port pop3 -l -A | egrep -i 'pass=|pwd=|log=|login=|user=|username=|pw=|passw=|passwd=|password=
|pass:|user:|username:|password:|login:|pass|user' --color=auto --line-buffered -B20

Input/Output redirection

I/O redirection is one of the easiest things to master. It allows for combining different utilities effectively. For example, you may want to search through the output from nmap or tcpdump or a key-logger by feeding its output to another file or program for further analysis.

Every running program starts with three files (data streams) already opened:

  • STDIN (0) - Standard input (data fed into the program, defaults to keyboard), $0
  • STDOUT (1) - Standard output (data printed by the program, defaults to terminal/console), $1
  • STDERR (2) - Standard error (for error messages, also defaults to the terminal/console), $2

That "open file"? The value returned by an open call is called a file descriptor and it is an index into an array of open files kept by the kernel, making the file-descriptor the gateway into the kernel's abstractions of underlying hardware.

File-descriptors.png

For more on device drivers see Linux Device Drivers, Third Edition By Jonathan Corbet, Alessandro Rubini, Greg Kroah-Hartman http://www.oreilly.com/openbook/linuxdrive3/book/index.html It's a bit outdated (2005) but a good reference.

Piping and redirection is the means by which we may connect these streams between programs and files to direct data in interesting and useful ways.

Redirecting to a file

Take all output from standard out (stdout) and place it into filename (Using >> will append to the file, rather than overwrite it):

$ ls > filename

You do not have to create the file named filename in the example first. The way the mechanism works, the file filename is created first (if it did not exist already) and then the program is run and output saved into the file. If we save into a file which already exists, however, then it's contents will be cleared, and then the new output saved to it.

Reading from a file

Copy all data from the file to the standard input (stdin) of the program:

$ echo < filename

Most programs allow us to input a file. So why is this redirection handy? An example of that using wc (word count, the -l is for printing newline counts):

$ wc -l fun.txt
5 fun.txt

while:

$ wc -l < fun.txt
5

When wc is supplied with the file to process as a command line argument, the output from the program included the name of the file that was processed. When redirecting the contents of fun.txt into wc the file name was not printed. When using redirection or piping, the data is sent anonymously. This mechanism is useful for getting ancilliary data to not be printed.

We can used the sort command to process the contents of fun.txt. We can combine the two forms of redirection like this:

$ sort < fun.txt > sorted_fun.txt
$ cat sorted_fun.txt 
- I mount my soul at /dev/null
- linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep
- linux was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
- Those who do not understand linux are condemned to reinvent it, poorly.
- Try this terminal: http://uni.xkcd.com/

The three streams have numbers associated with them ($0, $1 and $2). STDERR is stream number 2 and we can use these numbers to identify the streams. If we place a number before the > operator then it will redirect that stream (if we don't use a number, then it defaults to stream 1).

$ [command] 2> errors.txt

We can save both normal output and error messages into a single file by redirecting the STDERR stream to the STDOUT stream and redirecting STDOUT to a file. We redirect to a file first then redirect the error stream. We identify the redirection to a stream by placing an & in front of the stream number (otherwise it would redirect to a file called 1).

$ [command] > commandoutput 2>&1
$ cat commandoutput

Piping

Take everything from standard out (stdout) of program1 and pass it to standard input (stdin) of program2:

$ ls | more

We can pipe as many programs together as we like. In the below example the output of ls is piped to head to give us the first three lines of the output of the ls command, and that is piped to tail so as to get only the third file:

$ ls
commandoutput firstfile filename foo1 fun.txt funny.png 
$ ls | head -3
commandoutput 
firstfile 
filename
$ ls | head -3 | tail -1
filename

To make debugging of huge piped commands easier, build your pipes up incrementally. Run the first program and make sure it provides the output you were expecting. Then add the second program and check again before adding the third and so on. This can save you a lot of frustration. :D

When piping and redirecting, the actual data will always be the same, but the formatting of that data may be slightly different to what is normally printed to the screen.

Process management (job control)

One of the most powerful aspects of linux is its ability not only to keep many processes in memory at once but also to switch between them fast enough to make it appear as though they were all running at the same time, called multitasking. In much of the Linux code the references are to tasks, not to processes. Because the term process seems to be more common in *nix literature and I am used to that term, I will be using process.

First some concepts:

  • A process is a single sequence of events utilizing memory and files. A process is created by forking a copy of the process being made. The two processes are only distinguished by the parent being able to wait for the child process to finish. A process may replace itself by another program to be executed.
  • Control of the multitasking is maintained in a preemtive or timesliced way. In timesliced, after a certain amount of time (in ms) the operating system passes operation over from one process to the next, more deserving process. It is the scheduler which chooses which is the most appropriate process to run next and linux uses a number of scheduling strategies to ensure fairness. Prior to version 2.5.4, the linux kernel was non-preemptive, which means a process running in kernel mode could not be moved out of processor until it left the processor of its own accord or it waited for some input output operation to complete. Generally a process in user mode can enter into kernel mode using system calls. Previously when the kernel was non-preemptive, a lower priority process could priority invert a higher priority process by denying it access to the processor by repeatedly calling system calls and remaining in kernel mode. Even if the lower priority process' timeslice expired, it would continue running until it completed its work in the kernel or voluntarily relinquished control. If the higher priority process waiting to run is a text editor in which the user is typing or an MP3 player ready to refill its audio buffer, the result is very poor interactive performance. The kernel is now preemptive http://www.informit.com/articles/article.aspx?p=414983&seqNum=2
  • An image is a computer execution environment which includes the program, associated data, status of open files (ie. file descriptor table and system file table), and the default directory. Some image attributes such as the user-id are accessible directly but other attributes such as the list of child processes can only be accessed through system calls.
  • A process is the execution of an image. During execution it has four parts to its execution space: program code segment (read only and sharable), program data segment (writable, non-sharable), runtime stack segment, and system segment (system data localized to process).
  • A system call is a standardized access method or hook from user scripts or programs. The process management system uses four main system calls:
    • fork creates two copies (parent and child) of an image.
    • wait allows a parent to pause until the child process completes.
    • exec allows overlaying of the calling program with a new one.
    • exit is a voluntary completion of the process.
  • Processes intercommunicate with each other using signals.
  • A process table maintains records for each process on the system. These processes are in a treed structure of ownership.
  • Processes are normally but not necessarily associated with a terminal device. This is done automatically on creation.
  • Daemons are processes that are NOT associated with a terminal. An example is the print spooler. These are identified in the process table as ? in the tty column.
  • Processes may be run in the 'background' (often by using an ampersand (&) at the end of the shell script that initiates the process). Programs running in the background do not cause the system to wait for their completion. Mechanisms for identifying (returning the PID number) checking status ( the PS command) and terminating (the KILL command) are provided to maintain control of background processes.

When an executable program starts up, it runs as a process under management of the process table. The ps and top command can be used to look at running processes; nice and renice for raising and lowering priority of a process; processes can be moved to run in the background with bg or to the foreground with fg; kill and killall can be used to send signals to a process; stop, start and restart to manage the running of a process; and cron can run commands at a scheduled time.

Listing processes of current user at current shell:

$ ps
 PID TTY          TIME CMD
2446 pts/1    00:00:00 bash
5348 pts/1    00:00:00 ps
PID = Process ID (number)
TTY = Controlling TTY (terminal)
TIME = Total CPU time in [DD-]HH:MM:SS format
CMD = Command

Show all user' running processes (with CPU/MEM):

$ ps -u user u
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user      2040  0.0  0.1 367732 13540 ?        Ssl  09:04   0:00 x-session-manag
and a long list ...
%CPU = CPU utilisation of process's lifetime in 00.0 format
%MEM = Percentage of process's machine's physical memory use (resident test size)
VSZ = Process's cirtual memory (1024-byte units)
RSS = Non-swapped physical memeory (resident set size) in Kb
STAT = Multi-character state: One character "s" state plus other state characters 
START = Start time of command started in HH:MM

plus other state characters ... for a list the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process on your machine:

$ man ps|grep -A 20 'output specifiers'

Grep again! :D And the command pgrep looks through the currently running processes and lists the process IDs matching the selection criteria to stdout. All criteria have to match.

For listing all processes named ssh AND owned by root:

$ pgrep -u root ssh

Listing processes owned by root OR deamon:

$ pgrep -u root,daemon

And more fun stuff like that. Some processes start up other processes. A webserver for example, will spin off multiple httpd deamons to wait for requests to your webserver. You can view the hierarchy of processes in a tree view with ps -ejH, or in BSD style ps axjf, forest format ps -ef --forest or with pstree.

Changing process priority

Adding jobs to cron

Network connections

Connecting to a network from a linux box is easy, and on occasion not. If a network interface does not come up or requires manual setup, there are many commands for configuring interfaces, checking network connections and setting up special routing. Once connection is up there are more commands for getting information about networks your machine is connected to.

Monitoring network connections

Configuring network interfaces

Reconnaissance

Querying DNS servers

The whois system is used by system administrators to obtain contact information for IP address assignments or domain name administrators. Dig is a networking tool that can query DNS servers for information. It can be very helpful for diagnosing problems with domain pointing and is a good way to verify that your server configuration is working. An alternative to dig is a command called host. This command functions in a very similar way to dig, with many of the same options. And if dig and whois do not provide you with enough information, tools like dnsmap and dnsenum can be handy.

Enumerating targets

Enumerating targets on your local network can be done with nmap, arping, hping and fping. The last three allow for constructing arbitrary packets for almost any networking protocol, for analysis of replies.

Reverse engineering

Learn about reverse engineering and backdooring hosts, discover memory corruption, code injection, and general data- or file-handling flaws that may be used to instantiate arbitrary code execution vulnerabilities.

Metasploit

First some preps that make life a little easier. Metasploit can be used in the environment of the bash shell.

Disassembly

Disassembly is the process of reversing the effect of code compilation as much as possible. And does not make sense at all if you know nothing about the parts of your processor that are made visible to machine instructions. Minimally you need to know about its registers (which can be bit-vector/integer, floating point, machine address), how Arithmetic Logic Units work, how clocking circuits works and why some instructions take more than one clock, how first and second level caches work, how Memory Management Units and Direct Memory Access work, etc.

Network exploitation and monitoring

Warning: Do not execute these on a network or system that you do not own. Execute only on your own network or system for learning purposes. Do not execute these on any production network or system.

Spoofing

Questioning servers

Brute-forcing authentication

Traffic filtering

Testing SSL implementation

Related

References

  1. Linux for Theatre Makers: Embodiment and *nix modus operandi http://networkcultures.org/blog/2007/04/23/linux-for-theatre-makers-embodiment-and-nix-modus-operandi/