Kinky linux command-line
From Gender and Tech Resources
Graphical user interfaces (GUIs) are helpful for many tasks, but they box you in in the tasks the designer designed the GUI for. This is true to a certain extent for the command-line too, as it relies on the commands available. Still, some commands are so basic (close to the kernel), and come with many flags and options, or can be built on easily and be combined with other commands in shell scripts, that knowing the command-line and shell scripting is well worth the effort.
When I first discovered the power to delete the file in my OpenBSD terminal that the OSX finder could not trash I felt was no longer a prisoner inside my machine, only possessing knowledge of a GUI, I was formerly stuck in a holding pattern. Using *nix you keep moving all the time, discovering always new executable codes sensitive to commands. In the shell I find a marvelous mess of constellations, nebulae, interstellar gaps, awesome gullies, that provokes in me an indescribable sense of vertigo, as if I am hanging from earth upside down on the brink of infinite space, with terrestrial gravity still holding me by the heels but about to release me any moment. An example is /dev/null – a special *nix file where you pipe your unwanted data flow through this output. [1]
Contents
Command Line Culture (CLC)
Some people use a Command Line Interface (CLI) extensively, and like it more than a GUI. After a ten-step program, they will admit something like, "I am a command line junkie, I like it far better than pointing and clicking. I have become addicted to the bash command, and the basic linux utilities. I find myself installing the basic GNU tools on any system I use. Heck I even installed cygnus-win on my windows gaming box. Mmmm... Command completion... Tasty!"
An only somewhat more sane version of that seems to be running a GUI and a command line at the same time and switching between the two depending on what needs doing. Usually things can be done faster with the command line, but there are situations, such as doing something with multiple directories, when a GUI is more efficient.
$ cd /insanely/long/directory/path/and/you/thought/you/were/there/yet/but/no/muhhahahaaa/aaaaah
Typing that tends to waste time, even when using that yummy command completion. When doing that same thing regularly with the GUI, that may get annoying too and scripting ensues. Goodbye ten step plan. :D Well, except for geany
. My cup of tea.
Getting started
- Case sensitivity is very important and a common source of problems for people new to Linux. Other systems such as M$ windows are case insensitive when it comes to referring to files. Linux is not like this. You can have two or more files and directories with the same name but letters of different case.
- A lof of commands in linux are named as an abbreviation of a word or words describing them. This makes it easier to remember them.
Man
Bash shells come with a very useful utility called man
, short for manual files or manual pages. It gives a standardised format for documenting the purpose and usage of most of the utilities, libraries, and system calls https://www.kernel.org/doc/man-pages/. For documentation other than man pages, see the Linux Documentation Project site http://www.tldp.org/.
The manual pages are a set of pages that explain every command available on your system including what they do, the specifics of how you run them and what command line arguments they accept. They are fairly consistent in their structure so you can easily get the hang of it. Start up a console or terminal and invoke the manual pages with the following command:
$ man [command]
For example:
$ man grep
Everything is a file
Everything in linux can be viewed as a file:
- regular files are documents, images, archives, recordings, directories (just a file containing names of other files) …
- (character and block) device files give you access to hardware components
- named pipes and sockets give access points for processes to communicate with each other
- (hard and soft) links make a file accessible from different locations
With pwd
(present working directory) you can see your location in the file structure.
$ pwd /home/user $
With ls
(list) you can see what is in a location:
$ ls [options] [location]
For example:
$ ls -l /home/user total 20 drwxr-xr-x 2 user user 4096 Jun 17 14:39 Desktop drwxr-xr-x 2 user user 4096 Jul 2 00:45 Documents drwxr-xr-x 4 user user 4096 Jul 2 00:46 Pictures
The result lines explained
- The first character on a result line indicates whether it is a normal file
-
or a directoryd
. In the above example all are directories. - The next 9 characters are permissions for the file or directory. More on that in file permissions below.
- A character representing the number of blocks.
- The field following that is the owner of the file or directory (user in this case).
- The group the file or directory belongs to (user)
- File size
- File modification time
- Name of the file or directory
For more explanation on and examples of using ls
do:
$ man ls
When referring to either a file or directory on the command line, like with /home/user
in the ls
example, we are referring to a path, a description of a route to get to a particular file or directory on the system. The linux file system is a hierarchical with at the very top of the structure a directory called the root directory denoted by a single slash /
. It has subdirectories and the subdirectories have subdirectories and so on. Files may reside in any of these directories.
Paths can be absolute or relative:
- Absolute paths specify a location (file or directory) in relation to the root directory and begin with a
/
- Relative paths specify a location (file or directory) in relation to where you currently are in the system and do not begin with a
/
More building blocks:
-
~
(tilde), a shortcut for your home directory. For example/home/user/Pictures
and~/Pictures
both refer to thePictures
folder in the home directory of user. -
.
(dot), a reference to your current directory. For example,./Pictures
refers to the same directory again. -
..
(dotdot), a reference to the parent directory. You can use this several times in a path to keep going up the hierarchy. If you are in the location the path/home/user
refers to, you could run the commandls ../../
and this would return a listing of the root directory.
In order to move around in the system you can use a command called cd
(change directory):
$ cd [location]
Typing out these paths can become tedious. Not to mention my typos. Yessss. Tab Completion. When you start typing a path and hit the Tab key on your keyboard at any time you will invoke an auto complete action. If nothing happens then that means there are several possibilities. If you hit Tab again it will show you those possibilities. You can continue typing and hit Tab again and it will again try to auto complete for you.
Expansion
When a tilde is used at the beginning of a word, it expands into the name of the home directory of the named user, or if no user is named, the home directory of the current user:
$ echo ~ /home/user
If user "foo" has an account, then:
$ echo ~foo /home/foo
The shell allows arithmetic to be performed by expansion making using the shell prompt as a calculator easy. Arithmetic expansion only supports integers (whole numbers, no decimals), but can perform quite a number of different operations::
$ echo $((2 + 2)) 4
It allows for nesting of expressions (5**2 means 52):
$ echo $(($((5**2)) * 3)) 75
Single parentheses may be used to group multiple subexpressions:
$ echo $(((5**2) * 3)) 75
Here is an example using the division and remainder operators (integer division):
$ echo Five divided by two equals $((5/2)) Five divided by two equals 2 $ echo with $((5%2)) left over. with 1 left over.
Perhaps the strangest expansion is called brace expansion. You can create multiple text strings from a pattern containing braces:
$ echo last{mce,boot,xorg}.log lastmce.log lastboot.log lastxorg.log
Patterns to be brace expanded may contain a leading portion called a preamble and a trailing portion called a postscript. The brace expression itself may contain either a comma-separated list of strings, or a range of integers or single characters. The pattern may not contain embedded whitespace.
Using a range of integers in reverse order:
$ echo Number_{5..1} Number_5 Number_4 Number_3 Number_2 Number_1
Brace expansions may also be nested:
$ echo a{A{1,2},B{3,4}}b
aA1b aA2b aB3b aB4b
The most common application of brace expansion is to easily make files or directories:
$ mkdir {2011..2013}-0{1..9} {2011..2013}-{10..12} $ ls 2011-01 2011-07 2012-01 2012-07 2013-01 2013-07 2011-02 2011-08 2012-02 2012-08 2013-02 2013-08 2011-03 2011-09 2012-03 2012-09 2013-03 2013-09 2011-04 2011-10 2012-04 2012-10 2013-04 2013-10 2011-05 2011-11 2012-05 2012-11 2013-05 2013-11 2011-06 2011-12 2012-06 2012-12 2013-06 2013-12
File manipulation
From the command line, there are many ways to create, find and list different types of files.
In systems such as M$ Windows the extension is important and the system uses it to determine what type of file it is. In linux the system ignores extensions and looks inside the file to determine what type of file it is. So sometimes it can be hard to know for certain what type of file a particular file is. You can determine the type of a file with the file command:
$ file privatelyinvestigating.wordpress.2015-05-02.xml privatelyinvestigating.wordpress.2015-05-02.xml: XML document text
With cp
(copy) you can copy files and directories:
$ cp [options] [filename] [filename]
For example:
$ cp -u *.png /home/user/Pictures/
Will copy all files in the current directory with extension .png to the Pictures directory in the home directory of user.
With mv
(move) you can move or rename files and directories. To rename a file, use like this:
$ mv [filename1] [filename2]
To move a file, use like this:
$ mv [filename1] [directory]
To move files, use like this:
$ mv [filename1] [filename2] [directory]
With rm
(remove) you can remove files and directories. Linux does not have an undelete command. Once you delete something with rm, it's gone. You can inflict horrifying damage on your system with rm if you are not careful, particularly with wildcards such as *.
To remove a file:
$ rm [filename]
To remove directories:
$ rm -r [filename]
And with mkdir
(make directory) you can create directories:
$ mkdir [directory]
File permissions
Unix-like operating systems differ from other computing systems in that they are not only multitasking but also multi-user. The multi-user capability of Unix-like systems is a feature that is deeply ingrained into the design of the *nix operating system. In the environment in which Unix was created, this makes perfect sense, and now, with the internet, this makes perfect sense again. In the beginning, computers were large, expensive, and centralised, access was by terminals and The Computer would support many users at the same time, as does the internet. A method had to be devised to protect users from each other.
In linux, each file and directory is assigned access rights for the owner of the file, the members of a group of related users, and everybody else. Rights can be assigned to read a file, to write a file, and to execute a file (run the file as a program). There are two ways to specify the permissions.
For the first, see the permission settings for a file or directory, use the ls -l
command (see above in Navigation). Taking one line of the results:
drwxr-xr-x 4 user user 4096 Jul 2 00:46 Pictures
The ls -l
output line starts with a d
indicating it is a directory, and the next nine characters are for permissions. These are three groups of three characters each. The first set of three characters rwx
is for owner
, the owner of the file. Owner has read r
, write w
and execute x
permissions on that directory. The second set of characters is for group
. Users in the group have r-x
permissions and can only read and execute the file. Other
(the rest of the world) have those permissions too in this case.
The conversion to the other permissions representation goes like this:
1) Convert the three sets rwx r-x r-x
to three groups of binary code using 1's for "turned on" indicated by r
, w
and x
and 0's as "turned off" indicated by a -
, like so:
rwx = (111)2
r-x = (101)2
r-- = (100)2
--x = (001)2
The example then looks like 111 101 101
2
2) Convert binary code to octal code. If not familiar with number conversions, a decent tutorial can be found in http://www.cstutoringcenter.com/tutorials/general/convert.php
(111)2 = 20 + 21 + 22 = 1 + 2 + 2*2 = 78
(101)2 = 20 + 22 = 1 + 2*2 = 58
(100)2 = 22 = 2*2 = 48
(001)2 = 20 = 1 8
The example then looks like 7 5 5
8. So the permissions for this Pictures
directory in octal notation are 755 8 and the base, 8, is often not mentioned.
Handy file permissions mental shortcuts:
- 777 (rwxrwxrwx) No restrictions on permissions. Anybody may do anything. Generally not a desirable setting.
- 755 (rwxr-xr-x) The file's owner may read, write, and execute the file. All others may read and execute the file. This setting is common for programs that are used by all users.
- 700 (rwx------) The file's owner may read, write, and execute the file. Nobody else has any rights. This setting is useful for programs that only the owner may use and must be kept private from others.
- 666 (rw-rw-rw-) All users may read and write the file. But not execute.
- 644 (rw-r--r--) The owner may read and write a file, while all others may only read the file. A common setting for data files that everybody may read, but only the owner may change.
- 600 (rw-------) The owner may read and write a file. All others have no rights. A common setting for data files that the owner wants to keep private.
Handy directory permissions mental shortcuts:
- 777 (rwxrwxrwx) No restrictions on permissions. Anybody may list files, create new files in the directory and delete files in the directory. Generally not a good setting.
- 755 (rwxr-xr-x) The directory owner has full access. All others may list the directory, but cannot create files nor delete them. This setting is common for directories that you wish to share with other users.
- 700 (rwx------) The directory owner has full access. Nobody else has any rights. This setting is useful for directories that only the owner may use and must be kept private from others.
With chmod you can modify access rights to a file:
$ chmod [permissions] [filename]
With su
(super user) or sudo
(as superuser do) you can temporarily become the superuser. Doing su
on debian you will be asked for your root password.
$ su
A new shell owned by root is started, indicated by a #
instead of a $
as prompt. You can kill that shell and return to your previous user shell with exit
:
# exit
In mint and ubuntu default you do not have su
but you can use sudo
. You will be asked for your user password.
$ sudo
In order to change the owner of a file, you must be the superuser. With chown
(change ownership) you can then change file ownership:
# chown [username] [filename]
For changing the group ownership of a file you do not need superuser, but you do need to be owner of the file. With chgrp
(change group) you can change a file's group ownership:
$ chgrp [groupname] [filename]
Regular expressions
Regular expressions are strings that describe a collection of strings using a for that purpose created language. That probably reads like garble, but a few examples can help. Regular expressions are useful for expansion, static code source analysis, reverse engineering, malware fingerprinting, vulnerability assessment, and exploit development. Many of the tools for working with text enable you to use regular expressions, sometimes referred to as regex, to identify the text you are looking for based on some pattern. You can use these strings to find text within a text editor or use them with search commands to scan multiple files for the strings of text you want.
Matching using regex
Expression | Matches |
---|---|
a* | a, ab, abc, abs, absolutely, ... |
^a | Any "a" appearing at the beginning of a line |
*a$ | Any "a" appearing at the end of a line |
a.c | Three character strings that begin with a and end with c |
[bcf]at | bat, cat, or fat |
[a-d]at | aat, bat, cat, dat (but not Aat, Bat, Cat or Dat) |
[A-D]at | Aat, Bat, Cat, Dat (but not aat, bat, cat or dat) |
1[3-5]7 | 137, 147 and 157 |
\tHello | A tab character preceding the word Hello |
\.[tT][xX][Tt] | .txt, .TXT, .TxT, and all other case combinations |
Regular expressions are not completely consistent from program to program. For example, the meaning of the asterisk *
in the shell's filename expansion is different from that used by grep
and other programs which support regular expressions. In addition, other versions of grep
(like fgrep
and egrep
) support additional features. Programming languages also have many additional extensions to regular expressions. The online man
pages can be consulted to resolve any discrepancies.
Searching and replacing text with regular expressions in vi
The most common editor is still vi
and can be found on any *nix (unless it has been removed). Knowledge about how to make minor file edits is critical for administrators. On minimalist systems or when trying to bring a foreign server back online, vi
will almost certainly be there. Vim is an enhanced vi editor that may be there (for vim regex see http://www.vimregex.com/ and for vim macros see http://vimdoc.sourceforge.net/htmldoc/usr_10.html#10.1).
You can use regular expressions to find patterns in files from inside editors like vi
:
Expression | Matches |
---|---|
. (dot) | Any single character except newline |
* | Zero or more occurances of any character |
[...] | Any single character specified in the set |
[^...] | Any single character not specified in the set |
^ | Anchor - beginning of the line |
$ | Anchor - end of line |
\< | Anchor - beginning of word |
\> | Anchor - end of word |
\(...\) | Grouping - usually used to group conditions |
\n | Contents of nth grouping |
Examples of sets:
Expression | Matches |
---|---|
[A-Z] | The set from Capital A to Capital Z |
[a-z] | The set from lowercase a to lowercase z |
[0-9] | The set from 0 to 9 (All numerals) |
[./=+] | The set containing . (dot), / (slash), =, and + |
[-A-F] | The set from Capital A to Capital F and the dash (dashes must be specified first) |
[0-9 A-Z] | The set containing all capital letters and digits and a space |
[A-Z][a-zA-Z] | In the first position the set from capital A to Z, in the second position the set of all letters |
Examples of expressions:
Expression | Matches |
---|---|
/Hello/ | Line containing the value Hello |
/^TEST$/ | Line containing TEST by itself |
/^[a-zA-Z]/ | Line starts with any letter |
/^[a-z].*/ | First character of the line is a-z and there is at least one more of any character following it |
/2134$/ | Line ends with 2134 |
/[0-9]*/ | Zero or more numbers in the line |
\<00* | A number with leading zeroes |
/^[^#]/ | The first character is not a # in the line |
The search and replace function in vi is done with the :%s
command:
:%s/pattern/string/flags
This command replaces pattern with string according to flags:
- Flags can be
g
for replacing all occurences of pattern globally (in the open file) andc
for confirming replacements. - With
&
you can repeat the last:%s
command.
Suppose we have a text file with this content:
- I mount my soul at /dev/null - Those who do not understand Unix are condemned to reinvent it, poorly. - Unix was not designed to stop you from doing stupid things, because that would also stop you from doing clever things. - Unix is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep - Try this terminal: http://uni.xkcd.com/
Use the Escape key to get into command mode, type :%s
(your cursor will jump to the bottom and show you what you are typing) and enclose your regular expression in either slashes / /
for a forward search or backslashes \ \
for a backward search. In the above file (with the cursor on the first line) try:
:%s/U.i/linu/g
And see what happens:
- I mount my soul at /dev/null - Those who do not understand linux are condemned to reinvent it, poorly. - linux was not designed to stop you from doing stupid things, because that would also stop you from doing clever things. - linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep - Try this terminal: http://uni.xkcd.com/
Extracting columns of text with awk
Awk is a programming language which allows easy manipulation of structured data and the generation of formatted reports. Awk stands for the names of its authors "Aho, Weinberger, and Kernighan". Awk is used for pattern scanning and processing. It searches one or more files to see if they contain lines that matches with the specified patterns and then perform associated actions.
Key features of awk
are:
- Awk views a text file as records and fields.
- Like most programming language, awk has variables, conditionals and loops
- Awk has arithmetic and string operators.
- Awk can generate formatted reports.
- Awk reads from a file or standard input, and outputs to standard output.
- Awk does not get along with non-text files.
$ awk '/[pattern]/ [Actions] /[pattern]/ [Actions]' [inputfile]
Pattern is a regular expression and the single quotes are used to make sure the shell does not interpret any of the enclosed special characters.
$ awk '{print;}' fun.txt
will print all the lines in fun.txt.
$ awk '/null/' fun.txt - I mount my soul at /dev/null
You can continue on a next line. When you enter a return, the >
appears as prompt. When awk reads completion the return will be taken as "go".
$ awk '/sexy/ > /null/' fun.txt - I mount my soul at /dev/null - linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep
The examples below use a pipe |
, see I/O redirection for more on using pipes and Process management for more on ps
.
Showing you the processes of user:
$ ps auwx | awk '/user/ {print $11}'
or:
$ ps auwx | grep user | awk '{print $11}'
Both display the contents of the 11th column (command name) from currently running processes output from the ps command ps auwx
. In the first example awk
is used and in the second grep
to find all processes owned by the user named user. In each case, when processes owned by user are found, column 11 (command name) is displayed for each of those processes.
By default awk
uses spaces as delimiter between columns. You can specify a different delimiter with the -F
option:
$ awk -F: '{print $1,$5}' /etc/passwd
or:
$ cut -d: -f1,5 /etc/passwd
In both cases the colon :
is specified as delimiter. Changing the comma to a dash prints columns 1 through 5:
$ cut -d: -f1-5 /etc/passwd
When there a varying number of spaces such as in the output of ps
, awk
is recommended. When there's files delimited by commas or colons as is the case in the /etc/passwd
file, cut
is recommended.
Searching for text with grep
The grep
command comes in handy when performing more advanced string searches in a file. By now it's a verb. Grep's regex flavor is limited. An enhanced version of grep is called egrep. It uses a text-directed engine. Since neither grep nor egrep support any of the special features such as lazy repetition or lookaround, and because grep and egrep only indicate whether a match was found on a particular line or not, this distinction does not matter, except that the text-directed engine is faster. On POSIX systems, egrep uses POSIX Extended Regular Expressions http://www.regular-expressions.info/posix.html#bre. For more on POSIX see wikipedia https://en.wikipedia.org/wiki/POSIX. Despite the name "extended", egrep is almost the same as grep. It just uses a slightly different regex syntax and adds support for alternation, but loses support for backreferences.
The usual suspects with some differences, for example * is not a wildcard:
Expression | Matches |
---|---|
. (dot) | Any character except the end of the line character. |
$ | The expression at the end of a line. |
* | Zero or more occurrence of the previous character. |
Bracket expressions:
Expression | Matches |
---|---|
[:alnum:] | Alphanumeric characters. |
[:alpha:] | Alphabetic characters. |
[:blank:] | Blank characters: space and tab. |
[:lower:] | Lower-case letters: 'a b c d e f g h i j k l m n o p q r s t u v w x y z'. |
[:digit:] | Digits: '0 1 2 3 4 5 6 7 8 9'. |
[:space:] | Space characters: tab, newline, vertical tab, form feed, carriage return, and space. |
[:upper:] | Upper-case letters: 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z'. |
A regular expression for grep
may be followed by one of several repetition operators:
Expression | Matches |
---|---|
? | The preceding item is optional and matched at most once. |
* | The preceding item will be matched zero or more times. |
+ | The preceding item will be matched one or more times. |
{n} | The preceding item is matched exactly n times. |
{n,} | The preceding item is matched n or more times. |
{,m} | The preceding item is matched at most m times. |
{n,m} | The preceding item is matched at least n times, but not more than m times. |
For showing lines containing linux in a file:
$ grep linux fun.txt - Those who do not understand linux are condemned to reinvent it, poorly. - linux was not designed to stop you from doing stupid things, because that would also stop you from doing clever things. - linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep $
For showing empty lines in a file:
$ grep -c "^$" [filename]
Searching for a pattern “kernel: *” i.e kernel: and zero or more occurrence of space character:
$ grep "kernel: *." * grep: Desktop: Is a directory grep: eepsite: Is a directory grep: Music: Is a directory grep: scripts: Is a directory grep: Templates: Is a directory
Use of bracket expressions:
$ grep '[:upper:]' filename
Wildcards, matching all 3 character word starting with "b" and ending in "t":
$ grep '\<b.t\>' filename
Print all lines with exactly two characters:
$ grep '^..$' filename
The following regex to find an IP address 192.168.1.254 will not work:
$ grep '192.168.1.254' /etc/hosts
All three dots need to be escaped:
$ grep '192\.168\.1\.254' /etc/hosts
An IP address with egrep:
$ egrep '[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}' filename
The examples below use a pipe |
, see I/O redirection for more on using pipes, Process management for more on ps
and Network exploitation and monitoring for more on tcpdump
.
For showing init lines from ps output:
$ ps auwx | grep init root 1 0.0 0.0 29432 5416 ? Ss 10:14 0:01 /sbin/init user 4999 0.0 0.0 12724 2092 pts/0 S+ 18:32 0:00 grep init
Using grep to search for specific network traffic with tcpdump:
$ sudo tcpdump -n -A | grep -e 'POST' [sudo] password for user: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes E...=.@.@......e@.H..'.P(.o%~...P.9.PN..POST /blog/wp-admin/admin-ajax.php HTTP/1.1 E...c_@.@..=...e@.H..*.PfC<....wP.9.PN..POST /blog/wp-admin/admin-ajax.php HTTP/1.1 E.....@.@......e@.H...."g;.(.-,WP.9.Nj..POST /login/?login_only=1 HTTP/1.1
Sniffing passwords using egrep:
$ tcpdump port http or port ftp or port smtp or port imap or port pop3 -l -A | egrep -i 'pass=|pwd=|log=|login=|user=|username=|pw=|passw=|passwd=|password= |pass:|user:|username:|password:|login:|pass|user' --color=auto --line-buffered -B20
Input/Output redirection
I/O redirection is one of the easiest things to master. It allows for combining different utilities effectively. For example, you may want to search through the output from nmap
or tcpdump
or a key-logger by feeding its output to another file or program for further analysis.
File descriptors
Every running program starts with three files (data streams) already opened:
- STDIN (0) - Standard input (data fed into the program, defaults to keyboard), $0
- STDOUT (1) - Standard output (data printed by the program, defaults to terminal/console), $1
- STDERR (2) - Standard error (for error messages, also defaults to the terminal/console), $2
That "open file"? The value returned by an open
call is called a file descriptor and it is an index into an array of open files kept by the kernel, making the file-descriptor the gateway into the kernel's abstractions of underlying hardware.
For more on device drivers see Linux Device Drivers, Third Edition By Jonathan Corbet, Alessandro Rubini, Greg Kroah-Hartman http://www.oreilly.com/openbook/linuxdrive3/book/index.html It's a bit outdated (2005) but a good reference.
Piping and redirection is the means by which we may connect these streams between programs and files to direct data in interesting and useful ways.
Redirecting to a file
Take all output from standard out (stdout) and place it into filename (Using >>
will append to the file, rather than overwrite it):
$ ls > filename
You do not have to create the file named filename in the example first. The way the mechanism works, the file filename is created first (if it did not exist already) and then the program is run and output saved into the file. If we save into a file which already exists, however, then it's contents will be cleared, and then the new output saved to it.
Reading from a file
Copy all data from the file to the standard input (stdin) of the program:
$ echo < filename
Most programs allow us to input a file. So why is this redirection handy? An example of that using wc
(word count, the -l
is for printing newline counts):
$ wc -l fun.txt 5 fun.txt
while:
$ wc -l < fun.txt 5
When wc
is supplied with the file to process as a command line argument, the output from the program included the name of the file that was processed. When redirecting the contents of fun.txt into wc the file name was not printed. When using redirection or piping, the data is sent anonymously. This mechanism is useful for getting ancilliary data to not be printed.
We can used the sort
command to process the contents of fun.txt. We can combine the two forms of redirection like this:
$ sort < fun.txt > sorted_fun.txt $ cat sorted_fun.txt - I mount my soul at /dev/null - linux is sexy: who | grep -i blonde | date; cd ~; unzip; touch; strip; finger; mount; gasp; yes; uptime; umount; sleep - linux was not designed to stop you from doing stupid things, because that would also stop you from doing clever things. - Those who do not understand linux are condemned to reinvent it, poorly. - Try this terminal: http://uni.xkcd.com/
The three streams have numbers associated with them ($0
, $1
and $2
). STDERR is stream number 2 and we can use these numbers to identify the streams. If we place a number before the >
operator then it will redirect that stream (if we don't use a number, then it defaults to stream 1).
$ [command] 2> errors.txt
We can save both normal output and error messages into a single file by redirecting the STDERR stream to the STDOUT stream and redirecting STDOUT to a file. We redirect to a file first then redirect the error stream. We identify the redirection to a stream by placing an & in front of the stream number (otherwise it would redirect to a file called 1).
$ [command] > commandoutput 2>&1 $ cat commandoutput
Piping
Take everything from standard out (stdout) of program1 and pass it to standard input (stdin) of program2:
$ ls | more
We can pipe as many programs together as we like. In the below example the output of ls
is piped to head
to give us the first three lines of the output of the ls
command, and that is piped to tail
so as to get only the third file:
$ ls commandoutput firstfile filename foo1 fun.txt funny.png $ ls | head -3 commandoutput firstfile filename $ ls | head -3 | tail -1 filename
To make debugging of huge piped commands easier, build your pipes up incrementally. Run the first program and make sure it provides the output you were expecting. Then add the second program and check again before adding the third and so on. This can save you a lot of frustration. :D
When piping and redirecting, the actual data will always be the same, but the formatting of that data may be slightly different to what is normally printed to the screen.
Process management (job control)
One of the most powerful aspects of linux is its ability not only to keep many processes in memory at once but also to switch between them fast enough to make it appear as though they were all running at the same time, called multitasking. In much of the Linux code the references are to tasks, not to processes. Because the term process seems to be more common in *nix literature and I am used to that term, I will be using process.
Process management concepts
- A process is a single sequence of events utilizing memory and files. A process is created by forking a copy of the process being made. The two processes are only distinguished by the parent being able to wait for the child process to finish. A process may replace itself by another program to be executed.
- Control of the multitasking is maintained in a preemtive or timesliced way. In timesliced, after a certain amount of time (in ms) the operating system passes operation over from one process to the next, more deserving process. It is the scheduler which chooses which is the most appropriate process to run next and linux uses a number of scheduling strategies to ensure fairness. In general on scheduling http://www.hugovanhove.net/cursussen/OpSys/ProcessScheduling/ProcessScheduling.html. Prior to version 2.5.4, the linux kernel was non-preemptive, which means a process running in kernel mode could not be moved out of processor until it left the processor of its own accord or it waited for some input output operation to complete. Generally a process in user mode can enter into kernel mode using system calls. Previously when the kernel was non-preemptive, a lower priority process could priority invert a higher priority process by denying it access to the processor by repeatedly calling system calls and remaining in kernel mode. Even if the lower priority process' timeslice expired, it would continue running until it completed its work in the kernel or voluntarily relinquished control. If the higher priority process waiting to run is a text editor in which the user is typing or an MP3 player ready to refill its audio buffer, the result is very poor interactive performance. The kernel is now preemptive http://www.informit.com/articles/article.aspx?p=414983&seqNum=2
- An image is a computer execution environment which includes the program, associated data, status of open files (ie. file descriptor table and system file table), and the default directory. Some image attributes such as the user-id are accessible directly but other attributes such as the list of child processes can only be accessed through system calls.
- A process is the execution of an image. During execution it has four parts to its execution space: program code segment (read only and sharable), program data segment (writable, non-sharable), runtime stack segment, and system segment (system data localized to process).
- A system call is a standardized access method or hook from user scripts or programs. The process management system uses four main system calls:
- fork creates two copies (parent and child) of an image.
- wait allows a parent to pause until the child process completes.
- exec allows overlaying of the calling program with a new one.
- exit is a voluntary completion of the process.
- Processes intercommunicate with each other using signals.
- A process table maintains records for each process on the system. These processes are owned in a tree type structure.
- Processes are normally but not necessarily associated with a terminal device. This is done automatically on creation.
- Daemons are processes that are NOT associated with a terminal. An example is the print spooler. These are identified in the process table as ? in the tty column.
- Processes may be run in the 'background' (often by using an ampersand (&) at the end of the shell script that initiates the process). Programs running in the background do not cause the system to wait for their completion.
When an executable program starts up, it runs as a process under management of the process table. The ps
and top
command can be used to look at running processes; nice
and renice
for raising and lowering priority of a process; processes can be moved to run in the background with bg
or to the foreground with fg
; kill
and killall
can be used to send signals to a process; stop
, start
and restart
to manage the running of a process; and cron
can run commands at a scheduled time.
Looking at processes
Listing processes of current user at current shell:
$ ps PID TTY TIME CMD 2446 pts/1 00:00:00 bash 5348 pts/1 00:00:00 ps
PID = Process ID (number) TTY = Controlling TTY (terminal) TIME = Total CPU time in [DD-]HH:MM:SS format CMD = Command
Show all user' running processes (with CPU/MEM):
$ ps -u user u USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND user 2040 0.0 0.1 367732 13540 ? Ssl 09:04 0:00 x-session-manag and a long list ...
%CPU = CPU utilisation of process's lifetime in 00.0 format %MEM = Percentage of process's machine's physical memory use (resident test size) VSZ = Process's cirtual memory (1024-byte units) RSS = Non-swapped physical memeory (resident set size) in Kb START = Start time of command started in HH:MM STAT = Multi-character state: One character "s" state plus other state characters
"plus other state characters" ... The example shows Ssl. For a list the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process on your machine:
$ man ps|grep -A 20 'output specifiers'
Grep again! :D And the command pgrep
looks through the currently running processes and lists the process IDs matching the selection criteria to stdout. All criteria have to match.
For listing all processes named ssh AND owned by root:
$ pgrep -u root ssh
Listing processes owned by root OR deamon:
$ pgrep -u root,daemon
And more fun stuff like that. Some processes start up other processes. A webserver for example, will spin off multiple httpd deamons to wait for requests to your webserver. You can view the hierarchy of processes in a tree view with ps -ejH
, or in BSD style ps axjf
, forest format ps -ef --forest
or with pstree
.
Changing process priority
The kernel schedules processes and allocates CPU time accordingly for each of them. When one of your process requires higher priority to get more CPU time, you can use the nice
and renice
command. The process scheduling priority range is from -20 to 19. We call this as nice value. A nice value of -20 represents highest priority, and a nice value of 19 represent least priority for a process.
Launch a test program called test.sh (infinite loop testing script):
$ ./test.sh
Check with ps
:
$ ps -u user fl|grep './test.sh' 0 1000 3940 3708 20 0 13248 2908 - S+ pts/0 0:00 | | \_ /bin/bash ./test.sh 0 1000 4087 3956 20 0 12720 2100 - S+ pts/1 0:00 | \_ grep ./test.sh
The sixth column is NI (nice) and it is set to 0 for /bin/bash ./test.sh
. You can check what column is which by running that ps
command without piping it to grep
:
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
Instead of launching the program with the default priority, you can use nice
command to launch the process with a specific priority (-10 in the above command sets the priority of a process to 10. The – in nice command stands for the dash, which we use to pass options to the command. with two dashes it would be -10, a higher priority):
$ nice -10 ./test.sh
And check again:
$ ps -u user fl|grep './test.sh' 0 1000 4100 3708 30 10 13248 2980 - SN+ pts/0 0:00 | | \_ /bin/bash ./test.sh 0 1000 4104 3956 20 0 12720 2204 - S+ pts/1 0:00 | \_ grep ./test.sh
The test script is now launched with a nice value of 10, which means it runs at a lower priority when compared to other programs that are launched by default.
The process priority can be adjusted with the help of -n option. Increase:
$ nice -n -5 ./test.sh
Decrease:
$ nice -n 5 .test.sh
You can also change the priority of a running process with renice
. For that you will need the PID (4th column in the output of the grep command above):
$ renice -n -19 -p 3708
Running jobs, jobs, jobs!
The at
command runs a command or script at the time you set and that you enter at the at>
prompt (leave with Ctr-D):
$ at now +1 min at> backupdb at> <EOT> job 1 at Sat Jul 11 14:28:00 2015
With atq
you can look at all with at
queued jobs.
You can also use cron
for running commands or scripts at a given date and time. You can schedule scripts to be executed periodically. It is usually used for sysadmin jobs such as backups or cleaning /tmp/ directories and more. The cron service (daemon) runs in the background and constantly checks the /etc/crontab
file, and /etc/cron.*/
directories. It also checks the /var/spool/cron/
directory.
To create a personal crontab:
$ crontab -e no crontab for user - using an empty one Select an editor. To change later, run 'select-editor'. 1. /bin/nano <---- easiest 2. /usr/bin/vim.tiny
Likely you'll be given a choice as to which editor you wish to use, nano or vi(m), and then a new crontab file is opened for you in the chosen editor. Under the explanation you can enter your crontab line, in the given column format:
m h dom mon dow command
Crontab fields and allowed ranges (linux crontab syntax)
Field Description Allowed Value MIN Minute field 0 to 59 HOUR Hour field 0 to 23 DOM Day of Month 1-31 MON Month field 1-12 DOW Day Of Week 0-6 CMD Command Any command to be executed.
Scheduling a job for a specific time (july 11th 8:30 pm) every week:
30 20 11 07 * /home/user/scripts/backupdb
And using incremental backups, I can do this twice a day, evening and morning (the three *
expand to every day, every month and every week):
30 8,20 * * * /home/user/scripts/incremental-backupdb
To view your crontab entries use crontab -l
. To edit, use crontab -e
Network connections
Connecting to a network from a linux box is easy, and on occasion not. If a network interface does not come up or requires manual setup, there are many commands for configuring interfaces, checking network connections and setting up special routing. Once connection is up there are more commands for getting information about networks your machine is connected to.
Networking concepts
- Most of the Internet servers and personal computers use Internet Protocol version 4 (IPv4). This uses 32 bits to assign a network address as defined by the four octets of an IP address, up to
255.255.255.255
. Each octet is converted to a decimal number (base 10) from 0–255 and separated by a period (a dot). This format is called dotted decimal notation. For example, the IPv4 address of 11000000101010000000001100011000 is:- Segmented into 8-bit blocks: 11000000 10101000 00000011 00011000.
- Each block is converted to decimal: 192 168 3 24
- The adjacent octets are separated by a period: 192.168.3.24.
- Internet Protocol version 6 (IPv6) was designed to answer the future exhaustion of the IPv4 address pool. IPv4 address space is 32 bits which translates to just above 4 billion addresses. IPv6 address space is 128 bits translating to billions and billions of potential addresses. The protocol has also been upgraded to include new quality of service features and security, but also has its vulnerabilities [2] [3]. IPv6 addresses are represented as eight groups of four hexadecimal digits with the groups being separated by colons, for example 2805:F298:0004:0148:0000:0000:0740:F5E9, but methods to abbreviate this full notation exist http://www.vorteg.info/ipv6-abbreviation-rules/.
- The Transmission Control Protocol/Internet Protocol (TCP/IP) uses a client - server model for communications. The protocol defines the data packets transmitted (packet header, data section), data integrity verification (error detection bytes), connection and acknowledgement protocol, and re-transmission.
- TCP/IP Time To Live (TTL) is a counting mechanism to determine how long a packet is valid before it reaches its destination. Each time a TCP/IP packet passes through a router it will decrement its TTL count. When the count reaches zero the packet is dropped by the router. This ensures that errant routing and looping aimless packets will not flood the network.
- A Media Access Control address (MAC Address) is the network card address used for communication between other network devices on the subnet. This information is not routable. The ARP table maps a (global internet) TCP/IP address to the local hardware on the local network. The MAC address uniquely identifies each node of a network and is used by the Ethernet protocol.
- Full Duplex allows the simultaneous sending and receiving of packets. Most modern modems support full duplex.
- Half Duplex allows the sending and receiving of packets in one direction at a time only.
- The International Standards Organization (ISO) has defined the Open Systems Interconnection (OSI) model for current networking protocols, commonly referred to as the ISO/OSI model.
- A Network Hub is a hardware device to connect network devices together. The devices will all be on the same network and/or subnet. All network traffic is shared and can be sniffed by any other node connected to the same hub.
- A Network Switch is like a hub but creates a private link between any two connected nodes when a network connection is established. This reduces the amount of network collisions and thus improves speed. Broadcast messages are still sent to all nodes.
Linux TCP/IP network configuration files:
/etc/resolve.conf List DNS servers for internet domain name resolution. /etc/hosts Lists hosts to be resolved locally (not by DNS). /etc/nsswitch.conf List order of host name search. Typically look at local files, then NIS server, then DNS server.
Managing network interface cards
Adding a network interface card (NIC)
Activating and de-activating your NIC
Configuring network interfaces
Assigning an IP address
ifconfig
/etc/network/interfaces
Network IP aliasing
Changing the host name
Subnets
Network classes
Enabling and disabling forwarding
Route
Tunneling
Network socket listener daemons: inetd, xinetd
Remote commands: rcp, rsh, rlogin, rwho, ...
Remote Procedure Calls (RPC, Portmapper)
Network Wrappers (PAM)
ICMP: Blocking ICMP and look invisible to ping
Traffic Control (TC) and TC New Generation (TCNG)
Configuring linux for network multicast
Monitoring network connections
Using tcpdump to monitor the network
Address Resolution Protocol (ARP)
Network intrusion and detection systems
Reconnaissance
Querying DNS servers
The whois system is used by system administrators to obtain contact information for IP address assignments or domain name administrators. Dig is a networking tool that can query DNS servers for information. It can be very helpful for diagnosing problems with domain pointing and is a good way to verify that your server configuration is working. An alternative to dig is a command called host. This command functions in a very similar way to dig, with many of the same options. And if dig and whois do not provide you with enough information, tools like dnsmap and dnsenum can be handy.
Enumerating targets
Enumerating targets on your local network can be done with nmap, arping, hping and fping. The last three allow for constructing arbitrary packets for almost any networking protocol, for analysis of replies.
Reverse engineering
Learn about reverse engineering and backdooring hosts, discover memory corruption, code injection, and general data- or file-handling flaws that may be used to instantiate arbitrary code execution vulnerabilities.
Metasploit
First some preps that make life a little easier. Metasploit can be used in the environment of the bash shell.
Disassembly
Disassembly is the process of reversing the effect of code compilation as much as possible. And does not make sense at all if you know nothing about the parts of your processor that are made visible to machine instructions. Minimally you need to know about its registers (which can be bit-vector/integer, floating point, machine address), how Arithmetic Logic Units work, how clocking circuits works and why some instructions take more than one clock, how first and second level caches work, how Memory Management Units and Direct Memory Access work, etc.
Network exploitation and monitoring
Warning: Do not execute these on a network or system that you do not own. Execute only on your own network or system for learning purposes. Do not execute these on any production network or system.
Spoofing
Questioning servers
Brute-forcing authentication
Traffic filtering
Testing SSL implementation
Resources
- Commandlinefu is a place to record those command-line gems that you return to again and again. Delete that bloated snippets file you've been using and share your personal repository with the world. That way others can gain from your CLI wisdom and you from theirs too.
Related
References
- ↑ Linux for Theatre Makers: Embodiment and *nix modus operandi http://networkcultures.org/blog/2007/04/23/linux-for-theatre-makers-embodiment-and-nix-modus-operandi/
- ↑ Routing Loop Attack using IPv6 Automatic Tunnels: Problem Statement and Proposed Mitigations http://tools.ietf.org/html/draft-ietf-v6ops-tunnel-loops-07
- ↑ When moving to IPv6, beware the risks http://gcn.com/articles/2013/03/20/risks-moving-to-ipv6.aspx