Clean up all the things

From Gender and Tech Resources

Revision as of 10:12, 13 September 2015 by Lilith2 (Talk | contribs) (Email streams)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Granny bit her lip. She was never quite certain about children, thinking of them-when she thought about them at all-as coming somewhere between animals and people. She understood babies. You put milk in one end and kept the other as clean as possible. Adults were even easier, because they did the feeding and cleaning themselves. But in between was a world of experience that she had never really inquired about. As far as she was aware, you just tried to stop them catching anything fatal and hoped that it would all turn out all right.” ― Terry Pratchett, Equal Rites

CleanAllTheThings.jpg

The dirt on and stuff in it

Clean.jpg

There's a whole lot of information that we may not want other people to see:

  • Credit-card information
  • Social security numbers
  • Private correspondence
  • Personal details
  • Bank-account information
  • Medical health records ­
  • Other sensitive information, like metadata.

And we can't just simply swipe it under the rug. No room for that. There's already too many files under the rug. Visible and invisible. Cookies, metadata of all kinds, temp files, cache files, and not just under the rug. Boxes full of em, stacked everywhere. The whole place is a mess. But it's an organised mess.

When my place is a big mess, I grab some empty boxes. I start at the front door and go from room-to-room picking up the stuff that doesn't belong in each room. I put the stuff in the empty boxes. I end up with basically just more boxes of random stuff. But each room looks less cluttered for now. Regularly I sort through the boxes and put stuff in the "correct place" (often the bin). Sometimes a "correct place" gets designated another physical location. Just in case I get guests, invited or otherwise, I make certain messes look like a pile of rubbish so it doesn't get noticed. Every year or so, except for some keys that I keep for access to community spaces and some personal stuff that I keep in a container outside of the house, I burn down the entire place and build a new house. Overall, this strategy works for me. :D

Removing metadata

If we ask whether a fact about a person identifies that person, it turns out that the answer isn’t simply yes or no. If all I know about a person is their ZIP code, I don’t know who they are. If all I know is their date of birth, I don’t know who they are. If all I know is their gender, I don’t know who they are. But it turns out that if I know these three things about a person, I could probably deduce their identity! Each of the facts is partially identifying. There is a mathematical quantity which allows us to measure how close a fact comes to revealing somebody’s identity uniquely. That quantity is called entropy, and it’s often measured in bits. Intuitively you can think of entropy being generalization of the number of different possibilities there are for a random variable: if there are two possibilities, there is 1 bit of entropy; if there are four possibilities, there are 2 bits of entropy, etc. Adding one more bit of entropy doubles the number of possibilities. ~ A Primer on Information Theory and Privacy https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy

Metadata is data about data.

To put it bluntly, metadata is hidden data that can fuck you over. Fuck you over real hard and rough like, savvy? Often defined as "data about data," metadata is information about a specific file that’s often included within the file itself but that’s often not readily visible or modifiable to the end-user when z is viewing the file in the standard application that z would typically use to view the file. In other words, metadata provides background information about a file. Chances are that every document you create, every digital photograph you take, every music file you download, and so on, all have little bits of metadata which can leak vital information about your identity. ~ The dangers of metadata, 2008[1]

Metadata is collected by corporations for psychological manipulation -- persuasion or advertising.

Metadata also plays a number of important roles in computer forensics:

  • It can provide corroborating information about the document data itself.
  • It can reveal information that someone tried to hide, delete, or obscure.
  • It can be used to automatically correlate documents from different sources.

And the Snowden leaks (see timeline masters of the internet) revealed a massive surveillance program including interception of email and other internet communications and phone call tapping. Upstream collection, Hemisphere and XKeyScore by way of wealthycluster2 gobble up our metadata, and with interconnected systems such as by ICReach that data can be shared and associated with other data. There are dozens of clever analyses you can perform with such linked databases. I'm sure that is what they're doing right now. If I can think of it, so did they.

Some of it appears illegal, while other documents show the US spying on friendly nations during various international summits, and on its citizens. The programs are enabled by two US laws, the Patriot Act and the FISA Amendments Act (FAA), and a side dish called Executive Order 12.333. And it is not only the NSA and the other agencies from the five-eyes countries, these techniques are being used by many countries to intimidate and control their populations.

Images

Photos contain hidden information, including the GPS coordinates of the location they were taken at, the date and time, camera shutter setting details, and possibly even the name of the program you used to edit them. This type of metadata can be useful, but you may want to remove it from your photos before sharing them online.

exiftool

ExifTool is a perl program that can be used to read and edit exif metadata in images.

$ exiftool imagename.jpg 
ExifTool Version Number         : 9.74
File Name                       : imagename.jpg
Directory                       : .
File Size                       : 1165 kB
File Modification Date/Time     : 2015:07:18 00:03:06+01:00
File Access Date/Time           : 2015:07:18 00:03:09+01:00
File Inode Change Date/Time     : 2015:07:18 00:03:07+01:00
File Permissions                : rw-r--r--
File Type                       : JPEG
MIME Type                       : image/jpeg
JFIF Version                    : 1.02
Exif Byte Order                 : Little-endian (Intel, II)
Orientation                     : Horizontal (normal)
X Resolution                    : 300
Y Resolution                    : 300
Resolution Unit                 : inches
Software                        : ViewNX 2.10 W
Modify Date                     : 2015:01:17 07:02:01
Y Cb Cr Positioning             : Centered
Exif Version                    : 0230
Components Configuration        : Y, Cb, Cr, -
Maker Note Version              : 2.11
Compression                     : JPEG (old-style)
Preview Image Start             : 3780
Preview Image Length            : 36139
Nikon Capture Version           : ViewNX 2.10 W
IFD0 Offset                     : 490
Preview IFD Offset              : 390
Flashpix Version                : 0100
Color Space                     : Uncalibrated
Exif Image Width                : 1920
Exif Image Height               : 1275
Thumbnail Offset                : 614
Thumbnail Length                : 3166
Image Width                     : 1920
Image Height                    : 1275
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 1920x1275
Preview Image                   : (Binary data 36139 bytes, use -b option to extract)
Thumbnail Image                 : (Binary data 3166 bytes, use -b option to extract)

imagemagick

The mogrify command can be used to strip Exif data from images. For the same file as above and then checked with exiftool again:

$ mogrify -strip imagename.jpg
$ exiftool imagename.jpg
ExifTool Version Number         : 9.74
File Name                       : imagename.jpg
Directory                       : .
File Size                       : 1111 kB
File Modification Date/Time     : 2015:07:18 00:07:14+01:00
File Access Date/Time           : 2015:07:18 00:07:17+01:00
File Inode Change Date/Time     : 2015:07:18 00:07:14+01:00
File Permissions                : rw-r--r--
File Type                       : JPEG
MIME Type                       : image/jpeg
JFIF Version                    : 1.01
Resolution Unit                 : inches
X Resolution                    : 300
Y Resolution                    : 300
Image Width                     : 1920
Image Height                    : 1275
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 1920x1275

For removing Exif data from all jpg images in a directory and all of its subdirectories recursively:

$ find ./path/directory -type f -iname '*.jpg' | xargs mogrify -strip

exiv2

The exiv2 tool http://www.exiv2.org/manpage.html also has a command for deleting all Exif data from an image:

$ exiv2 rm imagename.jpg

For removing Exif data from all jpg images in the current directory:

$ exiv2 rm *.jpg

For removing Exif data from all jpg images in a directory and all of its subdirectories recursively:

$ find ./path/directory -type f -iname '*.jpg' | xargs exiv2 rm

mat

MAT is a toolbox composed of a GUI application, a CLI application and a library, to anonymise/remove metadata https://mat.boum.org/.

Mat1.png
Mat2.png

Documents

Document metadata is information about one or more aspects of a document, spreadsheet, pdf file, that is not always visible to the person creating them, but can be found by the person who receives them next. Comments, track changes, hidden text, markups, properties, attachments and bookmarks are all examples of document metadata. Metadata removal software identifies and removes the metadata contained within a document so it cannot be shared.

hexedit

  • Be sure to backup your data before operating any hexedit code.
  • In general, switch to ASCII mode, turn off “read only” mode, and start searching through the file. For navigation and commands see http://linux.die.net/man/1/hexedit.

For example, when scrubbing pdf’s look for "created" (metadata appears in the PDF file more than once). If and when you find metadata, change to fake data or delete. Then repeat your search again for the terms "create", "creation", "modified", and "modify", and similarly either replace or delete the dates, once again being sure to repeat each search so that any potential multiple instances of the field can be located and modified or blanked out.

Hexedit.png

vi

You can also use vi as a hex editor. It isn’t a real "hex mode", what happens is that vi’s buffer is streamed through the external program xxd, but it works well for some cases of scrubbing.

Open a file in vi as usual, hit escape and to switch into hex mode type:

:%!xxd

And when your done, to exit from hex mode, hit escape again and type

:%!xxd -r

pdftk and sed

For pdftk commands see http://linux.die.net/man/1/pdftk.

To look at the metadata that Adobe Reader does not show by default:

$ pdftk filename.pdf dump_data
Dump data.png

To alter the metadata first put the metadata in a file:

$ pdftk filename.pdf dump_data output pdf-metadata

Open the pdf-metadata file and remove the data you wish scrubbed:

Metadatafile.png

Save the pdf-metadata file. Now you can use that data to scrub the metadata from your file:

$ pdftk filename.pdf update_info pdf-metadata output filename-no-metadata.pdf

And check the result with:

$ pdftk filename-no-metadata.pdf dump_data
Scrubbeddumpdata.png

The creation date is gone and a new modification date has appeared. And the iText gives away the use of pdftk. These infokeys can be removed with sed:

$ sed -i 's/iText\ 2\.1\.7\ by\ 1T3XT//;s/D:20120409144213+02'\00'\//' filename-no-metadata.pdf

The \ are for breaking out of the single quoted string then escaping the single quote.

PdfID0 and PdfID1 are file identifiers. They are an md5 of various info about the file so that it has a unique string to identify the doc without having to use the filename. If you want to scrub those too, use sed as above. There’s no geeky tricks needed for cleaning those two from the metadata with sed, it’s pretty straightforward.

mat

Mat3.png

Once any metadata is detected by mat, "State" will be marked as "Dirty". You can double click the file to see detected metadata and click on the "Clean" button to empty all private metadata fields from the file.

Mat4.png

You can also run mat from the command-line. Without options, the default action is to remove metadata from files.

To check files in a directory and its subdirectories:

$ mat -c .

To check metadata detected:

$ mat -d 

To clean up all files and stores original files as '*.bak" files:

$ mat -b .

Removing cookies

Browser configuration

Cookies are not software. They can’t be programmed, can’t carry viruses, and can’t unleash malware to go wilding through your hard drive. But tracking cookies and especially third-party tracking cookies are commonly used as ways to compile long-term records of individuals’ browsing histories [2] [3] [4].

For not accepting third party cookies:

  • Firefox: Preferences > Privacy > Accept third-party cookies > Never.
  • Chrome (also chromium): Settings > Show advanced settings… > Content settings > Block third-party cookies and site data.

When you change the default cookie "lifetime" from "Keep until: they expire" to "Keep until: I close Firefox", Firefox changes any persistent cookies that sites set to session cookies. To allow a site to set a persistent cookie, you need to make an exception (site permission). For clearing cookies on exit:

  • Firefox/Tools > Options > Privacy > "Use custom settings for history" > Cookies: Keep until: "I close Firefox".
  • Chrome (also chromium): Settings > Show advanced settings ... > Content settings > Keep local data only until you quit your browser.

When you turn on the clearing of history at shutdown and include cookies, that runs a completely separate process which does not pay any attention to cookie lifetime or exceptions (site permissions). It just nukes them all. Note that some cookies might survive clearing at shutdown if they are encoded into your session history file, the one Firefox uses to restore your previous session windows and tabs.

Removing evercookie

And then there was "evercookie":

Evercookie is a javascript API available that produces extremely persistent cookies in a browser. Its goal is to identify a client even after they've removed standard cookies, Flash cookies (Local Shared Objects or LSOs), and others. Evercookie accomplishes this by storing the cookie data in several types of storage mechanisms that are available on the local browser. Additionally, if evercookie has found the user has removed any of the types of cookies in question, it recreates them using each mechanism available. [5]

With BleachBit you can delete evercookies in Firefox, Safari, and Google Chrome http://bleachbit.sourceforge.net/ (http://katana.oooninja.com/bleachbit/).

Search leakage

In most search engines, when you do a search and then click on a link, your search terms are sent to the site you clicked on (via the HTTP referrer header). This is called “search leakage.”

Not only the site you intend to visit, but also the search engine gets data from your machine. For example, that Google tracks user searches and online behavior is no secret and Google often shares this information with governments that request it. Google usually follows the law, and does not comply with requests which do not meet the law. See the Google Transparency Report http://www.google.com/transparencyreport/ for more on that. Its CEO Eric Schmidt has made plenty of controversial statements in the past https://en.wikipedia.org/wiki/Eric_Schmidt#Public_positions.

Smart referer

Search engines

Startpage https://www.startpage.com/ can also be combined with the Ixquick proxy. On the Startpage search results page, a ‘View by Ixquick Proxy’ option can be used to visit the search result with a proxy. Startpage has SSL and HTTPS add-ons for Mozilla Firefox.

Masquerading

Browser

When your browser requests a page from a web server, the browser sends information about itself along with the request in value strings. These headers include values to indicate browser type (Firefox, Opera, Mozilla, etc.), browser version, and underlying platform (Windows XP, Linux, Mac OS X, etc.). See http://www.ericgiguere.com/tools/http-header-viewer.html (javascript) for the data your browser sends. The web server then uses this information to select an appropriate page format for the browser, since different browsers (and even different versions of the same browser) have varying incompatibilities in their support for HTML and JavaScript.

Sometimes the web server misinterprets or fails to recognize this information and sends you an incorrectly formatted page. The server may even deny you access to its pages, whether it's for political reasons or you're using a browser that the site disapproves of or because its pages have only been tested for use with specific browser versions. The solution is to fool the server by having your browser masquerade as another browser.

Built into Firefox and Chrome are a number of "under the rug" settings, which can be changed to improve your privacy and anonymity when browsing. Most of the things we can do to obscure our identity however, can also make our sessions stand out as unusual (although it's harder to link the session to identity). Depending on your context, purpose, threat model, do not make yourself stand out like a "Big Red A-Team Tank Vehicle" on the internet highway. Really, browsers talk too much.

With panopticlick you can test how rare or unique your browser configuration is, based on the information it will share with sites it visits. Panopticlick gives a uniqueness score, letting you see how easily identifiable you might be as you surf the web https://panopticlick.eff.org/

Notes from Carsten:

  • The database is not a representative database, because most users, who know something about the project and visit it, use a privacy-friendly browser configuration.
  • Old entries in the database were not deleted. Firefox 3.5.3 has one of the best ratings in this database. But nobody uses this old browser version any more. You will be unique with this user agent in real life.
  • It is easy to manipulate the database. You can call the page with your preferred browser multiple times and your preferred browser will be higher rated.

About:config

If you use Firefox, use about:config to change some of the settings to non-unique nonsense http://kb.mozillazine.org/About:config and a helpful guide from https://www.bestvpn.com/blog/8499/make-firefox-secure-using-aboutconfig/

User-Agent Switcher

You can add your own user-agents, and even mimic being a webspider and switch between them. For a list of convincing user-agents see http://www.user-agents.org/index.shtml

Email

E-mail messages also contain a header containing the IP address of the sender as well as other information that may get attached to the header along the way, such as spam ratings that anti-spam software running on your e-mail server may apply to the message and other information added by the server. E-mail clients use this information to help identify spam messages [6].

According to RFC 821, an e-mail client is to send its domain name in the Helo/EHLO command http://www.samlogic.net/articles/smtp-commands-reference.htm which includes IP address. Masquerading ensues.

thunderbird

To view the header information in a Thunderbird e-mail message, select the message, then click on the View menu and select Headers > All. The header information for the message will replace the message in the Thunderbird window.

  • The Return-path is allegedly the e-mail address of the sender, although that is not a reliable method for identifying the sender because most spammers use any return e-mail address that they can find on spammer’s lists.
  • The Received line is a bit more reliable, because that contains the IP address of the location from where the spam message was sent. That is, of course, unless the spammer hacked into an e-mail server or is using a relay server to disguise the true source of the message.

Technically the IP address value in the hello string is not relevant for sending/receiving the mail, but because it might be used for spam scoring or simply out of courtesy I recommend entering a valid IP/hostname, like 127.0.0.1.

To change it globally (for all accounts) in Thunderbird Edit > Preferences > Advanced > Config Editor:

Helo1.png

Create (or edit) the entry named "mail.smtpserver.default.hello_argument" (If you need to create it, use right-click > New > String):

Helo2.png

Change the value to the desired IP or hostname (FQDN):

Helo3.png

To change it per SMTP server create (or edit) the entry named mail.smtpserver.smtp<number>.hello_argument where <number> is the ID for the SMTP server you would like to apply the setting to. Type "mail.smtpserver.smtp" in the search box up top in the Thunderbird configuration editor to see which ones are available and which ID they have. If you need to create the entry, use right-click > New > String.

TorBirdy is a plugin for Thunderbird. It tries to anonymize your connection (you need to have tor installed, see installing and configuring tor) and deletes and changes several information fields: https://trac.torproject.org/projects/tor/wiki/torbirdy/changes. When not installing tor and torbirdy, the list has some excellent examples of other strings you may wish to change in the Config Editor.

TorBirdy enforces the preferences it sets and attempts to change them using Thunderbird's settings or the configuration editor will not work as all such changes will be discarded when Thunderbird restarts. This is because the tor project believes that these preferences should not be changed, whether deliberately, by mistake, or due to another extension, as doing so can compromise your anonymity. There are however some preferences that can be changed and they can be accessed through TorBirdy's preferences dialog. Please note that if you are not an advanced user, you should NOT change any setting unless you are very sure of what you are doing. The preferences that TorBirdy changes are restored to their original values when it is uninstalled or disabled.

mutt

Mutt doesn't really speak SMTP, see the mutt mail concept http://dev.mutt.org/trac/wiki/MailConcept, and although a minimal SMTP is provided these days, the preferred way is still sending via a Mail Transfer Agent (MTA) such as sendmail, exim or postfix.

postfix

Open # vi /etc/postfix/main.cf with your favourite editor and add the line:

smtp_bind_address=127.0.0.1

Restart postfix:

# /etc/init.d/postfix restart

Clean up your language

If you are blogging, mind authorship analysis. Have a bar of soap, or learn about Wordsmithing.

Freeing up disk space

apt

command-line

To delete downloaded packages (.deb) already installed (and no longer needed):

$ sudo apt-get clean

To remove all stored archives in your cache for packages that can not be downloaded anymore (thus packages that are no longer in the repository or that have a newer version in the repository):

$ sudo apt-get autoclean

To remove unnecessary packages (After uninstalling an app there could be packages you don't need anymore):

$ sudo apt-get autoremove

To delete old kernel versions:

$ sudo apt-get remove --purge linux-image-X.X.XX-XX-generic

If you don't know which kernel version to remove:

$ dpkg --get-selections | grep linux-image

bleachbit

Bleachbit-apt.png

Shredding files and deleting data

Even when you erase everything on your hard disk, sometimes it is possible to recover (pieces of) data with forensics software and/or hardware. If that data is confidential, delete files and data securely so that no-one will recover them. Solid State Drives (SSD) may have introduced dramatic changes to the principles of computer forensics ...

When encrypting and compressing files, clear-text versions that existed before you compress/encrypt the file or clear-text copies that are created after you decrypt/decompress it remain on your hard drive. There may also be "temp" files left behind. Unless you purge — not just delete — those clear-text files.

Echoes of your personal data — swap files, temp files, hibernation files, erased files, browser artifacts, etc — are likely to remain on any computer that you use to access (encrypted) data. It is a trivial task to extract those echos. A hidden access trap. Purge – not just delete – echoes.

Shredding files

shred

Linux, FreeBSD and many other *nix systems come with a command line tool called shred installed. The shred command can be useful for destroying files so that its contents are very difficult to recover, even using high-sensitivity data recovery equipment. It repeatedly overwrites the data and the associated file or device names with random data. When used without options, shred will overwrite given files or devices 25 times. A device can be a partition or an entire HDD, USB key drive, etc.

$ shred [option(s)] file(s)_or_devices(s)

For example

$ shred filename1 filename2

will shred both files, and

$ shred /dev/hda4

will shred the fourth partition on the first HDD.

By default, shred does not delete files or partitions after overwriting them. Overwritten files can be deleted by using the -u option.

$ shred -u filename1 filename2

This both frees up the disk space for other data and makes it harder to reconstruct the shredded data.

Shred relies on the assumption that the filesystem overwrites data in place. But journal filesystems like Ext3 and ReiserFS, RAID-based filesystems, compressed filesystems, and filesystems that cache data in temporary locations do not satisfy this assumption. Plus that copies of files can be retained in filesystem backups and on remote mirrors. Shredding partitions is therefor more reliable than shredding files.

And even when shredding partitions, most HDDs map out bad sectors invisibly to application programs and utilities, and that includes shred. Sensitive data in such sectors will not be destroyed by shred.

Making deleted data hard to recover

dd

A hack that might work is to write zeroes or random data to a file on the drive until it fills up all of the available space, then delete it:

$ dd if=/dev/urandom of=/path/filename1

Then delete:

$ rm /path/filename1

This also works on partitions:

$ dd if=/dev/urandom of=/dev/sda4

Then delete

$ fdisk /dev/sda4
Command (m for help): d
Partition number (1-4): 4

Permanently delete files (including data in RAM or swap)

secure-delete tools

The Secure-Delete package comes with four commands:

srm 	 	Secure remove; used for deleting files or directories currently on your hard disk;
smem 	 	Secure memory wiper; used to wipe traces of data from your computer’s memory (RAM);
sfill 	 	Secure free space wiper; used to wipe all traces of data from the free space on your disk;
sswap 	 	Secure swap wiper; used to wipe all traces of data from your swap partition.

srm (secure remove) is a more advanced version of the “shred” command. It uses a combination of random data, zeros, and special values developed by cryptographer Peter Gutmann to make files irrecoverable. The shred tool allows you to specify the number of passes and the secure-delete tools use a default of 38 passes. It will also assign a random value for the filename, hiding that key piece of evidence:

$ srm filename

Removing a directory and all its subdirectories (recursive):

$ srm -r directory/

smem (secure memory wipe) removes residual traces of data that remain in memory. It is relatively easy for someone with the right tools to figure out what you had stored in RAM, which may be the contents of important files, internet activity, or whatever else it is you do with your computer. smem is slow. There are options to speed things up, but they increase risk by performing fewer overwrite passes.

Invoke with:

$ smem

sfill (secure free space wipe) wipes all the free space on your disk where past files have existed. This is particularly useful if you are getting rid of a hard disk for good; you can boot a LiveCD, delete everything on the disk, and then use sfill to make sure that nothing is recoverable (as root):

# sfill mountpoint/

NOTE: If you have /home/ on a separate partition and you try /home/hilarious/mistake as mountpoint, sfill will happily wipe the freespace on which the mistake directory resides (the entire /home/ partition).

sswap (secure swap wipe) wipes swap partitions. Swap partitions store data of running programs when RAM is filled up.

Find your mounted swap devices by running:

$ cat /proc/swaps

Or look in your /etc/fstab file for filesystems of type swap. It can be /dev/sda5 or /dev/dm-1, etc.

Disable the swap partition:

$ sudo swapoff /dev/sda6

Wipe:

$ sudo sswap /dev/sda6

Re-enable swap:

$ sudo swapon /dev/sda6

bleachbit

Bleachbit-system.jpg

Removing malware

And then of course, there is the possibility of people having visited without explicit invitation, without explicit consent, that may have left things lying about in odd places. And burglars leaving a payload or two to maintain access for continued pillaging and plundering of your private space. Or you may have downloaded something nasty somewhere.

A computer virus or a computer worm is a malicious software program that can self-replicate on computers or via computer networks – without you being aware that your machine has become infected. Because each subsequent copy of the virus or computer worm can also self-replicate, infections can spread very rapidly.

The word keylogger describes the program’s function. A keylogger can be software or hardware. The device types used to be relatively rare but its numbers are rapidly growing.

There is a lot of legitimate keylogging software designed to allow administrators to track activities. The boundary between “justified monitoring” and “espionage” is a fine line: Parental control, jealous partners, company security, control over employees, government surveillance contractors, security services … it is a huge market. Keylogging software and devices are also popular for stealing passwords, user data relating to online payment systems, and data useful for social profiling. Virus writers are constantly writing new keylogger Trojans for all of these purposes.

Keylogger basics

The idea behind keyloggers is to get in between any two links in the chain of events between when a key is pressed and when information about that keystroke is displayed on the monitor.

This can be achieved using video surveillance, a hardware bug in the keyboard, wiring or the computer itself, intercepting input/output, substituting the keyboard driver, the filter driver in the keyboard stack, intercepting kernel functions by any means possible (substituting addresses in system tables, splicing function code, etc.), intercepting functions in user mode, and requesting information from the keyboard using standard documented methods.

More advanced keyloggers can intercept data from wireless keyboards, and even collect and decipher the electromagnetic radiation or electrical signals given off by a keyboard. Hardware keyloggers are small devices that can be fixed to the keyboard, or placed within a cable or the computer itself. Software keyloggers are dedicated programs designed to track and log keystrokes.

Linux keyloggers

A kernel module can pick-up the input directly from the keyboard and catch everything. And for some linux versions, it is “kernel modules everywhere!” Still, such keyloggers aren’t exactly easy to install on machines. They require physical access. It must be downloaded and manually set to executable or extracted from an archive that stored the permissions, and manually run (at least the first time). Changing start-up will require root permissions, which would have to be either social engineered, or gained through some type of kernel exploit.

If your system has been compromised at the root level, then the attacker can hide a keylogger from detection by linking in a custom kernel module that intercepts system calls that might lead to its detection at the kernel level. This requires compiling the attack code for each and every current kernel.

Backdoors

A backdoor in a computer system (or cryptosystem or algorithm) is a method of bypassing normal authentication, securing unauthorized remote access to a computer, obtaining access to plaintext, and so on, while attempting to remain undetected. The backdoor may take the form of an installed program or may subvert the system through a rootkit.

A backdoor Trojan gives malicious users remote control over the infected computer. They enable the author to do anything they wish on the infected computer – including sending, receiving, launching and deleting files, displaying data and rebooting the computer. Backdoor Trojans are often used to unite a group of victim computers to form a botnet or zombie network that can be used for criminal purposes. Unlike computer viruses and worms, Trojans are not able to self-replicate.

As if it is not enough that intelligence agencies intercept and spy on our email, phonecalls, bank and credit card transactions, and other communications, the NSA actually intercepts, i.e., hijacks, computers before they reach their location.

Finfisher and linux

I haven't seen any reports yet of it being ported to "pure" *nix, but it has already been ported to Apple and Android. And while neither are "really linux", they are very close. It probably wouldn't be too much trouble for someone to port it to *nix.

Good news is that it is a Trojan and requires manual action on the part of the user to infect a system. Meaning we can take control and influence the odds. If we don't leave your senses about what we're installing and where we're getting it from then the likelyhood of any kind of infection on a *nix machine is very, very low. (Pretty close, but not quite, zero)

There have been a few isolated cases of reputable linux software being compromised at the server level, modified, and then being downloaded by people. But, those have been cases where maintainers could have set up their server security/repositories/signing keys better, and it was fixed it as soon as it was found. Afaik, in all but one case [7] that happened within hours.

The key signing procedures set up by the major distros avoid these types of problems by using mutlilevel authentification factors for allowing new/modified packages to be uploaded to the repositories or downloaded and installed from the repositories to your machine (see verification with checksums). Nothing is impossible to break, but this setup is extremely hard to break using contemporary concepts and technologies.

How the coodies got in the house?

Phishing & email

Depending on your email provider, simply opening an email message shouldn’t infect you as you haven’t executed any code, yet. Opening an attachment very well might, if it’s infected. So one typical infection vector is phishing, which is designed to trick an email recipient into opening a malicious executable.

According to The Washington Post, the FBI uses this technique for infecting a system too. Supposedly the bureau uses it sparingly – in part to keep references to the capability out of news stories – and only after obtaining permission from a judge (which has not always been granted).

Some keyloggers have a feature to send e-mails to the attacker and/or to email adresses in your address book.

Browsing

Keyloggers can be installed via a web page script which exploits a browser vulnerability. The program will automatically be launched when a user visits a infected site. Compromising a browser is relatively easy and it is cross-platform, hence an often chosen target. See linux security:safer browsing

Downloading and torrenting

A noted approach is to put malware up on sites labeled as something people want, such as “My caek loves YOU! 2 DVDRip.avi”. When downloading, it’s an avi, but it’s also bound with malware. And the caek isn’t real to begin with. For windows binding malware to an avi file isn’t even necessary because windows extensions are hidden by default, so they can just name it file.avi when in reality it’s file.avi.exe. The caek is still not real and happiness is not just around the corner.

Installation of applications and updates

Happens. So use the checksums. Checksums are used to ensure the integrity of data portions for data transmission or storage. Checksums is a simple error-detection scheme in which each transmitted message is accompanied by a numerical value based on the number of set bits in the message. The receiving station then applies the same formula to the message and checks to make sure the accompanying numerical value is the same. If not, the receiver can assume that the message has been garbled (or was altered). For more on its concepts see encrypting everything: checksums, and for its applications in linux see linux applications: verifying checksums.

I think my machine is infected with the coodies. Now what?

Overcome any embarrassment. Getting coodies can happen to the best. Plus embarrassment doesn't work, it lacks the head, hands and keyboard.

Check for new accounts

Check /etc/passwd for new accounts. Especially new accounts you don't recognise with a UID less than 500 are very suspicious. If a new account with a UID of 0 is in the list, definitely check it out. Also look for orphaned files, indicators of an account that has been deleted (may take a while):

$ sudo find / -nouser -print

Take a look at your processes

Take a look at your processes with ps -aux, htop or pstree for "unusual" processes.

If you are new to linux and your install took care of running most of the programs it can be hard to know what’s really supposed to be running vs. what’s not supposed to be there. Plus some of the best rootkits hide from such checks. If you installed intrusion detection software, you may find some clues in its reports.

Use rootkit scanners for additional information

Boot the machine from a known safe live-CD image and scan for suspicious files with rkhunter and chkrootkit.

Check crontab

Check crontab, it may be that the keylogger is relaunched regularly in case it is shut off or if the system reboots.

Check email streams

Look at the email streams. When programs start sending email by using another IP than your email server, it might be something fishy. It can of course also be legitimate. Also consider that a keylogger might just log in to your email server with valid credentials and email from there.

Trick

A trick to try if you suspect a keylogger is present:

  1. Type a random unique string on your keyboard in the live running machine.
  2. Reboot the machine from a known LiveCD and grep for that string.
  3. Find out where the string is stored, and you may have the temp file of the keylogger.
  4. Check the folder it is in, and check the folders upward in the tree.

Hit the mailinglists and forums

The above steps may minimally prepare you for receiving optimal help on forums and mailinglists or from us. You can find us on IRC.

If all else fails

  • If a VM was infected, revert back to an earlier snapshot that you think was clean or import an earlier exported ova of it. This does not help if the host was reached, of course.
  • A non-VM machine can be reinstalled. To make this experience an enjoyable event if and when it happens, have regular backups of data, and that personal information on an external disk and still totally clean. We laugh in the face of coodies and their masters!

It is possible for malware to persist across a re-format and re-install, if it is sufficiently ingenious and sophisticated: e.g., it can persist in the BIOS, in the firmware for peripherals (some hardware devices have firmware that can be updated, and thus could be updated with malicious firmware), or with a virus infecting data files on removable storage or on your backups. However, *most* malware doesn't do anything quite this nasty. Therefore, while there are no guarantees, re-formatting and re-installing should get rid of almost all malware you're likely to encounter in the wild. But, not all. Targeted surveillance and spy vs spy games is another game entirely?

This page continues in reverse engineering, really more of a montessori type pet peeve hack project to dive deeper in this direction.

Resources

News & watchdogs

Finfisher

Related

References

  1. The dangers of metadata, 2008 http://www.textfiles.com/uploads/diz-usp3.txt
  2. Recipes for Cookies: How Institutions Shape Communication Technologies http://papers.ssrn.com/sol3/papers.cfm?abstract_id=565041
  3. Thoughts on Mozilla and Privacy http://paranoia.dubfire.net/2010/12/thoughts-on-mozilla-and-privacy.html
  4. What about the "EU Cookie Directive" http://webcookies.org/faq/#Directive
  5. Evercookie http://samy.pl/evercookie/
  6. Spam Filtering for Mail Exchangers http://www.tldp.org/HOWTO/Spam-Filtering-for-MX/techniques.html
  7. Linux infection proves Windows malware monopoly is over; Gentoo ships backdoor? http://www.zdnet.com/article/linux-infection-proves-windows-malware-monopoly-is-over-gentoo-ships-backdoor-updated/