More L337 Translations

More L337 Translations
Image
bash
Dave Taylor Thu, 04/19/2018 - 09:20

Dave continues with his shell-script L33t translator.

In my last article, I talked about the inside jargon of hackers and computer geeks known as "Leet Speak" or just "Leet". Of course, that's a shortened version of the word Elite, and it's best written as L33T or perhaps L337 to be ultimately kewl. But hey, I don't judge.

Last time I looked at a series of simple letter substitutions that allow you to convert a sentence like "I am a master hacker with great skills" into something like this:


I AM A M@ST3R H@XR WITH GR3@T SKILLZ

It turns out that I missed some nuances of Leet and didn't realize that most often the letter "a" is actually turned into a "4", not an "@", although as with just about everything about the jargon, it's somewhat random.

In fact, every single letter of the alphabet can be randomly tweaked and changed, sometimes from a single letter to a sequence of two or three symbols. For example, another variation on "a" is "/-\" (for what are hopefully visually obvious reasons).

Continuing in that vein, "B" can become "|3", "C" can become "[", "I" can become "1", and one of my favorites, "M" can change into "[]V[]". That's a lot of work, but since one of the goals is to have a language no one else understands, I get it.

There are additional substitutions: a word can have its trailing "S" replaced by a "Z", a trailing "ED" can become "'D" or just "D", and another interesting one is that words containing "and", "anned" or "ant" can have that sequence replaced by an ampersand (&).

Let's add all these L337 filters and see how the script is shaping up.

But First, Some Randomness

Since many of these transformations are going to have a random element, let's go ahead and produce a random number between 1–10 to figure out whether to do one or another action. That's easily done with the $RANDOM variable:


doit=$(( $RANDOM % 10 ))       # random virtual coin flip

Now let's say that there's a 50% chance that a -ed suffix is going to change to "'D" and a 50% chance that it's just going to become "D", which is coded like this:


if [ $doit -ge 5 ] ;  then
  word="$(echo $word | sed "s/ed$/d/")"
else
  word="$(echo $word | sed "s/ed$/'d/")"
fi

Let's add the additional transformations, but not do them every time. Let's give them a 70–90% chance of occurring, based on the transform itself. Here are a few examples:


if [ $doit -ge 3 ] ;  then
  word="$(echo $word | sed "s/cks/x/g;s/cke/x/g")"
fi

if [ $doit -ge 4 ] ;  then
  word="$(echo $word | sed "s/and/\&/g;s/anned/\&/g;
     s/ant/\&/g")"
fi

And so, here's the second translation, a bit more sophisticated:


$ l33t.sh "banned? whatever. elite hacker, not scriptie."
B&? WH4T3V3R. 3LIT3 H4XR, N0T SCRIPTI3.

Note that it hasn't realized that "elite" should become L337 or L33T, but since it is supposed to be rather random, let's just leave this script as is. Kk? Kewl.

If you want to expand it, an interesting programming problem is to break each word down into individual letters, then randomly change lowercase to uppercase or vice versa, so you get those great ransom-note-style WeiRD LeTtEr pHrASes.

Next time, I plan to move on, however, and look at the great command-line tool youtube-dl, exploring how to use it to download videos and even just the audio tracks as MP3 files.

Help Canonical Test GNOME Patches, Android Apps Illegally Tracking Kids, MySQL 8.0 Released and More

News briefs for April 19, 2018.

Help Canonical test the GNOME desktop memory leak fixes in Ubuntu 18.04 LTS (Bionic Beaver) by downloading and installing the current daily ISO for your hardware from here: http://cdimage.ubuntu.com/daily-live/current/bionic-desktop-amd64.iso. Then download the patched version of gjs, install, reboot, and then just use your desktop normally. If performance seems impacted by the new packages, re-install from the ISO again, but don't install the new packages and see if things are better. See the Ubuntu Community page for more detailed instructions.

Thousands of Android apps downloaded from the Google Play store may be tracking kids' data illegally, according to a new study. NBC News reports: "Researchers at the University of California's International Computer Science Institute analyzed 5,855 of the most downloaded kids apps, concluding that most of them are 'are potentially in violation' of the Children's Online Privacy Protection Act 1998, or COPPA, a federal law making it illegal to collect personally identifiable data on children under 13."

MySQL 8.0 has been released. This new version "includes significant performance, security and developer productivity improvements enabling the next generation of web, mobile, embedded and Cloud applications." MySQL 8.0 features include MySQL document store, transactional data dictionary, SQL roles, default to utf8mb4 and more. See the white paper for all the details.

KDE announced this morning that KDE Applications 18.04.0 are now available. New features include improvements to panels in the Dolphin file manager; Wayland support for KDE's JuK music player; improvements to Gwenview, KDE's image viewer and organizer; and more.

Collabora Productivity, "the driving force behind putting LibreOffice in the cloud", announced a new release of its enterprise-ready cloud document suite—Collabora Online 3.2. The new release includes implemented chart creation, data validation in Calc, context menu spell-checking and more.

An Update on Linux Journal

An Update on Linux Journal
Image
Linux Journal magazine covers
Carlie Fairchild Wed, 04/18/2018 - 12:41

So many of you have asked how to help Linux Journal continue to be published* for years to come.

First, keep the great ideas coming—we all want to continue making Linux Journal 2.0 something special, and we need this community to do it.

Second, subscribe or renew. Magazines have a built-in fundraising program: subscriptions. It's true that most magazines don't survive on subscription revenue alone, but having a strong subscriber base tells Linux Journal, prospective authors, and yes, advertisers, that there is a community of people who support and read the magazine each month.

Third, if you prefer reading articles on our website, consider becoming a Patron. We have different Patreon reward levels, one even gets your name immortalized in the pages of Linux Journal.

Fourth, spread the word within your company about corporate sponsorship of Linux Journal. We as a community reject tracking, but we explicitly invite high-value advertising that sponsors the magazine and values readers. This is new and unique in online publishing, and just one example of our pioneering work here at Linux Journal.  

Finally, write for us! We are always looking for new writers, especially now that we are publishing more articles more often.  

With all our gratitude,

Your friends at Linux Journal

 

*We'd be remiss to not acknowledge or thank Private Internet Access for saving the day and bringing Linux Journal back from the dead. They are incredibly supportive partners and sincerely, we can not thank them enough for keeping us going. At a certain point however, Linux Journal has to become sustainable on its own.

Rise of the Tomb Raider Comes to Linux Tomorrow, IoT Developers Survey, New Zulip Release and More

News briefs for April 18, 2018.

Rise of the Tomb Raider: 20 Year Celebration comes to Linux tomorrow! A minisite dedicated to Rise of the Tomb Raider is available now from Feral Interactive, and you also can view the trailer on Feral's YouTube channel.

Zulip 1.8, the open-source team chat software, announces the release of Zulip Server 1.8. This is a huge release, with more than 3500 new commits since the last release in October 2017. Zulip "is an alternative to Slack, HipChat, and IRC. Zulip combines the immediacy of chat with the asynchronous efficiency of email-style threading, and is 100% free and open-source software".

The IoT Developers Survey 2018 is now available. The survey was sponsored by the Eclipse IoT Working Group, Agile IoT, IEEE and the Open Mobile Alliance "to better understand how developers are building IoT solutions". The survey covers what people are building, key IoT concerns, top IoT programming languages and distros, and more.

Google released Chrome 66 to its stable channel for desktop/mobile users. This release includes many security improvements as well as new JavaScript APIs. See the Chrome Platform Status site for details.

openSUSE Leap 15 is scheduled for release May 25, 2018. Leap 15 "shares a common core with SUSE Linux Enterprise (SLE) 15 sources and has thousands of community packages on top to meet the needs of professional and semi-professional users and their workloads."

GIMP 2.10.0 RC 2 has been released. This release fixes 44 bugs and introduces important performance improvements. See the complete list of changes here.

Create Dynamic Wallpaper with a Bash Script

Create Dynamic Wallpaper with a Bash Script
Image
screenshot
Patrick Wheelan Wed, 04/18/2018 - 09:58

Harness the power of bash and learn how to scrape websites for exciting new images every morning.

So, you want a cool dynamic desktop wallpaper without dodgy programs and a million viruses? The good news is, this is Linux, and anything is possible. I started this project because I was bored of my standard OS desktop wallpaper, and I have slowly created a plethora of scripts to pull images from several sites and set them as my desktop background. It's a nice little addition to my day—being greeted by a different cat picture or a panorama of a country I didn't know existed. The great news is that it's easy to do, so let's get started.

Why Bash?

BAsh (The Bourne Again shell) is standard across almost all *NIX systems and provides a wide range of operations "out of the box", which would take time and copious lines of code to achieve in a conventional coding or even scripting language. Additionally, there's no need to re-invent the wheel. It's much easier to use somebody else's program to download webpages for example, than to deal with low-level system sockets in C.

How's It Going to Work?

The concept is simple. Choose a site with images you like and "scrape" the page for those images. Then once you have a direct link, you download them and set them as the desktop wallpaper using the display manager. Easy right?

A Simple Example: xkcd

To start off, let's venture to every programmer's second-favorite page after Stack Overflow: xkcd. Loading the page, you should be greeted by the daily comic strip and some other data.

Now, what if you want to see this comic without venturing to the xkcd site? You need a script to do it for you. First, you need to know how the webpage looks to the computer, so download it and take a look. To do this, use wget, an easy-to-use, commonly installed, non-interactive, network downloader. So, on the command line, call wget, and give it the link to the page:


user@LJ $: wget https://www.xkcd.com/


--2018-01-27 21:01:39--  https://www.xkcd.com/
Resolving www.xkcd.com... 151.101.0.67, 151.101.192.67,
 ↪151.101.64.67, ...
Connecting to www.xkcd.com|151.101.0.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2606 (2.5K) [text/html]
Saving to: 'index.html'

index.html                                  100%
[==========================================================>]
2.54K  --.-KB/s    in 0s

2018-01-27 21:01:39 (23.1 MB/s) - 'index.html' saved [6237]

As you can see in the output, the page has been saved to index.html in your current directory. Using your favourite editor, open it and take a look (I'm using nano for this example):


user@LJ $: nano index.html

Now you might realize, despite this being a rather bare page, there's a lot of code in that file. Instead of going through it all, let's use grep, which is perfect for this task. Its sole function is to print lines matching your search. Grep uses the syntax:


user@LJ $: grep [search] [file]

Looking at the daily comic, its current title is "Night Sky". Searching for "night" with grep yields the following results:


user@LJ $: grep "night" index.html

Night Sky
Image URL (for hotlinking/embedding):
 ↪https://imgs.xkcd.com/comics/night_sky.png

The grep search has returned two image links in the file, each related to "night". Looking at those two lines, one is the image in the page, and the other is for hotlinking and is already a usable link. You'll be obtaining the first link, however, as it is more representative of other pages that don't provide an easy link, and it serves as a good introduction to the use of grep and cut.

To get the first link out of the page, you first need to identify it in the file programmatically. Let's try grep again, but this time instead of using a string you already know ("night"), let's approach as if you know nothing about the page. Although the link will be different, the HTML should remain the same; therefore, always should appear before the link you want:


user@LJ $: grep "img src=" index.html

xkcd.com logo

Selected Comics

It looks like there are three images on the page. Comparing these results from the first grep, you'll see that grep. The other two links contain "/s/"; whereas the link we want contains "/comics/". So, you need to grep the output of the last command for "/comics/". To pass along the output of the last command, use the pipe character (|):


user@LJ $: grep "img src=" index.html | grep "/comics/"



And, there's the line! Now you just need to separate the image link from the rest of it with the cut command. cut uses the syntax:


user@LJ $: cut [-d  delimeter] [-f field] [-c characters]

To cut the link from the rest of the line, you'll want to cut next to the quotation mark and select the field before the next quotation mark. In other words, you want the text between the quotes, or the link, which is done like this:


user@LJ $: grep "img src=" index.html | grep "/comics/" |
 ↪cut -d\" -f2

//imgs.xkcd.com/comics/night_sky.png

And, you've got the link. But wait! What about those pesky forward slashes at the beginning? You can cut those out too:


user@LJ $: grep "img src=" index.html | grep "/comics/" |
 ↪cut -d\" -f 2 | cut -c 3-

imgs.xkcd.com/comics/night_sky.png

Now you've just cut the first three characters from the line, and you're left with a link straight to the image. Using wget again, you can download the image:


user@LJ $: wget imgs.xkcd.com/comics/night_sky.png


--2018-01-27 21:42:33--  http://imgs.xkcd.com/comics/night_sky.png
Resolving imgs.xkcd.com... 151.101.16.67, 2a04:4e42:4::67
Connecting to imgs.xkcd.com|151.101.16.67|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 54636 (53K) [image/png]
Saving to: 'night_sky.png'

night_sky.png                               100%
[===========================================================>]
53.36K  --.-KB/s    in 0.04s

2018-01-27 21:42:33 (1.24 MB/s) - 'night_sky.png'
 ↪saved [54636/54636]

Now you have the image in your directory, but its name will change when the comic's name changes. To fix that, tell wget to save it with a specific name:


user@LJ $: wget "$(grep "img src=" index.html | grep "/comics/"
 ↪| cut -d\" -f2 | cut -c 3-)" -O wallpaper
--2018-01-27 21:45:08--  http://imgs.xkcd.com/comics/night_sky.png
Resolving imgs.xkcd.com... 151.101.16.67, 2a04:4e42:4::67
Connecting to imgs.xkcd.com|151.101.16.67|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 54636 (53K) [image/png]
Saving to: 'wallpaper'

wallpaper                                   100%
[==========================================================>]
53.36K  --.-KB/s    in 0.04s

2018-01-27 21:45:08 (1.41 MB/s) - 'wallpaper' saved [54636/54636]

The -O option means that the downloaded image now has been saved as "wallpaper". Now that you know the name of the image, you can set it as a wallpaper. This varies depending upon which display manager you're using. The most popular are listed below, assuming the image is located at /home/user/wallpaper.

GNOME:


gsettings set org.gnome.desktop.background picture-uri
 ↪"File:///home/user/wallpaper"
gsettings set org.gnome.desktop.background picture-options
 ↪scaled

Cinnamon:


gsettings set org.cinnamon.desktop.background picture-uri
 ↪"file:///home/user/wallpaper"
gsettings set org.cinnamon.desktop.background picture-options
 ↪scaled

Xfce:


xfconf-query --channel xfce4-desktop --property
 ↪/backdrop/screen0/monitor0/image-path --set
 ↪/home/user/wallpaper

You can set your wallpaper now, but you need different images to mix in. Looking at the webpage, there's a "random" button that takes you to a random comic. Searching with grep for "random" returns the following:


user@LJ $: grep random index.html

  • Random
  • Random
  • This is the link to a random comic, and downloading it with wget and reading the result, it looks like the initial comic page. Success!

    Now that you've got all the components, let's put them together into a script, replacing www.xkcd.com with the new c.xkcd.com/random/comic/:

    
    #!/bin/bash
    
    wget c.xkcd.com/random/comic/
    
    wget "$(grep "img src=" index.html | grep /comics/ | cut -d\"
     ↪-f 2 | cut -c 3-)" -O wallpaper
    
    gsettings set org.gnome.desktop.background picture-uri
     ↪"File:///home/user/wallpaper"
    gsettings set org.gnome.desktop.background picture-options
     ↪scaled
    
    

    All of this should be familiar except the first line, which designates this as a bash script, and the second wget command. To capture the output of commands into a variable, you use $(). In this case, you're capturing the grepping and cutting process—capturing the final link and then downloading it with wget. When the script is run, the commands inside the bracket are all run producing the image link before wget is called to download it.

    There you have it—a simple example of a dynamic wallpaper that you can run anytime you want.

    If you want the script to run automatically, you can add a cron job to have cron run it for you. So, edit your cron tab with:

    
    user@LJ $: crontab -e
    
    

    My script is called "xkcd", and my crontab entry looks like this:

    
    @reboot /bin/bash /home/user/xkcd
    
    

    This will run the script (located at /home/user/xkcd) using bash, every restart.

    Reddit

    The script above shows how to search for images in HTML code and download them. But, you can apply this to any website of your choice—although the HTML code will be different, the underlying concepts remain the same. With that in mind, let's tackle downloading images from Reddit. Why Reddit? Reddit is possibly the largest blog on the internet and the third-most-popular site in the US. It aggregates content from many different communities together onto one site. It does this through use of "subreddits", communities that join together to form Reddit. For the purposes of this article, let's focus on subreddits (or "subs" for short) that primarily deal with images. However, any subreddit, as long as it allows images, can be used in this script.

    Screenshot

    Figure 1. Scraping the Web Made Simple—Analysing Web Pages in a Terminal

    Diving In

    Just like the xkcd script, you need to download the web page from a subreddit to analyse it. I'm using reddit.com/r/wallpapers for this example. First, check for images in the HTML:

    
    user@LJ $: wget https://www.reddit.com/r/wallpapers/ && grep
     ↪"img src=" index.html
    
    --2018-01-28 20:13:39--  https://www.reddit.com/r/wallpapers/
    Resolving www.reddit.com... 151.101.17.140
    Connecting to www.reddit.com|151.101.17.140|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 27324 (27K) [text/html]
    Saving to: 'index.html'
    
    index.html                                  100%
    [==========================================================>]
    26.68K  --.-KB/s    in 0.1s
    
    2018-01-28 20:13:40 (270 KB/s) - 'index.html' saved [169355]
    
    
    a community ↪for ↪....Forever and ever...... --- SNIP ---

    All the images have been returned in one long line, because the HTML for the images is also in one long line. You need to split this one long line into the separate image links. Enter Regex.

    Regex is short for regular expression, a system used by many programs to allow users to match an expression to a string. It contains wild cards, which are special characters that match certain characters. For example, the * character will match every character. For this example, you want an expression that matches every link in the HTML file. All HTML links have one string in common. They all take the form href="LINK". Let's write a regex expression to match:

    
    href="([^"#]+)"
    
    

    Now let's break it down:

    • href=" — simply states that the first characters should match these.

    • () — forms a capture group.

    • [^] — forms a negated set. The string shouldn't match any of the characters inside.

    • + — the string should match one or more of the preceding tokens.

    Altogether the regex matches a string that begins href=", doesn't contain any quotation marks or hashtags and finishes with a quotation mark.

    This regex can be used with grep like this:

    
    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html
    
    href="/static/opensearch.xml"
    href="https://www.reddit.com/r/wallpapers/"
    href="//out.reddit.com"
    href="//out.reddit.com"
    href="//www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-57x57.png"
    
    --- SNIP ---
    
    

    The -e options allow for extended regex options, and the -o switch means grep will print only patterns exactly matching and not the whole line. You now have a much more manageable list of links. From there, you can use the same techniques from the first script to extract the links and filter for images. This looks like the following:

    
    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html | cut -d'"'
     ↪-f2 | sort | uniq | grep -E '.jpg|.png'
    
    https://i.imgur.com/6DO2uqT.png
    https://i.imgur.com/Ualn765.png
    https://i.imgur.com/UO5ck0M.jpg
    https://i.redd.it/s8ngtz6xtnc01.jpg
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪android-icon-192x192.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-114x114.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-120x120.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-144x144.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-152x152.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-180x180.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-57x57.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-60x60.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-72x72.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪apple-icon-76x76.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪favicon-16x16.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪favicon-32x32.png
    //www.redditstatic.com/desktop2x/img/favicon/
    ↪favicon-96x96.png
    
    

    The final grep uses regex again to match .jpg or .png. The | character acts as a boolean OR operator.

    As you can see, there are four matches for actual images: two .jpgs and two .pngs. The others are Reddit default images, like the logo. Once you remove those images, you'll have a final list of images to set as a wallpaper. The easiest way to remove these images from the list is with sed:

    
    user@LJ $: grep -o -E 'href="([^"#]+)"' index.html | cut -d'"'
     ↪-f2 | sort | uniq | grep -E '.jpg|.png' | sed /redditstatic/d
    
    https://i.imgur.com/6DO2uqT.png
    https://i.imgur.com/Ualn765.png
    https://i.imgur.com/UO5ck0M.jpg
    https://i.redd.it/s8ngtz6xtnc01.jpg
    
    

    sed works by matching what's between the two forward slashes. The d on the end tells sed to delete the lines that match the pattern, leaving the image links.

    The great thing about sourcing images from Reddit is that every subreddit contains nearly identical HTML; therefore, this small script will work on any subreddit.

    Creating a Script

    To create a script for Reddit, it should be possible to choose from which subreddits you'd like to source images. I've created a directory for my script and placed a file called "links" in the directory with it. This file contains the subreddit links in the following format:

    
    https://www.reddit.com/r/wallpapers
    https://www.reddit.com/r/wallpaper
    https://www.reddit.com/r/NationalPark
    https://www.reddit.com/r/tiltshift
    https://www.reddit.com/r/pic
    
    

    At run time, I have the script read the list and download these subreddits before stripping images from them.

    Since you can have only one image at a time as desktop wallpaper, you'll want to narrow down the selection of images to just one. First, however, it's best to have a wide range of images without using a lot of bandwidth. So you'll want to download the web pages for multiple subreddits and strip the image links but not download the images themselves. Then you'll use a random selector to select one image link and download that one to use as a wallpaper.

    Finally, if you're downloading lots of subreddit's web pages, the script will become very slow. This is because the script waits for each command to complete before proceeding. To circumvent this, you can fork a command by appending an ampersand (&) character. This creates a new process for the command, "forking" it from the main process (the script).

    Here's my fully annotated script:

    
    #!/bin/bash
    
    DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
     ↪# Get the script's current directory
    
    linksFile="links"
    
    mkdir $DIR/downloads
    cd $DIR/downloads
    
    # Strip the image links from the html
    function parse {
    grep -o -E 'href="([^"#]+)"' $1 | cut -d'"' -f2 | sort | uniq
     ↪| grep -E '.jpg|.png' >> temp
    grep -o -E 'href="([^"#]+)"' $2 | cut -d'"' -f2 | sort | uniq
     ↪| grep -E '.jpg|.png' >> temp
    grep -o -E 'href="([^"#]+)"' $3 | cut -d'"' -f2 | sort | uniq
     ↪| grep -E '.jpg|.png' >> temp
    grep -o -E 'href="([^"#]+)"' $4 | cut -d'"' -f2 | sort | uniq
     ↪| grep -E '.jpg|.png' >> temp
    }
    
    # Download the subreddit's webpages
    function download {
    rname=$( echo $1 | cut -d / -f 5  )
    tname=$(echo t.$rname)
    rrname=$(echo r.$rname)
    cname=$(echo c.$rname)
    wget --load-cookies=../cookies.txt -O $rname $1
     ↪&>/dev/null &
    wget --load-cookies=../cookies.txt -O $tname $1/top
     ↪&>/dev/null &
    wget --load-cookies=../cookies.txt -O $rrname $1/rising
     ↪&>/dev/null &
    wget --load-cookies=../cookies.txt -O $cname $1/controversial
     ↪&>/dev/null &
    wait # wait for all forked wget processes to return
    parse $rname $tname $rrname $cname
    }
    
    
    # For each line in links file
    while read l; do
       if [[ $l != *"#"* ]]; then # if line doesn't contain a
     ↪hashtag (comment)
            download $l&
       fi
    done < ../$linksFile
    
    wait # wait for all forked processes to return
    
    sed -i '/www.redditstatic.com/d' temp # remove reddit pics that
     ↪exist on most pages from the list
    
    
    wallpaper=$(shuf -n 1 temp) # select randomly from file and DL
    
    echo $wallpaper >> $DIR/log # save image into log in case
     ↪we want it later
    
    wget -b $wallpaper -O $DIR/wallpaperpic 1>/dev/null # Download
     ↪wallpaper image
    
    gsettings set org.gnome.desktop.background picture-uri
     ↪file://$DIR/wallpaperpic # Set wallpaper (Gnome only!)
    
    
    rm -r $DIR/downloads # cleanup
    
    

    Just like before, you can set up a cron job to run the script for you at every reboot or whatever interval you like.

    And, there you have it—a fully functional cat-image harvester. May your morning logins be greeted with many furry faces. Now go forth and discover new subreddits to gawk at and new websites to scrape for cool wallpapers.

    Cooking With Linux (without a net): A CMS Smorgasbord

    Cooking With Linux (without a net): A CMS Smorgasbord

    by Marcel Gagné

    Note : You are watching a recording of a live show. It's Tuesday and that means it's time for Cooking With Linux (without a net), sponsored and supported by Linux Journal. Today, I'm going to install four popular content management systems. These will be Drupal, Joomla, Wordpress, and Backdrop. If you're trying to decide on what your next CMS platform should be, this would be a great time to tune in. And yes, I'll do it all live, without a net, and with a high probability of falling flat on my face. Join me today, at 12 noon, Easter Time. Be part of the conversation.

    Content management systems covered include:

    Load Disqus comments

    The Agony and the Ecstasy of Cloud Billing

    The Agony and the Ecstasy of Cloud Billing
    Image
    Corey Quinn Tue, 04/17/2018 - 09:40

    Cloud billing is inherently complex; it's not just you.

    Back in the mists of antiquity when I started reading Linux Journal, figuring out what an infrastructure was going to cost was (although still obnoxious in some ways) straightforward. You'd sign leases with colocation providers, buy hardware that you'd depreciate on a schedule and strike a deal in blood with a bandwidth provider, and you were more or less set until something significant happened to your scale.

    In today's brave new cloud world, all of that goes out the window. The public cloud providers give with one hand ("Have a full copy of any environment you want, paid by the hour!"), while taking with the other ("A single Linux instance will cost you $X per hour, $Y per GB transferred per month, and $Z for the attached storage; we simplify this pricing into what we like to call 'We Make It Up As We Go Along'").

    In my day job, I'm a consultant who focuses purely on analyzing and reducing the Amazon Web Services (AWS) bill. As a result, I've seen a lot of environments doing different things: cloud-native shops spinning things up without governance, large enterprises transitioning into the public cloud with legacy applications that don't exactly support that model without some serious tweaking, and cloud migration projects that somehow lost their way severely enough that they were declared acceptable as they were, and the "multi-cloud" label was slapped on to them. Throughout all of this, some themes definitely have emerged that I find that people don't intuitively grasp at first. To wit:

    • It's relatively straightforward to do the basic arithmetic to figure out what a current data center would cost to put into the cloud as is—generally it's a lot! If you do a 1:1 mapping of your existing data center into the cloudy equivalents, it invariably will cost more; that's a given. The real cost savings arise when you start to take advantage of cloud capabilities—your web server farm doesn't need to have 50 instances at all times. If that's your burst load, maybe you can scale that in when traffic is low to five instances or so? Only once you fall into a pattern (and your applications support it!) of paying only for what you need when you need it do the cost savings of cloud become apparent.

    • One of the most misunderstood aspects of Cloud Economics is the proper calculation of Total Cost of Ownership, or TCO. If you want to do a break-even analysis on whether it makes sense to build out a storage system instead of using S3, you've got to include a lot more than just a pile of disks. You've got to factor in disaster recovery equipment and location, software to handle replication of data, staff to run the data center/replace drives, the bandwidth to get to the storage from where it's needed, the capacity planning for future growth—and the opportunity cost of building that out instead of focusing on product features.

    • It's easy to get lost in the byzantine world of cloud billing dimensions and lose sight of the fact that you've got staffing expenses. I've yet to see a company with more than five employees wherein the cloud expense wasn't dwarfed by payroll. Unlike the toy projects some of us do as labors of love, engineering time costs a lot of money. Retraining existing staff to embrace a cloud future takes time, and not everyone takes to this new paradigm quickly.

    • Accounting is going to have to weigh in on this, and if you're not prepared for that conversation, it's likely to be unpleasant. You're going from an old world where you could plan your computing expenses a few years out and be pretty close to accurate. Cloud replaces that with a host of variables to account for, including variable costs depending upon load, amortization of Reserved Instances, provider price cuts and a complete lack of transparency with regard to where the money is actually going (Dev or Prod? Which product? Which team spun that up? An engineer left the company six months ago, but their 500TB of data is still sitting there and so on).

    The worst part is that all of this isn't apparent to newcomers to cloud billing, so when you trip over these edge cases, it's natural to feel as if the problem is somehow your fault. I do this for a living, and I was stymied trying to figure out what data transfer was likely to cost in AWS. I started drawing out how it's billed to customers, and ultimately came up with the "AWS Data Transfer Costs" diagram shown in Figure 1.

    Figure 1. A convoluted mapping of how AWS data transfer is priced out.

    If you can memorize those figures, you're better at this than I am by a landslide! It isn't straightforward, it's not simple, and it's certainly not your fault if you don't somehow intrinsically know these things.

    That said, help is at hand. AWS billing is getting much more understandable, with the advent of such things as free Reserved Instance recommendations, the release of the Cost Explorer API and the rise of serverless technologies. For their part, Google's GCP and Microsoft's Azure learned from the early billing stumbles of AWS, and as a result, both have much more understandable cost structures. Additionally, there are a host of cost visibility Platform as a Service offerings out there; they all do more or less the same things as one another, but they're great for ad-hoc queries around your bill. If you'd rather build something you can control yourself, you can shove your billing information from all providers into an SQL database and run something like QuickSight or Tableau on top of it to aide visualization, as many shops do today.

    In return for this ridiculous pile of complexity, you get something rather special—the ability to spin up resources on-demand, for as little time as you need them, and pay only for the things that you use. It's incredible as a learning resource alone—imagine how much simpler it would have been in the late 1990s to receive a working Linux VM instead of having to struggle with Slackware's installation for the better part of a week. The cloud takes away, but it also gives.

    Microsoft Announces First Custom Linux Kernel, German Government Chooses Open-Source Nextcloud and More

    News briefs for April 17, 2018.

    Microsoft yesterday introduced Azure Sphere, a Linux-based OS and cloud service for securing IoT devices. According to ZDNet, "Microsoft President Brad Smith introduced Azure Sphere saying, 'After 43 years, this is the first day that we are announcing, and will distribute, a custom Linux kernel.'"

    The German government's Federal Information Technology Centre (ITZBund) has chosen open-source Nextcloud for its self-hosted cloud solution, iwire reports. Nextcloud was chosen for its strict security requirements and scalability "both in terms of large numbers of uses and extensibility with additional features".

    European authorities have effectively ended the Whois public database of domain name registration, which ICANN oversees. According to The Register, the service isn't compliant with the GDPR and will be illegal as of May 25th: "ICANN now has a little over a month to come up with a replacement to the decades-old service that covers millions of domain names and lists the personal contact details of domain registrants, including their name, email and telephone number."

    A new release of PySoIFC, a free and open-source collection of more than 1,000 card Solitaire and Mahjong games, was announced recently. The new stable release, 2.2.0, is the first since 2009.

    Deadline for proposals to speak at Open Source Summit North America is April 29. OSSN is being held in Vancouver, BC, this year from August 29–31.

    In other event news, Red Hat today announced the keynote speakers and agenda for its largest ever Red Hat Summit being held at the Moscone Center in San Francisco, May 8–10.

    Bassel Khartabil Free Fellowship, GNOME 3.28.1 Release, New Version of Mixxx and More

    News briefs for April 16, 2018.

    The Bassel Khartabil Free Fellowship was awarded yesterday to Majd Al-shihabi, a Palestinian-Syrian engineer and urban planning graduate based in Beirut, Lebanon: "The Fellowship will support Majd's efforts in building a unified platform for Syrian and Palestinian oral history archives, as well as the digitizing and release of previously forgotten 1940s era public domain maps of Palestine." The Creative Commons also announced the first three winners of the Bassel Khartabil Memorial Fund: Egypt-based The Mosireen Collective, and Lebanon-based Sharq.org and ASI-REM/ADEF Lebanon. For all the details, see the announcement on the Creative Commons website.

    GNOME 3.28 is ready for prime time after receiving its first point release on Friday, which includes numerous improvements and bug fixes. See the announcement for all the details on version 3.28.1.

    Apache Subversion 1.10 has been released. This version is "a superset of all previous Subversion releases, and is as of the time of its release considered the current "best" release. Any feature or bugfix in 1.0.x through 1.9.x is also in 1.10, but 1.10 contains features and bugfixes not present in any earlier release. The new features will eventually be documented in a 1.10 version of the free Subversion book." New features include improved path-based authorization, new interactive conflict resolver, added support for LZ4 compression and more. See the release notes for more information.

    A new version of Mixxx, the free and open-source DJ software, was released today. Version 2.1 has "new and improved controller mappings, updated Deere and LateNight skins, overhauled effects system, and much more".

    Kayenta, a new open-source project from Google and Netflix for automated deployment monitoring was announced recently. GeekWire reports that the project's goal is "to help other companies that want to modernize their application deployment practices but don't exactly have the same budget and expertise to build their own solution."

    Multiprocessing in Python

    Multiprocessing in Python
    Image
    python logo
    Reuven M. Lerner Mon, 04/16/2018 - 09:20

    Python's "multiprocessing" module feels like threads, but actually launches processes.

    Many people, when they start to work with Python, are excited to hear that the language supports threading. And, as I've discussed in previous articles, Python does indeed support native-level threads with an easy-to-use and convenient interface.

    However, there is a downside to these threads—namely the global interpreter lock (GIL), which ensures that only one thread runs at a time. Because a thread cedes the GIL whenever it uses I/O, this means that although threads are a bad idea in CPU-bound Python programs, they're a good idea when you're dealing with I/O.

    But even when you're using lots of I/O, you might prefer to take full advantage of a multicore system. And in the world of Python, that means using processes.

    In my article "Launching External Processes in Python", I described how you can launch processes from within a Python program, but those examples all demonstrated that you can launch a program in an external process. Normally, when people talk about processes, they work much like they do with threads, but are even more independent (and with more overhead, as well).

    So, it's something of a dilemma: do you launch easy-to-use threads, even though they don't really run in parallel? Or, do you launch new processes, over which you have little control?

    The answer is somewhere in the middle. The Python standard library comes with "multiprocessing", a module that gives the feeling of working with threads, but that actually works with processes.

    So in this article, I look at the "multiprocessing" library and describe some of the basic things it can do.

    Multiprocessing Basics

    The "multiprocessing" module is designed to look and feel like the "threading" module, and it largely succeeds in doing so. For example, the following is a simple example of a multithreaded program:

    
    #!/usr/bin/env python3
    
    import threading
    import time
    import random
    
    def hello(n):
        time.sleep(random.randint(1,3))
        print("[{0}] Hello!".format(n))
    
    for i in range(10):
        threading.Thread(target=hello, args=(i,)).start()
    
    print("Done!")
    
    

    In this example, there is a function (hello) that prints "Hello!" along with whatever argument is passed. It then runs a for loop that runs hello ten times, each of them in an independent thread.

    But wait. Before the function prints its output, it first sleeps for a few seconds. When you run this program, you then end up with output that demonstrates how the threads are running in parallel, and not necessarily in the order they are invoked:

    
    $ ./thread1.py
    Done!
    [2] Hello!
    [0] Hello!
    [3] Hello!
    [6] Hello!
    [9] Hello!
    [1] Hello!
    [5] Hello!
    [8] Hello!
    [4] Hello!
    [7] Hello!
    
    

    If you want to be sure that "Done!" is printed after all the threads have finished running, you can use join. To do that, you need to grab each instance of threading.Thread, put it in a list, and then invoke join on each thread:

    
    #!/usr/bin/env python3
    
    import threading
    import time
    import random
    
    def hello(n):
        time.sleep(random.randint(1,3))
        print("[{0}] Hello!".format(n))
    
    threads = [ ]
    for i in range(10):
        t = threading.Thread(target=hello, args=(i,))
        threads.append(t)
        t.start()
    
    for one_thread in threads:
        one_thread.join()
    
    print("Done!")
    
    

    The only difference in this version is it puts the thread object in a list ("threads") and then iterates over that list, joining them one by one.

    But wait a second—I promised that I'd talk about "multiprocessing", not threading. What gives?

    Well, "multiprocessing" was designed to give the feeling of working with threads. This is so true that I basically can do some search-and-replace on the program I just presented:

    • threading → multiprocessing
    • Thread → Process
    • threads → processes
    • thread → process

    The result is as follows:

    
    #!/usr/bin/env python3
    
    import multiprocessing
    import time
    import random
    
    def hello(n):
        time.sleep(random.randint(1,3))
        print("[{0}] Hello!".format(n))
    
    processes = [ ]
    for i in range(10):
        t = multiprocessing.Process(target=hello, args=(i,))
        processes.append(t)
        t.start()
    
    for one_process in processes:
        one_process.join()
    
    print("Done!")
    
    

    In other words, you can run a function in a new process, with full concurrency and take advantage of multiple cores, with multiprocessing.Process. It works very much like a thread, including the use of join on the Process objects you create. Each instance of Process represents a process running on the computer, which you can see using ps, and which you can (in theory) stop with kill.

    What's the Difference?

    What's amazing to me is that the API is almost identical, and yet two very different things are happening behind the scenes. Let me try to make the distinction clearer with another pair of examples.

    Perhaps the biggest difference, at least to anyone programming with threads and processes, is the fact that threads share global variables. By contrast, separate processes are completely separate; one process cannot affect another's variables. (In a future article, I plan to look at how to get around that.)

    Here's a simple example of how a function running in a thread can modify a global variable (note that what I'm doing here is to prove a point; if you really want to modify global variables from within a thread, you should use a lock):

    
    #!/usr/bin/env python3
    
    import threading
    import time
    import random
    
    mylist = [ ]
    
    def hello(n):
        time.sleep(random.randint(1,3))
        mylist.append(threading.get_ident())   # bad in real code!
        print("[{0}] Hello!".format(n))
    
    threads = [ ]
    for i in range(10):
        t = threading.Thread(target=hello, args=(i,))
        threads.append(t)
        t.start()
    
    for one_thread in threads:
        one_thread.join()
    
    print("Done!")
    print(len(mylist))
    print(mylist)
    
    

    The program is basically unchanged, except that it defines a new, empty list (mylist) at the top. The function appends its ID to that list and then returns.

    Now, the way that I'm doing this isn't so wise, because Python data structures aren't thread-safe, and appending to a list from within multiple threads eventually will catch up with you. But the point here isn't to demonstrate threads, but rather to contrast them with processes.

    When I run the above code, I get:

    
    $ ./th-update-list.py
    [0] Hello!
    [2] Hello!
    [6] Hello!
    [3] Hello!
    [1] Hello!
    [4] Hello!
    [5] Hello!
    [7] Hello!
    [8] Hello!
    [9] Hello!
    Done!
    10
    [123145344081920, 123145354592256, 123145375612928,
     ↪123145359847424, 123145349337088, 123145365102592,
     ↪123145370357760, 123145380868096, 123145386123264,
     ↪123145391378432]
    
    

    So, you can see that the global variable mylist is shared by the threads, and that when one thread modifies the list, that change is visible to all the other threads.

    But if you change the program to use "multiprocessing", the output looks a bit different:

    
    #!/usr/bin/env python3
    
    import multiprocessing
    import time
    import random
    import os
    
    mylist = [ ]
    
    def hello(n):
        time.sleep(random.randint(1,3))
        mylist.append(os.getpid())
        print("[{0}] Hello!".format(n))
    
    processes = [ ]
    for i in range(10):
        t = multiprocessing.Process(target=hello, args=(i,))
        processes.append(t)
        t.start()
    
    for one_process in processes:
        one_process.join()
    
    print("Done!")
    print(len(mylist))
    print(mylist)
    
    

    Aside from the switch to multiprocessing, the biggest change in this version of the program is the use of os.getpid to get the current process ID.

    The output from this program is as follows:

    
    $ ./proc-update-list.py
    [0] Hello!
    [4] Hello!
    [7] Hello!
    [8] Hello!
    [2] Hello!
    [5] Hello!
    [6] Hello!
    [9] Hello!
    [1] Hello!
    [3] Hello!
    Done!
    0
    []
    
    

    Everything seems great until the end when it checks the value of mylist. What happened to it? Didn't the program append to it?

    Sort of. The thing is, there is no "it" in this program. Each time it creates a new process with "multiprocessing", each process has its own value of the global mylist list. Each process thus adds to its own list, which goes away when the processes are joined.

    This means the call to mylist.append succeeds, but it succeeds in ten different processes. When the function returns from executing in its own process, there is no trace left of the list from that process. The only mylist variable in the main process remains empty, because no one ever appended to it.

    Queues to the Rescue

    In the world of threaded programs, even when you're able to append to the global mylist variable, you shouldn't do it. That's because Python's data structures aren't thread-safe. Indeed, only one data structure is guaranteed to be thread safe—the Queue class in the multiprocessing module.

    Queues are FIFOs (that is, "first in, first out"). Whoever wants to add data to a queue invokes the put method on the queue. And whoever wants to retrieve data from a queue uses the get command.

    Now, queues in the world of multithreaded programs prevent issues having to do with thread safety. But in the world of multiprocessing, queues allow you to bridge the gap among your processes, sending data back to the main process. For example:

    
    #!/usr/bin/env python3
    
    import multiprocessing
    import time
    import random
    import os
    from multiprocessing import Queue
    
    q = Queue()
    
    def hello(n):
        time.sleep(random.randint(1,3))
        q.put(os.getpid())
        print("[{0}] Hello!".format(n))
    
    processes = [ ]
    for i in range(10):
        t = multiprocessing.Process(target=hello, args=(i,))
        processes.append(t)
        t.start()
    
    for one_process in processes:
        one_process.join()
    
    mylist = [ ]
    while not q.empty():
        mylist.append(q.get())
    
    print("Done!")
    print(len(mylist))
    print(mylist)
    
    

    In this version of the program, I don't create mylist until late in the game. However, I create an instance of multiprocessing.Queue very early on. That Queue instance is designed to be shared across the different processes. Moreover, it can handle any type of Python data that can be stored using "pickle", which basically means any data structure.

    In the hello function, it replaces the call to mylist.append with one to q.put, placing the current process' ID number on the queue. Each of the ten processes it creates will add its own PID to the queue.

    Note that this program takes place in stages. First it launches ten processes, then they all do their work in parallel, and then it waits for them to complete (with join), so that it can process the results. It pulls data off the queue, puts it onto mylist, and then performs some calculations on the data it has retrieved.

    The implementation of queues is so smooth and easy to work with, it's easy to forget that these queues are using some serious behind-the-scenes operating system magic to keep things coordinated. It's easy to think that you're working with threading, but that's just the point of multiprocessing; it might feel like threads, but each process runs separately. This gives you true concurrency within your program, something threads cannot do.

    Conclusion

    Threading is easy to work with, but threads don't truly execute in parallel. Multiprocessing is a module that provides an API that's almost identical to that of threads. This doesn't paper over all of the differences, but it goes a long way toward making sure things aren't out of control.

    FOSS Project Spotlight: Ravada

    FOSS Project Spotlight: Ravada
    Image
    camel
    Francesc Guasch Fri, 04/13/2018 - 13:48

    Ravada is an open-source project that allows users to connect to a virtual desktop.

    Currently, it supports KVM, but its back end has been designed and implemented in order to allow future hypervisors to be added to the framework. The client's only requirements are a web-browser and a remote viewer supporting the spice protocol.

    Ravada's main features include:

    • KVM back end.
    • LDAP and SQL authentication.
    • Kiosk mode.
    • Remote access for Windows and Linux.
    • Light and fast virtual machine clones for each user.
    • Instant clone creation.
    • USB redirection.
    • Easy and customizable end-user interface (i18n, l10n).
    • Administration from a web browser.

    It's very easy to install and use. Following the documentation, virtual machines can be deployed in minutes. It's an early release, but it's already used in production. The project is open source, and you can download the code from GitHub. Contributions welcome!

    choose a screenList of virtual machines

    Elisa Music Player Debuts, Zenroom Crypto-Language VM Reaches Version 0.5.0 and More

    News briefs for April 13, 2018.

    The Elisa music player, developed by the KDE community, debuted yesterday, with version 0.1. Elisa has good integration wtih the Plasma desktop and also supports other Linux desktop environments, as well as Windows and Android. In addition, the Elisa release announcement notes, "We are creating a reliable product that is a joy to use and respects our users' privacy. As such, we will prefer to support online services where users are in control of their data."

    Mozilla released Firefox 11.0 for iOS yesterday, and this new version turns on tracking protection by default. The feature uses a list provided by Disconnect to identify trackers, and it also provides options for turning it on or off overall or for specific websites.

    The Zenroom project, a brand-new crypto-language virtual machine, has reached version 0.5.0. Zenroom's goal is "improving people's awareness of how their data is processed by algorithms, as well facilitate the work of developers to create and publish algorithms that can be used both client and server side." In addition, it "has no external dependencies, is smaller than 1MB, runs in less than 64KiB memory and is ready for experimental use on many target platforms: desktop, embedded, mobile, cloud and browsers." The program is free software and is licensed under the GNU LGPL v3. Its main use case is "distributed computing of untrusted code where advanced cryptographic functions are required".

    ZFS On Linux, recently in the news for data-loss issues, may finally be getting SSD TRIM support, which has been in the works for years, according to Phoronix.

    System76 recently became a GNOME Foundation Advisory Board member. Neil McGovern, Executive Director of the GNOME Foundation, commented "System76's long-term ambition to see free software grow is highly commendable, and we're extremely pleased that they're coming on board to help support the Foundation and the community." See the betanews article for more details.

    Facebook Compartmentalization

    Facebook Compartmentalization
    Image
    Kyle Rankin Thu, 04/12/2018 - 10:06

    I don't always use Facebook, but when I do, it's over a compartmentalized browser over Tor.

    Whenever people talk about protecting privacy on the internet, social-media sites like Facebook inevitably come up—especially right now. It makes sense—social networks (like Facebook) provide a platform where you can share your personal data with your friends, and it doesn't come as much of a surprise to people to find out they also share that data with advertisers (it's how they pay the bills after all). It makes sense that Facebook uses data you provide when you visit that site. What some people might be surprised to know, however, is just how much. Facebook tracks them when they aren't using Facebook itself but just browsing around the web.

    Some readers may solve the problem of Facebook tracking by saying "just don't use Facebook"; however, for many people, that site may be the only way they can keep in touch with some of their friends and family members. Although I don't post on Facebook much myself, I do have an account and use it to keep in touch with certain friends. So in this article, I explain how I employ compartmentalization principles to use Facebook without leaking too much other information about myself.

    1. Post Only Public Information

    The first rule for Facebook is that, regardless of what you think your privacy settings are, you are much better off if you treat any content you provide there as being fully public. For one, all of those different privacy and permission settings can become complicated, so it's easy to make a mistake that ends up making some of your data more public than you'd like. Second, even with privacy settings in place, you don't have a strong guarantee that the data won't be shared with people willing to pay for it. If you treat it like a public posting ground and share only data you want the world to know, you won't get any surprises.

    2. Give Facebook Its Own Browser

    I mentioned before that Facebook also can track what you do when you browse other sites. Have you ever noticed little Facebook "Like" icons on other sites? Often websites will include those icons to help increase engagement on their sites. What it also does, however, is link the fact that you visited that site with your specific Facebook account—even if you didn't click "Like" or otherwise engage with the site. If you want to reduce how much you are tracked, I recommend selecting a separate browser that you use only for Facebook. So if you are a Firefox user, load Facebook in Chrome. If you are a Chrome user, view Facebook in Firefox. If you don't want to go to the trouble of managing two different browsers, at the very least, set up a separate Firefox profile (run firefox -P from a terminal) that you use only for Facebook.

    3. View Facebook over Tor

    Many people don't know that Facebook itself offers a .onion service that allows you you to view Facebook over Tor. It may seem counterintuitive that a site that wants so much of your data would also want to use an anonymizing service, but it makes sense if you think it through. Sure, if you access Facebook over Tor, Facebook will know it's you that's accessing it, but it won't know from where. More important, no other sites on the internet will know you are accessing Facebook from that account, even if they try to track via IP.

    To use Facebook's private .onion service, install the Tor Browser Bundle, or otherwise install Tor locally, and follow the Tor documentation to route your Facebook-only browser to its SOCKS proxy service. Then visit https://facebookcorewwwi.onion, and only you and Facebook will know you are hitting the site. By the way, one advantage to setting up a separate browser that uses a SOCKS proxy instead of the Tor Browser Bundle is that the Tor Browser Bundle attempts to be stateless, so you will have a tougher time making the Facebook .onion address your home page.

    Conclusion

    So sure, you could decide to opt out of Facebook altogether, but if you don't have that luxury, I hope a few of these compartmentalization steps will help you use Facebook in a way that doesn't completely remove your privacy.

    Mozilla’s Internet Health Report, Google’s Fuchsia, Purism Development Docs and More

    News briefs for April 12, 2018.

    Mozilla recently published its annual Internet Health Report. Its three major concerns are:

    • "Consolidation of power over the Internet, particularly by Facebook, Google, Tencent, and Amazon."
    • "The spread of 'fake news,' which the report attributes in part to the 'broken online advertising economy' that provides financial incentive for fraud, misinformation, and abuse."
    • The threat to privacy posed by the poor security of the Internet of Things.

    (Source: Ars Technica's "The Internet has serious health problems, Mozilla Foundation report finds")

    Idle power on some Linux systems could drop by 10% or more with the Linux 4.17 kernel, reports Phoronix. Evidently, that's not all that's in the works regarding power management features: "performance of workloads where the idle loop overhead was previously significant could now see greater gains too". See Rafael Wysocki's "More power management updates for v4.17-rc-1" pull request.

    Google's "not-so-secret" operating system named Fuchsia that's been in development for almost two years has attracted much speculation, but now we finally know what it is not. It's not Linux. According to a post on xda, Google published a documentation page called "the book" that explains what Fuchsia is and isn't. Several details still need to be filled in, but documentation will be added as things develop.

    Instagram will soon allow users to download their data, including photos, videos and messages, according to a TechCrunch report: "This tool could make it much easier for users to leave Instagram and go to a competing image social network. And as long as it launches before May 25th, it will help Instagram to comply with upcoming European GDPR privacy law that requires data portability."

    Purism has started its developer docs effort in anticipation of development boards being shipped this summer. According to the post on the Purism website, "There will be technical step-by-step instructions that are suitable for both newbies and experienced Debian developers alike. The goal of the docs is to openly welcome you and light your path along the way with examples and links to external documentation." You can see the docs here.

    Promote Drupal Initiative Announced at DrupalCon

    Promote Drupal Initiative Announced at DrupalCon
    Image
    Katherine Druckman Wed, 04/11/2018 - 11:03

    Yesterday's Keynote from Drupal project founder, Dries Buytaert, kicked off the annual North American gathering of Drupalists from around the world, and also kicked off a new Drupal community initiative aimed at promoting the Drupal platform through a coordinated marketing effort using funds raised within the community.

    The Drupal Association hopes to raise $100,000 to enable a global group of staff and volunteers to complete the first two phases of a four-phase plan to create consistent and reusable marketing materials to allow agencies and other Drupal promoters to communicate Drupal's benefits to organizations and potential customers quickly and effectively.

    Convincing non-geeks and non-technical decision-makers of Drupal's strengths has always been a pain point, and we'll be watching with great interest as this initiative progresses.

    Also among the announcements were demonstrations of how easy it could soon be to manipulate content within the Drupal back end using a drag-and-drop interface, which would provide great flexibility for site builders and content editors.

    We also expect to see improvements to the Drupal site-builder experience in upcoming releases, as well as improvements to the built-in configuration management process, which eases the deployment process when developing in Drupal.

    See the full keynote to get inspired by what's to come in the Drupalverse.

    And also see the DrupalCon Nashville Playlist!

    OSI’s Simon Phipps on Open Source’s Past and Future

    OSI's Simon Phipps on Open Source's Past and Future
    Image
    Christine Hall Wed, 04/11/2018 - 09:20

    With an eye on the future, the Open Source Initiative's president sits down and talks with Linux Journal about the organization's 20-year history.

    It would be difficult for anyone who follows Linux and open source to have missed the 20th birthday of open source in early February. This was a dual celebration, actually, noting the passing of 20 years since the term "open source" was first coined and since the formation of the Open Source Initiative (OSI), the organization that decides whether software licenses qualify to wear that label.

    The party came six months or so after Facebook was successfully convinced by the likes of the Apache Foundation; WordPress's developer, Automatic; the Free Software Foundation (FSF); and OSI to change the licensing of its popular React project away from the BSD + Patents license, a license that had flown under the radar for a while.

    The brouhaha began when Apache developers noticed a term in the license forbidding the suing of Facebook over any patent issues, which was troublesome because it gave special consideration to a single entity, Facebook, which pretty much disqualified it from being an open-source license.

    Although the incident worked out well—after some grumblings Facebook relented and changed the license to MIT—the Open Source Initiative fell under some criticism for having approved the BSD + Patents license, with some people suggesting that maybe it was time for OSI to be rolled over into an organization such as the Linux Foundation.

    The problem was that OSI had never approved the BSD + Patents.

    Simon Phipps delivers the keynote at Kopano Conference 2017 in Arnhem, the Netherlands.

    "BSD was approved as a license, and Facebook decided that they would add the software producer equivalent of a signing statement to it", OSI's president, Simon Phipps, recently explained to Linux Journal. He continued:

    They decided they would unilaterally add a patent grant with a defensive clause in it. They found they were able to do that for a while simply because the community accepted it. Over time it became apparent to people that it was actually not an acceptable patent grant, that it unduly favored Facebook and that if it was allowed to grow to scale, it would definitely create an environment where Facebook was unfairly advantaged.

    He added that the Facebook incident was actually beneficial for OSI and ended up being a validation of the open-source approval process:

    I think the consequence of that encounter is that more people are now convinced that the whole licensing arrangement that open-source software is under needs to be approved at OSI.

    I think prior to that, people felt it was okay for there just to be a license and then for there to be arbitrary additional terms applied. I think that the consensus of the community has moved on from that. I think it would be brave for a future software producer to decide that they can add arbitrary terms unless those arbitrary terms are minimally changing the rights and benefits of the community.

    As for the notion that OSI should be folded into a larger organization such as the Linux Foundation?

    "When I first joined OSI, which was back in 2009 I think, I shared that view", Phipps said. He continued:

    I felt that OSI had done its job and could be put into an existing organization. I came to believe that wasn't the case, because the core role that OSI plays is actually a specialist role. It's one that needs to be defined and protected. Each of the organizations I could think of where OSI could be hosted would almost certainly not be able to give the role the time and attention it was due. There was a risk there would be a capture of that role by an actor who could not be trusted to conduct it responsibly.

    That risk of the license approval role being captured is what persuaded me that I needed to join the OSI board and that I needed to help it to revamp and become a member organization, so that it could protect the license approval role in perpetuity. That's why over the last five to six years, OSI has dramatically changed.

    This is Phipps' second go at being president at OSI. He originally served in the position from 2012 until 2015, when he stepped down in preparation for the end of his term on the organization's board. He returned to the position last year after his replacement, Allison Randal, suddenly stepped down to focus on her pursuit of a PhD.

    His return was pretty much universally seen in a positive light. During his first three-year stint, the organization moved toward a membership-based governance structure and started an affiliate membership program for nonprofit charitable organizations, industry associations and academic institutions. This eventually led to an individual membership program and the inclusion of corporate sponsors.

    Although OSI is one of the best known open-source organizations, its grassroots approach has helped keep it on the lean side, especially when compared to organizations like the behemoth Linux or Mozilla Foundations. Phipps, for example, collects no salary for performing his presidential duties. Compare that with the Linux Foundation's executive director, Jim Zemlin, whose salary in 2010 was reportedly north of $300,000.

    "We're a very small organization actually", Phipps said. He added:

    We have a board of directors of 11 people and we have one paid employee. That means the amount of work we're likely do behind the scenes has historically been quite small, but as time is going forward, we're gradually expanding our reach. We're doing that through working groups and we're doing that through bringing together affiliates for particular projects.

    While the public perception might be that OSI's role is merely the approval of open-source licenses, Phipps sees a larger picture. According to him, the point of all the work OSI does, including the approval process, is to pave the way to make the road smoother for open-source developers:

    The role that OSI plays is to crystallize consensus. Rather than being an adjudicator that makes decisions ex cathedra, we're an organization that provides a venue for people to discuss licensing. We then identify consensus as it arises and then memorialize that consensus. We're more speaker-of-the-house than king.

    That provides an extremely sound way for people to reduce the burden on developers of having to evaluate licensing. As open source becomes more and more the core of the way businesses develop software, it's more and more valuable to have that crystallization of consensus process taking out the uncertainty for people who are needing to work between different entities. Without that, you need to constantly be seeking legal advice, you need to constantly be having discussions about whether a license meets the criteria for being open source or not, and the higher uncertainty results in fewer contributions and less collaboration.

    One of OSI's duties, and one it has in common with organizations such as the Free Software Foundation (FSF), is that of enforcer of compliance issues with open-source licenses. Like the FSF, OSI prefers to take a carrot rather than stick approach. And because it's the organization that approves open-source licenses, it's in a unique position to nip issues in the bud. Those issues can run the gamut from unnecessary licenses to freeware masquerading as open source. According to Phipps:

    We don't do that in private. We do that fairly publicly and we don't normally need to do that. Normally a member of the license review mailing list, who are all simply members of the community, will go back to people and say "we don't think that's distinctive", "we don't think that's unique enough", "why didn't you use license so and so", or they'll say, "we really don't think your intent behind this license is actually open source." Typically OSI doesn't have to go and say those things to people.

    The places where we do get involved in speaking to people directly is where they describe things as open source when they haven't bothered to go through that process and that's the point at which we'll communicate with people privately.

    The problem of freeware—proprietary software that's offered without cost—being marketed under the open-source banner is particularly troublesome. In those cases, OSI definitely will reach out and contact the offending companies, as Phipps says, "We do that quite often, and we have a good track record of helping people understand why it's to their business disadvantage to behave in that way."

    One of the reasons why OSI is able to get commercial software developers to heed its advice might be because the organization has never taken an anti-business stance. Founding member Michael Tiemann, now VP of open-source affairs at Red Hat, once said that one of the reasons the initiative chose the term "open source" was to "dump the moralizing and confrontational attitude that had been associated with 'free software' in the past and sell the idea strictly on the same pragmatic, business-case grounds that had motivated Netscape."

    These days, the organization has ties with many major software vendors and receives most of its financial support from corporate sponsors. However, it has taken steps to ensure that corporate sponsors don't dictate OSI policy. According to Phipps:

    If you want to join a trade association, that's what the Linux Foundation is there for. You can go pay your membership fees and buy a vote there, but OSI is a 501(c)(3). That's means it's a charity that's serving the public's interest and the public benefit.

    It would be wrong for us to allow OSI to be captured by corporate interests. When we conceived the sponsorship scheme, we made sure that there was no risk that would happen. Our corporate sponsors do not get any governance role in the organization. They don't get a vote over what's happening, and we've been very slow to accept new corporate sponsors because we wanted to make sure that no one sponsor could have an undue influence if they decided that they no longer liked us or decided to stop paying the sponsorship fees.

    This pragmatic approach, which also puts "permissive" licenses like Apache and MIT on equal footing with "copyleft" licenses like the GPL, has traditionally not been met with universal approval from FOSS advocates. The FSF's Richard Stallman has been critical of the organization, although noting that his organization and OSI are essentially on the same page. Years ago, OSI co-founder and creator of The Open Source Definition, Bruce Perens, decried the "schism" between the Free Software and Open Source communities—a schism that Phipps seeks to narrow:

    As I've been giving keynotes about the first 20 years and the next ten years of open source, I've wanted to make very clear to people that open source is a progression of the pre-existing idea of free software, that there is no conflict between the idea of free software and the way it can be adopted for commercial or for more structured use under the term open source.

    One of the things that I'm very happy about over the last five to six years is the good relations we've been able to have with the Free Software Foundation Europe. We've been able to collaborate with them over amicus briefs in important lawsuits. We are collaborating with them over significant issues, including privacy and including software patents, and I hope in the future that we'll be able to continue cooperating and collaborating. I think that's an important thing to point out, that I want the pre-existing world of free software to have its due credit.

    Software patents represent one of several areas into which OSI has been expanding. Patents have long been a thorny issue for open source, because they have the potential to affect not only people who develop software, but also companies who merely run open-source software on their machines. They also can be like a snake in the grass; any software application can be infringing on an unknown patent. According to Phipps:

    We have a new project that is just getting started, revisiting the role of patents and standards. We have helped bring together a post-graduate curriculum on open source for educating graduates on how to develop open-source software and how to understand it.

    We also host other organizations that need a fiduciary host so that they don't have to do their own bookkeeping and legal filings. For a couple years, we hosted the Open Hatch Project, which has now wound up, and we host other activities. For example, we host the mailing lists for the California Association of Voting Officials, who are trying to promote open-source software in voting machines in North America.

    Like everyone else in tech these days, OSI is also grappling with diversity issues. Phipps said the organization is seeking to deal with that issue by starting at the membership level:

    At the moment I feel that I would very much like to see a more diverse membership. I'd like to see us more diverse geographically. I'd like to see us more diverse in terms of the ethnicity and gender of the people who are involved. I would like to see us more diverse in terms of the businesses from which people are employed.

    I'd like to see all those improve and so, over the next few years (assuming that I remain president because I have to be re-elected every year by the board) that will also be one of the focuses that I have.

    And to wrap things up, here's how he plans to go about that:

    This year is the anniversary year, and we've been able to arrange for OSI to be present at a conference pretty much every month, in some cases two or three per month, and the vast majority of those events are global. For example, FOSSASIA is coming up, and we're backing that. We are sponsoring a hostel where we'll be having 50 software developers who are able to attend FOSSASIA because of the sponsorship. Our goal here is to raise our profile and to recruit membership by going and engaging with local communities globally. I think that's going to be a very important way that we do it.

    Red Hat Enterprise Linux 7.5 Released, Valve Improves Steam Privacy Settings, New Distribution Specification Project for Containers and More

    News briefs for April 11, 2018.

    Red Hat Enterprise Linux 7.5 was released yesterday. New features include "enhanced security and compliance, usability at scale, continued integration with Windows infrastructure on-premise and in Microsoft Azure, and new functionality for storage cost controls. The release also includes continued investment in platform manageability for Linux beginners, experts, and Microsoft Windows administrators." See the release notes for more information.

    The Open Container Initiative (OCI) yesterday announced the launch of the Distribution Specification Project: "having a solid, common distribution specification with conformance testing will ensure long lasting security and interoperability throughout the container ecosystem". See also "Open Container Initiative nails down container image distribution standard" on ZDNet for more details.

    Valve is offering new and improved privacy settings for Steam users, providing more detailed descriptions of the settings so you can better manage what your friends and the wider Steam community see. The announcement notes, "Additionally, regardless of which setting you choose for your profile's game details, you now have the option to keep your total game playtime private. You no longer need to nervously laugh it off as a bug when your friends notice the 4,000+ hours you've put into Ricochet."

    Thousands of websites have been hacked to give "fake update notifications to install banking malware and remote access trojans on visitors' computers", according to computer researcher Malwarebytes. Ars Technica reports that "The attackers also fly under the radar by using highly obfuscated JavaScript. Among the malicious software installed in the campaign was the Chthonic banking malware and a commercial remote access trojan known as NetSupport."

    Krita 4.0.1 was released yesterday. This new version fixes more than 50 bugs since the 4.0 release and includes many improvements to the UI.

    Simple Cloud Hardening

    Simple Cloud Hardening
    Image
    make your cloud environments safer while not making them too complex
    Kyle Rankin Tue, 04/10/2018 - 10:30

    Apply a few basic hardening principles to secure your cloud environment.

    I've written about simple server-hardening techniques in the past. Those articles were inspired in part by the Linux Hardening in Hostile Networks book I was writing at the time, and the idea was to distill the many different hardening steps you might want to perform on a server into a few simple steps that everyone should do. In this article, I take the same approach only with a specific focus on hardening cloud infrastructure. I'm most familiar with AWS, so my hardening steps are geared toward that platform and use AWS terminology (such as Security Groups and VPC), but as I'm not a fan of vendor lock-in, I try to include steps that are general enough that you should be able to adapt them to other providers.

    New Accounts Are (Relatively) Free; Use Them

    One of the big advantages with cloud infrastructure is the ability to compartmentalize your infrastructure. If you have a bunch of servers racked in the same rack, it might be difficult, but on cloud infrastructures, you can take advantage of the technology to isolate one customer from another to isolate one of your infrastructure types from the others. Although this doesn't come completely for free (it adds some extra overhead when you set things up), it's worth it for the strong isolation it provides between environments.

    One of the first security measures you should put in place is separating each of your environments into its own high-level account. AWS allows you to generate a number of different accounts and connect them to a central billing account. This means you can isolate your development, staging and production environments (plus any others you may create) completely into their own individual accounts that have their own networks, their own credentials and their own roles totally isolated from the others. With each environment separated into its own account, you limit the damage attackers can do if they compromise one infrastructure to just that account. You also make it easier to see how much each environment costs by itself.

    In a traditional infrastructure where dev and production are together, it is much easier to create accidental dependencies between those two environments and have a mistake in one affect the other. Splitting environments into separate accounts protects them from each other, and that independence helps you identify any legitimate links that environments need to have with each other. Once you have identified those links, it's much easier to set up firewall rules or other restrictions between those accounts, just like you would if you wanted your infrastructure to talk to a third party.

    Lock Down Security Groups

    One advantage to cloud infrastructure is that you have a lot tighter control over firewall rules. AWS Security Groups let you define both ingress and egress firewall rules, both with the internet at large and between Security Groups. Since you can assign multiple Security Groups to a host, you have a lot of flexibility in how you define network access between hosts.

    My first recommendation is to deny all ingress and egress traffic by default and add specific rules to a Security Group as you need them. This is a fundamental best practice for network security, and it applies to Security Groups as much as to traditional firewalls. This is particularly important if you use the Default security group, as it allows unrestricted internet egress traffic by default, so that should be one of the first things to disable. Although disabling egress traffic to the internet by default can make things a bit trickier to start with, it's still a lot easier than trying to add that kind of restriction after the fact.

    You can make things very complicated with Security Groups; however, my recommendation is to try to keep them simple. Give each server role (for instance web, application, database and so on) its own Security Group that applies to each server in that role. This makes it easy to know how your firewall rules are being applied and to which servers they apply. If one server in a particular role needs different network permissions from the others, it's a good sign that it probably should have its own role.

    The role-based Security Group model works pretty well but can be inconvenient when you want a firewall rule to apply to all your hosts. For instance, if you use centralized configuration management, you probably want every host to be allowed to talk to it. For rules like this, I take advantage of the Default Security Group and make sure that every host is a member of it. I then use it (in a very limited way) as a central place to define any firewall rules I want to apply to all hosts. One rule I define in particular is to allow egress traffic to any host in the Default Security Group—that way I don't have to write duplicate ingress rules in one group and egress rules in another whenever I want hosts in one Security Group to talk to another.

    Use Private Subnets

    On cloud infrastructure, you are able to define hosts that have an internet-routable IP and hosts that only have internal IPs. In AWS Virtual Private Cloud (VPC), you define these hosts by setting up a second set of private subnets and spawning hosts within those subnets instead of the default public subnets.

    Treat the default public subnet like a DMZ and put hosts there only if they truly need access to the internet. Put all other hosts into the private subnet. With this practice in place, even if hosts in the private subnet were compromised, they couldn't talk directly to the internet even if an attacker wanted them to, which makes it much more difficult to download rootkits or other persistence tools without setting up elaborate tunnels.

    These days it seems like just about every service wants unrestricted access to web ports on some other host on the internet, but an advantage to the private subnet approach is that instead of working out egress firewall rules to specific external IPs, you can set up a web proxy service in your DMZ that has more broad internet access and then restrict the hosts in the private subnet by hostname instead of IP. This has an added benefit of giving you a nice auditing trail on the proxy host of all the external hosts your infrastructure is accessing.

    Use Account Access Control Lists Minimally

    AWS provides a rich set of access control list tools by way of IAM. This lets you set up very precise rules about which AWS resources an account or role can access using a very complicated syntax. While IAM provides you with some pre-defined rules to get you started, it still suffers from the problem all rich access control lists have—the complexity makes it easy to create mistakes that grant people more access than they should have.

    My recommendation is to use IAM only as much as is necessary to lock down basic AWS account access (like sysadmin accounts or orchestration tools for instance), and even then, to keep the IAM rules as simple as you can. If you need to restrict access to resources further, use access control at another level to achieve it. Although it may seem like giving somewhat broad IAM permissions to an AWS account isn't as secure as drilling down and embracing the principle of least privilege, in practice, the more complicated your rules, the more likely you will make a mistake.

    Conclusion

    Cloud environments provide a lot of complex options for security; however, it's more important to set a good baseline of simple security practices that everyone on the team can understand. This article provides a few basic, common-sense practices that should make your cloud environments safer while not making them too complex.

    Feral Interactive Releases GameMode, YouTube Music Videos Hacked, Oregon Passes Net Neutrality Law and More

    News briefs for April 10, 2018.

    Feral Interactive today released GameMode, an open-source tool that helps Linux users get the best performance out of their games. According to the press release, "GameMode instructs your CPU to automatically run in Performance Mode when playing games." Rise of the Tomb Raider, which is being released later this month, will be the first release to integrate this tool. GameMode is available now via GitHub.

    If you are using ZFS On Linux 0.7.7, which was released in March, upgrade immediately to version 0.7.8 to keep your data safe. Version 0.7.8 is an emergency release to deal with a possible data loss issue, Phoronix reports. See the ZOL bug report for more info.

    YouTube was hacked this morning, and many popular music videos were defaced, including the video for the hit song Despacito, as well as videos by Shakira, Selena Gomez, Drake and Taylor Swift. According to the BBC story, "A Twitter account that apparently belongs to one of the hackers posted: 'It's just for fun, I just use [the] script 'youtube-change-title-video' and I write 'hacked'."

    Linux computer maker System76 is moving its manufacturing factory from China to Denver, Colorado. In an interview with opensource.com about the move and bringing manufacturing in-house, System 76 marketing director Louisa Bisio, said "Creating a computer that is open source from the physical design to the OS is the next step in our mission to empower our customers and the community. We believe that by leading with open source design, the rest of the industry will have to follow."

    Oregon becomes the second state to pass Net Neutrality law. Governor Kate Brown signed the bill yesterday, "withholding state business from internet providers who throttle traffic, making the state the second to finalize a proposal aimed at thwarting moves by federal regulators to relax net neutrality requirements".

    Blockchain, Part I: Introduction and Cryptocurrency

    Blockchain, Part I: Introduction and Cryptocurrency
    Image
    Petros Koutoupis Mon, 04/09/2018 - 10:45

    It seems nearly impossible these days to open a news feed discussing anything technology- or finance-related and not see a headline or two covering bitcoin and its underlying framework, blockchain. But why? What makes both bitcoin and blockchain so exciting? What do they provide? Why is everyone talking about this? And, what does the future hold?

    In this two-part series, I introduce this now-trending technology, describe how it works and provide instructions for deploying your very own private blockchain network.

    Bitcoin and Cryptocurrency

    The concept of cryptocurrency isn't anything new, although with the prevalence of the headlines alluded to above, one might think otherwise. Invented and released in 2009 by an unknown party under the name Satoshi Nakamoto, bitcoin is one such kind of cryptocurrency in that it provides a decentralized method for engaging in digital transactions. It is also a global technology, which is a fancy way of saying that it's a worldwide payment system. With the technology being decentralized, not one single entity is considered to have ownership or the ability to impose regulations on the technology.

    But, what does that truly mean? Transactions are secure. This makes them more difficult to track and, therefore, difficult to tax. This is because these transactions are strictly peer-to-peer, without an intermediary in between. Sounds too good to be true, right? Well, it is that good.

    Although transactions are limited to the two parties involved, they do, however, need to be validated across a network of independently functioning nodes, called a blockchain. Using cryptography and a distributed public ledger, transactions are verified.

    Now, aside from making secure and more-difficult-to-trace transactions, what is the real appeal to these cryptocurrency platforms? In the case of bitcoin, a "bitcoin" is generated as a reward through the process of "mining". And if you fast-forward to the present, bitcoin has earned monetary value in that it can be used to purchase both goods and services, worldwide. Remember, this is a digital currency, which means no physical "coins" exist. You must keep and maintain your own cryptocurrency wallet and spend the money accrued with retailers and service providers that accept bitcoin (or any other type of cryptocurrency) as a method of payment.

    All hype aside, predicting the price of cryptocurrency is a fool's errand, and there's not a single variable driving its worth. One thing to note, however, is that cryptocurrency is not in any way a monetary investment in a real currency. Instead, buying into cryptocurrency is an investment into a possible future where it can be exchanged for goods and services—and that future may be arriving sooner than expected.

    Now, this doesn't mean cryptocurrency has no cash value. In fact, it does. As of the day I am writing this (January 27, 2018), a single bitcoin is $11,368.56 USD. This value is extremely volatile, and who knows what direction it will take tomorrow. One thing influencing the value of a bitcoin is the rate of adoption. More people using the technology results in more transactions being verified by the people-owned nodes forming the underlying blockchain. In turn, the owners of the verification systems earn their rewards, thereby increasing the value of the technology. It's simple: verify more transactions, and earn more money. Sure, there is a bit more to it, but that's the general idea.

    The owners of the verification systems are referred to as "miners". Miners provide a service of record keeping. Such a service requires a good amount of processing power to handle the cryptographic computations. The purpose of the miner is to keep the underlying blockchain consistent, complete and unaltered. A miner repeatedly verifies and collects broadcasted transactions into groups of transactions referred to as blocks. Using an SHA-256 algorithm (Secure Hash Algorithm 256-bit hash), each new block contains a cryptographic hash of the block prior to it, establishing a link for forming the chain of blocks, hence the name, blockchain.

    Figure 1. An Example of How Blocks of Data Are "Chained" to One Another

    A Global "Crisis"

    With the rise of cryptocurrency and the rise of miners competing to earn their fair share of the digital currency, we are now facing a dilemma—a global shortage of high-end PC graphics adapters. Even previously-used adapters are resold at a much higher price than newly boxed versions. But why is that? Using such high-end cards with enough onboard memory and dedicated processing capabilities easily can yield several dollars in cryptocurrency per day. Remember, mining requires the processing of memory-hungry algorithms. And as cryptocurrency prices continue to increase, albeit at a rapid rate, the worth of the digital currency awarded to miners also increases. This shortage of graphics adapters has become an increasing bottleneck for existing miners looking to expand their operations or for new miners to get in on the action. Hopefully, graphic card vendors will address this shortage sooner rather than later.

    Comparing Blockchain Technologies

    Multiple platforms exist for crypto-trading. You may come across articles discussing bitcoin and comparing that currency to others like ethereum or litecoin. Initially, those articles can lead to confusion between the two different types of digital coins: 1) cryptocurrencies and 2) tokens. The key things to remember are the following:

    • A bitcoin or litecoin or any other form of cryptocurrency actively competes against existing money and gold in the hopes of replacing them as an accepted form of global currency. As mentioned previously, the technology promises a non-regulated and globally accessible currency—one that contains the same stable value regardless of location. This concept definitely could appeal to those living in unstable countries with unstable currencies.

    • And ethereum? Well, it deals in tokens. It works on the idea of contracts. Ethereum is a platform that allows its users to write conditional digital "smart contracts", showing proof of a transaction that never can be deleted.

    In the modern world, a traditionally written contract will outline the terms of a relationship, usually enforceable by law. A smart contract will enforce a relationship using cryptographic code—that is, by executing the conditions defined by its creators using a program. What makes ethereum more interesting is that unlike bitcoin (or litecoin for that matter), the platform does not limit itself to the currency use case.

    Much like bitcoin, when a transaction takes place utilizing one or more of these contracts, transaction fees are charged to source the computation power required. The more computational power needed, the higher the fee.

    What Is Blockchain?

    To understand this cryptocurrency phenomenon and its explosive growth in popularity, you need to understand the technology supporting it: the blockchain. As mentioned previously, a blockchain consists of a continuously growing list of records captured in the form of blocks. Using cryptography, each new block is linked and secured to an existing chain of blocks.

    Each block will contain a hash pointer to the previous block within the chain, a timestamp and transactional data. By design, the blockchain is resistant to any sort of modification of data. This is because a blockchain provides an open and distributed ledger to record transactions between two interested parties efficiently, reliably and permanently.

    Once data has been recorded, the data in a given block cannot be altered without altering all subsequent blocks.

    I guess you can think of this as a distributed "database" where its contents are duplicated hundreds, if not thousands, of times across a network of computers. This method of replication emphasizes the decentralized aspect of the technology. Without a centralized version or a single "master" copy, this database is public and, therefore, can be verified easily without risk or fear of hacking or corruption. Simultaneously hosted by millions of computing nodes, the contents of this database are accessible to anyone on the internet. As an added benefit, the distributed and decentralized model reassures its users that no single point of failure exists. Imagine that one or more of these computing nodes are either inaccessible or experiencing some sort of internal failures or are even producing corrupted data. The blockchain is resilient in that it will continue to make available the requested data contents and in their proper (that is, uncorrupted) format. This is because of a technique commonly referred to as the Byzantine Fault Tolerance method.

    Byzantine Fault Tolerance

    Systems fail, and they can fail for multiple reasons (such as hardware, software, power, networking connectivity and others). This is a fact. Also, not all failures are easily detectable (even through traditional fault-tolerance mechanisms) nor will they always appear the same to the rest of the systems in the networked cluster. Again, imagine a large network consisting of hundreds, if not thousands, of nodes. To handle such unpredictable conditions, one must employ a voting system to ensure that the cluster will tolerate the failure or misbehavior.

    A Byzantine fault is defined by any fault showcasing different types of symptoms to different observers (that is, distributed computing systems). A Byzantine failure is the loss of a system service due to a Byzantine fault in an environment where a consensus must reached in order to perform that one service or operation.

    The purpose of Byzantine Fault Tolerance (BFT) is to defend the distributed platform against such Byzantine failures. Failing components of the system will not prevent the remaining components from reaching an agreement among themselves, where such an agreement is required to perform an operation. Correctly functioning components of a BFT system will continue to provide uninterrupted service, assuming that not too many faults exist.

    The name of this mechanism is derived from the Byzantine Generals' Problem (BGP). The BGP highlights an agreement problem, where there is a disagreement with all participating members. Imagine a scenario where several divisions of the Byzantine army are camped outside a fortified city. Each division has its own general, and the only way the generals are able to communicate with each other is through the use of messengers. The generals need to decide on a common plan of action. The problem is, some of the generals may and very well could be traitors. With one traitor in their midst, can the non-traitors decide on a common plan?

    In a BFT environment, the answer to this question is yes. In a group of three, one traitor makes it impossible not to reach a majority consensus. For instance, if one general says "attack" while the other two say to "retreat", it is easy to determine who the traitor of the group is. It is also possible to reach some sort of agreement across the non-traitors. Now, apply this concept to a distributed network of computing nodes. For example, when f number of nodes goes Byzantine, 2f + 1 nodes will not tolerate the misbehavior. All you need is 1 properly functioning node more than the potentially faulty nodes.

    Figure 2. The Byzantine Generals' Problem illustrated

    Now, why am I talking about this? The BFT is at the core of a blockchain's resiliency. If a consensus cannot be made to handle a transaction, the blockchain itself is no good.

    The Network

    A network consisting of computing nodes is what makes up the blockchain. A node gets an identical copy of the blockchain as soon as it joins the network. Each node is considered to be an administrator of the blockchain and not in any more control over the other nodes within the cluster—again, the result of being decentralized.

    Figure 3. An Example of a Decentralized Blockchain Network

    This method of computing is what lends the blockchain its robustness. Aside from updating the blockchain, each node can and will act independently from the other regardless of how it was accessed. And when it needs to append a new block to the chain, it will broadcast the update to the rest of the nodes (updating the public ledger).

    Whatever the user-driven event, it is considered to be a function of the network as a whole. It is the global network that manages the application, and it will operate on a user-to-user or peer-to-peer basis. Each node, when accessed independently, is tasked with confirming the requested transaction (such as mining). Already alluded to previously, it is this core concept that makes the blockchain that much more secure. The blockchain technology eliminates the risks (and vulnerabilities) introduced with data being held (or managed) centrally and not replicated across the network. Another way to think of it is this: instead of having a single entity validate the transaction, you now have multiple entities validating the transaction after reaching a consensus. They act as witnesses, and not one single entity has more authority over the other. This leaves no room for ambiguity, and if one or more nodes misrepresents the original data, the BFT model will address that.

    Almost everyone reading this is familiar with the constant security problems running rampant on the internet. We personally attempt to protect both our identity and our assets online by relying on the traditional "user name" and "password" systems. Blockchain takes this a step further and differs in that its security stems from its use of encryption technologies. The authentication "problem" is solved with the generation of "keys". A user will create a public key (a long and randomly generated numeric string) and a private key (which acts like a password). The public key serves as the user's address within the blockchain, and any transaction involving that address will be recorded as belonging to that address. The private key gives its owner access to his or her digital assets. The combination of both public and private keys provide a digital signature. The only concern here is taking the appropriate measures to protect private keys.

    Putting the Pieces Together

    By now, you should have more of a complete picture of how all of these components tie together.

    Figure 4. The General Handling of a Transaction across a Blockchain Network

    For example, let's say there's a bitcoin transaction (or it could something else entirely different), but imagine someone in the network is requesting the transaction. This requested transaction is then broadcasted across a peer-to-peer network of computing nodes. Using cryptographic algorithms, the network of nodes validates the user's status and the transaction. Once verified, the transaction is combined with other transactions, creating a new block of data for the public ledger. The new block of data is then appended to the existing blockchain and is done in a way that makes it permanent and unalterable. Then the transaction is complete. Using timestamping schemes, all transactions are serialized.

    What Makes Blockchain Important?

    Much like TCP/IP, the blockchain is a foundation technology. As TCP/IP enabled the internet by the 1990s, you can expect wonderful new beginnings with the blockchain. It is still a bit too early to see how it will evolve. This revolutionary technology has enabled organizations to explore what it can and will mean for their businesses. Many of these same organizations already have begun the exploration, although it primarily has been focused around financial services. The possibilities are enormous, and it seems that any industry dealing with any sort of transaction-based model will be disrupted by the technology.

    Summary

    This article covers the rise and interest in cryptocurrencies and begins to dive into the underlying blockchain technology that enables it. In the next part of this series, using open-source tools, I start to describe how to build your very own private blockchain network. This private deployment will allow you to dig deeper into the details highlighted here. The technology may be centered around cryptocurrency today, but I also look at various industries the blockchain can help to redefine and the potential for a promising future leveraging the technology.