Miner One Is Launching Its Bitcoin-Mining High-Altitude Balloon Today, New Stable Version of GIMP and More

News briefs for April 30, 2018.

Miner One announced via press release that it is launching its bitcoin-mining balloon today. You can watch the launch on Facebook. Space Miner One is a capsule and high-altitude balloon that will "perform data-mining operations at the edge of space". Miner One's goal is to "remind people that cryptocurrency is really about the future and the revolutionary technology at its heart: so-called blockchain technology."

After six years of development, a new stable version of GIMP has been released. Version 2.10.0 has a new default Dark theme and supports HIDPI displays and the GEGL image processing library. GIMP 2.10.0 also includes new tools, better file format support and an upgraded user interface, among other things. See the release notes for all the details.

The European Union wants online platforms to incorporate their own bot-detection mechanisms, TechCrunch reports. This is "as part of a voluntary Code of Practice the European Commission now wants platforms to develop and apply—by this summer—as part of a wider package of proposals it's put out which are generally aimed at tackling the problematic spread and impact of disinformation online."

Ubuntu "Budgie" announced its very first LTS version. This release features "more customisation options via budgie welcome, lots more Budgie Applets available to be installed, dynamic Workspaces, hot-corners and Window Shuffler" and more. See the release notes for more info, and go here to download.

Linux Mint published its monthly update this morning. The report details what the Mint team is working on, including adapting to the Debian Stretch and Ubuntu Bionic package bases, finalizing Cinnamon 3.8, adding new features to Mint tools, working on documentation, plans for Mint 19 and LMDE 3 and more.

Working around Intel Hardware Flaws

Working around Intel Hardware Flaws
Image
Zack Brown Mon, 04/30/2018 - 07:07

Efforts to work around serious hardware flaws in Intel chips are ongoing. Nadav Amit posted a patch to improve compatibility mode with respect to Intel's Meltdown flaw. Compatibility mode is when the system emulates an older CPU in order to provide a runtime environment that supports an older piece of software that relies on the features of that CPU. The thing to be avoided is to emulate massive security holes created by hardware flaws in that older chip as well.

In this case, Linux is already protected from Meltdown by use of PTI (page table isolation), a patch that went into Linux 4.15 and that was subsequently backported all over the place. However, like the BKL (big kernel lock) in the old days, PTI is a heavy-weight solution, with a big impact on system speed. Any chance to disable it without reintroducing security holes is a chance worth exploring.

Nadav's patch was an attempt to do this. The goal was "to disable PTI selectively as long as x86-32 processes are running and to enable global pages throughout this time."

One problem that Nadav acknowledged was that since so many developers were actively working on anti-Meltdown and anti-Spectre patches, there was plenty of opportunity for one patch to step all over what another was trying to do. As a result, he said, "the patches are marked as an RFC since they (specifically the last one) do not coexist with Dave Hansen's enabling of global pages, and might have conflicts with Joerg's work on 32-bit."

Andrew Cooper remarked, chillingly:

Being 32bit is itself sufficient protection against Meltdown (as long as there is nothing interesting of the kernel's mapped below the 4G boundary). However, a 32bit compatibility process may try to attack with Spectre/SP2 to redirect speculation back into userspace, at which point (if successful) the pipeline will be speculating in 64bit mode, and Meltdown is back on the table. SMEP will block this attack vector, irrespective of other SP2 defenses the kernel may employ, but a fully SP2-defended kernel doesn't require SMEP to be safe in this case.

And Dave, nearby, remarked, "regardless of Meltdown/Spectre, SMEP is valuable. It's valuable to everything, compatibility-mode or not."

SMEP (Supervisor Mode Execution Protection) is a hardware mode, whereby the OS can set a register on compatible CPUs to prevent userspace code from running. Only code that already has root permissions can run when SMEP is activated.

Andy Lutomirski said that he didn't like Nadav's patch because he said it drew a distinction between "compatibility mode" tasks and "non-compatibility mode" tasks. Andy said no such distinction should be made, especially since it's not really clear how to make that distinction, and because the ramifications of getting it wrong might be to expose significant security holes.

Andy felt that a better solution would be to enable and disable 32-bit mode and 64-bit mode explicitly as needed, rather than guessing at what might or might not be compatibility mode.

The drawback to this approach, Andy said, was that old software would need to be upgraded to take advantage of it, whereas with Nadav's approach, the judgment would be made automatically and would not require old code to be updated.

Linus Torvalds was not optimistic about any of these ideas. He said, "I just feel this all is a nightmare. I can see how you would want to think that compatibility mode doesn't need PTI, but at the same time it feels like a really risky move to do this." He added, "I'm not seeing how you keep user mode from going from compatibility mode to L mode with just a far jump."

In other words, the whole patch, and any alternative, may just simply be a bad idea.

Nadav replied that with his patch, he tried to cover every conceivable case where someone might try to break out of compatibility mode and to re-enable PTI protections if that were to happen. Though he did acknowledge, "There is one corner case I did not cover (LAR) and Andy felt this scheme is too complicated. Unfortunately, I don't have a better scheme in mind."

Linus remarked:

Sure, I can see it working, but it's some really shady stuff, and now the scheduler needs to save/restore/check one more subtle bit.

And if you get it wrong, things will happily work, except you've now defeated PTI. But you'll never notice, because you won't be testing for it, and the only people who will are the black hats.

This is exactly the "security depends on it being in sync" thing that makes me go "eww" about the whole model. Get one thing wrong, and you'll blow all the PTI code out of the water.

So now you tried to optimize one small case that most people won't use, but the downside is that you may make all our PTI work (and all the overhead for all the _normal_ cases) pointless.

And Andy also remarked, "There's also the fact that, if this stuff goes in, we'll be encouraging people to deploy 32-bit binaries. Then they'll buy Meltdown-fixed CPUs (or AMD CPUs!) and they may well continue running 32-bit binaries. Sigh. I'm not totally a fan of this."

The whole thread ended inconclusively, with Nadav unsure whether folks wanted a new version of his patch.

The bottom line seems to be that Linux has currently protected itself from Intel's hardware flaws, but at a cost of perhaps 5% to 30% efficiency (the real numbers depend on how you use your system). And although it will be complex and painful, there is a very strong incentive to improve efficiency by adding subtler and more complicated workarounds that avoid the heavy-handed approach of the PTI patch. Ultimately, Linux will certainly develop a smooth, near-optimal approach to Meltdown and Spectre, and probably do away with PTI entirely, just as it did away with the BKL in the past. Until then, we're in for some very ugly and controversial patches.

Note: If you're mentioned above and want to post a response above the comment section, send a message with your response text to ljeditor@linuxjournal.com.

Weekend Reading: Privacy

Most people simply are unaware of how much personal data they leak on a daily basis as they use their computers. Enter this weekend's reading topic: Privacy.

FOSS Project Spotlight: Tutanota, the First Encrypted Email Service with an App on F-Droid by Matthias Pfau

Seven years ago Tutanota was built, an encrypted email service with a strong focus on security, privacy and open source. Long before the Snowden revelations, Tutanota's team felt there was a need for easy-to-use encryption that would allow everyone to communicate online without being snooped upon.

The Wire by Shawn Powers

In the US, there has been recent concern over ISPs turning over logs to the government. During the past few years, the idea of people snooping on our private data (by governments and others) really has made encryption more popular than ever before. One of the problems with encryption, however, is that it's generally not user-friendly to add its protection to your conversations. Thankfully, messaging services are starting to take notice of the demand. For me, I need a messaging service that works across multiple platforms, encrypts automatically, supports group messaging and ideally can handle audio/video as well. Thankfully, I found an incredible open-source package that ticks all my boxes: Wire.

Facebook Compartmentalization by Kyle Rankin

Whenever people talk about protecting privacy on the internet, social-media sites like Facebook inevitably come up—especially right now. It makes sense—social networks (like Facebook) provide a platform where you can share your personal data with your friends, and it doesn't come as much of a surprise to people to find out they also share that data with advertisers (it's how they pay the bills after all). It makes sense that Facebook uses data you provide when you visit that site. What some people might be surprised to know, however, is just how much. Facebook tracks them when they aren't using Facebook itself but just browsing around the web.

Some readers may solve the problem of Facebook tracking by saying "just don't use Facebook"; however, for many people, that site may be the only way they can keep in touch with some of their friends and family members. Although I don't post on Facebook much myself, I do have an account and use it to keep in touch with certain friends. So in this article, I explain how I employ compartmentalization principles to use Facebook without leaking too much other information about myself.

Protection, Privacy and Playoffs by Shawn Powers

Randomly Switching Upper and Lowercase in a Shell Script

Randomly Switching Upper and Lowercase in a Shell Script
Image
Dave Taylor Fri, 04/27/2018 - 16:49

Dave wraps up the shell-script L33t generator

Last time, I talked about what's known informally as l33t-speak, a series of letter and letter-pair substitutions that marks the jargon of the hacker elite (or some subset of hacker elite, because I'm pretty sure that real computer security experts don't need to substitute vowels with digits to sound cool and hip).

Still, it was an interesting exercise as a shell-scripting problem, because it's surprisingly simply to adapt a set of conversion rules into a sequence of commands. I sidestepped one piece of it, however, and that's what I want to poke around with this article: changing uppercase and lowercase letters somewhat randomly.

This is where "Linux Journal" might become "LiNUx jOurNAl", for example. Why? Uhm, because it's a puzzle to solve. Jeez, you ask such goofy questions of me!

Breaking Down a Line Letter by Letter

The first and perhaps most difficult task is to take a line of input and break it down letter by letter so each can be analyzed and randomly transliterated. There are lots of ways to accomplish this in Linux (of course), but I'm going to use the built-in Bash substring variable reference sequence. It looks like this:


${variable:index:length}

So to get just the ninth character of variable input, for example, I could use ${input:9:1}. Bash also has another handy variable reference that produces the length of the value of a particular variable: ${#variable}. Put the two together, and here's the basic initialization and loop:


input="$*"
length="${#input}"

while [ $charindex -lt $length ]
do
    char="${input:$charindex:1}"
    # conversion occurs here
    newstring="${newstring}$char"
    charindex=$(( $charindex + 1 ))
done

Keep in mind that charindex is initialized to 0, and newstring is initialized to "", so you can see how this quickly steps through every character, adding it to newstring. "Conversion occurs here" is not very exciting, but that's the placeholder you need.

Lower, Meet Upper, and Vice Versa

Last time I also showed a quick and easy way to choose a number 1–10 randomly, so you can sometimes have something happen and other times not happen. In this command:


doit=$(( $RANDOM % 10 ))       # random virtual coin flip

Let's say there's only a 30% chance that an uppercase letter will convert to lowercase, but a 50% chance that a lowercase letter will become uppercase. How do you code that? To start, let's get the basic tests:


if [ -z "$(echo "$char" | sed -E 's/[[:lower:]]//')" ]
then
   # it's a lowercase character
elif [ -z "$(echo "$char" | sed -E 's/[[:upper:]]//')" ]
then
   # it's uppercase
fi

This is a classic shell-script trick: to ascertain if a character is a member of a class, replace it with null, then test to see if the resultant string is null (the -Z test).

The last bit's easy. Generate the random number, then if it's below the threshold, transliterate the char; otherwise, do nothing. Thus:


if [ -z "$(echo "$char" | sed -E 's/[[:lower:]]//')" ]
then
  # lowercase. 50% chance we'll change it
  if [ $doit -lt 5 ] ; then
    char="$(echo $char | tr '[[:lower:]]' '[[:upper:]]')"
  fi
elif [ -z "$(echo "$char" | sed -E 's/[[:upper:]]//')" ]
then
  # uppercase. 30% chance we'll change it
  if [ $doit -lt 3 ] ; then
    char="$(echo $char | tr '[[:upper:]]' '[[:lower:]]')"
  fi
fi

Put it all together and you have this Frankenstein's monster of a script:


$ sh changecase.sh Linux Journal is a great read.
LiNuX JoURNal is a GrEaT ReAd.
$ !!
LINuX journAl IS a gREat rEAd
$

Now you're ready for writing some ransom notes, it appears!

Mozilla’s New Mixed Reality Hubs, NanoPi K1 Plus, Wireshark Update and More

News briefs for April 27, 2018.

Mozilla announced yesterday a preview release of Hubs, a "new way to get together online within Mixed Reality, right in your browser." This preview lets you easily create a "room" online with a click, and you then can meet with others by sharing the link. When they open the link, they enter your room as avatars. Mozilla notes that this is "All with no app downloads, walled gardens, or content gatekeepers, and on any device you wish—and most importantly, through open source software that respects your privacy and is built on web standards."

The NanoPi K1 Plus from FriendlyElec is a new Raspberry Pi competitor. The NanoPi K1 Plus costs $35 and shares the same form factor as the RPi 3, but it has double the RAM, Gigabit Ethernet and 4K video playback. See the wiki for more details. (Source: ZDNet.)

Wireshark, the popular network protocol analyzer, just released version 2.6. This release brings many new or significantly updated features since version 2.5, including support for HTTP Request sequences, support for MaxMind DB files and much more. Download Wireshark from here.

Fedora announced that it now has a "curated set of third-party repositories" containing software that's not normally available in Fedora, such as Google Chrome, PyCharm and Steam. Fedora usually includes only free and open-source software, but with this new third-party repository, users can "opt-in" to these select extras.

Red Hat Enterprise Linux 6.10 beta is now available. According to the release announcement, "6.10 Beta is designed to support the next generation of cloud-native applications through an updated Red Hat Enterprise Linux 6 base image. The Red Hat Enterprise Linux 6.10 Beta base image enables customers to migrate their existing Red Hat Enterprise Linux 6 workloads into container-based applications - suitable for deployment on Red Hat Enterprise Linux 7, Red Hat Enterprise Linux Atomic Host, and Red Hat OpenShift Container Platform." See a full list of the changes here.

A Good Front End for R

A Good Front End for R
Image
screenshot
Joey Bernard Thu, 04/26/2018 - 09:30

R is the de facto statistical package in the Open Source world. It's also quickly becoming the default data-analysis tool in many scientific disciplines.

R's core design includes a central processing engine that runs your code, with a very simple interface to the outside world. This basic interface means it's been easy to build graphical interfaces that wrap the core portion of R, so lots of options exist that you can use as a GUI.

In this article, I look at one of the available GUIs: RStudio. RStudio is a commercial program, with a free community version, available for Linux, Mac OSX and Windows, so your data analysis work should port easily regardless of environment.

For Linux, you can install the main RStudio package from the download page. From there, you can download RPM files for Red Hat-based distributions or DEB files for Debian-based distributions, then use either rpm or dpkg to do the installation.

For example, in Debian-based distributions, use the following to install RStudio:


sudo dpkg -i rstudio-xenial-1.1.423-amd64.deb

It's important to note that RStudio is only the GUI interface. This means you need to install R itself as a separate step. Install the core parts of R with:


sudo apt-get install r-base

There's also a community repository of available packages, called CRAN, that can add huge amounts of functionality to R. You'll want to install at least some of them in order to have some common tools to use:


sudo apt-get install r-recommended

There are equivalent commands for RPM-based distributions too.

At this point, you should have a complete system to do some data analysis.

When you first start RStudio, you'll see a window that looks somewhat like Figure 1.

Screenshot

Figure 1. RStudio creates a new session, including a console interface to R, where you can start your work.

The main pane of the window, on the left-hand side, provides a console interface where you can interact directly with the R session that's running in the back end.

The right-hand side is divided into two sections, where each section has multiple tabs. The default tab in the top section is an environment pane. Here, you'll see all the objects that have been created and exist within the current R session.

The other two tabs provide the history of every command given and a list of any connections to external data sources.

The bottom pane has five tabs available. The default tab gives you a file listing of the current working directory. The second tab provides a plot window where any data plots you generate are displayed. The third tab provides a nicely ordered view into R's library system. It shows a list of all of the currently installed libraries, along with tools to manage updates and install new libraries. The fourth tab is the help viewer. R includes a very complete and robust help system modeled on Linux man pages. The last tab is a general "viewer" pane to view other types of objects.

One part of RStudio that's a great help to people managing multiple areas of research is the ability to use projects. Clicking the menu item File→New Project pops up a window where you can select how your new project will exist on the filesystem.

screenshot

Figure 2. When you create a new project, it can be created in a new directory, an existing directory or be checked out from a code repository.

As an example, let's create a new project hosted in a local directory. The file display in the bottom-right pane changes to the new directory, and you should see a new file named after the project name, with the filename ending .Rproj. This file contains the configuration for your new project. Although you can interact with the R session directly through the console, doing so doesn't really lead to easily reproduced workflows. A better solution, especially within a project, is to open a script editor and write your code within a script file. This way you automatically have a starting point when you move beyond the development phase of your research.

When you click File→New File→R Script, a new pane opens in the top left-hand side of the window.

screenshot

Figure 3. The script editor allows you to construct more complicated pieces of code than is possible using just the console interface.

From here, you can write your R code with all the standard tools you'd expect in a code editor. To execute this code, you have two options. The first is simply to click the run button in the top right of this editor pane. This will run either the single line where the cursor is located or an entire block of code that previously had been highlighted.

screenshot

Figure 4. You can enter code in the script editor and then have them run to make code development and data analysis a bit easier on your brain.

If you have an entire script file that you want to run as a whole, you can click the source button in the top right of the editor pane. This lets you reproduce analysis that was done at an earlier time.

The last item to mention is data visualization in RStudio. Actually, the data visualization is handled by other libraries within R. There is a very complete, and complex, graphics ability within the core of R. For normal humans, several libraries are built on top of this. One of the most popular, and for good reason, is ggplot. If it isn't already installed on your system, you can get it with:


install.packages(c('ggplot2'))

Once it's installed, you can make a simple scatter plot with this:


library(ggplot2)
c <- data.frame(x=a, y=b)
ggplot(c, aes(x=x, y=y)) + geom_point()

As you can see, ggplot takes dataframes as the data to plot, and you control the display with aes() function calls and geom function calls. In this case, I used the geom_point() function to get a scatter plot of points. The plot then is generated in the bottom-left pane.

screenshot

Figure 5. ggplot2 is one of the most powerful and popular graphing tools available in the R environment.

There's a lot more functionality available in RStudio, including a server portion that can be run on a cluster, allowing you to develop code locally and then send it off to a server for the actual processing.

Official Ubuntu 18.04 LTS Release, Gmail Redesign, New Cinnamon 3.8 Desktop and More

News briefs for April 26, 2018.

Ubuntu 18.04 "Bionic Beaver" LTS is scheduled to be released officially today. This release features major changes, including kernel version updated to 4.15, GNOME instead of Unity, Python 2 no longer installed by default and much more. According to the release announcement "Ubuntu Server 18.04 LTS includes the Queens release of OpenStack including the clustering enabled LXD 3.0, new network configuration via netplan.io, and a next-generation fast server installer. See the Release Notes for download links.

Google's new Gmail redesign launched yesterday, with several new privacy features including a new "confidential mode", which allows users to set an expiration date for private email, and "integrated rights management", which lets users block forwarding, copying, downloading or printing certain messages. See the story on The Verge for more information.

openSUSE Tumbleweed had four snapshots released this week, including new updates for the kernel, Mesa, KDE Frameworks and a major version update of libglvnd. See the post on openSUSE for all the details.

The Cinnamon 3.8 desktop has been released and is already available in some repositories, including Arch Linux. Cinnamon 3.8, which is scheduled to ship with Linux Mint 19 "Tara" later this summer, this release "brings numerous improvements, new features, and lots of Python 3 ports for a bunch of components." (Source: Softpedia News.)

Attackers subverted Amazon's domain-resolution service on Tuesday, and according to an Ars Technica report, "masqueraded as cryptocurrency website MyEtherWallet.com and stole about $150,000 in digital coins from unwitting end users. They may have targeted other Amazon customers as well."

ONNX: the Open Neural Network Exchange Format

ONNX: the Open Neural Network Exchange Format
Image
onnx logo
Braddock Gaskill Wed, 04/25/2018 - 09:19

An open-source battle is being waged for the soul of artificial intelligence. It is being fought by industry titans, universities and communities of machine-learning researchers world-wide. This article chronicles one small skirmish in that fight: a standardized file format for neural networks. At stake is the open exchange of data among a multitude of tools instead of competing monolithic frameworks.

The good news is that the battleground is Free and Open. None of the big players are pushing closed-source solutions. Whether it is Keras and Tensorflow backed by Google, MXNet by Apache endorsed by Amazon, or Caffe2 or PyTorch supported by Facebook, all solutions are open-source software.

Unfortunately, while these projects are open, they are not interoperable. Each framework constitutes a complete stack that until recently could not interface in any way with any other framework. A new industry-backed standard, the Open Neural Network Exchange format, could change that.

Now, imagine a world where you can train a neural network in Keras, run the trained model through the NNVM optimizing compiler and deploy it to production on MXNet. And imagine that is just one of countless combinations of interoperable deep learning tools, including visualizations, performance profilers and optimizers. Researchers and DevOps no longer need to compromise on a single toolchain that provides a mediocre modeling environment and so-so deployment performance.

What is required is a standardized format that can express any machine-learning model and store trained parameters and weights, readable and writable by a suite of independently developed software.

Enter the Open Neural Network Exchange Format (ONNX).

The Vision

To understand the drastic need for interoperability with a standard like ONNX, we first must understand the ridiculous requirements we have for existing monolithic frameworks.

A casual user of a deep learning framework may think of it as a language for specifying a neural network. For example, I want 100 input neurons, three fully connected layers each with 50 ReLU outputs, and a softmax on the output. My framework of choice has a domain language to specify this (like Caffe) or bindings to a language like Python with a clear API.

However, the specification of the network architecture is only the tip of the iceberg. Once a network structure is defined, the framework still has a great deal of complex work to do to make it run on your CPU or GPU cluster.

Python, obviously, doesn't run on a GPU. To make your network definition run on a GPU, it needs to be compiled into code for the CUDA (NVIDIA) or OpenCL (AMD and Intel) APIs or processed in an efficient way if running on a CPU. This compilation is complex and why most frameworks don't support both NVIDIA and AMD GPU back ends.

The job is still not complete though. Your framework also has to balance resource allocation and parallelism for the hardware you are using. Are you running on a Titan X card with more than 3,000 compute cores, or a GTX 1060 with far less than half as many? Does your card have 16GB of RAM or only 4? All of this affects how the computations must be optimized and run.

And still it gets worse. Do you have a cluster of 50 multi-GPU machines on which to train your network? Your framework needs to handle that too. Network protocols, efficient allocation, parameter sharing—how much can you ask of a single framework?

Now you say you want to deploy to production? You wish to scale your cluster automatically? You want a solid language with secure APIs?

When you add it all up, it seems absolutely insane to ask one monolithic project to handle all of those requirements. You cannot expect the authors who write the perfect network definition language to be the same authors who integrate deployment systems in Kubernetes or write optimal CUDA compilers.

The goal of ONNX is to break up the monolithic frameworks. Let an ecosystem of contributors develop each of these components, glued together by a common specification format.

The Ecosystem (and Politics)

Interoperability is a healthy sign of an open ecosystem. Unfortunately, until recently, it did not exist for deep learning. Every framework had its own format for storing computation graphs and trained models.

Late last year that started to change. The Open Neural Network Exchange format initiative was launched by Facebook, Amazon and Microsoft, with support from AMD, ARM, IBM, Intel, Huawei, NVIDIA and Qualcomm. Let me rephrase that as everyone but Google. The format has been included in most well known frameworks except Google's TensorFlow (for which a third-party converter exists).

This seems to be the classic scenario where the clear market leader, Google, has little interest in upending its dominance for the sake of openness. The smaller players are banding together to counter the 500-pound gorilla.

Google is committed to its own TensorFlow model and weight file format, SavedModel, which shares much of the functionality of ONNX. Google is building its own ecosystem around that format, including TensorFlow Server, Estimator and Tensor2Tensor to name a few.

The ONNX Solution

Building a single file format that can express all of the capabilities of all the deep learning frameworks is no trivial feat. How do you describe convolutions or recurrent networks with memory? Attention mechanisms? Dropout layers? What about embeddings and nearest neighbor algorithms found in fastText or StarSpace?

ONNX cribs a note from TensorFlow and declares everything is a graph of tensor operations. That statement alone is not sufficient, however. Dozens, perhaps hundreds, of operations must be supported, not all of which will be supported by all other tools and frameworks. Some frameworks may also implement an operation differently from their brethren.

There has been considerable debate in the ONNX community about what level tensor operations should be modeled at. Should ONNX be a mathematical toolbox that can support arbitrary equations with primitives such as sine and multiplication, or should it support higher-level constructs like integrated GRU units or Layer Normalization as single monolithic operations?

As it stands, ONNX currently defines about 100 operations. They range in complexity from arithmetic addition to a complete Long Short-Term Memory implementation. Not all tools support all operations, so just because you can generate an ONNX file of your model does not mean it will run anywhere.

Generation of an ONNX model file also can be awkward in some frameworks because it relies on a rigid definition of the order of operations in a graph structure. For example, PyTorch boasts a very pythonic imperative experience when defining models. You can use Python logic to lay out your model's flow, but you do not define a rigid graph structure as in other frameworks like TensorFlow. So there is no graph of operations to save; you actually have to run the model and trace the operations. The trace of operations is saved to the ONNX file.

Conclusion

It is early days for deep learning interoperability. Most users still pick a framework and stick with it. And an increasing number of users are going with TensorFlow. Google throws many resources and real-world production experience at it—it is hard to resist.

All frameworks are strong in some areas and weak in others. Every new framework must re-implement the full "stack" of functionality. Break up the stack, and you can play to the strengths of individual tools. That will lead to a healthier ecosystem.

ONNX is a step in the right direction.

Note: the ONNX GitHub page is here.

Disclaimer

Braddock Gaskill is a research scientist with eBay Inc. He contributed to this article in his personal capacity. The views expressed are his own and do not necessarily represent the views of eBay Inc.

About the Author

Braddock Gaskill has 25 years of experience in AI and algorithmic software development. He also co-founded the Internet-in-a-Box open-source project and developed the libre Humane Wikipedia Reader for getting content to students in the developing world.

Purism Partners with UBports to Offer Ubuntu Touch on the Librem 5, Red Hat Storage One Launches and More

News briefs for April 25, 2018.

Purism has partnered with UBports to offer Ubuntu Touch on its Librem 5 smartphone. By default, the smartphone runs Purism's PureOS, which supports GNOME and KDE Plasma mobile interfaces. UBports is ensuring Ubuntu Touch will run on the phones as well, so the Librem 5 can "now offer users three fully free and open mobile operating system options".

Red Hat today announced the general availability of Red Hat Storage One, a "new approach to software-defined storage aimed at providing pre-configured, workload optimized systems on a number of hardware choices." Features include ease of installation, workload and hardware optimization, flexible scalability and cost-effectiveness.

Microsoft yesterday announced vcpkg, a single C++ library manager for Linux, macOS and Windows: "This gives you immediate access to the vcpkg catalog of C++ libraries on two new platforms, with the same simple steps you are familiar with on Windows and UWP today."

The Linux Foundation and Dice are looking for people to take their open-source job surveys: "this is your chance to let companies, HR and hiring managers and industry organizations know what motivates you as an open source professional." There two surveys, one for open-source professionals and one specifically for hiring managers.

Outreachy, which provides three-month internships for people from groups traditionally underrepresented in tech, has announced its accepted summer interns. Out of 1,264 applicants, 44 were chosen. Here's a list of all the interns and their projects. If you're interested in participating, the next round of Outreachy internships opens in September 2018 for the December 2018 to March 2019 internship round.

Lying with statistics, distributions, and popularity contests on Cooking With Linux (without a net)

Please support Linux Journal by subscribing or becoming a patron.

It's Tuesday and that means it's time for Cooking With Linux (without a net), sponsored and supported by Linux Journal. Today, I'm courting controversy by discussing numbers, OS popularity, and how to pick the right Linux distribution if you want to be where are the beautiful people hang out. And yes, I'll do it all live, without a net, and with a high probability of falling flat on my face.

Blockchain, Part II: Configuring a Blockchain Network and Leveraging the Technology

Blockchain, Part II: Configuring a Blockchain Network and Leveraging the Technology
Image
Petros Koutoupis Tue, 04/24/2018 - 11:30

How to set up a private ethereum blockchain using open-source tools and a look at some markets and industries where blockchain technologies can add value.

In Part I, I spent quite a bit of time exploring cryptocurrency and the mechanism that makes it possible: the blockchain. I covered details on how the blockchain works and why it is so secure and powerful. In this second part, I describe how to set up and configure your very own private ethereum blockchain using open-source tools. I also look at where this technology can bring some value or help redefine how people transact across a more open web.

Setting Up Your Very Own Private Blockchain Network

In this section, I explore the mechanics of an ethereum-based blockchain network—specifically, how to create a private ethereum blockchain, a private network to host and share this blockchain, an account, and then how to do some interesting things with the blockchain.

What is ethereum, again? Ethereum is an open-source and public blockchain platform featuring smart contract (that is, scripting) functionality. It is similar to bitcoin but differs in that it extends beyond monetary transactions.

Smart contracts are written in programming languages, such as Solidity (similar to C and JavaScript), Serpent (similar to Python), LLL (a Lisp-like language) and Mutan (Go-based). Smart contracts are compiled into EVM (see below) bytecode and deployed across the ethereum blockchain for execution. Smart contracts help in the exchange of money, property, shares or anything of value, and it does so in a transparent and conflict-free way avoiding the traditional middleman.

If you recall from Part I, a typical layout for any blockchain is one where all nodes are connected to every other node, creating a mesh. In the world of ethereum, these nodes are referred to as Ethereum Virtual Machines (EVMs), and each EVM will host a copy of the entire blockchain. Each EVM also will compete to mine the next block or validate a transaction. Once the new block is appended to the blockchain, the updates are propagated to the entire network, so that each node is synchronized.

In order to become an EVM node on an ethereum network, you'll need to download and install the proper software. To accomplish this, you'll be using Geth (Go Ethereum). Geth is the official Go implementation of the ethereum protocol. It is one of three such implementations; the other two are written in C++ and Python. These open-source software packages are licensed under the GNU Lesser General Public License (LGPL) version 3. The standalone Geth client packages for all supported operating systems and architectures, including Linux, are available here. The source code for the package is hosted on GitHub.

Geth is a command-line interface (CLI) tool that's used to communicate with the ethereum network. It's designed to act as a link between your computer and all other nodes across the ethereum network. When a block is being mined by another node on the network, your Geth installation will be notified of the update and then pass the information along to update your local copy of the blockchain. With the Geth utility, you'll be able to mine ether (similar to bitcoin but the cryptocurrency of the ethereum network), transfer funds between two addresses, create smart contracts and more.

Download and Installation

In my examples here, I'm configuring this ethereum blockchain on the latest LTS release of Ubuntu. Note that the tools themselves are not restricted to this distribution or release.

Downloading and Installing the Binary from the Project Website

Download the latest stable release, extract it and copy it to a proper directory:


$ wget https://gethstore.blob.core.windows.net/builds/
↪geth-linux-amd64-1.7.3-4bb3c89d.tar.gz
$ tar xzf geth-linux-amd64-1.7.3-4bb3c89d.tar.gz
$ cd geth-linux-amd64-1.7.3-4bb3c89d/
$ sudo cp geth /usr/bin/

Building from Source Code

If you are building from source code, you need to install both Go and C compilers:


$ sudo apt-get install -y build-essential golang

Change into the directory and do:


$ make geth

Installing from a Public Repository

If you are running on Ubuntu and decide to install the package from a public repository, run the following commands:


$ sudo apt-get install software-properties-common
$ sudo add-apt-repository -y ppa:ethereum/ethereum
$ sudo apt-get update
$ sudo apt-get install ethereum

Getting Started

Here is the thing, you don't have any ether to start with. With that in mind, let's limit this deployment to a "private" blockchain network that will sort of run as a development or staging version of the main ethereum network. From a functionality standpoint, this private network will be identical to the main blockchain, with the exception that all transactions and smart contracts deployed on this network will be accessible only to the nodes connected in this private network. Geth will aid in this private or "testnet" setup. Using the tool, you'll be able to do everything the ethereum platform advertises, without needing real ether.

Remember, the blockchain is nothing more than a digital and public ledger preserving transactions in their chronological order. When new transactions are verified and configured into a block, the block is then appended to the chain, which is then distributed across the network. Every node on that network will update its local copy of the chain to the latest copy. But you need to start from some point—a beginning or a genesis. Every blockchain starts with a genesis block, that is, a block "zero" or the very first block of the chain. It will be the only block without a predecessor. To create your private blockchain, you need to create this genesis block. To do this, you need to create a custom genesis file and then tell Geth to use that file to create your own genesis block.

Create a directory path to host all of your ethereum-related data and configurations and change into the config subdirectory:


$ mkdir ~/eth-evm
$ cd ~/eth-evm
$ mkdir config data
$ cd  config

Open your preferred text editor and save the following contents to a file named Genesis.json in that same directory:


{
    "config": {
        "chainId": 999,
        "homesteadBlock": 0,
        "eip155Block": 0,
        "eip158Block": 0
    },
    "difficulty": "0x400",
    "gasLimit": "0x8000000",
    "alloc": {}
}

This is what your genesis file will look like. This simple JSON-formatted string describes the following:

  • config — this block defines the settings for your custom chain.

  • chainId — this identifies your Blockchain, and because the main ethereum network has its own, you need to configure your own unique value for your private chain.

  • homesteadBlock — defines the version and protocol of the ethereum platform.

  • eip155Block / eip158Block — these fields add support for non-backward-compatible protocol changes to the Homestead version used. For the purposes of this example, you won't be leveraging these, so they are set to "0".

  • difficulty — this value controls block generation time of the blockchain. The higher the value, the more calculations a miner must perform to discover a valid block. Because this example is simply deploying a test network, let's keep this value low to reduce wait times.

  • gasLimit — gas is ethereum's fuel spent during transactions. As you do not want to be limited in your tests, keep this value high.

  • alloc — this section prefunds accounts, but because you'll be mining your ether locally, you don't need this option.

Now it's time to instantiate the data directory. Open a terminal window, and assuming you have the Geth binary installed and that it's accessible via your working path, type the following:


$ geth --datadir /home/petros/eth-evm/data/PrivateBlockchain
 ↪init /home/petros/eth-evm/config/Genesis.json
WARN [02-10|15:11:41] No etherbase set and no accounts found
 ↪as default
INFO [02-10|15:11:41] Allocated cache and file handles
    ↪database=/home/petros/eth-evm/data/PrivateBlockchain/
↪geth/chaindata cache=16 handles=16
INFO [02-10|15:11:41] Writing custom genesis block
INFO [02-10|15:11:41] Successfully wrote genesis state
    ↪database=chaindata
hash=d1a12d...4c8725
INFO [02-10|15:11:41] Allocated cache and file handles
    ↪database=/home/petros/eth-evm/data/PrivateBlockchain/
↪geth/lightchaindata cache=16 handles=16
INFO [02-10|15:11:41] Writing custom genesis block
INFO [02-10|15:11:41] Successfully wrote genesis state
    ↪database=lightchaindata

The command will need to reference a working data directory to store your private chain data. Here, I have specified eth-evm/data/PrivateBlockchain subdirectories in my home directory. You'll also need to tell the utility to initialize using your genesis file.

This command populates your data directory with a tree of subdirectories and files:


$ ls -R /home/petros/eth-evm/
.:
config  data

./config:
Genesis.json

./data:
PrivateBlockchain

./data/PrivateBlockchain:
geth  keystore

./data/PrivateBlockchain/geth:
chaindata  lightchaindata  LOCK  nodekey  nodes  transactions.rlp

./data/PrivateBlockchain/geth/chaindata:
000002.ldb  000003.log  CURRENT  LOCK  LOG  MANIFEST-000004

./data/PrivateBlockchain/geth/lightchaindata:
000001.log  CURRENT  LOCK  LOG  MANIFEST-000000

./data/PrivateBlockchain/geth/nodes:
000001.log  CURRENT  LOCK  LOG  MANIFEST-000000

./data/PrivateBlockchain/keystore:

Your private blockchain is now created. The next step involves starting the private network that will allow you to mine new blocks and have them added to your blockchain. To do this, type:


petros@ubuntu-evm1:~/eth-evm$ geth --datadir
 ↪/home/petros/eth-evm/data/PrivateBlockchain --networkid 9999
WARN [02-10|15:11:59] No etherbase set and no accounts found
 ↪as default
INFO [02-10|15:11:59] Starting peer-to-peer node
    ↪instance=Geth/v1.7.3-stable-4bb3c89d/linux-amd64/go1.9.2
INFO [02-10|15:11:59] Allocated cache and file handles
    ↪database=/home/petros/eth-evm/data/PrivateBlockchain/
↪geth/chaindata cache=128 handles=1024
WARN [02-10|15:11:59] Upgrading database to use lookup entries
INFO [02-10|15:11:59] Initialised chain configuration
    ↪config="{ChainID: 999 Homestead: 0 DAO:  DAOSupport:
 ↪false EIP150:  EIP155: 0 EIP158: 0 Byzantium: 
 ↪Engine: unknown}"
INFO [02-10|15:11:59] Disk storage enabled for ethash caches
    ↪dir=/home/petros/eth-evm/data/PrivateBlockchain/
↪geth/ethash count=3
INFO [02-10|15:11:59] Disk storage enabled for ethash DAGs
 ↪dir=/home/petros/.ethash count=2
INFO [02-10|15:11:59] Initialising Ethereum protocol
    ↪versions="[63 62]" network=9999
INFO [02-10|15:11:59] Database deduplication successful
    ↪deduped=0
INFO [02-10|15:11:59] Loaded most recent local header
    ↪number=0 hash=d1a12d...4c8725 td=1024
INFO [02-10|15:11:59] Loaded most recent local full block
    ↪number=0 hash=d1a12d...4c8725 td=1024
INFO [02-10|15:11:59] Loaded most recent local fast block
    ↪number=0 hash=d1a12d...4c8725 td=1024
INFO [02-10|15:11:59] Regenerated local transaction journal
    ↪transactions=0 accounts=0
INFO [02-10|15:11:59] Starting P2P networking
INFO [02-10|15:12:01] UDP listener up
    ↪self=enode://f51957cd4441f19d187f9601541d66dcbaf560
↪770d3da1db4e71ce5ba3ebc66e60ffe73c2ff01ae683be0527b77c0f96
↪a178e53b925968b7aea824796e36ad27@[::]:30303
INFO [02-10|15:12:01] IPC endpoint opened: /home/petros/eth-evm/
↪data/PrivateBlockchain/geth.ipc
INFO [02-10|15:12:01] RLPx listener up
    ↪self=enode://f51957cd4441f19d187f9601541d66dcbaf560
↪770d3da1db4e71ce5ba3ebc66e60ffe73c2ff01ae683be0527b77c0f96
↪a178e53b925968b7aea824796e36ad27@[::]:30303

Notice the use of the new parameter, networkid. This networkid helps ensure the privacy of your network. Any number can be used here. I have decided to use 9999. Note that other peers joining your network will need to use the same ID.

Your private network is now live! Remember, every time you need to access your private blockchain, you will need to use these last two commands with the exact same parameters (the Geth tool will not remember it for you):


$ geth --datadir /home/petros/eth-evm/data/PrivateBlockchain
 ↪init /home/petros/eth-evm/config/Genesis.json
$ geth --datadir /home/petros/eth-evm/data/PrivateBlockchain
 ↪--networkid 9999

Configuring a User Account

So, now that your private blockchain network is up and running, you can start interacting with it. But in order to do so, you need to attach to the running Geth process. Open a second terminal window. The following command will attach to the instance running in the first terminal window and bring you to a JavaScript console:


$ geth attach /home/petros/eth-evm/data/PrivateBlockchain/geth.ipc
Welcome to the Geth JavaScript console!

instance: Geth/v1.7.3-stable-4bb3c89d/linux-amd64/go1.9.2
 modules: admin:1.0 debug:1.0 eth:1.0 miner:1.0 net:1.0
 ↪personal:1.0 rpc:1.0 txpool:1.0 web3:1.0

>

Time to create a new account that will manipulate the Blockchain network:


> personal.newAccount()
Passphrase:
Repeat passphrase:
"0x92619f0bf91c9a786b8e7570cc538995b163652d"

Remember this string. You'll need it shortly. If you forget this hexadecimal string, you can reprint it to the console by typing:


> eth.coinbase
"0x92619f0bf91c9a786b8e7570cc538995b163652d"

Check your ether balance by typing the following script:


> eth.getBalance("0x92619f0bf91c9a786b8e7570cc538995b163652d")
0

Here's another way to check your balance without needing to type the entire hexadecimal string:


> eth.getBalance(eth.coinbase)
0

Mining

Doing real mining in the main ethereum blockchain requires some very specialized hardware, such as dedicated Graphics Processing Units (GPU), like the ones found on the high-end graphics cards mentioned in Part I. However, since you're mining for blocks on a private chain with a low difficulty level, you can do without that requirement. To begin mining, run the following script on the JavaScript console:


> miner.start()
null

Updates in the First Terminal Window

You'll observe mining activity in the output logs displayed in the first terminal window:


INFO [02-10|15:14:47] Updated mining threads
    ↪threads=0
INFO [02-10|15:14:47] Transaction pool price threshold
 ↪updated price=18000000000
INFO [02-10|15:14:47] Starting mining operation
INFO [02-10|15:14:47] Commit new mining work
    ↪number=1 txs=0 uncles=0 elapsed=186.855us
INFO [02-10|15:14:57] Generating DAG in progress
    ↪epoch=1 percentage=0 elapsed=7.083s
INFO [02-10|15:14:59] Successfully sealed new block
    ↪number=1 hash=c81539...dc9691
INFO [02-10|15:14:59] mined potential block
    ↪number=1 hash=c81539...dc9691
INFO [02-10|15:14:59] Commit new mining work
    ↪number=2 txs=0 uncles=0 elapsed=211.208us
INFO [02-10|15:15:04] Generating DAG in progress
    ↪epoch=1 percentage=1 elapsed=13.690s
INFO [02-10|15:15:06] Successfully sealed new block
    ↪number=2 hash=d26dda...e3b26c
INFO [02-10|15:15:06] mined potential block
    ↪number=2 hash=d26dda...e3b26c
INFO [02-10|15:15:06] Commit new mining work
    ↪number=3 txs=0 uncles=0 elapsed=510.357us

[ ... ]

INFO [02-10|15:15:52] Generating DAG in progress
    ↪epoch=1 percentage=8 elapsed=1m2.166s
INFO [02-10|15:15:55] Successfully sealed new block
    ↪number=15 hash=d7979f...e89610
INFO [02-10|15:15:55] block reached canonical chain
    ↪number=10 hash=aedd46...913b66
INFO [02-10|15:15:55] mined potential block
    ↪number=15 hash=d7979f...e89610
INFO [02-10|15:15:55] Commit new mining work
    ↪number=16 txs=0 uncles=0 elapsed=105.111us
INFO [02-10|15:15:57] Successfully sealed new block
    ↪number=16 hash=61cf68...b16bf2
INFO [02-10|15:15:57] block reached canonical chain
    ↪number=11 hash=6b89ff...de8f88
INFO [02-10|15:15:57] mined potential block
    ↪number=16 hash=61cf68...b16bf2
INFO [02-10|15:15:57] Commit new mining work
    ↪number=17 txs=0 uncles=0 elapsed=147.31us

Back to the Second Terminal Window

Wait 10–20 seconds, and on the JavaScript console, start checking your balance:


> eth.getBalance(eth.coinbase)
10000000000000000000

Wait some more, and list it again:


> eth.getBalance(eth.coinbase)
75000000000000000000

Remember, this is fake ether, so don't open that bottle of champagne, yet. You are unable to use this ether in the main ethereum network.

To stop the miner, invoke the following script:


> miner.stop()
true

Well, you did it. You created your own private blockchain and mined some ether.

Who Will Benefit from This Technology Today and in the Future?

Although the blockchain originally was developed around cryptocurrency (more specifically, bitcoin), its uses don't end there. Today, it may seem like that's the case, but there are untapped industries and markets where blockchain technologies can redefine how transactions are processed. The following are some examples that come to mind.

Improving Smart Contracts

Ethereum, the same open-source blockchain project deployed earlier, already is doing the whole smart-contract thing, but the idea is still in its infancy, and as it matures, it will evolve to meet consumer demands. There's plenty of room for growth in this area. It probably and eventually will creep into governance of companies (such as verifying digital assets, equity and so on), trading stocks, handling intellectual property and managing property ownership, such as land title registration.

Enabling Market Places and Shared Economies

Think of eBay but refocused to be peer-to-peer. This would mean no more transaction fees, but it also will emphasize the importance of your personal reputation, since there will be no single body governing the market in which goods or services are being traded or exchanged.

Crowdfunding

Following in the same direction as my previous remarks about a decentralized marketplace, there also are opportunities for individuals or companies to raise the capital necessary to help "kickstart" their initiatives. Think of a more open and global Kickstarter or GoFundMe.

Multimedia Sharing or Hosting

A peer-to-peer network for aspiring or established musicians definitely could go a long way here—one where the content will reach its intended audiences directly and also avoid those hefty royalty costs paid out to the studios, record labels and content distributors. The same applies to video and image content.

File Storage and Data Management

By enabling a global peer-to-peer network, blockchain technology takes cloud computing to a whole new level. As the technology continues to push itself into existing cloud service markets, it will challenge traditional vendors, including Amazon AWS and even Dropbox and others—and it will do so at a fraction of the price. For example, cold storage data offerings are a multi-hundred billion dollar market today. By distributing your encrypted archives across a global and decentralized network, the need to maintain local data-center equipment by a single entity is reduced significantly.

Social media and how your posted content is managed would change under this model as well. Under the blockchain, Facebook or Twitter or anyone else cannot lay claim to what you choose to share.

Another added benefit to leveraging blockchain here is making use of the cryptography securing your valuable data from getting hacked or lost.

Internet of Things

What is the Internet of Things (IoT)? It is a broad term describing the networked management of very specific electronic devices, which include heating and cooling thermostats, lights, garage doors and more. Using a combination of software, sensors and networking facilities, people can easily enable an environment where they can automate and monitor home and/or business equipment.

Supply Chain Audits

With a distributed public ledger made available to consumers, retailers can't falsify claims made against their products. Consumers will have the ability to verify their sources, be it food, jewelry or anything else.

Identity Management

There isn't much to explain here. The threat is very real. Identity theft never takes a day off. The dated user name/password systems of today have run their course, and it's about time that existing authentication frameworks leverage the cryptographic capabilities offered by the blockchain.

Summary

This revolutionary technology has enabled organizations in ways that weren't possible a decade ago. Its possibilities are enormous, and it seems that any industry dealing with some sort of transaction-based model will be disrupted by the technology. It's only a matter of time until it happens.

Now, what will the future for blockchain look like? At this stage, it's difficult to say. One thing is for certain though; large companies, such as IBM, are investing big into the technology and building their own blockchain infrastructure that can be sold to and used by corporate enterprises and financial institutions. This may create some issues, however. As these large companies build their blockchain infrastructures, they will file for patents to protect their technologies. And with those patents in their arsenal, there exists the possibility that they may move aggressively against the competition in an attempt to discredit them and their value.

Anyway, if you will excuse me, I need to go make some crypto-coin.

Eclipse Foundation’s New Open-Source Governance Model for Jakarta EE, Turris MOX Modular Router Campaign and More

News briefs for April 24, 2018.

The Eclipse Foundation announced today a new open-source governance model and "a 'cloud native Java' path forward for Jakarta EE, the new community-led platform created from the contribution of Java EE." According to the press release, with this move to the community-driven open-source governance model, "Jakarta EE promises faster release and innovation cycles." See https://jakarta.ee for more details or to join the Jakarta EE Working Group.

A Czech Republic company has launched an Indiegogo campaign to create a modular and open-source router called Turris MOX. The Turris MOX team claims it's "probably the first router configurable as easily as a sandwich. Choose the parts you actually want." (Source: It's FOSS.)

Amnesty International's Technology and Human Rights researcher Joe Westby recently said that Google "shows total contempt for Android users' privacy" in reference to the launch of its new "Chat" messaging service: "With its baffling decision to launch a messaging service without end-to-end encryption, Google has shown utter contempt for the privacy of Android users and handed a precious gift to cybercriminals and government spies alike, allowing them easy access to the content of Android users' communications." (Source: BleepingComputer.)

SUSE recently announced that SUSE Linux Enterprise High Performance Computing is in the SLE 15 Beta program: "as part of the (upcoming) major version of SUSE Linux Enterprise 15, SUSE Linux Enterprise High Performance Computing (HPC) is now a dedicated SLE product base on SLES 15. This product is available for the x86_64 and ARM aarch64 hardware platforms." For more info on the SLE 15 Public Beta program, visit here.

According to Phoronix, the GNOME shell memory leak has been fixed: "the changes are currently staged in Git for what will become GNOME 3.30 but might also be backported to 3.28". See GNOME developer Georges Stravracas' blog post for more details on the leak and its fix.

Heptio Announces Gimbal, Netflix Open-Sources Titus, Linux 4.15 Reaches End of Life and More

News briefs for April 23, 2018.

Heptio this morning announces Gimbal, "an open source initiative to unify and scale the flow of network traffic into hybrid environments consisting of multiple Kubernetes clusters and traditional infrastructure technologies including OpenStack". The initiative is in collaboration with Actapio, a subsidiary of Yahoo Japan Corporation, and according to Craig McLuckie, founder and CEO of Heptio, "This collaboration demonstrates the full potential of cloud native technologies and open source as a way to not only manage applications, but address broader infrastructure considerations."

Netflix open-sources its Titus container management system. According to Christine Hall's DataCenter Knowledge article, Titus is tightly integrated with AWS, and it "launches as many as three million containers per week, to host thousands of applications over seven regionally isolated stacks across tens of thousands of EC2 virtual machines."

Linux 4.15 has reached end of life. Greg Kroah-Hartman announced on the LKML that if you're still using the 4.15 kernel series, it's time to upgrade to the 4.16.y kernel tree.

GNU Parallel 20180422 was released yesterday. GNU Parallel is "a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input". More info is available here.

FFmpeg has a new major release: version 4.0 "Wu". Some highlights include "bitstream filters for editing metadata in H.264, HEVC and MPEG-2 streams", experimental Magic YUV encoder, TiVo ty/ty+ demuxer and more.

Userspace Networking with DPDK

Userspace Networking with DPDK
Image
Rami Rosen Mon, 04/23/2018 - 07:07

DPDK is a fully open-source project that operates in userspace. It's a multi-vendor and multi-architecture project, and it aims at achieving high I/O performance and reaching high packet processing rates, which are some of the most important features in the networking arena. It was created by Intel in 2010 and moved to the Linux Foundation in April 2017. This move positioned it as one of the most dominant and most important open-source Linux projects. DPDK was created for the telecom/datacom infrastructure, but today, it's used almost everywhere, including the cloud, data centers, appliances, containers and more. In this article, I present a high-level overview of the project and discuss features that were released in DPDK 17.08 (August 2017).

Undoubtedly, a lot of effort in many networking projects is geared toward achieving high speed and high performance. Several factors contribute to achieving this goal with DPDK. One is that DPDK is a userspace application that bypasses the heavy layers of the Linux kernel networking stack and talks directly to the network hardware. Another factor is usage of memory hugepages. By using hugepages (of 2MB or 1GB in size), a smaller number of memory pages is needed than when using standard memory pages (which in many platforms are 4k in size). As a result, the number of Translation Lookaside Buffers (TLBs) misses is reduced significantly, and performance is increased. Yet another factor is that low-level optimizations are done in the code, some of them related to memory cache line alignment, aiming at achieving optimal cache use, prefetching and so on. (Delving into the technical details of those optimizations is outside the scope of this article.)

DPDK has gained popularity in recent years, and it's used in many open-source projects. Many Linux distributions (Fedora, Ubuntu and others) have included DPDK support in their packaging systems as well.

The core DPDK ingredients are libraries and drivers, also known as Poll Mode Drivers (PMDs). There are more than 35 libraries at the time of this writing. These libraries abstract away the low-level implementation details, which provides flexibility as each vendor implements its own low-level layers.

The DPDK Development Model

DPDK is written mostly in C, but the project also has a few tools that are written in Python. All code contributions to DPDK are done by patches sent and discussed over the dpdk-dev mailing list. Patches aiming at getting feedback first are usually titled RFCs (Request For Comments). In order to keep the code as stable as possible, the preference is to preserve the ABI (Application Binary Interface) whenever possible. When it seems that there's no other choice, developers should follow a strict ABI deprecation process, including announcement of the requested ABI changes over the dpdk-dev mailing list ahead of time. The ABI changes that are approved and merged are documented in the Release Notes. When acceptance of new features is in doubt, but the respective patches are merged into the master tree anyway, they are tagged as "EXPERIMENTAL". This means that those patches may be changed or even could be removed without prior notice. Thus, for example, new rte_bus experimental APIs were added in DPDK 17.08. I also should note that usually whenever patches for a new, generic API (which should support multiple hardware devices from different vendors) are sent over the mailing list, it's expected that at least one hardware device that supports the new feature is available on the market (if the device is merely announced and not available, developers can't test it).

There's a technical board of nine members from various companies (Intel, NXP, 6WIND, Cavium and others). Meetings typically are held every two weeks over IRC, and the minutes of those meetings are posted on the dpdk-dev mailing list.

As with other large open-source projects, there are community-driven DPDK events across the globe on a regular basis every year. First, there are various DPDK Summits. Among them, DPDK Summit Userspace is focused on being more interactive and on getting feedback from the community. There also are several DPDK meetups in different locations around the world. Moreover, from time to time there is an online community survey, announced over the dpdk-dev mailing list, in order to get feedback from the community, and everyone can participate in it.

The DPDK website hosts the master DPDK repo, but several other repos are dedicated for new features. Several tools and utilities exist in the DPDK tree, among them are the dpdk-devbind.py script, which is for associating a network device or a crypto device with DPDK, and testpmd, which is a CLI tool for various tasks, such as forwarding, monitoring statistics and more. There are almost 50 sample applications under the "examples" folder, bundled with full detailed documentation.

Apart from DPDK itself, the DPDK site hosts several other open-source projects. One is the DPDK Test Suite (DTS), which a Python-based framework for DPDK. It has more than 100 test modules for various features, including the most advanced and most recent features. It runs with IXIA and Scapy traffic generators. It includes both functional and benchmarking tests, and it's very easy to configure and run, as you need to set up only three or four configuration files. You also can set the DPDK version with which you want to run it. DTS development is handled over a dedicated mailing list, and it currently has support for Intel NICs and Mallanox NICs.

DPDK is released every three months. This release cadence is designed to allow DPDK to keep evolving in a rapid pace while giving enough opportunity to review, discuss and improve the contributions. There are usually 3–5 release candidates (RCs) before the final release. For the 17.08 release, there were 1,023 patches from 125 authors, including patches from Intel, Cavium, 6WIND, NXP and others. The release numbers follow the Ubuntu versions convention. A Long Term Stable (LTS) release is maintained for two years. Plans for future LTS releases currently are being discussed in the DPDK community. The plan is to make every .11 release in an even-numbered year (16.11, 18.11 and so forth ) an LTS release and to maintain it for two years.

Recent Features and New Ideas

Several interesting features were added last year. One of the most fascinating capabilities (added in DPDK 17.05, with new features enabled in 17.08 and 17.11) is "Dynamic Device Personalization" (DDP) for the Intel I40E driver (10Gb/25Gb/40Gb). This feature allows applying a per-device profile to the I40E firmware dynamically. You can load a profile by running a testpmd CLI command (ddp add), and you can remove it with ddp del. You also can apply or remove profiles when traffic is flowing, with a small number of packets dropped during handling a profile. These profiles are created by Intel and not by customers, as I40E firmware programming requires deep knowledge of the I40E device internals.

Other features to mention include Bruce Richardson's build system patch, which provides a more efficient build system for DPDK with meson and ninja, a new kernel module called Kernel Control Path (KCP), port representors and more.

DPDK and Networking Projects

DPDK is used in various important networking projects. The list is quite long, but I want to mention a few of them briefly:

  • Open vSwitch (OvS): the OvS project implements a virtual network switch. It was transitioned to the Linux Foundation in August 2016 and gained a lot of popularity in the industry. DPDK was first integrated into OvS 2.2 in 2015. Later, in OvS 2.4, support for vHost user, which is a virtual device, was added. Support for advanced features like multi-queues and numa awareness was added in subsequent releases.
  • Contrail vRouter: Contrail Systems was a startup that developed SDN controllers. Juniper Networks acquired it in 2012, and Juniper Networks released the Contrail vRouter later as an open-source project. It uses DPDK to achieve better network performance.
  • pktgen-dpdk: an open-source traffic generator based on DPDK (hosted on the DPDK site).
  • TREX: a stateful and stateless open-source traffic generator based on DPDK.
  • Vector Packet Processing (VPP): an FD.io project.

Getting Started with DPDK

For those who are newcomers to DPDK, both users and developers, there is excellent documentation hosted on DPDK site. It's recommended that you actually try run several of the sample applications (following the "Sample Applications User Guides"), starting with the "Hello World" application. It's also a good idea to follow the dpdk-users mailing list on a regular basis. For those who are interested in development, the Programmer's Guide is a good source of information about the architecture and development environment, and developers should follow the dpdk-dev mailing list as well.

DPDK and SR-IOV Example

I want to conclude this article with a very basic example (based on SR-IOV) of how to create a DPDK VF and how to attach it to a VM with qemu. I also show how to create a non-DPDK VF ("kernel VF"), attach it to a VM, run a DPDK app on that VF and communicate with it from the host.

As a preparation step, you need to enable IOMMU and virtualization on the host. To support this, add intel_iommu=on iommu=pt as kernel parameters to the kernel command line (in grub.cfg), and also to enable virtualization and VT-d in the BIOS (VT-d stands for "Intel Virtualization Technology for Directed I/O"). You'll use the Intel I40E network interface card for this example. The I40E device driver supports up to 128 VFs per device, divided equally across ports, so if you have a quad-port I40E NIC, you can create up to 32 VFs on each port.

For this example, I also show a simple usage of the testpmd CLI, as mentioned earlier. This example is based on DPDK-17.08, the most recent release of DPDK at the time of this writing. In this example, you'll use Single Root I/O Virtualization (SR-IOV), which is an extension of the PCI Express (PCIe) specification and allows sharing a single physical PCI Express resource across several virtual environments. This technology is very popular in data-center/cloud environments, and many network adapters support this feature, and likewise, their drivers support this feature. I should note that SRIOV is not limited to network devices, but is available for other PCI devices as well, such as graphic cards.

DPDK VF

You create DPDK VFs by writing the number of requested VFs into a DPDK sysfs entry called max_vfs. Say that eth8 is the PF on top of which you want to create a VF and its PCI address is 0000:07:00.0. (You can fetch the PCI address with ethtool -i | grep bus-info.) The following is the sequence you run on the host in order to create a VF and launch a VM. First, bind the PF to DPDK with usertools/dpdk-devbind.py, for example:


modprobe uio
insmod /build/kmod/igb_uio.k
./usertools/dpdk-devbind.py -b igb_uio 0000:07:00.0

Then, create two DPDK VFs with:


echo 2 > /sys/bus/pci/devices/0000:07:00.0/max_vfs

You can verify that the two VFs were created by this operation by checking whether two new entries were added when running: lspci | grep "Virtual Function", or by verifying that you have now two new symlinks under /sys/bus/pci/devices/0000:07:00.0/ for the two newly created VFs: virtfn0 and virtfn1.

Next, launch the VMs via qemu using PCI Passthrough, for example:


qemu-system-x86_64 -enable-kvm -cpu host \
    -drive file=Ubuntu_1604.qcow2,index=0,media=disk,format=qcow2 \
    -smp 5 -m 2048 -vga qxl \
    -vnc :1 \
    -device pci-assign,host=0000:07:02.0 \
    -net nic,macaddr=00:00:00:99:99:01 \
    -net tap,script=/etc/qemu-ifup.

Note: qemu-ifup is a shell script that's invoked when the VM is launched, usually for setting up networking.

Next, you can start a VNC client (such as RealVNC client) to access the VM, and from there, you can verify that the VF was indeed assigned to it, with lspci -n. You should see a single device, which has "8086 154c" as the vendor ID/device ID combination; "8086 154c" is the virtual function PCI ID of the I40E NIC. You can launch a DPDK application in the guest on top of that VF.

Kernel VF

To conclude this example, let's create a kernel VF on the host and run a DPDK on top of it in the VM, and then let's look at a simple interaction with the host PF.

First, create two kernel VFs with:


echo 2 > /sys/bus/pci/devices/0000:07:00.0/sriov_numvfs

Here again you can verify that these two VFs were created by running lspci | grep "Virtual Function".

Next, run this sequence:


echo "8086 154c" > /sys/bus/pci/drivers/pci-stub/new_id
echo 07:02.0 > /sys/bus/pci/devices/$VF_PCI_0/driver/unbind
echo 07:02.0 > /sys/bus/pci/drivers/pci-stub/bind

Then launch the VM the same way as before, with the same qemu-system-x86_64 command mentioned earlier. Again, in the guest, you should be able to see the I40E VF with lspci -n. On the host, doing ip link show will show the two VFs of eth8: vf 0 and vf 1. You can set the MAC addresses of a VF from the host with ip link set—for example:


ip link set eth8 vf 0 mac 00:11:22:33:44:55

Then, when you run a DPDK application like testpmd in the guest, and run, for example, show port info 0 from the testpmd CLI, you'll see that indeed the MAC address that you set in the host is reflected for that VF in DPDK.

Summary

This article provides a high-level overview of the DPDK project, which is growing dynamically and gaining popularity in the industry. The near future likely will bring support for more network interfaces from different vendors, as well as new features.

Weekend Reading: Networking

Weekend Reading: Networking
Image
Banana Backup
Carlie Fairchild Sat, 04/21/2018 - 08:08

Networking is one of Linux's strengths and a popular topic for our subscribers. For your weekend reading, we've curated some of Linux Journal's most popular networking articles. 

 

NTPsec: a Secure, Hardened NTP Implementation

by Eric S. Raymond

Network time synchronization—aligning your computer's clock to the same Universal Coordinated Time (UTC) that everyone else is using—is both necessary and a hard problem. Many internet protocols rely on being able to exchange UTC timestamps accurate to small tolerances, but the clock crystal in your computer drifts (its frequency varies by temperature), so it needs occasional adjustments.

 

smbclient Security for Windows Printing and File Transfer

by Charles Fisher

Microsoft Windows is usually a presence in most computing environments, and UNIX administrators likely will be forced to use resources in Windows networks from time to time. Although many are familiar with the Samba server software, the matching smbclient utility often escapes notice.

 

Understanding Firewalld in Multi-Zone Configurations

by Nathan R. Vance and William F. Polik

Stories of compromised servers and data theft fill today's news. It isn't difficult for someone who has read an informative blog post to access a system via a misconfigured service, take advantage of a recently exposed vulnerability or gain control using a stolen password. Any of the many internet services found on a typical Linux server could harbor a vulnerability that grants unauthorized access to the system.

 

Papa's Got a Brand New NAS

by Kyle Rankin

It used to be that the true sign you were dealing with a Linux geek was the pile of computers lying around that person's house. How else could you experiment with networked servers without a mass of computers and networking equipment? If you work as a sysadmin for a large company, sometimes one of the job perks is that you get first dibs on decommissioned equipment. Through the years, I was able to amass quite a home network by combining some things I bought myself with some equipment that was too old for production. A major point of pride in my own home network was the 24U server cabinet in the garage. It had a gigabit top-of-rack managed switch, a 2U UPS at the bottom, and in the middle was a 1U HP DL-series server with a 1U eSATA disk array attached to it. Above that was a slide-out LCD and keyboard in case I ever needed to work on the server directly.

 

Banana Backups

by Kyle Rankin

I wrote an article called "Papa's Got a Brand New NAS" where I described how I replaced my rackmounted gear with a small, low-powered ARM device—the Odroid XU4. Before I settled on that solution, I tried out a few others including a pair of Banana Pi computers—small single-board computers like Raspberry Pis only with gigabit networking and SATA2 controllers on board. In the end, I decided to go with a single higher-powered board and use a USB3 disk enclosure with RAID instead of building a cluster of Banana Pis that each had a single disk attached. Since I had two Banana Pis left over after this experiment, I decided to put them to use, so in this article, I describe how I turned one into a nice little backup server.

 

Roll Your Own Enterprise Wi-Fi

by Shawn Powers

The UniFi line of products from Ubiquiti is affordable and reliable, but the really awesome feature is its (free!) Web-based controller app. The only UniFi products I have are wireless access points, even though the company also has added switches, gateways and even VoIP products to the mix. Even with my limited selection of products, however, the Web controller makes designing and maintaining a wireless network not just easy, but fun!

 

Tracking Down Blips

by Shawn Powers

In a previous article, I explained the process for setting up Cacti, which is a great program for graphing just about anything. One of the main things I graph is my internet usage. And, it's great information to have, until there is internet activity you can't explain. In my case, there was a "blip" every 20 minutes or so that would use about 4mbps of bandwidth (Figure 1). In the grand scheme of things, it wasn't a big deal, because my connection is 60mbps down. Still, it was driving me crazy. I don't like the idea of something on my network doing things on the internet without my knowledge. So, the hunt began.

 

Caption This!

Caption This!
Image
Caption This!
Carlie Fairchild Fri, 04/20/2018 - 11:21

Each month, we provide a cartoon in need of a caption. You submit your caption, we choose three finalists, and readers vote for their favorite. The winning caption for this month's cartoon will appear in the June issue of Linux Journal.

To enter, simply type in your caption in the comments below or email us, publisher@linuxjournal.com.

Mozilla’s Common Voice Project, Red Hat Announces Vault Operator, VirtualBox 5.2.10 Released and More

News briefs April 20, 2018.

Participate in Mozilla's open-source Common Voice Project, an initiative to help teach machines how real people speak: "Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web." For more about the Common Voice Project, see the story on opensource.com.

Red Hat yesterday announced the Vault Operator, a new open-source project that "aims to make it easier to install, manage, and maintain instances of Vault—a tool designed for storing, managing, and controlling access to secrets, such as tokens, passwords, certificates, and API keys—on Kubernetes clusters."

Google might be working on implementing dual-boot functionality in Chrome OS to allow Chromebook users to boot multiple OSes. Softpedia News reports on a Reddit thread that references "Alt OS" in recent Chromium Gerrit commits. This is only speculation so far, and Google has not confirmed it is working on dual-boot support for Chrome OS on Chromebooks.

Oracle recently released VirtualBox 5.2.10. This release addresses the CPU (Critical Patch Updates) Advisory for April 2018 related to Oracle VM VirtualBox and several other improvements, including fixing a KDE Plasma hang and having multiple NVMe controllers with ICH9 enabled. See the Changelog for all the details.

Apple yesterday announced it has open-sourced its FoundationDB cloud database. Apple's goal is "to build a community around the project and make FoundationDB the foundation for the next generation of distributed databases". The project is now available on GitHub.

More L337 Translations

More L337 Translations
Image
bash
Dave Taylor Thu, 04/19/2018 - 09:20

Dave continues with his shell-script L33t translator.

In my last article, I talked about the inside jargon of hackers and computer geeks known as "Leet Speak" or just "Leet". Of course, that's a shortened version of the word Elite, and it's best written as L33T or perhaps L337 to be ultimately kewl. But hey, I don't judge.

Last time I looked at a series of simple letter substitutions that allow you to convert a sentence like "I am a master hacker with great skills" into something like this:


I AM A M@ST3R H@XR WITH GR3@T SKILLZ

It turns out that I missed some nuances of Leet and didn't realize that most often the letter "a" is actually turned into a "4", not an "@", although as with just about everything about the jargon, it's somewhat random.

In fact, every single letter of the alphabet can be randomly tweaked and changed, sometimes from a single letter to a sequence of two or three symbols. For example, another variation on "a" is "/-\" (for what are hopefully visually obvious reasons).

Continuing in that vein, "B" can become "|3", "C" can become "[", "I" can become "1", and one of my favorites, "M" can change into "[]V[]". That's a lot of work, but since one of the goals is to have a language no one else understands, I get it.

There are additional substitutions: a word can have its trailing "S" replaced by a "Z", a trailing "ED" can become "'D" or just "D", and another interesting one is that words containing "and", "anned" or "ant" can have that sequence replaced by an ampersand (&).

Let's add all these L337 filters and see how the script is shaping up.

But First, Some Randomness

Since many of these transformations are going to have a random element, let's go ahead and produce a random number between 1–10 to figure out whether to do one or another action. That's easily done with the $RANDOM variable:


doit=$(( $RANDOM % 10 ))       # random virtual coin flip

Now let's say that there's a 50% chance that a -ed suffix is going to change to "'D" and a 50% chance that it's just going to become "D", which is coded like this:


if [ $doit -ge 5 ] ;  then
  word="$(echo $word | sed "s/ed$/d/")"
else
  word="$(echo $word | sed "s/ed$/'d/")"
fi

Let's add the additional transformations, but not do them every time. Let's give them a 70–90% chance of occurring, based on the transform itself. Here are a few examples:


if [ $doit -ge 3 ] ;  then
  word="$(echo $word | sed "s/cks/x/g;s/cke/x/g")"
fi

if [ $doit -ge 4 ] ;  then
  word="$(echo $word | sed "s/and/\&/g;s/anned/\&/g;
     s/ant/\&/g")"
fi

And so, here's the second translation, a bit more sophisticated:


$ l33t.sh "banned? whatever. elite hacker, not scriptie."
B&? WH4T3V3R. 3LIT3 H4XR, N0T SCRIPTI3.

Note that it hasn't realized that "elite" should become L337 or L33T, but since it is supposed to be rather random, let's just leave this script as is. Kk? Kewl.

If you want to expand it, an interesting programming problem is to break each word down into individual letters, then randomly change lowercase to uppercase or vice versa, so you get those great ransom-note-style WeiRD LeTtEr pHrASes.

Next time, I plan to move on, however, and look at the great command-line tool youtube-dl, exploring how to use it to download videos and even just the audio tracks as MP3 files.

Help Canonical Test GNOME Patches, Android Apps Illegally Tracking Kids, MySQL 8.0 Released and More

News briefs for April 19, 2018.

Help Canonical test the GNOME desktop memory leak fixes in Ubuntu 18.04 LTS (Bionic Beaver) by downloading and installing the current daily ISO for your hardware from here: http://cdimage.ubuntu.com/daily-live/current/bionic-desktop-amd64.iso. Then download the patched version of gjs, install, reboot, and then just use your desktop normally. If performance seems impacted by the new packages, re-install from the ISO again, but don't install the new packages and see if things are better. See the Ubuntu Community page for more detailed instructions.

Thousands of Android apps downloaded from the Google Play store may be tracking kids' data illegally, according to a new study. NBC News reports: "Researchers at the University of California's International Computer Science Institute analyzed 5,855 of the most downloaded kids apps, concluding that most of them are 'are potentially in violation' of the Children's Online Privacy Protection Act 1998, or COPPA, a federal law making it illegal to collect personally identifiable data on children under 13."

MySQL 8.0 has been released. This new version "includes significant performance, security and developer productivity improvements enabling the next generation of web, mobile, embedded and Cloud applications." MySQL 8.0 features include MySQL document store, transactional data dictionary, SQL roles, default to utf8mb4 and more. See the white paper for all the details.

KDE announced this morning that KDE Applications 18.04.0 are now available. New features include improvements to panels in the Dolphin file manager; Wayland support for KDE's JuK music player; improvements to Gwenview, KDE's image viewer and organizer; and more.

Collabora Productivity, "the driving force behind putting LibreOffice in the cloud", announced a new release of its enterprise-ready cloud document suite—Collabora Online 3.2. The new release includes implemented chart creation, data validation in Calc, context menu spell-checking and more.

An Update on Linux Journal

An Update on Linux Journal
Image
Linux Journal magazine covers
Carlie Fairchild Wed, 04/18/2018 - 12:41

So many of you have asked how to help Linux Journal continue to be published* for years to come.

First, keep the great ideas coming—we all want to continue making Linux Journal 2.0 something special, and we need this community to do it.

Second, subscribe or renew. Magazines have a built-in fundraising program: subscriptions. It's true that most magazines don't survive on subscription revenue alone, but having a strong subscriber base tells Linux Journal, prospective authors, and yes, advertisers, that there is a community of people who support and read the magazine each month.

Third, if you prefer reading articles on our website, consider becoming a Patron. We have different Patreon reward levels, one even gets your name immortalized in the pages of Linux Journal.

Fourth, spread the word within your company about corporate sponsorship of Linux Journal. We as a community reject tracking, but we explicitly invite high-value advertising that sponsors the magazine and values readers. This is new and unique in online publishing, and just one example of our pioneering work here at Linux Journal.  

Finally, write for us! We are always looking for new writers, especially now that we are publishing more articles more often.  

With all our gratitude,

Your friends at Linux Journal

 

*We'd be remiss to not acknowledge or thank Private Internet Access for saving the day and bringing Linux Journal back from the dead. They are incredibly supportive partners and sincerely, we can not thank them enough for keeping us going. At a certain point however, Linux Journal has to become sustainable on its own.