Harness the power of bash and learn how to scrape websites for exciting new images every morning.
So, you want a cool dynamic desktop wallpaper without dodgy programs and a million viruses? The good news is, this is Linux, and anything is possible. I started this project because I was bored of my standard OS desktop wallpaper, and I have slowly created a plethora of scripts to pull images from several sites and set them as my desktop background. It's a nice little addition to my day—being greeted by a different cat picture or a panorama of a country I didn't know existed. The great news is that it's easy to do, so let's get started.
BAsh (The Bourne Again shell) is standard across almost all *NIX systems and provides a wide range of operations "out of the box", which would take time and copious lines of code to achieve in a conventional coding or even scripting language. Additionally, there's no need to re-invent the wheel. It's much easier to use somebody else's program to download webpages for example, than to deal with low-level system sockets in C.
How's It Going to Work?
The concept is simple. Choose a site with images you like and "scrape" the page for those images. Then once you have a direct link, you download them and set them as the desktop wallpaper using the display manager. Easy right?
A Simple Example: xkcd
Now, what if you want to see
this comic without venturing to the xkcd site? You need a script to do
it for you. First, you need to know how
the webpage looks to the computer, so download it and take a
look. To do this, use
wget, an easy-to-use, commonly
installed, non-interactive, network downloader.
So, on the command
wget, and give it the link to the page:
user@LJ $: wget https://www.xkcd.com/ --2018-01-27 21:01:39-- https://www.xkcd.com/ Resolving www.xkcd.com... 184.108.40.206, 220.127.116.11, ↪18.104.22.168, ... Connecting to www.xkcd.com|22.214.171.124|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2606 (2.5K) [text/html] Saving to: 'index.html' index.html 100% [==========================================================>] 2.54K --.-KB/s in 0s 2018-01-27 21:01:39 (23.1 MB/s) - 'index.html' saved 
As you can see in the output, the page has been saved to index.html in your current directory. Using your favourite editor, open it and take a look (I'm using nano for this example):
user@LJ $: nano index.html
Now you might realize, despite this being a rather bare page, there's a
lot of code in that file. Instead of going through it all, let's use
grep, which is perfect for this task. Its sole function is
to print lines matching your search. Grep uses the syntax:
user@LJ $: grep [search] [file]
Looking at the daily comic, its current title is "Night Sky". Searching
for "night" with
grep yields the following results:
user@LJ $: grep "night" index.html Image URL (for hotlinking/embedding): ↪https://imgs.xkcd.com/comics/night_sky.png
grep search has returned two image links in the file, each related to
"night". Looking at those two lines, one is the image in the page, and
the other is for hotlinking and is already a usable link. You'll
be obtaining the first link, however, as it is more representative
of other pages that don't provide an easy link, and it serves as a good
introduction to the use of
To get the first link out of the page, you first need to identify it in
the file programmatically. Let's try
grep again, but this time instead
of using a string you already know ("night"), let's approach as if you
know nothing about the page. Although the link will be different, the
HTML should remain the same; therefore,
always should appear
before the link you want:
It looks like there are three images on the page.
Comparing these results from the first
grep, you'll see that
grep. The other two links
contain "/s/"; whereas the link we want contains "/comics/". So,
you need to
grep the output of the last command for "/comics/". To pass
along the output of the last command, use the pipe character (|):
user@LJ $: grep "img src=" index.html | grep "/comics/"
And, there's the line! Now you just need to separate the image link from
the rest of it with the
uses the syntax:
user@LJ $: cut [-d delimeter] [-f field] [-c characters]
To cut the link from the rest of the line, you'll want to cut next to the quotation mark and select the field before the next quotation mark. In other words, you want the text between the quotes, or the link, which is done like this:
user@LJ $: grep "img src=" index.html | grep "/comics/" | ↪cut -d\" -f2 //imgs.xkcd.com/comics/night_sky.png
And, you've got the link. But wait! What about those pesky forward slashes at the beginning? You can cut those out too:
user@LJ $: grep "img src=" index.html | grep "/comics/" | ↪cut -d\" -f 2 | cut -c 3- imgs.xkcd.com/comics/night_sky.png
Now you've just cut the first three characters from the line, and you're
left with a link straight to the image. Using
you can download
user@LJ $: wget imgs.xkcd.com/comics/night_sky.png --2018-01-27 21:42:33-- http://imgs.xkcd.com/comics/night_sky.png Resolving imgs.xkcd.com... 126.96.36.199, 2a04:4e42:4::67 Connecting to imgs.xkcd.com|188.8.131.52|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54636 (53K) [image/png] Saving to: 'night_sky.png' night_sky.png 100% [===========================================================>] 53.36K --.-KB/s in 0.04s 2018-01-27 21:42:33 (1.24 MB/s) - 'night_sky.png' ↪saved [54636/54636]
Now you have the image in your directory, but its name will change when the
comic's name changes. To fix that, tell
wget to save it with
a specific name:
user@LJ $: wget "$(grep "img src=" index.html | grep "/comics/" ↪| cut -d\" -f2 | cut -c 3-)" -O wallpaper --2018-01-27 21:45:08-- http://imgs.xkcd.com/comics/night_sky.png Resolving imgs.xkcd.com... 184.108.40.206, 2a04:4e42:4::67 Connecting to imgs.xkcd.com|220.127.116.11|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54636 (53K) [image/png] Saving to: 'wallpaper' wallpaper 100% [==========================================================>] 53.36K --.-KB/s in 0.04s 2018-01-27 21:45:08 (1.41 MB/s) - 'wallpaper' saved [54636/54636]
-O option means that the downloaded image now has been saved as
"wallpaper". Now that you know the name of the image, you can set it as
a wallpaper. This varies depending upon which display manager you're
using. The most popular are listed below, assuming the image is located
gsettings set org.gnome.desktop.background picture-uri ↪"File:///home/user/wallpaper" gsettings set org.gnome.desktop.background picture-options ↪scaled
gsettings set org.cinnamon.desktop.background picture-uri ↪"file:///home/user/wallpaper" gsettings set org.cinnamon.desktop.background picture-options ↪scaled
xfconf-query --channel xfce4-desktop --property ↪/backdrop/screen0/monitor0/image-path --set ↪/home/user/wallpaper
You can set your wallpaper now, but you need different images to mix in.
Looking at the webpage, there's a "random" button that takes you
to a random comic. Searching with
grep for "random" returns the following:
This is the link to a random comic, and downloading it with
reading the result, it looks like the initial comic page. Success!
Now that you've got all the components, let's put them together into a script, replacing www.xkcd.com with the new c.xkcd.com/random/comic/:
#!/bin/bash wget c.xkcd.com/random/comic/ wget "$(grep "img src=" index.html | grep /comics/ | cut -d\" ↪-f 2 | cut -c 3-)" -O wallpaper gsettings set org.gnome.desktop.background picture-uri ↪"File:///home/user/wallpaper" gsettings set org.gnome.desktop.background picture-options ↪scaled
All of this should be familiar except the first line, which designates
this as a bash script, and the second
wget command. To capture the output
of commands into a variable, you use
$(). In this case,
the grepping and cutting process—capturing the final link and then
downloading it with
wget. When the script is run, the commands inside
the bracket are all run producing the image link before
wget is called
to download it.
There you have it—a simple example of a dynamic wallpaper that you can run anytime you want.
If you want the script to run automatically, you can add a cron job to have cron run it for you. So, edit your cron tab with:
user@LJ $: crontab -e
My script is called "xkcd", and my crontab entry looks like this:
@reboot /bin/bash /home/user/xkcd
This will run the script (located at /home/user/xkcd) using bash, every restart.
The script above shows how to search for images in HTML code and download them. But, you can apply this to any website of your choice—although the HTML code will be different, the underlying concepts remain the same. With that in mind, let's tackle downloading images from Reddit. Why Reddit? Reddit is possibly the largest blog on the internet and the third-most-popular site in the US. It aggregates content from many different communities together onto one site. It does this through use of "subreddits", communities that join together to form Reddit. For the purposes of this article, let's focus on subreddits (or "subs" for short) that primarily deal with images. However, any subreddit, as long as it allows images, can be used in this script.
Figure 1. Scraping the Web Made Simple—Analysing Web Pages in a Terminal
Just like the xkcd script, you need to download the web page from a subreddit to analyse it. I'm using reddit.com/r/wallpapers for this example. First, check for images in the HTML:
user@LJ $: wget https://www.reddit.com/r/wallpapers/ && grep ↪"img src=" index.html --2018-01-28 20:13:39-- https://www.reddit.com/r/wallpapers/ Resolving www.reddit.com... 18.104.22.168 Connecting to www.reddit.com|22.214.171.124|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 27324 (27K) [text/html] Saving to: 'index.html' index.html 100% [==========================================================>] 26.68K --.-KB/s in 0.1s 2018-01-28 20:13:40 (270 KB/s) - 'index.html' saved