Work with the bash shell

Select and concatenate data files

In Bash, * is a wildcard, which matches zero or more characters. When the shell sees a wildcard, it expands the wildcard to create a list of matching filenames before running the command that was asked for. As an exception, if a wildcard expression does not match any file, Bash will pass the expression as an argument to the command as it is. For example typing ls *.pdf in the mountain_data/ directory (which does not contains any pdf files) results in an error message that there is no file called *.pdf.

Here’s how the wildcard works in practice: *.txt matches annapurna-np.txt, chooyu-np.txt and every file that ends with ‘.txt’. On the other hand, n*.txt only matches nangaparbat-pl.txt because the ‘n’ at the front only matches filenames that begin with the letter ‘n’.

Exercise: Explore data files

Take a look at the contents of the mountain_data/ directory. Use ls and cat to inspect the names, file extensions and contents of the files. What patterns and information do you see? Tip: you can write your observations in comments section of the workshop website, and see what others observed. :)

Exercise: Select and concatenate data files

Right now, the data for each mountain in the mountain_data/ folder are each stored in separate files per mountain. That might be helpful for some applications, but we want a list of data in one .txt file called tallest-mountains.txt. How could we do that? Hint: We did this during the teaching session for all mountains in Nepal. The command to use starts with an ‘a’ and reminds me of my teenage years… :)

Exercise: add data to tallest-mountains.txt

Right now, the tallest-mountains.txt file contains geographic and population data for the 10 tallest mountains in the world. A good start, but there are many more cool mountains in the world! Add a few more mountains to the tallest-mountains.txt file using nano on the command line. Don’t forget to keep the same format as the rest of the data: name, lat, lon, height in meters, successful ascents(pre-2004), unsuccessful ascents(pre-2004). Hint: You can find a link to the data source in the mountain_data/README.md file.