Original computing articles by a systems administrator

Working with Spreadsheets from the Linux Command Line Interface

Preparing the Spreadsheet:
Before doing any of these operations you are going to want to save them as TAB delimited files.

In order for your tabs to line up when being displayed in the terminal you may need to change your tab stop settings. In a regular terminal you would use:

setterm -regtabs #


Replace # with the new number of spaces a tab stop should be, the default is 8. setterm -tabs will display an output of what the tabs will look like. If you are using xterm or gnome-terminal add -term linux so the full command might look like:

setterm -term linux -regtabs 20
Or:

setterm -term linux -tabs


I have also set my non-x console for a higher resolution, so I can get more of the sheet on the screen, this can be done by editing your kernel boot parameters in your grub or lilo conf. See: http://shrimpworks.za.net/2005/01/31/change-your-console-resolution-and-colour-depth/

Some Basic Operations:

To apply changes in all the following commands you are going to want redirect the standard output from the screen (essentially, as far as linux is concerned, the screen is just another file) to a new file using:

command -options original_file > new_file

DO NOT USE:

command -options orginal_file > orginal_file

So to add a column or row numbers to your spreadsheet:

cat -n oldfile > newfile


1. Sort all columns by a specific row:

sort -t $’\t’ +1 file

Or in descending (reverse) order:

sort –rt $’\t’ +1 file

Explanation: -t $’\t’ sends a literal TAB to sort, and has been the only way I can get bash and sort to reliably distinguish between spaces and tabs. This command will sort the file by the second column, so column 1 is considered column 0. If you want column 1 to be column one then:

sort –t &’\t’ -k1 file

2. Add the sum of all numbers in a column, in this case column 2 (thanks to http://www.pixelbeat.org/cmdline.html)

cut -f2 file | (tr ‘\n’ +; echo 0) | bc

Explanation: cut -f2 = extract column 2 from the file. | = sends the output of that command to the input of the next command (pipe). (tr ‘\n’ +; echo 0) = translate new lines to the plus sign, the semi-colon tells bash that a new command is starting so echo 0 is separate from the tr command. Bash understands new lines and tabs etc as characters, in this case \n = new line. The ‘echo zero’ is because the last new line will be a plus sign, so bc needs something after it, 0 won’t effect the mathematical results. bc is the Basic Calculator, and will print the mathematical result of its input.

3. Get the mathematical mean of a column 2, out to 20 decimal places

cut -f2 file | (echo –n \(;tr ‘\n’ +; echo 0\)\/`cat file | wc -l`) | bc -l

Explanation: I will just explain how this differs from the above. The text getting piped to bc if column 2 had 5 rows with one number in each row would be something like(3+5+2+8+6+0)/5 if row 1 had a value of 3, row 2 a value of 5 etc… The -n switch tells echo not to add a /n or newline character to the end of its output.
`cat file | wc -l` is command substitution, (note I am using back quotes a.k.a. back ticks, it is the character that exists on the same key as the tilde on a standard US keyboard), in command substitution the output of the command or commands within the back quotes are substituted in the place of the command. So in this case, wc -l outputs the number of lines in the file. I pipe the output of cat to wc instead of wc -l file because wc will normally output the filename as well.
When there is a \( or \/ I am escaping the character. Bash is often referred to as the Bash interpreter because it interprets what you type in before it sends it to the command, so since those characters have special meanings to Bash you need to tell bash to send the character literally to the command–without it being interpreted.

4. How to reorder the columns. With the following the original Column 3 is now column 1, and the original column 1 is now column 2, etc.

awk -F file \t’ ‘{OFS=”\t”}{print $3,$1,$2,$4}’ file

Due to the complexity of awk I am not going to explain this command, awk is considered a programming language in itself, but I wanted to include this basic operation.

I hope this is a good starting point for those who want to know how to perform some text manipulation from the command line with files in a spreadsheet format.


Leave a Reply

*