Today I Learned

A Hashrocket project

Parallel xargs fails if any of its children do

We like to write about xargs. In addition to all that, turns out xargs is a great tool for easily parallelizing tests, linters or anything where some may pass, and some may fail. If any of the processes that xargs spawns fail, the xargs call will also fail.

All child processes exit zero:


% echo "0\n0\n0" | xargs -Icode -P4 sh -c 'exit code'; echo exit code: $?
exit code: 0

And so does xargs! If any exit non-zero:

echo "0\n1\n127" | xargs -Icode  -P4 sh -c 'exit code'; echo exit code: $?
exit code: 1

xargs follows suit.

xargs substitution

Output piped to xargs will be placed at the end of the command passed to xargs. This can be problematic when we want the output to go in the middle of the command.

> echo "Bravo" | xargs echo "Alpha Charlie"
Alpha Charlie Bravo

xargs has the facility for substituion however. Indicate the symbol or string you would like to replace with the -I flag.

> echo "Bravo" | xargs -I SUB echo "Alpha SUB Charlie"
Alpha Bravo Charlie

You can use the symbol or phrase twice:

> echo "Bravo" | xargs -I SUB echo "Alpha SUB Charlie, SUB"
Alpha Bravo Charlie, Bravo

If xargs is passed two lines, it will call the the command with the substitution twice.

> echo "Bravo\nDelta" | xargs -I SUB echo "Alpha SUB Charlie SUB"
Alpha Bravo Charlie Bravo
Alpha Delta Charlie Delta

Parallel shell processing with xargs

Today I learned how to parallel run a slow command on my shell. We can use xargs combined with the flags -n and -P flags. Let’s see how this works:

find . -type f | xargs -n1 -P8 slow_command
  • slow_command your slow command that receives a file as the first arg
  • -n to specify how many arguments are passed to the slow_command
  • -P how many parallel workers xargs will spawn to run the slow_command

Check this out watch -d -n 0.1 "seq 10 | xargs -n2 -P8 echo":

watch-xargs

On this example xargs are spawning up to 8 workers to run the echo command and for each echo execution xargs will pass 2 arguments. The arguments are produced by a seq 10 and as multiple executions of echo runs in parallel we can highlight the output changes with watch.

Confirming operations with `xargs -p`

xargs is a great tool to take a lot of input and execute a lot of different commands based on that input. Sometimes though, if you are performing destructive or mutative actions with xargs you want to proceed more cautiosly.

> echo "banana apple orange" | tr ' ' '\n' | xargs -n1 echo "I like"

This outputs:

I like banana
I like apple
I like orange

But maybe I don’t like some of those things, please ask! Including the p flag with xargs forces a prompt.

> echo "banana apple orange" | tr ' ' '\n' | xargs -p -n1 echo "I like"
echo I like banana ?...n
echo I like apple ?...y
I like apple
echo I like orange ?...n

Yep, I only like apples.

Xargs from a file

I’ve struggled with xargs conceptually for long time, but actually its pretty easy conceptually. For commands that don’t read from stdin but do take arguments, like echo or kill, you can turn newline separated values from stdin in into arguments.

Piping to echo does not work.

> echo 123 | echo
# nothing

Using xargs it does.

> echo 123 | xargs echo
123

xargs can also read a file with the -a flag, turning each line of the file into an argument.

> echo "123\nabc" > test.txt
> cat test.txt
123
abc
> xargs -a test.txt echo
123 abc

H/T Brian Dunn

Two arguments in a command with xargs and bash -c

You can use substitution -I{} to put the argument into the middle of the command.

> echo "a\nb\nc\nd" | xargs -I{} echo {}!
a!
b!
c!
d!

I can use -L2 to provide exactly 2 arguments to the command:

> echo "a\nb\nc\nd" | xargs -I{} -L2 echo {}!
a b!
c d!

But I want to use two arguments, the first in one place, the next in another place:

> echo "a\nb\nc\nd" | xargs -I{} -L2 echo {}x{}!
a bxa b!
c dxa b!

I wanted axb! but got a bxa b!. In order to achieve this you have to pass arguments to a bash command.

> echo "a\nb\nc\nd" | xargs -L2 bash -c 'echo $0x$1!'
axb!
cxd!

Just like calling

bash -c 'echo $0x$1!' a b

Where $0 represents the first argument and $1 represents the second argument.

Call a program one time for each argument w/ xargs

Generally, I’ve used xargs in combination with programs like kill or echo both of which accept a variable number of arguments. Some programs only accept one argument.

For lack of a better example, lets try adding 1 to 10 numbers. In shell environments you can add with the expr command.

> expr 1 + 1
2

I can combine this with seq and pass the piped values from seq to expr with xargs.

> seq 10 | xargs expr 1 + 
expr: syntax error

In the above, instead of adding 1 to 1 and then 1 to 2, it trys to run:

expr 1 + 1 2 3 4 5 6 7 8 9 0

Syntax Error!

We can use the -n flag to ensure that only one argument is applied at time and the command runs 10 times.

> seq 10 | xargs -n1 expr 1 +
2
3
4
5
6
7
8
9
10
11

For more insight into what’s being called, use the -t flag to see the commands.

Delete remote branches with confirmation

Branches on the git server can sometimes get out of control. Here’s a sane way to clean up those remote branches that offers a nice confirmation before deletion, so that you don’t delete something you don’t want to delete.

git branch -a | grep remotes | awk '{gsub(/remotes\/origin\//, ""); print;}' | xargs -I % -p git push origin :%

The -p flag of xargs provides the confirmation.

Get ONLY PIDs for processes listening on a port

The lsof utility on Linux is useful among other things for checking which process is listening on a specific port.

If you need to kill all processes listening on a particular port, normally you would reach for something like awk '{ print $2 }', but that would fail to remove the PID column header, so you would also need to pipe through tail -1. It get pretty verbose for something that should be pretty simple.

Fortunatly, lsof provides a way to list all the pids without the PID header specifically so you can pipe the output to the kill command.

The -t flag removes everything from the output except the pids of the resulting processes from your query.

In this example I used a query to return all processes listening on port 3000 and return their PID:

lsof -ti tcp:3000

The output of which will look something like:

6540
6543
21715

This is perfect for piping into kill using xargs:

lsof -ti tcp:3000 | xargs kill

No awks or tails necessary! 🐕

Mass-Delete Git Tags

Building off this post:

I’m an advocate of Semantic Version tagging. It communicates to a team about every deploy and makes that rare rollback easier. So when does it not make sense to use a tag?

When you’re the only developer (nobody to communicate with except yourself), and also using a platform like Heroku that tags every release (your tags are redundant). This the case with my blog, so today I set out to delete all my Git tags.

First, delete them remotely (assuming a remote named origin):

$ git tag | xargs git push --delete origin

We also have to delete our local tags, or a tag push with create them again on the remote:

$ git tag | xargs git tag -d

$ git tag now returns nothing, and there are no remote tags.

Find (and kill) all processes listening on a port

To search for processes that listen on a specific port use the lsof or “List Open Files”. The -n argument makes the command run faster by preventing it from doing a ip to hostname conversion (it’s still pretty slow). Use grep to show only lines containing the word LISTEN.

lsof -n | grep LISTEN

A SIGNIFICANTLY FASTER way. Is to use the -i option to filter for a specific port:

lsof -i tcp:[PORT]

To kill all processes listening on a specific port use:

lsof -ti tcp:5900 | xargs kill

The -t command returns only the PID, exaclty for the purpose of piping it somewhere, and the xargs executes kill on each line returned.

If the process is more persistent, and kill did not work, try kill -9 to kill it more aggressively.

Rerun only failed specs with tmux

You don’t want to run your whole sweet again, just those failures?

Copy the bit below Failed examples: into your tmux buffer:

rspec ./spec/mailers/order_mailer_spec.rb:5 # OrderMailer reciept includes the support email and phone number
rspec ./spec/features/customer_checks_out_spec.rb:43 # customer checks out happy path
...

Now make command line args for rspec out of it:

tmux showb | ag -o '[^\s]+:\d+' | tr '\n' ' ' > rerun.txt

Run only this set of specs with xargs:

xargs rspec < rerun.txt

Grep For Files With Multiple Matches

The grep utility is a great way to find files that contain a certain pattern:

$ grep -r ".class-name" src/css/

This will recursively look through all the files in your css directory to find matches of .class-name.

Often times these kinds of searches can turn up too many results and you’ll want to pare it back by providing some additional context.

For instance, we may only want results where @media only screen also appears, but on a different line. To do this, we need to chain a series of greps together.

$ grep -rl "@media only screen" src/css |
    xargs grep -l ".class-name"

This will produce a list of filenames (hence the -l flag) that contain both a line with @media only screen and a line with .class-name.

If you need to, chain more grep commands on to narrow things down even farther.

See man grep for more details.

Kill rogue shell processes

There is a particular type of attack where an inserted usb stick can act like a keyboard, open a terminal, and start something like this:

while (true); do something_malicious; sleep 3600; done & disown

This process endlessly loops and wakes every hour to do something malicious. The & puts it in the background and the disown will end its attachment to the current terminal. When the terminal is closed the process will get a parent of 1.

This process is still detectable and killable at the command line by finding all shell programs with a parent pid of 1 and killing them with -9.

ps ax -o pid,command,ppid | grep '.*zsh.*\s1$' | awk '{print $1}' | xargs kill -9

This will kill all running rogue zsh processes. There may be reasons why you’d want a process to be detached from its parent terminal, but you could easily decide that this isn’t something you want ever and place the above command into a cron job that runs every 2 seconds.

Run Prettier on all #JavaScript files in a dir

If you are like me you must like formatters such as Prettier which probably prompted you to set your editor to auto format the file on save.

That’s great for new projects but when working on an existing project, every file you touch will have a huge diff in git that can obscure the real changes made to the file.

To solve that you must run prettier on all your javascript files as an independent commit. You can do it with the following command:

find ./src/**/*.js | xargs prettier --write --print-width 80 --single-quote --trailing-comma es5

The flags after the prettier are all my personal preferences except for --write which tells prettier to write the file in place.

Note 1: Make sure you have all the files you are about to change committed to source control so that you can check them out if this did not go well.

Note 2: When committing this change it would be a good idea to use git add -p and go through the changes one by one (which is always a good idea…)

Note 3: To dry run and see which files will be changed run the find ./src/**/*.js by itself.

Replace ERB files with HAML

The gem htmltohaml provides an easy method to convert your ERB files into HAML files. It generates a new converted file with the .haml extension for each .erb file, leaving you with both copies.

Today I used this command to quickly remove about twenty of these redundant .erb files, after adding the .haml files.

$ find . -name \*.erb | xargs git rm