Published: 04.05.2021 | Edited: 09.05.2021 | Tags: 100daystooffload

Markdown posts by word count in bash

I wanted to quickly overview the word count on my blog posts to roughly calculate the possible translation count and here's a one-liner I have come up with:

find . -maxdepth 1 -type f -name "*.md" -exec printf "{} " \; -exec ~/.local/bin/mwc {} \; | awk '{print $2 " " $1}' | sort -rnk1

The output should look similar to this:

1862 ./
1739 ./
1619 ./
1602 ./
1596 ./
1536 ./
1407 ./
1390 ./
1211 ./
1179 ./
1038 ./
1033 ./

The mwc command should exclude punctuation, footnotes or other markdown specialties but I did not do any extensive research yet. It should be however possible to draw a general conclusion about the translation costs. I am wondering if translators are accustomed to translate markdown already.


The above line requires mwc command, a python markdown-word-count script. Install via pip:

pip3 install markdown-word-count

Apart from the script, the line only requires standard GNU commands.

Notes and findings

  • Passing ls output into xargs can introduce many security risks1
  • It might be better to consider using find -exec instead2
  • There are unavoidable security problems surrounding use of the -exec action; you should use the -execdir option instead3
  • Simply passing multiple -execdir parameters to find is sufficient4
  • Narrowing results of the find command is optional5
  • Using awk for swapping columns is very easy6
  • Sorting the ouptut via the column is specified via -k parameter7

This is a 55th post of #100daystooffload.