Analyzing my 11k bash commands
history | wc
informs me that I’ve typed exactly 11005 bash commands for the 278 days I’ve used Ubuntu (ls -lt /var/log/installer
). However, I only expanded my history
storage about 3 months ago, so I’ve probably lost out on 60% of the commands I’ve ever typed.
Still, I’m intrigued to investigate what commands have I been writing, so I set out to write some scripts to find out. I hypothesize the front runners to be cd
, ls
, and docker
.
Top 10 commands executed
I wrote the following script:
import os
import sys
from collections import Counter
from itertools import dropwhile
if __name__ == " __main__":
history_path = os.path.join("", os.path.expanduser("~"), ".bash_history")
raw_history = open(history_path, 'r').read().splitlines()
def nonEmpty(str: str) -> bool:
return len(str) > 0
def isSegmentEnvVar(segment: str):
return '=' in segment
def isSegmentDate(segment: str):
return segment[0].isnumeric()
# remove blank lines and dates and env varibales
history = list(map(lambda row: ' '.join(
dropwhile(lambda segment: isSegmentDate(segment) or isSegmentEnvVar(segment), row.split())), raw_history
))
history = list(filter(nonEmpty, history))
def top_n_command_name(n: int):
command_names = map(lambda row: row.split(" ")[0], history)
return Counter(command_names).most_common(n)
print(top_n_command_name(10))
Results:
Command | Times Executed |
---|---|
cd | 1522 |
ls | 1446 |
npm | 725 |
sudo | 684 |
python3 | 588 |
git | 513 |
cb-dev-kit | 449 |
cb-cli | 444 |
code | 396 |
node | 296 |
isSegmentEnvVar
serves to remove the leading environment variables (e.g, ENV1=hello python3 driver.py
).
isSegmentDate
serves to remove the leading date information that might appear in a standard history entry.
Looks like I use a lot of node and python. Interesting, I would’ve thought that I’ve used more gcc
than the other two.
10 most frequent commands with 2 keywords
def top_n_k_keyword(n: int, k: int):
keywords_list = [' '.join(row.split()[:k])
for row in history if len(row.split()) >= k]
return Counter(keywords_list).most_common(n)
print(top_n_k_keyword(10, 2))
Command | Times Executed |
---|---|
cd .. | 324 |
code . | 142 |
npm start | 74 |
cd packages/server/ | 71 |
ls -R | 44 |
cb-cli init | 43 |
cd ../.. | 42 |
stack build | 39 |
cb-dev-kit generate | 38 |
git push | 37 |
My full script with 3 more options
Link I designed 3 more options:
- Shortest N commands
- Longest N commands
- N most frequent full commands
Run the script I linked above with no arguments to get some default output.