it was very unsophisticated, and only worked in gnu/linux.
fast forward to when i am trying to introduce the command line to windows users, but want to do it in a gnu/linux way, not a powershell way.
i had one person install unxutils, only to find it was much, much larger than when i had installed those tools in windows.
so i created a python-based shell for the command line. is it good? its experimental. dont use it on a system with cherished data.
i use it, along with bash, all the time.
it allows you to add built-in commands in python, such as fsortplus-- fsortplus (which is the one command that will crash your system if you run it on a very large file, or some things in /proc or /dev or /sys) gives you the following data fields:
size ; sha256sum ; date ; time ; fullpath
you can use it like this:
find | fsortplus
note that works in windows, too. it will change "find" to "dir /b /a /s" and work just the same in windows.
it has isoname, minusname, isoleft, isonotleft, isoright, isonotright for grep-like features. these are case sensitive. it has ucase and lcase if you wish to change all text to either.
for those familiar with uniq --all-repeated=separate -w 48, for alex that is groupbyleft 48.
rainbow lets you display text in rainbow colours, by field or by position.
but getting back to duplicates:
suppose you have a drive mounted on /mnt/usbdrive and you want to find files on there whose contents match files in some important folder, like /root/important
or, you want to find .txt files that are identical.
find /root/important /mnt/usbdrive -type f | fsortplus | groupbyleft 64 | isogroup important | groupsortlen
what does that do?
obviously, you find your files with find /root/important /mnt/usbdrive -type f |
fsortplus | sorts by size and hash (and date and time as well)
groupbyleft 64 | puts whitespace between each group of lines where the first 64 characters match. the 64 includes the size field, so technically this could find a false sha256sum though the odds are rare.
once groupbyleft separates identically hashed files into groups, you can search for groups that have one or more lines with certain text:
isogroup important | shows only groups of identical files where at least one is the file in your "important" folder.
then groupsortlen sorts the entire thing, so that if you have 2 identical files in one group, and 15 identical files in another, the smallest groups will be at the top (otherwise sorted as they were before) and the largest groups will be at the bottom, where you can tend to them.
i created this for teaching the command line, but quickly added all the best features of my previous python shell-- it is by far the best command shell program ive written.
of course, if you compare it to really good command shells-- it isnt.
its a multitool that makes windows more tolerable, and a collection of awesome command line tools for gnu/linux which you dont have to use as a shell. instead you can use bash and access the tools like this:
with alex:
root:/mnt/usbdrive#> find /root/important /mnt/usbdrive -type f | fsortplus | groupbyleft 64 | isogroup important | groupsortlen
with bash:
# find /root/important /mnt/usbdrive -type f | alex28 --fsortplus | alex28 --groupbyleft 64 | alex28 --isogroup important | alex28 --groupsortlen
again, be sure not to use fsortplus on proc/dev/sys or files that are several gb in size. ive tried to add an exception to keep that from crashing-- it happens rarely but its a pain.
(using fsortplus is optional, if you dont want to take the chance simply dont use the fsortplus command.)
Code: Select all
# alex28.py --help
alex 2.8, mar 2019 mn
usage:
alex run the alex line executive
or:
alex --help show this help information and exit
sleep [n] pause for one second or [n] seconds
arrcurl [-a] url write contents of url as output
pserver run a mini http server using the current folder
locate row column change the cursor position
colour colour highlight change the text colours
cat file1 [file2] [-n] concatenate files and/or number all lines
set vname value set vname to value
setrandint vname min max set vname to a random number between min / max
setinput vname set vname to whatever is input from the keyboard
setnum vname num set vname to numeric num (variable or value)
setadd vname v1 v2 set vname to sum of v1 and v2 (string or numeric)
find | fsortplus find files, show size / sha256 / date / time
find | fsortplusnows fsortplus, but create hash without whitespace
find | dc [n] [d[srchdate]] [+/-szlimit] list name/size/time, colour by type
| isoname text only show lines containing "text"
| isoplus text only list files with lines containing "text"
| minusname text remove lines containing "text"
| isoleft text include if left equals text
| isoright text include if right equals text
| isonotleft text include if left != text
| isonotright text include if right != text
| lcase [or] ucase convert lines to lower or upper case
| fields 1 2 3 4 _ show 1st, 2nd, 3rd, 4th, _all fields/tokens
| replace what with replace "what" with "with"
| arrdo "do what" very powerful / do not use
| tops n only show top n lines
| bots n only show bottom n lines
| noreps only show each line once, regardless of sort
| var varname pipe output to varname
| ascii [-h] display text as ascii codes (-h for hex)
| rainbow rotate colours by -f field, -p pos, -l level
| findsim find similar files
| isogroup text show only groups containing text
| groupbyleft howmuch uniq/sep by howmuch (includes singles)
| groupsortlen sort groups by size
| arrlen prepend each line with length
while ; repeatedly do part after ;
forin vname 500 ; do part after ; ...500 times
forin vname array ; do part after ; ...loop through array
next mark the bottom of a while or forin loop
break exit a while or forin loop
clear clear the screen
pset x y c draw a dot at x, y in colour c
chr asciicode [or unicode] output a character from ascii or unicode
line x y x2 y2 c draw a line from x, y to x2, y2 in colour c
echo $varname output $varname
quit, exit quit the shell