alex 2.8

Message

nosystemdthanks · #1 Post by **nosystemdthanks** » Wed 20 Mar 2019, 09:33

probably not your cup of tea, but a while ago i made a python shell specifically for dealing with duplicate files.

it was very unsophisticated, and only worked in gnu/linux.

fast forward to when i am trying to introduce the command line to windows users, but want to do it in a gnu/linux way, not a powershell way.

i had one person install unxutils, only to find it was much, much larger than when i had installed those tools in windows.

so i created a python-based shell for the command line. is it good? its experimental. dont use it on a system with cherished data.

i use it, along with bash, all the time.

it allows you to add built-in commands in python, such as fsortplus-- fsortplus (which is the one command that will crash your system if you run it on a very large file, or some things in /proc or /dev or /sys) gives you the following data fields:

size ; sha256sum ; date ; time ; fullpath

you can use it like this:

find | fsortplus

note that works in windows, too. it will change "find" to "dir /b /a /s" and work just the same in windows.

it has isoname, minusname, isoleft, isonotleft, isoright, isonotright for grep-like features. these are case sensitive. it has ucase and lcase if you wish to change all text to either.

for those familiar with uniq --all-repeated=separate -w 48, for alex that is groupbyleft 48.

rainbow lets you display text in rainbow colours, by field or by position.

but getting back to duplicates:

suppose you have a drive mounted on /mnt/usbdrive and you want to find files on there whose contents match files in some important folder, like /root/important

or, you want to find .txt files that are identical.

find /root/important /mnt/usbdrive -type f | fsortplus | groupbyleft 64 | isogroup important | groupsortlen

what does that do?

obviously, you find your files with find /root/important /mnt/usbdrive -type f |

fsortplus | sorts by size and hash (and date and time as well)

groupbyleft 64 | puts whitespace between each group of lines where the first 64 characters match. the 64 includes the size field, so technically this could find a false sha256sum though the odds are rare.

once groupbyleft separates identically hashed files into groups, you can search for groups that have one or more lines with certain text:

isogroup important | shows only groups of identical files where at least one is the file in your "important" folder.

then groupsortlen sorts the entire thing, so that if you have 2 identical files in one group, and 15 identical files in another, the smallest groups will be at the top (otherwise sorted as they were before) and the largest groups will be at the bottom, where you can tend to them.

i created this for teaching the command line, but quickly added all the best features of my previous python shell-- it is by far the best command shell program ive written.

of course, if you compare it to really good command shells-- it isnt.

its a multitool that makes windows more tolerable, and a collection of awesome command line tools for gnu/linux which you dont have to use as a shell. instead you can use bash and access the tools like this:

with alex:

root:/mnt/usbdrive#> find /root/important /mnt/usbdrive -type f | fsortplus | groupbyleft 64 | isogroup important | groupsortlen

with bash:

# find /root/important /mnt/usbdrive -type f | alex28 --fsortplus | alex28 --groupbyleft 64 | alex28 --isogroup important | alex28 --groupsortlen

again, be sure not to use fsortplus on proc/dev/sys or files that are several gb in size. ive tried to add an exception to keep that from crashing-- it happens rarely but its a pain.

(using fsortplus is optional, if you dont want to take the chance simply dont use the fsortplus command.)

Code: Select all

# alex28.py --help
alex 2.8, mar 2019 mn

usage:

    alex                        run the alex line executive
        or:
    alex --help                 show this help information and exit

    sleep [n]                   pause for one second or [n] seconds
    arrcurl [-a] url            write contents of url as output
    pserver                     run a mini http server using the current folder
    locate row column           change the cursor position
    colour colour highlight     change the text colours
    cat file1 [file2] [-n]      concatenate files and/or number all lines
    set vname value             set vname to value
    setrandint vname min max    set vname to a random number between min / max
    setinput vname              set vname to whatever is input from the keyboard
    setnum vname num            set vname to numeric num (variable or value)
    setadd vname v1 v2          set vname to sum of v1 and v2 (string or numeric)
    find | fsortplus            find files, show size / sha256 / date / time
    find | fsortplusnows        fsortplus, but create hash without whitespace
    find | dc [n] [d[srchdate]] [+/-szlimit] list name/size/time, colour by type

    | isoname text              only show lines containing "text"
    | isoplus text              only list files with lines containing "text"
    | minusname text            remove lines containing "text"
    | isoleft text              include if left equals text
    | isoright text             include if right equals text
    | isonotleft text           include if left != text
    | isonotright text          include if right != text
    | lcase [or] ucase          convert lines to lower or upper case
    | fields 1 2 3 4 _          show 1st, 2nd, 3rd, 4th, _all fields/tokens
    | replace what with         replace "what" with "with"
    | arrdo "do what"           very powerful / do not use
    | tops n                    only show top n lines
    | bots n                    only show bottom n lines
    | noreps                    only show each line once, regardless of sort
    | var varname               pipe output to varname
    | ascii [-h]                display text as ascii codes (-h for hex)
    | rainbow                   rotate colours by -f field, -p pos, -l level
    | findsim                   find similar files
    | isogroup text             show only groups containing text
    | groupbyleft howmuch       uniq/sep by howmuch (includes singles)
    | groupsortlen              sort groups by size
    | arrlen                    prepend each line with length

    while ;                     repeatedly do part after ;
    forin vname 500 ;           do part after ; ...500 times
    forin vname array ;         do part after ; ...loop through array
    next                        mark the bottom of a while or forin loop
    break                       exit a while or forin loop
    clear                       clear the screen
    pset x y c                  draw a dot at x, y in colour c
    chr asciicode [or unicode]  output a character from ascii or unicode
    line x y x2 y2 c            draw a line from x, y to x2, y2 in colour c
    echo $varname               output $varname
    quit, exit                  quit the shell

rockedge · #2 Post by **rockedge** » Wed 20 Mar 2019, 13:36

I am going to play with it...interesting stuff

nosystemdthanks · #3 Post by **nosystemdthanks** » Wed 20 Mar 2019, 14:17

rockedge wrote:I am going to play with it...interesting stuff

my favourite feature is | var varname although it is very limited at the moment. it doesnt work with variables that exist in the program itself.

i could make it append the variable name, but when set and when called, which should fix it.

but there are so many things i would like to add or improve-- the history feature has never worked properly (just the saving and loading part are faulty. it did work once-- doesnt work in windows) and i use it on several machines, though on just one machine recently (version 0.1 is from dec 2017) it stopped working when run as user. still works fine as root, still works as user on every other machine too. i checked the hash of the script, its identical.

so fixes, improvements and features happen from time to time, but not all the time. arrdo is very powerful-- be careful with it, its like a version of xargs that runs once per line (not limited to the folder youre in.) find / | arrdo rm is potentially worse than rm / -rf although it might actually stop as soon as it takes out /usr/bin/alex28.py.

find /root | tail -200 | fsortplus | var t

echo $t | leafpad

(old)Puppy Linux Discussion Forum

(old)Puppy Linux Discussion Forum

alex 2.8

alex 2.8