Discussion: how to categorize packages for end user access?

What features/apps/bugfixes needed in a future Puppy
Message
Author
User avatar
deshlab
Posts: 82
Joined: Sat 23 Jul 2005, 09:57
Location: oldenburg, germany

Discussion: how to categorize packages for end user access?

#1 Post by deshlab »

(thread subject was renamed, because the focus wasn't set right)

Hi,
for my recent work on the Additional Software Index, I looked at the various categorization systems in use around the places where packages are available and came up with a new system. I'll attach a screenshot of the table I worked with. (edit: now with the computation freebies system)
Image

Since the FTP/Repository are about to become populated with files I wanted to present and discuss this system again.

Code: Select all

Primary Categories (dotpup start menu)
      File Managers
      Graphics Processing
      Word Processing
      Information Managers
      Network
      Internet
      Multimedia
      Games

Additional Categories
      System Utilities - Programming - Window Managers
      Bugfixes, Updates & Security - Libraries & Resources - Miscellaneous
(15 categories)
The first part of the system mimics the original start menu of Puppy. This was the reason for the (not alphabetic) order of the items. The second part then adds up everything else in as little categories as possible (almost).

Obvious possible changes:
If the aim is to get as few categorys as possible, one could join "word-, graphics processing" and "information management" to an "office" category, join "network" and "internet", include games in multimedia and the file managers and security apps and similar in "system utilities" and combine programming with libraries/resources (far out?). that would give

Code: Select all

Multimedia
Miscellaneous
Office
Net
Ressources & Development
System Utilities
Window Management & Appearance
(7 categories)

This is the minimum I can think of and apart from the clarity there is not much good to it since the categories should become huge.

Overprecise categories with only a very little number of applications obviously aren't too useful either, but I would still propose that files like the system security programs get their own directory since they have a different sort of quality than other system utilities

So what this thread is supposed to be about is finding an adequate category system, not too many , not too few.

maybe someone could post the list from synaptic? or other repositories?
Last edited by deshlab on Tue 17 Jan 2006, 11:14, edited 3 times in total.
User avatar
Lobster
Official Crustacean
Posts: 15522
Joined: Wed 04 May 2005, 06:06
Location: Paradox Realm
Contact:

#2 Post by Lobster »

8) I only do this on special occasion

[Lobster goes into bowing frenzy]
"We are not worthy, we are not worthy"

Now we can point people to a location on the forum
Forum based package management - I better go do a news item . . .

There is a simple logic to your table, be interested what others think
:)

Hail Puppy :oops:
Puppy Raspup 8.2Final 8)
Puppy Links Page http://www.smokey01.com/bruceb/puppy.html :D
User avatar
puppian
Posts: 537
Joined: Tue 19 Jul 2005, 03:58
Location: PuppyLand
Contact:

Re: Suggestion for Package Category Structure (for FTP/Repos

#3 Post by puppian »

deshlab wrote:I missed the computation freebies board then
:)
Puppy Files:

Business - Word Processing, Spreadsheet, Fax, IME, etc.
Intenet/Network
Multimedia
System - Antivirus, Bugfixes, File Managers, Libraries, Security, System Monitoring...

Desktop Enhancements - WM, Themes, Icon...
Graphics - Editors, Viewers...
Drivers
Games
Home & Education - Astronomy, Earth Science, Music, Self-Learning ... (I'm going to remove this one as Puppy doesn't have much apps of this type yet)
Programming
Misc.

More description on each of them:
http://freeforums.bizhat.com/index.php? ... howforum=9
[url=http://puppylinux.org]Puppylinux.org - Community home page of Puppy Linux[/url] hosted by Barry (creator of Puppy), created and maintained by the [url=http://puppylinux.org/user/readarticle.php?article_id=8]Puppy Linux Foundation[/url] since 2005
User avatar
deshlab
Posts: 82
Joined: Sat 23 Jul 2005, 09:57
Location: oldenburg, germany

#4 Post by deshlab »

thanks Puppian, I made a new overview (image in first post).

At the moment I feel pretty positive about category names with more than one word, (but I'm not sure if this is good for the newbie user) so I also suggest the following change to my own system:

take the dotpup-developing tools from misc, add the in the "programming" category and rename it "programming & development"

I would also suggest renaming "games" to "games, hobby, education" and move the respective software there (from misc and elsewhere), but the gaming category is quite huge already and doesn't even include all the fancier games available outside of the forum (doom etc).

:?: Are less categories (<10) with longer names better than many categories (~15+) with single word names?
User avatar
jmarsden
Posts: 265
Joined: Sat 31 Dec 2005, 22:18
Location: California, USA

#5 Post by jmarsden »

While I don't have specific objections to the categories proposed, I'm not sure how well this approach meshes with the way pupget deals with repositories, either now or in my proposal for its enhancement? I noticed also that you did not include the ibiblio pupget-packages_1 repository in your comparison diagram!

/usr/sbin/pupget currently just expects a given repository to have a single collection of packages in a given place. My enhancement proposal did not change that, it just allows for multiple repositories to be configured. It is up to the meta info provided by a packages.txt (or somereponame.repo) file to provide details on what they are, categorize them, etc.

Having to treat such a categorized repo as (in effect) 15 or so separate repos (each with individual metainfo files that need updating) seems a bit unwieldy to me. Deshlab, how are you expecting package management tools (such as /usr/sbin/pupget ) would access (and be configured to use) repositories that are divided into these categories?

Thanks,

Jonathan
User avatar
MU
Posts: 13649
Joined: Wed 24 Aug 2005, 16:52
Location: Karlsruhe, Germany
Contact:

#6 Post by MU »

I also thought about this, as dotpups.de already uses categories.
I planned to use a categories.txt, a downloader would use.

It simply would contain the foldernames (and maybe a Meta-Category).
This also would allow subfolders (like the Doomsday-folder in http://dotpups.de/dotpups/Games ).

like:

Code: Select all

# pget -configfile
:GROUP:Office
/dotpups/Text_Editors
:GROUP:Games
/dotpups/Games
/dotpups/Games/Doomsday
:GROUP:EOF
As I want Dotpups.de to be accessable with a webbrowser, too, I uploaded some more files there, like pictures. So (in some cases) you quickly see by a screenshot, what a package provides.
So a downloader would have to check just for filename.pet -files, and maybe a filename.pet.htm with additional infos.
Mark
User avatar
babbs
Posts: 397
Joined: Tue 10 May 2005, 06:35
Location: Tijuana, BCN, Mexico

#7 Post by babbs »

I think that a central repository with all the different dotpups in a single directory would be easiest to maintain and access by programs. I also think that there needs to be a complete index file describing file version and associated Puppy version info. The different categories would be reflected in this index/readme file. My two pfennings...
User avatar
deshlab
Posts: 82
Joined: Sat 23 Jul 2005, 09:57
Location: oldenburg, germany

#8 Post by deshlab »

jmarsden wrote:While I don't have specific objections to the categories proposed, I'm not sure how well this approach meshes with the way pupget deals with repositories, either now or in my proposal for its enhancement? I noticed also that you did not include the ibiblio pupget-packages_1 repository in your comparison diagram!
do you referr to ftp://ibiblio.org/pub/linux/distributio ... ackages-1/ ?
how should I have included this? there are no categories there.

what I am proposing is meant for the user-part of the package access situation. I know very little about the server side of things and don't dare to talk much about that. I also don't know how you envision the future pupget tool, the current tool that just shows all available pupgets with short description but uncategorized can surely not be used when the hundreds of available dotpup packages (or variants thereof) have been added.
Having to treat such a categorized repo as (in effect) 15 or so separate repos (each with individual metainfo files that need updating) seems a bit unwieldy to me. Deshlab, how are you expecting package management tools (such as /usr/sbin/pupget ) would access (and be configured to use) repositories that are divided into these categories?
Ok, if I understand, there are two ways to handle the repository

- the ibiblio-pupget style (everything in one directory, one text file that sort of catalogues them)

- the dotpups.de style (a directory structure with subfolders for every category)

and if I get you right, you are saying the ibiblio-one folder for all (which would be horrible for direct user access) is the better system for reasons more complex.

I would then think up a system like this:
one large folder as the repository, every uploaded .p(g)et is accompanied by a .txt of the same name that contains info about it in a fixed system:

Code: Select all

package file name, package name, category, version, creation date, creator, short description, home page addresses and other links
a script would parse these .txts and create/update the main packages.txt with the necessary info.
A Pupget Downloader could then read the single packages.txt and pull out package name, and category (or more), display a categorized view of the pupget directory and offer two buttons : 'download package' and 'view package name.txt' (or it would automatically display the txt).
So basically the server looks like the pupget-packages_1 directory and the downloader looks like the dotpup downloader.

I have no idea if this is a good setup, if it is possible with reasonable effort or not at all - or if I am thinking about the whole thing from a stupid perspective and should just shut up. It is just how I as a newbie user would want my package access.

edit: oh, two new answers while I typed :( of course there could also be an accompanying .jpg for all .p(g)ets with gui. the rules for these could be filesize/measurements.

MU, being able to access the repository by browser (and organize it via subfolders) would be great, if that does not cause too much extra work maintaining it?

Again, I don't know how the server should best look internally - my suggestions are aimed at the user interface (pupget downloader, server access page). Sorry for noobness.
User avatar
Nathan F
Posts: 1764
Joined: Wed 08 Jun 2005, 14:45
Location: Wadsworth, OH (occasionally home)
Contact:

#9 Post by Nathan F »

Here's my offering.

One folder for all of the packages on the repository.

An index page with seperate catergories, in each category would be links to the individual packages.

To make things simpler to maintain, a simple text file could be uploaded along with each package describing what it is and what it does. The first line could contain a keyword to be read by the index (using php?) in order to sort it all out into categories. Of course, someone could just maintain the index by hand but that gets to be very tedious. The same keyword info could be used by Pupget in a future incarnation to show categories in the package selection window.

This is introducing some complexity into the system but may be well worth it as the sheer number of packages increases. In other words, it would be responsible to get to work implementing something now rather than waiting until it's become a mess.

Nathan
User avatar
Nathan F
Posts: 1764
Joined: Wed 08 Jun 2005, 14:45
Location: Wadsworth, OH (occasionally home)
Contact:

#10 Post by Nathan F »

I just realized Deshlab mentioned almost the same thing right before me.
User avatar
Lobster
Official Crustacean
Posts: 15522
Joined: Wed 04 May 2005, 06:06
Location: Paradox Realm
Contact:

#11 Post by Lobster »

deshlab wrote: edit: oh, two new answers while I typed :( of course there could also be an accompanying .jpg for all .p(g)ets with gui. the rules for these could be filesize/measurements.
A great idea - I would suggest a thumnail - reasonably large - linked to a full size image. Does any other package manager have thumbnails?

Are we having fun yet?
Puppy Raspup 8.2Final 8)
Puppy Links Page http://www.smokey01.com/bruceb/puppy.html :D
User avatar
Nathan F
Posts: 1764
Joined: Wed 08 Jun 2005, 14:45
Location: Wadsworth, OH (occasionally home)
Contact:

#12 Post by Nathan F »

Yes, that would be useful for some. It bears mention that having a metafile for each package is how Slapt-get works, so this is not a new idea and has been proven in use.

Nathan
User avatar
jmarsden
Posts: 265
Joined: Sat 31 Dec 2005, 22:18
Location: California, USA

#13 Post by jmarsden »

Rather than have the per-package metainfo in a separate file, it may be simpler to just include the necessary metainfo within each package file (as both .rpm and .deb files do). This way, the info is not accidentally separated from the package itself, and is also available locally to anyone who downloads the package (so you could easily put the contents of a repo on CD-R or DVD-R and browse it locally, without having to remember to add the separate metainfo files, for example). This could be useful for people with limited Internet access.

Likewise, instead of uploading separate .jpg files -- pupget packages that are GUI-oriented already contain a .xpm file which is intended as the "icon" for the package concerned... why duplicate that as a .jpg outside the package and upload it? That seems inefficient and unnecessary. A repository wanting to build a pretty web interface to its contents is free to unpack those icons and convert them to .jpg files, for browser-friendliness.

But please... can we get the package repo and its mirrors set up first, and worry about making a pretty web interface for its contents afterwards? Unless someone is volunteering to do the pretty web interface design and implementation right away? I'm concerned we'll delay the creation of a useful facility. Delaying repository implementation because some people want (apparently!) a categorized non-pupget-but-GUI interface to the package repository seems unnecessary. I'm not sure why an enhanced pupget can't be the normal, commonly-used, pretty GUI interface to Puppy package repositories myself.

Curently pupget packages do not contain their own description or dependency info, nor their "category". I'd suggest adding this into them in a standard way, rather than creating a new per-package metafile standard. While we enhancing the pupget file format, providing a documented way to add things like author, packager and licence info (for example) for those packagers who wish to do so, could also be useful, IMO. As could a defined way to handle package signing.

If we do decide to go with separate per-package metafiles in the repositories, I would strongly urge that we check whether the (simple) LSM format is adequate for Puppy package repository metainfo needs, before we create a new one. Let's avoid re-inventing too many wheels.

Having a set of defined package categories is good, and is useful for displaying packages to a user to select a package to install. I'm not at all sure that splitting up repositories based on those categories is similarly good and useful :-)

Jonathan
User avatar
jmarsden
Posts: 265
Joined: Sat 31 Dec 2005, 22:18
Location: California, USA

#14 Post by jmarsden »

deshlab wrote:do you referr to ftp://ibiblio.org/pub/linux/distributio ... ackages-1/ ? how should I have included this? there are no categories there.
My point exactly :-) It is a large collection of packages, a repository. It works. It does not need any categorization within itself. However, to be fair, it does in fact categorize its packages. Look inside packages.txt a little. A quick

Code: Select all

# sed -e 's/^.* on "//' ~/.packages/packages.txt |cut -d '"' -f1 |cut -d " " -f1 |grep -v "^ *$" |sort |uniq
will list them for you:
  • CONSAPPS
  • CONSCORE
  • GTK1APPS
  • GTK1CORE
  • GTK2APPS
  • GTK2CORE
  • MMCORE
  • MMGTK1APPS
  • MMGTK2APPS
  • MMTCLAPPS
  • TCLAPPS
  • TCLCORE
  • XLIBAPPS
  • XLIBCORE
Very different. Somewhat technical. I'd be interested to see the overall pupget design documentation and to learn the reasoning behind this categorization, as well as how it is used in practice. I can guess, from the list and from reading /usr/sbin/pupget sources, but it would be nice for this to be documented somewhere. And since you are analyzing categorization schemes for Puppy packages... I thought you might be doing that piece of documenting, or would at least be aware of where it is already documented. It sounds like you weren't really intending to look that deep?
what I am proposing is meant for the user-part of the package access situation.
OK, as long as others don't understand this set of (user interface) categories as a reason to split up package repositories along category lines, that's fine with me.
I also don't know how you envision the future pupget tool, the current tool that just shows all available pupgets with short description but uncategorized can surely not be used when the hundreds of available dotpup packages (or variants thereof) have been added.
Indeed. Pupget packages can (and are!) already be tagged with a category, see above. Displaying packages grouped according to this or some other categorization scheme is not difficult.

Jonathan
User avatar
Nathan F
Posts: 1764
Joined: Wed 08 Jun 2005, 14:45
Location: Wadsworth, OH (occasionally home)
Contact:

#15 Post by Nathan F »

Curently pupget packages do not contain their own description or dependency info, nor their "category".
Believe it or not this is somewhat misleading. It's true that the packages themselves don't have any dependency information, but Pupget can and does in some instances. It's part of the file 'packages.txt'. Here's an old entry for gxine.

Code: Select all

"gxine-0.3.3" "gxine-0.3.3: GUI multimedia player, Xine frontend" on "MMGTK2APPS +xine-1.0,+libdvdcss-1.2.8 980K" \
Ther part that says '+xine-1.0, +libdvdcss-1.2.8' is telling Pupget that if the user wants to install gxine, these other packages are required also. This is in some ways a better and more simple way to track the information than adding the extra scripting to read any kind of metafile. The drawback is that it really isn't completely implemented yet from what I've seen, plus that information won't be available with any new packages until they've made it into the 'official' unleashed suite and the info is hard-coded into the iso.

Barry also added category information, but it's not very helpful from an end user's perspective. It's more useful for the developer(s).

One thing Pupget does that NO other system I'm aware of does is to perform the ldd command on every executable. This virtually gaurantees that if something is not right or missing you will know it. Most users have never noticed because Barry's packages have always worked, but it was a nice bit of foresight.

I'm not opposed to changing this so that the dependency info is included in the package, but I just wanted to point out that there was already a system in place. I mentioned at least one very good reason why having this info right in the package might be better. I would like to stress the importance of simplicity in it's implementation.

The meta-info could be added to the packages now and the system to implement it's use could be added later, so that isn't an excuse to delay the creation of a package repo. I'd also like to restate that having all of the Pupgets in one directory is the best way to do it unless you want to rebuild Pupget from scratch and make it more complicated (which I've already expressed an aversion to).

Nathan
User avatar
jmarsden
Posts: 265
Joined: Sat 31 Dec 2005, 22:18
Location: California, USA

#16 Post by jmarsden »

jmarsden wrote:Curently pupget packages do not contain their own description or dependency info, nor their "category".
Nathan F wrote:Believe it or not this is somewhat misleading.
It is 100% accurate. The dependency info is in a separate file, packages.txt (and I showed a quick way to extract the category in a post made just after the one you quote -- I'll leave extracting the dependency info as an "exercise for the reader").

I was arguing for adding such information to the package itself. I'm aware it is in packages.txt (though not in a terribly flexible format, compared to allowing statements that say "this package needs package A with version >= 1.2, or Package B with version >= 2.7" (for example). But the point I was making is that info is not in the pupget package itself, yet. I have read the /usr/sbin/pupget code.
The meta-info could be added to the packages now and the system to implement it's use could be added later, so that isn't an excuse to delay the creation of a package repo.
Agreed.
I'd also like to restate that having all of the Pupgets in one directory is the best way to do it
Agreed.
... unless you want to rebuild Pupget from scratch and make it more complicated (which I've already expressed an aversion to).
Well, if we're going to add capability for multiple repositories and users adding new ones to the set of repos pupget can "see", then pupget is going to get (somewhat) more complicated. If we are to allow it to preset packages to users grouped by category, then it is also going to get (somewhat) more complicated. That's life. New capabilities generally do require increased code size. (As you say, splitting up repositories by category would make this worse -- but even without that, some complexity increase seems unavoidable.

Personally, I'd rather create readable and documented libraries of shell functions, from which we can (later) build a readable new pupget tool with some added functionality. If others want to hack on the current /usr/sbin/pupget to add that same or similar functionality, cool -- that approach may well be faster to implement. I just don't want the job of supporting such a script. So I'm not going to write it :-)

(If someone out there is planning on implementing this stuff by hacking pupget, please talk to me about it, so we can at least share design ideas and perhaps underlying package database format definitions, etc.!)

Jonathan
User avatar
babbs
Posts: 397
Joined: Tue 10 May 2005, 06:35
Location: Tijuana, BCN, Mexico

#17 Post by babbs »

I was just thinking... Another thing to consider would be how to decide what goes into the repository? Will the new dotpups first be vetted in the forum, or will they be just dumped in with the rest? Once placed into the repository, would they be replaced in the forum with a link?

Now that I have done everything that I can so far with ftp://puppyfiles.us/, all I can to is wait for a consensis on the issue in this thread. To see what I've added at this point (no dotpups yet), you can check out the FTP-Tree file at: ftp://puppyfiles.us/pub/puppyfiles-ftp-list.txt (16k)
User avatar
deshlab
Posts: 82
Joined: Sat 23 Jul 2005, 09:57
Location: oldenburg, germany

#18 Post by deshlab »

babbs wrote:I was just thinking... Another thing to consider would be how to decide what goes into the repository? Will the new dotpups first be vetted in the forum, or will they be just dumped in with the rest?
I guess smaller files can still be releases at the forum at first, larger files already have to be stored externally - a release thread should be opened anyway. We could switch to a system uploading everything directly to the repository and work with the "incoming" folder as a place for beta packages:

New files are uploaded to incoming and are available there for a week or 50 downloads (whatever comes first) and then sorted into the main categories (moved to the general dotpup directory). A message could warn/instruct the users that new files are less tested and that everyone should report problems on the forum. If problems come up, the package should be removed or renamed to "problematic*" or something.
Once placed into the repository, would they be replaced in the forum with a link?
as soon as the repositories are up and running, it's not longer necessary to post a link. a new release can just say "xyz compiled and uploaded, it does this and that" and people will look via ftp or a downloader tool, find the programm via the name and read the description there directly.
User avatar
deshlab
Posts: 82
Joined: Sat 23 Jul 2005, 09:57
Location: oldenburg, germany

#19 Post by deshlab »

jmarsden wrote:it may be simpler to just include the necessary metainfo within each package file.
I wouldn't know whether metainfo inside the pupget itself could be extracted via the repository tool and displayed before download. If that is possible, that would be nice for reducing the amount of files on the server (to a third). On the other hand if you keep the metainfo (txt, jpg) outside of the file, they are accessible for people without a download tool that use the repository via ftp. That would be useful for people with limited internet access as well.
pupget packages that are GUI-oriented already contain a .xpm file which is intended as the "icon" for the package concerned... why duplicate that as a .jpg outside the package and upload it?
I understood we weren't talking about icons for the list view but about screenshots (picture says more than many words, e.g. about the difference between adie, e3, elvis and geany text editors).
I'm concerned we'll delay the creation of a useful facility.
discussing the usefulness and how to achieve it, should be the only reason to delay anything.
I would strongly urge that we check whether the (simple) LSM format is adequate for Puppy package repository metainfo needs, before we create a new one.
thanks, that looks very useful!
I'm not at all sure that splitting up repositories based on those categories is similarly good and useful
please explain again what you mean by "splitting up repositories". If the scripted categorization I layed out in my earlier post if possible, than the main difference between one-directory and many-directories will be that one system can only be used via a downloader/web interface and the other can be accessed via ftp .
jmarsden wrote:
deshlab wrote:do you referr to ftp://ibiblio.org/pub/linux/distributio ... ackages-1/ ? how should I have included this? there are no categories there.
It is a large collection of packages, a repository. It works. It does not need any categorization within itself.
Sorry, but it doesn't work.
  1. it is quite accessible now (even for me, if I know what I look for), but that is due to being a small collection of packages. "large" is something else.
  2. if a user needs e.g. an adress book tool, and has no category system to help him, he would look under 'a like Address', not find it and then read the whole list of packages until he finds 'g like Gaby'. If the user uses ftp access instead of the pupget tool, he won't even realise that Gaby is what he is looking for.
And since you are analyzing categorization schemes for Puppy packages... I thought you might be doing that piece of documenting, or would at least be aware of where it is already documented. It sounds like you weren't really intending to look that deep?
:( you sound a bit accusing there - I already said, that my perspective is limited and that I am sorry for not having deeper insight into these things. I am trying to learn and do my best to contribute. I hoped that people like you could help me with providing aspects like this, not that they would make fun of me for missing them.
Also, you missed my point there. I did not try to find every categorization scheme available or possible. I saw that there were different ones in use and tried to think up a unified one through comparing those I found with the intention of releasing the repository owners from the work to think that up themselves.
I don't think Barry's category system is targeted at the end user and therefore it doesn't belong into my analyzation table. Should I add it there nevertheless?
Pupget packages can (and are!) already be tagged with a category, see above. Displaying packages grouped according to this or some other categorization scheme is not difficult.
Then the only things missing here are the implementation of a different categorization system and an interface that's allowing to access catgories one at a time.

edit: removed questions unrelated to the thread topic
Last edited by deshlab on Tue 17 Jan 2006, 11:11, edited 1 time in total.
User avatar
deshlab
Posts: 82
Joined: Sat 23 Jul 2005, 09:57
Location: oldenburg, germany

sorry for triple posting

#20 Post by deshlab »

on second thought: :arrow: People, let's please try to return this thread to the topic (I'll rename it to make this more clear).

At the moment we are tackling multiple (related) topics here, and much of what was said here should be elseshwere:

the topic of this thread is the category system that the end user should see accessing repository/ftp/http/download tool.

http://www.murga.org/~puppy/viewtopic.php?t=5432 deals with the .p(g)et package management - we could talk about metainfo and one-directory vs multi-directory there

maybe we should use three separate threads:
- organization of the repository (directory structure)
- the .p(g)et file format (cti, metainfo, etc)
- the interfaces (improvement of pupget downloader, webmask)

Let's try to organize the discussion more so that no ideas get lost and the whole thing stays readable for newcomers.
Post Reply