Wiki/Report of Meeting 2024-03-14

From J Wiki
Jump to navigation Jump to search

Report of Meeting 2024-03-14

Present: Skip Cave, Ed Gottsman, Raul Miller, and Bob Therriault

Full transcripts of this meeting are now available below on this wiki page.

1) Ed mentioned that John Baker had advised that the J Viewer database be moved out of the temp directory. Some users make a practice of cleaning out the temp directories and to avoid this the J Viewer Database will now be stored in an independent folder entitled jviewer contained in the user folder that would not be subject to accidental erasure. Ed has been working with Jan Jacobs about creating a 2 dimensional display from a vector of similar documents in the J Viewer. There was an ongoing discussion about the challenges of mapping information with so many connections onto 2 dimensions. The vectors have a hundred dimensions on each word and these word vectors are converted to document vectors. Ed feels that the biggest question is what does the user want? The expert may have very different interests than newcomers. Some users may want to do deep research on specific topics and others may be looking for a wider net of related ideas and not necessarily the narrow deep dive. Ed believes that Jan's planned outcome is a clickable SVG to display the information.

2) Ed has been exploring using Ollama, https://ollama.com/ a Large Language Model (LLM) and is doing this by invoking Python programs from J. The challenge is getting the information for the LLM from J to the Python application and back. Ed had been exploring j.py https://code.jsoftware.com/wiki/David_Lambert/j.py There is an add-on called python3 https://code.jsoftware.com/wiki/Addons/api/python3 that contains verbs that allow information to sent to and from J to python. Ed is still working through the process. Raul thought Eric had recommended the pandas library https://pandas.pydata.org/. Ed felt that this transfer was being done on a document level and not on the socket level. The advantage of Python is that there are many libraries that have been written to exploit the parallelism of the GPU. Python is a glue language and it would be advantageous to have access to those libraries through J. Ed is using JSON as the common currency for the transfers. Ed feels using socket would be faster but has concerned on how robust the sockets might be over the long timeframes that LLM's processing might require. Debugging in Python is a challenge and if J is blocked from the process and Python is going on too long there is not really a way to shut down the process from J. This is very much an exploration and Ed is not too sure how far he will take it. Raul suggested that TMUX might be useful to allow languages to drive the process. Raul also provided links to ways to interact with Python. https://github.com/daeken/Benjen/blob/master/daeken.com/entries/python-marshal-format.md and https://jakevdp.github.io/blog/2014/05/05/introduction-to-the-python-buffer-protocol/ Raul also knew of a Cygwin version that can support TMUX.

3) Bob has been exploring the CSS grid display ands found that it is good for some page styles because of the way that it allows pages to respond to the shape of the screen. https://code.jsoftware.com/wiki/User:Bob_Therriault/Newcomers Bob has also found that displaying the information as table is also responsive. Bob thought that he might stay with the table information for many of the simpler pages, but go to grid with more complicated pages because the grid pages are easier to change when that is necessary. We also reviewed the different screen sizes using Chrome's developer view. Table version of category page https://code.jsoftware.com/wiki/Category:Community_C Grid version of category page. https://code.jsoftware.com/wiki/User:Bob_Therriault/Community On a page such as NuVoc the complicated structure is easier to change, although the question of how often a page like NuVoc may need to be changed may make the change to grids form tables not worth the effort. Raul thought that the information reminded him of the info pages that had ben imported to the wiki from the previous Jsoftware site https://code.jsoftware.com/wiki/Help/Index_A Both are a form of links with brief descriptions.

4) Ed raised the question of the redundancies on the Community Category page https://code.jsoftware.com/wiki/Category:Community_C, the page content and the table of contents and whether that is an issue. Bob explained that the redundancy allows the user to get hints at what might lie within the categories. Raul mentioned that there may be a lot of maintenance to change the information in multiple places. Bob agreed that this approach involved a lot of maintenance if the changes were being made all the time. New pages would have to be placed in the wiki, perhaps with discussion within the wiki group to ensure that the page is located in the best position with he appropriate links. Bob thought that it might be possible to eliminate some of the information and try to lean more on the automated generation of pages. Ed felt that the Category tree might be all that is needed if the descriptions were expanded a bit. Ed mentioned the hover extension that Raul mentioned in the previous meeting. Bob pointed out that hover could be a challenge for phones or tablets. Since hover will not work on tablets the category trees are good for structure, but require a bit more information to be truly useful. Ed suggested that categories could be named with separators and a series of points would give the viewer a better sense of what the category and page would contain. Ed felt that good goals would be to reduce the redundancy and the administration burden.

5) Raul said that there was a new version of MediaWiki which has a dropdown menu for its sidebar and we might consider upgrading to get some of that functionality.

For access to previous meeting reports https://code.jsoftware.com/wiki/Wiki_Development If you would like to participate in the development of the J wiki please contact us on the J forum and we will get you an invitation to the next J wiki meeting held on Thursdays at 23:00 (UTC) Next meeting is March 28, 2024.

Transcript

Oh, do you have anything new about the J Viewer other than you're changing where you're

well, in addition to changing where you're making the storage?

Yeah, so I forget his name.

Hold on here one second.

John.

No, no, no, no, no.

Somebody else entirely.

John Baker, he of the JOD add on got in touch with me and said, get the hell out of the

temp directory.

So I did that.

He was actually a lot more tactful than that, to be honest.

But that was the burden of his message.

He's been on the call since last week.

He's quite a nice guy.

Oh, he is.

I'm sure.

What was the reason that I mean, I.

Oh, I tend to leave the temp directory pretty much alone.

I work with it, but I never clear it out.

I never really treat it as if it were really temp.

But apparently a lot of people do.

And John said that he had at least on one occasion wiped out the JV or database by accident.

So yeah, that's quite reasonable.

So yeah, I moved it.

John Jakob and I have been going back and forth and I.

I'm having sort of a.

So.

Jan has this approach to identifying similar documents that involves.

Vectors doesn't really matter.

And his goal is to come up with a two dimensional layout of documents.

Where similar documents are close to each other and more similar documents are closer

to each other and less similar documents are less close to one another.

And I'm having sort of a crisis about that because I don't understand what it means for

a document to be similar.

Or dissimilar.

It strikes me that similarity could occur along a lot of different axes.

Two documents from the wiki could.

Have the same J primitives or the same J trains in them, and that would make them similar

along that axis.

Or they could be similar in that they're treating the same problem space or they could be similar

in that they are both pitched at beginners or both pitched at much more experienced people

or what have you.

I came up with six or seven possibilities.

And I explained that to Jan and he did respond, which and his response was thoughtful and

I very much appreciated that.

He has now, I've given him all my data and he is now pursuing that effort.

And I have just become willing to sit back and let him let him figure it out.

I've offered to help him in any way that I can.

I did give him nicely cleaned up text document versions of the wiki with all of the J mnemonics

converted to letter J numbers so that the tokenizers will treat them seriously.

I'll be interested to see what he comes up with, but I'm not actively pursuing it myself

at this point.

And that is all I have.

I didn't see the response from Jan.

I did see your message to him.

I was thinking typically drops you off when he replies and I've gone back and forth on

whether that's deliberate.

So I'm never sure what I just hitting reply.

Yeah, I'm sure you're right.

I'll forward it to you.

Save me from going over what he's already gone over.

So how many, how many elements of his vector roughly?

The way it turns out, each document has 100 elements in the vector, 100 dimensions in

the vector.

I have no intuition about what that might mean.

Yeah, I'm just trying to think of the concept of distance between vectors.

Well, that part of it actually holds up pretty well.

So there's a Python library called Word2Vec, which Jan uses as part of the process of creating

these document vectors.

And Word2Vec will take a document corpus and create vectors for the words.

And they are, I think it was 100 actually dimensions on each word vector.

And that actually has some interesting properties.

So if you have a vector for queen, for example, and you subtract the vector from woman and

add the vector for man, you end up very close to the vector for king, just for example.

Now Jan goes a step beyond that in an effort to turn those word vectors into document vectors.

And that's where I lose him is the similarity of documents.

There's going to be, like you said, there's gonna be many kinds of similarity, but you

can't be all things to all people.

And it's probably as much about picking some useful ones as it is about tossing some unuseful

forms of matches.

The question is, what does the user want?

I mean, is he interested in similarly beginner?

Well, you're not going to get all users that ever there, since everybody's going to have

to very quickly go through the stuff that is of interest, of high interest to them and

then get into the secondary things.

Well let's disappear into the rabbit hole.

So the other measure that's of interest that isn't captured at all by this is related.

So I might have two documents on sorting, one that describes the bubble sort in some

detail and one that describes the quick sort in some detail.

They're not similar in any strong sense, I would argue, except that they both treat sorting,

but they are definitely related.

Maybe what I'm really interested in is related documents, and we're not capturing that either.

But then when you say related, they couldn't just be related by the fact that they're both

sorts.

You've decided that that's the relationship, right?

Sure.

All I'm saying is, in addition to the many dimensions along which documents might be

similar, there are dimensions along which documents might be related, and a user might

be interested in either of those sets of relationships.

But if it's sorting, they're probably already searching on sort already.

So what you're after is interesting stuff that the user might not have thought to search

on that's related to what they're reading.

And you shouldn't have too high of expectations there.

You should instead probably try and-- I mean, you can try and prove it, but it's also--

they can look at what are the sorts of things that creep into there that seem like they're

a waste of time or useless, and as a way of passing those from the related list, from

the interested list.

Well, I agree with you.

And I'm sorry, go ahead.

I just don't know if-- I mean, I-- go ahead.

That's fine.

So my feeling is that the magic eight ball mechanism I came up with a couple of weeks

ago is that, in effect.

It says low expectations.

It's just the magic eight ball.

It's not automatically going to find you exactly what you want.

And it's based on commonly occurring words.

So pick a word, and I'll show you some documents that are related to that, that include that

word which appeared in the document that you're looking at now.

So you get some agency.

You get some choice about what constitutes relatedness.

You're not presented with a hard-coded layout of relatedness that you're forced to live

with.

Yeah, it's interesting.

It'll be interesting to see where it goes.

Because to me, it's mapping 100 dimensions down to two.

Yeah, which I have not wrapped my mind around, I freely admit.

Yeah.

And I guess there's a few other ways you could-- well, it still comes down to two dimensions.

I mean, I'm just thinking, things being closer or further away, it's still a distance on

some form of two-dimensional thing.

And you could do a heuristic like, look at the words they're sorting on.

Go into the document and pick out words that are near those words, and do some quick searches

on those, and pick some of those out.

I'm sure there are a number of heuristics you could try.

I'll be interested to see what Jan comes up with.

He's got enough background in doing this already, and he's got enough information from me, I

think, that he should be able to come up with-- I believe he wants to produce an SVG map that's

clickable.

And I'll be very interested to see what that looks like.

Fascinating stuff.

I guess the only other thing I've recently seen you emailing about is the Python.

I think it was--

Oh, well, that's totally separate.

Yeah.

I can talk about that if this is the time for it.

Let me share my screen.

So this grew out of the fact that when I was using Word2Vec and when I was using some OLAMA--

OLAMA is one of these LLMs, or LLM manager, I guess you'd say, routines.

I found myself coding Python, which a few years ago when I retired, I promised myself

I would never do again, but I had to do it.

And so I have this notion that is not very well thought out.

I'd be interested to hear any comments on it, which is invoking Python from J. And as

Eric Iverson pointed out, that's not actually terribly hard.

You can invoke Python from J because you can shell out to programs from J. And you can

pass in a text file or a .py file, and it'll be executed.

It struck me that the real problem-- and this is what I ran into when I was trying to use

Word2Vec when I was working on Jan's problem-- is marshaling data between J and Python and

back again.

In order to do that, I was doing file IO and database IO on both sides, J and Python.

And it was-- it's not hard, exactly, but it was unnecessarily irritating.

So here's my notion.

On the left, you've got your J transcript, your J terminal.

And what you can do is there's an add-on called J.py or something like that, and it's got

four or five routines in it.

The first one is init.py.

And what init.py does is you pass in a path to a Python executable, and it'll fire it

up, and it'll get the process ID.

And then you can do a little J. You can say, all right, well, here I've got this great

big array, a million elements, which I assigned to the name K.

And then another one of the J.py routines is put.py.

And you can just give it the name of a J variable to send over to Python.

And it will do that.

And the way it does that is it turns it into JSON, it compresses it, and it passes it over

a socket.

And there are routines that were defined back up here when you did init.py that receive,

decompress, and parse the JSON.

And then they assign the resulting value to K in the Python namespace.

And then there's a run.py routine, which lets you execute arbitrary Python.

So you import neural netlib, which just imports neural netlib over on the Python side.

And then you run py, again, run Python, m equals some nifty AI routine, and you pass

in your million element array.

And that takes a while, but it comes back eventually.

It's just all synchronous.

It comes back eventually and says, yeah, OK, I finished that.

And then you can-- that code executed over on the Python side and assigned a vector to

m.

When you do get.py m, that on the Python side JSONifies the value of m, compresses it, and

transmits it.

And it turns out m is a string.

And you echo it, you see it, and you see what the nifty AI routine came back with, which

is answer unclear, try again later.

And then you close py to shut down the Python process.

And this is not my domain by any means.

So I'm not clear on how reasonable or feasible or whatever this is.

This is about as far as I've taken it.

I've got another slide with challenges that we may get to.

But I'll pause here for commentary on what I've presented so far.

What Eric had recommended was using Pandas, which is part of JD, I guess, as a model for

how to marshal.

But I haven't used that myself, so I don't know what that even involves.

I think he's marshaling by file.

I don't think he's marshaling over a socket, which is fine.

It makes perfect sense.

But I wanted something that was faster.

I guess one of the things to consider, and I don't know Python, but to try and make sure

that you're-- I guess one thing is the functionality.

If Python's got the functionality and J doesn't have the functionality, you're essentially

using J as a glue language to put together the Python scripts.

Is that accurate?

I would agree with your conclusion.

I would question the presumption.

So it's not that the J language lacks something that the Python language has.

It's that the Python language has been used to package up a bunch of really useful array

functionality that exploits the GPU, among other things.

And J can't do that at this point.

And the easiest thing to do is come up with a way, trivially, to exploit those Python-packaged

libraries.

And speaking of language, the actual GPU array routines, as I understand it, are mostly

in very, very optimized C. They're not actually in Python.

Python is just the packaging mechanism that's been used for those algorithms.

So yeah, I want to use J as a glue language for those libraries.

Yeah, because that's what I've heard, is that's the big advantage right now of Python, is

that there's so many people in there writing so many different libraries that if you want

to do something with data, chances are somebody's written a library that has the functionality

that gives you what you want.

Or maybe it's spread across a couple of libraries, but essentially the code has already been

written to be able to do the things you want to do.

Yeah.

My contention is it's not hard to write Python, and it's not hard to...

I used mostly a database.

I hit the database from Python, and I hit the same database from J. And that was my

marshalling mechanism, was reading and writing to that database from either side.

But that's work.

And maybe we can come up with something that's fairly lightweight and much less work, much

more trivial, that would still let us exploit those Python libraries.

And again, I'm not sure this is very much...

This is not had a lot of thought put into it, and I'm not the right guy to be doing

it.

And you're using JSON as a lingua franca between the two.

That's a common data structure.

Yeah.

So that takes care of... in some sense, takes care of the...

I'm not really sure how to compare that to Eric's use of the pandas format.

I'm not quite sure how to think about that.

Well, pandas is just a...

Yeah, this is...

Pandas is just a library of Python, isn't it?

It is a Python library.

It's got...

I think it uses NumPy for its vector definitions.

And Eric was exploiting that, which I think makes a lot of sense.

It struck me as simpler just to use JSON.

And if you want to convert Python vector to a NumPy vector, that seemed like something

you could do in the Python code that you passed over to the Python process.

But I could be wrong about that.

It might make a lot more sense to use Eric's approach.

But in any case, I really like the idea of using sockets rather than going through the

file system.

It seems faster.

On the other hand, these neural net and large language model routines run for so long.

I mean, they run for minutes at a time or hours that may not make any difference if

you're marshaling across the file system or across a socket.

That's another consideration.

Although I think I was mentioning on the emails, JHS, I've had it up running for days.

And it's all running on sockets between the browser and the J engine.

Yeah, my concern about sockets, and I do have some, is strictly the result of ignorance

on my part.

I've never done any socket work.

I've done MQTT work, which is a lot easier.

Most of the foibles of sockets are nicely hidden by MQTT.

But I'm sure you're right.

I'm sure properly done sockets would be entirely reliable.

There are some problems.

If you go to debug the Python code, there's no easy way to debug it.

That suggests that you should keep your Python code simple.

It would be nice to have some kind of logging mechanism that would come back to J so you

could write out progress reports from your Python code and see what was happening.

If J is blocked listening for a result from the Python process, it may be hard to terminate

the Python routine if you decide, oh, it's gone off by itself or it's going for too long.

I have to terminate it.

If the Python process breaks and it stops listening, that's obviously a problem.

You've got some cleanup you have to do.

I can't find any discussion of how you can control idle as the Python interactive REPL,

basically.

I haven't found any information on how you can control one of those remotely.

The interesting thing about that would be you could actually have the Python, rather

than being a headless process, it could be a visible thing you could see happening.

That way, that increased visibility would enhance your ability to do things like terminate

a Python process that you didn't want to have running anymore.

All, as I say, speculation.

That's as much as I've done and I'm not sure I'm going to do any more.

That's what's been occupying me for the last eight hours or so.

Well, and there's some, I think I mentioned Pineapple, which is the DialogAPL version

of their bridge that Rodrigo wrote.

You could use tmux if you really wanted to have a remote driver of a Python session,

I guess.

What is tmux, Raul?

tmux is a terminal multiplexer.

It basically gives you a command line environment that you can control with a language.

You can actually have, it's meant for users, so you can have multiple terminals doing multiple

different things.

It's frequently useful, like if you're remoting into another system, to leave a tmux session

there and then you can disconnect and come back later and connect to it.

It's that kind of a support.

But it also has a language-like API with it where you can send commands to things and

get, check for output necessary.

So you spawn a tmux instance and then tell it to launch Python, for example, and then

talk to Python via tmux.

Right.

Although I don't, tmux, I'm familiar with tmux from the Linux side of things.

I don't know if there's a Windows.

It seems like there should be, but I haven't used it recently.

I haven't used it for quite 10 years.

It seems like it should be.

Well I will come to tmux.

Thank you.

I'm also going to throw into chat a couple URLs that are kind of along the line of interoperating

with Python using a foreign function interface and having Python as the remote rather than

as the client.

Oh.

Just quick things I found.

They're not the best, but they might yield useful terminology for further searches.

Python buffer protocol.

That sounds very promising.

And is tmux built on top of sockets?

Is that its transmission?

I guess you can sell as a SIGWIN version of tmux if you want to go that route.

I'm sorry, a what version of tmux?

SIGWIN.

SIGWIN is a Unix-like shell for Windows.

Ah, I see.

And it provides the Unix API, so a lot of Unix programs get compiled and available through

SIGWIN.

C-Y-G-W-I-N.

I see.

It's been around for a long time.

Many moving parts.

You install it and you run it and at that point you've got a new home directory which

is the SIGWIN home directory that you use when you're in those programs.

I guess in that sense it's moving parts.

So Bob, that is all that I have at this point.

I wish I could be more entertaining, but that's as good as it gets.

Oh, well, you know.

We all do our best.

We do.

No, I think you're quite entertaining.

It was certainly getting me to think a lot about a lot of different things with Python

interactions with language because it's certainly a discussion that we've had a lot on the podcast

about how you can do things with array languages and you can do things with Python and Python's

going to be slower, but a lot of the stuff's already been written.

So, you know, and well, it's a mistake, I think, to talk about Python.

It makes more sense to talk about libraries.

And a lot of those libraries are going to be much, much, much, much faster than anything

you can do with J as it exists today because they can exploit the GPU.

They can exploit farms of GPUs.

Faster for the right workload.

Right, right.

Obviously, there are.

But I mean, these LLM thingamese are, they live and breathe arrays.

They want arrays as inputs.

They give arrays as outputs.

It feels very natural to drive them from J, even if you're not necessarily doing the heavy

lifting with J code.

And it's kind of hard to do that these days.

It's kind of a pain.

I guess I will share my screen because I just got I really been I've been playing around,

of course, with the grid stuff.

Oh, yeah.

Which we had a chance to look at last time.

So I'm sharing my whole desktop.

See how I can.

If I click on this, there we go.

That gets put in the other way.

So now I'm seeing what you guys are seeing.

And this is an example like of the community page of a category page.

That's got the category tree on this side and the table of contents refers down to these

headings on this side.

And this was the original one that I built.

The neat thing I found about this is I was always concerned about how responsive it was.

And you can see it actually is fairly responsive until it gets to there and then it just flips

over like that.

But it's a perfectly reasonable response.

What's that?

That's a perfectly reasonable response.

Oh, yeah.

No, no, I will.

The contents.

Yeah.

So I'm going to show you what I did.

Well, this is the original.

This is built with tables.

This is built with just basically the same sort of information that we've always been

using.

So I didn't go to grids with this.

It's sort of the original.

So then I went and started to play around with grids.

And this is what I came up with here.

And this is grid based.

And so what I was able to do is each one of these is a separate group or its grid within

itself.

And it does the same kind of compression, slightly different when it gets narrow.

You can see it actually will squeeze these.

Yeah.

But it does lose this at a certain point.

So it's actually right there.

Now if I just a trick that we did with Raul showed me last time.

Let's go back out wider so I get a better view of it.

If I want the different oh, I'm still in Safari.

So I didn't want to do that.

Let's just see.

Let's see here.

I'm going to go off to my.

There we go.

So now I'm taking another look at it.

And this is again back to my grids.

If I go back to this, I can change the like this is the Pixel 7 version of it.

And I've had 12 Pro and SE are sort of flatter.

So they're actually nicer.

Smaller screen, but it's actually nicer.

And some galaxies.

All those sort of line up nicely in that in that grid.

But they actually line up just as nicely without the grid.

So the question for me on these pages, like on some pages, I think the grid makes an awful

lot of sense.

Oh, and the other nice thing you can do with Chrome is you can rotate it.

So now I can see what it looks like in the different landscape with the same galaxy.

A 5171.

And I can see the different ones.

Surface Pro.

And I can rotate them.

So it's a really useful thing to look at the responsiveness.

But you can see that in this case with the grid.

And I actually end up getting the same thing.

I go back to my sky, I'll make a copy here.

We're back in the same thing.

But you can see that if I rotate these, this is the old grid.

It's still really serviceable.

And this is actually simpler to do when you're dealing with information on this scale.

When you get into the more complex grids where these things actually might move around depending

on how you squeeze things.

Like for instance, if you were looking at something like New Valk, the advantage the

grid has is it gives you a better semantic view of your information.

So it's much easier to find what you want to change within that rather than looking

through all the tables, which are actually kind of difficult to parse when you're trying

to find out what part you want to change.

So the one advantage I find grid has over the tables is grid is cleaner when you actually

want to go in and change things.

But at this scale of information, that's not required.

And these category pages really are that scale of information.

Like they're not more complicated than this.

They might have longer lists, but essentially, they're always this sort of format.

You've got a table of contents that links to these.

You've got your tree for which this applies.

And then you've got your list down here.

So that really the structure, you're not building tables on top of tables or within tables.

When you get to doing that with grid, you can take table of contents and put it within

a grid and move it around, which makes it more flexible when you're dealing with tables

within tables get quite complicated.

So I definitely look at using the grid for my newcomers page where I've got all those

icons and everything.

But I don't really see an advantage to using these pages.

So my question is, of these two, which do you like better?

Do you like this format?

Or do you like that format?

I'll get rid of this.

I think they're both fine.

I don't think you should make a choice based on...

Sorry.

I don't have a strong preference myself.

I'm not used to the boxes around the bits of text.

I haven't really internalized that yet.

It seems like that when I see boxes around lines of text, I'm expecting this, like a

table of contents or your category things, it's lost information, site information.

It's not the key information.

So the first thing I guess I do is I look at this and I say, well, where's the content

of this page?

But then I realized that it's in the boxes.

You could drop the outlines, right?

Absolutely.

Yeah, no, that's a piece of cake to do.

I'll do that right now.

I don't think that the appearance or behavior is sufficiently different that it would drive

a decision.

I think your decision should be based on other things.

It's kind of like the indices in the books that we brought over.

It feels like it's that kind of content here.

Yeah.

When you compare this without the boxes to this, it feels to me like to me this is preferable.

It does take the headings and they link back to this pretty quickly.

You can see what's going on.

And then this becomes content that is in that box.

And those are separated as well.

All right.

Here's what it's reminding me of is this thing that hardly anybody ever uses.

I threw it in chat.

Okay.

Well, I can stop sharing if you want to show it.

That's fine.

Oh, yeah.

Yeah.

Indexes.

Yeah.

It's that kind of summary information that it's mostly getting you from here to there.

And this was patterned after how book indices were redesigned.

You're filling a little more information than just a word or two on the link, but it's still

that kind of a thing where it's just glue between here and there.

I shouldn't say just, but it's glue between here and there.

Yeah.

No, just is fine because I agree with you.

It's glue, but I mean, glue has purposes.

Oh, yeah.

And this does.

I'll just show that in case.

This particular page is probably duct tape glue, but glue is still useful.

Yeah, there's a similar flavor to it for sure.

So Bob, are you going to redo NUVOC with RID?

Or should one of us do that?

Honestly, when I got to thinking about it, it was a case where if I were to build it

now, I would build it with RID.

But it's a thing that actually doesn't change that much.

So that's where it's considerably simpler with RID, but it doesn't change that much.

So what do you gain by, you know, it's a pain in the butt to go in and figure out where

you want to do it and make the change, but how often do you have to do that?

And so that's the question about whether it's worthwhile to go back in and try and redo

everything.

I know that when I was doing NUVOC for the viewer, I started out trying to write a parser

for the table.

Yeah.

And I gave up.

It was too painful.

I just built it manually.

I extracted all the data and stuck it in a database and built it manually.

And then I generate a graphic from that.

And it was horrible.

You just don't want to live like that.

But you're right.

It doesn't change that often.

No.

If there was a similar page that I was starting out with, I think there's an advantage.

And I think the advantage would be in things like parsing.

Because you can put grids within grids.

And so you just do those jumps down or back up again.

It's similar to trying to organize the boxes in a presentation of a box now.

But I think I'm the only one who ever would try to parse it.

So that's probably not compelling.

Yeah.

No.

That was the thing.

And I did look at it.

But when I thought about it, is that a good place to put my effort?

Because it cleans something up.

But it cleans something up that's not going to be really profitable.

Yeah.

No.

I think I agree with that.

Yeah.

But it certainly gives me-- it keeps me on the lookout for places where it could be.

Because there are so many things that Grid can do when you're trying to put information

together.

And I would say, if I was just guessing-- and it really is just a guess-- the point at which

Grid becomes useful is, I think, when you get that second layer of tables.

So if you had a table within a table, I think Grid really shows up there well.

But with the things I'm doing with the category pages, I could do a grid across the top.

I could do a grid across the table of contents in the category and keep that separate from

the rest of it.

But it doesn't really gain me anything.

It all reshrinks nicely anyway.

Yeah.

But I was more interested to see what the thoughts were regarding the differences, say,

between this format and then the more traditional format.

Having said that, I would stay, I think, with the more traditional format.

I think for most people playing around with these in the past, they've used tables.

And this isn't a difficult thing to do with tables.

And I think to some extent, it does-- like Raul was talking about the separation of the

headings from the content-- taking the boxes out to me doesn't separate these groups as

much as this does.

But I get the point where when you do separate them, are you tying forums to these other

lines underneath?

And that's something-- there's two different things there, the heading and then the information.

And this retains that difference, which is what is nicer.

Yeah.

OK.

Well, that's what I was looking at.

But there's something a little odd.

I mean, the contents on the left and community on the right are redundant to some extent.

And both are redundant with respect to what you might think of as the content below.

So you've got the headings, forums.

The heading forums, for example, repeated thrice, and the heading IRC channels repeated

thrice.

And I assume that if over on the right, you click on the IRC channels triangle, you'll

get overview bots and whatever that is, Jevil bot, Chey Val bot.

Well, Chey Val bot or whatever it is.

Yeah.

Those particular ones.

There's a lot of redundancy here.

There is.

And--

I admit that I'm complaining about it long after I probably should have.

No, no.

It's-- when I actually put this page together, I went through that same process saying, OK,

there's a ton of redundancy here.

What's the use of this?

Well, the use of this is when you get into situations where you might go a couple of

levels deep.

If I go to other array languages, I get this-- I'm into this section, so I can click on these.

It'll take me to the other array languages.

If I go here, I'm just guessing at this.

Yeah.

I can go a couple of levels deep on this quite quickly.

The thing that redundancy gets you into-- redundancy is great when you want a secure

system that people can't change.

That brings up the point of, if you do want to change it, what's the procedure here for

adding a new entry or changing some of these things?

If you wanted to add something to-- in part of this tree, it would be as simple as adding

the appropriate category tag.

Wouldn't you need to also edit the block?

You would need to go back and edit the block.

It would-- but it would show up in here for sure.

So you'd add the category to the destination page, and then you'd come back here to add

an entry to the block.

So the question is, if a person's editing the wiki and doesn't know about the system,

isn't familiar with it, how do we bring them up to speed on the process?

Or do we leave comments on the pages of-- do we have somebody keeping an eye on it and

coming in and filling in the blanks?

What's the long-term maintenance picture here?

I think the model to use would be that if a person is creating new pages, and they're--

if they're creating new pages and they're already-- I'm thinking of-- I'm trying to

think of his name, Thomas something, who's doing a lot of physics, particle physics stuff.

He's creating new pages all the time, but they all fit in underneath the same information.

You don't need to create a new category for him.

He's working within his own category.

But if somebody was actually going to create something that might show up in a new space

or link to something new, actually, I think that becomes an administration situation for

the people running the wiki.

They can decide what the best links are, and at that point, what would be put in and where,

and they would be the ones that would keep it up to date as to these category pages reflect

what the categories are.

So if I was to, say, develop something in Python, an interface for Python, and it was

a page that was going a different direction than anything had been done before, and so

I said, well, I'm going to create this page.

It's going to be based on this.

It'll fit into this category.

I'm actually kind of creating a new subcategory.

There'd be a discussion about that.

This new page has come up with this new subcategory.

It fits in here.

Is it significant enough that it should get a heading on this category page?

Should it make a change to this category page, or do we rely on people just to find it going

through the categories, and they'll be able to link it because, again, like the tags,

it shares a category with a bunch of other pages that are similar?

There's so much redundancy, there's a number of different ways to find it.

Well, so, Roel, I don't really understand redundancy as a security mechanism because

I don't think we have a security problem here, if I'm understanding you correctly.

No, that was just a general comment on the subject.

Okay.

All right.

Bob, it really strikes me that the rightmost tree could be all you need.

Work with me here.

If you look at, let's say, IRC channels down in the body of the page, IRC overview, this

page provides and is meaningless, unnecessary.

All you really need is overview of the JIRC channels.

And I think that's true of a lot of them, I haven't looked at them all, a lot of those

descriptions.

There's a good deal more there, like 50% more or 100% more than there needs to be.

And it strikes me that the rightmost tree could be augmented with the remaining necessary

words for each entry and you'd be fine.

That's all you'd need.

You wouldn't need contents on the upper left and you wouldn't need the content on the left

and going downwards.

I don't get why there's all this.

It confuses me.

Okay.

Well there's two different levels to it.

I think Raul was the original one that mentioned on this is that my descriptions in the different

boxes are not the best.

There should be more, like for instance, forums.

The point that he made is I don't really tell you what a JForum is.

That line that says access to JForums on Google groups should actually be talking about the

importance of the JForums as a communication within the J language.

So those lines could have better information written into them.

Having said that though, your point is do you really need them at all?

Even if it's better information, does it gain you that much?

No, no, no, no.

You need them.

The question is how do you deliver them?

And last week we identified, Raul identified an extension that is heavily used by the Wikipedia

where you hover on something and it shows you the first couple of sentences from the

target page.

So my question is, couldn't you just have the tree on the right and when you hover,

get the contents from the target page, the first couple of sentences from the target

page?

There would be a number of advantages to that.

One is you get rid of all the redundancy.

Two is there wouldn't be an administrative issue anymore.

So the number of advantages is two, I guess.

And in the case of something that's based on a touchpad, I know hover can be a challenge.

Ah, touche.

Right.

If I'm on a phone or a tablet.

Yeah, I don't have a good answer to that.

Actually technically I don't have any answer to that.

I just turned it right.

When I started thinking about that in terms of interfaces that somebody came back with,

well, you know, if you're on a touchpad, that went, oh yeah, no, it's true.

It's true.

Yeah.

But I think the, okay, so maybe my second order conclusions are not very good, but I

think my first order conclusions are reduce the redundancy.

Try to get rid of the administrative burden and make it as automatic as possible.

Well, yeah, I don't know the answer to that.

Yeah.

But I guess what I'm thinking is the place that reduces the administrative burden the

most is this section in here.

Maybe that's all you need.

I don't know.

Well, no, BC, if you do that, you're, I think you're losing more.

Like if you take, if you take this out, you're losing an access to the whole structure of

the whole wiki.

Yeah, I think you should keep that and augment it with the content that you've got in the

content area currently eliminating the unnecessary words.

And I take your point that you want to put more information in there, but I really wonder

about the level of effort required to do that.

It strikes me that you might be much better off just arbitrarily syntactically grabbing

the first few words from each target page.

That would be true when you get to the level of pages.

Right now, for instance, we're not on a page, we're on a category, we're not on a page,

right?

Right.

But forums and IRC channels and blogs and so on don't have any words anyway, so it doesn't

really matter.

Well, there's the IRC overview page.

Oh, but you didn't have it on the content page we were looking at a moment ago.

No, I didn't.

No, no, that's the content page was basically sort of giving you a sense of what you're

going to see.

This page is an overview of the JRC channels.

It's not a lot of information, but based on the page that you actually go to, see what

I'm thinking is if you have, we're on a community page.

If we go to the IRC channels page, that's what you see on the IRC channels page.

So it's just another layer, but again, it's just another layer of category, right?

Heavens, so this is yet more redundancy, right?

Exactly.

Yeah, no, no.

It's yeah.

Yeah.

It's turtles all the way down.

Yeah.

Yeah, I have to say I've become a tad dubious about the presentation.

I think the structure is fine, but I wonder very much about the way that it's presented

to the viewer.

So if we took out, I'm just looking at it.

I guess the thing to question, again, this is the high maintenance area, right?

I'll take your assertion on that.

Yes.

Well, it needs to be created.

This is being created automatically once you've done your curating and your categorization.

This is generated and these are generated basis on this.

If you take this out, this ceases to be interesting.

That's good.

Yeah.

But you're dealing all your information from here.

Yes.

So is this enough information?

No, you need more, but I'm saying add more.

So I mean, if hover were an adequate mechanism, you would hover and get the first couple of

sentences from each of the target pages.

Hover is not an adequate mechanism.

So I would argue that the next best thing would be to include content from each of the

target pages with each of these outline headings.

And I'm unclear on whether MediaWiki will let you do that gracefully or not.

But I would try to make this as automatic as possible.

Yeah.

The other advantage of that is it allows you to focus your energy on the curation rather

than creation.

Yes.

Yes, exactly.

Exactly.

Yeah.

It's something that has been back in my mind for a while.

But so thank you for bringing it up because I think whenever it's niggling there, but

the fact you bring it up gives us another chance to look at it.

And again, if you're not willing to, we used to say when making TVs, you got to learn to

kill your babies.

If something's not working in a television show, you take it out and it might be something

you love, but if it's wrecking the rest of the show, take it out.

Yeah.

I just think it could all be a lot simpler.

That's all in terms of presentation.

I can't speak for implementation.

The first thing that occurs to me with this table is this is just the name of the page.

Right.

We need more.

And again, if we could just hover, that would be easy.

Well, you give it a longer name.

Yeah, that's as far as rebuilding the entire wiki though, doesn't it?

Putting new names on it.

It would be rebuilding the tree table.

I mean, it's rebuilding those categories, but if it saves having to generate all this,

I'm sure it's easier to rebuild this part, all these categories than it is to...

You want to say more with each of those outline entries than would fit in a headline.

I think you want more content than that.

I know you do because you've put it in the content area.

I was looking for the page somewhere.

I had a cut and paste page.

There we go.

Those are the categories you'd be changing.

You're suggesting that you might just, they'd still be headlines rather than pieces of prose,

but they'd be more expansive headlines.

Yeah.

Instead of ending this at N, which I use so that if you ended up with a...

The hierarchical was just to make sure these were independent so that you couldn't end

up going to a different tree and having the same name.

These are all independent.

They're completely...

You can't have duplicates the way this is set up.

But if you just took this part and...

Oh, I'd have to go in and edit this.

You can see I've set this up as cut and paste.

And then when you get to the dirty parts, because I make it for cut and paste to make

it easy, I have to go in and change these to no wikis and do all sorts of things with

them.

If I went in here and just said, array language, community...

Well, I'm not sure.

I'd say information, but to me, that's just almost telling you nothing.

But something that relates to what you're going to see in this category.

Well, I wouldn't be doing that for newcomers.

I might just say in this one, new to J and have that show up.

Yeah.

The other thing you could do is the option eight on the Macintosh, the dot.

New to J dot IDE dot debugging dot whatever.

You don't necessarily need to make a single coherent sentence or even phrase.

You could just have points that you wanted to make.

I gotcha.

So that you sort of give the next level down would be a series of words.

The gist.

Yeah.

It might not be exactly the next level down, but it would be enough to let you know what

you would probably get.

Yeah.

And then that would mean when you went back to a page like this, that information would

all show up over here, right?

Yeah, exactly.

Well, I mean, actually, I'm pointing, that's not very helpful.

So the contents go away.

What I think of currently as the content goes away and the outline in the upper right becomes

the body of the page.

Right.

And, and this would be an extension here of these words.

Yes.

Yes, exactly.

And then when you say click down to here, you get more.

Well, in this case, you go to a page.

That's fine.

You could still have additional additional information about the contents of the page.

It might be a very rich and extensive page that is worth saying a few things about.

Yeah.

Now the challenge there with pages is to rename a page.

You're going to move the page.

I mean, you're going to basically be doing everything with every page you have.

Yeah, I can't tell you to do that work.

I can't even tell you it would be a good idea.

I don't know.

I'm not sure it would be a good idea, but maybe by the time you get to the page level

and you click on a page, it's going to take, you know, enough.

Yeah.

You know, I'm going to think about that, but I kind of like that.

Because again, it takes the focus away from trying to create stuff and puts it back into

curation, which we're honestly is.

That's where the work is needed.

It is.

It's it's it's what the what it's what wiki administration should be doing, not creating

pages, but curating what's being put in.

Yeah.

All right.

Just a thought.

That's a good thought.

Thank you.

Think about that for a bit before I do anything rash, but oh, yeah, for sure.

I may have some something to oh, by the way, next week, I won't be available for a meeting.

So we won't be having a meeting next week.

Following week, we'll be back into again.

Okay.

So two weeks from now, we shall reconvene and you'll see what I've come up with.

Excellent.

Also, also, we're thinking about the updates to media wiki, whether we want to encourage

Chris to upgrade to the current version, because it rearranges a few things that are related

to navigation.

So he made that change.

And there's a new version of the tools sidebar that's in the left for us.

In the new version, I'm seeing it's in a drop down menu.

Oh, okay.

And you can put it in it has an option to put it in the sidebar.

But even that's a bit different than it used to be.

Okay.

So I'm guessing that relates back to the way Wikipedia have done it.

Because they put in their sidebars is now headings basically.

Yeah, there's media wiki and wiki media.

And media wiki is a software and wiki media is the site.

So and yeah, they are closely tied together.

Yeah.

Okay, well, I think that wraps us all up.

Thanks everybody for your participation and lots to think about.

I should stop sharing.

Thank you very much, Bob.