Mάθησις, Mathesis

Today I would like to make a proposal: Mathesis, a dependency road map for science.

Consider that you’re a student or researcher in science trying to learn a new topic. This topic is explained in a paper, or a book, but it is not accessible to you because there are some pre-requisites that you’ve not covered. Of course, the bibliography of the paper or book can help you, but normally they are not so useful. How to trace it back to the point where you should start reading? And what if you need to take it at several different starting points, converging in the paper that you need?

Precisely because of that, we scientists write books and review articles. But textbooks are linear structures, while knowledge is not. A textbook takes you from point A to B, along a certain excursion path. But, more likely than not, only part of it is relevant to your needs. Hopefully, you can reach your desired knowledge by linking paths taken from different books or papers.

This is the very ambitious target of mathesis: a dependency tree for learning science. This means, to create a graph whose nodes are (small) pieces of knowledge, and whose links are the dependency relations among them. Thus, if you want to learn X, then you proceed to find the node for X. Its outcoming arrows denote on which pieces of knowledge it depends. Then you can trace them back, until you find which nodes correspond to your current knowledge and proceed from them backwards.

Each node need not contain a full explanation of the topic. That would imply to build a full encyclopaedia of science, which is a meta-ambitious target. No, it should  contain some good bibliography, taking into account the dependency structure. Of course, it is much better if this bibliography is free.

This idea resembles a lot the debian repository dependency network, and an attempt to implement it for knowledge has already been done.

So, this is a call for collaboration. We need:

  • Examples. You can try to create the dependency tree for your favourite result. Or the dependency tree in order to understand one of your papers.
  • A standard format for the nodes. They should contain, at least, a brief description, and a list of the nodes on which it depends. The nodes might be weighted, with a low number meaning that only the general idea is required and 1 that the topic should be mastered. And, of course, some bibliography.
  • A nice visualization tool, in order to view parts of the total tree which are relevant to you. Maybe, in java.

This stems from an idea that I had long back, in 2004. I created project Euler, in Spanish, with the full text of my classes of maths in high school, with a dependency tree associated. And I still like the logo I prepared at that time… :)

P.S.: And out of the topic… guess some nice properties of the logo figure? ;)

Advertisements

11 thoughts on “Mάθησις, Mathesis

  1. Javier,

    when it comes to the tech-tree of science, I would try to divide it into a few different parts.

    1. High-school and undergraduate knowledge

    It may be good to:
    – scan wikipedia articles for sites about physics/mathematics
    – scan their interlinks
    – for each interlink ask a question “if X requires Y” (perhaps with gradual answer)
    – run a survey, which asks random questions from the whole list of possibilities (e.g. people will be asked “If COMPLEX NUMBERS are required to understand QUANTUM MECHANICS”)
    Than a graph explorer lined to wikipedia articles could be really useful.

    Also see (sort of ‘tech-tree for theoretical physics, by Gerard t’ Hooft, a Noble Prize winner):
    http://www.staff.science.uu.nl/~hooft101/theorist.html

    2. Papers, research and other highly specialized fields

    First, often it is not a matter of knowledge but notation. And all one need is:
    – “what this letter means”
    – “are there any hidden assumptions” (or rather: “what are the hidden assumptions”)

    In that case I think there is needed a better information on the references. I.e. which references are general-introduction papers (textbook, review papers…).
    There are some attempts, see e.g.
    http://papercube.peterbergstrom.com/
    but unfortunately I know nothing really working (though maybe Mendeley (http://www.mendeley.com/, which I recommend wholeheartedly) in some time will try to do the thing).

    3. Relations between formulas in a single paper/book

    I think it may make readying easier. Just one need to make a LaTeX package, when in equation environments it is possible to say that a equation requires another (e.g. “\req{eq:definition_x}, or \req{cite:Xyzikson2008PRL}).

    BTW, there is one of my paper with a (relatively) large graph of dependencies (or rather -structure of the paper/derivation)
    http://www.springerlink.com/content/n57345p035262823/fulltext.pdf
    Fig. 2. (page 661)

    4. Visualization

    When it comes to the visualization, it’s a hard thing to do (I mean: to do it in a neat and useful way). However, there are already many solutions, many open, which may be adopted.

    To visualize graphs there are some tools, e.g.
    http://www.spato.net/
    (but see also http://delicious.com/stared/graph or http://www.delicious.com/stared/visualization for more references)

  2. Nice idea!

    If I understand it right, you will have problems with entangled dependencies A->B or B->A? You can avoid them focusing on documents, much like debian focuses on packages. Two documents may “provide” complex numbers, one requires wave mechanics, other does not. Or eigensystems, one may require QM, or vibrations, for motivation, other does not.

    Are you going to start a webpage or something?

  3. Piotr, I see that you make different types of remarks. About the dependency, you’re right, Wikipedia can be a source. The problem is that in many cases Wikipedia will be “circular”… this is what happens always with dictionaries, right? :)

    About the notational difficulty, I see your point: very often, one can not read papers from another field just because of the notation. And because of “universally accepted assumptions”. How to help in those cases?

    I’ve seen t’Hooft’s (like that?) webpage, and it’s really fascinating! :) Thanks for the information!

    Hipazia, I think you’re making a very important point here. Maybe the documents should be the nodes, not the topics themselves. This way, one might have different ways to learn a topic, depending on your previous knowledge. This is also a way out to Piotr’s problem of the language difficulties.

    Hypazia

    • Javi,

      everyone knows that “To understand recursion you need to understand recursion” ;).

      But I guess there will be a lot of hierarchy. Of course some links will be related in both ways (e.g. WAVE EQUATION and COMPLEX NUMBERS), some unrelated at all, some clearly related in one way.

      I think larger loops will be rare.

      When it comes to the visualization itself, one will need to make a cutoff. And e.g. only when more than 70% people say that X require Y draw arrow in one way.

  4. I like Hypazia’s idea of distinguishing between packages and topics. Different packages (documents) may provide certain pieces of knowledge with different requirements. This would solve the “loops”, because no loops would appear at the document level. And it might also would provide “bridges” for the lost-in-translation problems with different fields.

  5. I’ve received the proposal to re-baptize the project as μανθάνω, “mantháno”, which means “I learn”. In any case, names are not so important, are they? :)

  6. Hi. What kind of format do you have in mind? It certainly is a very nice idea that would seem suitable for a web interface, with the input fed into a cgi or php-run form. The backend would resolve the dependencies and php scripts or the like would generate the contents based on user input and perhaps with a few steps of feedback.
    My suggestion would be to start with a slightly less ambitious project in scope but with the relevant features the script variables must handle correctly. The script should be able to handle such things as recognizing the very specific –e.g. “bosonization”– from the general –e.g. “solid state physics”– and so on.
    How useful these thoughts can be I don’t know, but I loved your idea and you got me in a brainstorming mode!
    Cheers

  7. Hi, Joigus, sorry for the late reply. Holidays, you know :) I agree with your suggestions about the web interface, although I am myself working on a quick and dirty version using “processing”, which is a language mounted upon java, much simpler to use. I am so happy I gave you food for thought! :)

  8. Pingback: lucrari de licenta

  9. Pingback: The Crastina Column, April: Occupy Science

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s