|Title|Wiki Links Through XPATH, including SubPages support
|Date|23-May-2006 22:17:36 EEST
|JSPWiki version|
|Submitter| [DF|DirkFrederickx]
|[Idea Category]|GenericIdea
|[Idea Status]|NewIdea

!! Wiki Link syntax extended with XPATH

See also [XPath]

This page describes an approach to extend the wiki [[link syntax] 
to powerful expressions based on XPATH and JSR-170.
It aims also to present an approach to extend JSPWiki 
with support for [WikiSubPages] and [WikiFarms].  (wiki-farms todo)

Example of current syntax :
   [SandBox]                            =>yield link to wiki page
   [SandBox/attach.jpg]                 =>yield link to attachment
Example of extended Wiki Link syntax :
   [SandBox/w:properties/variableX]     =>yield value of wiki variable
   [SandBox/w:pages/SomeSubPage]        =>yield link to a subpage
   [SandBox/w:versions/v123]            =>yield link to wiki page version
   [SandBox/w:versions/v123/attach.jpg] =>yield link to attachment
or with a more compact convenience syntax :
   [SandBox/@variableX]                 =>yield value of wiki variable
   [SandBox/SomeSubPage]                =>yield link to the subpage
   [SandBox/v123/attach.jpg]            =>yield link to attachment

Normally, the wiki link syntax returns a single result, being a link to a wiki page or attachment.
From the previous examples, you can see that it is also possible to return the value of page __variables__ as well as links to page __versions__ different from the latest version.

Additionally, XPATH expressions will allow to return multiple results, separated by a space. 
   [SandBox/w:to]                       =>yield all referred-to page links
   [SandBox/w:from]                     =>yield all referred-from page links
   [SandBox/w:pages]                    =>yield links to all subpages of SandBox
   [SandBox/w:versions]                 =>yield links to all versions of SandBox
   [SandBox/w:properties]               =>yield all variables (how? name=value)
   [SandBox/w:attachments]              =>yield all attachment links

!! Wiki Metadata XML model

A Wiki Link can be written be means of an [XPath] expression.
In order to do that, you need to understand the underlying xml model.

The namespace ''w:'' is preserved for the predefined wiki elements.
\\''TODO: check how this will work with interwiki links.''

     <w:author> ... </w:author>
     <w:pagename> ... </w:pagename>
     <w:pathname> ... </w:pathname>
     <w:created> ... </w:created>
     <w:lastModified> ... </w:lastModified>
     <w:versionNumber> ... </w:versionNumber>
       <aProperty1> ... </aProperty1>
       <aProperty2> ... </aProperty2>
         <w:filesize> ... </w:filesize>
         <w:author> ... </w:author>
         <w:filename> ... </w:filename>
         <w:pathname> ... </w:pathname>
         <w:created> ... </w:created>
         <w:lastModified> ... </w:lastModified>
         <w:versionNumber> ... </w:versionNumber>
           <v1> ... </v1>
           <v2> ... </v2>
           <v3> ... </v3>
       <subPage1> ... </subPage1>
       <subPage2> ... </subPage2>
       <v1> ... </v1>
       <v2> ... </v2>
       <v3> ... </v3>
       <aPageX1> ... </aPageX1>
       <aPageX2> ... </aPageX2>
       <aPageY1> ... </aPageY1>
       <aPageY2> ... </aPageY2>
     ... <pagecontent - wiki markup text> ...
   <aPage2> ... <aPage2>

! Overview of predefined elements inside the Wiki Metadata XML model

|| Name            || Description                                   || Type
| <w:pages>        | Set of pages. Used as root element or as a collection of sub-pages | ''pages''
| <w:author>       | Name of the author of a page or attachment     | String
| <w:pagename>     | Page Name, including punctuations              | String 
| <w:filename>     | File Name or an attachment, including punctuations | String 
| <w:pathname>     | Complete path name, including names of parent pages | String 
| <w:created>      | Creation date of a page or attachment          | Date and Time
| <w:lastModified> | Last Modification date of a page of attachment | Date and Time
| <w:versionNumber>| Version number of a page or attachment         | 1..n
| <w:filesize>     | Filesize (number of bytes) of an attachment    | number
| <w:properties>   | Set of properties, or metadata of a page or attachment | ''properties''
| <w:attachments>  | Set of attachments of a page                   | ''attachments''
| <w:versions>     | set of page or attachment versions             | ''pages'' or ''attachments''
| <w:to>           | Set of pages which are being referred by this page (outgoing links) | ''pages''
| <w:from>         | set of pages which are referring to this page (incoming links) | ''pages''
| <w:parent>       | Parent page                                    | ''page''

! Additional notes to the Wiki Metadata XML Model

The contents of a ''page'' or ''attachment'' always refers to its current state.
The {{<w:versions>}} element provides a set of pages or attachements of 
all past __and current__ versions.

All sibling ''pages'', sibling ''attachments'' or sibling ''properties'' are assumed to be unique. In other words, there can not be a page with the same name inside a single w:pages element. 

The <w:from> and <w:to> are ''virtual'' elements, in the sense that they are computed on request, rather then being physically present in the document tree.

!! Wiki Link Compact Syntax

A compact convenience syntax is defined for properties, subpages, attachments and versions. 
Also the root-path {{/w:pages/}} can be dropped. This way, the syntax
becomes backwards compatible with the current syntax.
                                           => SandBox/@versionLabel

                                           => SandBox/attach.png

                                           => SandBox/SomeSubPage/@someMetaData

                                           => SandBox/v127/@versionLabel

!! XPATH expression

XPATH expression allow for a powerful querying of the wiki repository.
(See [XPath] for more details)

Some examples:

Return a list of to-pages having a fruit variable
Return a list of to-pages where the fruit variable equals 'apple'
Use (brackets) when you need logical operators (avoid syntax ambiguity)
  [(SandBox/w:to[@fruit='apple'] | SandBox/w:to[@fruit='approved'])]
XPATH even supports string functions. Following example returns a 
list of pages with the search string matched


It should be possible to use a standard xpath processing java library as a plugin to wiki to support this kind of expressions. 
Probably, we need to write some back-end to provide the xpath with a ''virtual'' xml structure to mimic the internal repository of JSPWiki.

Ref. [JXPATH|http://jakarta.apache.org/commons/jxpath/] Apache library. Check it.

!! Wiki Link Format

With the extended syntax, a wiki link can now return
(i) a hyperlink, (ii) a variable value, (iii) a set of hyperlinks or (iv) a set of values.
Therefor additional formatting capabilities are needed.

Standard __wiki link format__ syntax allows a static ''format'' text.
  [Play around|Sandbox]
Use the ''@-syntax'' to retrieve the value of page or attachment properties.
  [Speed is @speed|Sandbox]
  [./w:filesize|Sanbox/attach.png]   --todo check this out
Combine property values and links like this (nested brackets):
  [This [.] is ranked at @ranking |Sandbox]

In case the wiki link returns multiple results, the format string is iterated over each result.

Example: following expression returns a bullet list of pages with the value of @liveVersion 
and a link to that page as well.
  [* ./w:pagename has version @liveVersion, here is the [link|.] |Sandbox/w:to]

You can use other page variables to replace more complex format strings.
  [{SET format='* ./w:pagename has version @liveVersion, here is the [link|.]' }]
  [${format} |wiki-path] 
you can do the same with a format string from another page:
Here is an example which returns a tabular format.
By putting the format string in a separate variable, you avoid the need
to escape the vertical bar ( | ) which has a separate meaning inside a wiki link.
Obviously, you could also use the tilde ( ~ ) to escape the bars. 
  [{SET tableformat='| ./w:pagename | @liveVersion | [link|.]' }]
  || PageName     || LiveVersion   || Link
  [ ${tableformat} | wiki-path] 


Please log your remarks/suggestions here

Yup.  Great idea.  Would go together excellent with the idea of using JSR-170 as a backend repository...

-- JanneJalkanen


agree, it's a very good idea to use the namespace abbriviation in order to identify xpath queries inlined to the content.

some smaller questions towards a structure free of redundancy:

*do you mind to specify a property more than once in order to form a set of values under one name?
**looks like this is consitent to your use of set-returns in your xpath results
*I wonder how you would keep the ''to''s and ''from''s consistent over a wiki repository
*I wonder what's the difference between w:to and any other property set - other than it's used more often
*I would recommend to have one page element for each page version, as you did. I wonder why you need all the versions specified in one page, though.
*As a result here's [Rolfs slightly improved version]

--[rsc|mailto:rolf@august.de], 24-May-2006

;:Thx for the improved version -- definitely contains good ideas for improvement. I've updated the body of the doc, adding some stuff I found was missing. 
** I took the assumption that a wiki variable can only appear once on a page, so no multiple occurences 
** the to's and from's are computed at run time, not physically present in the tree
** all the elements starting with w: are predefined wiki-syntax elements; this allows to differentiate them from user defined metadata
** the ''versions'' grouping allows to retrieve a list of all versions of a page -- such a query would mimic the page-info section of jspwiki


Great idea.
However I see as a drawback in terms of usageal that it would add yet one new way to build content or lists out of other WIki pages.
As far as I know, we have:
* [Tasks Plugin], although the name is misleading, is an elaborate query engine to build tables out of several source pages.
** Pros: outputs ''contents'' of source poages (not only links to source pages), filters using arbitrary criteria on the contents itself
** cons: non-standard plugin, and slightly constrains the source pages structure, constrains the result format (table).
* [Query Plugin], again misnamed, builds a list of links references, based on references only (TOs and FROMs). This is somehow an elaborate form of the [Referring Pages Plugin], which lists only the FROMs).
** Pros: performance (supposedly), as it queries the ReferenceManager, and does not have to traverse page contents.
** Cons: non-standard plugin, outputs only links (no content), forces link structure on source pages, no content filtering
* IdeaWikiLinksThroughXPATHIncludingSubPagesSupport, the mechanism suggested on this very page.
** Pros: configurable formatting, filter criteria based on links ''and'' metadata contents (variables, attachments, versions), may end up into JSPWiki core as Janne seems interested :o)
** Cons: no filtering on raw text contents (other than variables or attachments), but maybe I misunderstood your examples, no filtering on page names (as of the examples, although it can surely be included for cheap)

Overall, merging the 3 ideas within a single mechanism, preferrably included in JSPWiki core, would make up a powerful Wiki query engine, enabling usage of [Wiki as database].

-- [JDuprez]

;: Indeed, that is exactly the idea: Trying to define a not too complex syntax, so we can replace the need of several query-kind-of-plugins. However, I think xpath may still not be powerful enough ;-)  BTW, I would also categorize [InsertVersionPlugin] under overlapping plugins with this syntax. Maybe there are even more... --[DF|DirkFrederickx]


!May be we could structure the effort in __tasks__:
#define invariant Metadata, to be input/edited by the user
##define page properties, names are chosen by the user
##there are evolving core properties, like ''subpage''
#define a set of default dynamic Metadata to every page representing some actual state
##page URLs that links to the page
##page URLs that ar linked to with the page
##available (other) versions to the page
\\this way I understood JDuprez
\\I agree to DF: all Metadata should be subject to selection/query, regardless of static or dynamic
!I notice a difference in the understanding what's a __subpage__:
*Is a subpage to a page the same as another one that the page links to?
*Is a subpage a page that exists only in conjunction with its parent?
*What about links in circles? Can a page be its own subpage?
*I would recommend a distinction (may be its defined elsewhere, too, please point me to that place)
**a subpage is something closely related to its parents
**a subpage could belong to more than one parent
**a subpage will be deleted in case all its parents are deleted (UML: composition)
**you can search thru subpages, omit all loosly linked pages (__this is the point__)
**a page may be shown tabbed together with its first level subpages
**you may have a plugin offering navigation thru the tree of subpages
!__xpath__ has several advanteges we should not overlook
*it's more fashionable than SQL
*it's more well thought than an own-build can ever be
*I cannot see a limit
**the only thing you got to solve is ''insert dynamic Metadata somewhere in the pages properties at runtime''\\seems not to be a hard thing to do
\\The only disadvantage I can see is that xpath might not be easy to learn, but that's a question of [a good tutorial|http://www.zvon.org/xxl/XPathTutorial/General/examples.html] as easy questions are easy to construct, even in xpath.
!DF: single property values are the same concept as mulitple ones
if you agree to the convention that the first is returned in case only a single is requested

Multiple values for properties -- or the same property name more than once (this just syntax) -- are able to cover the concept of subpages, too.

--[Rolf Schumacher|mailto:rolf@august.de], 25-May-2006