|Title|Wiki Links Through XPATH, including SubPages support
|Date|23-May-2006 22:17:36 EEST
|JSPWiki version|
|Submitter| [DF|DirkFrederickx]
|[Idea Category]|GenericIdea
|Reference|
|[Idea Status]|NewIdea

%%tabbedSection
%%tab-Proposal
!! Wiki Link syntax extended with XPATH
See also [XPath]

This page describes an approach to extend the wiki [[link syntax] 
to powerful expressions based on XPATH and JSR-170[1].
It aims also to present an approach to extend JSPWiki 
with support for [SubPages] and [WikiFarms].  (wiki-farms todo)

Example of current syntax :
{{{
   [SandBox]                              =>yield link to wiki page
   [SandBox/attach.jpg]                   =>yield link to attachment
}}}
Example of extended Wiki Link syntax :
{{{
   [SandBox/w:properties/variableX]       =>yield value of wiki variable
   [SandBox/w:pages/SomeSubPage]          =>yield link to a subpage
   [SandBox/w:versions/w:v123]            =>yield link to wiki page version
   [SandBox/w:versions/w:v123/attach.jpg] =>yield link to attachment
}}}
or with a more compact convenience syntax :
{{{
   [SandBox/@variableX]                   =>yield value of wiki variable
   [SandBox/SomeSubPage]                  =>yield link to the subpage
   [SandBox/w:v123/attach.jpg]            =>yield link to attachment
}}}

Normally, the wiki link syntax returns a single result, being a link to a wiki page or attachment.
From the previous examples, you can see that it is also possible to return the value of page __variables__ as well as links to page __versions__ different from the latest version.

Additionally, XPATH expressions will allow to return multiple results, separated by a space. 
{{{
   [SandBox/w:to]                         =>yield all referred-to page links
   [SandBox/w:from]                       =>yield all referred-from page links
   [SandBox/w:pages]                      =>yield links to all subpages of SandBox
   [SandBox/w:versions]                   =>yield links to all versions of SandBox
   [SandBox/w:properties]                 =>yield all variables (how? name=value)
   [SandBox/w:attachments]                =>yield all attachment links
}}}


!! Wiki Metadata XML model

A Wiki Link can be written be means of an [XPath] expression.
In order to do that, you need to understand the underlying xml model.

The namespace ''w:'' is preserved for the predefined wiki elements.
\\''TODO: check how this will work with interwiki links.''

{{{
 <w:pages>
   <aPage1>
     <w:properties>
       <w:name> ... </w:name>
       <w:path> ... </w:path>
       <w:length> ... </w:length>
       <w:author> ... </w:author>
       <w:created> ... </w:created>
       <w:lastModified> ... </w:lastModified>
       <w:versionNumber> ... </w:versionNumber>
       <aProperty1> ... </aProperty1>
       <aProperty2> ... </aProperty2>
     </w:properties>
     <w:attachments>
       <anAttachment1>
         <w:properties>
           <w:name> ... </w:name>
           <w:path> ... </w:path>
           <w:length> ... </w:length>
           <w:author> ... </w:author>
           <w:created> ... </w:created>
           <w:lastModified> ... </w:lastModified>
           <w:versionNumber> ... </w:versionNumber>
         </w:properties>
         <w:versions>
           <w:v1> ... </w:v1>
           <w:v2> ... </w:v2>
           <w:v3> ... </w:v3>
         </w:versions>
       </anAttachment1>
     </w:attachments>
     <w:pages>
       <subPage1> ... </subPage1>
       <subPage2> ... </subPage2>
     </w:pages>
     <w:versions>
       <w:v1> ... </w:v1>
       <w:v2> ... </w:v2>
       <w:v3> ... </w:v3>
     </w:versions>
     <w:to>
       <aPageX1> ... </aPageX1>
       <aPageX2> ... </aPageX2>
     </w:to>
     <w:from>
       <aPageY1> ... </aPageY1>
       <aPageY2> ... </aPageY2>
     </w:from>
     ... <pagecontent - wiki markup text> ...
   </aPage1>
   <aPage2> ... <aPage2>
 </w:pages>
}}}

! Overview of predefined elements inside the Wiki Metadata XML model

|| Child Node      || Description                                   || Type
| <w:pages>        | Set of pages. Used as root element or as a collection of sub-pages | ''page'' nodes
| <w:properties>   | Set of properties, or metadata of a page or attachment | ''properties''
| <w:attachments>  | Set of attachments                             | ''attachment'' nodes
| <w:versions>     | Set of page or attachment versions             | ''page'' or ''attachment'' nodes
| <w:to>           | Set of pages which are being referred by this page (outgoing links) | ''page'' nodes
| <w:from>         | set of pages which are referring to this page (incoming links) | ''page'' nodes
|| Property Node   || Description                                   || Type
| <w:name>         | Page or Attachment Name, including punctuations| String 
| <w:path>         | Complete path name, including names of parent pages | String 
| <w:author>       | Name of the author of a page or attachment     | String
| <w:length>       | length (number of bytes) of a page or attachment | number
| <w:created>      | Creation date of a page or attachment          | Date and Time
| <w:lastModified> | Last Modification date of a page of attachment | Date and Time
| <w:versionNumber>| Version number of a page or attachment         | 1..n
! Additional notes to the Wiki Metadata XML Model

* __Wiki Names__\\
  Page or Attachment nodes use a wiki-name without punctuations or blanks.
  The {{<w:name>}} property contains the full name, including punctuations and blanks.

* __Versions node__ \\
  Page or Attachment nodes always refer to their current state. 
  The {{<w:versions>}} node contains references  to all past __and current__ versions. 
  Each version gets a unique sequence number propery like this: {{w:v1}}, {{w:v2}}, {{w:v%%sub xx%%}}.

* __Same Name Siblings__ (JSR-170[1], chap 4.3) \\
  All nodes inside {{<w:pages>}}, {{<w:attachments>}} or {{<w:property>}} must have a unique name. 
  In other words, there can not be a page with the same name 
  inside a single {{<w:pages>}} node; all page properties have unique names etc.
  (multi-value properties need further investigation)

* __Virtual nodes__ \\
  The {{<w:from>}} and {{<w:to>}} nodes are ''virtual'' in the sense that they are computed 
  on request, rather then being physically present in the document tree.

!! Wiki Link Compact Syntax

A compact convenience syntax is defined for properties, subpages, attachments and versions. 
Also the root-path {{/w:pages/}} can be dropped. This way, the syntax
becomes backwards compatible with the current syntax.
{{{
   /w:pages/SandBox/w:properties/versionLabel
                                           => SandBox/@versionLabel

   /w:pages/SandBox/w:attachments/attach.png
                                           => SandBox/attach.png

   /w:pages/SandBox/w:pages/SomeSubPage/w:properties/someMetaData
                                           => SandBox/SomeSubPage/@someMetaData

   /w:pages/SandBox/w:versions/w:v127/w:properties/versionLabel
                                           => SandBox/w:v127/@versionLabel
}}}

!! XPATH expression

XPATH expression allow for a powerful querying of the wiki repository.
(See [XPath] for more details)

Some examples:

Return a list of to-pages having a fruit variable
{{{
  [SandBox/w:to[@fruit]]
}}}
Return a list of to-pages where the fruit variable equals 'apple'
{{{
  [SandBox/w:to[@fruit='apple']]
}}}                   
Use (brackets) when you need logical operators (avoid syntax ambiguity)
{{{
  [(SandBox/w:to[@fruit='apple'] | SandBox/w:to[@fruit='approved'])]
}}}                     
XPATH even supports string functions. Following example returns a 
list of pages with the search string matched
{{{
  [SandBox/w:to[contains(text(),'jsp')]]
  [SandBox/w:from[starts-with('Description')]]
}}}
You can use local wiki page variables inside xpath expressions too, prefix them with a $. 
{{{
  [{SET node='Main' }]

  [$node/w:to]
}}}

!Implementation

It should be possible to use a standard xpath processing java library as a plugin to wiki to support this kind of expressions. 
Probably, we need to write some back-end to provide the xpath with a ''virtual'' xml structure to mimic the internal repository of JSPWiki.

Ref. [JXPATH|http://jakarta.apache.org/commons/jxpath/] Apache library. Check it.


!! Wiki Link Format

With the extended syntax, a wiki link can now return
(i) a hyperlink, (ii) a variable value, (iii) a set of hyperlinks or (iv) a set of values.
Therefor additional formatting capabilities are needed.

Standard __wiki link format__ syntax allows a static ''format'' text.
{{{
  [Play around|Sandbox]
}}}
Use the ''@-syntax'' to retrieve the value of page or attachment properties.
{{{
  [Speed is @speed|Sandbox]
  [@w:length|Sanbox/attach.png]
}}}
Combine property values and links like this (nested brackets):
{{{
  [This [.] is ranked at @ranking |Sandbox]
}}}

In case the wiki link returns multiple results, the format string is iterated over each result.

Example: following expression returns a bullet list of pages with the value of @liveVersion 
and a link to that page as well.
{{{
  [* @w:name has version @liveVersion, here is the [link|.] |Sandbox/w:to]
}}}

You can use other page variables to replace more complex format strings.
{{{
  [{SET format='* @w:name has version @liveVersion, here is the [link|.]' }]
  [${format} |wiki-path] 
}}}
you can do the same with a format string from another page:
{{{
  [[SandBox/@format]|wiki-path] 
}}}
Here is an example which returns a tabular format.
By putting the format string in a separate variable, you avoid the need
to escape the vertical bar ( | ) which has a separate meaning inside a wiki link.
Obviously, you could also use the tilde ( ~ ) to escape the bars. 
{{{
  [{SET tableformat='| @w:name | @liveVersion | [link|.]' }]
  || PageName     || LiveVersion   || Link
  [ ${tableformat} | wiki-path] 
}}}

----

[#1] JSR-170 : see [http://www.jcp.org/en/jsr/detail?id=170]

%%
%%tab-Discussion

Please log your remarks/suggestions here

Yup.  Great idea.  Would go together excellent with the idea of using JSR-170 as a backend repository...

-- JanneJalkanen

----

agree, it's a very good idea to use the namespace abbriviation in order to identify xpath queries inlined to the content.

some smaller questions towards a structure free of redundancy:

*do you mind to specify a property more than once in order to form a set of values under one name?
**looks like this is consitent to your use of set-returns in your xpath results
*I wonder how you would keep the ''to''s and ''from''s consistent over a wiki repository
*I wonder what's the difference between w:to and any other property set - other than it's used more often
*I would recommend to have one page element for each page version, as you did. I wonder why you need all the versions specified in one page, though.
*As a result here's [Rolfs slightly improved version]

--[rsc|mailto:rolf@august.de], 24-May-2006


;:Thx for the improved version -- definitely contains good ideas for improvement. I've updated the body of the doc, adding some stuff I found was missing. 
** I took the assumption that a wiki variable can only appear once on a page, so no multiple occurences 
** the to's and from's are computed at run time, not physically present in the tree
** all the elements starting with w: are predefined wiki-syntax elements; this allows to differentiate them from user defined metadata
** the ''versions'' grouping allows to retrieve a list of all versions of a page -- such a query would mimic the page-info section of jspwiki
--[DF|DirkFrederickx]

----

Great idea.
However I see as a drawback in terms of usageal that it would add yet one new way to build content or lists out of other WIki pages.
As far as I know, we have:
* [Tasks Plugin], although the name is misleading, is an elaborate query engine to build tables out of several source pages.
** Pros: outputs ''contents'' of source poages (not only links to source pages), filters using arbitrary criteria on the contents itself
** cons: non-standard plugin, and slightly constrains the source pages structure, constrains the result format (table).
* [Query Plugin], again misnamed, builds a list of links references, based on references only (TOs and FROMs). This is somehow an elaborate form of the [Referring Pages Plugin], which lists only the FROMs).
** Pros: performance (supposedly), as it queries the ReferenceManager, and does not have to traverse page contents.
** Cons: non-standard plugin, outputs only links (no content), forces link structure on source pages, no content filtering
* IdeaWikiLinksThroughXPATHIncludingSubPagesSupport, the mechanism suggested on this very page.
** Pros: configurable formatting, filter criteria based on links ''and'' metadata contents (variables, attachments, versions), may end up into JSPWiki core as Janne seems interested :o)
** Cons: no filtering on raw text contents (other than variables or attachments), but maybe I misunderstood your examples, no filtering on page names (as of the examples, although it can surely be included for cheap)

Overall, merging the 3 ideas within a single mechanism, preferrably included in JSPWiki core, would make up a powerful Wiki query engine, enabling usage of [Wiki as database].

-- [JDuprez]

;: Indeed, that is exactly the idea: Trying to define a not too complex syntax, so we can replace the need of several query-kind-of-plugins. However, I think xpath may still not be powerful enough ;-)  BTW, I would also categorize [InsertVersionPlugin] under overlapping plugins with this syntax. Maybe there are even more... --[DF|DirkFrederickx]

----

!May be we could structure the effort in __tasks__:
#define invariant Metadata, to be input/edited by the user
##define page properties where a property name is chosen by the user 
##there are an evolving set of static core properties a user might choose and assign values to, like ''subpage''
###core properties will be equipped with a defined meaning, implemented e.g. by a plugin
##...
#define a set of default dynamic Metadata to every page representing some actual state
##page URLs that links to the page
##page URLs that ar linked to with the page
##available (other) versions to the page
##...
\\this way I understood JDuprez
\\as DF said: all Metadata should be subject to selection/query, regardless of static or dynamic
!I noticed a gap in the definition of __[subpage|SubPages]__:
*Is a subpage to a page the same as another one that the page links to? ''No.''
*Is a subpage a page that exists only in conjunction with its parent? ''Yes.''
*What about links in circles? Can a page be its own subpage? May be. ''Doesn't make sense.''
*I see a distinction (see [SubPages] and links from there)
**a subpage is text closely related to its parents
**a subpage could belong to more than one parent (''no embedding, no file system directory, but a static link, separate implementation from concept, any page provider should do'')
**a subpage will be deleted in case all its parents are deleted (UML: composition)
**you can search thru subpages, omit all loosly linked pages (__this is the point__)
**a page may be shown tabbed together with its first level subpages
**you may have a plugin offering navigation thru the tree of subpages
!__xpath__ has several advantages we should not overlook
*it's more fashionable than SQL
*it's more well thought than an own-build can ever be
*I cannot see a limit
**the only thing you got to solve is ''insert dynamic Metadata somewhere in the pages properties at runtime''\\seems not to be a hard thing to do
\\To name some
\\A disadvantage I'm aware of: xpath might not be easy to learn, but that's a question of [a good tutorial |http://www.zvon.org/xxl/XPathTutorial/General/examples.html] as easy questions are easy to construct, even in xpath.
!single property values are the same concept as mulitple ones
if you agree to the convention that the first is returned in case only a single is requested

However, multiple values for properties -- or the same property name more than once (this just syntax) -- are able to cover the concept of subpages, too.


--[Rolf Schumacher|mailto:rolf@august.de], 25-May-2006
%%
%%