|Title|Wiki Links Through XPATH, including SubPages support
|Date|23-May-2006 22:17:36 EEST
|JSPWiki version|
|Submitter| [DF|DirkFrederickx]
|[Idea Category]|GenericIdea
|[Idea Status]|NewIdea

This idea started as an XPATH extension of the page link syntax of JSPWiki.
%%(border-left:4px solid silver; padding-left: 1.5em; margin-top:-1em; margin-bottom:1em;)
It has been extended with new ideas of page organising concepts.

!!! Introduction : extend Wiki Link syntax with XPATH
See also [XPath]

This page describes an approach to extend the wiki [[link syntax] 
to powerful expressions based on XPATH and JSR-170[1].

Example of current syntax :
   [SandBox]                              =>yield link to wiki page
   [SandBox/attach.jpg]                   =>yield link to attachment
Example of extended Wiki Link syntax :
   [SandBox/w:properties/variableX]       =>yield value of wiki variable
   [SandBox/w:pages/SomeSubPage]          =>yield link to a subpage
   [SandBox/w:versions/w:v123]            =>yield link to wiki page version
   [SandBox/w:versions/w:v123/attach.jpg] =>yield link to attachment
or with a more compact convenience syntax :
   [SandBox/@variableX]                   =>yield value of wiki variable
   [SandBox/SomeSubPage]                  =>yield link to the subpage
   [SandBox/w:v123/attach.jpg]            =>yield link to attachment

Normally, the wiki link syntax returns a single result, being a link to a wiki page or attachment.
From the previous examples, you can see that it is also possible to return the value of page __variables__ as well as links to page __versions__ different from the latest version.

Additionally, XPATH expressions will allow to return multiple results, separated by a space. 
   [SandBox/w:to]                         =>yield all referred-to page links
   [SandBox/w:from]                       =>yield all referred-from page links
   [SandBox/w:pages]                      =>yield links to all subpages of SandBox
   [SandBox/w:versions]                   =>yield links to all versions of SandBox
   [SandBox/w:properties]                 =>yield all variables (how? name=value)
   [SandBox/w:attachments]                =>yield all attachment links

%%(border-left:4px solid silver; padding-left: 1.5em;)

!!! Organisational Concepts 

The extended [[page link] syntax would also provide support to some further extensions of JSPWiki for more advanced ''organisational concepts'' such as 
[WikiCategories|IdeaWikiLinksThroughXPATHIncludingSubPagesSupport#WikiCategories] and 
See also ref.[2]

!! ~WikiFarm

URL Syntax:  {{{http://baseURL/...}}}

A ''~WikiFarm'' is a grouping of ~WikiWebs, running on the same wiki engine, behind the same ''baseURL''.
It provides for a common administration platform for a set of wikis.
All ~WikiWebs of a ~WikiFarm share the same ''user database'' and ''authentication'' policy.

The ~WikiFarm ''~SiteMap'' page provides an overview of all its ~WikiWebs.
  [/w:wikiwebs]       => renders a list of all WikiWebs' homepages in this WikiFarm

A ~WikiFarm provides overall administrative settings for properties such as:\\
baseURL, encoding, searchProvider, plugin.*, interWikiRef.* , rss.*, log4j.*
authorizer, userdatabase.*, groupManager, aclManager, 
(new) defaultwikiweb

!! ~WikiWeb

URL Syntax:  {{{http://baseURL/wikiwebnamespace:wikipagename}}}

A ~WikiWeb allows ~WikiPages to be grouped into separate namespaces.

This is especially useful when organising large chunks of information. 
It also improves ease of administration. \\ 
However, separate namespaces may also decrease usability 
as users must remember the ~WikiWeb name prior to the ~WikiPage name.
Preferably, the name of the current ~WikiWeb should always be visible in the GUI.

The ''current ~WikiWeb'' defines the scope of several wiki functions such as
[PageIndex], [RecentChanges], [UnusedPages], [UndefinedPages] and [FindPage].
An extra ''wiki-web'' option is to be added to the [FindPage] (and other plugins)
to allow searches in other sibling ~WikiWebs.
A ~WikiWeb may want to provide its own global pages such as 
[LeftMenu], [LeftMenuFooter], [CopyrightNotice], [EditPageHelp], etc.. [3]

A ~WikiWeb may specify a dedicated location of its page repository and have a dedicated security policy.

A ~WikiWeb provides administrative settings via a GUI for properties such as:\\
%%strike frontPage%% homePage, templateDir, translatorReader.*, pageProvider, attachmentProvider, diffProvider,
(new) wikiWebNamespace, wikiWebSkin, wikiWebSecurityPolicy

Possibly, these properties could take another prefix iso {{jspwiki.}}
  jspwiki.pageProvider=...   => default pageProvider
  zoo.pageProvider=...       => pageProvider for the 'zoo' WikiWeb 

  [zoo:]              => links to the home page of zoo:
  [zoo:Main]          => link to the top Main page of zoo:
  [zoo:*//Main]       => links to all Main pages of zoo:
  [Main]              => links to zoo:Main when zoo: is the current wiki-web

Q. Is the term ''~WikiWeb'' clear enough. Other suggestions: ''wiki-site'', ...  ?

!! ~WikiSubPages

URL Syntax:  {{{http://baseURL/wikiwebnamespace:wikiparentpage/wikisubpage/wikisubsubpage}}}

~WikiPages can act as a container for other ~WikiSubPages as well as attachments.
Containment implies that e.g. renaming or deleting a ''parent'' ~WikiPage 
also impacts all its child pages and attachments.

~WikiSubPages inherit the ACLs of their ancestor pages. Child pages can add (=restrict) ACLs 
to those of the ancestors but never relax them. (parental control ;-)

  [Birds/Stork]         => child pages use / as delimitter
  [/Zoo/Birds/Stork]    => absolute link 
                           (start with / to denote the root of the current WikiWeb)
See also [Greedy Page Links|IdeaWikiLinksThroughXPATHIncludingSubPagesSupport#GreedyPageLinks] to keep your links short and backward compatibile with the current JSPWiki links.

!! More organising concepts

While ~WikiFarms, ~WikiWebs and ~WikiSubPages implement pure hierarchical organising concepts.
More rich grouping concepts are possible through 
[WikiCategories|IdeaWikiLinksThroughXPATHIncludingSubPagesSupport#WikiCategories] and 

! ~WikiCategories

~WikiPages can be ''categorised'' into one or more ~WikiCategories. See also [WikiCategory]

In order to mark a page for inclusion in a special category, just link it to the ''category page''.
Category links can not be distinguished from ordinary page links.
Typically the categorised pages link back to the category pages via ''Back To...'' or ''More Info...'' or ''See also ...'' links.
The category page may contain a textual description of the grouping concept. 
It may use the [[{ReferredPagesPlugin}] to automatically list all categorised pages.

Q. Can a category page categorise pages of multiple ~WikiWebs ? I would tend to say yes,
although the [[{ReferredPagesPlugin}] would need some extra param to search for links 
outside the current ~WikiWeb.
! ~WikiTags

~WikiPages can be tagged with one or more ~WikiTags. 

In order to mark a page with a tag, you will add some ''metadata'' to the page.
Typically, GUI support is available for easy page tagging 
(select a tag from the existing set of tags of the current ~WikiWeb) 
and to facilitate page searches based on certains tags.

  [/w:tags/*]          =>list all tags in this WikiWeb
  [*[@tag='food']]     =>list all pages tagged with 'food'

!! Greedy Page Links

The concept of ''greedy page links'' will keep your page links short and provide backward compatibility 
with the current style of page-linking.

When resolving a [[page-link], the wiki-engine will first try to match ''any'' of the 
children of the ''current page''. 
When no match is found, the wiki-engine will continue its search from the root of the ''current ~WikiWeb''.
Use XPATH expressions to target more specific sets of pages. [4]

With the ''greedy page link alghorithm'', a single [[page-link] can now result in one or more clickable page links.
By default, the generated set of links are separated with a space. See [Formatting Wiki Links|IdeaWikiLinksThroughXPATHIncludingSubPagesSupport#FormattingWikiLinks] to change this formatting.

Example of the ''greedy'' page-link algorithm :
|| Current Page     || Wiki markup... || Links to...
| /Zoo              | [[Food]         | /Zoo/Birds/flamingo/Food\\/Zoo/Birds/Stork/Food\\/Zoo/Birds/Stork/Summer/Food
| /Zoo/Birds        | [[Food]         | /Zoo/Birds/flamingo/Food\\/Zoo/Birds/Stork/Food\\/Zoo/Birds/Stork/Summer/Food
| /Zoo/Birds/Stork  | [[Food]         | /Zoo/Birds/Stork/Food\\/Zoo/Birds/Stork/Summer/Food
| /Zoo/Birds/Stork/Summer  | [[Food]  | /Zoo/Birds/Stork/Summer/Food
| /Zoo/Birds        | [[Main]         | /Main  (the [[Birds] has no ''Main'' page children)

Example of XPATH expresssions :
|| Current Page     || Wiki markup... || Links to...
| /Zoo/Birds/Stork  | [[Summer/Food]  | /Zoo/Birds/Stork/Summer/Food  
| /Zoo/Birds/Stork  | [[./Food]       | /Zoo/Birds/Stork/Food  
| /Zoo/Birds/Stork  | [[.//Food]      | /Zoo/Birds/Stork/Food\\/Zoo/Birds/Stork/Summer/Food   
| /Zoo/Birds/Stork/Summer | [[../Food]  | /Zoo/Birds/Stork/Food   
| /Zoo/Birds/Stork/Sumer  | [[..//Food] | /Zoo/Birds/Stork/Food\\/Zoo/Birds/Stork/Summer/Food   
| /Zoo/Birds/Stork  | [[../*/Food]    | /Zoo/Birds/flamingo/Food\\/Zoo/Birds/Stork/Food   
| any page          | [[/Zoo/Birds]   | /Zoo/Birds
| any page          | [[/Zoo/Birds/*/Food] | /Zoo/Birds/flamingo/Food\\/Zoo/Birds/Stork/Food

Q. What is the default format of a link ? 
Is it the absolute path, such as {{/Zoo/Birds/Stork/Summer/Food}}. 
Or the relative (and shorter) path, such as {{Stork/Summer/Food}} or {{./Stork/Summer/Food}} ?

!!! Wiki Metadata XML model

A Wiki Link can be written be means of an [XPath] expression.
In order to do that, you need to understand the underlying xml model.

The namespace ''w:'' is preserved for the predefined wiki elements.

       <w:name> ... </w:name>
       <w:path> ... </w:path>
       <w:length> ... </w:length>
       <w:author> ... </w:author>
       <w:created> ... </w:created>
       <w:lastModified> ... </w:lastModified>
       <w:versionNumber> ... </w:versionNumber>
       <aProperty1> ... </aProperty1>
       <aProperty2> ... </aProperty2>
           <w:name> ... </w:name>
           <w:path> ... </w:path>
           <w:length> ... </w:length>
           <w:author> ... </w:author>
           <w:created> ... </w:created>
           <w:lastModified> ... </w:lastModified>
           <w:versionNumber> ... </w:versionNumber>
           <w:v1> ... </w:v1>
           <w:v2> ... </w:v2>
           <w:v3> ... </w:v3>
       <subPage1> ... </subPage1>
       <subPage2> ... </subPage2>
       <w:v1> ... </w:v1>
       <w:v2> ... </w:v2>
       <w:v3> ... </w:v3>
       <aPageX1> ... </aPageX1>
       <aPageX2> ... </aPageX2>
       <aPageY1> ... </aPageY1>
       <aPageY2> ... </aPageY2>
     ... <pagecontent - wiki markup text> ...
   <aPage2> ... <aPage2>
   <wikiweb1:homePage> ... <wikiweb1:homePage>
   <wikiweb2:homePage> ... <wikiweb2:homePage>

!! Overview of predefined elements inside the Wiki Metadata XML model

|| Child Node      || Description                                   || Type
| <w:pages>        | Set of pages. Used as root element or as a collection of sub-pages | ''page'' nodes
| <w:properties>   | Set of properties, or metadata of a page or attachment | ''properties''
| <w:attachments>  | Set of attachments                             | ''attachment'' nodes
| <w:versions>     | Set of page or attachment versions             | ''page'' or ''attachment'' nodes
| <w:to>           | Set of pages which are being referred by this page (outgoing links) | ''page'' nodes
| <w:from>         | set of pages which are referring to this page (incoming links) | ''page'' nodes
| <w:wikiwebs>     | set of pages referring to the rootpages of all wikiwebs in this wikifarm | ''page'' nodes
|| Property Node   || Description                                   || Type
| <w:name>         | Page or Attachment Name, including punctuations| String 
| <w:path>         | Complete path name, including names of parent pages | String 
| <w:author>       | Name of the author of a page or attachment     | String
| <w:length>       | length (number of bytes) of a page or attachment | number
| <w:created>      | Creation date of a page or attachment          | Date and Time
| <w:lastModified> | Last Modification date of a page of attachment | Date and Time
| <w:versionNumber>| Version number of a page or attachment         | 1..n
! Additional notes to the Wiki Metadata XML Model

* __Wiki Names__\\
  Page or Attachment nodes use a wiki-name without punctuations or blanks.
  The {{<w:name>}} property contains the full name, including punctuations and blanks.

* __Versions node__ \\
  Page or Attachment nodes always refer to their current state. 
  The {{<w:versions>}} node contains references  to all past __and current__ versions. 
  Each version gets a unique sequence number propery like this: {{w:v1}}, {{w:v2}}, {{w:v%%sub xx%%}}.

* __Same Name Siblings__ (JSR-170[1], chap 4.3) \\
  All nodes inside {{<w:pages>}}, {{<w:attachments>}} or {{<w:property>}} must have a unique name. 
  In other words, there can not be a page with the same name 
  inside a single {{<w:pages>}} node; all page properties have unique names etc.
  (multi-value properties need further investigation)

* __Virtual nodes__ \\
  The {{<w:from>}} and {{<w:to>}} nodes are ''virtual'' in the sense that they are computed 
  on request, rather then being physically present in the document tree.

!! Wiki Link Compact Syntax

A compact convenience syntax is defined for properties, subpages, attachments and versions. 
Also the root-path {{/w:pages/}} can be dropped. This way, the syntax
becomes backwards compatible with the current syntax.
                                           => SandBox/@versionLabel

                                           => SandBox/attach.png

                                           => SandBox/SomeSubPage/@someMetaData

                                           => SandBox/w:v127/@versionLabel

!! XPATH expression

XPATH expression allow for a powerful querying of the wiki repository.
(See [XPath] for more details)

Some examples:

Return a list of to-pages having a fruit variable
Return a list of to-pages where the fruit variable equals 'apple'
Use (brackets) when you need logical operators (avoid syntax ambiguity)
  [(SandBox/w:to[@fruit='apple'] | SandBox/w:to[@fruit='approved'])]
XPATH even supports string functions. Following example returns a 
list of pages with the search string matched
You can use local wiki page variables inside xpath expressions too, prefix them with a $. 
  [{SET node='Main' }]


! Implementation

It should be possible to use a standard xpath processing java library as a plugin to wiki to support this kind of expressions. 
Probably, we need to write some back-end to provide the xpath with a ''virtual'' xml structure to mimic the internal repository of JSPWiki.

Ref. [JXPATH|http://jakarta.apache.org/commons/jxpath/] Apache library. Check it.

!!! Formatting Wiki Links

With the extended syntax, a wiki link can now return
(i) a hyperlink, (ii) a variable value, (iii) a set of hyperlinks or (iv) a set of values.
Therefor additional formatting capabilities are needed.

Standard __wiki link format__ syntax allows a static ''format'' text.
  [Play around|Sandbox]
Use the ''@-syntax'' to retrieve the value of page or attachment properties.
  [Speed is @speed|Sandbox]
Combine property values and links like this (nested brackets):
  [This [.] is ranked at @ranking |Sandbox]

In case the wiki link returns multiple results, the format string is iterated over each result.

Example: following expression returns a bullet list of pages with the value of @liveVersion 
and a link to that page as well.
  [* @w:name has version @liveVersion, here is the [link|.] |Sandbox/w:to]

You can use other page variables to replace more complex format strings.
  [{SET format='* @w:name has version @liveVersion, here is the [link|.]' }]
  [${format} |wiki-path] 
you can do the same with a format string from another page:
Here is an example which returns a tabular format.
By putting the format string in a separate variable, you avoid the need
to escape the vertical bar ( | ) which has a separate meaning inside a wiki link.
Obviously, you could also use the tilde ( ~ ) to escape the bars. 
  [{SET tableformat='| @w:name | @liveVersion | [link|.]' }]
  || PageName     || LiveVersion   || Link
  [ ${tableformat} | wiki-path] 

%%(border-left:4px solid silver; padding-left: 1.5em;)
!!! Footnotes

[#1] JSR-170 : see [http://www.jcp.org/en/jsr/detail?id=170]

[#2] Inspired by lots of stuff on the jspwiki mailing list, also on [http://twiki.org/cgi-bin/view/Codev/OrganizingPrinciples]

[#3] ~WikiWebs may define they own global pages suchs as [[LeftMenu] etc. 
By default, a ~WikiWeb will ''inherit'' these pages from the default ~WikiWeb at creation.\\
Example contents of [[LeftMenu] of a newly created ~WikiWeb: 
  [{InsertPage page='<default-wiki-web>:LeftMenu'}]

[#4] [The greedy page link|IdeaWikiLinksThroughXPATHIncludingSubPagesSupport#Greedy page link] algoritm actually executes following XPATH expressions:
* If not-empty, return {{[[.//pagename]}} : matches all subpages of the current page
* Otherwise return {{[[//pagename]}} : matches all subpages of the root page of the current ~WikiWeb


Please log your remarks/suggestions here

Yup.  Great idea.  Would go together excellent with the idea of using JSR-170 as a backend repository...

-- JanneJalkanen


agree, it's a very good idea to use the namespace abbriviation in order to identify xpath queries inlined to the content.

some smaller questions towards a structure free of redundancy:

*do you mind to specify a property more than once in order to form a set of values under one name?
**looks like this is consitent to your use of set-returns in your xpath results
*I wonder how you would keep the ''to''s and ''from''s consistent over a wiki repository
*I wonder what's the difference between w:to and any other property set - other than it's used more often
*I would recommend to have one page element for each page version, as you did. I wonder why you need all the versions specified in one page, though.
*As a result here's [Rolfs slightly improved version]

--[rsc|mailto:rolf@august.de], 24-May-2006

;:Thx for the improved version -- definitely contains good ideas for improvement. I've updated the body of the doc, adding some stuff I found was missing. 
** I took the assumption that a wiki variable can only appear once on a page, so no multiple occurences 
** the to's and from's are computed at run time, not physically present in the tree
** all the elements starting with w: are predefined wiki-syntax elements; this allows to differentiate them from user defined metadata
** the ''versions'' grouping allows to retrieve a list of all versions of a page -- such a query would mimic the page-info section of jspwiki


Great idea.
However I see as a drawback in terms of usageal that it would add yet one new way to build content or lists out of other WIki pages.
As far as I know, we have:
* [Tasks Plugin], although the name is misleading, is an elaborate query engine to build tables out of several source pages.
** Pros: outputs ''contents'' of source poages (not only links to source pages), filters using arbitrary criteria on the contents itself
** cons: non-standard plugin, and slightly constrains the source pages structure, constrains the result format (table).
* [Query Plugin], again misnamed, builds a list of links references, based on references only (TOs and FROMs). This is somehow an elaborate form of the [Referring Pages Plugin], which lists only the FROMs).
** Pros: performance (supposedly), as it queries the ReferenceManager, and does not have to traverse page contents.
** Cons: non-standard plugin, outputs only links (no content), forces link structure on source pages, no content filtering
* IdeaWikiLinksThroughXPATHIncludingSubPagesSupport, the mechanism suggested on this very page.
** Pros: configurable formatting, filter criteria based on links ''and'' metadata contents (variables, attachments, versions), may end up into JSPWiki core as Janne seems interested :o)
** Cons: no filtering on raw text contents (other than variables or attachments), but maybe I misunderstood your examples, no filtering on page names (as of the examples, although it can surely be included for cheap)

Overall, merging the 3 ideas within a single mechanism, preferrably included in JSPWiki core, would make up a powerful Wiki query engine, enabling usage of [Wiki as database].

-- [JDuprez]

;: Indeed, that is exactly the idea: Trying to define a not too complex syntax, so we can replace the need of several query-kind-of-plugins. However, I think xpath may still not be powerful enough ;-)  BTW, I would also categorize [InsertVersionPlugin] under overlapping plugins with this syntax. Maybe there are even more... --[DF|DirkFrederickx]


!May be we could structure the effort in __tasks__:
#define invariant Metadata, to be input/edited by the user
##define page properties where a property name is chosen by the user 
##there are an evolving set of static core properties a user might choose and assign values to, like ''subpage''
###core properties will be equipped with a defined meaning, implemented e.g. by a plugin
#define a set of default dynamic Metadata to every page representing some actual state
##page URLs that links to the page
##page URLs that ar linked to with the page
##available (other) versions to the page
\\this way I understood JDuprez
\\as DF said: all Metadata should be subject to selection/query, regardless of static or dynamic
!I noticed a gap in the definition of __[subpage|SubPages]__:
*Is a subpage to a page the same as another one that the page links to? ''No.''
*Is a subpage a page that exists only in conjunction with its parent? ''Yes.''
*What about links in circles? Can a page be its own subpage? May be. ''Doesn't make sense.''
*I see a distinction (see [SubPages] and links from there)
**a subpage is text closely related to its parents
**a subpage could belong to more than one parent (''no embedding, no file system directory, but a static link, separate implementation from concept, any page provider should do'')
**a subpage will be deleted in case all its parents are deleted (UML: composition)
**you can search thru subpages, omit all loosly linked pages (__this is the point__)
**a page may be shown tabbed together with its first level subpages
**you may have a plugin offering navigation thru the tree of subpages
!__xpath__ has several advantages we should not overlook
*it's more fashionable than SQL
*it's more well thought than an own-build can ever be
*I cannot see a limit
**the only thing you got to solve is ''insert dynamic Metadata somewhere in the pages properties at runtime''\\seems not to be a hard thing to do
\\To name some
\\A disadvantage I'm aware of: xpath might not be easy to learn, but that's a question of [a good tutorial |http://www.zvon.org/xxl/XPathTutorial/General/examples.html] as easy questions are easy to construct, even in xpath.
!single property values are the same concept as mulitple ones
if you agree to the convention that the first is returned in case only a single is requested

However, multiple values for properties -- or the same property name more than once (this just syntax) -- are able to cover the concept of subpages, too.

--[Rolf Schumacher|mailto:rolf@august.de], 25-May-2006


http://blogshot.nl/halloween <a href="http://blogshot.nl/halloween">Halloween costume</a> Halloween costume

--[halloween|http://blogshot.nl/halloween], 10-Oct-2006