Proposal to normalize wikinames for 3.0:

Generic comments:

  • Turn the current list of allowed characters into a user setting (taking care of Murray's concerns)
  • Upon save, disallowed characters create an error instead of being silently ignored
  • Turn [free link] lookup also from CamelCased page names into an user setting (by default off for new wikis). This provides backwards compatibility, turning [this is a link] to ThisIsALink.

When a freelink is encountered on a page (below, wikipath means the FQN of the page; in 2.x this is exactly the WikiName; in 3.0 it can also contain subpage+space information):

  • Link is parsed to its three elements: text, wikipath, parameters
  • Whitespace in front of and after the individual wikipath components (separated by / or :) is removed. Any excess whitespace is collapsed.
  • Each component is checked against the allowed character list, and if unallowed characters are detected, the link parsing is stopped and a warning condition is raised
  • Each component is lowercased with String.lowerCase() to make any page name comparisons case insensitive.
  • The resulting wikipath is turned into a JCR path (and any characters allowed by JSPWiki but not allowed by JCR spec are escaped)
  • The JCR path is then passed to the Repository and it is checked if page exists
  • If the user setting so dictates, in case the wikiPath is not found, it is camelcasified using the current TextUtil.wikifyLink() and it is tried again.

When a page is created, the following process takes place:

  • The wikiPath components are stripped of leading and trailing whitespace. Any excess whitespace is collapsed.
  • Each component is checked against the allowed character list
  • A "wiki:title" property is set to correspond to the typography of the resulting name of the page (=last component in the wikipath)
  • Each component is lowercased with String.lowerCase()
  • The wikipath is then turned to a JCR path and the content (including properties) are saved

When a page title is rendered, the following process takes place:

  • The "page" parameter is parsed into a wikipath
  • The wikiPath components are stripped of extra whitespace
  • The illegal chars are checked
  • The JCR path is formed
  • The Node is fetched, and WikiPage created. The title of the page is from the "wiki:title" property.

When a page is renamed:

  • The proper WikiPage object is located.
  • If the rename process would result with a different JCR path, the page is moved
  • In any case, the new title is written to the value of the wiki:title property

Yes, this means that a page title and it's JCR path will be subtly different, but that the wiki:title property keeps the representation and the path keeps the organization.

E.g. "?page=Foo%20bar" => WikiName = "Main:Foo bar" => JCR path = "/pages/main/foo bar" => wiki:title = "Foo bar".


Summary/Paraphrase of above?#

  • A new property tryCamelCase=true|false controls if a request for "Test Name", "Test name", "Test+Name", "Test+name", "Test%20Name" or "Test%20name" looks for "TestName" in the repository.
  • A new property tryBeautified=true|false controls if a request for "TestName" gets broken into "Test Name" (and then further normalized to "Test name").
  • A new property illegalCharacters defaults to the same list as Wikipedia, "#<>|{}" (or alternatively allowedCharacters? not sure which sense is best.)
  • Any illegal characters in a path-component at page creation causes an error.
  • Any illegal characters in a [free link|Some:path] at time of rendering are silently dropped and a view link to the normalized name (if already existing) is rendered, else render "specially" as a "bad link name"?
  • Each path-component gets trimmed.
  • All internal whitespace of each path-component is collapsed to a single space.
    • e.g. (spaceName:topPageName\subPageName\attachmentName) has four components.
    • What about links to headings? (spaceName:\topPageName\subPageName#headingName) don't spaces get mutated into '-' characters currently?
  • There may be a difference between the name and title for a page:
    1. Names are all lowercase
    2. Titles maintain the original case of the name as given at time of creation.
      • Is this an invariant? (name.equalsIgnoreCase(title) == true)
  • Titles are a maintained property, separate from, but closely related to, the page name.
  • Renames affect both the name and the title of the page.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-7) was last changed on 22-Mar-2010 22:01 by Allhours