This discussion has been refactored here from Ideas. It concerns the suggested modularization or mdification of TranslatorReader to accommodate different content types.

Jan 15, 2003: Pluggable TranslatorReader#

While thinking about PDF generation, it occurred to us that a pluggable TranslatorReader might be useful. We could deduce which translator to use from something in the request, produce a TranslatedObject (or something), and provide its contents (whether HTML, WML, PDF binary data, whatever) to the JSP page level. (Haven't thought about how I'd serve non-renderable data yet; maybe a custom JSP, or maybe utilizing the template system? Later.)

Comments? This has been done in portal systems a million times before, I'm sure..


I very much like the idea, just to make it more modular justifies that. I have also already experimented with a JavaCC-grammar for JSPWiki that produces a parse-tree of a page. Still far from beeing perfect it could provide a starting point. Things like the camel-case code would just be visitors to this tree and can transform it as they want to. HTML-output would just be another visitor to this tree.

Problems however:

  • plugins: they currently produce HTML. They have to be able to produce the other formats as well or even better only Wiki-Markup, but that is quite restricted of course
  • speed: Generating the parse-tree is slower than the current home-grown parsing, but could be compensated with caching (even gaining speed).

-- Torsten

I would rather concentrate on outputting XHTML with proper class-attributes so that we can then easily use techniques such as XSL to convert that into PDF or something. Maintaining two (or more) implementations of the same language parser is a hassle - we would need to standardize the WikiMarkup in a much deeper fashion than what we do now.

-- JanneJalkanen

As it produces a parse-tree it is very straightforward to convert that to well-formed XML with all possibilities to convert that to XHTML or whatever. On the other hand I wouldn't want to restrict that to a XML-format and at least keep the opportunity to have something else as well.

What do you actually mean with two parser-implementations? There would be just one parser that generates a tree. This tree is then transformed to do further processing, i.e. make CamelCase words a link, add an image after external links etc. Only the very last step would be an output-specific filter that converts that to whatever output format required. This filter would only have to recognize a few basic elements (currently lists, tables, links, images, text with a few formatting options). Most modifications should be achievable without modifying/adding to these basic elements (you might call that meta-WikiMarkup ;-) with only adding another tree visitor to the processing step after parsing and eventually having to modify the grammar a bit (which should be much better maintainable than the current solution).

-- Torsten

I was talking about two-parser implemenations because the topic of this idea is "pluggable TranslatorReaders". It sort of implies that it should be possible to have two or more implementations against the same set of input files, doesn't it?

As for a separate parser with a tree... What would we gain that we can't do now? Why fix something that is not really broke? Why is XHTML not a good choice for the intermediate language? Most of the WikiMarkup is really meant for display formatting anyway, not establishing content structure.

And most importantly: I don't see why a WikiEngine should be able to do everything and every format on earth. What is so fundamentally wrong with XHTML that it cannot be transferred into something else? If anything, the current TranslatorReader should be more componentized (ala DevWiki), and able to produce much more metainformation that it is currently doing. Yeah, and full XHTML-compliance would be nice, too. Currently we don't produce <p>-tags correctly anyway.

Besides, cases where people type first italic and bold, closing italic_ and bold last will produce faulty XML. Any system where users can freely input content means that we have to deal with problems like that. Currently we don't, which is a real problem in TranslatorReader.

-- JanneJalkanen

Hum, XHTML and FOP would pretty much answer my concerns. --ebu

To Janne: There is absolutely nothing wrong with XHTML. I just want to say that the proposed solution is more generic than just allowing XHTML. Why restricting the design if you can easily have a more general solution at little cost? The actual idea why I came to this design was when thinking about Latex-output from WikiMarkup.

Even more important than the above IMO is that is adds modularity. Currently TranslatorReader is a 1.5+kloc class that does all: parsing, processing and formatting. With the above design you are able to seperate that and most of the processing steps into seperate classes. This should make maintaining the existing code and adding new functionality easier.

You can also think of it as an extension to the current plugin mechanism. For example someone here recently came up with the idea of a table-of contents, something I would also find usefull. With this design you could write a plugin that collects all headings in a first step and then formats this using the current mechanism. This could be totally independent of the main distribution and would only require registering a new tree-visitor.

The case of not properly nested markup would be a problem for a formatter at the end, if this format requires strictly hierarchical output. The parser would just create a ToggleItalic-node or something similar.

-- Torsten

BTW, see Radeox for more ideas. It comes from the SnipSnap folks.

-- JanneJalkanen

April 1, 2003: Make checkForCamelCaseLink a Public Method#

checkForCamelCaseLink method is currently private. This method is useful and should be made public; or a public method isCamelCase(String word) that returns boolean can be created that checks for a null return from checkForCamelCaseLink method.

This would allow for newly created pages to be tested for their CamelCase quality. If isCamelCase is false, then a message before "Why don't you create it..." can be shown to suggest that the page is not in proper CamelCase.

Ideally as well, the isCamelCase method suggested above may be better suited as a static method -- but this would involve some work to make it fit into the exisiting TranslatorReader.

-- JeffPhillips

May 5, 2003: Use clear CSS classes and fully comply to XHTML#

This would require only relative few changes, but allows for many things like simpler XSL translations (e.g. using the HULA reader) or page embedding. I think whats needed is

  • add a specific class-Attribute to each element written by the TranslatorReader (I mean really everything, incl. BR's, TABLE's etc.). e.g. <table class="jspwiki-table">...
  • close P-Elements

-- BobSchulze

I don't know whether it is a good idea to put class definitions everywhere - can't you just enclose everything inside a <div> and use CSS selectors?

-- JanneJalkanen

March 27, 2004: Remove HR tags and hard-coded table border widths?#

The HR tag isn't 100% customizable in CSS in some browsers, and is handled differently between MSIE and Mozilla. For example, a style="color:red;height:3px;background-color:green;border:none;" gives a red bar in MSIE and a green bar in Mozilla. Are there any objections to replacing any HR tags generated by TranslatorReader with a more consistent DIV instead?

Also, I've found similar issues with TABLE elements with hard-coded border attributes. For example, a table border="1" style="border=3px dashed red;" renders with a 3px white border in Mozilla. Unfortunately, TranslatorReader also generates tables with this width attribute.

By replacing the HRs with DIVs, and by removing the border attribute from TABLEs, these elements would be far more configurable by using CSS.

This fix is pretty simple, I've addressed it in my wiki where you can read about the changes. I think that it makes the page look much cleaner and more modern, since you can get rid of the 1990's-era horizontal rulers.

My question is this: is there any reason to keep things the way they are, or would this change potentially break someone's wiki?

-- JeremySproat

September 17, 2004: XHTML now!#

How long does it take to alter the TranslatorReader to output valid XHTML? There are several problems, not only HR tags, that make crossbrowser styling impossible. There would be no need to have this Javascript CSS solution. One Stylesheet would do, if you use divs and produce valid XHTML or even valid HTML. The UL Tag within another UL Tag has to be enclosed by the preceding LI Tag. The list is improtant for the LeftMenu as you probably know.

-- Matthias

Category Ideas

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-13) was last changed on 21-Jun-2007 22:48 by