Text_Wiki 0.23.0 released

Jeremy Cowgar was kind enough to put together a render set for LaTeX; this means that you can now transform Wiki text to XHTML, plain text, or LaTeX markup. Latex support is about 80% complete abd is remarkably well-done. I’ll be taking Jeremy on as a project devloper once he can get a PEAR account and the related PHP CVS access.

I removed the ‘Translatehtml’ rule set; as it is XHTML-specific, it had no place in the general parse/render cycle. Instead, its functionality has been moved into Render/Xhtml.php method pre(). To modify its behavior, instead of setRenderConf(), use setFormatConf(‘Xhtml’, ‘translate’, translation-table or boolean false to turn off translation).

Looking forward to DocBook rendering support, I have added a new rule called ‘Function’ that allows you to enter a function definition with ‘access’, ‘return’, ‘param’ descriptions and defaults, and ‘throws’ keys, along with an almost-complete Xhtml renderer and stub renderers for Plain and Latex.

For other changes, please see the change log. I’ll update the documentation next week.

Why Coding Standards Matter

David Mytton writes about coding standards on Sitepoint.

A coding standards document tells developers how they must write their code. Instead of each developer coding in their own preferred style, they will write all code to the standards outlined in the document. This makes sure that a large project is coded in a consistent style — parts are not written differently by different programmers. Not only does this solution make the code easier to understand, it also ensures that any developer who looks at the code will know what to expect throughout the entire application.

I completely agree.

When you start sharing code, or start reading code shared by others, you begin to realize that not everybody writes their code they way you do. You see that other, ugly coding style, and think “everybody needs to write in the same style so that things are easier for me to understand.”

Thus, it is natural that everybody wants their own habits turned into the standard, so they don’t have to change their habits. They’re used to reading code a certain way (their own way) and get irritated when they see code in a different format. The thing about defining a coding style standard is that there is no objective means by which to judge one style as “better” or “more-right” than another. Sure, we can talk about “readability” and “consistency” but what is readable is different for each coder (depending on what they’re used to) and consistency follows automatically because, well, why would you use another style?

Other than in the most broad outlines, defining a coding standard is an exercise in arbitrariness. Who says that a 75 character line is better than 72, or 80? Who says putting braces in one place is better than putting them elsewhere? And who are they to say; by what standard do they judge?

The point of a coding style standard is not to say one style is objectively better than another in the sense of the specific details (75 character and one-true-brace convention, for example). Instead, the point is to set up known expectations on how code is going to look.

For example, look any PHP project written by more than one person. If you examine Script A and see one coding style, then examine Script B and see another, the effect is very jarring; they don’t look like they belong to the same project. Similarly, when Developer Joe (who uses one coding style) attempts to patch or add to a separate project from Developer Bill (who uses another coding style) the mix-and-match result in the same project (or even the same file!) is doubly jarring.

As PHP developers, we need to define and adhere to a coding style not because one is better than another but because we need a standard by which to collaborate. In that sense, coding style is very important; not for itself, but for the social effects generated by adherence to defined standards.

It is for this reason that I abandoned “my” coding style to adopt the PEAR coding standard. Sometimes you need to give up a small thing to gain a greater thing; by giving up my coding style, I had to change my habits; but now, anybody familiar with the PEAR standard can read my code and add to it in a recognizable way.

Text_Wiki 0.22.0 alpha

I just released Text_Wiki 0.22.0 alpha. Lots of improvements and no backwards-compat breaks; see the change log here.

Special thanks go out to Bob Glamm, Aaron Kalin, and Stephane Le Solliec for their very active efforts leading to this release. Thanks, guys. 🙂

Text_Wiki is a PEAR package that abstracts parsing and rendering of wiki markup. It is object-oriented, so you can add, modify, or remove rules without having to edit the core code. It is used in YaWiki (a Yawp-based project) and the Horde Wicked project.

Form Processing Questions

Norbert Mocsnik has raised some important points about automated form generation and processing in this post. He outlines what he thinks of as the form processing steps:

1. build a form (e.g. call functions to add inputs, set form ACTION)

2. fill in the defaults (with one call or walking through the form inputs by inputs)

3. assign the form template (either a static template which is created for a specific form thus giving the greatest flexibility or a dynamic template which describes how to display different field types thus can be applied to any form)

4. parse (and display) the form template

The user fills in the form and posts it to the server. Any time the user posts the form, the server should start with step 5 (so it goes like 5-6; 5-6-7; 5-6-7-8 not 5; 6; 7; 8). This way it is guaranteed that the user gets back the form any time he/she posts invalid values, gets to the confirmation after then (only with valid values that are ready to be processed) and the form is processed only if both the form was valid and it was confirmed (if needed).

5. form validation

6. if the form was not valid, pass it back to the user for correction (go back to step 1 but instead of filling the defaults in step 2, fill it with the values the user just entered)

7. if this form should be confirmed before processing and it wasn’t confirmed after the user edited it the last time, pass it back to the user “freezed” (=read-only) for confirmation

8. process the form (this means storing it in a database in most cases)

I am not 100% certain that I agree with these points as presented; allow me to revisit and restate them. The following is a thought experiment; it’s an outline of the client and server processes for a form submission.

0. (Optional, not the normal case.) The client decides to attack your script by submitting form data directly. This would cause us to skip steps 1, 2, and 3, going directly to step 4. This is why we cannot rely on client-side validation of data for any serious purposes.

1. Model logic is triggered for the first time by client browsing to the page. We need to send the client a blank form with some sane default values. The model logic talks to the database through an interface class to see what a default data set should look like, modifies it as needed, and presents the default data set to the View (template) logic.

2. View logic takes the default data set and parses it through to a form; the form in the template script may be dynamically generated, say with the Savant2 “form” plugin or the Smarty form element plugins; alternatively, it may be mostly static XHTML with placeholders for the data elements. When done, off we go back to the client.

3. Client gets the generated form, fills in some elements, submits the form.

4. Model logic gets the submitted form data. Now we have sanitize and validate the data; either the model logic or the data interface class sanitizes the data, then the data interface class validates it. Obviously there are two possible outcomes: some or all of the sanitized data is not valid (see 4a), or all of the sanitized data is valid (see 4b).

4a. If some part of the data is not valid, we should re-present the submitted data (perhaps modified to make it more sane) to the client. We cycle back to the equivalent of steps 1 and 2 again, with the submitted or modified data set (not the default data set).

4b. If all of the data is valid, then we can continue to perform the model logic; this may mean changing database values, handing control off to another script, “freezing” the form for confirmation (in which case we may cycle back to step 3 again), or any other type of processing.

As you can see, the above outline isn’t strictly based on Model/View separation. Instead, it is more like Data/Model/View separation, where the Data logic is encapsulated in a class like DB_Table or DB_DataObject.

It seems to me that the Model logic should not be validating the data; becuase it is data, that Data logic should handle that behavior. The Model should only ask the Data logic if the information is valid, and the reasons for invalidation; that way, any Model that talks to the Data logic will always get the same responses for the same data sets.

This is where I find HTML_QuickForm and FormBuilder and the like to be not-the-best long term solution. They pierce the veil between Data, Model, and View, trying to roll them all into one piece. This is fine for prototyping, but for extended use I’m beginning to think we need less of a monolithic approach. We need more “small pieces” to be “loosely joined”.

What would this entail?

The Data logic would entail a data-interface class that knows what it’s columns and their valid limitations are, effectively an increased-functionality version DB_Table or DB_DataObject. The class would need to be able to validate data at the PHP level for reasons of data portability (can’t depend on the database backend for that, becuase all DB backends are different in different ways). The class would also need to be able to report validity failures in an extensible and programmatic way (validation codes in addition to messages). Finally, the class would need to be able to provide hints to the View logic as to what kinds of values are valid, so that the View logic can present a reasonable form to the client.

The Model logic is up in the air; its functions are the heart of the custom application, and as such cannot be strictly outlined here. At worst, the Model would serve as a mediator between the Data logic and the View logic.

The View logic would be able to take the data presented to it from the Model and construct a useful form, along with validation messages.

Man, that’s a lot; let me think on this a bit more and get back to it later. In the mean time, please comment freely and tell me why this outline is the worst thing you’ve ever seen. 😉

Yawp 1.0.3 Released

Yawp 1.0.3 is online now. This is a bugfix release; I was using “@include” instead of “include” for hook scripts, which effectively turns off error reporting inside those hook scripts. Yawp uses just “include” now; sorry for any trouble this may have caused when troubleshooting.

At the same time, this uncovered a bug inside PEAR Benchmark_Timer; you can read more about that here.