Form Processing Questions

Norbert Mocsnik has raised some important points about automated form generation and processing in this post. He outlines what he thinks of as the form processing steps:

1. build a form (e.g. call functions to add inputs, set form ACTION)

2. fill in the defaults (with one call or walking through the form inputs by inputs)

3. assign the form template (either a static template which is created for a specific form thus giving the greatest flexibility or a dynamic template which describes how to display different field types thus can be applied to any form)

4. parse (and display) the form template

The user fills in the form and posts it to the server. Any time the user posts the form, the server should start with step 5 (so it goes like 5-6; 5-6-7; 5-6-7-8 not 5; 6; 7; 8). This way it is guaranteed that the user gets back the form any time he/she posts invalid values, gets to the confirmation after then (only with valid values that are ready to be processed) and the form is processed only if both the form was valid and it was confirmed (if needed).

5. form validation

6. if the form was not valid, pass it back to the user for correction (go back to step 1 but instead of filling the defaults in step 2, fill it with the values the user just entered)

7. if this form should be confirmed before processing and it wasn’t confirmed after the user edited it the last time, pass it back to the user “freezed” (=read-only) for confirmation

8. process the form (this means storing it in a database in most cases)

I am not 100% certain that I agree with these points as presented; allow me to revisit and restate them. The following is a thought experiment; it’s an outline of the client and server processes for a form submission.

0. (Optional, not the normal case.) The client decides to attack your script by submitting form data directly. This would cause us to skip steps 1, 2, and 3, going directly to step 4. This is why we cannot rely on client-side validation of data for any serious purposes.

1. Model logic is triggered for the first time by client browsing to the page. We need to send the client a blank form with some sane default values. The model logic talks to the database through an interface class to see what a default data set should look like, modifies it as needed, and presents the default data set to the View (template) logic.

2. View logic takes the default data set and parses it through to a form; the form in the template script may be dynamically generated, say with the Savant2 “form” plugin or the Smarty form element plugins; alternatively, it may be mostly static XHTML with placeholders for the data elements. When done, off we go back to the client.

3. Client gets the generated form, fills in some elements, submits the form.

4. Model logic gets the submitted form data. Now we have sanitize and validate the data; either the model logic or the data interface class sanitizes the data, then the data interface class validates it. Obviously there are two possible outcomes: some or all of the sanitized data is not valid (see 4a), or all of the sanitized data is valid (see 4b).

4a. If some part of the data is not valid, we should re-present the submitted data (perhaps modified to make it more sane) to the client. We cycle back to the equivalent of steps 1 and 2 again, with the submitted or modified data set (not the default data set).

4b. If all of the data is valid, then we can continue to perform the model logic; this may mean changing database values, handing control off to another script, “freezing” the form for confirmation (in which case we may cycle back to step 3 again), or any other type of processing.

As you can see, the above outline isn’t strictly based on Model/View separation. Instead, it is more like Data/Model/View separation, where the Data logic is encapsulated in a class like DB_Table or DB_DataObject.

It seems to me that the Model logic should not be validating the data; becuase it is data, that Data logic should handle that behavior. The Model should only ask the Data logic if the information is valid, and the reasons for invalidation; that way, any Model that talks to the Data logic will always get the same responses for the same data sets.

This is where I find HTML_QuickForm and FormBuilder and the like to be not-the-best long term solution. They pierce the veil between Data, Model, and View, trying to roll them all into one piece. This is fine for prototyping, but for extended use I’m beginning to think we need less of a monolithic approach. We need more “small pieces” to be “loosely joined”.

What would this entail?

The Data logic would entail a data-interface class that knows what it’s columns and their valid limitations are, effectively an increased-functionality version DB_Table or DB_DataObject. The class would need to be able to validate data at the PHP level for reasons of data portability (can’t depend on the database backend for that, becuase all DB backends are different in different ways). The class would also need to be able to report validity failures in an extensible and programmatic way (validation codes in addition to messages). Finally, the class would need to be able to provide hints to the View logic as to what kinds of values are valid, so that the View logic can present a reasonable form to the client.

The Model logic is up in the air; its functions are the heart of the custom application, and as such cannot be strictly outlined here. At worst, the Model would serve as a mediator between the Data logic and the View logic.

The View logic would be able to take the data presented to it from the Model and construct a useful form, along with validation messages.

Man, that’s a lot; let me think on this a bit more and get back to it later. In the mean time, please comment freely and tell me why this outline is the worst thing you’ve ever seen. 😉

Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving your codebase, all while keeping it running the whole time.
Share This!Share on Google+Share on FacebookTweet about this on TwitterShare on RedditShare on LinkedIn

4 thoughts on “Form Processing Questions

  1. Sounds very much like the forms engine I’m working on at the moment 🙂

    My solution was to create a form structure (an associative array) that describes the elements in the form.

    Pass the form structure to the build functions, and it spits out a fully working, styled form.

    The same form structure is used to validate any user inputs and spits back a form with errors highlighted.

    Once the form has validated, the form structure is used to build the queries neccessary to work with the database.

    Its a work in progress (half of it is still on my box at work), but take a look:

  2. I would just like to make a small notice here.

    In my opinion the form validation should be done in both client and server side. I know it is really boring to the programmer to code the validation in both Javascript and PHP, but if we think in efficiency and security at the same time, this is the best aproach.

    Form validation in the client side would avoid unecessary transactions between the client and the server. This has obvious benefits.

    Form validation in the server side will only fail if the client side validation has a bug, or the user is trying to hack the server. Server side validation may only throw a ugly error message, because if the client side validation has a bug the support should be warned about it and fix it as soon as possible. If the user is trying to hack the server… well this can have obvious solutions.

    Probably this would go against the MVC pattern, or maybe not. We can allways see the Model crossing the border of both PHP and Javascript. We are talking about web applications, wich take different aproaches of regular applications.

    Best regards,
    Gama Franco

  3. Hi, Gama —

    Thanks for reading and taking the time to comment. 🙂

    You said, “Form validation in the client side would avoid unecessary transactions between the client and the server. This has obvious benefits.” Indeed it does! However, it may be possible to use XmlHttpRequest to reduce the size of the round-trip transactions, thus reducing the benefit of client-side validation.

    You also said, “Form validation in the server side will only fail if the client side validation has a bug, or the user is trying to hack the server. Server side validation may only throw a ugly error message…” To me, this is a very strong reason to have a Data layer that throws useful error messages to the form.

    Finally, you say, “We are talking about web applications, wich take different aproaches of regular applications.” Absolutely! In almost all regular desktop application, all stages of data entry, validation, and storage happen at the client; in a web-app, the data is stored at the server, which is where all serious validation and processing must occur.

    — pmj

Leave a Reply

Your email address will not be published. Required fields are marked *