Simon Harriyott

Programming by text-file

I'm currently thinking about the role of configuration files (or the registry, or database storage) in programming. Clearly they have a role for user options, as the settings just can't be stored in code. I get scared when they become design or even coding options that are read from a file.

The idea behind using configuration files is sound. It is easier to change settings at run-time by editing a text file than recompiling a module and deploying it. This is a definite bonus when end-users call for support: "Just load the wotsit.xml file in notepad, and change the 'stopworking' value from true to false".

That is also the weakness. The user-with-just-enough-knowledge-to-be-dangerous will hang up the phone, and then see what other settings can be changed. And for that matter, what other files.

As configuration files can't be compiled, they live outside the program. This makes me nervous. Having, for example, URLs in a config file seems like a good idea, but if the program flow depends on the URL pointing to, say, the next page in a wizard, then this is risky. The URL is used in the same manner as a method call in compiled code. As the file is outside of the compiled program, the program's control logic can be changed without changing code and recompiling. However good the unit testing is, the deployed version can still break really easily.

There comes a point in a program where there are enough (lets call it x) configuration name / value pairs to warrant categories of options. Then a handy utility to manage the settings is needed, with a tabbed view interface, one tab per category. Again, these are both good ideas, but only for the right sort of option. When a new option is added, not only does the code in the original program need to load it, but the config utility must be updated. After there are x settings, then the work involved increases dramatically. [Note to self: I must get round to writing a sort of generic kind of config file manager editor application utility tool program.]

I notice that MSBuild, which although quite simple, can have some gigantic files. These are basically configuration files, and people are asking for a tool to configure them, which they may or may not get. Whidbey has a web front-end onto the web.config file, which has its own settings in the machine.config file. A config file tool with a config file? Lather, rinse and repeat.

Documenting the settings is often a problem, as configuration files (in my experience) generally don't have comments. The details of each name / value pair must be written in a supporting document, which may be part of the help system. By comparison, code is commented in the same file, on the line above.

Configuration files are hard to localise. A menu control using an XML file to store the menu options would require one file per culture. One more thing to send to the translator.

XML config files are replacing .ini files, and thus XML schemas are needed to ensure that the config file is valid. One more thing to change. And document. XML is just plain hard for humans to read.

Debugging a program, n-tier application or multi-application system that uses several configuration files is really hard. Developers understand code. A programmer can often tell by looking what is wrong with a routine in code. A config file could be anything. There just aren't the rules to follow in text files. Debugging involves lots of matching one thing to another, and then another thing to something else, back and forth, to and fro. I personally prefer looking through code than text files when debugging. I've seen someone trying to navigate through a large BizTalk configuration recently. That is really scary; I'm not going anywhere near that.

Because config files aren't compiled, enumerated values can't be enforced so easily. These are often used in switch statements. There are two less than ideal options when using enumerations in text files; have numbers in the file which are used in the code, maybe with an enumeration type, e.g.

switch (destinationOption)
case Destinations.Email:

or have a string in the file which is used directly in the switch statement.

switch (destinationOption)
case "email":

In the first example, readability is reduced when debugging, as the correlation between 12 and "email" is made elsewhere in the code. The only enforcement of correct values in the text file would be in the XML schema, which would be duplicated in the enumeration in the code. In the second example, the readability improves but the problem arises of having hard-coded strings in switch statements. This is an example of coding in a text file.

I'm not trying to say that configuration files are wrong. They're not. There's very few programs that could survive without them. But there is a limit somewhere, and some things just aren't meant to be configured, they're meant to be coded. Clearly some of the problems I've mentioned are going to occur anyway, as user options have to be stored somewhere. I'm trying to point out that design and compile time options aren't necessarily best kept in text files.

I don't actually have a conclusion, because I don't know where the limit is. I'm open to arguments for and against configuration files, and I'm quite open to being wrong, and being told that I am.
1 March 2005