[Code Friday] The most useful advice I will ever give anyone.

Mar 19, 2010 11:13

Suppose you're designing a protocol, and you're deliberating over whether to use XML, YAML, JSON, s-expressions (!) or some other data representation format for it.

The question you need to ask yourself is, "have I written an EBNF definition for my protocol yet ( Read more... )

code friday, code, formal languages, common sense, you're welcome

Leave a comment

steer March 19 2010, 11:45:46 UTC
Interesting -- but what does it actually gain you?

Designing the grammar first might mean your grammar is less than well suited to the particular data format you then wish to pick or, indeed, illegal in that format.

Or have I completely missed the point?

I guess, I'm just thinking, if I do the EBNF then I can, for example, immediately rule out XML unless I secretly thought "I'll use XML" then did the EBNF and then decided -- in which case the EBNF case was merely a ponderous waste.

What am I not seeing? I think I *AM* intermingling data representation and protocol structure but I cannot see how NOT to do this using EBNF.

Reply

vatine March 19 2010, 11:58:33 UTC
Data structure is part of your code, protocol grammar is part of a transport layer. It MAY be that having a clear 1:1 mapping between data structure and wire protocol is a good thing, but it may also be that it isn't. And don't force me to say "ASN.1", because I will start crying.

Reply

maradydd March 19 2010, 12:02:33 UTC
If you're not careful I'm going to start thinking of you as one of my acolytes. ;)

Reply

steer March 19 2010, 12:12:02 UTC
Perhaps questions related to specifics might make it clearer for me. I am thinking of an example where, say, I am designing the allowable layout of a config file for a simulation (read in at start up, generates parameters for simulation, somehow translated into a data structure internally). This affects data structure but is not data structure, it's a protocol.

I guess my methodology would be to pick (say) XML, design the grammar to work in XML (using one of the XML definition thingums) with an eye on the eventual data structure in the code but not genuinely dictated by that. What do I gain by adding EBNF because it seems there is a lot to lose?

Reply

m4dh4tt3r March 19 2010, 20:00:33 UTC
ASN.1 makes the baby jesus cry.

Reply

maradydd March 19 2010, 12:01:07 UTC
then I can, for example, immediately rule out XML

Why? XML can itself be defined by an EBNF. I actually think XML is a relatively decent way to represent a parse tree; tags correspond to nonterminals, text elements to terminals. You get a very nice separation between leaves and non-leaves that way. Of course the implementor then has to deal with XML, but that's his problem.

Reply

steer March 19 2010, 12:10:12 UTC
If my EBNF grammar produces something which is valid XML then that's not a coincidence. If my EBNF grammar produces something which is not valid XML I cannot use XML unless I design a second grammar derived from the first. In this case then what was the point of designing the first grammar?

Reply

maradydd March 19 2010, 12:20:32 UTC
I'm having trouble visualizing what you're saying. What's a trivial example of an EBNF grammar that can't be transformed into valid XML?

Note that I'm not saying that the EBNF should generate XML; it can, though that somewhat defeats the purpose of what I'm asserting. Rather, I submit that any valid EBNF can be rendered as XML (and, for bonus points, you get the parse tree for free).

Reply

steer March 19 2010, 12:37:05 UTC
What's a trivial example of an EBNF grammar that can't be transformed into valid XML

Anything can be transformed into valid XML -- but if you're going to do that wouldn't it be better to have started in XML?

Your design process seems to be "design grammar", "pick parsing scheme", "design second grammar to fit parsing scheme".

Now, I think you'll agree (maybe not) that taking, say a JSON way of doing things and moving it directly to XML will result is something "bodgy" and not quite in the spirit of XML.

In other words, the grammar you designed using EBNF would, I submit, spit out the answer as to whether XML or JSON better suits it even though that actually might be the "wrong" choice.

I suspect there is some subtlety to what you are suggesting that I'm not seeing.

Reply

ex_ben March 19 2010, 18:53:21 UTC
The point that came to my mind is that it is more important to design the abstract syntax (the concepts that make up the protocol) first than to focus too soon on the concrete syntax of how it is represented in some notation (the bits that communicate the protocol).

I don't know if that's what Meredith's intended to say. But I would. :)

Reply

steer March 22 2010, 13:17:04 UTC
No problem with that. EBNF isn't an abstract syntax it's a very concrete one.

Reply

steer March 19 2010, 12:52:18 UTC
Apologies -- I am putting this strongly because I want to understand. I respect your opinion and I want to really get what you're stating here. I hope you don't think I'm merely trying to gainsay, I believe you have a genuine reason for your recommendation which I am failing to understand well.

Reply

maradydd March 19 2010, 13:02:43 UTC
Oh, no apology needed -- I think there's a communication gap as well, and I'm sure not only that we can figure out how to resolve it, but that resolving it will help me to get my points across better to others. So this is actually a really valuable discussion for me.

However I'm reviewing conference papers today and the deadline is tight, so I can't commit as much time to LJ today as I would prefer. I promise I'll come back to this, though.

Reply

steer March 19 2010, 13:08:07 UTC
No problem -- I should also get on with paper reviewing. :-)

Reply

medains March 19 2010, 13:22:42 UTC
If it helps with your discussion - there is a relatively simple protocol that may have suffered from the "designed it in XML" factor here...

http://jogre.sourceforge.net/protocol.htm

Gives a starting point for discussion without either of you having to make up a protocol.

I'm interested myself, I -feel- that maradydd is giving good advice, but I'm unable to come up with an example to illustrate in my lunch hour.

Reply

vatine March 19 2010, 13:09:01 UTC
A typical example would be the configuration files for GCA (GURPS Character ASsistant). I am of the firm belief that you cannot write a "trivial" grammar for it and actually parse everything. It has clear marks of having grown ad-hoc from a simpler, more parseable format.

Also, the "protocol" is (or should be) your "data format". But not necessarily your data structure. More so in a configuration file than in an inter-process protocol, where your primary concern tends to be communication of state (or remote procedure calls that may or may not involve state). Extra bonus if the protocol is human-readable, makes it that much easier to read (for a hit in processing speed, but if you're using sufficiently loose constraints on hardware, you'll have to accomodate all sorts of things, like non-ocet bytes and a vast variety of bit-orders).

Reply


Leave a comment

Up