Comments | shevek: (без темы)

shevek

(Untitled)

Mar 06, 2012 18:06

I have yet to read a Java vs python comparison which was written by anyone with any serious enterprise development experience ( Read more... )

Leave a comment

Back to all threads

quercus March 7 2012, 18:11:07 UTC

I've got the experience of all three.

On the whole, I'd go with Java. This isn't because Java is better, or because Java people are better, but because Java is infinitessimally more robust against lazy code written by average coders. Bad Java is bad Java. It's harder to maintain than bad Python (bad Java tends to be verbose and over complicated, whereas bad Python is over-simple and a bit easier to rework). However bad Java smells worse. There's an awful lot of good practice around in how to run an effective Java shop, even if most of the suits wouldn't recognise it if it bit them. If you have an average team, and some better than average team leads, you can at least judge your product quality. You can know that some new Java is bad, which gives you the chance to fix both the code and its coders. Pair programming works well in Java, I've never seen it work in Python.

These days I mostly write Python. I write it on short contracts to make sticking plasters. However I'm also delivering production grade code (whether they want it or not) and it's now (after several years Python) code that I consider to be good, robust code. Note that this doesn't look the same as good, robust Java. It's quite difficult to eyeball Python and judge it - some bad stuff looks bad, but good-looking stuff is often over-verbose and literal-minded by the standards of good Python. My Python file reader would be as robust (and probably more so) than my Java code, but it would do it in one line and wouldn't look like it was having to work hard at it. Python is like tightrope walking - most of the time it just looks like some clown walking along a simple straight line, and any fool could do that. The clever bit comes in when something exceptional happens - then you see the difference between something that recovers cleanly as close as possible to the problem, and something that merely falls over. My Python is heavy with exception handling, and very often the code that handles those exceptions isn't at all obvious unless you know where to look for it. As it's not so reliant on handling every exception on the adjacent line, it's likely that more exceptions can be caught within its scope, which means in turn that I get to deliver more robust code with more trapping, for a given amount of effort / budget. I trap stuff in Python I'd never be allowed the coding time to trap in Java.

The inevitable canard of dynamic typing comes up. It's very difficult to judge Python modules as to how robust they are, looking from the outside. I've seen Python modules used by wrapping them in parameter validation wrappers, just to avoid the risk. While this obviously sucks and is the most un-dynamic thing you could imagine, one has to understand why someone would do it.

Scala is the best of both worlds. Provided that you have the right sort of people - which limits you to well-funded startups where you can attract good people and don't have to fight through corporate HR first. Only at MuppetLabs was I anywhere close to that sort of environment.

So personally I like writing Python. I make good code, and I get more done.

I also know how to run a Java team and to achieve fantastic improvements in product quality (Four years work, same team size, six times as many live customer products at the end of it, and with happy customers at the end of it who were no longer threatening to sue us for a non-working product).

I have no idea how I'd find a big Python team, or organise them to build good code.

rhialto March 10 2012, 09:38:02 UTC

The thing with dynamic typing is that it is nice in the occasion you want it, but you don't want it everywhere, all the time. There is lots of code where you can assign some type to your data with certainty. You'd like to express that in the code, if only to aid the reader in understanding what's going on, and to help the compiler to double-check. However, to my knowledge (I'm not a Python expert), Python has no syntax to let you optionally specify the type of your data.

quercus March 14 2012, 12:11:58 UTC

Python has plenty of ability to build statically-typed code, but it's so tedious that you might as well do it in Eiffel. You can't do it by syntax alone, but the check is easy enough with isinstance().

Dynamic typing becomes useful, and more than a party trick, when you're dealing with truly dynamic data in a semi-structured context. Situations like SOAP or RDF, where the data has a structure, the structure is communicated by some meta-format (WSDL or RDFS / OWL), but you don't know this structure before execution, at the time of writing your application. A good SOAP framework, like Suds, reads the WSDL and generates a Python facade for you, using dynamically generated object classes.

It's then common for many applications that you don't even know, or need to know, the full structure of these objects. Your app merely knows certain of their behaviours (this is often something like a Dublin Core level metadata) and so it knows how to apply the "behaviours" that make sense in the app's context, no matter what the underlying object. Your app might pick up objects from one source, route them around a bit using a few properties that it's built to recognise, then spit them back down some other pipeline. It's also very easy with Python to build clean code that understands a number of potential behaviours (maybe media objects that separately carry either Dublin Core, MARC, ccREL or FOAF) and to expose some app-useful behaviour consistently as a result. Wrapping external objects up in lightweight local wrappers to flatten such variations is a good technique, and fits well with the dynamic typing notion.

A downside of dynamic tying in Python is the lightweight semantics of method naming. Particularly if newbie coders have been heavy-handed with the "import *". If the names match, you get the method - even if it's utterly unrelated to what you expected.

To use the infamous duck typing example, a Duck object might work fine if you use its .quack() method, and a Goose might give a reasonable response too, but if you've imported VizComic.JohnnyFartpants you're going to get a shock.

shevek March 18 2012, 10:04:11 UTC

While isinstance() is easy, it's also too late. It's like an anti-missile system which detects incoming missiles by the "Wait for bang" method.

The implication is that a strict typed language has a harder time dealing with SOAP, RDF or WSDL? This is "not even wrong" - generating host-side code from a WSDL specification is a violation of the data/code divide in the Von Neumann machine, and lies somewhere between dangerous and highly dangerous.

A behaviour is an interface, and I'd rather have mine checked at compile time. Wrapping in a lightweight local wrapper is the adapter pattern, a classic for strong type systems, and described in the gang of four book as such.

I sat in the "interfaces and abstract base classes" (it's cool to call them "ABCs") talk at pycon, and the conclusion was somewhere between "no thesis was made" and "the question was begged".

To use the duck typing example, Duck implements animal.Quack, Goose implements aminal.Quack. JohnnyFartpants implements joke.Quack, and while you may get a shock, you get it privately, before you give the code to the customer.

quercus March 18 2012, 22:33:04 UTC

Type checking:

Yes, this is like an anti-missile system. There is a cost to providing it, and it keeps you safe from the wrong sort of incoming items. There is also an insignificant overhead to delaying every commercial flight in mid air to check its bona fides.

However it's not a "wait for the bang" method. We don't execute upon these incorrect types, and hope to catch some thrown exception.

When it boils down from the source to the executable or VM code, then there is little to choose between what the compilers and optimisers can build. Only the most esoteric static type systems can avoid such a check of a type tag and there's no strong reason why this has to become significantly less efficient in a dynamically typed language (granted it's a burden in the source).

quercus March 18 2012, 23:19:21 UTC

Von Neumann is obsolete, at least at this level. We need something smarter and more flexible.

I've spent the last dozen years worrying about semi-structured data and metadata. A vast class of interesting problems are out there that I cannot solve by the classical application of statically-typed approaches. I cannot know the types of incoming objects because they are simply unknown or undefined at the time I choose my static types. This is either because they haven't been defined yet, or they haven't been defined "here", i.e. within visibility of my client coding effort. I need techniques that can work under this constraint, and I certainly need something better than many statically typed systems that collapse into a reference typed as no more than a pointer to "an object", with the hope of some exception trapping it it goes obviously wrong.

Yes, behaviours can be interfaces. If my statically-typed environment offers such, then that's great. So what's an interface? It's a subset of methods. It might include other checks too. If this subset granularises to the single-method level, then it's no longer conceptually different to what Python offers. As it happens, I do like interfaces - the ability to bundle a set of methods is a far better explanation of the infernal duckery nonsense.

Can I have a strongly-typed language where I can pass any type, of any class-based inheritance whatsoever, into a context that is strongly typed as requiring the support of a particular interface - and with no other implications than this, certainly not any implication as to the implementation inheritance tree. Now that's a good environment to address the metadata-handling issue.

ABCs? They're still pretty flimsy.Good idea, but they're still heavily reliant on good coding practice, not really bringing their virtue to the level of things the compiler can just deal with for me.

Adapters are a pattern from the outside view, and you can do these either statically or dynamically, with nothing much to choose between them. The difference is in the internal implementation - static approaches generally look like some vast switch statement, with a whole set of behaviours hung off each type-based branch. The dynamic approach is more granular. There's no concept of "type" (and no ducks either), there's just an attempt to access each property, with a chain of fallbacks for each way of accessing it. These operate quite independently between each property we're after. Less safe? Well if we fully understand the internal implementation of the "foreign" object (and the drawbacks of that are left as an exercise for the reader), then that's so and the implementation-class-based switch is fine and dandy. What you really need though, in a great many cases, particularly involving metadata, is a succession of "best attempt" accesses to each property on a one-by-one basis. This is done without making assumptions about foreign implementations, such that if the .title responds best through Dublin Core rather than MARC, then the .isbn will do so equally.

Back to all threads