December days: Programming language improvements

Dec 26, 2014 15:10

As a follow-up to my post about my personal history of programming languages, cartesiandaemon asked me to expand on "which language(s) do you use most now (C and Python?) and what improvements would you like to see in them?"

Mostly I do indeed use C and Python these days. I'm still comfortable enough in Perl and often reach for it particularly for shorter text-processing kinds of jobs (some of which grow a bit more than I'd intended), but I really haven't kept up with work on the core language much since about Perl 5.10, so I'm not desperately qualified to talk about things I'd like to see improved. The trends I've seen there seem to be towards making it easier and more natural for people to use modern and safer techniques rather than perl4isms, things like Modern::Perl and Moose, and these seem like good things but I don't have a lot of experience to share there.

C

C is in many ways an awful language for the sorts of things it's often used for: manual memory management, no bounds checking, a standard library with a host of peculiar warts, support for closures that could be described as poor if you were feeling generous, and so on. (C++ fixes some of these at the cost of truly insane programmer-facing complexity.) Huge numbers of security vulnerabilities can be ascribed to defects in the design of C. On the other hand, it underpins so many other things, especially on Unix, that it isn't really possible or reasonable to avoid entirely, and it's pretty much the greatest common denominator for library interfaces that want to be usable from more than one language environment. (You can sometimes make C++ libraries usable from C, but you have to be pretty careful.)

GLib (and its object system add-on, GObject) is pretty pervasive these days, including in various things that otherwise have nothing to do with GNOME, and it's not at all a bad general supplemental library: better-designed and more comprehensive than much of the C standard library, and, sure, it may be a megabyte or so but you almost certainly had it installed anyway. I think for a new C project these days I would probably turn to it, unless it were particularly small or needed to function in particularly minimal environments. Its main flaw is verbosity, particularly where callbacks are involved. For this I think it's possible to do an excellent job with domain-specific languages that provide a thin wrapper over C. I converted performance-critical parts of a project at work recently from Python to Vala recently, and was very impressed: it made it actively enjoyable to program using GObject, which wasn't something I could really say before, and it was still possible to inspect and roughly understand the generated C code. GObject also provides very close to automatic binding generation for various other languages by way of gobject-introspection, which is extremely powerful for being able to use the right tool for individual jobs without having to commit to it for your whole project. I think that this kind of thing is probably the right path for many C projects.

For things that need to stick with plain C, it's certainly possible to incrementally evolve the facilities they're using to make things safer and easier. valgrind made quite a few waves when it was introduced a little over ten years ago: in C you often find that your memory management mistakes are only reported as a crash in some entirely different part of your program much later, and this is excruciatingly difficult to debug directly, so valgrind keeps track of absolutely everything you do with memory and tells you if it looks invalid. Compiler developers have been doing all kinds of interesting things recently such as AddressSanitizer and Undefined Behaviour Sanitizer which promise to make it easier to spot problems early, and there are all sorts of proactive hardening techniques one can use to stop bugs escaping into the rest of your system as exploits.

Library-wise, there's always more to be done. I made my own small contribution to this with libpipeline, which I'd like to see used by more C projects that invoke other programs, since it's very easy to get this kind of thing wrong.

Python

The Python world spent a long time collecting all the things that were hard to improve in the core language without breaking compatibility in some way, and batched up a lot of these into Python 3. Unfortunately the transition to Python 3 has been an extremely painful and protracted one. Despite considerable work on migration strategies such as 2to3, the changes were substantial enough that it's taken quite some time for many projects to be ported, and you can only start using Python 3 once all your dependencies have been ported and are available anywhere you might want your code to run. (In particular, the Unicode string changes are I think a significant net improvement - they're not without problems, but Python 2 was even worse for anything that might care about internationalisation - but porting to them requires going through your program and for each string-like variable in it determining whether its essential nature is to contain binary data or text. Easy enough to get right from scratch, but painful to retrofit.) The right answer for many projects is to write "bilingual" code for a while that works in both Python 2 and Python 3, which is largely possible with libraries like six to help, but this is a bit of extra cognitive load on programmers and of course not everyone cares. So I think if I got one wish here it would be for all the remaining stragglers to be magically ported to Python 3 so that we could stop caring about the old stuff and simplify things, but that's not likely to happen any time soon.

I guess the main general improvement I can think of for Python, short of "please be faster" and various standard library warts, would be some form of partial static typing so that I don't have to rely quite so completely on test suites to defend me against my own foolishness. I hear that something like this may be underway for Python 3.5, which will be interesting.

This post is part of my December days series. Please prompt me! This entry was originally posted at http://cjwatson.dreamwidth.org/15047.html. Please comment there using OpenID.

programming, december days

Previous post Next post
Up