Sunday, August 12, 2007

Responses to 'Silver Bullets Incoming!'

(((
These are the responses originally posted to "Silver Bullets Incoming". Many of these responses are worth reading in their own right, and I'd like to thank the respondants for taking the time for such thoughtful posts.

Please do not post this to reddit, as it has already been discussed there under the original URL.
)))
  1. Stephen Says:

    Paul:

    I enjoyed your first article quite a bit - it got me thinking about technical language issues again (always fun).

    I’d like to comment on your update to the original article. Specifically, I have some comments regarding C++

    C++ is not an “old” language, incorporating many language features of more “modern” languages, including exceptions, automatic memory management (via garbage collection libraries and RIIA techniques), and templates, a language feature that is only available in C++, and that provides support for generic programming and template metaprogramming, two relatively new programming paradigms. Yes, C++ has been around a while, but until I see programmers constantly exhausting the design and implementation possibilities of C++, I won’t call the language “old.”

    C++ was not designed to support just OO programming: From “Why C++ Isn’t Just An Object-Oriented Programming Language” (http://www.research.att.com/~bs/oopsla.pdf):

    “If I must apply a descriptive label, I use the phrase ‘multiparadigm language’ to describe C++.”

    Stroustrup identifies functional, object-oriented, and generic programming as the three paradigms supported by C++, and I would also include metaprogramming (via C++ templates or Lisp macros) as another paradigm, though it is not often used by most developers.

    Of course, we should also keep in mind Stroustrup’s statement regarding language comparisons (”The Design and Evolution of C++”, Bjarne Stroustrup, 1994, p.5): “Language comparisons are rarely meaningful and even less often fair.”

    Take care, and have a good weekend!

    Stephen

  2. pongba Says:

    I found it so weird that, on the one hand you argue that haskell is fast( to the extend that it might be even faster than some compiling language such as C++), while on the other hand you said “where correctness matters more than execution speed its fine today”.
    Does that sound paradoxical?

  3. Another Paul Says:

    Paul:

    “I think that Dijkstra had it right: a line of code is a cost, not an asset. It costs money to write, and then it costs money to maintain. The more lines you have, the more overhead you have when you come to maintain or extend the application”

    By that measure, there’s no such thing as an asset. Think about that a moment - someone buys a general ledger or CAD/CAM system and modifies it as companies do. Either system reduces staff, provides more accurate information much more quickly, and renders the company more competitive. Take it away and what happens?

    It’s been my experience that while these systems require maintenance (and sometimes a lot) they usually result in a net reduction in staff and the cost of doing business. And some types of systems provide a clear competitive edge as well. I think that makes many systems just as much an asset as a house, factory building, or a lathe.

    Interesting article. Thanks.

    Another Paul

  4. BobP Says:

    >> An order of magnitude is a factor of 10, no less

    > Well, the Wikipedia entry does say about 10. All this stuff is so approximate that anything consistently in excess of 5 is close enough.

    0.5 orders of magnitude = power(10.0,0.5) = sqrt(10.0) = 3.1623 (approx)
    1.5 orders of magnitude = power(10.0,1.5) = sqrt(1000.0) = 31.623 (approx)

    If we are rounding off, a factor of 4 is about one order of magnitude; also, a factor of 30 is about one order of magnitude.

  5. Jeremy Bowers Says:

    You missed my point with Python, or at least failed to address it.

    My point wasn’t that Python is also good. My point was that you lept from “10x improvement” to “it must the chewy functional goodness!” But your logic falls down in the face of the fact that Python, Perl, Ruby, and a number of non-functional languages that also have a 10x improvement over C++, therefore it clearly is not a sound argument to just leap to the conclusion that “it must be the chewy functional goodness!” when there are clearly other factors in play.

    I’m not criticizing Haskell or functional programming, I’m criticizing your logic, and you’ve still got nothing to back it up.

    (This is par for the course for a claim of a silver bullet, though.)

  6. Sam Carter Says:

    “Libraries and languages are complicit: they affect each other in important ways. In the long run the language that makes libraries easier to write will accumulate more of them, and hence become more powerful.”

    This argument has a large flaw in it, namely the current state of libraries doesn’t reflect this claim. The largest and most powerful collection of libraries seem to be .NET, CPAN, and the Java libs, certainly not Lisp libraries.

    But the advocates of Lisp would argue that it’s the most powerful language, and it’s clearly been around for a long time, yet the Lisp community has not accumulated the most powerful collection of libraries. So unless the next 40 years are going to be different from the previous 40 years, you can’t really assert that language power is going to automatically lead to a rich set of libraries.

    I stand by my original comment to the previous article that programming is more about APIs and libraries than about writing their own code, and that if you are focused on measuring code-writing performance, you are just measuring the wrong thing.

    I also disagree with the claim that this is unmeasurable because doing a real-world test is too expensive. As long as the project is solvable in a few programmer-weeks, you can test it out with different languages. I took a computer science class (Comp314 at Rice) where we were expected to write a web browser in 2 weeks. It wouldn’t be that hard to have a programming test which incorporated a database, a web or GUI front end, and some kind of client/server architecture, e.g. implementing a small version of Nagios, or an IM client, or some other toy application.

    I’m sorry but writing a command line application that parses a CSV file and does some fancy algorithm to simulate monkeys writing Shakespeare is about as relevant to modern software engineering as voodoo is to modern medicine.

  7. Paul Johnson Says:

    pongba:

    I’m arguing that Haskell programs are faster to *write*. Execution speed is a much more complicated issue. FP tends to lose in simple benchmarks, but big systems seem to do better in higher level languages because the higher abstraction allows more optimisation.

  8. Paul Johnson Says:

    Another Paul:

    The functionality that a package provides is an asset, but the production and maintenance of each line in that package is a cost. If you can provide the same asset with fewer lines of code then you have reduced your liabilities.

    Paul.

  9. Paul Johnson Says:

    Jeremy Bowers:

    Teasing apart what it is about Haskell and Erlang that gives them such a low line count is tricky, because it is more than the sum of its parts. One part of it is the high level data manipulation and garbage collection that Python shares with functional languages. Another part of it is the chewy functional gooodness. Another part, for Haskell at least, is the purity. OTOH for Erlang it is the clean and simple semantics for concurrency.

    What I see in the results from the Prechelt paper is that Python was, on average, about 3 times better than C++ while the average Haskell program (from a sample of 2) was about 4 times better. Actually the longer Haskell program was mine, and I was really embarassed when someone else showed me how much simpler it could have been.

    In terms of pure line count I have to conceed that Python and Haskell don’t have a lot to choose between them. A 25% improvement isn’t that much. Its a pity we can’t do a controlled test on a larger problem: I think that Haskell’s type system and monads are major contributors to code that is both reliable and short. Unfortunately I can’t prove it, any more than I could prove that garbage collection was a win back in the days when I was advocating Eiffel over C++.

    Paul.

  10. Paul Prescod Says:

    If you cannot “tease apart” what it is about Haskell and Erlang that makes them so productive then you cannot say that any one improvement is a silver bullet. It just feels truthy to you. Furthermore, if you are presented with counter-examples in the form of Python and Ruby then surely you must discard your thesis entirely. The best you can say is that there exist non-functional languages that are 10 times less productive than some functional languages for some projects.

    Duh.

  11. Paul Johnson Says:

    Sam Carter:

    On languages with expressive power gathering libraries; point mostly conceeded, although Perl certainly is a very expressive language, so I don’t think it supports your point, and .NET has Microsoft paying its mongolian hordes, so its not a fair comparison.

    There are two sorts of libraries: general purpose ones (e.g. data structures, string manipulation, file management) that get used in many applications, and vertical libraries (HTTP protocol, HTML parsing, SMTP protocol) that are only useful in specific applications. There is no hard dividing line of course, but the usefulness of a language for general purpose programming depends on the language and its general purpose libraries. The vertical libraries have a big impact for applications that use them, but not elsewhere. So I would generally judge a language along with the general purpose libraries that are shipped with it. The special purpose libraries are useful as well, but its a secondary consideration.

    Paul.

  12. Paul Johnson Says:

    Sam Carter (again):

    Sorry, just looked back at your post and realised I’d forgotten the second point.

    A worthwhile test is going to take about 10 versions to average out the impact of different developers. So thats 2 weeks times 10 coders is 20 developer-weeks, or almost half a man-year. Say a coder is on $30K per year and total cost of employment is three times that (which is typical). Round numbers $40-50 per language. Ten languages will cost the best part of half a million dollars to evaluate. Not small beer.

    Of course you could use students, but on average they will know Java or Javascript better than Python or Haskell, so how do you correct for that?

    Paul.

  13. pongba Says:

    I’m arguing that Haskell programs are faster to *write*. Execution speed is a much more complicated issue. FP tends to lose in simple benchmarks, but big systems seem to do better in higher level languages because the higher abstraction allows more optimisation.

    I always hear people saying that, but I really don’t get it.
    I know that *theoretically* abstraction( or non-side-effect, etc) gives more opportunity for optimization, but I have never seen people show some real data that can *really* prove it.
    One question constantly annoys me - If higher-level of abstraction allows more optimization, then why .NET put the burden of discriminating value-types and reference-types on us programmers. Shouldn’t the referential-transparency-ness be better at this?

  14. Jonathon Duerig Says:

    I have two specific (and one general) criticisms to make about your line of argumentation:

    First, I think you do not adequately address the criticisms about lines of code as a metric. The cost of a line of code is the sum of five factors: (a) Difficulty of formulating the operation involved (original coder*1), (b) Difficulty of translating that operation into the target programming language (original coder*1), � Difficulty of parsing the code involved to understand what the line does (maintainer*n), (d) Difficulty of later understanding the purpose of that operation (maintainer*n), and (e) Difficulty of modifying that line while keeping it consistent with the rest of the program (maintainer*n).

    (a) and (b) are done only once, but �, (d), and (e) are done many times whenever the program needs to be fixed or modified. Brooks’ argument was specifically that in the general case the time for (a) is more than 1/9 the time for (b), and the time for (d) is more than 1/9 the time for � and (e). This is important because (a) and (d) are both language and tool independent.

    When comparing the lines of code from different languages, it is important to realize that the formulation of the operations and the understanding of purpose are spread across those lines. And the verbosity of the language usually doesn’t impede either of these problems (unless it is extreme).

    For instance, take the creation of an iterator or enumeration in C++ or Java respectively and compare that to creating a fold function in Scheme. These are roughly equivalent tasks. In C++, an iterator is defined first by defining a class with various access operators like * and -> and ++ and — and then implementing them. This adds a lot of baggage because there are half a dozen or so functions that must be defined and there is a separate class specification. In constrast, a scheme fold function is much simpler from the language perspective. A single function is defined rather than half a dozen. It will almost certainly have fewer lines, possibly by 4 or 5 times.

    But let us look at what the creation of the iterator or fold function means from the perspective of items (a) and (d). Both of these are common idioms in their respective languages, so all of the code specifically dealing with iteration/folding is trivial to conceptualize and trivial to understand the purpose of. The difficulty in writing either a custom iterator or a custom fold function lies within the subtleties of the iteration. If it is a tree, what information needs to be maintained and copied to successive iterations (whether that be in the form of state, or in the form of argument passing)? Are there multiple kinds of iterations? How would they be supported? (For example, sometimes a user wants to traverse a tree in pre-order, sometimes in post-order, sometimes in-order, and sometimes level by level in a breadth-first order.) These are the questions which the original coder and the later maintainers will have to contend with. And these are really orthogonal to lines of code counts.

    But there is another factor at work here which makes lines of code a faulty cross-language measurement. Every language has a grain to it. If you program with the grain, then any difficulty will be easily solved by the tools in the language. If you program against the grain, then you will run into difficulty after difficulty. This applies to fundamental language properties. You can bypass the type system in C++ and avoid all the type checks, but it is cumbersome and unpredictable if you do it wrong. Ruby allows you to be much more flexible with types and provides a safety net. If you try to enforce a more strict typing in Ruby, then you will have to fight the language every step.

    But the grain of the language also includes the question of scale. Some languages provide a lot of flexibility. They allow compact and loose representations of programs which can be customized to the problem domain easily. These languages include Scheme and Ruby and Haskell. These flexible languages are very useful for small projects with one or a few developers because they can be metaphorically molded to fit the hand of the person who wields them. But there is a trade-off because they tend to be more difficult to use in large groups because it is harder for others to undestand what it going on. This is a fundamental trade-off that programming languages must make. And it means that a language which is great at one end of the spectrum will likely be lousy at the other end. And this is reflected in the lines of code required for a particular scale of problem.

    My second criticism is in regard to your discussion of state. You point out that Brooks considered managing of state to be a major essential difficulty of programming and you then claim that functional languages obviate this difficulty and hypothesize this as the reason that they can be a silver bullet.

    I believe that you have misunderstood the kind of state the Brooks was referring to. He was not talking about run-time state but compile-time state. He was not talking about what variables are changed at run-time. He was talking about the interactions between components of the program. These interactions are still there and just as complex in functional languages as in imperative languages.

    Second, even when considering just the run-time state, the referential transparency of functional languages simplifies only the theoretical analysis of a program. As far as a normal programmer who is informally reasoning about what a program does, the programmer must consider how state is transformed in the same way whether or not a modified copy is made or a destructive write is made. This is the same kind of reasoning.

    Finally, I have seen many people talk about getting an order of magnitude improvement by finding some incredible programming tool. Functional programming is not unique in that respect. But in my experience this is more likely to be about finding a methodology that suits the persons mindset than about finding the one true language or system. Somebody who thinks about programming in terms of a conceptual universe that changes over time will be an order of magnitude less effective in a functional environment. And somebody who thinks about programming in terms of a conceptual description of the result which is broken up into first class functions will be an order of magnitude less effective in an imperative environment.

    I have programmed in both imperative and functional languages. I know and understand the functional idioms and have used them. My mindset tends to the empirical. I am a less effective programmer in such languages. But I have seen programmers who can pull a metaphorical rabbit out of a hat while tapdancing in them. This says to me that evangelism about functional languages or empirical languages is fundamentally misguided regardless of the side.

  15. Paul Johnson Says:

    Jonathon Duerig:

    I had decided not to respond to any further comments and instead get on with my next article. But yours is long and carefully argued, so it merits a response regardless. Its also nice to be arguing the point with someone who knows what a fold is.

    You make the point that during maintenance the difficulty of later understanding the purpose of an operation is language independent. I’m not so sure. A maintainer may suspect that a C++ iterator is truly orthogonal, but it can’t be taken for granted. There may be a bug hiding in those methods, or perhaps someone fixed a bug or worked around a problem by tweaking the semantics in an irregular way. Also a lot of the understanding of a piece of code comes from context, and it helps a lot to be able to see all the context at once (why else would 3 big monitors be a selling point for coding jobs?). So terse code makes it a lot easier to deduce context because you can see more at once.

    (Aside: I remember in my final year project at Uni going into the lab at night because then I could get two VT100s to myself).

    You say that Scheme, Ruby and Haskell can be moulded to fit the hand of the user, making them more productive for single person tasks, but less productive for groups because of mutual comprehension difficulties.

    This is hard to test because of the lack of statistics, but Haskell is strongly typed and the community has already developed conventions and tools for documentation and testing (Haddock and QuickCheck). I can see that Scheme macros can be used to construct an ideosyncratic personal language, but I really don’t see how this could happen in Haskell. Things that get done with macros in Scheme are usually done with monads in Haskell, but whereas Scheme macros are procedural monads are declaritive and must conform to mathematical laws, making them tractable. My experience with Haskell monads is that you generally build a monadic sub-language in a single module and provide libraries for it in some other modules (e.g. Parsec), and that the end result is intuitive and simple to use. But maybe I’ve only been exposed to well-designed monads.

    On the subject of state and informal reasoning: personally I use whatever reasoning forms that will work. In debugging a particularly complex monad I once resorted to writing out the algebraic substitutions long-hand in order to understand how the bind and return operators were interacting. It worked, and I got the monad to do what I wanted. I routinely use informal algebraic reasoning of this sort in simpler cases in order to understand what my program is doing. Any informal reasoning must be a hasty short-hand version of what a full formal proof would do, and it follows that language features that make full formal proof easier will make the informal short-hand mental reasoning easier too.

    Pure functions are particularly valuable when trying to understand a large program because you don’t have to worry about the context and history of the system for each call; you just look at what the function does to its arguments. In a real sense this is as big a step forwards as garbage collection, and for the same reason: any time you overwrite a value you are effectively declaring the old value to be garbage. Functional programs (at least notionally) never require you to make this decision, leaving it up to the GC and compiler to figure it out for you based on the global system context. Thus complex design patterns like Memento and Command are rendered trivial or completely obsolete.

    Finally you talk about the many over-hyped technologies in this industry. Yes, hype is a common problem. Those of you who think you have a silver bullet are very annoying for those of us who actually do. :-)

    Paul.

  16. Paul Johnson Says:

    Jonathon Duerig:

    I had decided not to respond to any further comments and instead get on with my next article. But yours is long and carefully argued, so it merits a response regardless. Its also nice to be arguing the point with someone who knows what a fold is.

    You make the point that during maintenance the difficulty of later understanding the purpose of an operation is language independent. I’m not so sure. A maintainer may suspect that a C++ iterator is truly orthogonal, but it can’t be taken for granted. There may be a bug hiding in those methods, or perhaps someone fixed a bug or worked around a problem by tweaking the semantics in an irregular way. Also a lot of the understanding of a piece of code comes from context, and it helps a lot to be able to see all the context at once (why else would 3 big monitors be a selling point for coding jobs?). So terse code makes it a lot easier to deduce context because you can see more at once.

    (Aside: I remember in my final year project at Uni going into the lab at night because then I could get two VT100s to myself).

    You say that Scheme, Ruby and Haskell can be moulded to fit the hand of the user, making them more productive for single person tasks, but less productive for groups because of mutual comprehension difficulties.

    This is hard to test because of the lack of statistics, but Haskell is strongly typed and the community has already developed conventions and tools for documentation and testing (Haddock and QuickCheck). I can see that Scheme macros can be used to construct an ideosyncratic personal language, but I really don’t see how this could happen in Haskell. Things that get done with macros in Scheme are usually done with monads in Haskell, but whereas Scheme macros are procedural monads are declaritive and must conform to mathematical laws, making them tractable. My experience with Haskell monads is that you generally build a monadic sub-language in a single module and provide libraries for it in some other modules (e.g. Parsec), and that the end result is intuitive and simple to use. But maybe I’ve only been exposed to well-designed monads.

    On the subject of state and informal reasoning: personally I use whatever reasoning forms that will work. In debugging a particularly complex monad I once resorted to writing out the algebraic substitutions long-hand in order to understand how the bind and return operators were interacting. It worked, and I got the monad to do what I wanted. I routinely use informal algebraic reasoning of this sort in simpler cases in order to understand what my program is doing. Any informal reasoning must be a hasty short-hand version of what a full formal proof would do, and it follows that language features that make full formal proof easier will make the informal short-hand mental reasoning easier too.

    Pure functions are particularly valuable when trying to understand a large program because you don’t have to worry about the context and history of the system for each call; you just look at what the function does to its arguments. In a real sense this is as big a step forwards as garbage collection, and for the same reason: any time you overwrite a value you are effectively declaring the old value to be garbage. Functional programs (at least notionally) never require you to make this decision, leaving it up to the GC and compiler to figure it out for you based on the global system context. Thus complex design patterns like Memento and Command are rendered trivial or completely obsolete.

    Finally you talk about the many over-hyped technologies in this industry. Yes, hype is a common problem. Those of you who think you have a silver bullet are very annoying for those of us who actually do. :-)

    Paul.

  17. Toby Says:

    Since I happened to stumble upon an actual Dijsktra cite just now, I thought I’d add it here (having read and appreciated your original post a few days ago).

    In EWD513, “Trip Report E.W. Dijsktra, Newcastle, 8-12 September 1975,” he writes,

    “The willingness to accept what is known to be wrong as if it were right was displayed very explicitly by NN4, who, as said, seems to have made up his mind many years ago. Like so many others, he expressed programmer productivity in terms of ‘number of lines of code produced’. During the discussion I pointed out that a programmer should produce solutions, and that, therefore, we should not talk about the number of lines of code produced, but the number of lines used, and that this number ought to be booked on the other side of the ledger. His answer was ‘Well, I know that it is inadequate, but it is the only thing we can measure.’. As if this undeniable fact also determines the side of the ledger….”

    That is the edited version as printed in “Selected Writings on Computing: A Personal Perspective”. The original text can be found in the EWD Archive, at http://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD513.html

No comments: