Monday, December 5, 2011

2 Misconcepts About Functional Programming (relating to context and monad)

It's interesting that almost every old and new language that claims to have some functional support has to go through the trial of its purity, like Perl, Scala, Erlang, Ruby, JavaScript and Python. Candle is no exception. Candle's last release announcement to LtU triggered some interesting discussion about whether the functional features in Candle are pure. I finally have sometime to share more about my thoughts on this issue. I'd like to talk about 2 misconcepts about functional programming. The first misconcept is that dynamic scope variables are not functional. The second misconcept is that monads are functional. I know the following writing is going to kick up the dust, but it is better than we settle without thorough thinking.

Of course, to continue the discussion, we must first define the judging criteria(s) to tell whether a programming feature is functional or not. I think everyone would agree the single most important criteria is referential transparency. Other salient features, like function composition, lazy evaluation and anonymous functions, are more prominent in functional programming, but they are not exclusive to just functional programming; imperative programming can sometimes exhibit or exploit these features as well.

1st Misconcept: Dynamic Scoped Variables are not Functional


This question arise due to the discussion whether the put statement in Candle is functional or not. Yes, by strict definition, dynamic scoped variables are not referentially transparent. But I'd argue it is still functional. I'd like to introduce a new concept here, so as to extend the scope of functional programming, a concept called contextually functional.

In other words, features in a functional programming language can be divided into two categories: purely or strictly functional and contextually functional. Strictly functional features are those obey the referential transparency rule. And a contextually functional feature is a feature that follows these rules:
  • Some API function or language construct is defined to allow user to retrieve information about the context at certain place in a program.
  • Some corresponding API function or language construct is defined to allow user to push value unto the context, which can be retrieved later by the previous API function or language construct. The pushed value must be hold constant when it is in scope. When the pushed value goes out of scope, they are popped out of the stack automatically.
  • The pushed value can shadow or occlude previous values of same identity. This is very important. Certain features like global variables, current date and time function (like in XQuery), data input functions are contextual information, but as they cannot be occluded, they can still be classified as strictly functional.
Some contextual functional features that I can think of are:
  • Dynamical scoped variable, as in Common Lisp, Candle and XSLT (which is named as tunneled parameters);
  • The context item in XPath/XQuery and current item in XSLT. Most people take for granted that XSLT and XQuery are purely functional. They are not.
  • Closure.
  • 'this' and 'super' accessors in functional OOP.
I won't argue that contextual functional features are RT (referentially transparent). As pointed out in the LtU discussion, the Wikipedia article is wrong or inaccurate in classifying dynamic scoped variable as RT. But I'd argue that they are still functional. The justification I can put forward is that a program containing contextual functional features can be transformed into a program containing just pure RT functions. The brute force way of transformation is to pass all context information as parameters to all functions in the program.

As can be seen from the LtU discussion, some people might not accept that as valid proof. They would argue that even procedural program can be transformed into functional program. But that's a misconception, which I'll elaborate later. The truth is that a procedural program with external side effects cannot be transformed into pure functional program. By this proof, I believe, Candle's put statement, even in its earlier version put expr = value { ... } is still functional. In the latest release, I change the put statement to put qname = value { ... }, not because I think the earlier version would render Candle impure, but because I think that the later version is more friendly to static analysis.

But I do acknowledge the concern that contextual functional features, if misused, can make a program look less functional. For example, say if a language goes to the extreme to allow any addressable data to be occluded, then a programmer can essentially program in a procedural style by occluding any data when he wants an assignment and passing on the continuation.

But when contextual functional features are carefully defined, we can come up with idiomatic usages, and our programs will not look any less functional then pure RT programs. Just look at XQuery and XSLT. We accept them as 'pure' functional programming languages without second thoughts.

Programming is both science and engineering. When used in science, which often deals with complicated problems, it is often enough to just use RT functions. But when used in engineering and real life programming, which often deals with complex problems, there is often a lot of context information. In such application system, it is often cumbersome to pass all necessary contextual information to a function. Contextual functional features are very important in making these large and complex application systems more modular and maintainable. The price that we have to pay is that contextual functions are harder to analyze and process than pure functions, for example, in caching the output of a function.

As programming evolves, I'm sure we'll see more contextual functional features. CSS (cascading style sheet) and SVG drawing context are good examples of functional contextual dependency in complex application modeling. And we can come up with new idiomatic usages when we selectively allow some context information to be shadowed. For example, if we allow the current date-time to be shadowed, then we can emulate a past or future time, which can be very useful in automatic testing. If we allow the OS specific information to be shadowed, then we may emulate a UNIX environment on a Windows platform.

If we restrict functional programming to just mean RT programming, then we lose all those rich contextual programming features.

2nd Misconcept: Monads are Functional


Monad is probably one of the mostly misunderstood programming feature. I myself have attempted very hard to understand it by carefully studying different tutorials for over 10 times at least, and I think I still just have elementary understanding of it.

Most people think monad is a 'functional' programming feature. But this really depends on how we define functional. If it is defined to be RT, as above, then I claim that monads are not functional when they are used to allow functional programming languages to carry out side effects.

Monad itself is just a way of computation. It transform a program into another program. Monad itself does not introduce additional side-effects. But if the result program is procedural, I cannot accept that monad used in such way can be called functional.

Monad doesn't have the magic power of transforming a procedural program into a purely functional program, as some might have mis-believed. This misconcept might be related to the misconcepts that "all programs (whether functional or procedural) are essentially the same" based on the Church-Turing thesis (for example in this LtU post). But this is wrong interpretation of the thesis. I don't think the thesis is actually of any relevance in judging functional programming, because the effective method in the thesis can be functional or non-functional (although the descriptions sound like functional computation). So we can put the thesis aside.

Here, I'll put it very clearly that pure functional program is distinctively different from procedural program. The fundamental assumption of pure functional program is that the inputs to a function must not change during the evaluation of that function. If the inputs might change, then it loses RT and all other fancy functional features. One obvious derivation of the assumption is that the output of a pure function must not overlap its input. Thus a pure function can never to used to directly perform the external side-effects.

A pure function can be used to compute the desired side-effects, but that is just a pending list. The actual side-effects have to be carried out by a procedural program. One good example to understand this is the XQuery Update Facility. This extension to XQuery only generates pending update lists. And the lists can be further processed (merged, validated) in a functional manner. The final 'blast' is left to the underlying engine. So XQuery Update can be considered purely functional, but the final 'blast' is not.

If a pure functional program can never carried out effective side effects, then a true procedural program (which performs side effects) can never be transformed into a pure functional program. Of course, there are many procedural programs out there that can be completely replaced by pure functional program, for example those programs that perform static analysis, report generation and program compilation. But it doesn't mean all procedural programs can be replaced by pure functional programs.

This misconcept is also probably due to the way monad is advocated in Haskell. People advocate Haskell as a pure functional language even when monads are used to perform side effects, which I don't accept. I think the classification of functional languages into pure and impure (e.g. in this wiki article) is simply wrong. You can say XQuery or XSLT are purely functional, but if you say Haskell is still purely functional when it is used to perform side effects, that is simply cheating. When you start mixing black with white, you get all shades of gray; you can't classify a gray color as white, no matter how light it is.

Furthermore, I don't think Haskell's monads when used to perform side effects are in any way superior than other ways of mixing functional and procedural programming. The model to carry out side-effects by monads is different from XQuery Update. XQuery Update only produces pending update lists, and the update commands in these PULs are unordered, and the PULs are mergable. But the side-effects performed by monads are threaded just like procedural program. They can't be reordered or merged. When you use monads to perform side effects, you are just writing a so called 'functional' program to generate a traditional procedural program. If that can be called 'pure' functional programming, then any C program can be called pure functional as well.

I don't doubt monads usefulness. I think it is a powerful meta-programming mechanism. But when monads are use to mimic every ways of procedural programming, like I/O monad, state monad, async monad, concurrent monad, I don't see any extra values that they bring into programming than the procedural counterparts. They are more like the Emperor's New Clothes, sophisticated enough that whoever don't praise them looks like a fool.

In the next blog article, I'll talk about the procedural design in Candle. The mechanisms in Candle to allow procedural programming to be cleanly mixed with functional programming.

(Update: I've posted a new blog article in response to the comments and feedback.)


  1. I can understand your reasoning here, and agree somewhat with *what* you say (the ideas) but I disagree with *how* you have said it (the terminology). Normally I wouldn't pick up on problems with terminology, but since that's part of the reason for the blog post I think it's justified.

    Dynamic scope: My problem here is that it seems you're moving the goalposts of "functional programming" such that it includes Candle's non-functional aspects. I think that it would be simpler just to state that Candle does contain some non-functional aspects, but these are invaluable when programming real-world systems. It doesn't matter if a language is or is not purely functional; it matters whether it is good at its job. If these features make Candle better at its job then they are good features to have, but that still doesn't make them (purely) functional.

    Monads: There's more to monads than IO! Your argument is sort-of valid when it comes to (in the case of Haskell) a "main" function in the IO monad, but that's only because *in that particular case* there is an implied expectation that it will be run by a procedural program, as you say. However the actual *concept* of monads makes has no such expectation, and it's perfectly possible to use all kinds of monads in a purely functional way. For example, a library which makes use of the Maybe monad is still purely functional (libraries don't have a main function). There is an argument against this, namely that libraries only matter in the context of programs (main functions), but that would also apply to every other pure function and there would be no reason to single out monads over, say, addition.

  2. Henry Luo wrote:
    "imperative programming can sometimes exhibit or exploit these features as well"

    The functional programming features you mentioned are not what distinguishes imperative vs. declarative programming.

    I explained at the above link, that I think you (and others) are conflating imperative vs. declarative with functional programming.

    Afaics, the axis of your taxonomy is incorrect. Declarative vs. imperative is only concerned with the immutability of stored values, or more generally the referential transparency (RT) of the functions (sub-expressions) invoked in the declarative expression.

    As the link below confirms, if all the expressions inside a function are declarative, then the containing function is RT. But it is also possible for the containing function to RT, even if the contained expressions are not declarative and only mutate _LOCAL_ state.

    I also explained at the first link above, that monadic composition is _always_ functional and declarative, although the IO monad is arguably not w.r.t. state external to the program (i.e. the World).

    Ehud Lamm censored me from LtU so I can't post there, but the following comment is incorrect. The "within the monad" is declarative RT functional composition. Although the order of execution is sequential because the functions are nested, this does remove the RT in any way at any scope.

    The author of the comment linked below (dmbarbour) is conflating how in C the expressions are not usually RT w.r.t. to the state in the program.

    It is wrong to define imperative as "order of execution dependent" and it is wrong to conflate RT with imperative. (Computer) Science requires we have precise definitions and do not conflate the terms.


  3. typo change "this does remove the RT" to "this does not remove the RT".

  4. shelby wrote:
    "The author of the comment linked below (dmbarbour) is conflating how in C the expressions are not usually RT w.r.t. to the state in the program."

    And the key point is that when C expression are not RT, then it is impossible to encapsulate them in RT functions in the 'environment passing style'. So that is why dmbarbour's analogy is not valid.

  5. Hi Warbo,

    I'm not trying to move goalposts of "functional programming" just for Candle, but to help clarify what is "functional programming". I found it funny that people accept languages like XQuery/XSLT as pure functional language, but don't accept dynamic scope variable as a functional feature. That's simply self-contradictory.

    Regarding the concept of 'contextual functional', I think it is more than just an idea or terminology, there can be a formal mathematical model or type theory behind it. However, I'm not an academic guru to formulate it. It can help classify a set of programming features that are not RT, but also distinctively different from procedural, destructive state-mutating programming features.

    If we classify these non-RT features as 'contextual procedural features', I think that would be even less appropriate and against most people's intuitive feeling.

  6. shelby wrote:
    >> "The author of the comment linked below
    >> (dmbarbour) is conflating how in C the
    >> expressions are not usually RT w.r.t. to the
    >> state in the program."
    > And the key point is that when C expression
    > are not RT, then it is impossible to
    > encapsulate them in RT functions in the
    > 'environment passing style'. So that is why
    > dmbarbour's analogy is not valid.

    The difference is that each C statement mutates the global state, whereas "inside" the state monad each corresponding statement generates a _COPY_ of the state (so modified). The latter is RT-- former is not. Monadic composition is always RT (referentially transparent).

  7. Thanks for all your detailed comments. I've posted a new blog article in response.