Saturday, December 17, 2022

My Dinner with ChatGPT

I have nothing original to say about OpenAI's ChatGPT. I'll say it anyway.

Calculating and Reasoning

ChatGPT will readily tell you that it is not a calculator and you shouldn't trust it for calculations. What's interesting about my interaction on this is that the problems that showed up were not with the accuracy of the calculations, but with the choice of what to calculate. For a language model, it gets the actual numbers right remarkably well (your mileage—or other units—may vary).

A Classic "Back of the Envelope" Question

I asked ChatGPT how many ping pong balls would fit in a 747. The first response was to chide me for asking such a thing, and advising me that it would be a bad idea to fill a 747 with ping pong balls. I explained that it was a thought experiment, and it allowed me to proceed.

It began with an estimate of the volume of a 747 as about 15,000 cubic meters. I didn't find an actual number in a quick search online, but based on the 747 dimensions I found that seems like it's in the right ballpark.

Next it told me that a ping pong ball has a diameter of 2.7 centimeters. This is way off. A ping pong ball's diameter is actually 4 centimeters (but it seems it used to be 3.8 centimeters). How did it come up with that number? Here's a scary fact: a ping pong ball's mass is 2.7 grams. Oh, boy.

The bot proceeded to correctly calculate the volume of a 2.7 cm sphere in cubic meters and divide 15,000 by that. So far, so good, but it left out an important consideration.

I prompted it, pointing out that spherical ping pong balls can't be packed with perfect density. It agreed, described two different ways of packing spheres (cubic and hexagonal), correctly reported the density for hexagonal packing as about 0.74, and proceeded to redo its calculation by multiplying 15,000 cubic meters by 0.74. In doing so, it made every ping pong ball 1 meter in diameter, and it reported the new final number without regard for the huge difference between its original estimate and the revised one.

Interesting.

Confidence and Servility

You can tell that ChatGPT was trained on the Internet, because it never expresses any uncertainty or doubt. Yet it has also been trained to accept correction with extreme humility. The experience is a little bizarre.

Dressage

I had a brief chat about the sport of dressage. In the course of the chat, ChatGPT told me that the USDF (US Dressage Federation) was the National Governing Body for dressage in the US. I honestly can't recall whether that was completely spontaneous or a logical response to something I said, but I definitely did not ask that specifically.

I said I thought that the USEF (US Equestrian Federation) was the NGB for dressage. The bot immediately agreed, apologizing for its mistake. This is disconcerting. It just feels unnatural for a conversational partner to assert a fact with complete confidence and then turn around and admit it was entirely mistaken with total equanimity. "Bygones."

SQL "QUALIFY" and "HAVING"

Bear with me here, I won't make you learn SQL. I asked ChatGPT about "QUALIFY" clauses in SQL, because I only learned about them relatively recently and I'm still trying to get the hang of how to use them and when they are useful. After I asked the bot for an example, it gave me one that could have been done using the more familiar HAVING clause. So, I said "show me an example of using QUALIFY where HAVING would not work". And it did.

The example it gave did use "GROUP BY", though, which is used with "HAVING", so I said, "Are there cases where I would use QUALIFY and not have any GROUP BY?" It proceeded to show me a query written with just QUALIFY, and another version written with GROUP BY and HAVING. Then it said the fateful text:

Both of these queries would produce the same result, but the first query uses the QUALIFY clause without a GROUP BY clause, whereas the second query uses the GROUP BY and HAVING clauses.

But that wasn't true. I asked:

Are these two queries equivalent? Can't the first one have multiple rows with the same salesperson and product?

And, without hesitation, ChatGPT said:

The two queries are not equivalent because they produce different results.

It went on to explain exactly why the two queries produce different results. And that's where I'm stuck: ChatGPT will happily explain to you exactly how its earlier, confidently presented answer was wrong, yet it isn't able to figure out that the answer is wrong before you prompt it.

ChatGPT Is Not Intelligent

People much smarter than I am keep pointing out that ChatGPT is manipulating words, not concepts. We linguistically-oriented humans think being good with language is a distinguishing feature of intelligence. That's probably because we've no previous experience with something so good with words and bad with ideas.

Just above I said the bot isn't able to figure out on its own that the answer is wrong, and one reason, I think, is that ChatGPT lacks any sort of "executive function" that introspects and asks, "Is what I'm saying right, or does it just sound right?" And that—not the threat to professional writers, student essays, or junior programmers—is the truly alarming problem the technology poses in its current form.

Thursday, November 02, 2017

Tweets2Vec: Notes from a Toy Project with Word2Vec

I've been trying to educate myself about machine learning because it's all the rage. Also, understanding it on some level seems essential to grasp what the humongous Internet companies are doing to our lives.

Word2Vec has gotten some attention because it seems to extract sophisticated semantic information from a document collection just by looking at word usage in context. Synonyms are its most basic result, found by looking for the nearest neighbors of the vectors representing words (based on cosine similarity of the vectors).

What, I wondered, might Word2Vec tell me about the Tweeter-in-Chief? I had a pretty good idea, to be honest: next to nothing, because Word2Vec relies on context and lots of data. But that wasn't really the point of the exercise. The point was to use the tools as if I were doing something real.

I found a data collection of the tweets issued by the TiC since election day, which turns out to be not really very much data, and I started my journey.

The Setup

I'm focusing on Spark for data exploration these days. And I wanted to do this in a notebook, because Data Science! I've been making some contributions to the BeakerX project, which evolved from a standalone notebook system to a set of kernels and extensions for Jupyter. So, Spark's ML lib in a Jupyter/BeakerX Scala notebook it is.

The BeakerX Scala kernel doesn't have built-in Spark integration, but it does have a command to automatically download libraries (and their dependencies) and put them on the classpath. Specifying the spark-core, spark-sql, and spark-mllib libraries does the trick. Almost. More about that soon.

Reading and Preprocessing the Data

The data was nicely prepared in JSON, suitable for reading in with Spark's JSON reader. And the data provider had included information that the data contained many duplicates, an artifact of the way it was collected from Twitter. So, the first step was to remove duplicates and extract a single text field.

Next comes tokenization. I opted for a regular expression tokenizer that split on non-word characters. That means my tokens did not includes at-signs: nytimes and @nytimes are the same token. It seemed like the right thing, but writing this now I wonder if I should test its impact.

To Stop or Not To Stop

Most text analysis removes stop words—common words that occur so frequently that they usually don't help differentiate documents. Word2Vec relies on context, though. Should I remove stop words or not? I went to the web to try to answer that question. I didn't find an authoritative answer, but the balance seemed to be toward removing stop words. So I did.

Library Issues

Now I'm ready to run Word2Vec on my prepared data. But no. I'm getting a missing class at runtime. It turns out that there is a bug in the handling of transitive dependencies for libraries, apparently inherited from Ivy. It takes a while to figure out the problem: I need to manually add the arpack dependency because the dependency resolver is loading the arpack source jar instead of the binary jar. (I was lucky to find a bug report against SBT that mentioned this problem.) I also notice, after the missing library is resolved, a complaint about not finding native libraries for the linear algebra routines, which isn't fatal but hurts performance. The solution for that is at least documented.

Word2Vec finally runs, producing a model that I can query.

What Did I Get?

Word2VecModel supports a couple of handy built-in queries that save the trouble of feeding the results into a separate step to find nearest neighbors with cosine similarity. So, here's where I get to use the BeakerX notebook's table display functionality. I generated two tables. The first table iterates over the vocabulary and shows me, for each word, its three closest synonyms. The second table shows me just the closest synonym, but it includes the score. That makes it easy to sort by score and see the best matches. Which is where things get weird.

Tickets to the White House?

One of the top pairings is "tickets" and "whitehouse". Huh? I look at the tweets containing "tickets", and they are all promoting post-election campaign rallies. They all have a link to a website for tickets. The tweets containing "whitehouse" actually contain "@whitehouse" and they seem to be mentions/retweets of material published by the @whitehouse feed, also with links. So it turns out that the shared context is the tokens generated by a URL: "https", "t", "co", followed by a unique string.

Revisiting the data cleaning process, I figure the best thing to do is just remove all the URLs from the text before tokenization. So I add a regular expression substitution to map from text to text without URLs. Works great, except that I'm still seeing some URLs.

Your Data Is Never as Clean as You Think

I was looking for "https://t.co/\w+" in my regular expression. But, surprise, some of the URLs actually look like "https://t.c…". It seems that the original data extraction somehow truncated the tweets. I don't know exactly how or why. I fix the URL regular expression so that anything that starts with "https:" and has no spaces up to a "…" will also get removed from the input. I decide not to worry about the other truncations, I just want those URLs gone.

Results

As I expected, there weren't any surprises, and precious little that was even what you might expect. The interesting close synonyms are things like "repeal" and "border" (linked by both being legislative priorities?), and "state" and "alabama" (clearly semantically linked).

Looking past the closest match to the second or third, there are a few fun things from looking at interesting words.

"obamacare" is close to "repeal" and "healthcare". The closest match to "russia" is "hillary", and "russians" is close to "phony". "democrat" is close to "republican". "foxnews" is close to "seanhannity". "nytimes" is close to "media" and the closest match to "times" is "dishonest". "cnn" is close to "russia". "fake" is close to "media". "hurricaneharvey" (presumably a hashtag) is close to "fema". "god" is close to "military".

Somehow the closest match for "obama" is "ivankatrump". The latter is clearly an @ reference; it looks like it's because of retweets of Ivanka talking about the Trump administration overlapping with direct remarks about the Obama administration. "Administration" seems to be the common context, from my eyeballing it.

Conclusions

It's always educational to get hands-on and end-to-end with software, even for a toy project. A lot of the value comes from working through all the little things that crop up, since nothing ever works exactly as anticipated. Applying tools that really require a large dataset to a tiny one is not going to result in amazing insights, of course. Even if it didn't yield particularly interesting results, I think it was worth doing.

Wednesday, June 25, 2014

A Couple of Words about Swift

It seems everyone has something to say about Swift, the new language for developing on Apple’s Mac and iOS platforms.

Things like: “Do we really need another programming language?” Or, “Clearly, Swift borrowed feature X from language Y.” Or, “Boy, not including feature Z really cripples Swift.”

Am I any more qualified than the rest of the rabble to comment? No. But I won’t let that stop me.

Swift is an Amazing Achievement

Consider Objective-C, the language in which the Cocoa frameworks are written. At birth, Objective-C most closely resembled Smalltalk-80 glued to a C compiler. The syntax for message sends was clearly borrowed from Smalltalk, and the extreme late-binding dynamic behavior was, too. Over time, Objective-C acquired a little bit of static typing, but it was mostly hints, not guarantees that a compiler could (or should) rely on. At bottom, every message send went through dynamic dispatch (objc_msgSend).

Objective-C, some joked, combined the type-safety of C with the performance of Smalltalk.

Objective-C was the big loser in the battle for a mainstream language that extended C with object-oriented capabilities. (C++ won that battle, but it ultimately lost the war to Java.) Developers knew that Objective-C was an Apple-only ghetto. But they also knew that the language was not as important as the frameworks. Remember, all of NextStep Cocoa was written in Objective-C. So, they put up with the limitations of Objective-C, but something was happening in the world.

After a (Java-induced?) lull, people started paying more attention to programming language alternatives. C++ templates and Java generics made strict compile-time typing the norm, and programmers began to get used to the idea of having the compiler eliminate a large class of errors. Compile-time type guarantees also eliminate most run-time checks, yielding performance improvements. Type inferencing started making inroads into mainstream languages, so that typing declarations didn't require so much typing on the keyboard. Lambda expressions, first-class functions, and closures also started to make their way into the mainstream from the functional programming community (one area where Objective-C’s blocks were slightly ahead of the game).

The end result of this was, I think, a renewed interest in languages that combine expressiveness with static type safety. And Apple was at risk of having developers feel a pull away from the Cocoa platform because the Objective-C language was starting to seem behind the times. Objective-C was not as easy or as fun to use as the newer alternatives.

A New Language, or a New Language?

Apple’s challenge, then, was to introduce a new language that could be used with the existing Cocoa frameworks. And that, I claim, is exactly why no existing language—no matter how wonderful—could fill the need.

Watch the WWDC 2014 presentations on Integrating Swift with Objective-C and Swift Interoperability in Depth. Now ask yourself, for every existing candidate language, could it have been adapted to interoperate with Objective-C the way that Swift does? No. Could any of them have been tweaked a bit? Maybe, but at what cost? Don't forget that many of these languages are relying on an underlying virtual machine with a very specific memory model, and once that is changed, much of the existing implementation cannot be reused.

The constraints of the environment make a new language a better choice, a choice with fewer compromises.

So, yeah, Swift borrows liberally from “Rust, Haskell, Ruby, Python, C#, CLU, and far too many others to list.” But it couldn't be any of those, for the most basic of reasons, that requirement of interoperability.

The Good, the Bad, and the Ugly

Are you ready for a subjective laundry list? Let’s do it!

The Good

Interactive REPL
Semicolons? We don't need no semicolons!
Unicode
Immutable and mutable bindings (let and var)
Type inferencing (can handle recursion in some cases that Scala can’t, it would seem)
String interpolation
Array and dictionary literals
Pattern-matching!
Closures and first-class functions
Tuples
Explicit override
Optional
Generics with type constraints
Protocols
Extensions
Arithmetic overflow detection
Mandatory named parameters (sometimes, you don’t want people to have the option of screwing up)
Trailing closures, I think. I’m not entirely certain whether this is the right approach; but it does seem useable.

The Bad

Immutable arrays aren’t. This is documented behavior, but it may be changed. (Update: It was changed.)
Documentation is vague on when/whether dictionary keys are immutable.
Optional is baked in as a special case. Don’t expect to use the same syntax for anything you might conjure up yourself.
Mutable structs that are always passed by value. How can this be a good idea?
It seems that genericity and protocols don’t play well together. I am not sure how well I understand the issue, to be honest, since I have not actually written any Swift code.
~~The two-dot/three-dot half-open/closed range syntax is a horrible mistake.~~ (Update: This has been changed, to use ..< and ..., which takes it out of Bad but maybe still leaves it in Ugly.
Mutable strings
Generic type constraints are limited, with …
No explicit covariance/contravariance for generic type parameters
Not really functional, although the author of that piece is not criticizing, just observing

The Ugly

The test-optional-and-bind syntax (if let) is pretty ugly to my eye. For that matter, so is the pattern matching bind (case let). My brain does not want to see that let in the middle of a statement.
I just don’t think switch is the best name for the pattern matching statement. Pattern matching is so different from what a C switch does. (Perhaps a case of trying too hard to appear familiar.)
External names for parameters. I don’t object strongly to the idea, but the syntax leaves something to be desired.
String mutability is determined by the kind of binding? That is weird and confusing.
C-style for loops are probably unnecessary and may encourage bad style.
Labeled statements and directed break/continue. Bleah.
var parameters. Why bother?
inout parameters probably exist only for compatibility with existing code. A necessary evil, I suppose.
The keyword in for separating closure header from body grates on me.
Using static/class modifiers instead of having a single keyword for both cases.

Will It Fly? (sorry)

Contributions on Apple’s developer forums from the Swift team make it clear that Swift is a work in progress. The Swift creators are interested in feedback and they want to improve the language.

Swift could well achieve one of its open secret goals: to become a teaching language. (The benefit for Apple is obvious.) The playground feature is pretty nifty, and I can easily imagine wanting to use it if I were introducing some basic programming concepts.

Would Swift spread beyond the Cocoa domain if the implementation were opened up? That’s hard to say. As a language, it does not seem compelling compared to my current favorite, Scala. The native code compilation has some appeal, but the JVM has a pretty good JIT these days. The reference counting storage management could be a big deal in certain applications, but the developer overhead in manually managing cycles is hard to justify outside of narrow domains.

For its intended use, however, I would say Swift is a pretty good bet to be successful, and probably make a lot of Cocoa programmers more productive.

Wednesday, January 29, 2014

Scala: What’s not to like?

What if you like object-oriented programming and you like functional programming and you like having an interactive interface and you like the Java ecosystem and you like static typing (but you wish the compiler wouldn’t make you be explicit all the time about type declarations)?

Scala lives at the intersection of all those things and, after spending some time with it, I have to say it’s pretty good.

Maybe I’m just weird, but my initial reaction to Scala was that it felt enough like Python to make me happy. I like Python, but I’ve never tried to use it for anything big—it doesn't feel like I would be comfortable trying to move around in a lot of Python code. Scala mostly succeeds at feeling simple enough to start diving in scripting-style while having the muscle to build anything you might want a “systems language” for.

Things I Like

A type system that works

I still recall when I first learned that Java's type system made arrays covariant, so that a program could pass the static type checker, not do any explicit dynamic type casting, and still generate run-time type exceptions. Yay (ironic). It took years for Java to add generics (parameterized types), and by that time the compatibility concerns meant that it was too late to do things right. So, Java has static types with generics, but they're not as flexible, powerful, or reliable as you might like.

The type inferencer is good enough that most declarations are for humans, not the compiler. (There are exceptions, of course. The most common is that the compiler needs an explicit return type for a recursive function.)

The truth is that the time is long past when I could read a paper about F-bounded polymorphism and imagine that I understood it. But the Scala type system seems to work in the real world. Yes, sometimes the error messages are not as helpful as I'd like. But I'll take that in exchange for what I get in return.

Interactive read-eval-print

Is this really that important? I think so. All programming is exploratory to some degree, and all exploration is interactive. Being able to type in a few lines and see if they produce the result you expect is invaluable.

Value classes

You want zero-overhead wrappers for primitives like integers and doubles? You want to add behavior to integers by subclassing without any overhead? You can do that with the combination of value classes and implicits. This can be surprisingly handy.

Java interoperability

So, Java is not my favorite programming language, but it has an enormous ecosystem. Scala can plug into that by virtue of its JVM implementation and Java interoperability.

The interactivity comes into play here as well: Whenever you use someone else’s library, you have to able to do some exploration. I’ve used both Rhino and Jython as ways to experiment with Java APIs, but they don't feel like an especially good fit, maybe because of the dynamic typing. Scala works well for this.

Decent IDEs

Well, I've only used IntelliJ IDEA, but people seem to think Eclipse is also fine for Scala. This isn't so much a good thing about the language as a necessity.

Call by name

This is something that really got done right. I suppose it’s just syntactic sugar for automatically generating no-argument lambda expressions, but it adds the right kind of control structure extensibility without being, well, weird.

A better Java?

I've seen people say that Scala is to Java as C++ is to C. Sometimes they mean that as a negative, and sometimes as a positive. I agree that C++ adds a lot of complexity to C, but I also think it makes a more expressive language that, used with care, makes for efficient, maintainable code. And that's pretty much how I view Scala.

Things I Don’t Like

There's more than one way to do it

This is, I guess, an unavoidable result of a multi-paradigm language. Somehow it seems worse in Scala than it does in Python. Part of that is the syntactic sugar aspect, where there really isn't more than one way to do it, but there's more than one way to express what you're doing. But another part is that Scala really lives in two worlds: the Java world of mutable data structures and the functional world of immutable data structures and higher-order functions. You can write code in either style and still be idiomatic, although people are likely to steer you more toward the functional style.

The existence of both traits and abstract classes is another side of this. Even the language's creator has trouble explaining exactly when to use which.

Strings

The idea of a programming language that doesn’t actually have a native definition of strings is just bizarre (“Scala’s String class is usually derived from the standard String class of the underlying host system (and may be identified with it).”). Scala strings are Java strings, and if some behavior of Java strings is just not quite Scala-like, that’s too bad. Yuck.

AC adapter not included

Here's where I think the Python comparison makes Scala look bad. The Python slogan “batteries included” refers to the standard library, which is full of useful stuff. Scala native libraries are mostly collections and utilities. Scala has direct access to all of the Java standard library, but there are no standard wrappers to make it easy to use those Java classes from Scala, and the Java standard libraries lack a lot of useful things that can be found in the Python library.

JVM crap

Thanks to the way Java handles generics, Scala has to live with the JVM's type erasure. That's a nuisance, and it makes some code clumsier than it ought to be.

Scala has to live with Java's exception and class/interface models, too. The former isn't hopeless, although interoperability means dealing with checked exceptions. The class/interface model means that the mapping between Scala types and Java types can be surprising.

That Darn @!

There really are only two things that @ is used for (I think), annotations and pattern matching, but somehow the many different ways this can manifest just makes me worry every time I see an @. Annotations on classes, methods and variables are simple enough, but annotations on types and expressions get a little confusing. The annotation is not always where you might expect it to be, and it's easy to lose track of what's going on. The cognitive load for distinguishing an annotation from a patten match type constraint and figuring out what exactly the annotation applies to is higher than I would like.

Things I’m Just Not Sure About

Implicit magic

Scala's mechanisms for implicit parameters can do amazing things. Implicit parameters are the way that generic methods on collections can return new collections of the right type, for example. And the “pimp my type” pattern (using implicits) makes it easy to add behavior to an existing implementation.

Implicits are pretty carefully controlled: the compiler will complain about ambiguity, so the risk of surprises is fairly low (but not zero). But implicits are also a temptation to hide parameters that could be explicit—just to reduce the number of characters that have to be typed (silly) or read (maybe good, maybe bad). My bias is with Python: “Explicit is better than implicit.”

Syntax follies

Scala will let you define your own special-character sequences as names. This can and does lead to people doing things that, in the guise of “domain specific languages,” are really just silly. But if I ever need to, I can define the UFO operator as <-^->.

I think the rules for infix syntax are the same for both special-character names and ordinary names (+ or plus). But the rules for when parentheses are mandatory are not easily internalized, especially for parameters that are functions. It is not always clear why sometimes parentheses are necessary, sometimes braces are sufficient, and (maybe?) sometimes you need both.

XML baked in

There seems to be consensus that this was probably not the best idea.

Churn

Scala is very much a living language, and the rate of change is kind of high. I'm not sure whether I think that is bad or not. It will certainly depend on your use case.

Random Stuff

The Scala meaning of yield is quite different from the Python meaning of yield, since Scala doesn’t have generators and Python has different for-comprehension syntax. That threw me at first.

I guess functional programming gurus have no problem with it, but I always have to translate flatMap into the equivalent nested for comprehension, which makes code using flatMap a little harder to read.

Companion objects are pretty handy, but it’s easy to confuse the companion object’s apply method with a constructor. (The classic example is the difference between Array(0) and new Array(0).)

Also, nothing actually guarantees that the unapply method used in pattern matching is consistent with the apply method (or case class constructor). That’s an opportunity for confusion.

Summary (or “TL;DR” if you prefer)

Scala is actually pretty cool. It has its warts, like any programming language, but there is a lot there that feels right. It’s too bad it doesn’t have more traction.

Saturday, August 24, 2013

Ashland Summer 2013

Herein a few notes on the summer plays we saw at the Oregon Shakespeare Festival.

The Tenth Muse

This commissioned work started out, we were told, to be an adaptation of one of the few surviving plays by Sor Juana Inés de la Cruz, a 17th century Mexican nun. It evolved into a story set in a fictionalized version of the convent at San Jeronimo twenty years after her death. Sor Juana was an intellectual and a feminist, and her writing attracted disapproval from the church hierarchy. The play has three young women discovering a cache of her works that were supposed to have been destroyed, and focuses on how they respond to the work and to the influence of one of the sisters who emulates Sor Juana.

The play is as much about race and class as it is about feminism: the three central characters are Spaniard, mestizo, and native. These ethnicities dictate their options and those options conflict with their capabilities and desires.

My biggest complaint about this work is that it seemed like we had not gotten very far in plot or character development before intermission. The second half was entirely satisfying, so I can't complain too much, but it would be better, I think, if a little more development moved earlier.

The story takes a particular liberty with Sor Juana's death in a way that serves drama and, perhaps, Truth. And the tone of the ending is a bit ambiguous; presumably this was the intention.

The Liquid Plain

I liked everything about this production of a play that deals with the slave trade in the northern states. The sets, the costumes, the performances were all excellent. My problem is with the play itself.

I feel like I missed some essential point. The play hinges around a climactic scene where a character does something surprising. Later we learn that the character has an important secret. But that revelation does not explain the character's behavior—especially when we're told that another character also knew the secret.

The characterizations and dialogue were otherwise fine, but I can't get past this problem.

The Heart of Robin Hood

This play spins a version of Robin Hood where the merry men are still on step 1 of “Steal from the rich, give to the poor” (their reputation is preceding them?) when Marion tries to join them. They also have a strict rule against women in the camp, so Marion has to go Shakespearean and dress up as a boy to teach them a thing or two.

Michael Elich gets to have far too much fun playing evil Prince John pursuing Marion and threatening small children. I'm not sure the two children added enough to the story to justify them; the play feels just a bit over-stuffed. But it was plenty of fun, and felt at home amid the Shakespeare.

The Unfortunates

I don't know what to call this. I wouldn't call it a musical—maybe a musical theatrical experience? Let's just say that it was well-performed but the story and the visuals were kind of hallucinatory. Not much plot to speak of, and the bare framing didn't really make sense.

A Midsummer Night’s Dream

I quite liked this production. Set in 1964 at a Catholic school (more or less), the framing of Theseus and Hippolyta as a priest and nun who are renouncing their vows to marry actually worked for me. (A nun, like an Amazon, lives in a society of women.) Having the “rustics” be the school's drama teacher, gym teacher, science teacher, lunch lady, and janitor was great, and Brent Hinckley was wonderful as Nick Bottom (but really, they were all good).

The lecture we heard drew attention to a peculiar conundrum of the young lovers that is often ignored: at the end of the play, one of them is still under the influence of a magic spell to “correct” his affection toward the proper target. It’s an interesting point, although Demetrius is supposed to have loved Helena before he switched his affections to Hermia, so maybe the magic is really setting things to the way they were supposed to be.

While I love her in other productions, I don't think Gina Daniels’s voice was a good match for the role of Puck this time around.

A Streetcar Named Desire

Okay, I have never seen the classic film adaptation with Marlon Brando, nor any other production, so this was my first Streetcar. I understand that some people have remarked on Stella being a stronger character than they are used to. I have no basis for comparison, but there’s no question that, in this show, Stella knows exactly what she wants, what she has, and the price she is paying for it.

Jeffrey King’s performance as Mitch was amazing. I really felt that he inhabited the character. Danforth Comins, who was outstanding as the tortured Brick in Cat on a Hot Tin Roof, brought a certain element of self-awareness to Stanley that I think was not right for the character. It didn't ruin things, but I think the play would have been better with a slightly more elemental Stanley.

Cymbeline

One of Shakespeare's later and lesser known plays, Cymbeline is classed as a “romance” along with The Tempest, A Winter's Tale, and Pericles. It’s full of familiar themes: a parent-child reunion after many years, forgiveness, a token of love that becomes false proof of infidelity, a woman dressed as a man, a daughter being forced to marry against her will, parental ghosts, a power-hungry scheming spouse … you get the idea. The preface speaker joked that it was like “Shakespeare's Greatest Hits.”

I gather that this production cut a fair bit from the play to make it manageable. Cymbeline has, we are told, the longest final act of any of Shakespeare’s plays, as all the plot threads have to be pulled together and tied up. There are a lot of those threads.

I think they did a good job. I found the play quite satisfying. I don’t even mind too much the director’s decision to play up the “fairy tale” aspects explicitly (the evil queen is dressed like Disney’s, and her poison is delivered in a shiny apple-shaped box; there are elfin ears aplenty and even a slightly ogre-ish jailer). I have to suppose that this production was unusually good, or the play would be more popular.

Tuesday, May 01, 2012

Software Design Values

A recent conversation made me think explicitly about my software design values. I believe that good software is as correct as is practical, is maintainable, and has adequate performance.

Correctness

Code that is “practically correct” is code that has been tested. Only the most trivial of programs is “obviously correct,” and formal verification is not yet practical for non-trivial programs. Testing is the only practical tool available for evaluating correctness. Code review (including pair programming) is also valuable, but it is not a substitute for testing the executable code against expected results.

Maintainability

Maintainable code is readable and testable. Without readability, code cannot easily be modified because it is hard to determine what the code is actually doing in the first place. Without testability, code cannot be modified because it is impossible to determine whether changes have the intended consequences.

Readability

Readable code is divided into chunks that are small enough to be easily described, because the human brain has a limited capacity for complexity and detail. “The competent programmer is fully aware of the strictly limited size of his own skull … .” [Dijkstra 1972]

Readable code must also be transparent: it says what it does and does what it says. The essential character of readable code is that it reduces the cognitive load on the reader, so the reader can easily see what the code is doing—without being faced with unnecessary detail. Code described as “clever” or “tricky” is rarely transparent; code that can be described as “elegant,” however, does not suffer that problem.

Another characteristic of readable code is good locality: all the code that manipulates some state is gathered together (conceptually, if not physically). This allows the reader to see what state transitions are possible and to confirm invariants.

Readability and Testability

Separation of concerns is vital for code to be readable and testable. Code that is concerned with one aspect of a problem should not be entangled with code that is concerned with another aspect of the problem. Such entanglement impairs both readability and testability: readability is impaired because the reader must think about two separate problems at the same time, and testability is impaired because the test harness has to deal with two sets of behaviors.

Testability

For code to be testable, it should be structured to allow testing in isolation (unit testing). Unit testing allows explicit, executable specification of behavior. Unit testing also allows small changes to be tested, and small changes are much easier to make correctly because they minimize the cognitive burden on the maintainer.

Performance

I mention performance last not because it is less important, but because it is notoriously seductive. “Premature optimization is the root of all evil.” [Knuth 1974] Measured performance and intuition often disagree, and time spent on optimization before the code is correct is almost always time better spent on getting an adequate solution sooner.

And …

Finally, good programming is hard, and any strategy that makes it easier is valuable. One such strategy is to use existing software instead of creating new, when there is adequate confidence in the existing software’s suitability.

I’ve avoided using “object-oriented” or even “abstraction” in the above. That is because I am trying to identify the first principles of my design values, and these terms describe approaches that derive from those principles.

Saturday, September 03, 2011

Discover Card Doesn’t Get It

The folks at Discover Card just announced that they are ending their Secure Online Account Number program. It was a fraud prevention system that allowed a cardholder to generate a different credit card number for each merchant. The card number was bound to the merchant with whom it was first used—any attempt to use the same number elsewhere would be rejected. It was different from some other virtual credit card systems in that it did not let you specify a dollar limit or an explicit expiration date. Despite those limitations, it gave me a fair bit of peace of mind.

In explaining their decision, Discover says

Since Secure Online Account Numbers launched over a decade ago, we’ve continued to add layers of fraud detection and prevention, authentication and ID verification to protect your account information.

and

Discover provides you $0 Fraud Liability Guarantee to protect your account from unauthorized charges.

The problem with that $0 guarantee is that it ignores a fundamental truth of the modern world: time is money. Because what happens if Discover discovers that someone who shouldn’t has hold of your account number is they cancel the card and issue a new account number. No problem, right?

Some years ago, I was traveling for work when my wife’s purse was stolen. The only credit card I had with me was also in the stolen purse, so the card had to be canceled. This created an unexpected problem when the hotel I was staying at demanded an alternate form of payment immediately: they would not wait for a new card to be delivered to me, which might not even arrive before my departure. I had to get a colleague to guarantee my room payment.

I’ve learned that lesson, and my wife and I each carry a card that the other doesn’t. But canceling a credit card can still have a substantial impact. The last time Discover decided to change my account number, I got a notice from the local public radio station that my monthly pledge was rejected. Various online accounts had to be updated. It took a while to track everything down and get everything sorted.

That’s why I have been a regular user of the virtual account feature: not to limit my financial risk, but to limit my non-financial risk. And that is what Discover Card does not seem to understand.

Update (2011-10-17): Discover just announced that they have changed their minds:

We recently announced the decision to discontinue the Secure Online Account Number feature.
Since then, we've heard customers like you tell us how much they love the added control of using Secure Online Account Numbers for their online purchases.
Based on the feedback we've decided to reinstate this feature. Beginning today you will once again be able to generate secure online account numbers for online purchases.

Somebody actually listened.

Write-Only