My favorite Game of Life videos

Conway’s Game of Life is the world’s most popular 2-dimensional cellular automaton. Programmers often implement it when learning a new programming language. It’s a nice little programming exercise and more challenging than a “hello, world”. There was a time when we ourselves implemented it in a lot of different ways during Code Retreat sessions.

The charm of Conway’s Game of Life is that from a small set of simple rules many interesting patterns can emerge: oscillators, gliders, spaceships, etc. On video platforms like YouTube you can find many videos of Conway’s Game of Life in action. I want to share with you some of my favorites that I personally found impressive:

Epic Game of Life

Life in Life – The Game of Life playing itself.

Turing Machine in Game of Life – The Game of Life has the power of a universal Turing machine, and here’s an implementation of a Turing Machine in Game of Life itself.

Game of Life in APL – This video is impressive in a different way: it demonstrates the expressiveness (and eccentricity) of an elder programming language named APL originating in the 1960s. APL is an array programming language and can be seen as a precursor to MATLAB or Mathematica. It’s based on a mathematical notation invented by (Turing Award winner) Kenneth E. Iverson. The implementation is basically a one-liner.

Smooth Life – A variation of Game of Life using floating point values instead of integers.

Game of Life producing a scrolling marquee of aliens.

And here you can watch John Conway himself, a very humble person, explain the rules of Game of Life with a handful of almonds.

Ansible: Play it again, Sam

Recently we started using Ansible for the provisioning of some of our servers. Ansible is one of many configuration management / provisioning tools that are popular right now. Puppet and Chef are probably more widely known representatives of their kind, but what attracted us to Ansible was the fact that it’s agentless: the target machines don’t need an agent installed, all you need is remote access via SSH. Well, almost. It turns out that Python is also required on the remote machines, otherwise you’ll be limited to a very basic set of functionality (the raw module). Fortunately, most Linux distributions have Python installed by default.

With Ansible you describe the desired target configuration as a sequence of tasks in a YAML file called Playbook: package installation, copying files, enabling and starting services, etc. The playbook is semi-declarative. Each step usually describes a goal, e.g. package XY should be present. Action is only taken if necessary. On the other hand it’s also very imperative: steps are executed sequentially and you can have conditionals and loops (e.g. “with_items”). You can also define handlers, which are executed once after they have been notified, for example if you want to restart the Apache web server after its configuration has changed.

Before a playbook is applied to a remote machine Ansible will query “facts” about this machine. These facts are available as variables in the playbook. You can also define your own variables.

A playbook is usually applied to a set of machines. Available machines are listed in a separate file, the inventory, where they can be grouped by roles. With one command you can configure or update all the machines of a specific role at once. You can also execute a “dry run”, which simulates a playbook run and tells you what changes would be applied.

So far our experience with Ansible has been good. The concepts are easy to grasp. YAML syntax requires getting used to, but at least it’s not XML. On the website the actual documentation is a bit hidden among promotion for their commercial products, but you can also directly visit docs.ansible.com.

Dynamic addition and removal of collection-bound items in an HTML form with Angular.js and Rails

A common pattern in one of our web applications is the management of a list of items in a web form where the user can add, remove and edit multiple items and finally submit the data:

form-fields

The basic skeleton for this type of functionality is very simple with Angular.js. We have an Angular controller with an “items” array:

angular.module('example', [])
  .controller('ItemController', ['$scope', function($scope) {
    $scope.items = [];
  }]);

And we have an HTML form bound to our Angular controller:

<form ... ng-app="example" ng-controller="ItemController"> 
  <table>
    <tr>
      <th></th>
      <th>Name</th>
      <th>Value</th>
    </tr>
    <tr ng-repeat="item in items track by $index">
      <td><span class="remove-button" ng-click="items.splice($index, 1)"></span></td>
      <td><input type="text" ng-model="item.name"></td>
      <td><input type="text" ng-model="item.value"></td>
    </tr>
    <tr>
      <td colspan="3">
        <span class="add-button" ng-click="items.push({})"></span>
      </td>
    </tr>
  </table>
  <!-- ... submit button etc. -->
</form>

The input fields for each item are placed in a table row, together with a remove button per row. At the end of the table there is an add button.

How do we connect this with a Rails model, so that existing items are filled into the form, and items are created, updated and deleted on submit?

First you have to transform the existing Ruby objects of your has-many association (in this example @foo.items) into JavaScript objects by converting them to JSON and assigning them to a variable:

<%= javascript_tag do %>
  var items = <%= escape_javascript @foo.items.to_json.html_safe %>;
<% end %>

Bring this data into your Angular controller scope by assigning it to a property of $scope:

.controller('ItemController', ['$scope', function($scope) {
  $scope.items = items;
}]);

Name the input fields according to Rails conventions and use the $index variable from the “ng-repeat” directive to provide the correct index value. You also need a hidden input field for the id, if the item already has one:

  <td>
    <input name="foo[items_attributes][$index][id]" type="hidden" ng-value="item.id" ng-if="item.id">
    <input name="foo[items_attributes][$index][name]" type="text" ng-model="item.name">
  </td>
  <td>
    <input name="foo[items_attributes][$index][value]" type="text" ng-model="item.value">
  </td>

In order for Rails to remove existing elements from a has-many association via submitted form data, a special attribute named “_destroy” must be set for each item to be removed. This only works if

accepts_nested_attributes_for :items, allow_destroy: true

is set in the Rails model class, which contains the has-many association.

We modify the click handler of the remove button to set a flag on the JavaScript object instead of removing it from the JavaScript items array:

<span class="remove-button" ng-click="item.removed = true"></span>

And we render an item only if the flag is not set by adding an “ng-if” directive:

<tr ng-repeat="item in items track by $index" ng-if="!item.removed">

At the end of the form we render hidden input fields for those items, which are flagged as removed and which already have an id:

<div style="display: none" ng-repeat="item in items track by $index"
ng-if="item.removed && item.id">
  <input type="hidden" name="foo[items_attributes][$index][id]" ng-value="item.id">
  <input type="hidden" name="foo[items_attributes][$index][_destroy]" value="1">
</div>

On submit Rails will delete those elements of the has-many association with the “_destroy” attribute set to “1”. The other elements will be either updated (if they have an id attribute set) or created (if they have no id attribute set).

The various ways of error handling

There are various approaches and philosophies regarding error handling in different programming languages. This article tries to give an overview.

Exceptions

Most of the current mainstream programming languages use exceptions for error handling. When an exception is raised (“thrown”), the call stack is being unwound until the exception is caught. The functions passed along the way through the call stack have to ensure that any opened resources are properly closed. This is usually done via finally blocks. If the exception is not caught the program terminates.

Exceptions come in two flavors: checked exceptions and unchecked exceptions. The handling of checked exceptions is enforced by the compiler. Checked exceptions are part of the function signatures. A function explicitly declares in its signature what exceptions can be thrown:

void f() throws A, B, C

The caller of a function has to either handle the exceptions (fully or partially) or let them pass through by re-declaring them in the throws clause of its own function signature.

Checked exceptions have the property that it’s hard to forget to handle them. However, proponents of unchecked exceptions argue that checked exceptions have two problems: versioning and scalability.

Once declared they are part of the interface and adding another exception will break all client code. Multiple exception types also tend to accumulate the more different subsystems are being aggregated. Proponents of unchecked exceptions prefer a catch-all clause further up the call stack. Some languages (e.g. Erlang) even follow a “let it crash” paradigm and simply respawn crashed processes. This approach is more viable in distributed systems than in user-facing applications.

Java is known for its checked exceptions. C#, C++, Scala and most dynamically typed languages decided to go with unchecked exceptions.

No exceptions

An alternative to exceptions is no exceptions. Exceptions overlay multiple different control flows, which makes it harder to reason about the control flow of a function. With exceptions functions can return at many other points than the explicit return points.

If an error is just a value that is returned by a function it can be handled by the usual control flow mechanisms of a language (like if and else) without the need of a special sub-language for error handling. These errors tend to be handled closer to the place of their occurrence rather than further up the call stack.

In such a language, which uses return values to flag errors, you’d better check all errors, otherwise you risk continuing with an incorrect, invalid or meaningless value. This can be enforced either by the compiler or via a lint tool.

There are different possibilities of how an error could be returned from a function:

In C a sentinel value in the range of the return type is often used to indicate an error, e.g. a negative value or zero. This is not a good solution, because it intermingles two things that do not belong together and it limits the range of valid return values. Another solution in C could be the use of an error output parameter. Prominent examples are NSError in Objective C or GError in GLib. This brings us to another possibility:

Some languages support multiple return values (e.g. Go) or tuple types (product types), which can act as multiple return values. One value can hold the actual result (e.g. number of bytes written), the other can indicate an error.

Multiple return types / product types are a simple solution, cover the necessary use cases and require little additional language support. A more sophisiticated and more restrictive solution are sum types, but they require a little bit more language support: instead of returning a value AND a possible error, a function returns either a value OR an error. This way the programmer is forced to check for an error by discriminating between the two cases. This is usually done via a feature called structural pattern matching (not to be confused with pattern matching on strings), either explicitly with a switch/case-like control structure or implicitly via convenience function. A popular example is Haskell’s Maybe monad or the Option or similarly named type in some other languages (e.g. Scala, Standard ML, OCaml).

 

Using Rails with a legacy database schema – Part 2

Part one of this blog post mini-series showed how to override default table names and primary key names in ActiveRecord model classes, and how to define alias attributes for legacy column names.

This part will discuss some options for primary key definitions in the schema file, which are relevant for legacy schemas, as well as primary key value generation on Oracle databases.

Primary key schema definition

The database schema definition of a Rails application is usually provided in a file called schema.rb via a simple domain specific language.

The create_table method implicitly adds a primary key column with name id (of numeric type) by default.

create_table 'users' do |t|
  t.string 'name', limit: 20
  # ...
end

If the primary key has a different name you can easily specify it via the primary_key option:

create_table 'users', primary_key: 'user_key' do |t|
  t.string 'name', limit: 20
  # ...
end

But what if a table has a primary key of non-numeric type? The Rails schema DSL does not directly support this. But there’s a workaround: you can set the id option of create_table to false, declare the primary key column like an ordinary non-nullable column, and add the primary key constraint afterwards via execute.

create_table 'users', id: false do |t|
  t.string 'user_key', null: false
  t.string 'name', limit: 20
  # ...
end
execute 'ALTER TABLE user ADD PRIMARY KEY (user_key)'

Primary key value generation

On Oracle databases new primary key values are usually created via sequences. The Oracle adapter for ActiveRecord assumes sequence names in the form of table name + “_seq”.  You can override this default sequence name in a model class via the sequence_name property:

class User < ActiveRecord::Base
  self.sequence_name = 'user_sequence'
  # ...
end

Sometimes primary key values are auto-generated via triggers. In this case you need the Oracle Enhanced adapter, which is a superset of the original ActiveRecord Oracle adapter, but with additional support for working with legacy databases. Now you can set the sequence_name property to the value :autogenerated:

class User < ActiveRecord::Base
  self.sequence_name = :autogenerated
  # ...
end

This circumvents the default convention and tells the adapter to not include primary key values in generated INSERT statements.

Don’t let one bad apple spoil the whole bunch

One exception in a collection operation like for-each or map/collect stops the processing of all the other elements. Instead of letting the whole task blow up it is often more desirable to skip those elements causing failures, log the errors (and possibly notify the user about the failing elements), but have all other elements processed. Examples for such operations are: sending bulk mails to users, bulk import/export, lists in user interfaces etc., and common errors are, for example, NullPointerExceptions, database errors or wrong email addresses.

Here’s some simple code for robust and reusable for-each and map operations in JavaScript:

function robustForEach(array, callback) {
  var failures = [];
  array.forEach(function(elem, i) {
    try {
      callback(elem, i);
    } catch (e) {
      failures.push({element: elem, index: i, error: e});
    }
  });
  return failures;
}

function robustMap(array, callback) {
  var result = { array: [] };
  result.failures = robustForEach(array, function(elem, i) {
    result.array.push(callback(elem, i));
  });
  return result;
}

Similar code can be easily implemented in other languages like Java (especially with Java 8 streams), Groovy, Ruby, etc.

If you decide to log the errors, you have to choose between two possible log strategies: one log operation per error, which can be annoying if you get a mail for each logged error, or one log operation bundling all occurred errors (make sure that a failing toString can’t spoil the whole bunch again).

function logAny(failures) {
  failures.forEach(function(fail) {
    log.error(failMessage(fail));
  });
}

function logAnyBundled(failures) {
  if (failures.length == 0) {
    return;
  }
  log.error(failures.map(function(fail) {
    return failMessage(fail);
  }).join('\n'));
}

function failMessage(fail) {
  return "Could not process '" +
         fail.element + "': " + fail.error;
}

You can easily combine the map and log operations:

function robustMapAndLog(array, callback) {
  var result = robustMap(array, callback);
  logAny(result.failures);
  return result.array;
}

Example usage:

var numbers = [1, 2, 3, 4, 5, 6, 7, 8];
var result = robustMapAndLog(numbers, function(n) {
  if (n == 5) {
    throw 'bad apple';
  }
  return n * n;
});
print(result);

// Error log output:
//  Could not process '5': bad apple
// Output:
//  [ 1, 4, 9, 16, 36, 49, 64 ]

One element could not be processed due to an error, but all other elements were not affected.

Conclusion

Be aware of the bad apple possibility for every loop you write (explicitly or implicitly) and consciously choose the appropriate error handling strategy depending on the situation. Don’t let indifference decide the fate of your bulk operations.

Translating strings in internationalized applications

Internationalization (“i18n”) and localization (“l10n”) of software is a complex topic with many facets. One aspect of internationalization is the translation of strings in programs into different languages.

Here’s an example of how not to do it (assuming t is a translation lookup function):

StringBuilder sb = new StringBuilder(t("User "));
sb.append(user.name());
sb.append(t(" logged in "));
sb.append(minutes);
sb.append(" ");
if (minutes == 1) {
    sb.append(t("minute"));
} else {
    sb.append(t("minutes"));
}
sb.append(t(" ago."));
return sb.toString();

Translatable strings and concatenation don’t mix well, be it via StringBuilder, the plus operator or in template files like JSPs. Different languages have different sentence structures. You can’t know in advance in which order the parts must appear in the translated text. So the most basic rule is: never construct sentences programmatically from sentence fragments if they are intended for translation.

Here’s a slightly better variant:

if (minutes == 1) {
    return t("User {0} logged in {1} minute ago.", user.name(), minutes);
}
return t("User {0} logged in {1} minutes ago.", user.name(), minutes);

I18n frameworks always offer the possibility to pass arguments to the translation lookup function. This way translators can freely choose the positions of these arguments via placeholders in the translated string.

However, not all languages have pluralization rules similar to English, where you have to handle only two cases (one and zero/many). For example, Russian and Polish use different forms of nouns with different numerals higher than one. Here’s an extensive table listing the plural rules for different languages: The rules are classified into these categories: “one”, “two”, “few”, “many”, “other”. Good i18n frameworks provide translation lookup functions where you can pass the count as an additional argument. The framework then dispatches to different translation keys, depending on the count and the target language:

user.login.minutes.one=...
user.login.minutes.two=...
user.login.minutes.many=...
user.login.minutes.other=...

There are other traps that you have to watch out for, e.g.

  • different punctuation marks: you can’t simply assume that you can convert any translated text into a label by appending “:” to it, or that you can convert any translated text into a quotation by surrounding it with ” and “.
  • gender rules, which can be handled similarly to the pluralization rules

Conclusion

This article gave a small glimpse into the topic of internationalization, to help avoid the most basic mistakes. Check out the documentation of your internationalization framework to see what it can offer.

Dart and TypeScript as JavaScript alternatives

JavaScript was designed at Netscape by Brendan Eich within a couple of weeks as a simple scripting language for the web browser. It’s an interesting mixture of Self‘s prototype-based object model, first-class functions inspired by LISP, a C/AWK-like syntax and a misleading name imposed by marketing.

Unfortunately, the haste in which JavaScript was designed by a single person shows in many places. Lots of features are inconsistent and violate the principle of least surprise. Just skim through the JavaScript Garden to get an idea.

Another aspect casting a poor light on JavaScript is the bad design of the browser DOM API, including incompatibilities between different browser implementations.

Douglas Crockford redeemed the reputation of JavaScript somewhat, by writing articles like “JavaScript: The World’s Most Misunderstood Programming Language“, the (relatively thin) book “JavaScript: The Good Parts” and discovering the JSON format. But even his book consists for the most part of advice on how to avoid the bad and the ugly parts.

However, JavaScript is ubiquitous. It is the world’s most widely deployed programming language, it’s the only programming language option available in all browsers on all platforms. The browser DOM API incompatibilities were ironed out by libraries like jQuery. And thanks to the JavaScript engine performance race started by Google some time ago with their V8 engine, there are now implementations available with decent performance – at least for a scripting language.

Some people even started to like JavaScript and are writing server-side code in it, for example the node.js community. People write office suites, emulators and 3D games in JavaScript. Atwood’s Law seems to be confirmed: “Any application that can be written in JavaScript, will eventually be written in JavaScript.”

Trans-compiling to JavaScript is a huge thing. There are countless transpilers of existing or new programming languages to JavaScript. One of these, CoffeeScript, is a syntactic sugar mixture of Ruby and Python on top of JavaScript semantics, and has gained some name recognition and adoption, at least in the Rails community.

But there are two other JavaScript alternatives, backed by large companies, which also happen to be browser manufacturers: Dart by Google and TypeScript by Microsoft. Both have recently reached version 1.0 (Dart even 1.2), and I will have a look at them in this blog post.

Large-scale application development and types

Scripting languages with dynamic type systems are neat and flexible for small and medium sized projects, but there is evidence that organizations with large code bases and large teams prefer at least some amount of static types. For example, Google developed the Google Web Toolkit, which compiled Java to JavaScript and the Closure compiler, which adds type information and checks to JavaScript via special comments, and now Dart. Facebook recently announced their Hack language, which adds a static type system to PHP, and Microsoft develops TypeScript, a static type add-on to JavaScript.

The reasoning is that additional type information can help finding bugs earlier, improve tool support, e.g. auto-completion in IDEs and refactoring capabilities such as safe, project-wide renaming of identifiers. Types can also help VMs with performance optimization.

TypeScript

This weekend the release of TypeScript 1.0 was announced by Microsoft’s language designer Anders Hejlsberg, designer of C#, also known as the creator of the Turbo Pascal compiler and Delphi.

TypeScript is not a completely new language. It’s a superset of JavaScript that mainly adds optional type information to the language via Pascal-like colon notation. Every JavaScript program is also a valid TypeScript program.

The TypeScript compiler tsc takes .ts files and translates them into .js files. The output code does not change a lot and is almost the same code that you would write by hand in JavaScript, but with erased type annotations. It does not add any runtime overhead.

The type system is heavily based on type inference. The compiler tries to infer as much type information as possible by following the flow of types through the code.

TypeScript has interfaces that are very similar to interfaces in Go: A type does not have to declare which interfaces it implements. Interfaces are satisfied implicitly if a type has all the required methods and properties – in short, TypeScript has a structural type system.

Type definitions for existing APIs and libraries such as the browser DOM API, jQuery, AngularJS, Underscore.js, etc. can be added via .d.ts files.
These definition files are very similar to C header files and contain type signatures of the API’s functions. There’s a community maintained repository of .d.ts files called Definitely Typed for almost all popular JavaScript libraries.

TypeScript also enhances JavaScript with functionaliy that is planned for ECMAScript 6, such as classes, inheritance, modules and shorthand lambda expressions. The syntax is the same as the proposed ES6 syntax, and the generated code follows the usual JavaScript patterns.

TypeScript is an open source project under Apache License 2.0. The project even accepts contributions and pull-requests (yes, Microsoft). Microsoft has integrated TypeScript support into Visual Studio 2013, but there is support for other IDEs and editors such as JetBrain’s IDEA or Sublime Text.

Dart

Dart is a JavaScript alternative developed by Google. Two of the main brains behind Dart are Lars Bak and Gilad Bracha. In the early 90s they worked in the Self VM group at Sun. Then they left Sun for LongView Technologies (Animorphic Systems), a company that developed Strongtalk, a statically typed variant of Smalltalk, and later the now-famous HotSpot VM for Java. Sun bought LongView Technologies and made HotSpot Java’s default VM. Bracha co-authored parts of the Java specification, and designed an object-oriented language in the tradition of Self and Smalltalk called Newspeak. At Google, Lars Bak was head developer of the V8 JavaScript engine team.

Unlike TypeScript, Dart is not a JavaScript superset, but a language of its own. It’s a curly-braces-and-semicolons language that aims for familiarity. The object model is very similar to Java: it has classes, inheritance, abstract classes and methods, and an @override annotation. But it also has the usual grab bag of features that “more sugar than Java but similar” languages like C#, Groovy or JetBrain’s Kotlin have:

Lambdas (via the fat arrow =>), mixins, operator overloading, properties (uniform access for getters and setters), string interpolation, multi-line strings (in triple quotes), collection literals, map access via [], default values for arguments, optional arguments.

Like TypeScript, Dart allows optional type annotations. Wrong type annotations do not stop Dart programs from executing, but they produce warnings. It has a simple notion of generics, which are optional as well.

Everything in Dart is an object and every variable can be nullable. There are no visibility modifiers like public or private: identifiers starting with an underscore are private. The “truthiness” rules are simple compared to JavaScript: all values except true are false.

Dart comes with batteries included: it has a standard library offering collections, APIs for asynchronous programming (event streams, futures), a sane HTML/DOM API, removing the need for jQuery, unit testing and support for interoperating with JavaScript. A port of Angular.js to Dart exists as well and is called AngularDart.

Dart supports a CSP-like concurrency model based on isolates – independent worker threads that don’t share memory and can communicate via SendPorts and
ReceivePorts.

However, the Dart language is only one half of the Dart project. The other important half is the Dart VM. Dart can be compiled to JavaScript for compatibility with every browser, but it offers enhanced performance compared to JavaScript when the code is directly executed on the Dart VM.

Dart is an open source project under BSD license. Google provides an Eclipse based IDE for Dart called the “Dart Editor” and Dartium, a special build of the Chromium browser that includes the Dart VM.

Conclusion

TypeScript follows a less radical approach than Dart. It’s a typed superset of JavaScript and existing JavaScript projects can be converted to TypeScript simply by renaming the source files from *.js to *.ts. Type annotations can be added gradually. It would even be simple to switch back from TypeScript to JavaScript, because the generated JavaScript code is extremely close to the original source code.

Dart is a more ambitious project. It comes with a new VM and offers performance improvements. It will be interesting to see if Google is going to ship Chrome with the Dart VM one day.

Using Rails with a legacy database schema

Rails is known for its convention over configuration design paradigm. For example, database table and column names are automatically derived and generated from the model classes. This is very convenient, but when you want to build a new application upon an existing (“legacy”) database schema you have to explore the configuration side of this paradigm.

The most basic operation for dealing with a legacy schema in Rails is to explicitly set the table_names and the primary_keys of model classes:

class User < ActiveRecord::Base
  self.table_name = 'benutzer'
  self.primary_key = 'benutzer_nr'
  # ...
end

Additionally you might want to define aliases for your column names, which are mapped by ActiveRecord to attributes:

class Article < ActiveRecord::Base
  # ...
  alias_attribute :author_id, :autor_nr
  alias_attribute :title, :titel
  alias_attribute :date, :datum
end

This automatically generates getter, setter and query methods for the new alias names. For associations like belongs_to, has_one, has_many you can explicitly specify the foreign_key:

class Article < ActiveRecord::Base
  # ...
  belongs_to :author, class_name: 'User', foreign_key: :autor_nr
end

Here you have to repeat the original name. You can’t use the previously defined alias attribute name. Another place where you have to live with the legacy names are SQL queries:

q = "%#{query.downcase}%"
Article.where('lower(titel) LIKE ?', q).order('datum')

While the usual attribute based finders such as find_by_* are aware of ActiveRecord aliases, Arel queries aren’t:

articles = Article.arel_table
Article.where(articles[:titel].matches("%#{query}%"))

And lastly, the YML files with test fixture data must be named after the database table name, not after the model name. So for the example above the fixture file name would be benutzer.yml, not user.yml.

Conclusion

If you step outside the well-trodden path of convention be prepared for some inconveniences.

Next part: Primary key schema definition and value generation