Using PostgreSQL for time-series data

The number of sensors and other things that periodically collect data is ever growing. This advent of the  internet of things (IoT) demands a way of storing and analyzing all this so-called time-series data. There are many options for such data – the most prominent being special time-series databases like InfluxDB or well suited, nicely scaling databases like Apache Cassandra.

The problem is you have to tailor your solution to one of these technologies whereas there is SQL with mature database management systems (DBMS) and drivers/bindings for almost any programming language.

Why not use a plain SQL database?

Relational SQL databases are a mature and well understood piece of technology albeit not as sexy as all those new NoSQL databases. Using them for time series data may not be a problem for smaller datasets but sooner or later your ingestion and query performance will degrade massivly. So in general it is not a good option to store all your time-series data in a traditional relational DBMS (RDBMS).

Why use PostgreSQL with TimescaleDB?

With the PostgreSQL extension TimescaleDB you get the best of both worlds: a well known query language, robust tools and scalability.

You access and manage your time-series database just like your ordinary PostgreSQL database. Almost everything including replication and backups will continue to work like before.

You do not have to deal with limitations of specialized solutions or learn a completely new ecosystem just for one aspect of your solution.

The future

We are successfully using TimescaleDB in one of our projects and will continue to share tipps and experience with this technology taking its rising importance into account.

 

The lorem ipsum in development

We’ve all been guilty of this: showing a new UI mockup to the user with lorem ipsum or typing in fake data into forms for testing. That’s bad because fake data does not have the same characteristics of real data. Real data has specific lengths, formats and structure. Fake data is arbitrary. It is even more embarrassing if a lorem ipsum gets into production. But using fake data is not only a problem in the user interface.
Using fake data is also a problem in programming. We see it in names of classes: *Impl, *Manager or *Holder.
These names do not communicate well. Why not name classes with intent? Take a look at domain driven design and the ubiquitous language for a starter.
But also using only fake data in tests is bad. Yes you need to test the extreme cases but even here you need to inform your selection of data from the real world. Build in the constraints that the domain provides. Without the domain you build highly flexible software which is a nightmare to maintain. But also the test data should not be just arbitrary: you need the corner cases and to find them look again into the domain, the real world. To do this you need to gather real data from users, domain experts or existing systems and reports. Real data has constraints, constraints drive creativity and decisions. The problem with deferring the decisions too much is you maintain your flexibility until then. So make decisions but not from fake but real data.

Developer Experience means nothing

You’ve probably already heard about the concept of User Experience (UX) and the matching virtue of User Experience Design. If not, you might want to go check it out. I would suggest you start with the excellent book “The Design of Everyday Things” from Donald Norman. It has nothing to do with software development, so you won’t mix up User Experience Design with Graphical User Interface Design or even Graphics Design.

In short, User Experience is what an user of your software will experience while using your software. If your software makes them smile, you’re some kind of UX god. If your software makes them curse repeatedly, you’re probably not. You can try to improve the experiences of your users by making changes to the software. This is called User Experience Design and is an important part of software development. Most developers know nothing about User Experience Design.

What developers naturally know about is Developer Experience (DX), a concept that isn’t really defined in the literature. I made it up to explain my point to you. Every software developer has a favorite programming language and IDE. Remember, the source codes of the same problem in different programming languages will ultimately yield the same machine code. The machine is totally agnostic about your preferences for a certain programming language. Your choice of a programming language for a certain problem is more about your Developer Experience than anything else. Developer Experience is everything you feel and think during software development. A bad Developer Experience lets you swear about the tools, the code and everything and everybody else around you. A good Developer Experience gives you a sense of accomplishment and safety while coding and makes your work not harder than necessary.

Because you can’t read elsewhere about the concept of Developer Experience, I want to give two more examples and show how it affects the User Experience. The first example is a big, long-lasting software system that is in production and still developed further. If you have a software system in production, everything requires an additional thought about the matching upgrade strategy. You can’t just modify the database structure, you need to provide migrations. You can’t just add a configuration entry, you need to make it optional or consider the least harmful default value. In the project of our example, the developers had three possible approaches to a new configuration entry:

  • They could add it to the code but leave the configuration files unchanged. This required the code to handle the “absent from configuration” case in a useful manner. It required the developer to make effort in the code. The user would not know about the new configuration entry if not stated in some external documentation.
  • They could add it to the code and write a configuration migration script that added it to the existing configuration files on the customer’s installation. The code could now expect the entry to be present, but the developer had to write and test the migration script code. The user would see the new configuration entry with the default value.
  • They could introduce a new configuration file to the system and place the new configuration entry in it. The code could expect the entry to be present, because new configuration files were added to an existing installation during the upgrade process. The user would see the new configuration file and the new configuration entry with the default value.

You can probably guess which of the three options got used so excessively that users complained about the configuration being all over the place, in a myriad of little one-liner configuration files with ominous names. The developers chose the best option for them and, in short view, for the users. But on the long run, the User Experience declined.

The second example is from a computer game in the mid 2000s. It was a massive multiplayer online shooter with a decent implementation ahead of its time. But one thing was still from the last century: After each update of the game, the key bindings were reset to the default. And every other aspect of local modification to the settings, too. After each update, you had to configure your video, audio, controls and everything else like your in-game equipment again and again. The game didn’t offer any means to copy or reload your settings. It was up to you to maintain a recent backup of the settings files just in case. And if the file format was changed, you needed to combine their changes with yours. WinMerge is a decent tool for that task. But the problem is clear: The game developers couldn’t be bothered to think about how their upgrade strategy would affect their users. They ignored the problem and let the users figure it out. They chose better Developer Experience, free from complexities like user-side modifications over good User Experience like a game that can be upgraded without drawbacks for the gamers.

Sadly, this is an usual formula in software development: Developer Experience is more important than User Experience. Just look at it from an utilitarian point of view: If you burden thousands or even millions of users with just an easy one-minute task that you could fix during development, you have a budget of two workdays up to 10 person-years. Do the math yourself: One million users, each spending one minute of work on your software, equal to over 2000 workdays lost on a thing you could probably fix for everybody in an hour or two.

And this brings me to my central statement: Developer Experience, as opposed to User Experience, means nothing. It’s just not important. At least for all the users. It is important to the small group of developers and should not be forgotten. But never should a decision lead to better Developer Experience on the expense of User Experience. It’s a small inconvenience for us developers to think about smooth upgrades, meaningful and consistent control element titles or an easy installation. It’s a whole new game for our users.

Always choose User Experience over Developer Experience.

Call to action: If you have another good example of developers being lazy on the expense of their users, please share it as a comment.

Call to action 2: The initial picture is linked from the excellent website badhtml.com. If you ever feel like an imposter for your latest design, go and visit this website.

luabind deboostified tips and tricks

luabind deboostified is a fork of the luabind project that helps exposing APIs to Lua. As the name implies, it replaces the boost dependency with modern C++, which makes it a lot more pleasant to work with.

Here are a few tips and tricks I learned while working with it. Some tricks might be applicable to the original luabind – I do not know.

1. Splitting module registration

You can split the registration code for different classes. I usually add a register function per class, like this:

struct A {
  void doSomething();
  static luabind::scope registerWithLua();
};

struct B {
  void goodStuff();
  static luabind::scope registerWithLua();
};

You can then combine their registration code into a single module on the Lua side:

void registerAll(lua_State* L) {
  luabind::module(L)[
    A::registerWithLua(),
    B::registerWithLua()];
}

The implementation of a registration function looks like this:

luabind::scope A::registerWithLua()
{
  return luabind::class_<A>("A")
    .def("doSomething", &A::doSomething);
}

2. Multiple policies and multiple return values

Unlike C++, Lua has real multiple return values. You can use that by utilizing the return value policies that luabind offers. Lets say, you want to write this in Lua:

local x, y = a.getPosition()

The C++ side could look like this:

void getPosition(A const& a, float& x, float& y);

The deboostified fork needs its policies supplied in a type list. Let’s use a small helper meta-function to build that:

template <typename... T>
using joined = 
  typename luabind::meta::join<T...>::type;

Once you have that, you can expose it like this:

luabind::def("getPosition", &getPosition,
              joined<
                luabind::pure_out_value<2>,
                luabind::pure_out_value<3>
              >());

3. Specialized data structures using luabind::object

Using the converters in luabind is not the only way to make Lua values from C++. Almost everything you can do in Lua itself, you can do with luabind::object. Here is a somewhat contrived example:

luabind::object repeat(luabind::object what,
                       int count) {
  // Create a new table object
  auto result = luabind::newtable(
    what.interpreter());
  // Fill it as an array [1..N]
  for (int i = 1; i <= count; ++i)
    result[i] = what;
  return result;
}

This function can then be exported via luabind::def and used just like any other function. This is just the tip of the iceberg, though. For example, you can also write functions that, at runtime, behave differently when a number is passed in as when a table is passed in. You can find out the Lua type with luabind::type(myObject).

Of course, as soon as you want to create new objects to return to Lua, you need the lua_State pointer in that function. Using the interpreter from a passed-in luabind::object is one way, but I have yet to find another pleasant way to do this. It is probably possible to use the policies to do this, and have them pass that in as a special parameter, but for now I am using some complicated machinery to bind lambda functions that capture the Lua interpreter.

That’s it for now..

Keep in mind that these are not thoroughly researched best-practices, but patterns I have used to solve actual problems. There might be better solutions out there – if you know any, please let me know. Hope this helped!

Using Ansible vault for sensitive data

We like using ansible for our automation because it has minimum requirements for the target machines and all around infrastructure. You need nothing more than ssh and python with some libraries. In contrast to alternatives like puppet and chef you do not need special server and client programs running all the time and communicating with each other.

The problem

When setting up remote machines and deploying software systems for your customers you will often have to use sensitive data like private keys, passwords and maybe machine or account names. On the one hand you want to put your automation scripts and their data under version control and use them from your continuous integration infrastructure. On the other hand you do not want to spread the secrets of your customers all around your infrastructure and definately never ever in your source code repository.

The solution

Ansible supports encrypting sensitive data and using them in playbooks with the concept of vaults and the accompanying commands. Setting it up requires some work but then usage is straight forward and works seamlessly.

The high-level conversion process is the following:

  1. create a directory for the data to substitute on a host or group basis
  2. extract all sensitive variables into vars.yml
  3. copy vars.yml to vault.yml
  4. prefix variables in vault.yml with vault_
  5. use vault variables in vars.yml

Then you can encrypt vault.yml using the ansible-vault command providing a password.

All you have to do subsequently is to provide the vault password along with your usual playbook commands. Decryption for playbook execution is done transparently on-the-fly for you, so you do not need to care about decryption and encryption of your vault unless you need to update the data in there.

The step-by-step guide

Suppose we want work on a target machine run by your customer but providing you access via ssh. You do not want to store your ssh user name and password in your repository but want to be able to run the automation scripts unattended, e.g. from a jenkins job. Let us call the target machine ceres.

So first you setup the directory structure by creating a directory for the target machine called $ansible_script_root$/host_vars/ceres.

To log into the machine we need two sensitive variables: ansible_user and ansible_ssh_pass. We put them into a file called $ansible_script_root$/host_vars/ceres/vars.yml:

ansible_user: our_customer_ssh_account
ansible_ssh_pass: our_target_machine_pwd

Then we copy vars.yml to vault.yml and prefix the variables with vault_ resulting in $ansible_script_root$/host_vars/ceres/vault.yml with content of:

vault_ansible_user: our_customer_ssh_account
vault_ansible_ssh_pass: our_target_machine_pwd

Now we use these new variables in our vars.xml like this:

ansible_user: "{{ vault_ansible_user }}"
ansible_ssh_pass: "{{ vault_ansible_ssh_pass }}"

Now it is time to encrypt the vault using the command

ANSIBLE_VAULT_PASS="ourpwd" ansible-vault encrypt host_vars/ceres/vault.yml

resulting a encrypted vault that can be put in source control. It looks something like

$ANSIBLE_VAULT;1.1;AES256
35323233613539343135363737353931636263653063666535643766326566623461636166343963
3834323363633837373437626532366166366338653963320a663732633361323264316339356435
33633861316565653461666230386663323536616535363639383666613431663765643639383666
3739356261353566650a383035656266303135656233343437373835313639613865636436343865
63353631313766633535646263613564333965343163343434343530626361663430613264336130
63383862316361363237373039663131363231616338646365316236336362376566376236323339
30376166623739643261306363643962353534376232663631663033323163386135326463656530
33316561376363303339383365333235353931623837356362393961356433313739653232326638
3036

Using your playbook looks similar to before, you just need to provide the vault password using one of several options like specifying a password file, environment variable or interactive input. In our example we just use the environment variable inline:

ANSIBLE_VAULT_PASS="ourpwd" ansible-playbook -i inventory work-on-customer-machines.yml

After setting up your environment appropriately with a password file and the ANSIBLE_VAULT_PASSWORD_FILE environment variable your playbook commands are exactly the same like without using a vault.

Conclusion

The ansible vault feature allows you to safely store and use sensitive data in your infrastructure without changing too much using your automation scripts.

Client-side web development: Drink the Kool-Aid or be cautious?

Client side web development is a fast-changing world. JavaScript libraries and frameworks come and go monthly. A couple of years ago jQuery was a huge thing, then AngularJS, and nowadays people use React or Vue.js with a state container like Redux. And so do we for new projects. Unfortunately, these modern client-side frameworks are based on the npm ecosystem, which is notoriously known for its dependency bloat. Even if you only have a couple of direct dependencies the package manager lock file will list hundreds of indirect dependencies. Experience has shown that lots of dependencies will result in a maintenance burden as time passes, especially when you have to do major version updates. Also, as mentioned above, frameworks come and then go out of fashion, and the maintainers of a framework move on to their next big thing pet project, leaving you and your project sitting on a barely or no longer maintained base, and frameworks can’t be easily replaced, because they tend to permeate every aspect of your application.

With this frustrating experience in mind we recently did an experiment for a new medium sized web project. We avoided frameworks and the npm ecosystem and only used JavaScript libraries with no or very few indirect dependencies, which really were necessary. Browsers have become better at being compatible to web standards, at least regarding the basics. Libraries like jQuery and poly-fills that paper over the incompatibilities can mostly be avoided — an interesting resource is the website You Might Not Need jQuery.

We still organised our views as components, and they are communicating via a very simple event dispatcher. Some things had to be done by foot, but not too much. It works, although the result is not as pure as it would have been with declarative views as facilitated by React and a functional state container like Redux. We’re still fans of the React+Redux approach and we’re using it happily (at least for now) for other projects, but we’re also skeptical regarding the long term costs, especially from relying on the npm ecosystem. Which approach will result in less maintenance burden? We don’t know yet. Time will tell.

Books and talks that shaped my mind as a developer

Over the years I’ve read many books and watched many talks but a few stand out (at least for me) that influenced me in my development career.

The inmates are running the asylum by Alan Cooper
This book opened my eyes that I approached software development completely from the wrong standpoint: the software should serve the user not vice versa.

Design Patterns by Erich Gamma et al
Oh others use the same patterns as me and what? you can even talk about it without explaining every detail…

Refactoring by Martin Fowler
This book taught me that you can change the structure and the design of the software without changing its function. Cool.

Inventing on Principle by Bret Victor
Seeing a new way of interacting with your software in development blew my mind. Think WYSIWYG on steroids.

Getting real by 37signals
Getting to the core of what is essential and what really needs to be done in software/product development is laid out here so clear and stripped down that it struck me.

Information visualization by Edward Tufte
Another book which reduces its topic (this time: presenting information) to the core and by this identifying so much unnecessary practice that it hurts.

Start with why by Simon Sinek
Purpose. Why do you develop software? Why do I arrange an UI or the architecture of an application? This is what design is about.

Only openings by Frank Chimero
Do I try to eliminate failures and therefore options or do I leave the possibility to the user to choose…

Web design is 95% typography by Oliver Reichenstein
Concentrate on the main part, the bigger part, the 95%. If you get them right the rest isn’t so important after all.

Discount usability by Jakob Nielsen
Do what you can do with what you have.