Use real(istic) data from early on

When developing software in general and also specifically user interfaces (UIs) one important aspect is often neglected: The form, shape and especially the amount of data.

One very common practice is to fill unknown texts with fragments of the famous Lorem ipsum placeholder text. This may be a good idea if you are designing a software for displaying a certain kind of articles similar in size and structure to your placeholder text. In all other cases I would regard using lorem ipsum as a smell.

My recommendation is to collect as many samples of real or at least realistic data as feasible. Use them to build and test your application. Why do I think it matters? Let me elaborate a bit in the following sections.

Data affects the layout

You can only choose a fitting layout if you have knowledge about the length of certain texts, size of image etc. The width of columns can be chosen more appropriately, you can descide if you need scrollbars, if you want them permantently visible for a more stable and calm layout, how large panels or text areas have to be for optimum readability and so on.

Data affects the choice of UI controls

The data your application has to handle should reflect not only in the layout but also in the type of controls to be used.

For example, the amount of options for the user to make a choice from drastically affects the selection of an adequate UI control. If you have only 2 or 3 options toggle buttons, checkboxes or radio buttons next to each other or layed out in one column may be a good fit. If the count of options is greater, dropdowns may be better. At some point maybe a full-blown list with filters, sorting and search may be necessary.

To make a good decision, you have to know the expected amount and shape of your data.

Data affects algorithms and technical decisions regarding performance

The data your system has to work with and to present to the user also has technical impact. If the datasets are moderate in size, you may be able to transfer them all to the frontend and do presentation, filtering etc. there. That has the advantage of reducing backend stress and putting computational effort in the hands of the clients.

Often this becomes unfeasible when the system and its data pool grows. Then you have to think about backend search and filtering, datacompression and the like.

Also algorithmns and datastructure may change from simple lists and linear search to search trees, indexes and lookup tables.

The better you know the scope of your system and the data therein the better your technical decisions can be. You will also be able to judge if the YAGNI principle applies or not.


To quickly sum-up the essence of the advice above: Get to know the expected amount and shape of data your application has to deal with to be able to design your system and the UI/UX accordingly.

Web Security for Frontend and Backend

The web is everywhere and we use it for tons of important tasks like online banking, shopping and communication. So it becomes increasingly important to implement proper security. As attacks like cross-site scripting (XSS) or cross-site request forgery (CSRF) are wide-spread browsers, web standards designers and web application developers implement more and more mechanisms to make such attacks harder or even impossible. This puts a certain burden on both frontend and backend developers.

Since security is hard and should not be an afterthought I would like to give you some advice when implementing a web app using a Javascript-frontend and a backend service written in some of the common languages/frameworks like .NET, Micronaut, Javalin, Flask or the like.

Frontend advice

I prefer traditional cookie-based sessions to JWT-based approaches for interactive web frontends because of simplicity, browser support and the possibility to use it without Javascript. For service-to-service communication bearer tokens of some kind may be more appropriate. Your Javascript client has to include the credentials in the fetch() calls to cause the browser to send the cookie.

Unfortunately, incorrect use of cookies may be insecure, so be sure to check up-to-date advice on cookies; see some hints below in the backend part because cookies are configured and issued there.

Backend advice

Modern web security requires additional measures on the server side to ensure secure authentication and communication with web clients. You should use https whereever possible to gain at least transport security and avoid many cases of sniffing credentials or changing content between client and backend.

Improving security of cookies

First of all, cookies should be HttpOnly so that scripts cannot access the contents of a cookie. Furthermore you should ideally set the SameSite and Secure attributes appropriately and use https whenever possible. That way you have mitigated the most common attacks on your session handling and authentication.

Another bonus for cookies is that browsers can inform you about problems with your cookie setup:

Configuring Cross-Origin Resource Sharing (CORS)

Nowadays it is common for web app to be served from a different host than the backend API. This is a potential problem because attackers may sneak scripts into the browser of a user and use the existing session to access the resources in an illegal way. Therefore another means of improving security of web apps running in browsers was introduced with the access control using CORS.

For browsers to be able to prevent or allow requests to certain resources the backend has to provide appropriate Access-Control-headers, most notably Access-Control-Allow-Origin and Access-Control-Allow-Credentials. Make sure to set these values correctly or your frontend will have trouble to access your backend or you introduce a potential security whole.

Fortunately many web frameworks make it easy to configure CORS, see Micronaut documentation for example.


Security is always important and browser vendors keep implementing additional measures to mitigate problems in the current web environment. Make sure you keep up with the latest advice and measures and implement them in your applications.

Speeding up your HQL

Using an object-relational-mapper (ORM) to persist your entities, manage their state and query subsets for lists or reports is a wide-spread practice and may speed up your development.

If not used correctly, it may introduce unexpected performance problems because of unefficient default queries and the overhead this mapping introduces as most of the time table rows are converted to domain objects. Often this results in many queries and the n+1 query problem.

Nevertheless, the benefits of using an ORM may outweigh the problems and most problems can be mitigated by features and a correct usage of the tool.

Today I want to present a performance problem we had using GORM/Hibernate and how we easily fixed it without major code restructuring or workarounds.

The Problem

We used a HQL-query to load quite a lot of entities which took about 3 seconds. This was acceptable for our customer. If the user however tried to narrow down the results using a filter loading a smaller amount of the same entities took over 1 minute. Obviously, this was totally unacceptable and counter-intuitive.

The Analysis

Further analysis revealed, that a particular part of the WHERE-clause was responsible for the observed slowdown:

FROM Report r

So we did filter the root entity Report on an entity called Proposal but needed to load an associated Project entity for all reports to consider. So even if we are just using entity-ids to filter the innocently looking path leads to loading and mapping of hundreds of Project entities.

The Solution

In our example we can fortunately do a lot better without big changes to our domain model, the application code or the query.

The relevant part of the schema looks like below:

In the above schema we can see, that both, a Report and a Proposal are associated with a certain project. Remember, that in Hibernate your entities contain only the id of their one-to-one mapped sub-entities by default. This means that if we change the filter clause to


we skip loading and mapping of all the Project entities and only load the needed reports and proposals. Since they both contain the project id we can use that in our filter. This resulted in more than a 10x speedup with such a simple and non-invasive change.

General Takeaway

ORMs can be a great tool but it is very easy to shoot yourself into the foot. With enough care you can achieve both simple code and good performance but you may run into non-obvious problems every now and then.

LDAP-Authentication in Wildfly (Elytron)

Authentication is never really easy to get right but it is important. So there are plenty of frameworks out there to facilitate authentication for developers.

The current installment of the authentication system in Wildfly/JEE7 right now is called Elytron which makes using different authentication backends mostly a matter of configuration. This configuration however is quite extensive and consists of several entities due to its flexiblity. Some may even say it is over-engineered…

Therefore I want to provide some kind of a walkthrough of how to get authentication up and running in Wildfly elytron by using a LDAP user store as the backend.

Our aim is to configure the authentication with a LDAP backend, to implement login/logout and to secure our application endpoints using annotations.


Of course you need to install a relatively modern Wildfly JEE server, I used Wildfly 26. For your credential store and authentication backend you may setup a containerized Samba server, like I showed in a previous blog post.

Configuration of security realms, domains etc.

We have four major components we need to configure to use the elytron security subsystem of Wildfly:

  • The security domain defines the realms to use for authentication. That way you can authenticate against several different realms
  • The security realms define how to use the identity store and how to map groups to security roles
  • The dir-context defines the connection to the identity store – in our case the LDAP server.
  • The application security domain associates deployments (aka applications) with a security domain.

So let us put all that together in a sample configuration:

<subsystem xmlns="urn:wildfly:elytron:15.0" final-providers="combined-providers" disallowed-providers="OracleUcrypto">
        <security-domain name="DevLdapDomain" default-realm="AuthRealm" permission-mapper="default-permission-mapper">
            <realm name="AuthRealm" role-decoder="groups-to-roles"/>
        <ldap-realm name="LdapRealm" dir-context="ldap-connection" direct-verification="true">
            <identity-mapping rdn-identifier="CN" search-base-dn="CN=Users,DC=ldap,DC=schneide,DC=dev">
                    <attribute from="cn" to="Roles" filter="(member={1})" filter-base-dn="CN=Users,DC=ldap,DC=schneide,DC=dev"/>
        <ldap-realm name="OtherLdapRealm" dir-context="ldap-connection" direct-verification="true">
            <identity-mapping rdn-identifier="CN" search-base-dn="CN=OtherUsers,DC=ldap,DC=schneide,DC=dev">
                    <attribute from="cn" to="Roles" filter="(member={1})" filter-base-dn="CN=auth,DC=ldap,DC=schneide,DC=dev"/>
        <distributed-realm name="AuthRealm" realms="LdapRealm OtherLdapRealm"/>
        <dir-context name="ldap-connection" url="ldap://" principal="CN=Administrator,CN=Users,DC=ldap,DC=schneide,DC=dev">
            <credential-reference clear-text="admin123!"/>
<subsystem xmlns="urn:jboss:domain:undertow:12.0" default-server="default-server" default-virtual-host="default-host" default-servlet-container="default" default-security-domain="DevLdapDomain" statistics-enabled="true">
        <application-security-domain name="myapp" security-domain="DevLdapDomain"/>

In the above configuration we have two security realms using the same identity store to allow authenticating users in separate subtrees of our LDAP directory. That way we do not need to search the whole directory and authentication becomes much faster.

Note: You may not need to do something like that if all your users reside in the same subtree.

The example shows a simple, but non-trivial use case that justifies the complexity of the involved entities.

Implementing login functionality using the Framework

Logging users in, using their session and logging them out again is almost trivial after all is set up correctly. Essentially you use HttpServletRequest.login(username, password), HttpServletRequest.getSession() , HttpServletRequest.isUserInRole(role) and HttpServletRequest.logout() to manage your authentication needs.

That way you can check for active session and the roles of the current user when handling requests. In addition to the imperative way with isUserInRole() we can secure endpoints declaratively as shown in the last section.

Declarative access control

In addition to fine grained imperative access control using the methods on HttpServletRequest we can use annotations to secure our endpoints and to make sure that only authenticated users with certain roles may access the endpoint. See the following example:

@WebServlet(urlPatterns = ["/*"], name = "MyApp endpoint")
        transportGuarantee = ServletSecurity.TransportGuarantee.NONE,
        rolesAllowed = ["oridnary_user", "super_admin"],
public class MyAppEndpoint extends HttpServlet {

To allow unauthenticated access you can use the value attribute instead of rolesAllowed in the HttpConstraint:

        transportGuarantee = ServletSecurity.TransportGuarantee.NONE,
        value = ServletSecurity.EmptyRoleSemantic.PERMIT)

I hope all of the above helps to setup simple and secure authentication and authorization in Wildfly/JEE.

Running a containerized ActiveDirectory for developers

If you develop software for larger organizations one big aspect is integrating it with existing infrastructure. While you may prefer simple deployments of services in docker containers a customer may want you to deploy to their wildfly infrastructure for example.

One common case of infrastructure is an Active Directory (AD) or plain LDAP service used for organization wide authentication and authorization. As a small company we do not have such an infrastructure ourselves and it would not be a great idea to use it for development anyway.

So how do you develop and test your authentication module without an AD being available for you?

Fortunately, nowadays this is relatively easy using tools like Docker and Samba. Let us see how to put such a development infrastructure up and where the pitfalls are.

Running Samba in a Container

Samba cannot only serve windows shares or act as an domain controller for Microsoft Windows based networks but includes a full AD implementation with proper LDAP support. It takes a small amount of work besides installing Samba in a container to set it up, so we have two small shell scripts for setup and launch in a container. I think most of the Dockerfile and scripts should be self-explanatory and straightforward:


FROM ubuntu:20.04

RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install samba krb5-config winbind smbclient 
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install iproute2
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install openssl
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install vim

RUN rm /etc/krb5.conf
RUN mkdir -p /opt/ad-scripts

WORKDIR /opt/ad-scripts

CMD chmod +x *.sh && ./ && ./


set -e

info () {
    echo "[INFO] $@"

info "Running setup"

# Check if samba is setup
[ -f /var/lib/samba/.setup ] && info "Already setup..." && exit 0

info "Provisioning domain controller..."

info "Given admin password: ${SMB_ADMIN_PASSWORD}"

rm /etc/samba/smb.conf

samba-tool domain provision\

mv /etc/samba/smb.conf /var/lib/samba/private/smb.conf

touch /var/lib/samba/.setup

Using we start samba directly instead of running it as a service which you would do outside a container:


set -e

[ -f /var/lib/samba/.setup ] || {
    >&2 echo "[ERROR] Samba is not setup yet, which should happen automatically. Look for errors!"
    exit 127

samba -i -s /var/lib/samba/private/smb.conf

With the scripts and the Dockerfile in place you can simply build the container image using a command like

docker build -t dev-ad -f Dockerfile .

We then run it like follows and use the local mounts to preserve the data in the AD we will be using for testing and toying around:

 docker run --name dev-ad --hostname --privileged -p 636:636 -e SMB_ADMIN_PASSWORD=admin123! -v $PWD/:/opt/ad-scripts -v $PWD/samba-data:/var/lib/samba dev-ad

To have everything running seamlessly you should add the specified hostname – in our example – to /etc/hosts so that all tools work as expected and like it was a real AD host somewhere.

Testing our setup

Now of course you may want to check if your development AD works as expected and maybe add some groups and users which you need for your implementation to work.

While there are a bunch of tools for working with an AD/LDAP I found the old and sturdy LdapAdmin the easiest and most straightforward to use. It comes as one self-contained executable file (downloadable from Sourceforge) ready to use without installation or other hassles.

After getting the container and LdapAdmin up and running and logging in you should see something like this below:

LdapAdmin Window showing our Samba AD

Then you can browse and edit your active directory to fit your needs allowing you to develop your authentication and authorization module based on LDAP.

I hope you found the above useful for you development setup.

Packaging Java-Project as DEB-Packages

Providing native installation mechanisms and media of your software to your customers may be a large benefit for them. One way to do so is packaging for the target linux distributions your customers are running.

Packaging for Debian/Ubuntu is relatively hard, because there are many ways and rules how to do it. Some part of our software is written in Java and needs to be packaged as .deb-packages for Ubuntu.

The official way

There is an official guide on how to package java probjects for debian. While this may be suitable for libraries and programs that you want to publish to official repositories it is not a perfect fit for your custom project that you provide spefically to your customers because it is a lot of work, does not integrate well with your delivery pipeline and requires to provide packages for all of your dependencies as well.

The convenient way

Fortunately, there is a great plugin for ant and maven called jdeb. Essentially you include and configure the plugin in your pom.xml as with all the other build related stuff and execute the jdeb goal in your build pipeline and your are done. This results in a nice .deb-package that you can push to your customers’ repositories for their convenience.

A working configuration for Maven may look like this:


If you are using gradle as your build tool, the ospackage-plugin may be worth a look. I have not tried it personally, but it looks promising.

Wrapping it up

Packaging your software for your customers drastically improves the user experience for users and administrators. Doing it the official debian-way is not always the best or most efficient option. There are many plugins or extensions for common build systems to conveniently build native packages that may easier for many use-cases.

Serving static resources in Javalin running as servlets

Javalin is a nice JVM-based microframework targetted at web APIs supporting Java and Kotlin as implementation language. Usually, it uses Jetty and runs standalone on the server or in a container.

However, those who want or need to deploy it to a servlet container/application server like Tomcat or Wildfly can do so by only changing a few lines of code and annotating at least one Url as a @WebServlet. Most of your application will continue to run unchanged.

But why do I say only “most of your application”?

Unfortunately, Javalin-jetty and Javalin-standalone do not provide complete feature parity. One important example is serving static resources, especially, if you do not want to only provide an API backend service but also serve resources like a single-page-application (SPA) or an OpenAPI-generated web interface.

Serving static resources in Javalin-jetty

Serving static files is straightforward and super-simple if you are using Javalin-jetty. Just configure the Javalin app using config.addStaticFiles() to specify some paths and file locations and your are done.

The OpenAPI-plugin for Javalin uses the above mechanism to serve it’s web interface, too.

Serving static resources in Javalin-standalone

Javalin-standalone, which is used for deployment to application servers, does not support serving static files as this is a jetty feature and standalone is built to run without jetty. So the short answer is: you can not!

The longer answer is, that you can implement a workaround by writing a servlet based on Javalin-standalone to serve files from the classpath for certain Url-paths yourself. See below a sample implementation in Kotlin using Javalin-standalone to accomplish the task:

package com.schneide.demo

import io.javalin.Javalin
import io.javalin.http.Context
import io.javalin.http.HttpCode
import javax.servlet.annotation.WebServlet
import javax.servlet.http.HttpServlet
import javax.servlet.http.HttpServletRequest
import javax.servlet.http.HttpServletResponse

private const val DEFAULT_CONTENT_TYPE = "text/plain"

@WebServlet(urlPatterns = ["/*"], name = "Static resources endpoints")
class StaticResourcesEndpoints : HttpServlet() {
    private val wellknownTextContentTypes = mapOf(
        "js" to "text/javascript",
        "css" to "text/css"

    private val servlet = Javalin.createStandalone()
        .get("/") { context ->
            serveResource(context, "/public", "index.html")
        .get("/*") { context ->
            serveResource(context, "/public")

    private fun serveResource(context: Context, prefix: String, fileName: String = "") {
        val filePath = context.path().replace(context.contextPath(), prefix) + fileName
        val resource = javaClass.getResourceAsStream(filePath)
        if (resource == null) {
        var mimeType = URLConnection.guessContentTypeFromName(filePath)
        if (mimeType == null) {
            mimeType = guessContentTypeForWellKnownTextFiles(filePath)

    private fun guessContentTypeForWellKnownTextFiles(filePath: String): String {
        if (filePath.indexOf(".") == -1) {
            return DEFAULT_CONTENT_TYPE
        val extension = filePath.substring(filePath.lastIndexOf('.') + 1)
        return wellknownTextContentTypes.getOrDefault(extension, DEFAULT_CONTENT_TYPE)

    override fun service(req: HttpServletRequest?, resp: HttpServletResponse?) {
        servlet.service(req, resp)

The code performs 3 major tasks:

  1. Register a Javalin-standalone app as a WebServlet for certain URLs
  2. Load static files bundled in the WAR-file from defined locations
  3. Guess the content-type of the files as good as possible for the response

Feel free to use and modify the code in your project if you find it useful. I will try to get this workaround into Javalin-standalone if I find the time to improve feature-parity between Javalin-jetty and Javalin-standalone. Until then I hopy you find the code useful.

Improving Windows Terminal

As mentioned in my earlier post about hidden gems in the Windows 10 eco system a very welcomed addition is Windows Terminal. Finally we get a well performing and capable terminal program that not only supports our beloved tabs and Unicode/UTF-8 but also a whole bunch of shells: CMD, PowerShell, WSL and even Git Bash.

See this video of a small ASCII-art code golf written in Julia and executed in a Windows Terminal PowerShell:The really curious may try running the code in the standard CMD-Terminal or the built-in PowerShell-Terminal…

But now on to some more productive tipps for getting more out of the already great Windows Terminal.

Adding a profile per Shell

One great thing in Windows Terminal is that you can provide different profiles for all of the shells you want to use in it. That means you can provide visual clues like Icons, Fonts and Color Schemes to instantly visually recognize what shell you are in (or what shell hides behind which tab). You can also set a whole bunch of other parameters like transparency, starting directory and behaviour of the tab title.

Nowadays most of this profile stuff can simply be configured using the built-in windows terminal settings GUI but you also have the option to edit the JSON-configuration file directly or copy it to a new machine for faster setup.

Here is my settings.json provided for inspiration. Feel free to use and modify it as you like. You will have to fix some paths and provide icons yourself.

Pimping it up with oh-my-posh

If that is still not enough for you there are a prompt theme engine like oh-my-posh using a command like

Install-Module oh-my-posh -Scope CurrentUser

and try different themes with Set-PoshPrompt -Theme <name>. Using your customized settings for a specific Windows Terminal profile can be done by specifying a commandline to execute expressions defined in a file:

powershell.exe -noprofile -noexit -command \"invoke-expression '. ''C:/Users/mmv/Documents/PowerShell/PoshGit.ps1

where PoshGit.ps1 contains the commands to set up the prompt:

Import-Module oh-my-posh

$DefaultUser = 'Your Name'

Set-PoshPrompt -Theme blueish

Even Microsoft has some tutorials for highly customized shells and prompts

How does my Window Terminal look like?

Because seeing is believing take a look at my setup below, which is based on the instructions and settings.json above:

I hope you will give Windows Terminal a try and wish a lot of fun with customizing it to fit your needs. I feel it makes working with a command prompt on Windows much more enjoyable than before and helps to speed you up when using many terminal windows/tabs.

A final hint

You may think, that you cannot run Windows Terminal as an administrator but the option appears if you click the downward-arrow in the start menu:

Migrating a Grails application from Oracle to PostgreSQL

In my previous post I explained how to migrate an Oracle schema with data to a PostgreSQL database management system (DBMS). Besides the general tasks and issues there are additional topics to migrate a complete application using the database to the other DBMS.

In our specific case we have a grails application which we maintain since Grails 1.0 times for more than 12 years. During that time we did a ton of feature development with lots of refactoring and many database migrations. So the source database will most like not be perfectly consistent and clean.

General approach

Since Grails/GORM and the DatabaseMigration-Plugin (DBM-Plugin) do a great job at preparing an empty database with a matching schema for the application to run we let the framework tools generate the schema and only migrate the data using Ora2Pg.

Sounds simple, but how it is done in detail and what else to look for?

Generating the initial dabase schema

The DBM-Plugin provides a script to create a database changelog with a schema matching the domain model of your grails application. It is integrated in gradle, so you can grails dbm-generate-gorm-changelog initialdb.groovy to create the migration scripts providing a fitting schema. You then include this script in grails-app/migrations/changelog.groovy or replace all the migrations you had before included there with this initial database changelog.

To prepare an empty database to run with your application you call the gradle task dbmUpdate.

Checking all plain SQL code

If you are only using GORM’s dynamic finders, save()/update()/delete()-methods, HQL and the criteria API you are probably fine to run your application or perform the data mirgration step.

Our application has some specific parts where we use plain SQL. Because of syntactical differences you will want to check all the plain SQL if it works with PostgreSQL. The most obvious stuff is dealing with sequences or other queries where you need the dual table in Oracle.

Migrating the data

This is probably the part where the most things can go wrong. We had quite some work with data-inconsistencies and left-overs from manual corrections that happened over the course of running and upgrading the application for so many year. For younger and simpler applications this may not present any challenges but for us it was quite time-consuming.

Now you can use Ora2Pg to import the data. After the whole data import using Ora2Pg worked as intended you should check the value of the hibernate_sequence . This sequence is used to generate the ids of all grails domain objects.

Do not let the sequences from the autoincrement columns of the tables of your domain objects confuse you! They are not used by Grails/GORM. To avoid this confusion you can remove the default value of the id columns and the accompanying sequences.

Checking the result

You should always run acceptance or manual tests to make sufficiently sure that the migration worked as intended. There is always the possibility of a configuration or software error or some oversights in checking the application code.

If possible tests the result on a dedicated system with some snapshot of the real world data before making the switch on the production system. Good luck!

Migrating from Oracle to PostgreSQL

We are maintaining several applications with a SQL-Database as our data storage. If we can decide freely, we usually opt for PostgreSQL as the database management system (DBMS). But sometimes our clients have specific requirements because they are running the services on-premises so we use our customers’ choice. SQL is SQL anyway, is it not?

No it isn’t. And this year one of our customers asked us to migrate our application from Oracle to PostgreSQL. The migration was challenging even though we are using an object-relational mapper (ORM) and the necessary changes to our application code were very limited.

In this post I want to explain the general, application-agnostic challenges of such a migration. A follow-up will cover the application- and framework-specific issues.

Why is it not easy?

Luckily, PostgreSQL supports most common SQL features of Oracle, especially sequences, PL/SQL like scripts, triggers, foreign keys etc. and all the important datatypes. So you are mostly migrating from an inferior to a more powerful solution, at least feature and capability-wise from a client perspective. Please note that I am not judging the performance, replication, clustering and other administrative features here!

Unfortunately there is no simple and powerful enough tool to simply dump the oracle database into some standard SQL text format that you could pipe into psql or use with pg_restore. In addition there is also a challenge to convert the different number-types of Oracle to sematically equivalent PostgreSQL types etc.

Another challenge is coping with the referential integrity. Especially data in complex schemas with a lot of foreign keys are harder to migrate without proper tool support as you have to figure out the correct order of tables to restore.

Nevertheless, such a migration is doable, especially if you do not have too much scripting logic in your database. And there is a free tool to help you with all this stuff called Ora2Pg.

What can Ora2Pg do for you?

It can export the full database schema including constraints, convert datatypes based on configuration provided by you and offers a basic automatic conversion of PL/SQL code to PLPGSQL. When running the migration you can interactively choose what to migrate and what to skip. That allows you to only migrate the data into a readily prepared schema, for example.

How to run Ora2Pg?

Ora2Pg is a collection of perl scripts and configuration files so you need a system capable of running these. If you do not want to mess with your whole system and install all of the dependencies I prepared a Dockerfile able to run Ora2Pg:

FROM centos:7

# Prepare the system for ora2pg 
RUN yum install -y wget
RUN wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle

COPY ol7-temp.repo /etc/yum.repos.d/
RUN yum install -y oraclelinux-release-el7
RUN mv /etc/yum.repos.d/ol7-temp.repo /etc/yum.repos.d/ol7-temp.repo.disabled
RUN yum install -y oracle-instantclient-release-el7
RUN yum install -y oracle-instantclient-basic
RUN yum install -y oracle-instantclient-devel
RUN yum install -y oracle-instantclient-sqlplus

RUN yum install -y perl perl-CPAN perl-DBI perl-Time-HiRes perl-YAML perl-local-lib make gcc
RUN yum install -y perl-App-cpanminus

RUN cpanm CPAN::Config
RUN cpanm CPAN::FirstTime

ENV LD_LIBRARY_PATH=/usr/lib/oracle/21/client64/lib
ENV ORACLE_HOME=/usr/lib/oracle/21/client64

RUN perl -MCPAN -e 'install DBD::Oracle'

COPY ora2pg-21.1.tar.gz /tmp

RUN tar zxf ora2pg-21.1.tar.gz && cd ora2pg-21.1 && perl Makefile.PL && make && make install

RUN mkdir -p /migration
RUN ora2pg --project_base /migration --init_project my_project
WORKDIR /migration/my_project

# uncomment this if you have a customized ora2pg.conf
#COPY ora2pg.conf /migration/my_project/config/

CMD ora2pg -t SHOW_VERSION -c config/ora2pg.conf && ora2pg -t SHOW_TABLE -c config/ora2pg.conf\
 && ora2pg -t SHOW_REPORT --estimate_cost -c config/ora2pg.conf\
 && ./ && ora2pg -t INSERT -o data.sql -b ./data -c ./config/ora2pg.conf

Here are the commands and the workflow to export the oracle database using the above docker image:

docker build -t o2pg .
# this will fail initially but create the project structure and generate a default configuration file
docker run --name oracle-export o2pg
# copy the project structure to the host system
docker cp oracle-export:/migration/my_project ./my_project_migration/

Now you can edit the configuration in my_project_migration/config and copy it to the directory you have built and run the docker commands. Most importantly you have to change the connection parameters at the top of the ora2pg.conf file. When you are ready to make the first go you need to enable configuration copying in the Dockerfile and rebuild the image. Now you should get your first somehow usable export.

The most import config options we changed for our projects are:

  • Connection parameters
  • Excluded tables that you do not want to migrate
  • Deletion of the contents of the target tables
  • Conversion of some datatypes like NUMBER(*,0) to bigint and NUMBER:1 to boolean for some columns

Most of the defaults are sensible to begin with but you can tailor the export specifically to your needs. If you feel ready to try the import you can run the import using a second docker image based on the following Dockerfile-import:

FROM centos:7

# Prepare the system for ora2pg 
RUN yum install -y wget
RUN wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle

COPY ol7-temp.repo /etc/yum.repos.d/
RUN yum install -y oraclelinux-release-el7
RUN mv /etc/yum.repos.d/ol7-temp.repo /etc/yum.repos.d/ol7-temp.repo.disabled
RUN yum install -y oracle-instantclient-release-el7
RUN yum install -y oracle-instantclient-basic
RUN yum install -y oracle-instantclient-devel
RUN yum install -y oracle-instantclient-sqlplus
RUN yum install -y postgresql-server

RUN yum install -y perl perl-CPAN perl-DBI perl-Time-HiRes perl-YAML perl-local-lib make gcc
RUN yum install -y perl-App-cpanminus

RUN cpanm CPAN::Config
RUN cpanm CPAN::FirstTime

ENV LD_LIBRARY_PATH=/usr/lib/oracle/21/client64/lib
ENV ORACLE_HOME=/usr/lib/oracle/21/client64

RUN perl -MCPAN -e 'install DBD::Oracle'

COPY ora2pg-21.1.tar.gz /tmp

RUN tar zxf ora2pg-21.1.tar.gz && cd ora2pg-21.1 && perl Makefile.PL && make && make install

# you need to mount the project volume to /my_project
WORKDIR /my_project

ENV pg_port=5432

CMD ./ -d $pg_db -h $pg_host -p $pg_port -U $pg_user -o $pg_user

To run the import with your exported project run build and run the import container as follows:

docker build -t postgres-import -f Dockerfile-import .
docker run -it --rm -e pg_host=target-db.intranet -e pg_db=my_project_db -e pg_user=my_db_user -v ./my_project_migration:/my_project postgres-import

Then you can interactively provide the database password and decide which migration steps to perform.


Depending on your schema, data and privileges in the target database it may be necessary to disable all triggers before importing and reenable them after a successful import. This can done by replacing all occurences of TRIGGER USER by TRIGGER ALL in the file data/data.sql. You may need appropriate privileges for this to work.

Final words

Such a migration is not an easy task but may be worth it in total cost of ownership and maybe developer satisfaction as Oracle has some oddities and limitations for backend developers.

I will tackle some application specific issues with such a migration in a follow-up article that we encountered when migrating our system from Oracle to PostgreSQL using the above approach and tools.