XML for Secured Message Passing

Securely passing XML Documents between two parties requires that the receiver can find out:

  • who sent the message
  • whether the message has been tampered with

and the sender can guarantee that:

  • enough information is available so the receiver can verify who the sender is
  • the receiver can find out whether the message has been tampered with

These requirements can be satisfied if the document is digitally signed. However, using XML as a message format brings additional complications that must be handled. Consider the following two documents:

They represent the same information but in different way:

  • the attribute value is double quoted in the first document while it is single quoted in the second one
  • the empty element uses different syntax

This is a simple example. Think about all the other things that can differ. For example:

  • the order of the attributes
  • the document encoding
  • arbitrary whitespaces
  • new lines
  • how new lines are encoded

All these differences will cause documents having the same data result in different signatures.

 

XML Canonicalization to the Rescue

To avoid these situations, the XML document must be canonicalized so it is always represented in the same way no matter what the original document structure is. The canonicalization is done by following the rules defined in https://www.w3.org/TR/xml-c14n/. The following list is a sample of the rules just to show what they look like:

  • XML declaration is omitted
  • All characters are encoded to UTF-8
  • Empty element must be converted to start-end tag pair
  • Attribute values use double quotes
  • CDATA sections are replaced by its content only

Example

To demonstrate XML canonicalization the Apache Santuario library will be used.

The following code uses the Canonicalizer class to convert the XML documents to have the same structure.

As mentioned early this is a simple example. The XML document can be much richer containing comments, namespaces, entity references etc. The canonicalization of these cannot be handled by just one mapping. That’s why there are different algorithms that can be specified when a Canonicalizer is instantiated. You can see the full documentation here or in the specification 

If you are curious what is the canonicalized form of your XMLs just give them a try following the example above.

LVM2: Extend file system

LVM2 refers to the userspace toolset that provide logical volume management facilities on Linux. It is reasonably backwards-compatible with the original LVM toolset.

Resizing volumes with the LVM2 toolset is easy. First you have to adjust your physical partition table using fdisk. Let’s say you have a 60GB physical disk and a 50GB physical partition on it.

You want to resize the partition to take the whole disk. To do that you delete the partition and create it again with the new geometry.

The highlighted rows show the commands issued to fdisk. After the partition table is rewritten reboot the system.

The new partition table should now look like this:

Now that the new partition is setup you can assign the new space to a PV (physical volume). You can check the physical volumes with the pvs command.

In this case you have only one PV (physical volume) of size 50GB. Since you know your partition is 60GB you can very easily resize the PV to take all space with the pvresize command.

As you can see now the PV is of size 60GB and there is 10GB of free space. You can now use the free space to resize an LV (logical volume) using the lvextend command.

The -r option tells lvextend to resize the underlying filesystem (e.g. ext4 or btrfs) along with the LV. The -l option tells lvextend to set the LV size in units of logical extents. In this case the argument to -l is +100%FREE which means that 100% of the free space will be used. In other words all free space will be added to the system/root LV. After lvextend returns the file system should be resized and operational without system reboot.

String to Numeric and Vice-Versa in Java

How often do you write code that converts String input to a number? How often do you need to convert it back?
For example consider the following code:

There are two conversions here. First, the literal “123123123” is converted to Integer and then that Integer is converted to String again (System.out.println calls the toString() method). So what do you expect the output to be? It should be the same as the original String argument. While this is true for the above-mentioned example this is not the case when operating with floating point numbers. What do you expect the output of this code to be:

Surprisingly or not (hopefully not) it is:

Yes, that’s right! After converting String to Float/Double and then back to String the value is different than the one we started with. This is not a big deal unless you develop an application that handels numbers without considering the context.

Let’s say there is a common method that converts Strings to numbers. Something like this:

At some point in time a defect comes up that the system cannot work with floating point numbers. So the easy fix would be:

This looks clever at first sight since the method is declared Number so its callers won’t break and the fix is in one place only. However, once this fix is deployed all the code that converts the number back to String will start to behave in an unexpected manner. There will be validation errors, wrong data display etc. For example, code that accepts String argument that is expected to be an integer representation will start getting values like “1.2312312E8”.
As an example see the program below

Run it with argument “123123123”. Now change lines 24 and 35 to

and run it again without changing the argument.

Yes I know that most of this is bad code and lack of good engineering. However, keep in mind that as a system grows it can get entangled and simple changes can lead to massive failures.

Java To JavaScript Cheat Sheet

Coming from the Java language and landing in the JavaScript realm can be kind of confusing experience. All you see is functions and thinking in objects can be hard if you are not familiar with the language idioms.

This post is an attempt to help overcome the initial struggle and can be used as a cheat sheets for Java developers who want to start writing in JavaScript.

Let’s start with something simple – how to create objects.

Object Creation

Java sample that declares a Book class and creates an instance of it

The Java code should be clear so no need to dwell on it. However let’s see the JavaScript code for the same thing.

JavaScript samples for creating and using a Book object

The easiest way is to define an Object literal (a.k.a inline object)

As you can see JavaScript does not have any notion of classes – you define objects instead.

What about creating several object that have the same properties and methods? One way is to place object literals in the code however this can be a bit frustrating. To avoid code duplication and accidental typos a function that creates such objects can be defined

There is a shortcut to this approach and it is to use constructor function:

The constructor function is the closest version of a class in JavaScript. It basically defines properties and methods.

Note that in this case the new operator is used to invoke the function. Also by convention the name of a constructor function starts with a capital letter.

There is still another way to create an object – by invoking the Object() constructor.

You can use this constructor to create generic object and attach properties and methods to it later.

Default values for properties and methods can be passed to the Object constructor:

Polymorphism

Java sample – Render Rectangle and Square

Again the Java code should be clear so let’s see the JavaScript part.

JavaScript samples showing the dynamic nature of prototypal inheritance

Since JavaScript has no notion of classes the inheritance is implemented by using a prototype object. Basically each object has a prototype object that serves as a template and extenders inherit properties and methods from it. It might sound confusing at first but after running the following examples it makes sense.

Let’s start by defining a Rectangle object.

It has two properties. If this object is designed to be extendable all methods that can be inherited must be defined in the prototype object.

(Remember that each object has a prototype object)

The code above adds the render method to the prototype object thus any object that inherits Rectangle can override it.

Now let’s define another object – Square, that will extend Rectangle

As you can see it has only one argument and calls Rectangle ‘s constructor passing it.

Now If we call render()

We’ll see the following error:

To fix this we have to copy the prototype object from the parent object

Calling render() again will result in:

Now we can override the behaviour of renderfor Square objects:

And here is the full code:

Can you guess the output?

Encapsulation

Java sample – private Rectangle properties

The Java code defines a Rectangle whose properties are not exposed and can be changed only through the resize method.

Let’s see how this can be achieved in JavaScript

JavaScript counterpart – function scoped variables

The code looks a bit stranger than the previous snippets so let’s start from the top. The expression

defines anonymous function and executes it. If the last () is omitted then only a function will be declared but not executed (if no variable points to it then the function cannot be called at later time).

The next important thing is this idiom:

This is an anonymous function that returns an empty object. Having in mind that the variables are scoped in a function this means that everything that is declared in the function and before the return statement will be visible to the inlined object and not visible to the callers of the function. Example:

This will log 10 in the console output. And this will log undefined

Hopefully the code we started with is more clear now. The next question is – What about having multiple objects with scoped state?

JavaScript sample – multiple objects with scoped state

As a last note – this approach is not restricted to variables only. Functions declared outside the return statement won’t be visible as well.

JAXB With Kotlin

Kotlin is a programming language that runs on the Java Virtual Machine (JVM) and is designed to interoperate with Java code and thus with the existing frameworks.

For the purpose of this tutorial let’s imagine we develop a library application. As a first step let’s define a class that represents a book. A book has a name, author and description

About the Kotlin language: More on classes

As a second step let’s marshal a Book just to make sure everything is working:

About the Kotlin language: More on functions

Note: Here you can find more information on the Book::class.java expression

Running the application shows an error stating that: unable to marshal type "Book" as an element because it is missing an @XmlRootElement annotation Let’s add that annotation to the Book class and run the application again – this time another error is printed: Book does not have a no-arg default constructor. When we added the Book class we declared only one constructor (the primary one) which has three arguments. Let’s add another one in order to make JAXB happy:

The new constructor calls the primary one with default values. Kotlin has special opinion on null values (see here) and the developer must specially declare the null possibility. Since I agree that null should be avoided as a value, default values are provided in this case. Let’s run the application again. This time the output is:

The book element is empty because we didn’t add @XmlElement to the fields we want to marshal and we did not tell JAXB how to access the properties. Let’s fix that:

And here is the output:

So far so good. We have a working marshaling of a Book. Let’s organize books in a Library by adding a new class that will hold a collection of Books:

About the Kotlin language: More on collections

This time the library name won’t be an element but an attribute. Let’s marshal it:

And here is the output:

About the Kotlin language: More on try-with-resource

Let’s unmarshal it and print the contents of the Library

(yes Kotlin supports multi-line strings. It’s omitted here in sake of brevity)
Note:Since the Unmarshaller returns Object (not always but in this case it does) but we expect Library object then we use “unsafe” cast with as
This is the output:

 

Marshall JAXB Elements Without Prior Declaration of Their Type

Java Architecture for XML Binding (JAXB) is an API for access of XML documents from applications written in the Java programming language. It’s convenient to use as it usually follows this pattern:

  1. Create JAXBContext
  2. Create Marshaller/Unmarshaller
  3. Marshal/Unmarshal the code

The following are examples that marshal:

and unmarshal XML document to object:

The marshalled XML looks like this:

The actual model is this:

As you can see the JAXBContext is initialized with the classes that take part in the marshall/unmarshall process (in this case the Project class has a reference to the Task class and that’s why there is no need to pass all classes to the JAXBContext). This means that those classes should be known beforehand. However often this is not the case. There are cases when the contained elements are known only at runtime. One such example is a Task Runner – an application that can run tasks and the tasks with their execution order and parameters are contained in single XML file (sounds familiar? Yep, Ant is what comes to mind). Consider the following example:

It contains several elements that have different names – tstamp, mkdir, delete – however they are all of the same type. They are all Tasks. And because such an application is extensible (users can add arbitrary Tasks) by plugins the Tasks are not known beforehand.

Let’s start with the model

Minor changes can be spotted. First of all Task is an abstract class so the actual implementors are not referenced. The other thing is the @XmlElementRef annotation. This annotation dynamically associates an XML element name with the JavaBean property. When used the XML element name is derived from the instance of the type of the JavaBean property at runtime.

This is the Java code defining the Tasks

There are two ways to marshal those objects. First add jaxb.index file to the package containing the JAXB annotated classes and list them. The second way is to provide ObjectFactory in the package containing the JAXB annotated classes.

 

Source: ObjectFactory.java
Here is an app that instantiates custom tasks and marshals them

Source: Application.java
As you can see in this case the JAXBContext is created by specifying the package names containing the JAXB annotated classes.

Full source code

 

Resources:
https://jaxb.java.net/nonav/2.2.4/docs/api/javax/xml/bind/annotation/XmlElementRef.html
https://jaxb.java.net/nonav/2.2.4/docs/api/javax/xml/bind/annotation/XmlElementDecl.html

Git cheats

git --everything-is-local

Git is a great and very powerful tool. There are probably over a hundred command line options that allow you to take advantage of its features. Here are few interesting things you can do easily.


Restore a deleted file

To restore a file that has been deleted several commits ago you have to find the commit that deleted the file. Then checkout the version of the file that was before the delete commit. Suppose you have the following in your repository.

As you can see originally there was a file named main.c but at some point it got deleted by: 2626b9b. To restore the content of main.c you can find the last commit that changed it.

And checkout the previous version (denoted by the caret character: ^).

Of course you checkout the previous version because the last commit that changed the file is the one that deleted it.


Rename files and directories

Git is very smart when it comes to renaming files. You don’t actually have to tell git that you’re renaming a file, it will figure it out on its own. Here’s an example, first we rename the file with whatever tool we like.

At this point git knows that you deleted a file that was in the repository (Main.java) and created a new file that is still untracked (Launcher.java).

Then you can just add these changes to the staging area and git will know that you renamed the file because the content is the same.

Actually it will detect that you renamed a file even if you made changes to the content. In this case I also changed the name of the class to keep it a valid Java file.

As you can see git calculates a similarity index of 68% and correctly concludes this is a rename operation.


Revert commits

Sometimes you’ll need to revert the changes introduced by one or more commits. There are generally two approaches to do that. One is to change the existing git history and the other to create new commits. Changing history can lead to issues if the commits are shared (when other people have them in their repositories). So probably the easiest and safest way is to use the revert command to create new commits with the opposite changes. Lets look at the following history.

Commit 329d44c adds a main method.

You can create a new commit that has exactly the opposite changes.

Now the history contains a new commit that reverted the changes 329d44c introduced.

Factory With Java 8

Consider an application that prints student names taken from a data access object.

The application can work with different data stores and the storage should not affect the operations performed. That’s why there is an interface DataAccess that provides the data and several implementations (in this case two) that know how to handle the reading from the actual data storage.

Another part of the application is the DataProcessor that prints students obtained from the DataAccess. The DataProcessor does not care what is the actual storage it needs just the data (the students’ names).

Here is a sample implementation:

Source: DataProcessor.java

Source: DataAccess.java

Source: MongoAccess.java

Source: RdbmsAccess.java

Here is how a main app would look like written in Java 7. First a Factory class that creates a DataAccess is required:

Source: Factory.java

and an Application that actually uses the DataProcessor

Source: Application.java

This code can consist of fewer lines if using Java 8 capabilities for Lambdas and the interfaces defined in java.util.function.Supplier

Source: ApplicationWithLambda.java

In this case the Factory class is replaced by java.util.function and the code is inlined using a lambda expression.
This code is more compact that the one using Java 7 but it can become event more compact using method references.

Source: ApplicationWithLambda.java

Here the Supplier is inlined using a method reference to the constructor of each data access class. A nice benefit of this approach is that in case of not supported type it is easy the show a readable error displaying all supported types.

You can find the full code here

Java 9 Process API Improvements

Introduction

Java 9 adds improvements to the Process API that deal with ongoing shortcommings like:

  • getting the PID of the running process
  • getting the children and/or the descendants of a process

Along with these a new class is added that helps:

  • listing all running processes
  • getting info for an arbitrary process
  • traversing process tree

Note: The information returned by these methods is just a snapshot of the processes running on the OS. Some of them may be missing, stopped etc. during the traversal so handle with care.

Samples

Showing Process Information

Sample showing obtaining information for running processes’ id and command

Getting information for a process returns Optional and based on that further filtering can be done. For example listing only processes that have available command property:

You can see there is no ‘N/A’ value for ‘Command’ in the output

Sample showing obtaining information for an arbitrary process

In this case the ‘pid’ is of the IDE used for writing the sample.

Getting the ID of the Running Process

Sample showing obtaining information for a process that was killed after taking a snapshot from the OS

The output of this snippet is:

In this case the PID is of a text editor. To repro start the sample code in debug and once the processHandler is obtained stop the text editor.

Traversing process tree

The output of this program is:

In this sample ‘pid’ is the running IDE. The output shows the difference between children() and descendants(). At the begging of the programm a new process is started however it is not listed when children() is called and it is shown when descendants() is called.

Sample Application

Sample application that shows:

  • all process when invoked with argument ‘ps’
  • process’ children when invoked with arguments ‘children <pid>’
  • process’ descendants when invoked with arguments ‘descendants <pid>’

Source:App.java

References

Arbitrary Java object serialization to YAML

There are several good options for java object serialization and deserialization. One of these is to use the YAML data serialization standard.

YAML (rhymes with “camel“) is a human-friendly, cross language, Unicode based data serialization language designed around the common native data types of agile programming languages. It is broadly useful for programming needs ranging from configuration files to Internet messaging to object persistence to data auditing.

There are also several Java libraries that implement the YAML standard (as listed on yaml.org).

  • JvYaml
  • SnakeYAML
  • YamlBeans
  • JYaml

For the purpose of this article we’ll be using SnakeYAML as a solid modern implementation of the YAML 1.1 standard.

Consider the following simple example:

Source: Weapon.java

This Java bean can be serialized and deserialized with just the following code:

Source: WeaponTest.java

Simple Java beans are automatically serialized and deserialized, the data looks like this:

Of course in reality things are not always that simple. Your domain model may contain types that are not Java beans like java.net.URL, java.util.UUID, java.net.InetAddress and others. In this case the serialization process will ignore the non-bean objects and deserialization will fail altogether.

Fortunately SnakeYAML’s represent and construct functionality can easily be extended. Consider the following more complex example:

Source: Cyborg.java

Now we have non-bean types and java.util.Collection. To handle serialization and deserialization properly we need to use custom representer and constructor.

The representer takes the non-bean objects and transforms them into custom-tagged scalars. Also RepresentInetAddress is registered for both Inet4Address and Inet6Address to handle both protocols.

Source: ApplicationRepresenter.java

The constructor does exactly the opposite of the representer does.

Source: ApplicationConstructor.java

Our custom representer and constructor are passed to the constructor of the YAML processor. Putting everything together looks like this:

Source: CyborgTest.java

Now the type hierarchy is represented in YAML like this:

As you can see the non-bean type values are tagged with custom tags: !inet and !uuid. That’s how the YAML document is later properly deserialized.