ScaLearning 3 – Is Scala Hard to Learn?


Like many developers who make the journey from Java to Scala, I often find myself amazed at how much easier it is to do some things, or how much easier it is to express myself in Scala.

“ScaLearning” will be a series of short blog-posts just documenting little tidbits I find interesting, confusing, amusing, or otherwise worthy of talking about.

Is Scala Hard to Learn?

If you want to skip the explanation, my answer is no.

To elaborate, I’ve heard a great deal of “buzz” about how complicated Scala is. I’ve heard it said that it will become an academic language, and that average programmers won’t be able to pick it up. Unfortunately, the blogosphere is so cluttered with rebuttals to the idea that it’s hard to find anyone actually critiquing Scala.

Some examples of pro-scala posts:

Is Scala Complex? (Martin Odersky)
Is Scala Too Complex? (Eric Daugherty)
Is Scala More Complicated than Java? (Vassil Dichev)

Some examples critiquing the complexity:

My Verdict on the Scala Language (Doug Pardee)
Is the Scala 2.8 Collections Library a Case of the Longest Suicide Note in History? (oxbow_lakes)

Since a lot has been said already, I won’t go into any in-depth tirade about software complexity. I will, however, draw on my experience:

For me, Scala has been easy to learn after a few years of Java because:

  • Type system that’s similar to Java
  • Java interoperability allows me to ease myself in
  • It’s possible to write near-Java syntax, and have it be valid Scala
  • Many Scala keywords have direct analogs in Java (ie: public final String s vs val s : String

So far I’ve done two “projects” in Scala.

One was a Lift (version 1.x) application that talked to a back-end set of Jersey services. The native XML handling was a dream-come-true, but the mapping of Lift 1.0 made me think Scala was much harder than it is.

The second project was an experimental conversion of a Spring MVC application using JSP views into a Lift 2.x website. I left the backing services in Java, used Spring to get the dependencies, but rewrote the front-end to be Lift. The mapping I hated in 1.x had been completely overhauled, and was actually very pleasant to work with.
That brings me to the take-away of this post.
A large part of the “complexity” people see in Scala isn’t the language. It’s the libraries and frameworks.

Scala, as a language, is very simple and very powerful. Overall, while I don’t know every feature, I feel as if I know enough to be able to understand most of the code I read, and colleagues familiar with Scala seem to be able to read the code I’m writing.

I find Scala very easy to learn. Some libraries have been a challenge… but then, so is learning the exact magic incantation for Java’s file IO.



ScaLearning 2 – Reducing string-join to a line


Like many developers who make the journey from Java to Scala, I often find myself amazed at how much easier it is to do some things, or how much easier it is to express myself in Scala.

“ScaLearning” will be a series of short blog-posts just documenting little tidbits I find interesting, confusing, amusing, or otherwise worthy of talking about.

String Joining

I recently found myself writing a SQL generator that took a domain-specific query object and turned it into raw SQL. We had to write our own, as we needed to optimize several parts of the query, or face the rage of the customer when things were a little sluggish.

(It’s generally bad for business when the customer goes Hulk on you.)

As happens often in generating SQL, I found myself wanting to store a list and later convert it into a comma-separated string.

Java – First Attempt

StringBuilder result = new StringBuilder();
List items = Arrays.asList("a", "b", "c", "d");
boolean first = true;
for(String i : items) {
    if(first) first = false;
    else result.append(", ");

Sloppy. Three lines in the for loop, a local variable hanging around. It’s readable, but having to read 5-7 lines just to realize it’s a string-join seems excessive. Try again!

Java – Second Attempt

StringBuilder result = new StringBuilder();
List items = Arrays.asList("a", "b", "c", "d");
for(String i : items) {
    result.append(", ").append(i);
System.out.println(result.toString().replaceFirst(", ", ""));

Shorter, no local variable, but now we’re doing “unwork” to get rid of the leading comma. That’s an easy detail to miss, so fewer lines but not much more readable.

(I’m open to cleaner implementations! Comment with what you prefer.)


In Scala (and many other languages, ie: PHP) this is a simple one-liner.

val items = List("a", "b", "c", "d");
println( items.reduceLeft(_ + ", " + _) )
println( items.mkString(", ") )

Two ways, both one line. Both are readable, and I don’t have a real preference between them.

While readability and line count aren’t always so tied, the fewer lines a person has to read the faster they can read them. While I could easily write a method for Java to join Strings, this is just another example of how Scala seems to actively try to make my life easier.


ScaLearning 1 – Closure Oddities


Welcome to my first ScaLearning post.

Like many developers who make the journey from Java to Scala, I often find myself amazed at how much easier it is to do some things, or how much easier it is to express myself in Scala.

“ScaLearning” will be a series of short blog-posts just documenting little tidbits I find interesting, confusing, amusing, or otherwise worthy of talking about.


Back in March I created a document called “ScalaWTF.txt”. Today I randomly found it, read it, and thought to myself, “WTF?! How does that work!”. Yes, I think in ancronyms 🙂

Here’s what the document said:

scala> (1 to 5) filter{x => println("hi"); x%2==0}                          
res61: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 4)

scala> (1 to 5) filter{println("hi"); _%2==0}                               
res62: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 4)

It only took me a moment or two to remember why I found this so confusing. The two closures I supply to “filter” look nearly identical, yet they behave very differently.

I decided to dig a little deeper:

scala> val f : (Int) => Boolean = (x) => {println("hi"); x % 2 == 0 }
f: (Int) => Boolean = 

scala> f(2)                                                          
res1: Boolean = true

scala> f(3)
res2: Boolean = false

Interesting, so that behaves as I would expect. The “println”, being a part of the closure, is executed on each call to “f”. Now the other:

scala> val f : (Int) => Boolean = {println("hi"); _ % 2 == 0 }    
f: (Int) => Boolean = 

scala> f(3)                                                   
res3: Boolean = false

scala> f(2)
res4: Boolean = true

Strange, and not what I expected. Let’s take it a step further.

scala> val f : (Int) => Boolean = {println(_); 2 % 2 == 0 }    
:5: error: missing parameter type for expanded function ((x$1) => println(x$1))
       val f : (Int) => Boolean = {println(_); 2 % 2 == 0 }
:5: error: type mismatch;
 found   : Boolean(true)
 required: (Int) => Boolean
       val f : (Int) => Boolean = {println(_); 2 % 2 == 0 }

So apparently, while I assumed what I was doing was passing in a closure to “f”, what I was actually doing was evaluating a block, the last element of which must be a closure. Also, only the last statement is capable of binding with the wildcard.

Based on that, the following will not work:

val f : (Int,Int) => Boolean = {println(_); _ % 2 == 0 }

But this one should:

val f : (Int,Int) => Boolean = {println("a"); _ % _ == 0 }

After trying it, the REPL proved my theory. The take-home is that using wildcards in your closure isn’t quite the same as explicitly declaring your variables.


Test Flow and Method Contracts


Today’s (long overdue) blog entry is inspired by a recent twitter discussion I’ve been following. Uncle Bob (aka Robert Martin) made the bold statement that 100% test coverage should simply be a matter of conscience.
Now, I’m not going to delve into my thoughts on the discussion. However, I find it distressing that one of the biggest arguments I see against high test coverage appears to be “but tests don’t guarantee the code works…”.
As Bob said, “Tests cannot prove the absence of bugs. But tests can prove that code behaves as expected.”

What are automated tests?

Proponents of automated testing list a great many reasons why they believe in it. To name a few:

  • Prevent code/bug regression
  • Ease of refactoring
  • Provide confidence in code behaviour
  • Reduce (or eliminate?) time spent testing manually

All of these points actually come back to one thing:
Tests mean you know what the code does
Note that I didn’t claim the code works, just that you know what it does. That differentiation is important. I’ll talk more about the infallibility of tests after.

Function Contracts

Those familiar with design by contract or Liskov’s Substitution Principle are familiar with the idea of preconditions and postconditions:
Precondition – a condition or predicate that must always be true just prior to the execution of some section of code or before an operation
Postcondition – a condition or predicate that must always be true just after the execution of some section of code or after an operation
A more formal coding of pre and post conditions takes the form of Hoare Triples. A Hoare Triple is a statement which essentially says “given some precondition, the execution of my code will produce some postcondition”.

Tie it Together

While most of us don’t think about it, automated tests are Hoare Triples.

public void get_shouldReturnCachedValue_givenValuePutInCache() {
  String key = "key";
  String expectedValue = "value";
  Cache cache = new Cache();
  cache.put(key, expectedValue);
  String result = cache.get(key);
  assertThat(result, equalTo(expectedValue));
public void get_shouldHaveValue_givenValuePutInCache() {
  String key = "key";
  String expectedValue = "value";
  Cache cache = new Cache();
  cache.put(key, expectedValue);
  boolean result = cache.has(key);

If you put this in terms of a Hoare Triple ({P} C {Q}):
{ cache.put(x, y); } r = cache.get(x) { r == y }
{ cache.put(x, y); } r = cache.has(x) { r == true }
You also might note that the method signature I chose is simply the Hoare Triple written as C {Q} {P}. This is the current practice of the team I work on, but ensuring all three clauses of the triple are distinguishable in a test name has been extremely valuable to us.
Note: The test could be more terse, but splitting setup/execute/assert allows us to think in terms of {P} C {Q}

Taking it a Step Further: Mocking

Some of you may be thinking “ok yeah, that test was easy”. It’s true. In the example we didn’t have to interact with external dependencies. But how can we ensure the system works correctly with external dependencies?
Note: The concept of “from repository” is somewhat simplified to keep things short. Imagine a more complex world where a couple of DAOs had to be combined to create a Member.

Cache mockCache = mock(Cache.class);
Repo mockRepo = mock(Repo.class);
MemberLookupService service = new MemberLookupService(mockCache, mockRepo);
public void getMember_shouldReturnMemberFromCache_whenCachedValuePresent() {
  String memberId = "member id";
  Member member = new Member(memberId);
  String result = service.getMember(memberId);
  assertSame(member, result);
public void getMember_shouldReturnMemberFromRepository_whenCachedValueNotPresent() {
  String memberId = "member id";
  Member member = new Member(memberId);
  String result = service.getMember(memberId);
  assertSame(member, result);
public void getMember_shouldPlaceMemberInCache_whenValueLookedUpFromRepository() {
  String memberId = "member id";
  Member member = new Member(memberId);
  String result = service.getMember(memberId);
  verify(mockCache).put(key, result);

Here we’ve created three Hoare Triples.
{Cache.has == true; Cache.get(x) == y; } r = getMember(x) { r == y }
{Cache.has == false; Repo.getMember(x) == y} r = getMember { r == y}
{Cache.has == false; Repo.getMember(x) == y} r = getMember { Cache.has(x) == true }

Reasoning About our Code

Now that we’ve built up a set of Hoare Triples, let’s attempt to reason about our code. We have established a system with the following rules:
{ cache.put(x, y); } r = cache.get(x) { r == y }
{ cache.put(x, y); } r = cache.has(x) { r == true }
{Cache.has == true; Cache.get(x) == y; } r = getMember(x) { r == y }
{Cache.has == false; Repo.getMember(x) == y} r = getMember { r == y}
{Cache.has == false; Repo.getMember(x) == y} r = getMember { Cache.has(x) == true }

Based on this, let’s create a scenario and pose a question. Here’s the scenario:

  • Cache.put has not been called with key “Travis”
  • “Travis” exists in the Repository, it is not null

The question:
Is it possible for MemberLookupService.getMember(“Travis”) to return null?
For the answer, I’ll refer you to Modus Ponens. In specific, when given the rule “P => Q”, if you know “not P” you cannot reason about “Q”. All potential values for “Q” are possible.
So can “getMember” return null? Yes. We’ve not established any rules about what “has” does when there’s nothing in the cache.

Fix the bug

To fix the bug, we need to add a couple more tests, as well as whatever code makes our entire test base pass:

public void get_shouldReturnNull_givenEmptyCache() {
  String key = "key";
  Cache cache = new Cache();
  String result = cache.get(key);
  assertThat(result, is(nullValue()));
public void has_shouldReturnFalse_givenEmptyCache() {
  String key = "key";
  Cache cache = new Cache();
  boolean result = cache.has(key);

Creating the following rules:
{new Cache} r == Cache.has(x) {r == false}
{new Cache} r == Cache.get(x) {r == null}
Based on our earlier scenario, we will no longer receive null:
{new Cache} r == Cache.has(Travis) {r == false}
{Cache.has(Travis) == false; Repo.getMember(Travis) == y} r = getMember { r == y}

The Code Behaves as Expected

While I don’t spend every day thinking about Hoare Triples and the predicate calculus behind my system, it’s all still there. Whether reasoning formally about our system, or informally, we do it based on what we believe the rules of our system to be.
Tests prove that these logical rules exist. Correct tests prove that they are the rules that we think they are. Whether the test is manual or automatic, as long as it is correct it can prove that we are correct about what rules govern our software.
Of course, this requires the tests to be correct. Tests are fallible as well. Automating our tests is how we address the fallibility of the tester, but I’ll go into that next time.


Never Settle For Being Good Enough

Motivation truly is a fascinating thing. We deal with it in our personal lives all the time. Managers, especially, have to be attentive to what motivates their teams and how to produce results.

I’ve personally spent my life trying to understand what motivates me.

Have you ever been told, “that chore will be easier if you make it a game”? Doesn’t work for me at all, but tonight I realized what does.

My Sudden Realization

Tonight I did a group presentation for school. The leader of the group pulled my tail out of the fire a little bit (as best he could) when I failed miserably at my portion.

Why did I fail? That’s not the issue.

My failure motivated me!

When I sat down, my head was churning. How can I do better? How can I help my team succeed? Where did I go wrong? Where did other groups go right? How can I improve.

Tonight I realized, I’m motivated by self improvement.

Never Good Enough

Uncle Bob recently published a post ending in the quote, “Whatever you are, be a good one”.

That’s not good enough for me.

In software, at school, at mundane things like cleaning my apartment, I will never settle for being “good enough”.

In the past I’ve tried to control my motivation by creating challenges or relating unpleasant tasks to real interests. The fact is, it just hasn’t worked for me.

However, I am motivated to learn how to most effectively work with my team in order to be successful in the remainder of the project.

I am motivated to improve myself as a software developer, or perhaps even a software craftsman, every single day.

I am motivated when I say, “No. I am not good enough at this. I’m not the best I could be. I can be better, and I will prove it.”

Give it a try 😉

“Who has written this crappy code?” – I did!


Today I thumbed over an amusing comic while catching up on blog posts:

I actually found it very funny, but it prompted a long discussion with a friend on IRC and I felt like sharing this.

Who has written this crappy code?

Let’s be honest, we’ve all asked it. We’ve all wondered. We’ve all used svn blame (or the equivalent).

We work in an industry full of very passionate people. We are very passionate people. We’re passionate about the great software we write. We’re passionate about wanting to improve ourselves. When we see bad code, we’re passionate to the point of frustration. Sometimes we’re passionately fearful that it was us.

I learned a very interesting answer to this question from one of my peers though, and that’s what I’m sharing today.

Who has written this crappy code?

I did.

For me it signifies a few things.

First of all, we all write bad code sometimes. It’s not something to be embarrassed about. How could anyone ever improve if they were unable to admit they have flaws that need improving?

Second of all, does it matter? Would knowing the answer actually help you fix the code any faster? Just pretend I did it, swear under your breath, then move on and fix it. Come tomorrow, you won’t respect me any less or more for it, so let’s not waste time with SVN blame, I’ll just take the blame up-front.

Lastly, the team owns the code. All of the code. (Sometimes I use the answer “we did”.) Even if the code was written before the current team, the answer “I did” or “We did” really signifies the idea that as a team we are responsible for the code under our watch.

Of all the ideas I’ve been introduced to in the last two years, this is one of the simplest and most powerful. Give it a try on your team, see how people react. You may be pleasantly surprised.


Why Failing Tests Make Refactoring Safe


I recently read an interesting blog post about why it’s ok not to write unit tests.

I disagree.

However, I believe the topic does deserve a proper rebuttal of some form, and I would like to try my hand.

What are Unit Tests for?
I would like to start by clearing the clutter. Much of cashto’s blog post discusses jobs that unit tests just aren’t really well-suited for.

Do unit tests catch bugs?
Not very well, no. Unit tests can catch some very small insignificant bugs, but I’m sure most people reading this will agree that the real bugs come up from interactions between code. This just isn’t feasible to test in a unit test – we need functional tests, integration tests, and acceptance tests to do that.

Do unit tests improve design?
No, but I’ll admit they make poor design painful.

I recently worked with an object that evolved to require 10 constructor parameters. Mocking 9 parameters just so you can test interactions with the 1 you do care about is painful. The poor design hurt, and we refactored to a simpler design.

Do unit tests help you to refactor?

Why failing tests create safety.
Until recently I didn’t fully get the meaning of this idea.

The big problem I saw was that failing tests just got altered or deleted. How on earth does deleting or changing a failing test make the refactor easy? Couldn’t you have just refactored without the test?

The answer hit me like an untested jar-file.

When I change a piece of code and 3 tests fail, I can read those tests and know exactly what contracts I had violated. That doesn’t mean my change was wrong, but it does make it very clear what assumptions had previously been made about the system.

This empowers me to go into any areas that used those components, and modify them based on the new contract I am trying to put in place. I now know exactly how my alteration affects other areas of the system.

What about bugs and junk?
Yes, Unit Tests can help uncover bugs, but usually these bugs are very low-level simple algorithm bugs.

When Unit Tests drive my code, the unit tests act as living documentation for the assumptions I actually care about, but they do absolutely no work to ensure other areas of the system use the component properly.

So how do you really catch bugs? By testing the system in its assembled state. This is accomplished using Functional and Acceptance Tests. Then you test the interactions your system may have with other systems using Integration Tests. These catch real bugs, not unit tests.

So that’s why it’s the corporate standard?

As cashto makes very clear, many people are well aware of what unit tests are, but don’t fully understand what they provide or what a test failure actually means.

It’s entirely possible that someone at your organization knows exactly what Unit Testing is meant to do. If that’s the case, great! Encourage them to share that knowledge with the rest of the team(s), because it’s very valuable.

Then again, it’s entirely possible that nobody knows why Unit Testing is actually done. Maybe someone just heard it was a good thing. If that’s the case, I encourage you to go out, learn about the reasons behind TDD and Unit Testing, and help educate your team and corporation.

I believe one of the primary problems with testing, TDD, and Unit Tests is a lack of understanding.

I walked out of University appropriately dogmatic about the usefulness of Unit Tests and testing. I criticized how many tests my team had (on their already slow 45-minute build cycle). I ran around adding tests whenever I could. Up until recently I didn’t know why, and I honestly believe the quality of my tests suffered for it.

Like all practices, we have to know why what we are doing is good before we can reap the full benefits.


The Retrospective: A Key to Self-Organization

One of the most interesting aspects of writing computer software, at least for me, has been the collaborative nature of what we do. I enjoy seeing the various ways in which a team can be organized, or organize itself, and the results of each of them. What makes software development effective, in the end, is an effective team that can deliver quality solutions to their customers in an efficient manner.

That’s why lately I’ve been absolutely fascinated with the affect agile practices have had on my team. The results have been absolutely amazing, but none of them come even close to comparing with the power of self-organization.

Self-organization has been one of the driving factors of the success of our team in the past few months, and I believe one of the most important aspects of our behaviour is our sprint retrospectives.

What does Self-Organization mean?

From Wikipedia:

Self-organization is a process of attraction and repulsion in which the internal organization of a system, normally an open system, increases in complexity without being guided or managed by an outside source. Self-organizing systems typically (but not always) display emergent properties.

Self-organization usually relies on four basic ingredients:

  1. Positive feedback
  2. Negative feedback
  3. Balance of exploitation and exploration
  4. Multiple interactions

But what does that really mean for a software team?

Well it implies a state of flux and constant change. It also implies that outside forces will eventually affect the team and influence its growth. However, these outside forces are not allowed to guide or manage the growth of the team. The team will, instead, react to the pressures of the outside world and adapt to them in order to operate more effectively. It really is a powerful statement, but how on earth does it work?

Positive Feedback/Negative Feedback

One of the easiest concepts to understand is feedback, both in positive and negative form. In fact, this is why retrospectives are as powerful as they are.

The purpose of a good retrospective is to look back at the previous sprint or iteration, and identify what things contributed to the successes and failures of the team. Often teams will identify many positive driving factors which helped them, as individuals, feel as if they were doing a good job. They will also identify negative factors which slowed them down, or made them feel as if they could have done better.

For a team that’s attempting to self-organize, this feedback is absolutely vital. Without it, the team would push forward, often missing out on key practices that lead to extremely effective behaviors. They would also miss out on identifying what practices are preventing further success.

As an example, in a recent retrospective the team I am currently on identified that team professionalism has been declining. While we are still producing high-quality results each week, we’ve allowed bad habits to enter our daily workflow. For example, because we felt so comfortable with each other we had begun to use of sarcasm and off-humour to illustrate our points, instead of more professional and intellectual alternatives. While in small doses this can be an effective stress-relief, our retrospective allowed us to identify the fact that team meetings were staring to actually take longer because it often harder to reach a team consensus.

However, the same retrospective identified something else far more illuminating. Occurrences of the unprofessional behavior actually seemed to be directly related to team frustration, which was often because we had failed to “fail early” on difficult tasks. Since identifying this pattern, we’ve started to be able to identify problems our short sprints (1 week) as early as the second day, when it would normally take until “crunch time” on the fourth day to realize the danger.

Balance of Exploration and Exploitation

Identifying positive and negative aspects of a sprint is an excellent way to exploit the knowledge and experience the team has gained, but another key factor of a healthy self-organizing system is a balance between exploiting existing knowledge, and exploring new alternatives.

This is where another key part of retrospectives comes in: “Things to Try”.

Identifying what went wrong in a sprint is an excellent starting point, but that knowledge would be of little value if the team didn’t have ideas for how to fix the problems or prevent them from occurring in the future. Perhaps even more importantly, if the team isn’t aware of ways in which it can repeat the positive factors of the sprint, the benefit from them may quickly be lost.

After discussing what we felt went right and wrong within a sprint, my team shifts its focus to what things it can try in the future to continue to achieve excellent results. However, keep in mind that this is a balance. We can’t try all of the ideas we have every single week, the effort would overwhelm us and we would spend all of our time exploring new disciplines while failing to exploit our existing habits. The practice on my team is to allow each team member to vote on any two of the ideas presented during our “things to try”. From there, the items with the top two votes are adopted as team discipline until they become habit, or are found to be of little value.

Multiple Interactions

The last of the ingredients defined is the existence of multiple interactions. For clarification on this one, I actually decided to dive down to the source article:

Swarm intelligence: from natural to artificial systems, Page 11:

(Self Organization) generally requires a minimal density of mutually tolerant individuals. Moreover, individuals should be able to make use of the results of their own activities as well as of others’ activities.

This, perhaps, identifies the most important part of the sprint retrospective: the team! What makes agile, self-organization, and a good retrospective all so successful is a good team full of intelligent and talented individuals who truly care about their work. The retrospective is one of the many ways in which the team interacts. Daily scrum-meetings, for example, are extremely valuable for ensuring continual progress, but are designed to identify roadblocks. This isn’t the type of information that members of the team can easily make use of, but it is something that should be acted on.

The primary artifacts of a successful retrospective are knowledge and action. The team is empowered with knowledge of what works, and what doesn’t. This way each team member can learn from the experiences of their teammates. The team takes action based on this new knowledge, in order to help become more effective and efficient.


While there are many aspects of the agile mindset that make a team effective, I believe that one of the most powerful tools we have available is the retrospective.

A team that is empowered to self-organize is empowered to remove its own roadblocks, and do what it takes to build better software faster.

A good retrospective identifies common problems, key positive practices, and encourages the team to change, grow, and adapt based on those realizations. By doing that, the team can become more efficient, produce high-quality software, and achieve excellent results. Without that, a team will get bogged down in “the rules” without ever stopping to figure out if they are being helped, or hindered.

“I have but one lamp by which my feet are guided, and that is the lamp of experience. I know no way of judging of the future but by the past.” (Edward Gibbon)


XmlBeanUtils 1.0.1 Release – Legacy parser support

For those who read my blog, I released an open-source project a couple of days ago for working with XML using DynaBeans from BeanUtils.

Today, while working with XML directly, I discovered that some older XML parsers, such as older versions of the Xerces parser, do not actually support all of the methods of Node that are required. Here’s a comparison:
Java 1.4 – Node
Java 1.5 – Node

To help deal with this, I’ve released a new version of XmlBeanUtils today with support for legacy XML parsers.

I’ve also gone through and added javadoc. My next step will be to put it up on a maven repository, that should be coming in the next few days but I probably won’t post a notification so keep an eye on the XmlBeanUtils site.

XmlBeanUtils – Working with XML Easily

Edit:A new version was released, see here for notes.

My team and I were working the recently on a “Converter” utility whose job it is to take in a String and output one of our domain objects.

The String contains a chunk of XML conforming to a provided Schema, and in all honestly the mappings are incredibly simple. It would take far more time and code to configure XStream or JaxB than it would for me to express, on a field by field basis, what goes where.

After parsing the XML into the appropriate Java DOM objects, I noticed that retrieving values from the DOM is far more complex than seems necessary. What I really wanted was a tool that took in XML and spit out something like a DynaBean.

I honestly thought it felt like an obvious use case, but after searching for twenty minutes I couldn’t turn up a single framework or library that did what I actually wanted.

So I made one.

XmlBeanUtils is an open-source library that does the very simple job of pumping raw XML into an easy-to-query bean. It’s an extension to the BeanUtils, which is required as a transitive dependency.

XmlBeanUtils 1.0.0 provides the simplest useful utility I could come up with to work with Xml:

import com.codequirks.xmlbeanutils.XmlDynaBean;

public class Example {
    public static void main(String[] args) throws Exception {
        String randomXml = 
           "<root style=\"metasyn-tastic\" description=\"full of gibberish\">" +
                "<child name=\"foo\">Hello!</child>" +
                "<child name=\"bar\">World!</child>" +
                "<step-child name=\"baz\">" +
                    "<grandchild>Random Data</grandchild>" +
                "</step-child>" +
                "<child name=\"zap\">Goodbye.</child>" +

        XmlDynaBean root = new XmlDynaBean(randomXml);

        /* Output: root */

        /* Output: metasyn-tastic */

        /* Output: full of gibberish */

        /* children will contain foo, bar, and zap */
        List<XmlDynaBean> children = (List<XmlDynaBean>) root.get("child");

        /* step-children will contain baz */
        List<XmlDynaBean> stepChildren = (List<XmlDynaBean>) root.get("step-child");

        /* family will contain foo, bar, baz and zap */
        List<XmlDynaBean> family = (List<XmlDynaBean>) root.get("*");

        /* You can get nth child as well (zero-indexed) */
        /* Output: bar */
        XmlDynaBean bar = root.get("child", 1);

        /* You can get grab child-attributes quickly for the 0th named child */
        /* Output: foo */
        System.out.println(root.get("child", "@name"));
        /* Output: baz */
        System.out.println(root.get("step-child", "@name"));
        /* Output: foo */
        System.out.println(root.get("*", "@name"));

        /* But getting attributes of the nth child is also pretty easy */
        /* Output: zap */
        System.out.println(root.get("child", 2).get("@name"));
        /* Grabbing the grand-children of the 0th named child also works */
        /* Output: baz */
        List<XmlDynaBean> grandChildren = 
                                (List<XmlDynaBean>) root.get("step-child", "*");

Currently XmlDynaBeans are read-only.

Plans for the near future:

  • A mutable XmlDynaBean so you can “change” the XML
  • The ability for XmlDynaBeans to generate their String XML representation
  • Proper Javadocs
  • Beefig up the Google Code landing page

I would love some feedback. Please leave your comments below

  • Has this already been done before?
  • Is this useful?
  • What should I add?
  • Have you noticed any unexpected behaviour?