Svend

Random thoughts about IT

How to install R packages with Ansible

with 5 comments

Here is a short snipet of Ansible playbook that installs R and any required packages to any nodes of the cluster:

Read the rest of this entry »

Written by Svend

February 25, 2014 at 6:25 pm

Posted in Uncategorized

Tagged with ,

Error handling in Storm Trident topologies

with 3 comments

This post summarizes my current approach to error handling when designing Storm Trident topologies. I focus here on code design, not on deployment good practices like supervision nor redundancy.

Because of the real-time stream nature of Storm, when facing most kinds of error we’ll ultimately have to move on to the next piece of data. Error handling in that context boils down to reporting this error (or not) and retrying to process the failed input data later (or not). Read the rest of this entry »

Written by Svend

February 5, 2014 at 6:11 pm

How to compile Storm 0.8.2 on Mac OS X

with 3 comments

Here are a set of instructions to build and package from source either the storm-0.8.2.jar or the complete storm-0.8.2.zip (with all dependencies). I assume packaging later versions will be similar, just be careful about dependencies versions.
Read the rest of this entry »

Written by Svend

September 4, 2013 at 4:43 pm

Posted in Uncategorized

Tagged with , ,

Scalable real time state update with Storm groupBy / persistentAggregate / IBackingMap

with 24 comments

In this post, I illustrate how to maintain in DB the current state of a real time event-driven process in a scalable and lock free manner thanks to the Storm framework.

Storm is an event based data processing engine. Its model relies on basic primitives like event transformation, filtering, aggregation… that we assemble into topologies. The execution of a topology is typically distributed over several nodes and a storm cluster can also execute several instances of a given topology in parallel. At design time, it’s thus important to have in mind which Storm primitives execute with partition scope, i.e. at the level of one cluster node, and which ones are cluster-wide Read the rest of this entry »

Written by Svend

July 30, 2013 at 1:01 am

Posted in Uncategorized

Tagged with , , , ,

Introduction to clean javascript design

with 3 comments

Despite all the efforts spent to replace it with something more decent (e.g. Flex, DartSilverlight,…), javascript is still today the language of choice for browser-side scripting. And given the huge spotlight that HTML5 is directing on the browser, it becomes again an extremely popular language.

Javascript is powerful and comes with a rich ecosystem. Its major problem though is a syntax so permissive that care is required in order to avoid ending up with a bunch of unmaintainable spaghetti code.

The purpose of this post is to present basic design tips which help keeping things clean and organized. I hope it to be useful for javascript developers struggling with code design or experienced designers struggling with javascript… Read the rest of this entry »

Written by Svend

January 4, 2012 at 10:04 pm

What I love and hate about Play 1

with 5 comments

Overview

Play 2 has been announced a few weeks ago.

I am currently working on a project based on version 1 of this framework, as I am envisaging a migration, I decided to write two blog posts about this experience. The text below contains my personal thoughts on Play 1, I’ll try at a later stage to hunt for some time to use this as a comparison point for my feed-back on Play 2 and the migration process (if I confirm it should happen…)

Here are some major characteristics of Play (just check their online docs for more details, that is not the point of this post):

  • Web framework for Java and/or Scala
  • Not based on servlet, although a project can be converted to a war package
  • No server-side session
  • Java controller methods are public static void

Read the rest of this entry »

Written by Svend

December 2, 2011 at 12:11 pm

Posted in Uncategorized

Tagged with , ,

Migrating From AWS Beanstalk to Cloud Foundry in (almost) zero steps :-)

leave a comment »

I am developping a small web app for a few months during my free time. I used to deploy it to the AWS Beanstalk platform, today I’ve been amazed how easy it has been to migrate it to Cloud Foundry :-).

My app is mostly based on JSF2/jquery/Spring/Jackson + the AWS-specific APIs for storing data into AWS SDB and S3.

The migration to Cloudfoundry is the most straightforward migration experience I’ve seen up to now, it boiled down to: Read the rest of this entry »

Written by Svend

September 8, 2011 at 8:50 pm

Transactional event-based NOSQL storage

with 5 comments

I am presenting here a simple two steps architectural approach based on stored events as a workaround for the lack of full atomic transaction support in so-called “NOSQL” databases.

Being fairly new to NOSQL-based architectures, I have the annoying intuition that I am about to write nothing but a set of obvious statements. On the other hand, I have not yet read a detailed description of this anywhere, so hopefully it will be useful to some other developers as well.

[Edit: 1rst Oct 2012]: look also at the slides of  Nathan Marz’ recent presentation on event based “Big Data architecture”:  http://www.slideshare.net/nathanmarz/runaway-complexity-in-big-data-and-a-plan-to-stop-it, this one has some similarities with what I present below and is very clear to understand.

Problem statement

NOSQL databases do not offer atomic transactions over several update operations on different data entities(*). A simplistic explanation to this is that Read the rest of this entry »

Written by Svend

August 26, 2011 at 8:33 pm

Web Service security and the human dimension of SOA roadmap

leave a comment »

In most non-trivial SOA landscapes, keeping track of the constantly evolving integrations among systems can be hard unless there is in place a clearly identified way to publish and find the appropriate pieces of information. An overview of the IT landscape, defining what is currently or will be connected to what, is a prerequisite for being able to maintain the environment. Absence of this typically leads to a feeling of “Spaghetti Oriented Environment” and reluctance to start anything big.

This statement sounds obvious but it is not always taken into account in practice. Some organizations either do not have in place such a centralized control of integration or have stopped using it because it “just got in the way of anything”. At best, this means that the integration information is kept in the head of some key Read the rest of this entry »

Written by Svend

June 3, 2011 at 7:13 pm