Keep Walking!

Announcing AM library 1.0.0

I am happy to announce the launch of a new library called AM or Assert-Mocks for unit-testing Java servlet API code. The library aims to make it a lot easier and faster to test the servlet API code, including and not limited to testing of servlets, servlet filters and tag libraries. It helps remove the need of using Mockito framework for setting up expectations.


Read the full post here.

Releasing jerry-core 3.0.0

I am happy to announce that exactly after an year of the last release, I have a new major release for jerry-core library: version 3.0.0. You may get hold of the library from Maven Central with the following coordinates:

<dependency>
    <groupId>com.sangupta</groupId>
    <artifactId>jerry-core</artifactId>
    <version>3.0.0</version>
</dependency>

Read the full post here.

Fastest way to merge multiple integer sets

Problem

You are given multiple integer sets and you need to merge them into a single set in the fastest possible way.

Solution

The always updated article is always available on Github:sangupta/ps repo

Suppose we have 3 arrays of integer sets and we need to merge them. The fastest solution would be possible in time of O(n1 + n2 + n3) where n1, n2, n3 are the lengths of the three sets respectively. The solution lies in using a bit-vector (also called as bit-set or a bit-array) to represent elements using the index and then iterating over them.

  • Construct a bit-array
  • Iterate over the first array and for each integer set the corresponding bit to true
  • Repeat the above process for remaining arrays
  • Any duplicate element will result in setting true an already set bit
  • The resultant bit-array is the merge of all the arrays

Read the full post here.

Fastest sorting of integers

Problem

We are given a huge array of bounded, non-duplicate, positive integers that need to be sorted. What’s the fastest way to do it.

Solution

The always updated article is always available on Github:sangupta/ps repo

Most of the interview candidates that I have talked to about this problem come up with the quick answer as MergeSort or divide-and-conquer. The cost of sorting being O(N * Log(N)) - which in this case is not the fastest sorting time.

The fastest time to sort an integer array is O(N). Let me explain how.

  • Construct a boolean array of length N
  • For every integer n in array, mark the boolean at index n in array as true
  • Once the array iteration is complete, just iterate over the boolean array again to print all the indices that have the value set as true

Read the full post here.

Finding degrees of separation in a social graph

Problem

Assume two users on a Social network start with zero 0 connections. And add connections at a rate of 100 per minute. Explain how you would design a solution to find out the degrees of separation between two network profiles, given a known function getConnections(profile.id) that returns data from over the network. What differences would you make for realtime less consistent/optimal result vs a slower more accurate result?

Solution

The always updated article is always available on Github:sangupta/ps repo

I can think of many different ways to compute this. I will put them out below.

Approach 2 seems the best considering the storage cost and traversal costs. Approach 3 can be near real-time if we can take care of the storage, and add optimal number of workers for fan-out.


Read the full post here.

Introducing CodeFix

Am happy to release the very first version of a command-line development tool that I had been using for years for my own consumption: CodeFix - the tool helps perform some minor code refactoring tasks via a command line interface such as adding/updating copyright headers, removing trailing spaces in files, fixing file encoding, adding an empty line-ending to text files and more…

Some quick and dirty examples on what could be achieved are:

# add/update copyright
$ java -jar codefix.jar addcopy -r -f COPYRIGHT.txt -p *.java c:\code

# update line endings
$ java -jar codefix.jar ending -r -p *.txt c:\docs

# remove trailing white spaces
$ java -jar codefix.jar rtrim -r -p *.java c:\code

# change file encodings
$ java -jar codefix.jar encoding -r -p *.txt -s ISO-8969 -t UTF-8 c:\textdocs

You may download the binary or may take a dive into the source code.

Hope this helps.


Read the full post here.

Tabular data in a Java Console application

Displaying tabular data is a very common use-case, but working a console application and displaying the same becomes a tedious task. Why? First, you have to take care of formatting the entire data. Second, you need to make sure that the data does not spill-over the boundary of the cell. Third, it takes time and precision for such a code, and spending time on this boiler-plate code is total-waste.

Thus, I created ConsoleTable, a new helper class in Jerry-Core framework. This class is pretty useful to create such console outputs.

Features

  • Supports automatic sizing of columns in full-width layout
  • Supports multi-line layout where column text is broken at word boundaries to spill over multi-line
  • Support export to CSV, XML and JSON formats

Examples

For example, to create a layout like:

 | ------------ | --------- | --------------- | 
 | Stock Ticker | Company   | Any one product | 
 | ------------ | --------- | --------------- | 
 | APPL         | Apple     | iPhone          | 
 | GOOG         | Google    | GMail           | 
 | MSFT         | Microsoft | Windows         | 
 | ------------ | --------- | --------------- | 

you just need to code something like:

ConsoleTable table = new ConsoleTable();

table.addHeaderRow("Stock Ticker", "Company", "Any one product");

table.addRow("APPL", "Apple", "iPhone");
table.addRow("GOOG", "Google", "GMail");
table.addRow("MSFT", "Microsoft", "Windows");

table.write(System.out);

The code itself takes care of proper alignment and spacing of each element. This is not the end.

The ConsoleTable also supports multi-line output by breaking the sentence at word boundaries to make sure that the text fits in the cell.

For example:

ConsoleTable table = new ConsoleTable(ConsoleTableLayout.MULTI_LINE);
		
table.addHeaderRow("Stock Ticker", "Company", "Products");
table.addRow("APPL", "Apple", "iPhone, iPad, iPod, Mac, OSX, Mac Pro");
table.addRow("GOOG", "Google", "GMail, Blogger, AdSense, Analytics, Search");
table.addRow("MSFT", "Microsoft", "Windows, Office, Remote Desktop");

table.setColumnSize(2, 20);

table.write(System.out);

produces output as:

 | ------------ | --------- | -------------------- | 
 | Stock Ticker | Company   | Products             | 
 | ------------ | --------- | -------------------- | 
 | APPL         | Apple     | iPhone, iPad, iPod,  | 
 |              |           | Mac, OSX, Mac Pro    | 
 | GOOG         | Google    | GMail, Blogger,      | 
 |              |           | AdSense,             | 
 |              |           | Analytics, Search    | 
 | MSFT         | Microsoft | Windows, Office,     | 
 |              |           | Remote Desktop       | 
 | ------------ | --------- | -------------------- | 

The table also supports export to various formats:

Console table = getMyTable(); // some table data

// create a CSV
ConsoleTableWriter.writeCsv(table, System.out);

// create JSON output
ConsoleTableWriter.writeJson(table, System.out);

// create an XML
// wrap everthing inside a <data> ... </data> tag
// each row be wrappped inside a <row> ... </row> tag
ConsoleTableWriter.writeXml(table, System.out, "data", "row"); 

Hope this helps.


Read the full post here.

Various Bit-Array Implementation in Java

Bit-Arrays are much-useful and quite-fast data-structure that have a variety of uses. One of the most important ones in context of Big-Data is the use in Bloom filters. To store the bloom we need a very fast bit-array and the ones that can be persisted as well. Java has only an in-memory bit-array implementation. I needed a file-persisted one to be used in conjunction with the bloomfilter filter project. Thus, I added the following implementations to the jerry-core project.

  • FastBitArray - an implementation that is much faster than the standard Java implementation
  • FileBackedBitArray - backed by file persistence, and all operations are flushed back automatically
  • MMapFileBackedBitArray - one that is file-persisted, but uses memory-mapped files for nuch faster performance
  • JavaBitSetArray - one that uses internal Java implementation underneath

Usage is pretty simple as,

final int maxElements = 1000 * 1000; // 1 million
BitArray ba = new FileBackedBitArray(new File("my-bit-array.ba"), maxElements);

boolean updated = ba.setBit(13); // returns true
updated = ba.setBit(13); // returns false
ba.clearBit(13);
udpated = ba.setBit(13); // returns true

boolean isSet = ba.getBit(13); // returns true


// using the memory-mapped version is similar
ba = new MMapFileBackedBitArray(new File("my-bit-array"), maxElements);

// all other operations are the same

I have used MMapFileBackedBitArray in production for the last few years and has been quite useful and fast.

Hope this helps.


Read the full post here.

Iterating over Strings inside a String object

When reading file contents, or working with REST requests, many a times we want to read a String object line-by-line, that is, read lines within a single String object. Java does not offer a simple solution to the same - either you convert the String object to a byte[] and then use ByteArrayInputStream or use a StringReader and then push this into another line-by-line reader.

For the same, I wrote a simple utility class, StringLineIterator (available in jerry-core project) simplifying reading of lines to the following code snippet:

String contents = "..."; // some contents that contains new-lines, and form-feed characters

StringLineIterator iterator = new StringLineIterator(contents);
while(iterator.hasNext()) {
  String line = iterator.next();

  // do something with this extracted line
}

This helps us reading sub-string lines from contents and reduces boiler-plate code. Note that this implementation would use extra memory to the extent of each line, as it creates a new String object for each line that is read via the iterator.

Hope this helps.


Read the full post here.

Jetty + Spring 4 + Jersey 2 Integration

Today most of the enterprise systems are built over MicroServices. The advantages of the same are well-explained as 12factor principles. In Java, this translates to using Jetty as the embedded HTTP server, running Jersey or RestEasy or alike as the Jax-RS based REST framework.

Integration between Spring 3 and Jersey 2 is well documented and works great. With the coming of Spring 4, the integration still works if you are building a web application using Tomcat or JBoss or any other application server. However, for standalone Java applications this is broken.

Last night, I went poking my nose inside the jersey-spring3 module to dig the reason for the same. And finally found the reason, and a fix for the same.

Quick Fix

For the impatient, a very simple fix to this is to create a new WebApplicationContext, with the parent set to the ApplicationContext you created manually, and then set it in the Jettys ServletContext as:

ServletContextHandler context = new ServletContextHandler(ServletContextHandler.NO_SESSIONS);
context.setContextPath("/");

AnnotationConfigWebApplicationContext webContext = new AnnotationConfigWebApplicationContext();
webContext.setParent(rootContext);
webContext.refresh();
webContext.start();

context.setAttribute(WebApplicationContext.class.getName() + ".ROOT", webContext);

This will ensure that all your dependencies get wired in your web-services.

For why it happens, continue reading.

Why is it broken?

The class org.glassfish.jersey.server.spring.SpringComponentProvider is responsible for detecting existence of Spring context and wiring the same withint the Jersey code so that all your dependencies can be @Autowired or @Injected. Let’s take a look at the initialize method of the class:

@Override
public void initialize(ServiceLocator locator) {
	this.locator = locator;

	if (LOGGER.isLoggable(Level.FINE)) {
		LOGGER.fine(LocalizationMessages.CTX_LOOKUP_STARTED());
	}

	ServletContext sc = locator.getService(ServletContext.class);

	if (sc != null) {
		// servlet container
		ctx = WebApplicationContextUtils.getWebApplicationContext(sc);
	} else {
		// non-servlet container
		ctx = createSpringContext();
	}
	if (ctx == null) {
		LOGGER.severe(LocalizationMessages.CTX_LOOKUP_FAILED());
		return;
	}

	// more code omitted for brevity
}

If you look above, if Jersey figures out that there is a ServletContext already present, which would be as you are running a Jetty server, it will then only read the ApplicationContext/ctx via the code line:

ctx = WebApplicationContextUtils.getWebApplicationContext(sc);

If it detects that no ServletContext is present, only then it creates a new ApplicationContext instance via the call to,

ctx = createSpringContext();

Now the call to WebApplicationContextUtils.getWebApplicationContext(sc) translates to the following code (some constant references have been modified to make the code more understandable):

public static WebApplicationContext getWebApplicationContext(ServletContext sc, String attrName) {
	Assert.notNull(sc, "ServletContext must not be null");
	Object attr = sc.getAttribute(WebApplicationContext.class.getName() + ".ROOT");
	if (attr == null) {
		return null;
	}
	if (attr instanceof RuntimeException) {
		throw (RuntimeException) attr;
	}
	if (attr instanceof Error) {
		throw (Error) attr;
	}
	if (attr instanceof Exception) {
		throw new IllegalStateException((Exception) attr);
	}
	if (!(attr instanceof WebApplicationContext)) {
		throw new IllegalStateException("Context attribute is not of type WebApplicationContext: " + attr);
	}
	return (WebApplicationContext) attr;
}

As there is not WebApplicationContext.class.getName() + ".ROOT" attribute present in the ServletContext - Jersey fails to wire the dependencies.

Now, let’s take a look at the createSpringContext() method (again, some constants have been inlined):

private ApplicationContext createSpringContext() {
  ApplicationHandler applicationHandler = locator.getService(ApplicationHandler.class);
  ApplicationContext springContext = (ApplicationContext) applicationHandler.getConfiguration().getProperty("contextConfig");
  if (springContext == null) {
    String contextConfigLocation = (String) applicationHandler.getConfiguration().getProperty("contextConfigLocation");
    springContext = createXmlSpringConfiguration(contextConfigLocation);
  }

  return springContext;
}

One another way to fix this, would be be to add an ApplicationHandler class that sets the contextConfig property in its configuration, but with annotation support and classpath scanning, I don’t see why someone would want to do that.

I would raise a pull-request for the same in Jersey code sometime soon.

Hope this helps.


Read the full post here.

JOpenSurf now available in Maven Central

Sometime back I happened to work with JOpenSurf project for some prototyping work. The project is also a dependency of the LIRE project, the popular Lucene based Image Retrieval library. The only drawback if the JOpenSurf project was its non-availability in Maven Central.

I took time and am pleased to announce that the library is now available in Maven Central. Feel free to include it in your code as:

<dependency>
    <groupId>com.sangupta</groupId>
    <artifactId>jopensurf</artifactId>
    <version>1.0.0</version>
</dependency>
  • JOpenSurf Home Page: https://code.google.com/p/jopensurf/
  • LIRE Home Page: https://code.google.com/p/lire/

Hope this helps.


Read the full post here.

Windows Shell and Wildcard Command-line argument resolution

Problem: Windows Shell parses any wildcard path arguments that you supply over the command line, before passing the arguments to the actual program that has been invoked.


Read the full post here.

Introducting Java based HTTP server

Introducing a new HTTP server that can be used by Java developers for simple development needs.
Read the full post here.

Building a REST Server using Jetty

Today, most of the enterprise applications are distributed. And this also calls that the various components communicate with each other either using a Message Queue or via REST services. I have seen people still building Java web applications that are eventually deployed in a container like Tomcat. This is actually over engineering.


Read the full post here.

Songs Downloader

Today, it was the first glimpse of monsoon showers in Delhi. Today, also marks 11 months since the time I last blogged. It’s not that I haven’t been coding, but my efforts were concentrated to ship many small or large applications both at work and personally. As the weather changes today, even I change my course today and come back to the blogging world.


Read the full post here.

Change Eclipse Juno UI to match Eclipse Indigo

Some of the Windows users, like me, who have switched to Eclipse Juno (Eclipse 4.2) might not have liked the theme that ships as default. Specially, the piece around the toolbar backgrounds, the code editing theme, the absence of left border along side line numbers, and the overtly flashy UI containers.


Read the full post here.

Pepmint now in Maven Central

I am pleased to announce the immediate availability of Pepmint, a Java wrapper over the Python Pygments library in Maven Central. Use the following to include it as a dependency,


Read the full post here.

Saving HTML5 canvas to Java server

If you are working with HTML5 Canvas element and are looking to save the generated PNG file back on to the server via Java - it is not as easy as saving the byte array. The reason that the generated PNG data is URL encoded and is prefixed with the dataURI format headers.


Read the full post here.

Trim down HTML content to desired text length

Given some HTML code, trim it down into valid HTML code that contains text of desired length.
Read the full post here.

Merge different SCM snapshots

This post is about MergeRepo a small script that allows to merge two different snapshots of the same repository from different SCMs into one.


Read the full post here.