Keep Walking!

I am happy to announce the launch of a new library called AM or Assert-Mocks for unit-testing Java servlet API code. The library aims to make it a lot easier and faster to test the servlet API code, including and not limited to testing of servlets, servlet filters and tag libraries. It helps remove the need of using Mockito framework for setting up expectations.

Read the full post here.

I am happy to announce that exactly after an year of the last release, I have a new major release for jerry-core library: version 3.0.0. You may get hold of the library from Maven Central with the following coordinates:

``````<dependency>
<groupId>com.sangupta</groupId>
<artifactId>jerry-core</artifactId>
<version>3.0.0</version>
</dependency>
``````

Read the full post here.

Problem

You are given multiple integer sets and you need to merge them into a single set in the fastest possible way.

Solution

The always updated article is always available on Github:sangupta/ps repo

Suppose we have 3 arrays of integer sets and we need to merge them. The fastest solution would be possible in time of `O(n1 + n2 + n3)` where `n1`, `n2`, `n3` are the lengths of the three sets respectively. The solution lies in using a bit-vector (also called as bit-set or a bit-array) to represent elements using the index and then iterating over them.

• Construct a bit-array
• Iterate over the first array and for each integer set the corresponding bit to true
• Repeat the above process for remaining arrays
• Any duplicate element will result in setting `true` an already set bit
• The resultant bit-array is the merge of all the arrays

Read the full post here.

Problem

We are given a huge array of bounded, non-duplicate, positive integers that need to be sorted. What’s the fastest way to do it.

Solution

The always updated article is always available on Github:sangupta/ps repo

Most of the interview candidates that I have talked to about this problem come up with the quick answer as `MergeSort` or divide-and-conquer. The cost of sorting being `O(N * Log(N))` - which in this case is not the fastest sorting time.

The fastest time to sort an integer array is `O(N)`. Let me explain how.

• Construct a boolean array of length N
• For every integer `n` in array, mark the boolean at index `n` in array as `true`
• Once the array iteration is complete, just iterate over the boolean array again to print all the indices that have the value set as `true`

Read the full post here.

Problem

Assume two users on a Social network start with zero `0` connections. And add connections at a rate of 100 per minute. Explain how you would design a solution to find out the degrees of separation between two network profiles, given a known function `getConnections(profile.id)` that returns data from over the network. What differences would you make for realtime less consistent/optimal result vs a slower more accurate result?

Solution

The always updated article is always available on Github:sangupta/ps repo

I can think of many different ways to compute this. I will put them out below.

`Approach 2` seems the best considering the storage cost and traversal costs. `Approach 3` can be near real-time if we can take care of the storage, and add optimal number of workers for fan-out.

Read the full post here.

Am happy to release the very first version of a command-line development tool that I had been using for years for my own consumption: CodeFix - the tool helps perform some minor code refactoring tasks via a command line interface such as adding/updating copyright headers, removing trailing spaces in files, fixing file encoding, adding an empty line-ending to text files and more…

Some quick and dirty examples on what could be achieved are:

``````# add/update copyright
\$ java -jar codefix.jar addcopy -r -f COPYRIGHT.txt -p *.java c:\code

# update line endings
\$ java -jar codefix.jar ending -r -p *.txt c:\docs

# remove trailing white spaces
\$ java -jar codefix.jar rtrim -r -p *.java c:\code

# change file encodings
\$ java -jar codefix.jar encoding -r -p *.txt -s ISO-8969 -t UTF-8 c:\textdocs
``````

You may download the binary or may take a dive into the source code.

Hope this helps.

Read the full post here.

Displaying tabular data is a very common use-case, but working a console application and displaying the same becomes a tedious task. Why? First, you have to take care of formatting the entire data. Second, you need to make sure that the data does not spill-over the boundary of the cell. Third, it takes time and precision for such a code, and spending time on this boiler-plate code is total-waste.

Thus, I created ConsoleTable, a new helper class in Jerry-Core framework. This class is pretty useful to create such console outputs.

Features

• Supports automatic sizing of columns in full-width layout
• Supports multi-line layout where column text is broken at word boundaries to spill over multi-line
• Support export to CSV, XML and JSON formats

Examples

For example, to create a layout like:

`````` | ------------ | --------- | --------------- |
| Stock Ticker | Company   | Any one product |
| ------------ | --------- | --------------- |
| APPL         | Apple     | iPhone          |
| GOOG         | Google    | GMail           |
| MSFT         | Microsoft | Windows         |
| ------------ | --------- | --------------- |
``````

you just need to code something like:

``````ConsoleTable table = new ConsoleTable();

table.write(System.out);
``````

The code itself takes care of proper alignment and spacing of each element. This is not the end.

The `ConsoleTable` also supports multi-line output by breaking the sentence at word boundaries to make sure that the text fits in the cell.

For example:

``````ConsoleTable table = new ConsoleTable(ConsoleTableLayout.MULTI_LINE);

table.addRow("APPL", "Apple", "iPhone, iPad, iPod, Mac, OSX, Mac Pro");
table.addRow("MSFT", "Microsoft", "Windows, Office, Remote Desktop");

table.setColumnSize(2, 20);

table.write(System.out);
``````

produces output as:

`````` | ------------ | --------- | -------------------- |
| Stock Ticker | Company   | Products             |
| ------------ | --------- | -------------------- |
| APPL         | Apple     | iPhone, iPad, iPod,  |
|              |           | Mac, OSX, Mac Pro    |
| GOOG         | Google    | GMail, Blogger,      |
|              |           | AdSense,             |
|              |           | Analytics, Search    |
| MSFT         | Microsoft | Windows, Office,     |
|              |           | Remote Desktop       |
| ------------ | --------- | -------------------- |
``````

The table also supports export to various formats:

``````Console table = getMyTable(); // some table data

// create a CSV
ConsoleTableWriter.writeCsv(table, System.out);

// create JSON output
ConsoleTableWriter.writeJson(table, System.out);

// create an XML
// wrap everthing inside a <data> ... </data> tag
// each row be wrappped inside a <row> ... </row> tag
ConsoleTableWriter.writeXml(table, System.out, "data", "row");
``````

Hope this helps.

Read the full post here.

`Bit-Arrays` are much-useful and quite-fast data-structure that have a variety of uses. One of the most important ones in context of Big-Data is the use in Bloom filters. To store the bloom we need a very fast bit-array and the ones that can be persisted as well. `Java` has only an in-memory bit-array implementation. I needed a file-persisted one to be used in conjunction with the bloomfilter filter project. Thus, I added the following implementations to the jerry-core project.

• `FastBitArray` - an implementation that is much faster than the standard Java implementation
• `FileBackedBitArray` - backed by file persistence, and all operations are flushed back automatically
• `MMapFileBackedBitArray` - one that is file-persisted, but uses memory-mapped files for nuch faster performance
• `JavaBitSetArray` - one that uses internal `Java` implementation underneath

Usage is pretty simple as,

``````final int maxElements = 1000 * 1000; // 1 million
BitArray ba = new FileBackedBitArray(new File("my-bit-array.ba"), maxElements);

boolean updated = ba.setBit(13); // returns true
updated = ba.setBit(13); // returns false
ba.clearBit(13);
udpated = ba.setBit(13); // returns true

boolean isSet = ba.getBit(13); // returns true

// using the memory-mapped version is similar
ba = new MMapFileBackedBitArray(new File("my-bit-array"), maxElements);

// all other operations are the same
``````

I have used `MMapFileBackedBitArray` in production for the last few years and has been quite useful and fast.

Hope this helps.

Read the full post here.

When reading file contents, or working with REST requests, many a times we want to read a `String` object line-by-line, that is, read lines within a single `String` object. `Java` does not offer a simple solution to the same - either you convert the `String` object to a `byte[]` and then use `ByteArrayInputStream` or use a `StringReader` and then push this into another line-by-line reader.

For the same, I wrote a simple utility class, StringLineIterator (available in jerry-core project) simplifying reading of lines to the following code snippet:

``````String contents = "..."; // some contents that contains new-lines, and form-feed characters

StringLineIterator iterator = new StringLineIterator(contents);
while(iterator.hasNext()) {
String line = iterator.next();

// do something with this extracted line
}
``````

This helps us reading sub-string lines from `contents` and reduces boiler-plate code. Note that this implementation would use extra memory to the extent of each line, as it creates a new `String` object for each line that is read via the `iterator`.

Hope this helps.

Read the full post here.

Today most of the enterprise systems are built over MicroServices. The advantages of the same are well-explained as 12factor principles. In Java, this translates to using `Jetty` as the embedded HTTP server, running `Jersey` or `RestEasy` or alike as the `Jax-RS` based REST framework.

Integration between `Spring 3` and `Jersey 2` is well documented and works great. With the coming of `Spring 4`, the integration still works if you are building a web application using `Tomcat` or `JBoss` or any other application server. However, for standalone `Java` applications this is broken.

Last night, I went poking my nose inside the `jersey-spring3` module to dig the reason for the same. And finally found the reason, and a fix for the same.

Quick Fix

For the impatient, a very simple fix to this is to create a new `WebApplicationContext`, with the parent set to the `ApplicationContext` you created manually, and then set it in the `Jetty`s `ServletContext` as:

``````ServletContextHandler context = new ServletContextHandler(ServletContextHandler.NO_SESSIONS);
context.setContextPath("/");

AnnotationConfigWebApplicationContext webContext = new AnnotationConfigWebApplicationContext();
webContext.setParent(rootContext);
webContext.refresh();
webContext.start();

context.setAttribute(WebApplicationContext.class.getName() + ".ROOT", webContext);
``````

This will ensure that all your dependencies get wired in your web-services.

For why it happens, continue reading.

Why is it broken?

The class `org.glassfish.jersey.server.spring.SpringComponentProvider` is responsible for detecting existence of `Spring` context and wiring the same withint the `Jersey` code so that all your dependencies can be `@Autowire`d or `@Inject`ed. Let’s take a look at the `initialize` method of the class:

``````@Override
public void initialize(ServiceLocator locator) {
this.locator = locator;

if (LOGGER.isLoggable(Level.FINE)) {
LOGGER.fine(LocalizationMessages.CTX_LOOKUP_STARTED());
}

ServletContext sc = locator.getService(ServletContext.class);

if (sc != null) {
// servlet container
ctx = WebApplicationContextUtils.getWebApplicationContext(sc);
} else {
// non-servlet container
ctx = createSpringContext();
}
if (ctx == null) {
LOGGER.severe(LocalizationMessages.CTX_LOOKUP_FAILED());
return;
}

// more code omitted for brevity
}
``````

If you look above, if `Jersey` figures out that there is a `ServletContext` already present, which would be as you are running a `Jetty` server, it will then only read the `ApplicationContext`/`ctx` via the code line:

``````ctx = WebApplicationContextUtils.getWebApplicationContext(sc);
``````

If it detects that no `ServletContext` is present, only then it creates a new `ApplicationContext` instance via the call to,

``````ctx = createSpringContext();
``````

Now the call to `WebApplicationContextUtils.getWebApplicationContext(sc)` translates to the following code (some constant references have been modified to make the code more understandable):

``````public static WebApplicationContext getWebApplicationContext(ServletContext sc, String attrName) {
Assert.notNull(sc, "ServletContext must not be null");
Object attr = sc.getAttribute(WebApplicationContext.class.getName() + ".ROOT");
if (attr == null) {
return null;
}
if (attr instanceof RuntimeException) {
throw (RuntimeException) attr;
}
if (attr instanceof Error) {
throw (Error) attr;
}
if (attr instanceof Exception) {
throw new IllegalStateException((Exception) attr);
}
if (!(attr instanceof WebApplicationContext)) {
throw new IllegalStateException("Context attribute is not of type WebApplicationContext: " + attr);
}
return (WebApplicationContext) attr;
}
``````

As there is not `WebApplicationContext.class.getName() + ".ROOT"` attribute present in the `ServletContext` - `Jersey` fails to wire the dependencies.

Now, let’s take a look at the `createSpringContext()` method (again, some constants have been inlined):

``````private ApplicationContext createSpringContext() {
ApplicationHandler applicationHandler = locator.getService(ApplicationHandler.class);
ApplicationContext springContext = (ApplicationContext) applicationHandler.getConfiguration().getProperty("contextConfig");
if (springContext == null) {
String contextConfigLocation = (String) applicationHandler.getConfiguration().getProperty("contextConfigLocation");
springContext = createXmlSpringConfiguration(contextConfigLocation);
}

return springContext;
}
``````

One another way to fix this, would be be to add an `ApplicationHandler` class that sets the `contextConfig` property in its configuration, but with `annotation` support and `classpath` scanning, I don’t see why someone would want to do that.

I would raise a pull-request for the same in `Jersey` code sometime soon.

Hope this helps.

Read the full post here.

Sometime back I happened to work with `JOpenSurf` project for some prototyping work. The project is also a dependency of the `LIRE` project, the popular Lucene based Image Retrieval library. The only drawback if the `JOpenSurf` project was its non-availability in Maven Central.

I took time and am pleased to announce that the library is now available in Maven Central. Feel free to include it in your code as:

``````<dependency>
<groupId>com.sangupta</groupId>
<artifactId>jopensurf</artifactId>
<version>1.0.0</version>
</dependency>
``````

Hope this helps.

Read the full post here.

`Problem:` `Windows Shell` parses any wildcard path arguments that you supply over the command line, before passing the arguments to the actual program that has been invoked.

Read the full post here.
Introducing a new HTTP server that can be used by Java developers for simple development needs.
Read the full post here.

Today, most of the enterprise applications are distributed. And this also calls that the various components communicate with each other either using a `Message Queue` or via `REST services`. I have seen people still building Java web applications that are eventually deployed in a container like `Tomcat`. This is actually over engineering.

Read the full post here.

Today, it was the first glimpse of monsoon showers in Delhi. Today, also marks 11 months since the time I last blogged. It’s not that I haven’t been coding, but my efforts were concentrated to ship many small or large applications both at work and personally. As the weather changes today, even I change my course today and come back to the blogging world.

Read the full post here.

Some of the Windows users, like me, who have switched to Eclipse Juno (Eclipse 4.2) might not have liked the theme that ships as default. Specially, the piece around the toolbar backgrounds, the code editing theme, the absence of left border along side line numbers, and the overtly flashy UI containers.

Read the full post here.

I am pleased to announce the immediate availability of Pepmint, a Java wrapper over the Python Pygments library in Maven Central. Use the following to include it as a dependency,

Read the full post here.

If you are working with HTML5 Canvas element and are looking to save the generated PNG file back on to the server via Java - it is not as easy as saving the byte array. The reason that the generated PNG data is URL encoded and is prefixed with the dataURI format headers.

Read the full post here.
Given some HTML code, trim it down into valid HTML code that contains text of desired length.
Read the full post here.

This post is about `MergeRepo` a small script that allows to merge two different snapshots of the same repository from different SCMs into one.

Read the full post here.