Easy File Manipulation With Groovy Scripts

I’m in the process of collecting useful file manipulation scripts I have in Groovy and posting them here. Here are a few for starters:

This script removes lines from a given file that do not match a particular pattern (in this case, the “to_match” regular expression):

File f = new File("file.txt")
def lines = []

   f.eachLine {
    if (line =~ /to_match/)
      lines << line
}

PrintWriter writer = new PrintWriter(f)
lines.each { it -> writer.println(it) }
writer.close()

 

This script replaces the first instance of “to_match” with “replacement” in a file called “input.txt” and saves it:


def f = new File("input.txt")
def fileText = f.text
fileText = (fileText ~= /to_match/).replaceFirst("replacement")
f.write(fileText)

 

This one downloads a file from a URL, and saves it to disk, using the file name at the end of the URL:

def url = "http://www.techscreen.net/questions/details.txt"

def file = new FileOutputStream(url.tokenize("/")[-1])
def outputStream = new BufferedOutputStream(file)

outputStream << new URL(url).openStream()
outputStream.close()

 
This script splits an input file into two output files, putting the first 1000 lines into the first file, and the rest in the second:

def lines = []
def lines2 = []
def splitSize = 1000
new File (baseDir, 'input.txt').eachLine { line, nb ->
    if (nb > splitSize)
      lines2 << line
    else
      lines << line</span>
}

new File(baseDir, 'output1.txt').withWriter('utf-8') { writer ->
  lines.each { it -> 
    writer.writeLine(it)
  }
}
new File(baseDir, 'output2.txt').withWriter('utf-8') { writer ->
  lines2.each { it ->
    writer.writeLine(it)
  }
}

 

Advertisements

More Groovy Language Coolness – Awesome XML Support

One of the great features of Groovy is its excellent XML support. Basically, with utility classes like XmlParser and XmlSlurper, you can transform to/from XML text/Groovy objects with almost no effort. (Conceptually, the type of thing you could accomplish with JAXB in Java, only easier.)

def text = '''
  <books>
    <book id="1">
      <title>Smartness for Dummies</title>
      <author>John A. Smith</author>
    </book>
    <book id="2">
      <title>Cooking with Gas</title>
      <author>J.A.K. Gladney</author>
    </book>
  </books>
'''

def books = new XmlSlurper().parseText(text)
assert books.book[0].title == 'Smartness for Dummies'

Now one might wonder, when would I use XmlSlurper vs. XmlParser?

The key differences between these XML utilities are:

  • XmlSlurper lazy loads the document into list + map objects in memory. This is useful if you only need to access a small fraction of the nodes in the XML document
  • XmlSlurper produces GPathResult objects, whereas XmlParser produces Node objects
  • XmlParser can efficiently handle reading + updating of the XML structure at the same time, whereas XmlSlurper can’t.

The consensus seems to be: Use XmlSlurper for things like transforming one XML document to another, or when you only need to access a few nodes in a piece of XML text. Use XmlParser when you’re reading + writing the same piece of XML potentially at the same time (as memory usage will typically be much more efficient with this class).

Of course, once the XML has been transformed into either a GPathResult or a Node, you can use XPath-like query expressions to get at particular nodes and values (via the GPath Groovy class).

books.book[0].title is already one example of this GPath selection. One can also access attributes like so:

def firstId = books.book[0].@id
assert firstId == "1"

To go the other direction (i.e., to generate an XML string from an object tree), one common approach is to use the MarkupBuilder instance, wrapping a PrintWriter instance, like so:

def writer = new PrintWriter()
def xmlMarkup = new MarkupBuilder(writer)

...
// add XML structures to xmlMarkup
...

def xmlString = xmlMarkup.toString()

Tech Interview Screening with TechScreen.net

Tech Interview Screening – Why So Low Tech?

I’ve been doing software development for almost 20 years now, and the number of times I encountered employers with a very iffy, ad hoc interview screening process is surprisingly high. Typically, it would consist of some technical manager throwing together a dozen or so Java questions into a Word doc, asking a candidate to answer them and send them the doc, and then storing the doc in some folder somewhere, or even on the desktop of some seemingly random, barely used machine. Last year I conceived of a web application that could help improve that process, and that’s when TechScreen.net was born.

Minimum Viable Product

I figured the minimal feature set of TechScreen.net needed to be:

  • Ability to quickly put together a screening test
  • A good library of screening questions on a variety of topics
  • Ability to send the test to the candidate and have them take it online
  • Test scoring
  • Ability to share test results and keep track of them in some sort of dashboard

Additionally, it would be nice to support multiple question types, automatic scoring, management of job listings and company listings, etc.

Technology Stack

Ended up settling on Ruby on Rails, Bootstrap and jQuery on the front end, and PostgreSQL on the backend. Along with the Cloud9 IDE and Heroku for rapid prototyping and deployment. They all play well together!

Total development time was about 2 months, and that was mostly part-time. Also hired a great QA person to test (and re-test) all the basic workflows, as well as a technical writer to do some of the writing for the question + answer technical library.

Screenshots

It was pretty quick to get some basic Bootstrap-y CRUD screens together. Here are a few of them:

Splash / Landing page:

Screen Shot 2015-05-26 at 12.17.46 PM

Soon-to-be Configurable Dashboard:

Screen Shot 2015-05-26 at 12.17.16 PM

The Screening Question Search (added an auto-complete field for question category):

Screen Shot 2015-05-26 at 12.25.31 PM

Metrics:

Screen Shot 2015-05-26 at 12.33.56 PM

I will soon post a more detailed account of this whole development process.

Groovy Language Tidbits

Some Useful Groovy Language Features

In this post, I’m focusing on Groovy language features that would be significantly less elegant to write in Java.

Map creation is simple and clean, and is closer to the hash syntax you’d see in languages like JavaScript or Ruby. It looks like this, and is implemented via a LinkedHashMap (so the ordering of keys is maintained from insertion order):

def map = ["key1" : "value1", "key2" : 100, "key3" : [1, 2, 3]]
def emptyMap = [:]

Multiple assignment of variables on a single line looks like this:

def (v1, v2, v3) = ["v1Value", 22.2, true]

def (v4, v5) = ["first"]
assert v5 == null

Closures, which are basically pointers to code blocks potentially with references bound to scope outside of the block, look like:

def a = "some text"
def f = { println "A is accessible: ${a}" }

A single parameter to a closure does not need to specified explicitly, as it is available via the variable “it”:

def squared = { return it * it }
println squared(2);

4

For additional parameters to closures, they look like:

def test = { val, msg -> println (msg + ":" + val) }

Getter and setter methods are implicit, and thus do not need to be written.

Ranges can be used in for loops like so:

for (i in 1..20) {
    println ("i is ${i}")
}

There are many useful list functions, which very easily allow for sorting, replacing values, splicing, reversing, etc:

def list = [1, 2, 3, 4]
list.reverse() == [4, 3, 2, 1]
list [1..3] == [3, 2, 1]

String appending can be done easily via the overloaded left shift operator (<<), like so:

def s = ""
s << "first append" << "second append"

Note that Groovy converts s to a StringBuffer after the first append operation.

“Safe navigation” is pretty useful as well, to avoid those pesky NullPointerExceptions. Safe navigation just does the null pointer checks for you in a string of method calls like in the following example:

def val = possiblyNullObj?.possiblyNullProp?.value