Contact

Ryan Cox

Popular Posts

java.util.uuid FAQ
XPath Engine Comparison
Author Tips

 RSS Feed

Jun
13th
2006
Tue
permalink

Vim7 Feature Roundup

I noticed Emacs 22 getting some press lately. So figured it was time to spread a little vim7 love .

Here is a grab bag of features I noticed.

Omni Complete


Completion now comes to vim. <ctl-x><ctl-o> and vim does it’s best to complete the symbol. It’s not perfect. The implementations look like they are mostly regexp’ing around. No knowledge type systems. Based on the contents of $VIM/autoload it looks like there’s out of the box support for: XML, HTML, CSS, C, JavaScript, PHP, Python, Ruby, and SQL. Sorry Java!

Quoted Text Select


New quoted text selection shortcuts while in select mode. With cursor in a string <ctl-q>i” will select all the text within double quotes. If you want the quotes <ctl-q>a” will do the trick.

:vimgrep


Vim now comes out of the box with a grep implementation that works well with existing errorlist functionality.Recursively look for all the FIXME tags in the ruby source tree.:vimgrep /FIXME/j C:/ruby/src/**.rb

View the hits and step through them

:cope

Tabs


Enable this feature and you’ll feel like you walked into a room with 3 switches that control the same light. It looks nice, but when cycling through buffers with :bnext and you can see any buffer in any tab. It’s there if you want it, but I suspect this will largely go unused by most gvim people.To open a new tab:

:tabnew

Spell Check


It’s 2006! I want a jetpack not spell checker!

Nothing too surprising here. Ships with English support. You can grab OpenOffice dictionaries and use the :mkspell to convert it into something vim can cope with. Notice red squiggles under the spelling errors and blue under the punctuation errors.

:setlocal spell spelllang=Fr_fr

Undo Branches


This is an excellent feature with a confusing name. Think of it as restore to any point in time during an edit session. It will save your bacon when you make some edits, undo a few levels, make some more edits and realize you wanted something back from before the undo. New commands support restoring to a text state based on time:

  • g+ / g- This is like u / CTRL-R except it will walk across all changes over time. Good candidates for mapping.
  • :earlier 5m Restore to state 5 minutes ago
  • :later 1h Restore to state 1 hour later
  • :undolist Show leaves in undo / redo tree

:MkVimball

Package and distribute plugins and vim macros into vba files with this new command. When opening vim will recognize the vba files and prompt to extract. Uninstall would be a nice addition.

:GetLatestVimScripts

A very lightweight package manager is now included. Make sure you have a version of wget in your path. Then create a file $VIM/getlatest/GetLatestScripts.dat that looks like:

ScriptID SourceID Filename
————————–

A call to :GetLatestVimScripts will then attempt to update all plugins aware of this new functionality.

It will update the file with version and date information. Note that vim ships with more plugins than this, but GLVS, netrw and vimball are the only ones aware of this new feature.

ScriptID SourceID Filename
————————–
642 5228 :AutoInstall: GetLatestVimScripts.vim
1075 5814 :AutoInstall: netrw.vim
1502 5828 :AutoInstall: vimball.vim

Script Enhancements

Vim’s macro language continues to evolve. Things like this are now possible:

:let num = { 1: ‘one’, 2: ‘two’, 3: ‘three’ }
:echo num[2]

One step closer to emacs convergence?

Comments (View)    
Mar
22nd
2006
Wed
permalink

Music to code by!


The already great somafm just launched space station soma.

Tune in, turn on, space out. Spaced-out ambient and mid-tempo electronica.

Comments (View)    
Dec
6th
2005
Tue
permalink

Aardvark’d: 12 Weeks With Geeks


I finished Joel’s Aardvark’d documentary last night. It captured the experiences of his 4 interns for 12 weeks. Pointing a camera at 4 guys typing for 3 months and editing that into something entertaining is non-trivial. I struggled to find the thesis, but there was plenty to like.

Some observations:

What it is:

A little bit of advertising. Without being cynical, let’s be realistic. I paid $20 for this thing. Now I’m blogging about it. Mission accomplished.

A look through the key-hole at this medium’s potential. Cameras and computers are cheap. There’s more of this type of thing to come and they will build it’s strengths.

A tease for folks interested in the Y Combinator. Paul Graham’s work in Boston is briefly contrasted. Someone send a guy and a camera up there!

What it might be:

A HOWTO for companies starting internships. There more than a few lessons to be learned: find them housing, buy them a TV, let them get some skin in the game the first day by putting together their workspace, give them the same hardware as the rest of the team, give them a spec, etc…

A depiction of the banality that comes along with software development. Long nights where the only thing interesting that happens is a cockroach crawling across the floor. Endless debates in the forums. The office manager / secretary provides us the view of some one looking in from the outside.

What it wasn’t:

A deep look into the journey of the intern; intern as Ubermensch; emerges from long dark 12 week tunnel having re-evaluated prior ideals.

An insightful portrayal of the starts and stops of software development; the highs and lows; peaks and valleys. While some demos crashed and some bugs were found, the contrast between the intoxicating sensation of shipping and sober realities along the way went largely unaddressed.

Comments (View)    
Jul
21st
2005
Thu
permalink

Book Review: Maven - A Developer’s Notebook


Maven: A Developer’s Notebook is a recently published book by Vincent Massol and Timothy O’Brien in O’Reilly’s Developer Notebook series. It fills a void in the market for a quality book on Maven. I had a chance to spend some time with Vincent while in Paris earlier in the summer; we talked about the book, his company: Pivolis, the French language and every manner of other topics. I’ve just now taken the time read through the book and should have done so earlier. It’s excellent. Vincent’s passion and focus comes across loud and clear.

According to O’Reilly’s site the Developer Notebook Series “is for early adopters of leading-edge technologies who want to get up to speed quickly on what they can do now with emerging technologies.” This book delivers on that promise, weighing in at under 200 pages, but providing the reader with more than enough to get them up and running quickly. It goes beyond the basics, covering customization via maven.xml, multiproject configuration, site customization, and plugin creation. A very practical set of labs are woven through the book as well; making the book suitable for use in a training class.

It is published at an interesting time with Maven2 in alpha. The book targets Maven 1.x, but Vincent and Tim don’t ignore Maven2. They make frequent references to features Maven2 will offer and help prepare the reader for migration when Maven2 is production ready.

The book comes along with a counterpart web site that offers up info on the book and additional content of interest to the reader: www.mavenbook.org

Some areas for improvement:

  • Help the reader understand if Maven isn’t for them. When is it overkill? When would something else be a better choice? What are the alternatives?
  • Additional coverage of XDoc; a poorly documented aspect of Maven that is generally learned by looking at other examples (read: cut and paste).

Three and half out of four stars. Great job…

Comments (View)    
Jul
5th
2005
Tue
permalink

DAX Checked into Subversion

After a few months being off the air, the lights are back on. Relocation from the West Coast of the US to the East Coast, a month in France, selling a house, buying a house, and a new job have kept me a bit more occupied than I expected.

I’ve finally taken the time to check in DAX to the Subversion repository Lars setup for me. Point your svn clients over to: http://www.goshaky.com/svn/dax/

Some tiny changes:

  • Jar file naming is Maven compatible
  • @Path is now @Match
  • build.xml tidied up a bit

A few suggestions have been sent along; I may get to some of these…

  • Use annotations for the POJO -> XML mapping of simple Java types; mitigating the DOM4J dependancy.
  • Support XPath2; perhaps via Saxon
Comments (View)    
Mar
3rd
2005
Thu
permalink

Introducing DAX: Declarative API for XML

Inspired by XSLTO, I’ve put together a bit of code to allow XSLT like tranformations from within Java. DAX glues together Java 5 annotations, Jaxen XPath and DOM4J to make possible the declarative style of processing shown below.

Getting Started

  • Install JDK 5 and a recent flavor of Ant
  • Download DAX here
  • Extract and do an “ant test”
  • “ant javadocs” for API guide
  • Look at the dax.examples package for sample code

More postings to follow. Stay tuned…


public class BindingTransform extends Transformer {

	public List items = new ArrayList();
	private RSSItem currentItem;

	public BindingTransform() {
		// tell engine about anticipated namespace
		setNamespace("dc", "http://purl.org/dc/elements/1.1/");
	}

	public void init() {
		items.clear();
	}

	public void complete() {
		for (RSSItem i : items) {
			System.out.println(i.getTitle());
		}
	}

	@Path("//item")
	public void item(Node node) {
		currentItem = new RSSItem();
		items.add(currentItem);
		applyTemplates(node);
	}

	@Path("item/title")
		public void title(Node node) {
		currentItem.setTitle(node.getStringValue());
	}

	@Path("item/link")
		public void link(Node node) {
		currentItem.setLink(node.getStringValue());
	}

	@Path("item/description")
		public void description(Node node) {
		currentItem.setDescription(node.getStringValue());
	}

	@Path("item/dc:creator")
		public void creator(Node node) {
		currentItem.setCreator(node.getStringValue());
	}
}
Comments (View)    
Feb
23rd
2005
Wed
permalink
Comments (View)    
Jan
20th
2005
Thu
permalink

XPath Performance Followup

Shirasu has extended my XPath benchmarking work to include testing across object models.

His observations:

  • Traversal of Jaxen is a bit slower than others.
  • Performance of JXPath does not depend on object models generally. JXPath uses the original traversal mechanism (i.e. NodePointers), and this mesurement suggests that it does not use object model explicitly.
  • Jaxen and Saxon in DOM is a bit slower than ones in JDOM and XOM.
  • The following-sibling traversal of Saxon in DOM is a bit slow, but good in JDOM and XOM
  • Details in English

    Details in Japanese

    Comments (View)    
    Jan
    14th
    2005
    Fri
    permalink

    Tips for Computer Book Authors

    Over the last 18 months or so I’ve had the great experience of reviewing a handful of books in various stages of development. In doing so I noted several patterns that emerged and captured them here. Enjoy!

    Make sample code stand-alone test cases

    Whether you are developing Python or PowerBuilder, chances are there is a unit-test framework available. Leverage these frameworks in creating your samples. It forces discipline upon the author to keep examples small and self-contained. More significantly, it provides code that will compile and be immediately useful to readers.

    Never underestimate the power of copy and paste.

    Lucene in Action

    Testing Frameworks

    Develop reader personas

    Writers face the challenge of crystallizing target readers. What is their level of education? What is their experience with the subject being presented? More importantly, writers face the less obvious question: Who are they not writing for? Poor attention to these questions manifests itself in the form of inconsistencies like: assuming the reader understands the quicksort algorithm but in the previous chapter explaining how to download and unzip a file.

    Borrow a technique from the design world to mitigate this challenge: Develop full-blown personas of your readers. Include photos. Include a work-history. Is this the type of person who wants to go home at 5PM or understand the subject from the inside out? What is their reading style? Make a few copies. Hang them on the wall where you write. Send them to your editors.

    Mercilessly challenge yourself: Is this something the readers already know? Is this something readers need additional foundation on?

    Perfecting Personas

    Define your value proposition

    Odds are, your book is not the only one on the subject. In the planning stages of the book define your value proposition in clear terms. Send it to your editors. What does your book bring to the party that no other does? Does it consider common pitfalls? Does it describe internal implementation details? Is it aimed specifically at beginners or advanced readers? Is it a comprehensive treatise on a subject?

    Struggling to come up with your value proposition? Maybe the project isn’t such a great idea.

    Consider various styles of reading

    A small minority of your readers will make a front-to-back consideration of your manuscript. More readers look for answers to a specific problem. Still more read in an exploratory style, seeking an overall survey or looking for topics that interest them. Ask yourself: How do I serve these various types of readers? Devices such as “Best Practices” or “Pitfalls” beak-out boxes help capture the attention of explorers. A topical Q&A summary can serve the answer seekers.

    Think about the guy at Borders flipping through your book on a rainy afternoon. Remember the frantic developer spilling cold coffee on your book as he works on solving a problem with a production system.

    JUnit In Action
    Head First Design Patterns

    “Don’t be too pleased with yourself.”

    This one is borrowed from the style guide provided to writers of the Economist magazine. The idea’s relevance to programmers is emphasized in Larry Wall’s consideration of great virtues a programmer. In it he defines “Hubris” as:

    “Excessive pride, the sort of thing Zeus zaps you for. Also the quality that makes you write (and maintain) programs that other people won’t want to say bad things about. Hence, the third great virtue of a programmer. See also laziness and impatience. (p.607)”

    Self-confidence is a good thing; leave it unchecked and you will endear yourself to few readers. For whatever reasons, the industry breeds excessive self-assurance. Challenge yourself: Can use of the first person singular be reduced? Better yet, eliminated entirely? Do you see phrases like: “A naive programmer would…” or “Inexperienced users might…”?

    Respect your reader; they could probably teach you a thing or two.

    3 Traits
    Economist Style Guide

    Leave your fingerprints on the manuscript

    This is the corollary to “Don’t be too pleased with yourself”. Without being self-serving let your unique voice be heard. You represent a unique intersection of experience, education and ideas. Let this individuality come through. Have you been consulting for years and seen techniques succeed or fail. Tell readers about it! Have a relevant war story? Share it! Inject some personality; relentlessly combat a dry documentary style.

    If you are not consciously battling tedium, you have ceded to it.

    Enterprise Java w/o EJB

    Use a single problem domain for sample code

    Don’t be seduced by metasyntactic variables. Pick a realistic problem domain and stick to it. Don’t switch from stock quotes in the first chapter to class scheduling in the second. Let your reader reap the benefits of investing time in understanding the problem domain. Note: This is not to say examples should be cumulative and dependent on prior chapters.

    Allow them to focus on the message and not the means to convey it.

    Hibernate: A Developer’s Notebook
    Metasyntactic Variables

    Say more by saying less

    Keep your language simple and eliminate digressions. Simple words are the most powerful. “It was the best times, it was the worst of times…”. “In the beginning…”. “We didn’t land on Plymouth Rock…” Unsparingly revise long and wordy sentences. Watch some DVD commentaries and listen to directors wax-nostalgic about scenes they deleted in the interest of producing a better film. Learn from them. Don’t make your book the next Ishtar. Unsympathetically delete sentences, paragraphs and chapters.

    You’ve got work to do if you can’t remember the last time you simplified a sentence or deleted a portion of your manuscript.

    Cryptography Decrypted

    Cite other works

    Citations reinforce the hard work that went into the manuscript. More importantly they show readers where more can be learned on a specific topic. Subscribe to O’Reilly’s Safari service. Spend some time at a good book store. Work with your publisher to get copies of books they publish on related topics.

    Unix System Administration Book
    Code Complete

    Communicate compromises

    Every technology represents a series of compromises. Communicate these to the reader. Don’t make them learn the hard way. This isn’t a hammer for every nail. This isn’t a solution to every problem. Challenge yourself when outlining a topic: What are the relative merits the implementation? What were the alternatives? When should this technology not be employed? What would my readers benefit most for knowing?

    Prove to your reader you’re not a zealot.

    Effective XML ( particularly items 24,25,41 )

    Comments (View)    
    Jan
    4th
    2005
    Tue
    permalink

    java.util.UUID mini-FAQ

    J2SE 1.5 gives us a new class to generate identifiers. Here are some details I found after writing a bit of code and doing some reading.

    What is a UUID?

    According to Wikipedia:

    “…an identifier standard used in software construction, standardized by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE). The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. Thus, anyone can create a UUID and use it to identify something with reasonable confidence that that identifier will never be unintentionally used by anyone for anything else. Information labelled with UUIDs can therefore be later combined into a single database without need to resolve name conflicts. The most widespread use of this standard is in Microsoft’s Globally Unique Identifiers (GUIDs) which implement this standard.

    A UUID is essentially a 16-byte number and in its canonical form a UUID may look like this:

    550E8400-E29B-11D4-A716-446655440000

    How do I generate a new UUID?

    Calling the randomUUID() factory method will give you a fresh instance. Calling toString() will provide a spec-compliant canonical string representation. See the testRandom method in the source below.

    I thought these were called GUIDs. What is a UUID?

    Per the draft spec referenced below these are synonymous:

    …UUIDs (Universally Unique IDentifier), also known as GUIDs (Globally
    Unique IDentifier).

    I’m a Windows guy. Is this the same thing as CoCreateGuid?

    According to MSDN the current version of CoCreateGuid also generates a ‘version 4′ random number based UUID.

    The CoCreateGuid function calls the RPC function UuidCreate, which creates a GUID, a globally unique 128-bit integer. Use the CoCreateGuid function when you need an absolutely unique number that you will use as a persistent identifier in a distributed environment.To a very high degree of certainty, this function returns a unique value no other invocation, on the same or any other system (networked or not), should return the same value.

    For security reasons, it is often desirable to keep ethernet/token ring addresses on networks from becoming available outside a company or organization. In Windows XP/2000, the UuidCreate function generates a UUID that cannot be traced to the ethernet/token ring address of the computer on which it was generated. It also cannot be associated with other UUIDs created on the same computer. If you do not need this level of security, your application can use the UuidCreateSequential function, which behaves exactly as the UuidCreate function does on all other versions of the operating system.

    In Windows NT 4.0, Windows 95, DCOM release, and Windows 98, UuidCreate returns RPC_S_UUID_LOCAL_ONLY when the originating computer does not have an ethernet/token ring (IEEE 802.x) address. In this case, the generated UUID is a valid identifier, and is guaranteed to be unique among all UUIDs generated on the computer. However, the possibility exists that another computer without an ethernet/token ring address generated the identical UUID. Therefore you should never use this UUID to identify an object that is not strictly local to your computer. Computers with ethernet/token ring addresses generate UUIDs that are guaranteed to be globally unique.

    I thought UUIDs used the MAC address to guarantee uniqueness

    The specification defines multiple types of UUIDs. Version 1 UUIDs include a MAC address. Version 4 UUIDs are generated from a large random number and do not include the MAC address. The implementation of java.util.UUID creates version 4 UUIDs. See the testVersion method in the source below.

    So java.util.UUID can only generate version 4 UUIDs. Can it parse and handle the other types?

    Yes. Using the fromString method you can instantiate and interrogate a UUID of version 1,2,3 or 4. See the testFromString method in the source below; note it’s a version 1 UUID.

    The draft spec talks about multiple variant types of UUIDs. Which variant types are supported by java.util.UUID?

    As indicated by the javadoc for getVariant(), java.util.UUID only supports variant type 2.

    I like the idea of using the MAC address. What makes version 4 UUIDs better?

    Using a version 4 UUID could save you 20 months in a United States Federal Prison. Evidently the writer of the Melissa worm was tracked down in part by the MAC address in a UUID.

    So these version 4 UUIDs are basically random numbers. Won’t my UUID collide with someone elses?

    There are 122 significant bits in a type 4 UUID. 2122 is a *very* large number. Assuming a random distribution of these bits, the probability of collission is *very* low. How is the “randomness” determined?

    Under the hood java.util.UUID is creating an instance of SecureRandom and using that to generate new UUIDs. If you are using the default Sun provider and default java.security file, you are using a SHA1PRNG ( Pseudo Random Number Generator based on Secure Hash Algorigthm 1 ) seeded from the operating system.

    java.security

    #
    # Select the source of seed data for SecureRandom. By default an
    # attempt is made to use the entropy gathering device specified by
    # the securerandom.source property. If an exception occurs when
    # accessing the URL then the traditional system/thread activity
    # algorithm is used.
    #
    # On Solaris and Linux systems, if file:/dev/urandom is specified and it
    # exists, a special SecureRandom implementation is activated by default.
    # This "NativePRNG" reads random bytes directly from /dev/urandom.
    #
    # On Windows systems, the URLs file:/dev/random and file:/dev/urandom
    # enables use of the Microsoft CryptoAPI seed functionality.
    #
    securerandom.source=file:/dev/urandom

    On WinTel platforms this may map down to hardware

    More on randomness

    I want to preserve backward JDK compatibilty. I’ll just use java.rmi.server.UID. Cool?

    Probably a bad idea. Per the docs this gives you 216 significant digits. It is also makes no provision for the system clock being set backward.

    What other options do I have?

    commons-id is a general purpose identifier generator capable of generating UUIDs.

    Where can I read more about UUIDs?

    Current Draft Specification

    Sample Code

    import java.util.UUID;
      import junit.framework.TestCase;
    public class UUIDTest extends TestCase {  	
    	public void testRandom() { 		
    		UUID a = UUID.randomUUID();
    		UUID b = UUID.randomUUID();
    		assertFalse(a.equals(b));
    	}  	
    
    	public void testVariant() { 		
    		UUID a = UUID.randomUUID();
    		assertEquals(a.variant(), 2);
    	}  	
    
    	public void testVersion() { 		
    		UUID a1 = UUID.randomUUID();
    		assertEquals(a1.version(), 4);
    		// here is a version 1 UUID plucked from my own HKLMSoftwareClasses 		
    		// for history of UUIDs ending in 444553540000 		
    		// http://blogs.msdn.com/oldnewthing/archive/2004/02/11/71307.aspx#71356 		
    		UUID a2 = UUID.fromString("d27cdb6e-ae6d-11cf-96b8-444553540000");
    		assertEquals(a2.variant(), 2);
    		assertEquals(a2.version(), 1);
    	}  	
    
    	public void testFromString() { 		
    		String s = "d27cdb6e-ae6d-11cf-96b8-444553540000";
    		UUID a = UUID.fromString(s);
    		assertEquals(a.toString(), s);
    	}  	
    
    	public void testVersionFromCommonsTestCase() {  		
    		// these UUIDs are from commons-id 		
    		UUID v1 = UUID.fromString("3051a8d7-aea7-1801-e0bf-bc539dd60cf3");
    		UUID v2 = UUID.fromString("3051a8d7-aea7-2801-e0bf-bc539dd60cf3");
    		UUID v3 = UUID.fromString("3051a8d7-aea7-3801-e0bf-bc539dd60cf3");
    		UUID v4 = UUID.fromString("3051a8d7-aea7-4801-e0bf-bc539dd60cf3");
    
    		//UUID v5 = UUID.fromString("3051a8d7-aea7-3801-e0bf-bc539dd60cf3");
    		assertEquals(1, v1.version());
    		assertEquals(2, v2.version());
    		assertEquals(3, v3.version());
    		assertEquals(4, v4.version());
    
    		// java.util.UUID doesn't support version 5 UUIDs while commons-id does 		
    		//assertEquals(5, v5.version());
    	} 
    }
    
    Comments (View)