Friday, November 10, 2006

It is tough to be a Great Programmer

This article, called Law of Leaky Abstractions, by Joel is worth reading.

He articulates the issue very well.

Most Interesting Stuff is here...

The law of leaky abstractions means that whenever somebody comes up with a wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying "learn how to do it manually first, then use the wizzy tool to save time." Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don't save us time learning.

And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.

Ten years ago, we might have imagined that new programming paradigms would have made programming easier by now. Indeed, the abstractions we've created over the years do allow us to deal with new orders of complexity in software development that we didn't have to deal with ten or fifteen years ago, like GUI programming and network programming. And while these great tools, like modern OO forms-based languages, let us get a lot of work done incredibly quickly, suddenly one day we need to figure out a problem where the abstraction leaked, and it takes 2 weeks. And when you need to hire a programmer to do mostly VB programming, it's not good enough to hire a VB programmer, because they will get completely stuck in tar every time the VB abstraction leaks.

The Law of Leaky Abstractions is dragging us down.

Sunday, September 17, 2006

Very good video of REC Trichy - Reminiscence

Very good video of REC Trichy - Reminiscence is available at google.

Eyes will become moist for sure and heart will ache once again for glory days....

Check it here.

Monday, July 24, 2006

All Dostoevsky said was 'we shall be with Christ.'

This article gave me some good idea about Dostoevsky's experiences in his early years.

Most revealing paragraph is:

As for Dostoevsky himself, though tortured by fear, he nevertheless held fast to that "philosophy" which the pervert Freud detested. As one of Dostoevsky's fellow conspirators on the scaffold recalled: "All Dostoevsky said was 'we shall be with Christ.'" Yes, Dostoevsky had Christian faith. Holy Bible faith. Hebrews Chapter 11 faith. In his moment of greatest fear the image of a waiting Christ sustained him, preserved his sanity. To the pervert Freud, this faith shackles man. But to the condemned man, the man facing his own grave, this faith is liberating.

Of course, history records that shortly after Dostoevsky uttered his famous words, just as the rifles were due to fire upon the first three, a reprieve was granted.

Monday, July 10, 2006

Garbage Collection and JVM Tuning

Most of the large scale J2EE Applications need some JVM tuning to perform reasonably.

I had a miserable time in 2003 while trying to tune JVM on Red Hat Linux 7.3 with JBoss.

Since then, I keep a note of GC Notes and following articles are worth sharing with all.

How we solved our GC Problem?

A Collection of JVM options

Friday, July 07, 2006

Brief Note about MD5, Hash, Checksum and Digest

Recently, I became bit curious about checking the integrity of downloaded files from net and had to read up about MD5, Hash, Checksum et al.

I started off with wikipedia and google search; both of these gave me a bunch of info which I have summarized here for a quick intro.

What is "Hash"?
---------------
The term "hash" apparently comes by way of analogy with its standard meaning in
the physical world, to "chop and mix."

In the SHA-1 algorithm, for example, the domain is "flattened" and "chopped" into "words"
which are then "mixed" with one another using carefully chosen mathematical functions.


Hash Function
----------------
A hash function (or hash algorithm) is a way of creating a small digital "fingerprint"
from any kind of data.

The function chops and mixes the data to create the fingerprint, often called a hash
value.

The hash value is commonly represented as a short string of random-looking letters
and numbers (Binary data written in hexadecimal notation).

A good hash function is one that yields few hash collisions in expected input domains.
In hash tables and data processing, collisions inhibit the distinguishing of data,
making records more costly to find.


Error Detection - Use of Hash Function and Redundancy Check
-------------------------------------------------------------------------
Using a hash function to detect errors in transmission is straightforward.

The hash function is computed for the data at the sender, and the value of this hash is
sent with the data. The hash function is performed again at the receiving end, and if
the hash values do not match, an error has occurred at some point during the transmission.
This is called a redundancy check.

Checksum
---------------
This article is about checksums calculated using addition.

The term "checksum" is sometimes used in a more general sense to refer to any kind of
redundancy check.

A checksum is a form of redundancy check, a very simple measure for protecting the integrity
of data by detecting errors in data that is sent through space (telecommunications) or time (storage).

It works by adding up the basic components of a message, typically the asserted bits, and
storing the resulting value.

Later, anyone can perform the same operation on the data, compare the result to
the authentic checksum, and (assuming that the sums match) conclude that the
message was probably not corrupted.

The simplest form of checksum, which simply adds up the asserted bits in the data,
cannot detect a number of types of errors. In particular, such a checksum is not changed by:

* reordering of the bytes in the message
* inserting or deleting zero-valued bytes
* multiple errors which sum to zero

More sophisticated types of redundancy check, including Fletcher's checksum, Adler-32, and
cyclic redundancy checks (CRCs), are designed to address these weaknesses by considering not only the value of each byte but also its position.

The cost of the ability to detect more types of errors is the increased complexity of computing
the checksum.

These types of redundancy check are useful in detecting accidental modification such as corruption to stored data or errors in a communication channel. However, they provide no security against a malicious agent as their simple mathematical structure makes them trivial to circumvent.

To provide this level of integrity, the use of a cryptographic hash function, such as SHA-256,
is necessary. (Collisions have been found in the popular MD5 algorithm and finding collisions in SHA-1 seems possible, but there is no evidence as of 2005 that SHA-256 suffers similar weaknesses.)

On Unix, there is a tool called "cksum" that generates both a 32 bit CRC and a byte count for
any given input file.

CRC
----
A cyclic redundancy check (CRC) is a type of hash function used to produce a checksum
a small, fixed number of bits against a block of data, such as a packet of network traffic or
a block of a computer file. The checksum is used to detect errors after transmission or storage.

A CRC is computed and appended before transmission or storage, and verified afterwards by recipient to confirm that no changes occurred on transit. CRCs are popular because they are simple to implement in binary hardware, are easy to analyze mathematically, and are particularly good at detecting common errors caused by noise in transmission channels.

Hashes are "digests", not "encryption"
-------------------------------------
Encryption transforms data from a cleartext to ciphertext and back (given the right keys),
and the two texts should roughly correspond to each other in size: big cleartext yields
big ciphertext, and so on. "Encryption" is a two-way operation.

Hashes, on the other hand, compile a stream of data into a small digest
(a summarized form: think "Reader's Digest"), and it's strictly a one way operation. All hashes of the same type have the same size no matter how big the inputs are:

We'll note here that though hashes and digests are often informally called "checksums",
they really aren't. True checksums, such as a Cyclic Redundancy Check are designed to
catch data-transmission errors and not deliberate attempts at tampering with data.

Cipher
---------
A cipher is an algorithm for performing encryption (and the reverse, decryption)
A series of well-defined steps that can be followed as a procedure. An alternative term is encipherment

Excellent References:
-----------------------
1) An Illustrated Guide to Cryptographic Hashes


2) What is a Digital Signature

Thursday, May 11, 2006

IO Redirection in Java

A Casual Google search led me to this interesting Java TechTips which talks about redirecting IO to a file as we normally do on command prompt

java test >myfile

Same thing can be achieved by using PrintWriter cleverly.

System class has methods to let you set your own PrintWriter to it. - System.setOut()

As a point of interest, the JDK 1.3 implementation of java.lang.System has internal code of this form:

    FileOutputStream fdOut =
new FileOutputStream(FileDescriptor.out);
setOut0(new PrintStream(
new BufferedOutputStream(fdOut, 128), true));

Check out this TechTips for more info about IO Redirection and Array Manipulation.

Friday, January 06, 2006

Good FAQ on Changing Unix Shell

This entry is worth reading before making a switch from one unix shell to another shell

Thursday, January 05, 2006

Web Security - Why SSL is not enough to protect Credit Card

I have been under the illusion that SSL is enough for Point to Point Security. 2 Articles from Monitoring Central have broken the illusion for good!

Does SSL protect you, or is it a condom that is open at both ends?
Read this article to understand the limitation that SSL does not really ensure the authenticity of both ends, automatically.

Excerpt:

What it does not do is actually secure any of the data that passes through the pipe, or really know where either end of the pipe actually is. What you can be sure of is that anything put into one end of the pipe is going to come out wherever the other end is.

But surely the data is fully protected? Yes, whilst the data is in the pipe it is protected. Now, assuming - and unfortunately that's what we have to do - that you know for sure where each end of the pipe is, and you are sure that each end is very secure, and you know for certain who is at each end, then you're OK. If any of those is not true then you do have a problem.

My data is SSL protected between the server, and me so why should I worry? Well no one at the server end really knows whom the data is from because they don't know what your identity is. They assume that data arriving through the pipe is right, and that your identity can be presumed from the data, not the other way around. Unfortunately there are hacker attacks that divert your link through their own site, where they can pretend to each end that they are the other entity without either end being the wiser. (This is called a man-in-the-middle attack using web site spoofing.)

Why SSL is not enough to secure your credit card details
There is no easy for Server to establish the identity of client and vice-versa. Sure, we do get padlock but most of the people would not bother to check the certificate if they are valid, genuine or fake.