Monday, September 26, 2016

Duplicate HTTP/HTTPS Post requests


If you are here that means you are also puzzled by a weird error where same HTTP/HTTPS request is resent to the server, sometimes after few ms or few seconds.

And, it doesn't help that even POST requests are being repeatedly being sent to the server

There are quite a few reports in the wild which report the same thing. One of them is here.

It happened to us also and I was trying to understand this & hoping that I will be able to reproduce this.

Finally, I was able to do it on my localhost and it does confirm what HTTP 1.1 Spec says here.

Exact Steps to reproduce are

  • Install MITMProxy on your localhost
    • This will run on port 8080 by default
    • if you are trying to troubleshoot HTTPS request then ensure that localhost trust store has HTTPS certificate available
  • Install firefox if you don’t have it already
  • Install Wireshark to investigate the traffic on your machine
    • please note that loopback address of 127.0.0.1 can be easily captured by wireshark on mac and linux
    • Windows may require some additional steps. Please refer to wireshark documentation
  • Configure Firefox to proxy all requests (including localhost) to localhost:8080 where MITMProxy is running
  • Use a simple form to post the data - this is to ensure that request has a body
    • Reference is TestPostForm.html
  • Write a Simple Test Http Server which accepts the requests and closes the connection without any response
    • Refer to AnotherTestHttpServer
    • Line # 23 is commented out to ensure that socket is closed without sending any response to client
  • Run test http server by running the java code from above step
  • Launch wireshark and start filtering HTTP traffic by typing http in filter 
  • Load TestPostForm.html in firefox and hit submit
  • You should see 2 requests hitting Java HTTP Server while browse displays “HttpReadDisconnect('Server disconnected',) “ error
  • Wireshark logs will show 2 requests from MITMProxy to HTTPServer while Proxy convey to Browser the error “502 Bad Gateway”

Source Code of AnotherTestHttpServer is as following
import java.io.*;
import java.net.*;
import java.util.Date;

public class AnotherTestHttpServer {

    public static void main(String args[]) throws IOException {
        ServerSocket server = new ServerSocket(8000);
        System.out.println("Listening for connection on port 8000 ....");
        while (true) {
                try (Socket socket = server.accept()) {


                    System.out.println("Creating a socket with port " +
 socket.getPort());

                    System.out.println("Handling request...." + 
 
 System.currentTimeMillis());

                        Date today = new Date();
                        String httpResponse = "HTTP/1.1 200 OK\r\n\r\n" + today;
                        //socket.getOutputStream().write(httpResponse.getBytes("UTF-8"));
                    socket.close();
                }
            }
    }
} 

Source Code of TestPostForm.html 

 <HTML>  
 <HEAD>  
       <TITLE>A Sample Form Using POST</TITLE>  
 </HEAD>  
 <BODY BGCOLOR="#FDF5E6">  
 <CENTER>  
 <H2>A Sample Form Using POST</H2>  
 <FORM ACTION="http://localhost:8000"  
    METHOD="POST">  
  First name:  
  <INPUT TYPE="TEXT" NAME="firstName" VALUE="ABC"><BR>  
  Last name:  
  <INPUT TYPE="TEXT" NAME="lastName" VALUE="DEFDFD"><P>  
  <INPUT TYPE="SUBMIT">  
 </FORM>  
 </CENTER>  
 </BODY></HTML>  


How to workaround this problem
try to put a GUID in each request and check your db/cache before executing the request. If GUID is found in the cache/DB then return the same response.

Putting a cache in front of service layer may help otherwise db layer locking may be required 

Helpful References are







Tuesday, October 21, 2014

Bare bones Credit Card processing workflow



Some of my recent work deals with accepting credit card and other form of cards and settling them.

I have had some challenges to understand some basic terminology itself so I thought that it would be good to share my summary with others, in case they need it.

So, here it is...

There are quite a few key players


  1. Card Holder - person who has the card, say, VISA.
  2. Issuing Bank - Bank which issued the card (VISA branded card) to me. Say, Bank of America
  3. Merchant - let's say i want to buy something at www.amazon.com
  4. Acquiring Bank - Amazon.com may have an account with some bank like Wells Fargo and as and when amazon gets the money paid by customers, it is deposited into its account with Wells Fargo bank.
  5. Payment Processor - It can be someone like Chase Payment tech which accepts online payment requests from Amazon.com
  6. Card Network - VISA/MasterCard/Amex have their own electronic network to accept the payment request and they route the request to Issuing Bank, for authorization.
  7. Payment Service Provider - These are online players like Paypal and Amazon Payments who certify themselves as Merchants with Card Networks and act on behalf of small merchants. (sub_merchants)
  8. Sub-Merchant - Small merchants who may not have the muscle/resources to go thru lengthy  + costly certification process with card networks and banks.
  9. Payment Gateway - CCBill or other online services which deal with processors and banks.

There are 2 basic workflows

  1. Authorize - when card needs to be authorized for the amount of sale requested
  2. Settlement - When the amount of sale transaction  is deducted from Customer's account and deposited into Merchant's account 
Please note that these are simplified descriptions for ease of understanding.

Overall workflow can be visualized as depicted here and here


Monday, September 09, 2013

Browser vs Layout Engine



Ever since I started working on a Front End Heavy Project which involves significant bit of HTML5 and CSS3 as well as elements of responsive design, I have heard terms like WebKit, Gecko, Browser Engine, JS Engine,Rendering Engine and so on.

Some of these terms are clear to me but some are not. So, I decided to do some digging into this area to self educate myself.

Findings are good enough to share with wider audience in the hope of being helpful to someone in future.
  • What is WebKit?  
WebKit is an open source rendering engine which parses HTML, CSS and JavaScript and renders the web page in a browser.

Standard Components of WebKit are WebComponents, JSCore (used for JS parsing and execution) as well as platform specific stack for actually rendering the page.

WebKit Diagram

It is all very well explained in an article here

  • What are the other rendering engines in the wild? 
Well, there are numerous rendering engines (besides webkit)  but most notable ones are Trident (from MicroSoft) and Gecko (from Mozilla)

Conceptually, they are pretty similar to WebKit but actual differentiation lies in the implementation.

  • What is a Browser?
Browser is a software which is used to access resources over the internet (or intranet).

Browsers use rendering engine like WebKit/Gecko to render the page but have additional code for Browser UI as well as dealing with different persistence layer like Cookies/LocalStorage/WebSQL/IndexedDB


Image Source (how browsers work?)

Standard Components of a Browser are:

  • Parsing (HTML, XML, CSS, JavaScript)
  • Layout (common in all WebKit browsers)
  • Text and graphics rendering
  • Image decoding
  • GPU interaction
  • Network access
  • Hardware acceleration

More information can be found here and here

  • Does using WebKit mean that browsers will be compatible?
No, Just because 2 browsers are using WebKit, it does not mean that they will be compatible.
For example, Chrome (upto 27) and Safari are based on WebKit but as we can see from the diagrams above, there are lots of other components which vary from one browser to another browser (for example, Chrome uses V8 JS Engine where as Safari uses different engine) or from one OS to another OS (like Safari on Windows vs Safari on IPad)

It is a complex world out there!

Thursday, May 30, 2013

How does Solr Sort Documents when their Score is Same

We have had cases when same keyword Search gives us results which are ordered randomly.

On digging deeper, it seems that if Score of documents are same then Solr is sorting them based on their internal DocId or the time when the documented was indexed.

To demonstrate this, I used a very simple schema:

Added few documents like this

<doc>
<str name="id">201</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2022</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">1</str>
<int name="rank_id">20</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2</str>
<int name="rank_id">21</int>
<str name="name">Perl Harbor</str>
</doc>


When doing a simple query, 


I get following documents which are in the order in which they were added

<doc>
<str name="id">201</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2022</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">1</str>
<int name="rank_id">20</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2</str>
<int name="rank_id">21</int>
<str name="name">Perl Harbor</str>
</doc>


This indicates that order is based on the order in which they were added to Solr.

You can see same question being asked here

Saturday, October 27, 2012

Solr Schema Changes - Breaking Vs Non-Breaking



There are times when your Solr based application needs to be extended, in terms of adding new fields or updating existing fields' definitions or deleting existing fields.

Whenever we run into these scenarios, one of the most important question that needs to be answered is, does this change need existing index to be deleted and recreated or is it as simple as updating schema without any deletion & re-indexing?

If a change needs index to be deleted and all docs to be re-indexed then this is something I call as a breaking change. One such case is when field omitNorms is changed to be true/false. This impacts all the documents and unless all docs are deleted followed by re-indexing, index will still have older information.

Changes like adding a new field or deleting a field is an easy one to deal with. These changes do not require all docs to be deleted. It is handled nicely by Solr. All newly added docs will follow new schema. This is what i refer as non-breaking change.

I hope this clarifies some of questions which people may have about the impact of making changes to Solr Schema.


Friday, May 25, 2012

Creativity, Brainstorming and Work Spaces

This article does a good job, at clearing misconceptions about Brainstorm, Creativity and how it can be influenced by people dynamics as well as the work space they are in.

Following is what I understood:
  • Brainstorming is not as productive as it is made out to be. Most importantly, the notion that during brainstorming session, we should not be critical of the ideas and rather get an inventory of ideas. This is supposed to ensure free flow of ideas. It does not seem to be the case in real world. 

  •  Research is proving that it is much better to let people question the ideas and it seems to be leading to better ideas which are far more original than what brainstorming does.
  •  Article seems to suggest that Unfamiliar perspectives can be thought provoking and can lead to new ideas. Success of broadway musicals where artistes were a mix of people who have worked together and some newbies indicates that it truly is a case.
  •  It is a lesson worth keeping in mind that those teams would be more successful which have a healthy mixture of people who have worked as a team earlier and also to have some new people who have different perspective that rest of the team.
  • To get success as a team, it is important that team is able to meet physically often and this leads to an insight that work spaces play an important role in fostering creative collaboration across different groups.
  • Success of MIT Radar Lab, Building 20 as well as Pixar setup by Steve Jobs proves that workspaces are much more important than we think.

As I work in Software development with different set of people, these lessons are important for me and people like me.

Wednesday, June 29, 2011

Using log4j to get ibatis and SQL logs

# Global logging configuration
log4j.rootLogger=ERROR, stdout

#log4j.logger.com.ibatis=DEBUG

# shows SQL of prepared statements
#log4j.logger.java.sql.Connection=DEBUG

# shows parameters inserted into prepared statements
#log4j.logger.java.sql.PreparedStatement=DEBUG

# shows query results
#log4j.logger.java.sql.ResultSet=DEBUG

#log4j.logger.java.sql.Statement=DEBUG

# Console output
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%t] - %m%n