Parsing raw HTTP Request

I working on HTTP Traffic Data set which is composed of complete POST and GET request Like given below. I have written code in java that has separated each of these request and saved it as string element in array list. Now i am confused how to parse these raw HTTP request in java is there any method better than manual parsing?

GET http://localhost:8080/tienda1/imagenes/3.gif/ HTTP/1.1
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.8 (like Gecko)
Pragma: no-cache
Cache-control: no-cache
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: x-gzip, x-deflate, gzip, deflate
Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5
Accept-Language: en
Host: localhost:8080
Cookie: JSESSIONID=FB018FFB06011CFABD60D8E8AD58CA21
Connection: close

Answer

I [am] working on [an] HTTP Traffic Data set which is composed of complete POST and GET request[s]

So you want to parse a file or list that contains multiple HTTP requests. What data do you want to extract? Anyway here is a Java HTTP parsing class, which can read the method, version and URI used in the request-line, and that reads all headers into a Hashtable.

You can use that one or write one yourself if you feel like reinventing the wheel. Take a look at the RFC to see what a request looks like in order to parse it correctly:

Request       = Request-Line              ; Section 5.1
                    *(( general-header        ; Section 4.5
                     | request-header         ; Section 5.3
                     | entity-header ) CRLF)  ; Section 7.1
                    CRLF
                    [ message-body ]          ; Section 4.3

Leave a Reply

Your email address will not be published. Required fields are marked *