Webscraping page that require cookies with requests

Piripozzo :

I'm trying to scrape the results of this booking website. The site drops a cookie to recognise the session. I've tried replicating it with requests but I still getting an Invalid Session ID error in my response. What am I doing wrong?

url = 'https://alilauro-tickets.certusonline.com/php/proxy.php'
s = requests.Session()
s.get(url)
data = {
    'msg': 'TimeTable',
    'req': '{"getAvailability":"Y","getBasicPrice":"Y","getRouteAnalysis":"Y","directOnly":"Y","legs":1,"pax":1,"origin":"BEV","destination":"FOR","tripRequest":[{"tripfrom":"BEV","tripto":"FOR","tripdate":"2020-03-21","tripleg":0}]}'
}
r = s.post(url, data=data, cookies=s.cookies)

Here is the error I get:

'sessionID': none, 'errorCode': '620', 'errorDescription': 'Invalid Session Number'

Here is the cookie information: Cookie informaiton

Bertrand Martel :

Indeed the cookie is present when you call https://alilauro-tickets.certusonline.com/php/proxy.php but the cookie is not valid until a Javascript function call https://alilauro-tickets.certusonline.com/php/proxy.php?msg=Connect. This is a protection against CSRF as Dan-Dev mentionned it in comments.

Using the following would work :

import requests
import json

url = "https://alilauro-tickets.certusonline.com/php/proxy.php"

session = requests.Session()

r = session.post(url, data= { "msg": "Connect"})
r = session.post(url, data= { 
    "msg": "TimeTable", 
    "req": json.dumps({
        "getAvailability":"Y",
        "getBasicPrice":"Y",
        "getRouteAnalysis":"Y",
        "directOnly":"Y",
        "legs":"1",
        "pax":1,
        "origin":"FOR",
        "destination":"BEV",
        "tripRequest":[{
            "tripfrom":"FOR",
            "tripto":"BEV",
            "tripdate":"2020-03-20",
            "tripleg":0
        }]
    })
})

print(json.loads(r.text)["VWS_Trips_Trip"])

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=320095&siteId=1