Piripozzo :
I'm trying to scrape the results of this booking website. The site drops a cookie to recognise the session. I've tried replicating it with requests
but I still getting an Invalid Session ID error
in my response. What am I doing wrong?
url = 'https://alilauro-tickets.certusonline.com/php/proxy.php'
s = requests.Session()
s.get(url)
data = {
'msg': 'TimeTable',
'req': '{"getAvailability":"Y","getBasicPrice":"Y","getRouteAnalysis":"Y","directOnly":"Y","legs":1,"pax":1,"origin":"BEV","destination":"FOR","tripRequest":[{"tripfrom":"BEV","tripto":"FOR","tripdate":"2020-03-21","tripleg":0}]}'
}
r = s.post(url, data=data, cookies=s.cookies)
Here is the error I get:
'sessionID': none, 'errorCode': '620', 'errorDescription': 'Invalid Session Number'
Here is the cookie information: Cookie informaiton
Bertrand Martel :
Indeed the cookie is present when you call https://alilauro-tickets.certusonline.com/php/proxy.php
but the cookie is not valid until a Javascript function call https://alilauro-tickets.certusonline.com/php/proxy.php?msg=Connect
. This is a protection against CSRF as Dan-Dev mentionned it in comments.
Using the following would work :
import requests
import json
url = "https://alilauro-tickets.certusonline.com/php/proxy.php"
session = requests.Session()
r = session.post(url, data= { "msg": "Connect"})
r = session.post(url, data= {
"msg": "TimeTable",
"req": json.dumps({
"getAvailability":"Y",
"getBasicPrice":"Y",
"getRouteAnalysis":"Y",
"directOnly":"Y",
"legs":"1",
"pax":1,
"origin":"FOR",
"destination":"BEV",
"tripRequest":[{
"tripfrom":"FOR",
"tripto":"BEV",
"tripdate":"2020-03-20",
"tripleg":0
}]
})
})
print(json.loads(r.text)["VWS_Trips_Trip"])
Guess you like
Origin http://43.154.161.224:23101/article/api/json?id=320095&siteId=1