I've posted about the same question before here but the other thread is dying and I'm getting desperate.
I'm trying to scrape a webpage using rvest etc. Most of the stuff works but now I need R to loop trough a list of links and all it gives me is NA.
This is my code:
install.packages("rvest")
site20min <- read_xml("https://api.20min.ch/rss/view/1")
urls <- site20min %>% html_nodes('link') %>% html_text()
I need the next one because the first two links the api gives me direct back to the homepage
urls <- urls[-c(1:2)]
If I print my links now it gives me a list of 109 links.
urls
Now this is my loop. I need it to give me the first link of urls so I can read_html it
I'm looking for something like: "https://beta.20min.ch/story/so-sieht-die-coronavirus-kampagne-des-bundes-aus-255254143692?legacy=true".
I use break so it shows me only the first link but all I get is NA.
for(i in i:length(urls)) {
link <- urls[i]
break
}
link
If I can get this far, I think I can handle the rest with rvest but I've tried for hours now and just ain't getting anywhere.
Thx for your help.
Can you try out
for(i in 1:length(urls)) {
link <- urls[i]
break
}
link
instead?