I am trying to get the article text,header and published date of the article from the below URL
while I am trying to scrape the ‘article’ container with class “news-container cf” it returns 0 rows.
url = "https://www.argusmedia.com/en/news/2214037-us-hrc-prices-rise-as-supply-remains-tight" # Request r1 = requests.get(url, verify=False) r1.status_code print(r1.status_code) # We'll save in coverpage the cover page content coverpage = r1.content # Soup creation soup1 = BeautifulSoup(coverpage, "html5lib") # News identification coverpage_news = soup1.find_all('article' , class_ ='news-container cf') len(coverpage_news) ```
That is because this is being loaded dynamically, you need to call the APIs directly
import requests data = requests.get('https://www.argusmedia.com/api/news/2214037/us-hrc-prices-rise-as-supply-remains-tight').json() body = data['AmpBody'] title = data['Title'] date = data['PublishedDate'] year = data['PublishedYear'] print(body, title, date, year, sep='n') # <article><p class="lead">US hot-roll... # US HRC: Prices rise as supply remains tight # 11 May # 2021