April 2021

April 27, 2021 / Last updated : April 27, 2021 admin postgres

PostgreSQL

PostgreSQL is a free, powerful SQL database which is frequently used with Python How to connect to postgres sudo -i -u postgres run psql postgres@rag-laptop:~$ psql psql (12.6 (Ubuntu 12.6-0ubuntu0.20.04.1)) Type “help” for help. list databases postgres=# l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges ———–+———-+———-+————-+————-+———————– gis […]

April 26, 2021 / Last updated : February 27, 2023 admin Python Code

Extract links with Scrapy

Using Scrapy’s LinkExtractor method you can get the links from every page that you desire. What are Link Extractors? “A link extractor is an object that extracts links from responses.” Summary The above code gets all of the hrefs very quickly and give you the flexibility to omit or include very specific attirbutes Watch the video Extract Links | how to scrape website urls | Python + Scrapy […]

April 21, 2021 / Last updated : April 21, 2021 admin Python Code

Read Scrapy ‘start_urls’ from csv file

How can the start_urls for scrapy be imported from csv? Using a list comprehension and a csv file you can make Scrapy get specific URLs from a predefined list use the .strip() method to remove newline characters Here you can see the line.strip() is performing the removal: [line.strip() for line in file] Demonstration of how […]

April 16, 2021 / Last updated : April 16, 2021 admin Python Code

How to web scrape iframes with scrapy

Web Scraping pages with iframes in can be done with Scrapy if you use a separate URL to access the data inside the iframe. You need to identify the name of the page of the iframe and then append that to your base url to provide a 2nd URL for the Scrapy spider to visit. […]

April 9, 2021 / Last updated : February 13, 2023 admin Python Code

How to scrape iframes

If you are scraping a website with pop up messages asking you to agree to accept cookies. This can prevent your scraper from continuing to the pages you want to scrape. How do you get past these? Using Selenium you need to switch to the iframe (which you can identify using browser tools / inspect […]