Web_Crawler

Automated webscrapping program…
Scans through mavin.io search results and save retrieved information in an ordered format (.csv)
Built with passion @ Dera Mobile Legacy
Aurthor : CHIDERA C. ONWURAH

complete results with

a. products name,
b. date sold,
c. price sold,
d. shipping cost and
e. the pictures.

All of the previous categories will be extracted to different columns in an csv

open cmd or terminal and run this command

pip install bs4, requests

https://mavin.io/search?q=&bt=sold&cat=261332&sort=EndTimeSoonest&page=201 [note] page starts from 201
till n’th number of pages.. usually in range of 201 – 400,000

The program takes the item number [ self.num = (“2956”) ] input at [ line 11 ] in the main.py file .
And it scrapes all results corresponding to the said item number
The only line that needs modification is [ line 11 ] ….
Bcoz of the remaining items that needs to be scraped. so let the program scrape tons of information as its relates to a specific item, you can check the [ log.csv ] file if the number of items scraped are up to 7 million, kill the program by ;

Ctrl + C

Don’t worry there will not be any duplicate results
Network issues wont affect the program, bcoz it keeps restarting itself anytime the signal is not strong enough.
Contact [ mailto: [email protected] ] for more enquiry

python main.py