mmohamedyaser
Member
Bro this is not a udemy course request post.
Bro this is not a udemy course request post.
You need to create it and add your access_token and client_id in it.where is cookie file is open? in IDE?
access_token||client_id
thanks, brother but a something is missing in instructions like first (you have to install python?) them in cmd you paste the access token and client id ?You need to create it and add your access_token and client_id in it.
So the text file looks likeaccess_token||client_id
Ohh yahh. Thanks, I will add it in.thanks, brother but a something is missing in instructions like first (you have to install python?) them in cmd you paste the access token and client id ?
Hi @mmohamedyaser
Can you make a tutorial on how to add other website sources?
Currently, here's the Websites Available:
Discudemy
Udemy Freebies
Udemy Coupons
Real Discount
Tricks Info
Free Web Cart
Course Mania
Jojo Coupons
Online Tutorials
For e.g. I want the script to also scrape the following sites:
couponscorpion.com
100offdeal.online
udemyfreecourses.org
coursesity.com/provider/free/udemy-courses
www.guru99.com/free-udemy-course.htm
udemycoupons.me
www.onlinecourses.ooo
etc
Any help would be much appreciated.
Thanks for sharing this anyway![]()
ONLINECOURSES = "https://www.onlinecourses.ooo/page/"
.'onlinecourses.ooo',
into total_sites array after 'Tricks Info'
.5,
in site_range after 5 elements. This is the number of pages that it will go through. Now it will go through 4 pages (given number -1).def onlinecourses(page):
links_ls = []
head = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
} # This is the headers used to send the request, this spoof's the website that you are a normal user ;)
r = requests.get(ONLINECOURSES+ str(page), headers=head, verify=False) # this code uses the variable that we created in constants.py to send request to web server
soup = BeautifulSoup(r.content, 'html.parser') # this code parses the page into html format for python to read
all = soup.find_all('a', attrs={"rel": "bookmark", "title": True}) # this grabs all the links from that website which is about 10 links.
for index, items in enumerate(all): # once grabbing of the links are complete, a for loop is created to loop between each link. Note: indent the below lines
title = items.text #this variable takes the name of the course
url2 = items['href'] #this takes the url of each course page (eg. https://www.onlinecourses.ooo/coupon/learn-how-to-build-an-ecommerce-website-using-wordpress/)
r2 = requests.get(url2, headers=head, verify=False) # request is sent to web server for the above link
sys.stdout.write("\rLOADING URLS: " + animation[index % len(animation)]) # Just for animation
sys.stdout.flush() # Just for animation
soup1 = BeautifulSoup(r2.content, 'html.parser') # now again the page is parsed
link = soup1.find('div', 'link-holder').a['href'] # we are looking for the udemy url and saving it into the variable
links_ls.append(title + '||' + link) # here we are saving each title and link to a list that will be used for signup next. Note: Indent ends here
return links_ls # this returns the list to the function.
lambda page : onlinecourses(page),
in the new lineif site == 'Tricks Info':
, paste the below code after the end of d+=1
and make sure the indents are placed properly. if site == 'onlinecourses.ooo':
limit = 8
print('\n' + fc + sd + '-------' + fm + sb + '>>' + fb +' Online Courses.ooo ' + fm + sb + '<<' + fc + sd + '-------\n')
while d <= limit:
list_st = onlinecourses(d)
site = process(list_st, d, limit, site_index, cookies, access_token, csrftoken, head)
d += 1
python3 udemy.py -c cookie.txt
Hi @buggysite
Sure lets go step by step. Some sites use redirect which would not be easy to get the links(as the udemy url is not visible for python to grab, there is a way but its not straight forward). I took an example from one of the sites that you requested.
- First go to __constants/constants.py add in a new line after TRICKSINF.
- Insert this line ->
ONLINECOURSES = "https://www.onlinecourses.ooo/page/"
.- Insert
'onlinecourses.ooo',
into total_sites array after'Tricks Info'
.- Insert
5,
in site_range after 5 elements. This is the number of pages that it will go through. Now it will go through 4 pages (given number -1).- Next go to __functions/functions.py, this is the location for adding each page's function.
- Go to line 170 and paste the below. Will explain everything in comments.
Python:def onlinecourses(page): links_ls = [] head = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' } # This is the headers used to send the request, this spoof's the website that you are a normal user ;) r = requests.get(ONLINECOURSES+ str(page), headers=head, verify=False) # this code uses the variable that we created in constants.py to send request to web server soup = BeautifulSoup(r.content, 'html.parser') # this code parses the page into html format for python to read all = soup.find_all('a', attrs={"rel": "bookmark", "title": True}) # this grabs all the links from that website which is about 10 links. for index, items in enumerate(all): # once grabbing of the links are complete, a for loop is created to loop between each link. Note: indent the below lines title = items.text #this variable takes the name of the course url2 = items['href'] #this takes the url of each course page (eg. https://www.onlinecourses.ooo/coupon/learn-how-to-build-an-ecommerce-website-using-wordpress/) r2 = requests.get(url2, headers=head, verify=False) # request is sent to web server for the above link sys.stdout.write("\rLOADING URLS: " + animation[index % len(animation)]) # Just for animation sys.stdout.flush() # Just for animation soup1 = BeautifulSoup(r2.content, 'html.parser') # now again the page is parsed link = soup1.find('div', 'link-holder').a['href'] # we are looking for the udemy url and saving it into the variable links_ls.append(title + '||' + link) # here we are saving each title and link to a list that will be used for signup next. Note: Indent ends here return links_ls # this returns the list to the function.
- All text after # is for comments
- Now open udemy.py in visual studio code or notepad++ or any text editor
- In line 24, press Enter after the comma
- Insert
lambda page : onlinecourses(page),
in the new line- The next part is extra, for people who want to select the course manually.
- Find
if site == 'Tricks Info':
, paste the below code after the end ofd+=1
and make sure the indents are placed properly. Python:if site == 'onlinecourses.ooo': limit = 8 print('\n' + fc + sd + '-------' + fm + sb + '>>' + fb +' Online Courses.ooo ' + fm + sb + '<<' + fc + sd + '-------\n') while d <= limit: list_st = onlinecourses(d) site = process(list_st, d, limit, site_index, cookies, access_token, csrftoken, head) d += 1
That should be it. Just run the code the same way as before.python3 udemy.py -c cookie.txt