AppFiller

Origin

The idea was that I hate filing out job applications, so why not just automate it.

Overview

The application has 2-parts the spidering and applying. The spidering gets all the jobs based on a query compared to the applying that requires a login and applies to the jobs that were was found during spidering.

Step 1: Setup

Key aspect since the program is based on Selenium is the driver for the application. I decided to use Edge because majority of people are using windows and it is built in. In the future making it available for other web platforms is ideal.

def setup():
	options = webdriver.EdgeOptions()
	options.add_argument("--ignore-certificate-error")
	options.add_argument("--ignore-ssl-errors")
	options.add_argument('--ignore-certificate-errors-spki-list')
	options.add_argument('log-level=3')
	return webdriver.Edge(service=EdgeService(EdgeChromiumDriverManager().install()), options=options)

driver = setup()
driver.maximize_window()
baseurl = "https://indeed.com/" 
driver.get(baseurl)

Step 2: Spidering

Since we have 2 different apps, I am going to cover indeed spider. The reason it is called spider is because it webs out to subpages that are found until x length.

def spider(query):
	query = query.replace(' ', '+')
	fullList = getLinks(query, 600)
	g = open('urlLinks.txt','a')
	for q in fullList:
		g.write(q + "\n")
		g.flush()
	g.close()

Which sends the query to the link. This request the page with all the links up until specified limit.

def getLinks(query, limit):
	url6 = "https://indeed.com/"
	joboption = "/jobs?q=" + query
	startCount = 0
	fullList = []
	while startCount < limit:
		if startCount == 0:
			driver.get(url6 + joboption)
		else:
			driver.get(url6 + joboption + "&start=" + str(startCount))
		time.sleep(1)
		src = driver.page_source
		newLinks = parseResultPage(src)
		for s in newLinks:
			fullList.append(s.strip("\""))
		startCount += 10
	return fullList

Finally, I parse the links from the list of the initial page and write to a file along with the name of the job for later review.

def parseResultPage(src):
	initInd1 = src.index("<td id=\"resultsCol\">")
	src = src[initInd1::]
	initInd2 = src.index('mosaic-zone')
	newSrc = src[initInd2::]
	fullList = []
	while 'href' in newSrc:
		hInd = newSrc.index('href')
		newSrc = newSrc[hInd::]
		hEnd = newSrc.index('>')
		hLink = newSrc[5:hEnd]
		companyInd = 100
		try:
			companyInd = newSrc.index('company')
		except:
			pass
		if companyInd < 10:
			if " " in hLink:
				hLink = hLink.split(" ")[0]
			fullList.append(hLink)
		newSrc = newSrc[hEnd::]
	return fullList

Conclusion

With this we use certain queries to find jobs that fit what we are looking for. Potential upgrade is to find parts of the article to narrow down if it is a good fit based on information found out about people.

Step 3: Applying

Now we have all the links to do something it is time to try the application. If the user doesn't login to their account it will keep getting a popup and not able to apply to anything, but instead will keep reaching a page

def apply(driver , url):
	driver.get(url)
	time.sleep(3)
	try:
		butts = driver.find_element(By.ID, 'indeedApplyButton').click()
	except:
		print("Page is not valid")
		return "NOPE"
	steps = 0
	while steps < 7:
		nexts = driver.find_elements(By.TAG_NAME, 'button')
		for n in nexts:
			if n.accessible_name == 'Continue':
				n.click()
				break
			if n.accessible_name == 'Submit your application':
				n.click()
				steps = 8
				return "FINISHED"
		time.sleep(2)
		steps += 1
	return "MISSING DATA"

There are sleep in there to wait for page to be loaded as we don't want it to not load the objects we need, or it is having a set time delay. Also was an issue if you didn't finish the application it would have a popup which I had to work around.

driver.execute_script("window.open('');")
driver.close()
driver.switch_to.window(driver.window_handles[0])

Conclusion

It is a bare minimum application, but it gets what I needed done and plan on expanding on this idea but for now this is the best I need.

IMPROVEMENTS

I would love the application to do company websites as well because easy apply is just the starting step for a lot.

PreviousFile Manipulator NextCyber Security Projects

Last updated 2 years ago

Was this helpful?