Link Grabber

Link Grabber provides a quick and easy way to grab links from a single web page. This python package is a simple wrapper around BeautifulSoup, focusing on grabbing HTML's hyperlink tag, "a."

Documentation

http://linkgrabber.neurosnap.net/

find

Parameters:

filters (dict): Beautiful Soup's filters as a dictionary
limit (int): Limit the number of links in sequential order
reverse (bool): Reverses how the list of <a> tags are sorted
sort (function): Accepts a function that accepts which key to sort upon within the List class

Sort by a link's attribute:

from linkGrabber import Links

links = Links("http://www.google.com")
links.find(limit=3, sort=lambda key: key['text'])

Exclude text:

import re

from linkGrabber import Links

links = Links("http://www.google.com")
links.find(exclude=[{ "text": re.compile("Read More") }])

Remove duplicate URLs and make the output pretty:

from linkGrabber import Links

links = Links("http://www.google.com")
links.find(duplicates=False, pretty=True)

The codes working. Depend on connection to website.

[ { 'class': ['gb1'],
'href': 'http://www.google.lt/imghp?hl=lt&tab=wi',
u'seo': 'imghp?hl=lt&tab=wi',
u'text': u'Vaizdai'},
{ 'class': ['gb1'],
'href': 'http://maps.google.lt/maps?hl=lt&tab=wl',
u'seo': 'maps?hl=lt&tab=wl',
u'text': u'\u017dem\u0117lapiai'},
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
{ 'href': '/intl/lt/policies/privacy/', u'seo': '', u'text': u'Privatumas'},
{ 'href': '/intl/lt/policies/terms/', u'seo': '', u'text': u'S\u0105lygos'}]

Run business automating tasks with python programming.

Search This Blog

Friday, October 27, 2017

Pypi project for SEO LinkGrabber

Link Grabber

find

No comments:

Post a Comment

Labels

Total Pageviews