Search This Blog

Friday, October 27, 2017

Pypi project for SEO LinkGrabber

Link Grabber

https://travis-ci.org/michigan-com/linkGrabber.svg?branch=master
Link Grabber provides a quick and easy way to grab links from a single web page. This python package is a simple wrapper around BeautifulSoup, focusing on grabbing HTML's hyperlink tag, "a."
Documentation

find

Parameters:
  • filters (dict): Beautiful Soup's filters as a dictionary
  • limit (int): Limit the number of links in sequential order
  • reverse (bool): Reverses how the list of <a> tags are sorted
  • sort (function): Accepts a function that accepts which key to sort upon within the List class
Sort by a link's attribute:
from linkGrabber import Links

links = Links("http://www.google.com")
links.find(limit=3, sort=lambda key: key['text'])
Exclude text:
import re

from linkGrabber import Links

links = Links("http://www.google.com")
links.find(exclude=[{ "text": re.compile("Read More") }])
Remove duplicate URLs and make the output pretty:
from linkGrabber import Links

links = Links("http://www.google.com")
links.find(duplicates=False, pretty=True)

The codes working. Depend on connection to website.

[   {   'class': ['gb1'],
        'href': 'http://www.google.lt/imghp?hl=lt&tab=wi',
        u'seo': 'imghp?hl=lt&tab=wi',
        u'text': u'Vaizdai'},
    {   'class': ['gb1'],
        'href': 'http://maps.google.lt/maps?hl=lt&tab=wl',
        u'seo': 'maps?hl=lt&tab=wl',
        u'text': u'\u017dem\u0117lapiai'},
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
    {   'href': '/intl/lt/policies/privacy/', u'seo': '', u'text': u'Privatumas'},
    {   'href': '/intl/lt/policies/terms/', u'seo': '', u'text': u'S\u0105lygos'}]



No comments:

Post a Comment

Note: Only a member of this blog may post a comment.