blot.summarize¶
-
class
blot.summarize.Summary(url, article_html, title, summaries)¶ Bases:
object
-
blot.summarize.compare_sents(sent1, sent2)¶ Compare two word-tokenized sentences for shared words
-
blot.summarize.compare_sents_bounded(sent1, sent2)¶ If the result of compare_sents is not between LOWER_BOUND and UPPER_BOUND, it returns 0 instead, so outliers don’t mess with the sum
-
blot.summarize.compute_score(sent, sents)¶ Computes the average score of sent vs the other sentences (the result of sent vs itself isn’t counted because it’s 1, and that’s above UPPER_BOUND)
-
blot.summarize.find_likely_body(b)¶ Find the tag with the most directly-descended <p> tags
-
blot.summarize.is_unimportant(word)¶ Decides if a word is ok to toss out for the sentence comparisons
-
blot.summarize.only_important(sent)¶ Just a little wrapper to filter on is_unimportant
-
blot.summarize.summarize_block(block)¶ Return the sentence that best summarizes block
-
blot.summarize.summarize_blocks(blocks)¶
-
blot.summarize.summarize_page(url)¶
-
blot.summarize.summarize_text(text, block_sep='\n\n', url=None, title=None)¶
-
blot.summarize.u(s)¶ Ensure our string is unicode independent of Python version, since Python 3 versions < 3.3 do not support the u”...” prefix