miniflux-v2

Commit Graph

Author	SHA1	Message	Date
Corey McCaffrey	0683074b8b	Added scraper rule for TheOatmeal.com The default rule does not show the comic posted to the feed. The comic image is in a div with id "comic".	2020-05-13 21:28:00 -07:00
Corey McCaffrey	8f6c07afd6	Added scraper rule for RayWenderlich.com RayWenderlich.com is a popular developer's community for iOS and Android developers. The default rule results in "GROUP GROUP GROUP GROUP…" instead of the content posted on the blog.	2020-05-13 21:28:00 -07:00
Andrew Williams	9974e0f458	Addition of scraper rule for wdwnt.com By default fetching original content for wdwnt.com results in a snippet of the comments section, this rule captures the article content.	2020-02-28 20:24:58 -08:00
cinput	8e1ed8bef3	Return outer HTML when scraping elements	2019-12-21 21:18:31 -08:00
somini	30f22fbd78	Update scraper rule for "Le Monde"	2019-12-19 18:35:29 -08:00
Neo Ng	90064a8cf0	Update scraper rule for openingsource.org	2019-11-28 19:40:26 -08:00
Tom Matthews	8b40778ee1	Add BBC News scraping rule	2018-12-13 20:25:30 -08:00
Frédéric Guillot	6f5d93cbbe	Update scraper rule for lemonde.fr	2018-12-02 20:53:22 -08:00
Frédéric Guillot	311a133ab8	Refactor manual entry scraper	2018-12-02 20:51:06 -08:00
mapl	e47188eab2	Update scraper rule for heise.de	2018-12-01 11:49:30 -08:00
Frédéric Guillot	3b6e44c331	Allow the scraper to parse XHTML documents Only "text/html" was authorized before.	2018-11-03 13:44:13 -07:00
Frédéric Guillot	5870f04260	Simplify feed parser and format detection - Avoid doing multiple buffer copies - Move parser and format detection logic to its own package	2018-10-14 11:46:41 -07:00
Frédéric Guillot	9dc38a0803	Add missing package descriptions for GoDoc	2018-10-08 17:32:17 -07:00
Patrick	2538eea177	Add the possibility to override default user agent for each feed	2018-09-19 18:19:24 -07:00
Frédéric Guillot	df2bebaf3d	Update scraper rule for heise.de	2018-08-25 10:33:18 -07:00
Frédéric Guillot	dbcc5d8a97	Use canonical imports	2018-08-24 21:56:39 -07:00
Frédéric Guillot	1eba1730d1	Move HTTP client to its own package	2018-04-28 10:51:07 -07:00
aniran	322b265d7a	Scrape parent element for iframe Current behavior: if you have an `iframe` scraper rule, `scrapContent` tries to return the inner HTML of the `iframe`, which turns up blank. New behavior: like `img` elements, if an `iframe` is matched by a scraper rule, the parent element's inner HTML (i.e. the `iframe` is returned).	2018-04-27 17:57:22 -07:00
Frédéric Guillot	1d7fe892e1	Add scraper rule for darkreading.com	2018-01-06 13:25:12 -08:00
Frédéric Guillot	48aa0d07ef	Add more scraper rules	2018-01-04 19:32:24 -08:00
Frédéric Guillot	3c3f397bf5	Make sure the scraper parse only HTML documents	2018-01-02 18:32:01 -08:00
Frédéric Guillot	c454f67037	Add scraper rules for version2.dk and ing.dk	2017-12-27 19:44:23 -08:00
Frédéric Guillot	d4839b5597	Add more scraper rules	2017-12-27 13:36:07 -08:00
Frédéric Guillot	1d8193b892	Add logger	2017-12-15 18:55:57 -08:00
Frédéric Guillot	c6d9eb3614	Improve content scraper	2017-12-13 21:30:40 -08:00
Frédéric Guillot	84d912c979	Rewrite imports	2017-12-12 21:48:13 -08:00
Frédéric Guillot	ef097f02fe	Add the possibility to enable crawler for feeds	2017-12-12 19:19:36 -08:00
Frédéric Guillot	87ccad5c7f	Add scraper rules	2017-12-10 20:51:04 -08:00
Frédéric Guillot	7a35c58f53	Add readability package to fetch original content	2017-12-10 19:01:38 -08:00

29 Commits