Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications. https://www.fivefilters.org/full-text-rss/
Find a file
2026-06-16 17:28:18 +02:00
.about.com.txt
.allthingsd.com.txt
.asahi.com.txt add .asahi.com.txt and biz-journal.jp.txt (#1911) 2026-03-04 15:00:28 +01:00
.blog.163.com.txt Update .blog.163.com.txt 2017-02-09 23:12:58 +01:00
.blog.hu.txt Create .blog.hu.txt (#1465) 2024-10-28 01:05:57 +01:00
.blogs.nytimes.com.txt
.blogspot.com.txt Update .blogspot.com.txt 2023-10-09 19:57:07 +02:00
.businessinsider.com.txt
.cnet.com.txt
.craigslist.org.txt
.ctv.ca.txt
.denfaminicogamer.jp.txt add .denfaminicogamer.jp.txt (#1827) 2025-12-20 18:59:32 +01:00
.dreamwidth.org.txt
.dxy.cn.txt
.elpais.com.txt Update elpais.com (#1007) 2022-11-02 09:59:04 +01:00
.etc.se.txt
.ew.com.txt
.fivefilters.org.txt
.fok.nl.txt
.gitattributes
.gitignore
.globo.com.txt Globo.com (#1494) 2024-11-16 08:52:26 +01:00
.hardware.info.txt
.ietf.org.txt Update .ietf.org.txt 2023-11-08 08:21:27 +01:00
.ifeng.com.txt
.ihned.cz.txt
.itmedia.co.jp.txt feat: Add configuration files for .itmedia.co.jp and atmarkit.itmedia.co.jp (#1797) 2025-12-02 19:22:14 +01:00
.lingolia.com.txt Update .lingolia.com.txt (#1624) 2025-05-25 09:32:52 +02:00
.livejournal.com.txt Initial commit 2013-02-27 23:43:10 +01:00
.m.wikihow.com.txt
.medium.com.txt Medium.com (#1169) 2023-07-25 06:45:51 +02:00
.metafilter.com.txt
.mitpress.mit.edu.txt Create .mitpress.mit.edu.txt 2022-07-14 21:41:16 -04:00
.mozilla.org.txt
.nasa.gov.txt Create .nasa.gov.txt (#1462) 2024-10-27 17:49:35 +01:00
.nytimes.com.txt Fix fetching nytimes.com articles (#1000) 2022-10-14 22:19:17 +02:00
.onliner.by.txt
.orf.at.txt
.over-blog.com.txt Create .over-blog.com.txt (#1682) 2025-06-24 23:45:16 +02:00
.philhist.unibas.ch.txt Rename .unibas.ch.txt to .philhist.unibas.ch.txt 2022-01-25 22:10:25 +01:00
.playblackdesert.com.txt Create .playblackdesert.com.txt 2021-03-03 18:04:17 +01:00
.quora.com.txt Update .quora.com.txt 2020-11-15 20:43:43 +01:00
.readthedocs.io.txt
.redbullmusicacademy.com.txt Add redbullmusicacademy.com config (#1022) 2023-01-02 07:02:59 +01:00
.repubblica.it.txt
.robweychert.com.txt add robweychert.com (#1097) 2023-06-19 09:05:59 +02:00
.rollingstone.com.txt Create .rollingstone.com.txt (#1775) 2025-10-11 19:50:17 +02:00
.schwab.com.txt
.signal-arnaques.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
.simonwillison.net.txt Update .simonwillison.net.txt (#1265) 2023-12-10 11:22:50 +01:00
.slashdot.org.txt slashdot: replace i tags with blockquote (#929) 2022-02-15 07:16:41 +01:00
.smashingmagazine.com.txt Improve smashingmagazine (#302) 2017-06-14 13:28:00 +02:00
.sodexo.com.txt
.sputniknews.com.txt
.stackexchange.com.txt Fix extracting body on stackexchange sites (#1785) 2025-11-09 13:55:42 +01:00
.stanford.edu.txt
.statista.com.txt Create statista.com / es.statista.com (#852) 2021-01-20 13:51:38 +01:00
.substack.com.txt Subst2 (#1713) 2025-07-15 10:07:06 +02:00
.theinventory.com.txt Create .theinventory.com.txt 2021-01-01 17:04:37 +01:00
.theonion.com.txt
.theplayerstribune.com.txt Create .theplayerstribune.com.txt (#1408) 2024-07-26 15:46:36 +02:00
.time.com.txt
.tvbs.com.tw.txt Update .tvbs.com.tw.txt (#1337) 2024-02-16 01:54:30 +01:00
.tweakblogs.net.txt
.usinenouvelle.com.txt rename .usinenouvelle.com to .txt 2015-04-05 22:19:11 +02:00
.vanityfair.com.txt Update .vanityfair.com.txt (#1415) 2024-08-01 21:37:11 +02:00
.visualcapitalist.com.txt Create .visualcapitalist.com.txt (#1115) 2023-06-26 06:33:38 +02:00
.watch.impress.co.jp.txt add .watch.impress.co.jp.txt and fujinkoron.jp.txt (#1917) 2026-03-20 14:43:02 +01:00
.watson.de.txt Update .watson.de.txt 2024-07-25 15:04:02 +02:00
.wikihow.com.txt
.wikimedia.org.txt
.wikipedia.org.txt remove redundant math so only image is kept. (#979) 2022-06-13 06:08:11 +02:00
.wired.com.txt Create .wired.com.txt (#1569) 2025-03-12 14:46:41 +01:00
.wordpress.com.txt chore: update wordpress (#1813) 2026-02-06 17:27:23 +01:00
.wp.pl.txt Update .wp.pl.txt (#1039) 2023-01-27 13:55:45 +01:00
.wyborcza.biz.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
.wyborcza.pl.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
.yahoo.com.txt Changed 3 Yahoo configs (#1400) 2024-07-07 10:12:10 +02:00
01net.com.txt Create 01net.com.txt (#1586) 2025-04-15 08:01:16 +02:00
3quarksdaily.com.txt
3voor12.vpro.nl.txt Initial commit 2013-02-27 23:43:10 +01:00
5by5.tv.txt
7newsbelize.com.txt 7newsbelize.com 2013-05-31 22:16:45 +02:00
8e-etage.fr.txt
9gag.com.txt
9to5google.com.txt Create 9to5google.com.txt (#1322) 2024-01-27 06:37:55 +01:00
9to5mac.com.txt Create 9to5mac.com.txt (#288) 2017-04-15 13:48:57 +02:00
16personalities.com.txt chore: add working rule without JS for 16personalities (#1883) 2026-02-26 17:17:49 +01:00
20min.ch.txt
20minutes.fr.txt Update 20minutes.fr.txt 2024-05-27 09:26:21 +02:00
24.ae.txt Create 24.ae.txt 2014-12-10 01:14:11 +01:00
24a11y.com.txt Added 24a11y.com.txt (#583) 2018-12-21 15:40:49 +01:00
24auto.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
24garten.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
24hamburg.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
24joursdeweb.fr.txt fix: add title and body to 24joursdeweb (#1808) 2025-12-07 23:28:27 +01:00
24rhein.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
24vita.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
24ways.org.txt
36kr.com.txt
37signals.com.txt
43folders.com.txt
404media.co.txt Update 404media.co.txt 2025-06-25 17:43:22 -04:00
500px.com.txt
512pixels.net.txt Update 512pixels.net.txt 2014-10-14 11:49:02 +02:00
a.tldrnewsletter.com.txt add support for a.tldrnewsletter.com (#1478) 2024-11-03 14:46:42 +01:00
a11ywithlindsey.com.txt
aachener-nachrichten.de.txt
aarp.org.txt Update aarp.org.txt 2021-05-12 16:40:02 +02:00
abc-luxe.com.txt
abc.es.txt updates 2013-03-22 15:38:36 +01:00
abc.net.au.txt Update abc.net.au.txt (#1181) 2023-08-13 23:15:31 +02:00
abcnews.go.com.txt Initial commit 2013-02-27 23:43:10 +01:00
abendblatt.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
abendzeitung-muenchen.de.txt Create abendzeitung-muenchen.de.txt (#1185) 2023-08-20 12:55:29 +02:00
abplive.com.txt Create abplive.com.txt 2024-05-02 10:36:18 +02:00
absolument-tout.net.txt Create absolument-tout.net.txt (#1757) 2025-09-06 09:35:09 +02:00
academic.oup.com.txt Shtrom 2023 03 (#1059) 2023-03-08 12:42:57 +01:00
academiedugout.fr.txt
accaglobal.com.txt Create accaglobal.com.txt 2022-06-13 00:36:28 +02:00
access.redhat.com.txt fix: body rule for redhat.com (#1897) 2026-02-26 17:58:46 +01:00
accesstoinsight.org.txt
achgut.com.txt add config for achgut.com (#984) 2022-07-18 06:55:23 +02:00
acidcow.com.txt
aclu.org.txt
acroswing.fr.txt
actualitte.com.txt
ad.nl.txt
addendum.org.txt
adfc-nrw.de.txt
adme.ru.txt Backport site_config changes from wallabag v1 2015-12-31 18:13:20 +01:00
adslzone.net.txt Update adslzone.net (#1005) 2022-11-02 09:58:34 +01:00
aei.org.txt
aeon.co.txt Update stripping rules in aeon.co.txt (#1792) 2025-11-24 08:49:33 +01:00
aerobuzz.fr.txt Add aerobuzz.fr.txt (#883) 2021-05-14 00:46:52 +02:00
afr.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
africaintelligence.fr.txt
aftenposten.no.txt
aftonbladet.se.txt
agirpourlatransition.ademe.fr.txt Create agirpourlatransition.ademe.fr.txt (#1298) 2024-01-07 14:39:28 +01:00
aht.seriouseats.com.txt Initial commit 2013-02-27 23:43:10 +01:00
aif.ru.txt Update and add some configs (#835) 2020-12-28 11:05:02 +01:00
aitnews.com.txt
akweb.de.txt Add configuration for ak - analyse & kritik (#834) 2020-12-17 20:15:24 +01:00
al-monitor.com.txt Create al-monitor.com.txt (#1247) 2023-11-14 22:53:36 +01:00
albayan.ae.txt
alberta.ca.txt Create alberta.ca.txt 2020-11-28 12:57:43 +01:00
alex.mullr.net.txt
alexduner.com.txt
alexmurrell.co.uk.txt Create alexmurrell.co.uk.txt (#1072) 2023-03-30 13:40:12 +02:00
alexwlchan.net.txt Create alexwlchan.net.txt (#1641) 2025-06-03 12:56:43 +02:00
alicewalkersgarden.com.txt Create alicewalkersgarden.com.txt (#1359) 2024-04-08 13:55:06 +02:00
aligneddev.net.txt Create aligneddev.net.txt (#1659) 2025-06-13 16:25:17 +02:00
alimentation-generale.fr.txt
alistapart.com.txt Update alistapart.com.txt (#863) 2021-03-06 23:48:49 +01:00
aljazeera.com.txt
allafrica.com.txt
allgemeine-zeitung.de.txt Update allgemeine-zeitung.de.txt 2023-10-15 14:59:48 +02:00
allphly.com.txt Create allphly.com.txt (#1209) 2023-09-25 05:44:03 +02:00
allrecipes.com.txt
allthingsd.com.txt
allyou.com.txt
alphabeta.argaam.com.txt
alriyadh.com.txt
alsacreations.com.txt
alseraj.net.txt
altaonline.com.txt Create altaonline.com.txt 2022-07-14 21:10:36 -04:00
alternatives-economiques.fr.txt Update alternatives-economiques.fr.txt (#1032) 2023-01-19 19:51:45 +01:00
alternator.science.txt Create alternator.science.txt (#1492) 2024-11-12 14:09:31 +01:00
alternet.org.txt alternet.org 2014-06-08 14:41:32 +02:00
altfoto.com.txt
alumni.stanford.edu.txt
amandala.com.bz.txt
amazon.com.txt
americandrink.net.txt
americanprogress.org.txt Create americanprogress.org.txt (#1555) 2025-02-04 09:32:20 +01:00
americanthinker.com.txt
americastestkitchenfeed.com.txt
amp.themercury.com.au.txt
amptoons.com.txt
anandtech.com.txt
androidandme.com.txt
androidcentral.com.txt Create androidcentral.com.txt (#1335) 2024-02-10 22:38:14 +01:00
androidpolice.com.txt Update androidpolice.com.txt (#1621) 2025-05-16 13:47:18 +02:00
andy-bell.design.txt
angrymetalguy.com.txt
annatravelling.wordpress.com.txt
annouchka.fr.txt
ansible.com.txt add config for ansible.com (#1094) 2023-06-16 21:55:33 +02:00
answers.microsoft.com.txt Create answers.microsoft.com.txt (#1688) 2025-06-28 07:00:18 +02:00
answersresearchjournal.org.txt Create answersresearchjournal.org.txt 2023-03-21 00:37:21 +01:00
antigone21.com.txt Create antigone21.com.txt (#1463) 2024-10-27 18:36:25 +01:00
antirez.com.txt
aoc.media.txt
apache.be.txt
apnews.com.txt chore: add rules for apnews.com (#1884) 2026-02-24 15:01:17 +01:00
apotheke-adhoc.de.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
apple.com.txt Create apple.com.txt 2023-11-06 15:39:29 +01:00
apple.news.txt
appleinsider.com.txt appleinsider.com 2013-10-03 23:24:32 +02:00
appleweblog.com.txt
aps.dz.txt Create aps.dz.txt (#1044) 2023-02-06 07:01:47 +01:00
araraneon.com.br.txt Create araraneon.com.br.txt (#1416) 2024-08-02 08:38:12 +02:00
archdaily.com.txt
archiloque.net.txt Create archiloque.net.txt 2021-01-06 16:44:55 +01:00
architecturaldigest.com.txt Create architecturaldigest.com.txt (#1495) 2024-11-16 10:44:58 +01:00
archive.pressthink.org.txt
archiveofourown.org.txt Update archiveofourown.org.txt (#1630) 2025-05-27 22:01:09 +02:00
archlinux.de.txt Create archlinux.de.txt (#1556) 2025-02-07 14:08:26 +01:00
arduino-tutorial.de.txt Create arduino-tutorial.de.txt (#560) 2018-11-23 01:30:16 +01:00
arretsurimages.net.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
arstechnica.com.txt Update arstechnica.com.txt (#1471) 2024-10-30 21:27:50 +01:00
artforum.com.txt Update artforum.com.txt 2024-07-15 14:23:53 +02:00
articles.courant.com.txt
articles.washingtonpost.com.txt
artofmanliness.com.txt
artresilia.com.txt Create artresilia.com.txt (#1634) 2025-05-30 15:46:53 +02:00
artsixmic.fr.txt
arxiv-vanity.com.txt Update arxiv-vanity.com.txt 2023-06-08 15:29:31 +02:00
arxiv.org.txt add publication date and author to arXiv (#1745) 2025-08-12 18:46:25 +02:00
as-web.jp.txt add as-web.jp.txt and mainichi.jp.txt (#1822) 2025-12-17 20:01:19 +01:00
asahi.com.txt add below site configs (#1849) 2026-01-17 12:51:20 +01:00
ascarter.net.txt
ascii.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
askingbox.de.txt
askubuntu.com.txt Create askubuntu.com.txt (#1772) 2025-10-09 20:36:35 +02:00
astronews.com.txt
astronomy.com.txt
asymco.com.txt
atlantico.fr.txt
atlasobscura.com.txt Create atlasobscura.com.txt (#1576) 2025-03-28 08:12:57 +01:00
atmarkit.itmedia.co.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
au.lifehacker.com.txt Update and rename lifehacker.com.au.txt to au.lifehacker.com.txt (#1528) 2024-12-12 16:38:55 +01:00
au.news.yahoo.com.txt Changed 3 Yahoo configs (#1400) 2024-07-07 10:12:10 +02:00
audiobookshelf.org.txt Create audiobookshelf.org.txt (#1755) 2025-09-02 08:41:48 +02:00
auto-motor-und-sport.de.txt add config for auto-motor-und-sport.de (#1302) 2024-01-11 02:04:16 +01:00
autoactu.com.txt
autoblog.com.txt
autocar.co.uk.txt Update autocar.co.uk.txt (#1672) 2025-06-17 09:05:55 +02:00
autocrypt.org.txt autocrypt.org, cloud.google.com (#647) 2019-05-09 15:13:49 +03:00
automobil-produktion.de.txt Create automobil-produktion.de.txt (#1357) 2024-04-07 04:05:42 +02:00
autoplus.fr.txt
avantivictoirerao.com.txt Create avantivictoirerao.com.txt 2020-04-27 12:13:58 +02:00
avclub.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
awealthofcommonsense.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
aws.amazon.com.txt
axesslab.com.txt
axiocap.com.txt Update axiocap.com.txt 2024-02-23 03:29:22 +01:00
axios.com.txt Update axios.com.txt (#1752) 2025-08-27 08:49:31 +02:00
az-online.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
az.lib.ru.txt Add az.lib.ru 2025-11-24 19:42:42 +01:00
backlinko.com.txt
bahnblogstelle.com.txt Update bahnblogstelle.com.txt (#1130) 2023-07-03 16:57:16 +02:00
baltimoresun.com.txt
banglarrannaghor.com.txt Add scraping rules for banglarrannaghor.com (#1815) 2025-12-13 01:02:21 +01:00
barrons.com.txt Create barrons.com.txt (#1509) 2024-12-01 08:09:10 +01:00
baseballprospectus.com.txt
basicthinking.de.txt
basketeurope.com.txt
bastamag.net.txt
bastibe.de.txt Create bastibe.de.txt (#1505) 2024-11-22 08:36:35 +01:00
batenka.ru.txt Create batenka.ru.txt 2021-10-17 09:52:46 +02:00
baylon-industries.com.txt
bayometric.com.txt Add bayometric.com.txt with scraping details (#1839) 2026-01-06 18:38:28 +01:00
bbc.co.uk.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
bbc.com.txt Update cleanup on bbc.com (#1949) 2026-05-19 05:13:13 +02:00
bbcgoodfood.com.txt Bbcgoodfood (#1407) 2024-07-26 13:52:51 +02:00
bbva.es.txt Create bbva.es.txt 2022-10-25 11:18:12 +02:00
bdaily.co.uk.txt Create bdaily.co.uk.txt 2022-07-06 12:14:32 -04:00
bearmetal.eu.txt
becomingminimalist.com.txt
begeek.fr.txt
ben-evans.com.txt Create ben-evans.com.txt (#1423) 2024-08-23 08:05:13 +02:00
benoitmaison.org.txt
berliner-zeitung.de.txt Update berliner-zeitung.de.txt (#1937) 2026-05-06 20:49:31 +02:00
berlingske.dk.txt
bernama.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
bernardinai.lt.txt Create bernardinai.lt.txt (#1382) 2024-05-27 22:00:17 +02:00
besabine.com.txt Create besabine.com.txt (#1500) 2024-11-17 10:26:33 +01:00
bestcarweb.jp.txt add bestcarweb.jp.txt (#1896) 2026-03-02 09:32:50 +01:00
betabeat.com.txt
betanews.com.txt
bez.es.txt
bgland24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
bild.de.txt
biography.com.txt
birthdayshoes.com.txt
bit-tech.net.txt Create bit-tech.net.txt 2014-08-20 14:35:23 +01:00
bitelia.com.txt
biz-journal.jp.txt add .asahi.com.txt and biz-journal.jp.txt (#1911) 2026-03-04 15:00:28 +01:00
bizjournals.com.txt
bjango.com.txt
blaetter.de.txt Create blaetter.de.txt (#1762) 2025-09-20 12:54:15 +02:00
blast-info.fr.txt blast: remove navigation links (#1838) 2026-01-05 10:35:57 +01:00
bleacherreport.com.txt
blog.angular.io.txt create 3 new configs (#1149) 2023-07-12 06:28:40 +02:00
blog.asmartbear.com.txt
blog.chriszacharias.com.txt
blog.cloudflare.com.txt
blog.dropbox.com.txt
blog.eleven-labs.com.txt blog.eleven-labs.com added (#279) 2017-03-20 10:24:15 +00:00
blog.eng.xogrp.com.txt
blog.engineering.publicissapient.fr.txt Create blog.engineering.publicissapient.fr.txt (#891) 2021-08-17 16:56:19 +02:00
blog.fefe.de.txt
blog.google.txt Create blog.google.txt (#1656) 2025-06-12 15:52:38 +02:00
blog.imirhil.fr.txt
blog.instagram.com.txt
blog.instapaper.com.txt
blog.kaelig.fr.txt
blog.landr.com.txt Site config for blog.landr.com (#946) 2022-03-04 13:53:34 +01:00
blog.lepine.pro.txt Create blog.lepine.pro.txt (#1013) 2022-11-28 22:50:40 +01:00
blog.lumen.com.txt Create blog.lumen.com.txt (#1503) 2024-11-22 07:39:55 +01:00
blog.mochi.is.txt Create blog.mochi.is.txt (#1534) 2024-12-19 09:22:58 +01:00
blog.mondediplo.net.txt Create blog.mondediplo.net.txt (#1520) 2024-12-06 07:41:29 +01:00
blog.mozilla.org.txt fix: mozilla blog selectors (#1892) 2026-02-26 13:29:36 +01:00
blog.native-instruments.com.txt Create blog.native-instruments.com.txt 2021-06-09 21:40:12 +02:00
blog.naver.com.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
blog.netinfluence.ch.txt
blog.nightly.mozilla.org.txt
blog.octo.com.txt Create blog.octo.com.txt (#892) 2021-07-09 08:26:56 +02:00
blog.pchome.net.txt
blog.pinboard.in.txt
blog.professeurjoachim.com.txt Add blog.professeurjoachim.com.txt (#1514) 2024-12-05 08:49:48 +01:00
blog.rchapman.org.txt Create blog.rchapman.org.txt (#1459) 2024-10-25 00:23:45 +02:00
blog.renren.com.txt
blog.robertelder.org.txt Add blog.robertelder.org.txt (#980) 2022-06-13 06:07:39 +02:00
blog.rust-lang.org.txt Update blog.rust-lang.org.txt 2023-10-17 10:11:11 +02:00
blog.sentry.io.txt Sentry.io (#1140) 2023-07-07 17:02:14 +02:00
blog.serverlessadvocate.com.txt Create blog.serverlessadvocate.com.txt (#1145) 2023-07-12 06:28:14 +02:00
blog.shaunfinglas.co.uk.txt
blog.sina.com.cn.txt Initial commit 2013-02-27 23:43:10 +01:00
blog.spu.edu.txt
blog.squad.fr.txt
blog.stenmans.org.txt Create blog.stenmans.org.txt (#1736) 2025-08-07 03:43:41 +02:00
blog.terkel.io.txt Create blog.terkel.io.txt 2023-02-08 00:23:16 +01:00
blog.trello.com.txt
blog.twitter.com.txt
blog.wells.ee.txt
blog.xebia.fr.txt
blog.youb.fr.txt
blogs.faz.net.txt
blogs.forbes.com.txt
blogs.gnome.org.txt Add blogs.gnome.org & happyassassin.net (#410) 2018-04-01 14:10:09 +02:00
blogs.lse.ac.uk.txt Add sub-headings activation and strip attributes (#1954) 2026-05-22 07:43:17 +02:00
blogs.oracle.com.txt Create blogs.oracle.com.txt 2023-11-10 15:23:38 +01:00
blogs.reuters.com.txt
blogs.sciencemag.org.txt
blogs.smithsonianmag.com.txt Initial commit 2013-02-27 23:43:10 +01:00
blogs.technet.com.txt
bloomberg.com.txt Update bloomberg.com.txt (#1545) 2025-01-08 16:03:52 +01:00
boagworld.com.txt
boards.greenhouse.io.txt Create boards.greenhouse.io.txt (#1197) 2023-09-04 07:07:41 +02:00
bobbyhiltz.com.txt added bobbyhiltz.com (#1799) 2025-12-03 17:52:04 +01:00
bobbyromeo.com.txt
bohaishibei.com.txt Add files via upload (#293) 2017-05-14 13:52:53 +02:00
boingboing.net.txt Add title and date extraction to boingboing.net (#1835) 2026-01-04 07:46:14 +01:00
bonpote.com.txt Update bonpote.com.txt (#1411) 2024-07-29 15:50:34 +02:00
book.douban.com.txt Initial commit 2013-02-27 23:43:10 +01:00
bookforum.com.txt
borderhouseblog.com.txt
bosch-presse.de.txt
bostonglobe.com.txt Update bostonglobe.com.txt (#1256) 2023-11-29 16:43:46 +01:00
bostonreview.net.txt Update bostonreview.net.txt 2022-07-14 21:54:13 -04:00
boundlessline.org.txt Initial commit 2013-02-27 23:43:10 +01:00
boxingnewsonline.net.txt
bpb.de.txt Update bpb.de.txt 2025-04-21 00:38:44 +02:00
br.de.txt Create br.de.txt 2023-11-11 14:16:28 +01:00
brainfacts.org.txt
brainpickings.org.txt
brandeins.de.txt
brandingstrategyinsider.com.txt
brasil.elpais.com.txt fix: elpais body rule (#1885) 2026-02-23 19:41:12 +01:00
braunschweiger-zeitung.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
breitengrad-nord.de.txt Create breitengrad-nord.de.txt (#1344) 2024-02-28 09:19:31 +01:00
brentozar.com.txt Create brentozar.com.txt (#866) 2021-03-17 21:00:18 +01:00
brettterpstra.com.txt Initial commit 2013-02-27 23:43:10 +01:00
briefly.co.za.txt Create briefly.co.za.txt 2021-12-21 14:51:30 +01:00
brightside.me.txt
brit.co.txt Create brit.co.txt (#1401) 2024-07-07 10:25:29 +02:00
brookings.edu.txt Add brookings.edu.txt (#865) 2021-03-15 01:40:41 +01:00
brooksreview.net.txt
brucelawson.co.uk.txt
bt.no.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
buerstaedter-zeitung.de.txt Update buerstaedter-zeitung.de.txt 2023-10-15 15:01:44 +02:00
buffed.de.txt Update buffed.de.txt 2023-10-24 00:02:22 +02:00
buildvirtual.net.txt Create buildvirtual.net.txt (#1474) 2024-10-31 00:22:16 +01:00
bunshun.jp.txt add below site configs (#1849) 2026-01-17 12:51:20 +01:00
buquad.com.txt Initial commit 2013-02-27 23:43:10 +01:00
business-standard.com.txt Create business-standard.com.txt 2024-09-26 11:57:12 +02:00
business.time.com.txt
business2community.com.txt
businessinsider.com.au.txt
businessinsider.com.txt Update businessinsider.com.txt (#1637) 2025-05-31 14:54:34 +02:00
businessinsider.jp.txt add tokyo-np.co.jp.txt and businessinsider.jp.txt (#1823) 2025-12-18 13:42:33 +01:00
businessnews.com.tn.txt
businessweek.com.txt
buzzfeed.com.txt
buzzfeed.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
bw24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
bzg.fr.txt Create bzg.fr.txt (#1593) 2025-04-17 22:52:23 +02:00
c.newsnow.co.uk.txt
c.newsnow.com.txt
cabinetmagazine.org.txt Create cabinetmagazine.org.txt 2021-10-10 11:42:37 +02:00
cable.co.uk.txt
cafebabel.com.txt
caffereggio.net.txt
callistaenterprise.se.txt Create callistaenterprise.se.txt (#1464) 2024-10-28 00:42:30 +01:00
canardpc.com.txt Update canardpc.com.txt (#1458) 2024-10-30 03:12:47 +01:00
canonrumors.com.txt
captaineconomics.fr.txt
car-it.com.txt add car-it.com.txt (#754) 2020-03-12 23:57:10 +01:00
caranddriver.com.txt Create caranddriver.com.txt (#1648) 2025-06-08 09:54:13 +02:00
caravanmagazine.in.txt Create caravanmagazine.in.txt 2022-11-09 00:59:25 +01:00
cardboardconnection.com.txt
carlchenet.com.txt
carnegie.ru.txt Rename carnegie.ru.tx to carnegie.ru.txt 2022-02-17 22:22:59 +01:00
carnegieeurope.eu.txt
cars.com.txt
caseinterview.com.txt Update caseinterview.com.txt 2021-06-19 16:30:13 +02:00
cashless.pl.txt
catapult.co.txt Update catapult.co.txt 2020-10-28 10:29:40 +01:00
catb.org.txt
cbsnews.com.txt Revamp CBS News (#1908) 2026-03-04 01:59:31 +01:00
cell.com.txt Update XPath selector for article body (#1842) 2026-01-08 18:44:22 +01:00
cert-bund.de.txt Make the feed from cert-bund.de more useful (#921) 2022-01-13 11:35:20 +01:00
certaintynews.com.txt Update certaintynews.com.txt 2023-11-30 11:23:48 +01:00
cfclrk.com.txt Create cfclrk.com.txt (#1574) 2025-03-26 16:03:18 +01:00
cgtrader.com.txt Cgtrader (#1640) 2025-06-01 01:51:30 +02:00
champeau.info.txt
channelnewsasia.com.txt Update channelnewsasia.com.txt 2026-02-12 16:43:26 +01:00
chaperonsetvous.fr.txt Create chaperonsetvous.fr.txt (#981) 2022-06-14 16:50:00 +02:00
chareidi.org.txt
charlotteobserver.com.txt
chat.openai.com.txt Update chat.openai.com.txt 2023-11-08 15:47:00 +01:00
chefkoch.de.txt Update chefkoch.de (#763) 2020-03-31 12:33:01 +02:00
chicagotribune.com.txt Create chicagotribune.com.txt 2021-05-26 00:25:07 +02:00
chiemgau24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
china-gadgets.de.txt add china-gadgets.de config (#1309) 2024-01-15 23:56:26 +01:00
chip.de.txt Create chip.de.txt (#1424) 2024-08-24 09:20:16 +02:00
choice.com.au.txt Update choice.com.au.txt 2026-02-20 19:16:38 +01:00
chomsky.info.txt chore: add body and fix author for chomsky.info (#1886) 2026-02-24 13:41:05 +01:00
chrisltd.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
christianitytoday.com.txt Initial commit 2013-02-27 23:43:10 +01:00
christies.com.txt Initial commit 2013-02-27 23:43:10 +01:00
christophe-casalegno.com.txt feat: add christophe-casalegno.com configuration (#1947) 2026-05-17 13:13:31 +02:00
chrome.google.com.txt
chronicle.com.txt
ciaosamin.com.txt
cicero.de.txt
cio.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
ciperchile.cl.txt
cityam.com.txt Update cityam.com.txt 2024-05-20 19:07:22 +02:00
citylab.com.txt
cjr.org.txt
clarin.com.txt Create clarin.com.txt (#1166) 2023-07-23 08:28:13 +02:00
classcentral.com.txt Create classcentral.com.txt (#1502) 2024-11-22 06:19:02 +01:00
cleafy.com.txt Update cleafy.com.txt (#1350) 2024-03-09 13:10:37 +01:00
cleantechnica.com.txt
clientk.com.txt
cloud.google.com.txt Update cloud.google.com.txt 2026-02-15 19:38:08 +01:00
cloudacademy.com.txt
clubic.com.txt Update clubic.com.txt 2025-10-13 10:39:26 +02:00
cmace.de.txt
cmns.umd.edu.txt Shtrom 2022 04 (#966) 2022-04-22 09:57:46 +02:00
cmswire.com.txt
cn.engadget.com.txt
cn.nytimes.com.txt Create cn.nytimes.com.txt 2022-11-08 23:07:05 +01:00
cn.reuters.com.txt
cnbc.com.txt Update cnbc.com.txt 2021-05-19 22:58:04 +02:00
cnet.com.txt Update cnet.com.txt 2025-06-25 17:54:26 -04:00
cnetfrance.fr.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
cnews.ru.txt Update cnews.ru.txt (#1553) 2025-01-28 17:19:37 +01:00
cnn.com.txt Update cnn.com.txt (#1798) 2025-12-03 17:56:50 +01:00
cnrs.fr.txt Added cnrs.fr.txt (#876) 2021-05-04 08:58:59 +02:00
cntraveller.com.txt Update cntraveller.com.txt 2023-01-13 01:17:57 +01:00
coalicionporelevangelio.org.txt Create coalicionporelevangelio.org.txt 2022-10-25 11:26:55 +02:00
code.activestate.com.txt
code.google.com.txt
codebase64.org.txt
codeproject.com.txt
codinghorror.com.txt Initial commit 2013-02-27 23:43:10 +01:00
codyhosterman.com.txt Create codyhosterman.com.txt (#867) 2021-03-17 20:58:13 +01:00
coffeecircle.com.txt
cohost.org.txt Add cohost.org ko-fi.com and pcgamer.com (#1364) 2024-04-13 16:18:23 +02:00
cointelegraph.com.txt Add cointelegraph.com.txt (#881) 2021-05-14 00:47:34 +02:00
collective-evolution.com.txt
collegehumor.com.txt
columbiaspectator.com.txt Create columbiaspectator.com.txt (#1434) 2024-09-23 20:40:54 +02:00
come-on.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
commentarymagazine.com.txt Update commentarymagazine.com.txt 2021-02-13 21:44:12 +01:00
commitstrip.com.txt
commlabindia.com.txt Add initial content for commlabindia.com.txt (#1933) 2026-05-04 10:00:54 +02:00
commondreams.org.txt Update commondreams.org.txt (#1766) 2025-10-01 21:45:22 +02:00
commonwealmagazine.org.txt Create commonwealmagazine.org.txt 2023-02-18 01:05:41 +01:00
communities-dominate.blogs.com.txt
community.element14.com.txt Create community.element14.com.txt (#1473) 2024-10-31 00:08:32 +01:00
community.lucid.co.txt Comment out callout strip rule in community.lucid.co.txt 2026-02-10 19:19:44 +01:00
community.openstreetmap.org.txt Update community.openstreetmap.org.txt (#1452) 2024-10-21 02:10:58 +02:00
community.readeck.org.txt Create community.readeck.org.txt (#1603) 2025-05-03 08:23:02 +02:00
community.silverbullet.md.txt add community.silverbullet.md (#1844) 2026-01-09 21:45:13 +01:00
composer.spitfireaudio.com.txt Update composer.spitfireaudio.com.txt 2021-07-10 02:01:39 +02:00
computerbase.de.txt
computerworld.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
computerworld.dk.txt
consortiumnews.com.txt Create consortiumnews.com.txt 2021-05-03 12:34:38 +02:00
consumerreports.org.txt Create consumerreports.org.txt 2024-06-13 17:05:22 +02:00
contexte.com.txt
contrepoints.org.txt
cooking.nytimes.com.txt Create cooking.nytimes.com.txt 2021-01-06 16:36:08 +01:00
cooper.com.txt
core77.com.txt
correctiv.org.txt Update correctiv.org.txt 2022-05-24 00:27:52 +02:00
costanachrichten.com.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
counterpunch.org.txt
countrylife.co.uk.txt Create countrylife.co.uk.txt (#1469) 2024-10-30 02:34:42 +01:00
courrierdesbalkans.fr.txt
courrierdeuropecentrale.fr.txt
courrierinternational.com.txt Update courrierinternational.com.txt (#1391) 2024-06-15 01:10:47 +02:00
creteinsider.com.txt Create creteinsider.com.txt (#1705) 2025-07-06 23:57:38 +02:00
crikey.com.au.txt crikey.com.au.txt: Initial commit (#862) 2021-03-07 00:08:29 +01:00
crimemagazine.com.txt
crimereads.com.txt Update crimereads.com.txt (#1073) 2023-03-31 16:10:56 +02:00
crimethinc.com.txt
criterion.com.txt Update criterion.com.txt 2022-08-20 00:12:35 +02:00
crn.de.txt Initial commit 2013-02-27 23:43:10 +01:00
crunchyroll.com.txt
csmonitor.com.txt
csnphilly.com.txt Initial commit 2013-02-27 23:43:10 +01:00
csoonline.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
css-tricks.com.txt Updated css-tricks.com.txt (#624) 2019-03-04 15:25:53 +01:00
csswizardry.com.txt Added csswizardry.com.txt (#655) 2019-06-11 09:17:05 +02:00
ctxt.es.txt Create ctxt.es.txt (#1566) 2025-03-11 14:57:47 +01:00
cucharasonica.com.txt
cultofmac.com.txt Update cultofmac.com.txt 2019-05-23 13:40:41 +02:00
culturebd.com.txt
cw.com.tw.txt
cwnp.com.txt
cyrille-borne.com.txt
da.feedsportal.com.txt
dadall.info.txt
dafoster.net.txt Create dafoster.net.txt (#1151) 2023-07-12 06:22:32 +02:00
dagogtid.no.txt
daily-osm-tips.getsendstack.com.txt add config for daily-osm-tips.getsendstack.com (#1009) 2022-11-02 10:01:43 +01:00
dailydot.com.txt
dailykos.com.txt
dailymail.co.uk.txt Update dailymail.co.uk.txt 2022-04-03 11:47:20 +02:00
dailymaverick.co.za.txt Update dailymaverick.co.za.txt 2021-05-03 13:08:26 +02:00
dailymotion.com.txt
dailynord.fr.txt
dailysabah.com.txt Create dailysabah.com.txt 2015-11-09 20:03:28 +01:00
dailyshincho.jp.txt add dailyshincho.jp.txt and shueisha.online.txt (#1824) 2025-12-19 13:36:11 +01:00
dailystar.com.lb.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
dallasnews.com.txt Update dallasnews.com.txt (#1406) 2024-07-18 07:15:03 +02:00
danbooru.donmai.us.txt Create danbooru.donmai.us.txt (#1356) 2024-03-26 22:22:23 +01:00
danburzo.ro.txt Add metadata to danburzo.ro.txt (#1848) 2026-01-13 11:39:46 +01:00
danluu.com.txt Fix body extraction for https://danluu.com (#1767) 2025-10-02 09:41:12 +02:00
dansdata.com.txt
dantri.com.vn.txt
daringfireball.net.txt
daserste.ndr.de.txt Add daserste.ndr.de (#235) 2016-12-31 00:26:50 +01:00
dasgelbeblatt.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
datasecuritybreach.fr.txt Create datasecuritybreach.fr.txt (#1642) 2025-06-06 09:28:17 +02:00
davidwalsh.name.txt
dazeddigital.com.txt Update dazeddigital.com.txt 2023-10-04 14:38:27 +02:00
dbazi.com.txt
dcurt.is.txt
deadline.com.txt
deadspin.com.txt Update deadspin.com.txt 2022-12-04 22:39:03 +01:00
declassifieduk.org.txt Create declassifieduk.org.txt 2023-05-24 21:51:58 +02:00
defenseone.com.txt Update defenseone.com.txt 2022-03-11 11:45:57 +01:00
deia.com.txt
deichstube.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
deliverydoubled.com.txt
deloitte.com.txt Modify content extraction rules in deloitte.com.txt (#1956) 2026-05-24 23:16:40 +02:00
delong.typepad.com.txt
democracynow.org.txt
demorgen.be.txt Re-enable cookie wall bypass for demorgen.be (#1912) 2026-03-16 07:04:39 +01:00
denikn.cz.txt Update denikn.cz.txt with new paywall detection selector (#1559) 2025-02-27 11:41:57 +01:00
denofgeek.com.txt Create denofgeek.com.txt (#1737) 2025-08-07 16:04:20 +02:00
der-postillon.com.txt
derbund.ch.txt
derekseaman.com.txt Create derekseaman.com.txt (#1472) 2024-10-30 23:15:33 +01:00
derstandard.at.txt Standard2 (#1721) 2025-07-21 08:46:30 +02:00
derstandard.de.txt Standard2 (#1721) 2025-07-21 08:46:30 +02:00
des-livres-pour-changer-de-vie.fr.txt
designsponge.com.txt
designtagebuch.de.txt
deutsche-apotheker-zeitung.de.txt
dev.to.txt
devblogs.microsoft.com.txt fix: update Microsoft devblogs configuration (#1942) 2026-05-10 09:29:40 +02:00
developer.mozilla.org.txt Update developer.mozilla.org.txt 2023-11-14 17:09:36 +01:00
developers.facebook.com.txt
devlinsangle.blogspot.co.at.txt
dezeen.com.txt Add dezeen.com.txt (#1207) 2023-09-22 17:38:14 +02:00
diagonalperiodico.net.txt
diamond-rm.net.txt add diamond-rm.net.txt (#1857) 2026-01-28 12:41:27 +01:00
diamond.jp.txt add diamond.jp (#1850) 2026-01-18 14:23:57 +01:00
dice.com.txt Create dice.com.txt 2023-11-18 18:05:42 +01:00
dictionary.reference.com.txt
diepresse.com.txt Update diepresse.com.txt (#1466) 2024-10-29 17:16:51 +01:00
digg.com.txt Create digg.com.txt 2022-01-08 12:17:16 +01:00
digiphoto.techbang.com.txt
digital-photography-school.com.txt Updated digital-photography-school.com.txt (#1351) 2024-03-12 13:43:25 +01:00
digitalcourage.de.txt Create digitalcourage.de.txt (#806) 2020-09-17 20:26:48 +02:00
digitalfernsehen.de.txt Update digitalfernsehen.de.txt 2023-10-20 15:43:13 +02:00
digitalforensics.com.txt
digitalkamera.de.txt add digitalkamera.de.txt for multipage fetching (#1288) 2024-01-01 00:26:58 +01:00
digitalspy.co.uk.txt
dilbert.com.txt
dinamalar.com.txt
disclose.ngo.txt Create disclose.ngo.txt (#1526) 2024-12-12 01:12:42 +01:00
discuss.logseq.com.txt Create discuss.logseq.com.txt (#1451) 2024-10-21 01:55:35 +02:00
discuss.pixls.us.txt add discuss.pixls.us.txt (#1782) 2025-11-03 18:23:21 +01:00
dispatchesjournal.org.txt
dissentmagazine.org.txt
distributistreview.com.txt
dn.pt.txt
dobreprogramy.pl.txt
doc.rust-lang.org.txt fix issue: wallabag/wallabag/issues/7854 (#1506) 2024-11-24 20:31:19 +01:00
doc.rust-lang.ru.txt fix issue: wallabag/wallabag/issues/7854 (#1506) 2024-11-24 20:31:19 +01:00
doc.wallabag.org.txt
docs.cloud.google.com.txt Update docs.cloud.google.com.txt 2026-02-15 19:44:30 +01:00
docs.opnsense.org.txt Create docs.opnsense.org.txt (#1283) 2023-12-28 16:32:13 +01:00
docs.redhat.com.txt Add configuration for processing Red Hat docs (#1957) 2026-05-30 10:09:34 +02:00
dodgersway.com.txt Create dodgersway.com.txt (#1012) 2022-11-28 09:50:06 -08:00
domo-blog.fr.txt Create domo-blog.fr.txt (#1536) 2024-12-21 21:16:49 +01:00
domusweb.it.txt
donnahay.com.au.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
dorkly.com.txt
dou.ua.txt
douban.com.txt
doughellmann.com.txt
dpreview.com.txt
dr-b.io.txt add dr-b.io (#1100) 2023-06-19 09:07:00 +02:00
dr.dk.txt Update dr.dk.txt 2024-06-05 23:20:18 +02:00
drdobbs.com.txt Initial commit 2013-02-27 23:43:10 +01:00
drgoulu.com.txt
drive2.ru.txt
dropbox.com.txt
drupal.org.txt
dummies.com.txt Create dummies.com.txt (#1163) 2023-07-21 06:25:41 +02:00
dushumashang.com.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
dw.com.txt fix: dw.com body rule and add date (#1888) 2026-02-24 06:43:06 +01:00
dzone.com.txt
earther.com.txt Add fossbytes.com, mercurynews.com, earther.com configs (#463) 2018-07-17 18:42:36 +02:00
earvingad.github.io.txt Create earvingad.github.io.txt (#1753) 2025-08-30 09:01:18 +02:00
eastoftheweb.com.txt
eatsmarter.de.txt fix: eatsmarter single_page_link (#1889) 2026-02-24 06:45:42 +01:00
ebay.com.txt
ecetia.com.txt
echo-online.de.txt Update echo-online.de.txt 2023-10-15 15:00:22 +02:00
echo24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
eckerd.edu.txt Create eckerd.edu.txt (#841) 2021-01-07 07:48:57 +01:00
econlog.econlib.org.txt
economichardship.org.txt Create economichardship.org.txt (#1578) 2025-03-30 22:27:22 +02:00
economie.gouv.fr.txt
economist.com.txt Update economist.com.txt (#1523) 2024-12-09 09:30:00 +01:00
ecranlarge.com.txt
edge-online.com.txt
edge.org.txt
edition.channel5belize.com.txt
edition.cnn.com.txt fix: cnn selectors (#1898) 2026-02-26 19:38:20 +01:00
edmunds.com.txt Create edmunds.com.txt (#1644) 2025-06-07 09:26:20 +02:00
edn.com.txt Create edn.com (#1138) 2023-07-06 22:37:46 +02:00
eetimes.com.txt
eff.org.txt eff.org: wrap quotes in blockquote (#912) 2021-10-29 22:36:48 +02:00
einfach-tasty.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
ekantipur.com.txt
ekultura.hu.txt Initial commit 2013-02-27 23:43:10 +01:00
elance.com.txt
elblogsalmon.com.txt
elconfidencial.com.txt Update elconfidencial.com.txt 2022-06-06 12:52:44 +02:00
elderscrollsonline.com.txt
eleconomista.es.txt
electrek.co.txt Add electrek.co.txt (#850) 2021-01-17 23:54:20 +01:00
electromaker.io.txt
elektroautomobil.com.txt add elektroautomobil.com (#1303) 2024-01-11 02:30:33 +01:00
elektroniknet.de.txt Update elektroniknet.de.txt (#1358) 2024-04-07 04:36:49 +02:00
elementor.contentlabs.ca.txt Create elementor.contentlabs.ca.txt 2022-12-20 00:02:09 +01:00
elespanol.com.txt Update elespanol.com (#1003) 2022-11-02 09:57:42 +01:00
elfster.com.txt add elfster.com to remove ads (#1096) 2023-06-19 09:05:43 +02:00
elmalpensante.com.txt
elmundo.es.txt Update elmundo.es (#1002) 2022-11-02 09:57:22 +01:00
elpais.com.txt Elpais (#1561) 2025-02-28 05:15:51 +01:00
eltonjohn.com.txt Create eltonjohn.com.txt (#1589) 2025-04-16 08:17:16 +02:00
emaratalyoum.com.txt
en.espnf1.com.txt Initial commit 2013-02-27 23:43:10 +01:00
energie-experten.org.txt Add site configs for energie-experten.org and reset.org (#1961) 2026-06-10 15:23:24 +02:00
engadget.com.txt
engineering.tumblr.com.txt
english.aljazeera.net.txt
enikos.gr.txt
enterprisersproject.com.txt
entertainment.timesonline.co.uk.txt
entheogenesis.org.txt Entheogenesis (#1430) 2024-09-02 14:29:31 +02:00
entrepreneurshandbook.co.txt Update entrepreneurshandbook.co.txt (#1170) 2023-07-25 06:45:09 +02:00
entwickler.de.txt
enviscope.com.txt
erdorin.org.txt Adding rules for erdorin.org (#1873) 2026-02-20 16:34:27 +01:00
ericsuh.com.txt
ernestmag.fr.txt
escapistmagazine.com.txt
esglobal.org.txt add esglobal.org 2015-10-03 21:23:19 +02:00
espacepolitique.revues.org.txt
espn.go.com.txt
esquire.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
esslinger-zeitung.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
essonneinfo.fr.txt
estadao.com.br.txt
eternabuenosaires.com.txt
eudi-wallet.gov.de.txt Add initial content to eudi-wallet.gov.de.txt (#1928) 2026-04-16 16:12:59 +02:00
euractiv.com.txt fix: euractiv.com (#1867) 2026-02-10 14:08:22 +01:00
euractiv.fr.txt Add euractiv.fr.txt (#1066) 2023-03-19 20:48:29 +01:00
eurogamer.net.txt Improvements to eurogamer.net, heise.de, rockpapershotgun.com, tagesschau.de and zeit.de. Fix golem.de (#936) 2022-02-28 06:39:51 +01:00
everway.com.txt Add content extraction rules for everway.com 2025-12-22 23:22:07 +01:00
everydayfeminism.com.txt
evo.co.uk.txt
eweek.com.txt
exoplanets.nasa.gov.txt Add exoplanets.nasa.gov.txt (#949) 2022-03-08 06:17:57 +01:00
explainthatstuff.com.txt Create explainthatstuff.com.txt 2021-02-01 13:57:24 +01:00
explosm.net.txt Update explosm.net.txt (#991) 2022-09-02 07:02:17 +02:00
expresso.sapo.pt.txt
extracine.com.txt
extratipp.com.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
f-droid.org.txt chore: add body and date rules for f-droid (#1894) 2026-02-24 15:09:08 +01:00
facebook.com.txt
facta.co.jp.txt
factuel.info.txt
fair.org.txt Create fair.org.txt 2019-06-20 15:29:17 +02:00
fairphone.com.txt
fakirpresse.info.txt
falter.at.txt
fanfiction.net.txt
fastcompany.com.txt Update fastcompany.com.txt 2021-05-19 22:52:48 +02:00
fathers.pl.txt Fix test_url entry in fathers.pl.txt (#1861) 2026-01-31 16:52:08 +01:00
favouritehumandesign.com.txt Create favouritehumandesign.com.txt (#1654) 2025-06-08 19:53:06 +02:00
faz.net.txt Add strip rule for Google preferences link in faz.net.txt (#1955) 2026-05-22 15:24:15 +02:00
feeds.feedblitz.com.txt
fehmarn24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
feinschwarz.net.txt Create feinschwarz.net.txt (#1734) 2025-08-04 00:07:00 +02:00
fernbahntunnel-frankfurt.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
fertigung.de.txt
fictionpress.com.txt
ficwad.com.txt
fidelitydigitalassets.com.txt Update fidelitydigitalassets.com.txt (#1700) 2025-07-04 09:36:30 +02:00
fiftytwo.in.txt Create fiftytwo.in.txt 2020-10-20 11:02:39 +02:00
filamentgroup.com.txt
filmstarts.de.txt
finance.yahoo.co.jp.txt add finance.yahoo.co.jp.txt and topnews.jp.txt (#1856) 2026-01-26 19:21:42 +01:00
findtheswagger.tumblr.com.txt
finexpert.e15.cz.txt
fingerprint.ippen.media.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
firstmonday.org.txt Create firstmonday.org.txt (#1291) 2024-01-05 01:31:42 +01:00
firstthings.com.txt
fivebooks.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
fivefilters.org.txt
fivethirtyeight.com.txt
flyingmachinestudios.com.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
fm4.orf.at.txt
fmhy.net.txt Update fmhy.net.txt (#1353) 2024-03-12 17:02:25 +01:00
fnal.gov.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
fnp.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
focus-numerique.com.txt Create focus-numerique.com.txt (#155) 2016-05-17 14:02:33 +02:00
focus.de.txt Focus.de (#1390) 2024-06-15 00:28:21 +02:00
fok.nl.txt fok.nl 2015-01-28 14:38:46 +01:00
fokus.se.txt
foley.com.txt
folklore.org.txt
food.com.txt
fool.com.txt
forbes.com.txt Update polygon.com and forbes.com (#843) 2021-01-08 13:36:50 +01:00
forbesjapan.com.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
forbiddenstories.org.txt Create forbiddenstories.org.txt 2023-04-19 17:35:11 +02:00
foreignaffairs.com.txt Update foreignaffairs.com.txt 2022-04-09 16:44:49 +02:00
foreignpolicy.com.txt Update foreignpolicy.com.txt 2024-12-13 14:06:52 +01:00
formula1.com.txt Create formula1.com.txt (#1579) 2025-03-31 00:12:39 +02:00
forsvaret.no.txt
fortelabs.co.txt Update fortelabs.co.txt 2021-04-22 15:35:42 +02:00
forum-geldpolitik.ch.txt Add site configs for republik.ch, forum-geldpolitik.ch, wendezeit.ch (#1963) 2026-06-06 18:34:41 +02:00
forum.revvox.de.txt add forum.revvox.de.txt (#1783) 2025-11-03 18:24:31 +01:00
forward.com.txt Create forward.com.txt 2023-11-18 17:44:56 +01:00
fossbytes.com.txt
fosslinux.com.txt Add scraping configuration for fosslinux.com (#1944) 2026-05-14 08:05:17 +02:00
foxnews.com.txt
fr.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
framablog.org.txt
france24.com.txt
franceculture.fr.txt Create franceculture.fr.txt (#403) 2018-03-11 18:22:28 +01:00
franceinfo.fr.txt Update franceinfo.fr.txt (#1761) 2025-09-19 22:42:44 +02:00
frandroid.com.txt Update frandroid.com.txt 2026-02-06 17:26:21 +01:00
frankenpost.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
frankwatching.com.txt chore: rename frankwatching and add body rule (#1887) 2026-02-23 19:42:53 +01:00
freecodecamp.org.txt Create freecodecamp.org.txt (#935) 2022-02-21 14:26:04 +01:00
freelancer.com.txt
freemovement.org.uk.txt Create freemovement.org.uk.txt (#1417) 2024-08-08 14:00:12 +02:00
fria.nu.txt
friatidningen.se.txt
frmplus.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
fromreformationtoreformation.com.txt Create fromreformationtoreformation.com.txt (#1622) 2025-05-21 08:21:45 +02:00
frontburner.dmagazine.com.txt Backport site_config changes from wallabag v1 2015-12-31 18:13:20 +01:00
frontpagelinux.com.txt
fs.blog.txt Create fs.blog.txt 2022-08-12 01:40:21 +02:00
ft.com.txt Update ft.com.txt (#1343) 2024-02-21 22:18:51 +01:00
ftchinese.com.txt updated ftchinese.com.txt (#836) 2020-12-28 16:26:00 +01:00
fujinkoron.jp.txt add .watch.impress.co.jp.txt and fujinkoron.jp.txt (#1917) 2026-03-20 14:43:02 +01:00
fularsizentellik.com.txt
fuldaerzeitung.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
funnyjunk.com.txt Update funnyjunk.com.txt 2021-03-14 13:51:25 +01:00
futura-sciences.com.txt
futurezone.at.txt
futurism.com.txt fix: futurism.com.txt (#1870) 2026-02-18 16:12:17 +01:00
fzone.cz.txt
gamasutra.com.txt
gameblog.fr.txt mmm, not a rocker at manual XPath. 2015-03-11 23:51:18 +01:00
gamedev.net.txt
gamekult.com.txt Updated gamekult.com.txt (#875) 2021-04-30 18:40:23 +02:00
gamer.no.txt
gamereactor.no.txt
gamesradar.com.txt Create gamesradar.com.txt (#1606) 2025-05-05 06:53:46 +02:00
gameswirtschaft.de.txt
ganglia.info.txt
gatesnotes.com.txt Update gatesnotes.com.txt (#1499) 2024-11-17 09:52:42 +01:00
gatopardo.com.txt
gauchiste.fr.txt
gawker.com.txt
geeksofdoom.com.txt Initial commit 2013-02-27 23:43:10 +01:00
geenstijl.nl.txt
gendai.media.txt add gendai.media.txt xenospectrum.com.txt taxacc.jp.txt (#1821) 2025-12-17 05:44:08 +01:00
generation-nt.com.txt Update generation-nt.com.txt (#1025) 2023-01-12 15:57:19 +01:00
germangirlinamerica.com.txt Create germangirlinamerica.com.txt (#1137) 2023-07-06 11:18:31 +02:00
geschichtedergegenwart.ch.txt Create geschichtedergegenwart.ch.txt (#1183) 2023-08-20 12:58:25 +02:00
getnews.jp.txt
getpocket.com.txt Update getpocket.com.txt 2021-05-12 22:01:55 +02:00
ghanaweb.com.txt
giantbomb.com.txt
giessener-allgemeine.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
giga.de.txt
gigaom.com.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
gihyo.jp.txt
gist.github.com.txt Initial commit 2013-02-27 23:43:10 +01:00
git-scm.com.txt
github.blog.txt
github.com.txt Fix XPath selector for body content 2026-01-17 20:15:34 +01:00
gizmodo.com.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
gizmodo.uol.com.br.txt
gizmologia.com.txt
gizmovil.com.txt
glasnaya.media.txt Update glasnaya.media.txt 2023-11-09 16:55:29 +01:00
glazman.org.txt
global.txt Update global.txt 2020-10-24 11:42:27 +02:00
globalgrind.com.txt fix: globalgrind.com rules (#1899) 2026-02-26 18:32:00 +01:00
globalissues.org.txt Initial commit 2013-02-27 23:43:10 +01:00
globalresearch.ca.txt
gloswielkopolski.pl.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
gnppn.fr.txt
gnu.org.txt fix: allow getting body from man pages for gnu.org (#1890) 2026-02-24 06:46:35 +01:00
gnz.de.txt Create gnz.de.txt (#1535) 2024-12-21 14:06:24 +01:00
go2senkyo.com.txt add merkmal-biz.jp and go2senkyo.com (#1934) 2026-05-04 14:30:06 +02:00
goal.com.txt
gocomics.com.txt
gofugyourself.com.txt
gokulkrishh.github.io.txt
gold.ac.uk.txt
goldseiten.de.txt Update goldseiten.de.txt (#1056) 2023-02-19 22:39:21 +01:00
golem.de.txt Update cookie consent value in golem.de.txt (#1891) 2026-02-24 06:47:36 +01:00
good.is.txt
goodfil.ms.txt
goodhousekeeping.com.txt Create goodhousekeeping.com.txt (#1742) 2025-08-10 05:36:56 +02:00
goodreads.com.txt
gorky.media.txt Create gorky.media.txt 2020-09-03 13:12:52 +02:00
gossip-tv.gr.txt
goteborgsfria.se.txt
gothamist.com.txt Initial commit 2013-02-27 23:43:10 +01:00
gov.uk.txt Update gov.uk.txt 2022-10-16 10:54:58 +02:00
gp.se.txt
gq-magazine.co.uk.txt Update gq-magazine.co.uk.txt (#1202) 2023-09-11 21:13:36 +02:00
gq.com.txt Update gq.com.txt 2020-05-03 14:24:42 +02:00
grafikart.fr.txt
granta.com.txt Update granta.com.txt 2021-03-26 21:46:36 +01:00
grantland.com.txt
greatergreaterwashington.org.txt
greaterwrong.com.txt Create greaterwrong.com.txt 2020-10-16 20:41:55 +02:00
greensavers.sapo.pt.txt Create greensavers.sapo.pt.txt (#985) 2022-07-19 15:13:38 +02:00
grisebouille.net.txt Grisebouille (#1881) 2026-02-23 14:25:31 +01:00
groene.nl.txt Update groene.nl.txt (#1158) 2023-07-17 11:28:59 +02:00
grokipedia.com.txt Add attribute stripping for span elements (#1922) 2026-03-24 05:24:24 +01:00
groups.drupal.org.txt
grubstreet.com.txt
grumpygamer.com.txt
gsmarena.com.txt
gulfnews.com.txt
guokr.com.txt
gurumed.org.txt
gurusblog.com.txt
gutenberg.org.txt Update gutenberg.org.txt 2026-01-29 12:51:47 +01:00
guyaweb.com.txt
haaretz.co.il.txt Create haaretz.co.il.txt (#1069) 2023-03-23 14:35:43 +01:00
haaretz.com.txt Update haaretz.com.txt 2025-06-25 17:37:36 -04:00
haberler.com.txt
habr.com.txt Update habr.com.txt (#1470) 2024-10-30 03:36:19 +01:00
habrahabr.ru.txt
hacf.fr.txt Create hacf.fr.txt (#1704) 2025-07-06 23:11:27 +02:00
hackersrepublic.org.txt
hackertarget.com.txt
hackmake.org.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
hackneycitizen.co.uk.txt Update hackneycitizen.co.uk.txt 2021-05-26 00:55:08 +02:00
hacks.mozilla.org.txt
hallo-muenchen.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
halo.bungie.org.txt Initial commit 2013-02-27 23:43:10 +01:00
hanau-wuerzburg-fulda.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
hanauer.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
handelsblatt.com.txt fix: bad format errors (#1811) 2025-12-09 13:48:06 +01:00
hanselman.com.txt
happyassassin.net.txt
hardware.fr.txt
hardware.no.txt Convert hardware.no.txt from latin1. 2014-07-01 20:17:14 +01:00
hardwareluxx.de.txt update hardwareluxx (#1809) 2025-12-08 01:31:34 +01:00
harpers.org.txt Update harpers.org.txt 2023-01-17 22:25:58 +01:00
harzkurier.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
has-sante.fr.txt Create has-sante.fr.txt (#1719) 2025-07-20 10:30:26 +02:00
hazlitt.net.txt
hbr.org.txt Refactor selectors and strip rules in hbr.org.txt (#1906) 2026-03-03 00:25:05 +01:00
headrush.typepad.com.txt
health.com.txt
health.gov.au.txt Shtrom 2019 02 2 (#615) 2019-02-17 17:15:41 +01:00
healthland.time.com.txt
healthletter.mayoclinic.com.txt
healthline.com.txt Update healthline.com.txt 2023-01-27 15:11:56 +01:00
heatmap.news.txt Shtrom 2024 01 (#1326) 2024-01-31 16:15:41 +01:00
heidelberg24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
heise.de.txt Add strip rule for SVG images (#1855) 2026-01-24 14:29:24 +01:00
hellofresh.de.txt Create hellofresh.de.txt (#1368) 2024-04-25 18:58:10 +02:00
help.fivefilters.org.txt Add some other missing news sites that do not always correctly work. 2015-03-10 11:50:15 +11:00
help.sharegate.com.txt Create help.sharegate.com.txt 2025-09-11 16:17:44 +02:00
hemmings.com.txt
herbstfest-rosenheim.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
hersfelder-zeitung.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
hespress.com.txt
hessen.de.txt Update hessen.de.txt (#1179) 2023-08-12 10:19:02 +02:00
hessenschau.de.txt Add new strip rule for Hessenschau app link (#1960) 2026-05-30 10:56:28 +02:00
higcapital.com.txt
highscalability.com.txt
hiiraan.com.txt
hillstreetgrocer.com.txt
hindustantimes.com.txt fix: hindustantimes rules (#1901) 2026-02-26 18:41:59 +01:00
hiperpop.com.txt
hipertextual.com.txt
hiphopleeft.nl.txt
histoire-filante.fr.txt
histoire.presse.fr.txt
historic-uk.com.txt
historytoday.com.txt
hln.be.txt
hmercer.com.txt
hna.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
hochheimer-zeitung.de.txt Update hochheimer-zeitung.de.txt 2023-10-15 15:02:12 +02:00
hodinkee.com.txt Update hodinkee.com.txt 2022-01-16 13:01:15 +01:00
hollywoodlife.com.txt
homeofsports.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
hometheaterreview.com.txt
hosted.ap.org.txt
hosted2.ap.org.txt
houstonchronicle.com.txt
howtogeek.com.txt Update howtogeek.com.txt (#1487) 2024-11-09 09:25:03 +01:00
hpd.de.txt Create hpd.de.txt (#1439) 2024-10-10 22:13:26 +02:00
hs.fi.txt Update hs.fi (#1710) 2025-07-14 20:11:24 +02:00
ht.ly.txt
huffingtonpost.co.uk.txt
huffingtonpost.fr.txt Update huffingtonpost.fr.txt 2025-07-24 14:22:13 +02:00
huffpost.com.txt
humanite.fr.txt
humantransit.org.txt
hurriyet.com.tr.txt
hvg.hu.txt
hypebeast.com.txt
ianlewis.org.txt
iansommerville.com.txt
icannabis.tumblr.com.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
ichkoche.at.txt Create ichkoche.at.txt (#1324) 2024-01-28 10:26:58 +01:00
ici.radio-canada.ca.txt Update ici.radio-canada.ca.txt (#1732) 2025-08-02 15:28:36 +02:00
idealog.co.nz.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
idlewords.com.txt
ieeexplore.ieee.org.txt Update ieeexplore.ieee.org.txt 2023-10-15 08:15:49 +02:00
ietf.org.txt Create ietf.org.txt 2023-11-04 10:46:47 +01:00
igen.fr.txt fix: update MacG websites configuration (#1941) 2026-05-10 09:39:30 +02:00
igeneration.fr.txt
ikz-online.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ilounge.com.txt
ilsoftware.it.txt Ilsoftware (#1645) 2025-06-07 12:59:11 +02:00
ilyabirman.ru.txt
immub.org.txt Update immub.org.txt (#1518) 2024-12-05 14:20:13 +01:00
imore.com.txt Update imore.com.txt 2023-04-02 00:38:51 +02:00
in-muenchen.de.txt changed 57 files for ippen.media sites (#1383) 2024-06-05 12:06:20 +02:00
inc.com.txt Update inc.com.txt (#1703) 2025-07-05 06:51:08 +02:00
indehekken.net.txt Add indehekken.net 2016-04-10 20:44:17 +03:00
independent.co.uk.txt Update independent.co.uk.txt 2021-08-29 01:24:25 +02:00
indiatimes.com.txt
indiehackers.com.txt
indiewire.com.txt Update indiewire.com.txt (#1739) 2025-08-08 13:35:53 +02:00
indiscreto.org.txt Create indiscreto.org.txt (#1655) 2025-06-11 17:23:59 +02:00
inessential.com.txt Initial commit 2013-02-27 23:43:10 +01:00
infolibre.es.txt Add infolibre.es (#970) 2022-05-18 06:58:32 +02:00
infoq.com.txt
informador.com.mx.txt
information.dk.txt
informationarchitects.net.txt
informationclearinghouse.info.txt
informit.com.txt
infovaticana.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
infoworld.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
infzm.com.txt Initial commit 2013-02-27 23:43:10 +01:00
ingame.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
inhabitat.com.txt
innsalzach24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
inquirer.com.txt Update inquirer.com.txt 2021-04-09 11:31:07 +02:00
inquirer.net.txt Update inquirer.net.txt 2024-06-07 12:32:00 +02:00
instagr.am.txt
instagram.com.txt Create instagram.com.txt (#1675) 2025-06-19 07:34:49 +02:00
instructables.com.txt Update instructables.com.txt 2021-03-05 19:42:36 +01:00
insuedthueringen.de.txt add config for insuedthueringen.de (#1086) 2023-06-09 06:17:39 +02:00
intelligenceonline.fr.txt
interconnected.org.txt Create interconnected.org.txt (#1481) 2024-11-04 02:01:46 +01:00
interestingengineering.com.txt add config for interestingengineering.com (#986) 2022-08-12 00:22:22 +02:00
intern-mag.com.txt Create intern-mag.com.txt 2022-02-04 00:26:54 +01:00
interviewmagazine.com.txt
investigation.rollingstone.com.txt Update investigation.rollingstone.com.txt 2023-02-05 20:40:27 +01:00
investopedia.com.txt Update investopedia.com.txt 2021-06-20 18:26:05 +02:00
inwestomat.eu.txt Create inwestomat.eu.txt (#1057) 2023-02-22 07:55:07 +01:00
ipadclub.nl.txt
ipadplanet.nl.txt
iphon.fr.txt Update iphon.fr.txt 2023-09-21 10:42:33 +02:00
iphoneaddict.fr.txt
iphoneclub.nl.txt
iphonehacks.com.txt
iphonetweak.fr.txt
iplaysoft.com.txt
ishadeed.com.txt Create ishadeed.com (#931) 2022-02-16 22:00:35 +01:00
iso.500px.com.txt
isource.com.txt
ispatguru.com.txt Create ispatguru.com.txt (#1311) 2024-01-19 08:36:06 +01:00
it-connect.fr.txt
italpassion.fr.txt add 3 files (#1829) 2025-12-23 11:30:11 +01:00
itavisen.no.txt
itmedia.co.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
itnews.com.au.txt
itsfoss.com.txt fix: itsfoss rules (#1895) 2026-02-24 15:39:39 +01:00
itstactical.com.txt
itunes.apple.com.txt
itwire.com.txt Create itwire.com.txt 2013-05-29 11:19:53 +10:00
izismile.com.txt
jack-vanlightly.com.txt Add initial content for jack-vanlightly.com analysis (#1836) 2026-01-04 08:42:41 +01:00
jacobin.com.br.txt Create scraping configuration for jacobin.com.br (#1780) 2025-10-26 21:25:03 +01:00
jacobin.com.txt Update and rename jacobinmag.com.txt to jacobin.com.txt 2022-10-20 17:01:24 +02:00
jacobnordby.com.txt add jacobnordby.com (#1795) 2025-11-29 19:23:23 +01:00
jalopnik.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
jamesclear.com.txt
jameslandrith.com.txt
jamieoliver.com.txt
jandan.net.txt
japoninfos.com.txt
javascript.plainenglish.io.txt Create javascript.plainenglish.io.txt (#1146) 2023-07-12 06:27:41 +02:00
jbpress.ismedia.jp.txt add two sites (#1832) 2025-12-29 14:56:41 +01:00
jdubuzz.com.txt
je-suis-papa.com.txt
jesuisundev.com.txt
jetzt.de.txt
jetzt.sueddeutsche.de.txt
jeuxvideo.com.txt Modify extraction rules for jeuxvideo (#1927) 2026-04-01 08:02:35 +02:00
jezebel.com.txt Update all Kinja websites (#579) 2018-12-19 02:43:28 +01:00
jjahnke.net.txt Initial commit 2013-02-27 23:43:10 +01:00
jneurosci.org.txt
jobbank.gc.ca.txt
joelonsoftware.com.txt
johannesbader.ch.txt
johnnysgamelogs.fr.txt
jollinger.com.txt Create jollinger.com.txt 2021-01-06 16:57:26 +01:00
journal.markusthoma.com.txt add journal.markusthoma.com (#1304) 2024-01-11 02:36:36 +01:00
journaldugamer.com.txt
journaldugeek.com.txt
journals.biologists.com.txt Create journals.biologists.com.txt (#1693) 2025-07-02 16:18:45 +02:00
journals.plos.org.txt
journals.sagepub.com.txt Update journals.sagepub.com.txt 2021-04-02 15:40:21 +02:00
joystiq.com.txt
jp.motorsport.com.txt add jp.motorsport.com (#1818) 2025-12-14 13:13:49 +01:00
jp.reuters.com.txt Update xenospectrum.com add newsphere.jp jp.reuters.com (#1847) 2026-01-11 16:34:29 +01:00
jpmens.net.txt Create jpmens.net.txt (#1182) 2023-08-20 12:59:00 +02:00
jsforcats.com.txt
juedische-allgemeine.de.txt Update juedische-allgemeine.de.txt (#1951) 2026-05-21 11:57:38 +02:00
juejin.cn.txt Add juejin.cn config (#916) 2021-11-22 07:04:30 +01:00
juliareda.eu.txt
julieandrieu.com.txt
jungle-world.com.txt Backport site_config changes from wallabag v1 2015-12-31 18:13:20 +01:00
juppy.org.txt
jvns.ca.txt Create jvns.ca.txt (#1587) 2025-04-15 08:59:21 +02:00
jvt.me.txt
kachestvo.ru.txt
kathimerini.gr.txt
kattascha.de.txt
kb.mailbox.org.txt
kenfm.de.txt
kenrockwell.com.txt Initial commit 2013-02-27 23:43:10 +01:00
keyboardmag.com.txt Configs for keyboardmag.com and lenta.ru (#182) 2016-07-17 14:20:58 +02:00
keycloak.org.txt Create keycloak.org.txt 2021-02-01 13:45:32 +01:00
kicker.de.txt
kickstarter.com.txt
kinder-verstehen.de.txt
kingarthurflour.com.txt
kingstonist.com.txt Create kingstonist.com.txt (#1515) 2024-12-05 10:01:12 +01:00
kingz.fr.txt Create kingz.fr.txt (#483) 2018-08-04 12:10:41 +02:00
kinocheck.de.txt Create kinocheck.de.txt (#1730) 2025-07-31 21:08:29 +02:00
klimareporter.de.txt fix: klimareporter.de rules (#1902) 2026-02-26 18:43:04 +01:00
knoten-stadion.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
knowablemagazine.org.txt Create knowablemagazine.org.txt (#1531) 2024-12-16 21:44:01 +01:00
ko-fi.com.txt Add cohost.org ko-fi.com and pcgamer.com (#1364) 2024-04-13 16:18:23 +02:00
kochbar.de.txt fix: kochbar body rule (#1904) 2026-02-26 18:45:54 +01:00
kommersant.ru.txt Update kommersant.ru.txt 2023-11-16 15:50:08 +01:00
kont.me.txt Add a custom user agent to retrieve kont.me (#872) 2021-04-08 19:06:16 +02:00
korben.info.txt Update korben.info.txt (#1729) 2025-07-30 20:55:48 +02:00
kotaku.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
kottke.org.txt Initial commit 2013-02-27 23:43:10 +01:00
kqed.org.txt Create kqed.org.txt (#1496) 2024-11-16 11:24:48 +01:00
krautreporter.de.txt Add site configs for republik.ch, forum-geldpolitik.ch, wendezeit.ch (#1963) 2026-06-06 18:34:41 +02:00
krebsonsecurity.com.txt
kreis-anzeiger.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
kreisbote.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
kreiszeitung.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
kresus.org.txt
kriswrites.com.txt Create kriswrites.com.txt (#1144) 2023-07-12 06:28:58 +02:00
krone.at.txt
krzbb.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
kuemmerle.name.txt Create kuemmerle.name.txt (#1448) 2024-10-19 04:03:39 +02:00
kulturegeek.fr.txt
kumailplus.com.txt
kumb.com.txt
kurier.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
kurierverlag.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
kwerfeldein.de.txt
kyoko-np.net.txt add two sites (#1832) 2025-12-29 14:56:41 +01:00
labs.bishopfox.com.txt Config for blogposts at labs.bishopfox.com (#853) 2021-01-23 11:46:37 +01:00
labs.mwrinfosecurity.com.txt
labs.ripe.net.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
lactualite.com.txt Update lactualite.com.txt (#1716) 2025-07-18 09:37:35 +02:00
lado.mx.txt Create lado.mx.txt 2024-02-12 16:09:33 -06:00
lalettrea.fr.txt
lalibre.be.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
lamontagne.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
lampertheimer-zeitung.de.txt Update lampertheimer-zeitung.de.txt 2023-10-15 15:02:55 +02:00
landetsfria.se.txt
landtiere.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
lapetiteokara.fr.txt Extract content structure from lapetiteokara.fr (#1837) 2026-01-04 09:20:28 +01:00
laphamsquarterly.org.txt Update laphamsquarterly.org.txt 2021-12-19 11:54:06 +01:00
lapin-blanc.blogs.docteo.net.txt Backport site_config changes from wallabag v1 2015-12-31 18:13:20 +01:00
lapresse.ca.txt Update XPath selectors and test URLs in lapresse.ca.txt (#1853) 2026-01-24 09:16:45 +01:00
lapresselibre.info.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
laquadrature.net.txt
laravel-france.com.txt feat: add laravel-france.com configuration (#1938) 2026-05-10 08:47:20 +02:00
larep.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
lareviewofbooks.org.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
larevuedesmedias.ina.fr.txt Add larevuedesmedias.ina.fr (#849) 2021-01-17 23:04:51 +01:00
lasemainedelallier.fr.txt feat: add lasemainedelallier.fr configuration (#1939) 2026-05-10 07:42:50 +02:00
latimes.com.txt Update latimes.com.txt 2020-10-24 11:47:58 +02:00
laughingsquid.com.txt
lauterbacher-anzeiger.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
lavenir.net.txt Add support for lavenir.net (#1871) 2026-02-20 07:30:17 +01:00
lawfareblog.com.txt
le-pays.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
leancrew.com.txt
learn.microsoft.com.txt fix: msdn rules (#1903) 2026-02-26 18:45:06 +01:00
learnexperts.ai.txt Add configuration for learnexperts.ai scraping (#1932) 2026-05-04 09:40:39 +02:00
leb.fbi.gov.txt Update leb.fbi.gov.txt 2022-11-09 00:42:06 +01:00
leberry.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
leblogduhacker.fr.txt
lececil.org.txt
lechorepublicain.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
lecker.de.txt
ledauphine.com.txt Create ledauphine.com.txt 2024-06-09 22:50:39 +02:00
ledoc-info.com.txt
leereamsnyder.com.txt Create leereamsnyder.com.txt (#1425) 2024-08-24 10:26:56 +02:00
lefigaro.fr.txt Update lefigaro.fr.txt 2025-02-19 17:01:04 +01:00
lefilrouge.media.txt
legrandcontinent.eu.txt XPath updates for legrandcontinent.eu (#1793) 2025-11-25 07:30:03 +01:00
lehollandaisvolant.net.txt
leinetal24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
lejdc.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
lejournal.cnrs.fr.txt
lemmy.ml.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
lemonde.fr.txt Update lemonde.fr.txt (#1915) 2026-03-18 16:11:57 +01:00
lenta.ru.txt
leon.jp.txt add leon.jp (#1826) 2025-12-20 11:47:29 +01:00
lepoint.fr.txt
lepopulaire.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
lequatreheures.com.txt
lequipe.fr.txt Add lequipe.fr (#827) 2020-11-17 12:03:05 +01:00
lesecolohumanistes.fr.txt Add lesecolohumanistes.fr.txt (#877) 2021-05-04 08:58:30 +02:00
lesjours.fr.txt Lesjours (#1533) 2024-12-17 15:53:04 +01:00
lesnumeriques.com.txt Fix lesnumeriques.com (#1723) 2025-07-26 09:59:54 +02:00
lesoir.be.txt
lesprosdelapetiteenfance.fr.txt Create lesprosdelapetiteenfance.fr.txt 2023-08-25 09:49:18 +02:00
lesswrong.com.txt Create lesswrong.com.txt 2021-01-22 22:24:10 +01:00
letraslibres.com.txt
leveil.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
lexpress.fr.txt Create lexpress.fr.txt (#1440) 2024-10-10 22:41:03 +02:00
lezephyrmag.com.txt Create lezephyrmag.com.txt (#485) 2018-08-04 12:11:19 +02:00
libcom.org.txt
liberation.fr.txt Update liberation.fr.txt 2025-10-13 10:43:44 +02:00
LICENSE.txt LICENSE.txt added (public domain) 2016-12-02 12:06:13 +01:00
lifeclub.org.txt Create lifeclub.org.txt (#1127) 2023-07-03 09:28:58 +02:00
lifehack.org.txt Create lifehack.org.txt 2019-01-09 00:40:59 +01:00
lifehacker.com.txt Update lifehacker.com.txt (#1676) 2025-06-19 19:21:50 +02:00
lifehacker.ru.txt Modify body selector and prevent indentation (#1784) 2025-11-06 09:13:32 +01:00
lifestyle.inquirer.net.txt
lifeweek.com.cn.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
lightreading.com.txt add site config for lightreading.com (#851) 2021-01-19 14:59:05 +01:00
limo.media.txt add limo.media.txt and news.infoseek.co.jp.txt (#1863) 2026-02-02 20:03:59 +01:00
limprevu.fr.txt
link.springer.com.txt Link.springer (#1325) 2024-01-28 17:38:34 +01:00
linkedin.com.txt Modify LinkedIn scraping configuration (#1802) 2025-12-04 13:07:08 +01:00
linux-community.de.txt Update linux-community.de.txt (#1108) 2023-06-19 16:57:42 +02:00
linux-magazin.de.txt update linux-magazin.de config (#1011) 2022-11-15 04:54:11 +01:00
linux.com.txt
linuxconfig.org.txt Create linuxconfig.org.txt (#1323) 2024-01-28 09:44:57 +01:00
linuxjournal.com.txt
linuxnix.com.txt
literaryreview.co.uk.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
lithub.com.txt Update XPath for article metadata extraction (#1910) 2026-03-04 01:03:44 +01:00
livescience.com.txt Cleanup livescience.com.txt (#848) 2021-01-17 23:04:26 +01:00
longform.org.txt Initial commit 2013-02-27 23:43:10 +01:00
longreads.com.txt
longreads.tni.org.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
loopinsight.com.txt
lostgarden.com.txt
lotro.com.txt Update lotro.com.txt (#1686) 2025-06-27 15:50:05 +02:00
loudersound.com.txt Create loudersound.com.txt (#1738) 2025-08-08 13:34:34 +02:00
lowtechmagazine.com.txt servethehome and lowtechmagazine (#208) 2016-10-11 23:23:05 +02:00
lrb.co.uk.txt Update lrb.co.uk.txt 2020-11-05 15:40:04 +01:00
ludwigshafen24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
lukew.com.txt
luminous-landscape.com.txt Initial commit 2013-02-27 23:43:10 +01:00
lupa.cz.txt
lux-magazine.com.txt Create lux-magazine.com.txt 2021-05-19 22:42:34 +02:00
luxuo.com.txt
lvsl.fr.txt
lwlies.com.txt
lwn.net.txt Fix lwn.net login (#1529) 2024-12-14 13:07:34 +01:00
lynalden.com.txt Create lynalden.com.txt (#1507) 2024-11-27 01:02:04 +01:00
lyonne.fr.txt feat: add Centre France newspapers configurations (#1940) 2026-05-10 09:08:58 +02:00
m.bbc.co.uk.txt
m.douban.com.txt
m.dw.com.txt
m.facebook.com.txt
m.theregister.co.uk.txt
m.wikihow.com.txt
m.xkcd.com.txt
m00natic.github.io.txt
mac4ever.com.txt Update mac4ever.com.txt 2023-01-04 15:16:07 +01:00
macdrifter.com.txt
macg.co.txt fix: update MacG websites configuration (#1941) 2026-05-10 09:39:30 +02:00
macmagazine.com.br.txt
macrumors.com.txt
macstories.net.txt
mactalk.com.au.txt
mactechnews.de.txt
macworld.com.txt
mailchi.mp.txt Update mailchi.mp.txt 2024-10-20 15:38:40 +02:00
main-spitze.de.txt Update main-spitze.de.txt 2023-10-15 15:06:48 +02:00
mainichi.jp.txt add as-web.jp.txt and mainichi.jp.txt (#1822) 2025-12-17 20:01:19 +01:00
mainpost.de.txt
maitre-eolas.fr.txt
make.wordpress.org.txt Create make.wordpress.org.txt 2020-12-29 01:29:14 +01:00
makramayache.com.txt Create www.makramayache.com.txt (#1524) 2024-12-10 22:36:18 +01:00
malekal.com.txt Create malekal.com.txt (#1214) 2023-10-02 06:14:13 +02:00
manager-magazin.de.txt Update manager-magazin.de.txt (#1320) 2024-01-26 19:59:13 +01:00
manager.co.th.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
manga-news.com.txt
mangfall24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
mannheim24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
marciniwuc.com.txt Create marciniwuc.com.txt (#1058) 2023-02-22 13:38:14 +01:00
marco.org.txt
marcobehler.com.txt Create marcobehler.com.txt (#1164) 2023-07-21 06:25:26 +02:00
marcvidal.net.txt
marginalrevolution.com.txt Create marginalrevolution.com.txt (#1563) 2025-03-04 22:11:37 +01:00
marigold.cz.txt
maritimedanmark.dk.txt Create maritimedanmark.dk.txt 2024-06-06 11:27:10 +02:00
marketresearchdirect.com.txt
markmanson.net.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
marksdailyapple.com.txt
marktechpost.com.txt Create marktechpost.com.txt (#1449) 2024-10-19 22:12:21 +02:00
marmiton.org.txt
marriedtothesea.com.txt
marsactu.fr.txt
martinfowler.com.txt Initial commit 2013-02-27 23:43:10 +01:00
mashable.com.txt
matija.suklje.name.txt Create matija.suklje.name.txt (#1486) 2024-11-09 09:18:08 +01:00
matt.might.net.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
mattcutts.com.txt
matthewball.co.txt Entheogenesis (#1430) 2024-09-02 14:29:31 +02:00
maxim.com.txt
mbari.org.txt
mbk-news.appspot.com.txt
mbl.is.txt Initial commit 2013-02-27 23:43:10 +01:00
mccarthy.ca.txt Create mccarthy.ca.txt 2025-07-31 15:05:33 +02:00
mcconnellsmedchem.com.txt Create mcconnellsmedchem.com.txt (#1577) 2025-03-30 07:17:51 +02:00
mcorbin.fr.txt Add mcorbin.fr configuration (#1189) 2023-08-22 00:16:41 +02:00
mdpi.com.txt
mdr.de.txt add siteconfig for mdr.de (#527) 2018-09-17 17:19:49 +02:00
mebedo.de.txt
mediacites.fr.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
medialens.org.txt Update medialens.org.txt 2021-05-09 15:32:13 +02:00
mediapart.fr.txt Updated mediapart.fr login and added new stripping rules (#1929) 2026-04-18 19:54:09 +02:00
medium.com.txt Medium.com (#1169) 2023-07-25 06:45:51 +02:00
medscape.com.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
meduza.io.txt
megamp3.eu.txt
mein-hbf-ffm.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
mein-mmo.de.txt
meine-anzeigenzeitung.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
mentalfloss.com.txt
meowni.ca.txt
mercatornet.com.txt Create mercatornet.com.txt (#1329) 2024-02-02 18:37:29 +01:00
mercurynews.com.txt
mereorthodoxy.com.txt Add scraping configuration for mereorthodoxy.com (#1840) 2026-01-07 04:29:22 +01:00
merkmal-biz.jp.txt add merkmal-biz.jp and go2senkyo.com (#1934) 2026-05-04 14:30:06 +02:00
merkur.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
merkurist.de.txt Update merkurist.de.txt 2023-10-08 10:51:03 +02:00
mesec.cz.txt Update mesec.cz.txt 2017-02-09 23:38:35 +01:00
metafilter.com.txt metafilter.com 2013-05-16 11:46:28 +02:00
metro.co.uk.txt Create metro.co.uk.txt (#1129) 2023-07-03 09:29:37 +02:00
metrocop.net.txt
mforum.cari.com.my.txt
miamiherald.com.txt
microsiervos.com.txt Add microsiervos.com (#1006) 2022-11-02 09:58:44 +01:00
middleeasteye.net.txt Update middleeasteye.net.txt 2023-06-02 13:28:55 +02:00
mikeash.com.txt
mikeindustries.com.txt
milanocittastato.it.txt Create milanocittastato.it.txt (#1671) 2025-06-17 07:35:08 +02:00
minnesota.publicradio.org.txt
minnpost.com.txt
mintpressnews.com.txt
miops.com.txt Add miops.com.txt (#974) 2022-06-01 16:21:58 +02:00
mirrorfootball.co.uk.txt
mises.org.txt
missnumerique.com.txt Create missnumerique.com.txt (#1015) 2022-11-28 22:49:25 +01:00
mithatkonar.com.txt
mitie.com.txt Create mitie.com.txt 2021-09-11 18:35:52 +02:00
mittelhessen.de.txt Update mittelhessen.de.txt 2023-10-15 15:07:14 +02:00
mlb.sbnation.com.txt Initial commit 2013-02-27 23:43:10 +01:00
mlssoccer.com.txt
mmo-champion.com.txt
mnn.com.txt
mno.hu.txt
mobile.lemondeinformatique.fr.txt
mobile.nytimes.com.txt Nyt (#1427) 2024-08-28 14:10:59 +02:00
mobile.twitter.com.txt twitter.com: fix content fetching using custom UA (#837) 2020-12-28 18:22:14 +01:00
mobilegeeks.de.txt
mobilenet.cz.txt
mobileopportunity.blogspot.com.txt
mobilmania.cz.txt
modernghana.com.txt
momentumsaga.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
moncarnet.com.txt Update moncarnet.com.txt (#1731) 2025-08-01 01:32:29 +02:00
mondayvatican.com.txt Create mondayvatican.com.txt (#1943) 2026-05-14 08:10:49 +02:00
monde-diplomatique.fr.txt Update monde-diplomatique.fr.txt (#1727) 2025-07-30 17:30:13 +02:00
money.cnn.com.txt
moneymorning.com.txt Add scraping rules for Money Morning article 2026-06-09 17:12:12 +02:00
moneysavingexpert.com.txt
monkeyuser.com.txt
monkeyzen.com.txt
montelimar-news.fr.txt
moo.nac.uci.edu.txt
moonsault.de.txt
morgenpost.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
mothering.com.txt
motherjones.com.txt
moto-net.com.txt
motorcyclistonline.com.txt
motorfull.com.txt
motorsport-magazin.com.txt Update motorsport-magazin.com.txt (#1461) 2024-10-27 17:09:51 +01:00
movie.douban.com.txt
mp.weixin.qq.com.txt Update mp.weixin.qq.com.txt (#1629) 2025-05-27 09:03:02 +02:00
msdvetmanual.com.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
msn.com.txt Update msn.com.txt (#1346) 2024-03-05 10:35:34 +01:00
msnbc.msn.com.txt
mtlblog.com.txt
muenster.de.txt Create muenster.de.txt (#711) 2019-11-14 21:46:49 +01:00
multinationales.org.txt Fix multinationales.org.txt (#1773) 2025-10-10 20:46:32 +02:00
muse.jhu.edu.txt muse.jhu.edu.txt added for the journal of democracy (#1548) 2025-01-10 19:43:52 +01:00
muycomputerpro.com.txt
muyinteresante.com.txt Fix w3349 (#1460) 2024-10-26 22:05:39 +02:00
muyinteresante.es.txt muyinteresante.es (#425) 2018-05-13 14:05:10 +02:00
muylinux.com.txt
mymodernmet.com.txt Add mymodernmet.com (#828) 2020-11-17 12:03:29 +01:00
myrecipes.com.txt
mysqlblog.fivefarmers.com.txt add config for mysqlblog.fivefarmers.com (#987) 2022-08-12 00:42:33 +02:00
mytotalretail.com.txt
n-tv.de.txt
n.survol.fr.txt
nachdenkseiten.de.txt Updated nachdenkseiten.de.txt (#683) 2019-09-17 16:03:39 +02:00
nachrichten.at.txt
naiz.eus.txt Correction to naiz.eus 2015-10-15 15:42:12 +02:00
najlepsze-ksiazki.pl.txt Create najlepsze-ksiazki.pl.txt (#1171) 2023-07-25 06:44:37 +02:00
nakedsecurity.sophos.com.txt Add nakedsecurity.sophos.com 2016-04-08 18:54:08 +03:00
narratively.com.txt Update narratively.com.txt 2021-12-30 02:22:40 +01:00
nasa.gov.txt
natalie.mu.txt add .watch.impress.co.jp.txt and fujinkoron.jp.txt (#1917) 2026-03-20 14:43:02 +01:00
nationalgeographic.de.txt Update nationalgeographic.de.txt 2021-10-29 00:22:12 +02:00
nationalpost.com.txt Update nationalpost.com.txt 2025-06-25 18:00:47 -04:00
nationalreview.com.txt Update nationalreview.com.txt (#1106) 2023-06-19 14:17:50 +02:00
natura-sciences.com.txt
nature.com.txt Update nature.com.txt (#1567) 2025-03-11 15:30:49 +01:00
nbnnews.com.au.txt
ncbi.nlm.nih.gov.txt Update ncbi.nlm.nih.gov.txt (#1295) 2024-01-07 06:11:06 +01:00
nejm.org.txt Update nejm.org.txt (#1362) 2024-04-09 11:38:19 +02:00
nerdy.dev.txt Rename custom/nerdy.dev.txt to nerdy.dev.txt (#1684) 2025-06-25 23:20:47 +02:00
net-security.org.txt
netflixtechblog.com.txt Create netflixtechblog.com.txt (#1147) 2023-07-12 06:27:21 +02:00
netmagazine.com.txt
networkworld.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
netzoekonom.de.txt
netzpolitik.org.txt Update netzpolitik.org.txt (#1825) 2025-12-19 14:08:55 +01:00
neues-deutschland.de.txt
neunetz.com.txt
newcriterion.com.txt Create newcriterion.com.txt 2020-12-01 14:07:28 +01:00
newlinesmag.com.txt Update extraction rules (#1925) 2026-03-29 15:29:41 +02:00
newmedia.calcalist.co.il.txt Create newmedia.calcalist.co.il.txt (#1083) 2023-06-02 13:22:32 +02:00
newrepublic.com.txt Update newrepublic.com.txt 2022-04-09 11:25:28 +02:00
news.bayern.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
news.cnet.com.txt Initial commit 2013-02-27 23:43:10 +01:00
news.com.au.txt Update news.com.au.txt (#1510) 2024-12-01 09:10:11 +01:00
news.detik.com.txt
news.google.com.txt Update news.google.com.txt 2023-02-16 15:58:36 +01:00
news.infoseek.co.jp.txt add limo.media.txt and news.infoseek.co.jp.txt (#1863) 2026-02-02 20:03:59 +01:00
news.jp.txt add news.jp.txt and web.gekisaka.jp.txt (#1820) 2025-12-16 14:43:32 +01:00
news.mynavi.jp.txt
news.pixelistes.com.txt
news.rambler.ru.txt
news.rub.de.txt
news.techmeme.com.txt
news.yahoo.co.jp.txt add news.yahoo.co.jp.txt (#1817) 2025-12-14 09:53:45 +01:00
news.ycombinator.com.txt Update news.ycombinator.com.txt (#1276) 2023-12-22 07:11:42 +01:00
news247.gr.txt
newsbomb.gr.txt Initial commit 2013-02-27 23:43:10 +01:00
newsinfo.inquirer.net.txt Update newsinfo.inquirer.net.txt 2024-06-07 12:31:54 +02:00
newsletter.pragmaticengineer.com.txt Create newsletter.pragmaticengineer.com.txt (#1174) 2023-07-25 16:48:09 +02:00
newsphere.jp.txt Update xenospectrum.com add newsphere.jp jp.reuters.com (#1847) 2026-01-11 16:34:29 +01:00
newstatesman.com.txt Update newstatesman.com.txt 2022-08-19 23:53:59 +02:00
newsunspun.org.txt
newsweek.com.txt
newswise.com.txt
newtimesslo.com.txt Create newtimesslo.com.txt (#1592) 2025-04-17 22:39:29 +02:00
newyorkaktuell.nyc.txt Add body field to newyorkaktuell.nyc.txt (#1913) 2026-03-17 20:52:01 +01:00
newyorker.com.txt Update newyorker.com.txt (#1249) 2023-11-17 20:47:20 +01:00
next.ink.txt Fix image lazy loading on next.ink (#1948) 2026-05-19 05:12:57 +02:00
nextcloud.com.txt
nextdraft.com.txt add nextdraft.com (#1101) 2023-06-19 09:07:15 +02:00
nextg.tv.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
nf-farn.de.txt Create nf-farn.de.txt 2022-01-16 12:58:32 +01:00
nfl.com.txt Create nfl.com.txt 2020-08-24 02:29:29 +02:00
nicj.net.txt Added nicj.net.txt (#826) 2020-10-27 09:30:45 +01:00
nifi.apache.org.txt Create nifi.apache.org.txt (#1482) 2024-11-05 09:01:36 +01:00
nikkei.com.txt add .watch.impress.co.jp.txt and fujinkoron.jp.txt (#1917) 2026-03-20 14:43:02 +01:00
nintendoworldreport.com.txt
nitter.net.txt Create nitter.net.txt (#1313) 2024-01-21 10:12:06 +01:00
nj.com.txt
noidea.dog.txt Create noidea.dog.txt (#1142) 2023-07-12 06:29:20 +02:00
nojesguiden.se.txt
nordmainische-s-bahn.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
northumberlandview.ca.txt
nos.nl.txt Create nos.nl (#1116) 2023-06-26 06:32:47 +02:00
nosalty.hu.txt
nota-bene.org.txt
notebookcheck.net.txt Update notebookcheck.net.txt (#1735) 2025-08-06 15:43:35 +02:00
notimx.mx.txt Create notimx.mx.txt 2024-02-12 16:29:32 -06:00
nouvelobs.com.txt Update nouvelobs.com.txt 2024-06-14 11:28:14 +02:00
novastan.org.txt
novinky.cz.txt
np-coburg.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
nplusonemag.com.txt Update nplusonemag.com.txt 2022-07-14 22:20:35 -04:00
npr.org.txt
nrc.nl.txt
nrz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ntoskrnl.org.txt
number.bunshun.jp.txt add number.bunshun.jp.txt and taxlabor.com.txt (#1833) 2026-01-03 22:18:54 +01:00
numerama.com.txt Update numerama.com.txt 2025-09-11 11:38:25 +02:00
nybooks.com.txt Update nybooks.com.txt 2020-10-24 12:35:05 +02:00
nymag.com.txt fix: nymag.com.txt (#1810) 2025-12-08 17:05:58 +01:00
nyra.nyc.txt Add configuration for nyra.nyc article scraping (#1859) 2026-01-28 18:44:13 +01:00
nytimes.com.txt Update nytimes.com.txt 2025-07-24 15:22:39 +02:00
nzz.ch.txt Remove "Optimize your browser" text (#1558) 2025-02-26 06:53:44 +01:00
o6asan.com.txt
oberhessische-zeitung.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
observers.france24.com.txt
ocu.org.txt
off.net.mk.txt Initial commit 2013-02-27 23:43:10 +01:00
oko.press.txt Update oko.press.txt (#1585) 2025-04-11 08:11:20 +02:00
oktoberfest.bayern.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
oltnertagblatt.ch.txt Create oltnertagblatt.ch.txt (#1201) 2023-09-05 15:01:08 +02:00
omgubuntu.co.uk.txt
omiliya.org.txt
onb.ac.at.txt Update onb.ac.at.txt (#1708) 2025-07-12 15:13:20 +02:00
oncletom.io.txt
onlinewelten.com.txt
ontologicalgeek.com.txt
op-online.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
open.online.txt Create open.online.txt (#1122) 2023-07-03 09:27:41 +02:00
openai.com.txt Update openai.com.txt (#1431) 2024-09-15 15:57:16 +02:00
opendemocracy.net.txt Create opendemocracy.net.txt 2021-11-01 22:53:35 +01:00
opensource.com.txt add config for opensource.com (#1093) 2023-06-16 21:49:51 +02:00
opensource.org.txt
openstreetmap.org.txt
openthemagazine.com.txt Initial commit 2013-02-27 23:43:10 +01:00
optimizesmart.com.txt Create optimizesmart.com.txt 2022-08-20 10:30:54 +02:00
orf.at.txt
orientxxi.info.txt
origo.hu.txt
oschina.net.txt
osmand.net.txt
osmc.tv.txt
ostechnix.com.txt Create ostechnix.com.txt 2021-02-01 14:27:15 +01:00
ostprog.de.txt Update ostprog.de.txt 2022-10-01 23:53:28 +02:00
otz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ourworldindata.org.txt ourworldindata.org file config (#366) 2017-11-27 11:06:26 +00:00
outsideonline.com.txt Update outsideonline.com.txt 2021-10-29 00:13:19 +02:00
ovb-online.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
overreacted.io.txt Update overreacted.io.txt 2025-10-11 14:06:05 +02:00
oxfordamerican.org.txt
paddle.com.txt Create paddle.com.txt 2024-05-16 16:11:14 +02:00
pagenotfound.cz.txt Update pagenotfound.cz.txt (#1601) 2025-04-29 13:05:38 +02:00
palmbeachpost.com.txt
pandemicequityinitiative.com.txt Shtrom 2024 01 (#1293) 2024-01-07 06:05:15 +01:00
pandodaily.com.txt
panewslab.com.txt Add article content selector and test URL (#1964) 2026-06-09 23:08:51 +02:00
panic.com.txt
paperpaper.ru.txt
papertohtml.org.txt Update papertohtml.org.txt 2024-12-26 13:19:48 +01:00
papodehomem.com.br.txt
paquier.xyz.txt
parislemon.com.txt
parliament.uk.txt
parlimen.gov.my.txt Create parlimen.gov.my.txt (#1687) 2025-06-27 16:52:36 +02:00
parool.nl.txt
pastebin.com.txt
pastepad.fivefilters.org.txt
pathawks.com.txt
patreon.com.txt Update patreon.com.txt (#1488) 2024-11-10 16:30:37 +01:00
pcgamer.com.txt Add cohost.org ko-fi.com and pcgamer.com (#1364) 2024-04-13 16:18:23 +02:00
pcmag.com.txt Update pcmag.com.txt 2023-03-29 23:24:28 +02:00
pcworld.com.txt Update pcworld.com.txt (#1681) 2025-06-24 17:20:41 +02:00
penny-arcade.com.txt Initial commit 2013-02-27 23:43:10 +01:00
pentaxforums.com.txt Initial commit 2013-02-27 23:43:10 +01:00
peoplesdispatch.org.txt Create peoplesdispatch.org.txt (#1418) 2024-08-08 14:29:29 +02:00
perell.com.txt
perspective-daily.de.txt Fix perspective-daily.de pixelated preview thumbnails (#1962) 2026-06-06 18:32:38 +02:00
pestemag.com.txt Create pestemag.com.txt 2022-10-16 14:51:20 +02:00
petbook.de.txt Add petbook.de scraping configuration (#1965) 2026-06-09 23:44:37 +02:00
pfefferminzia.de.txt Create pfefferminzia.de.txt (#1068) 2023-03-19 20:47:56 +01:00
pflegen-online.de.txt Create pflegen-online.de.txt (#1268) 2023-12-13 13:58:52 +01:00
pharmazeutische-zeitung.de.txt Update pharmazeutische-zeitung.de.txt (#1150) 2023-07-12 06:26:29 +02:00
phastidio.net.txt
philosophyforlife.org.txt Create philosophyforlife.org.txt (#1092) 2023-06-14 16:27:55 +02:00
philosophynow.org.txt Create philosophynow.org.txt 2020-12-01 14:10:23 +01:00
philstar.com.txt Create philstar.com.txt 2017-03-06 16:35:48 +01:00
phoronix.com.txt Update phoronix.com (#1267) 2023-12-10 21:35:04 -08:00
photo.tutsplus.com.txt
photografix-magazin.de.txt Update photografix-magazin.de.txt (#1266) 2023-12-10 11:41:34 +01:00
photopills.com.txt Create photopills.com.txt (#1176) 2023-08-12 10:21:15 +02:00
phototrend.fr.txt
php.net.txt
phys.org.txt Update phys.org.txt (#1572) 2025-03-19 19:58:00 +01:00
pinterest.com.txt
piped.video.txt Create piped.video.txt (#1203) 2023-09-18 09:20:56 +02:00
pitchfork.com.txt Update pitchfork.com.txt 2022-12-05 21:28:02 +01:00
pittsburghmagazine.com.txt
pixellibre.net.txt
pjmedia.com.txt
placegrenet.fr.txt
planet3dnow.de.txt
planetvita.de.txt
playboy.com.txt
playgroupnsw.org.au.txt
ploum.net.txt fix: ploum.net (#1542) 2025-01-03 18:16:00 +01:00
pluralistic.net.txt Update pluralistic.net.txt (#1489) 2024-11-10 21:14:08 +01:00
plus.google.com.txt
plzkthxbai.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
pmf.silvrback.com.txt
poetryfoundation.org.txt Update poetryfoundation.org.txt 2022-04-18 10:59:31 +02:00
poets.org.txt Update scraping rules for poets.org (#1854) 2026-01-24 14:02:16 +01:00
pogue.blogs.nytimes.com.txt
politico.com.txt Update politico.com.txt (#1636) 2025-05-31 04:48:03 +02:00
politifact.com.txt Initial commit 2013-02-27 23:43:10 +01:00
politiken.dk.txt
politis.fr.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
polka.academy.txt Update polka.academy.txt (#1078) 2023-04-13 19:31:29 +02:00
polygon.com.txt Update polygon.com.txt (#1136) 2023-07-06 10:16:07 +02:00
popehat.com.txt
popsci.com.txt
popularmechanics.com.txt Popularmechanics (#1649) 2025-06-08 10:26:54 +02:00
portertech.ca.txt
positioningmag.com.txt
posta.com.tr.txt
posteo.de.txt
postnauka.ru.txt Update postnauka.ru.txt 2021-03-18 09:44:37 +01:00
preparedfoods.com.txt
president.jp.txt Add president.jp (#1794) 2025-11-29 11:34:45 +01:00
presse-citron.net.txt Update presse-citron.net for multipage & new design (#612) 2019-02-13 11:01:52 +01:00
presseportal.de.txt modified: presseportal.de (#962) 2022-03-27 14:12:27 +02:00
primaonline.it.txt Create primaonline.it.txt (#1651) 2025-06-08 12:49:46 +02:00
privacyinternational.org.txt
pro-linux.de.txt
prog21.dadgum.com.txt
prolost.com.txt
propakistani.pk.txt
propublica.org.txt
proskauer.com.txt Update proskauer.com.txt 2020-06-24 00:14:21 +02:00
prospectmagazine.co.uk.txt
protocol.com.txt Add protocol.com.txt (#864) 2021-03-15 01:41:26 +01:00
protonmail.com.txt
protothema.gr.txt
psu.edu.txt Create psu.edu.txt (#919) 2022-01-03 10:02:13 +01:00
psyche.co.txt Update psyche.co.txt (#1639) 2025-06-01 01:37:32 +02:00
psychologytoday.com.txt
psypost.org.txt Create psypost.org.txt (#1631) 2025-05-28 10:01:26 +02:00
publications.aap.org.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
publications.parliament.uk.txt
publicdomainreview.org.txt Update publicdomainreview.org.txt (#1683) 2025-06-25 14:34:47 +02:00
publico.pt.txt
publicorthodoxy.org.txt Create publicorthodoxy.org.txt (#1950) 2026-05-21 11:47:41 +02:00
publik-forum.de.txt Create publik-forum.de.txt (#1952) 2026-05-21 19:37:40 +02:00
puri.sm.txt Update puri.sm.txt (#1060) 2023-03-09 22:46:20 +01:00
putaindecode.io.txt
putsch.media.txt Create putsch.media.txt (#508) 2018-08-11 14:15:36 +02:00
pxlnv.com.txt
pymotw.com.txt
python.org.txt Create python.org.txt 2022-03-01 21:38:18 +01:00
qctimes.com.txt
qntm.org.txt Create qntm.org.txt (#1504) 2024-11-22 08:20:04 +01:00
quantamagazine.org.txt Update quantamagazine.org.txt (#1660) 2025-06-13 17:53:15 +02:00
quantumdiaries.org.txt Initial commit 2013-02-27 23:43:10 +01:00
quechoisir.org.txt
queerty.com.txt
questionablecontent.net.txt
queue.acm.org.txt Add/update amp.themercury.com, businessinsider.com[.au], enterprisers… (#611) 2019-02-07 09:29:51 +01:00
quickanddirtytips.com.txt
quora.com.txt Update quora.com.txt 2020-11-15 20:44:02 +01:00
qz.com.txt Update qz.com.txt 2020-11-09 15:00:40 +01:00
rachelandrew.co.uk.txt
racjonalista.pl.txt
radar.oreilly.com.txt
radionz.co.nz.txt
radishzz.cc.txt Create radishzz.cc.txt (#1530) 2024-12-14 20:54:30 +01:00
rancher.com.txt
randsinrepose.com.txt
rasgolatente.es.txt
rbb24.de.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
reactjs.org.txt Update reactjs.org.txt 2022-08-20 10:37:03 +02:00
reactormag.com.txt Create reactormag.com.txt (#1760) 2025-09-11 10:59:02 +02:00
readingthechinadream.com.txt Create readingthechinadream.com.txt (#1187) 2023-08-20 12:54:45 +02:00
README.md Update README.md 2025-07-24 15:30:30 +02:00
real.gr.txt
rebelionenlagranja.com.txt Create rebelionenlagranja.com.txt (#933) 2022-02-17 20:22:47 +01:00
rebooti.com.txt
recode.net.txt
redalemeden.com.txt
redbull.com.txt
reddit.com.txt Update reddit.com.txt (#1691) 2025-06-30 16:19:20 +02:00
redeszone.net.txt
redmas.com.co.txt Create redmas.com.co.txt 2024-02-12 16:34:18 -06:00
redmondpie.com.txt
redtimmy.com.txt
refinery29.com.txt Update refinery29.com.txt 2022-07-04 16:48:47 -04:00
reflets.info.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
regionaltangente-west.de.txt Update regionaltangente-west.de.txt (#1564) 2025-03-06 21:41:00 +01:00
reitschuster.de.txt Create reitschuster.de.txt (#1319) 2024-01-26 19:49:12 +01:00
renenekuda.cz.txt Initial commit 2013-02-27 23:43:10 +01:00
renverse.co.txt Create renverse.co.txt 2022-04-09 10:41:04 +02:00
report-k.de.txt Create report-k.de (#1190) 2023-08-27 10:53:55 +02:00
reportermagazin.cz.txt
reporterre.net.txt Update reporterre.net.txt 2023-10-08 12:02:48 +02:00
republik.ch.txt Add site configs for republik.ch, forum-geldpolitik.ch, wendezeit.ch (#1963) 2026-06-06 18:34:41 +02:00
researchandmarkets.com.txt Create researchandmarkets.com.txt 2021-04-13 21:00:20 +02:00
researchgate.net.txt Update researchgate.net.txt (#1539) 2024-12-28 14:04:15 +01:00
reset.org.txt Add site configs for energie-experten.org and reset.org (#1961) 2026-06-10 15:23:24 +02:00
resilience.org.txt Create resilience.org.txt (#918) 2021-12-13 21:16:32 +01:00
retractionwatch.com.txt
retro-games.fr.txt add retro-games.fr (#1102) 2023-06-19 09:07:34 +02:00
reuters.com.txt Update stripping and replacing rules in reuters.com.txt 2026-04-08 21:55:44 +02:00
revdennismccarty.com.txt Create revdennismccarty.com.txt 2021-07-27 12:18:29 +02:00
reves-d-espace.com.txt adding reves-d-espace.com (#1876) 2026-02-21 12:22:04 +01:00
revue-farouest.fr.txt Create revue-farouest.fr.txt (#473) 2018-07-29 12:08:26 +02:00
rewe.de.txt Create rewe.de.txt (#1706) 2025-07-07 10:05:00 +02:00
rework.withgoogle.com.txt
rezeptwelt.de.txt
rfi.fr.txt Create rfi.fr.txt (#1602) 2025-04-29 13:30:54 +02:00
rhein-kreis-neuss.de.txt Update rhein-kreis-neuss.de.txt 2020-12-30 19:23:05 +01:00
richardkmorgan.com.txt Create richardkmorgan.com.txt (#1161) 2023-07-20 14:04:57 +02:00
riddle.press.txt Add riddle.press extraction configuration (#1967) 2026-06-16 17:28:18 +02:00
riffreporter.de.txt Create riffreporter.de.txt 2022-02-17 22:08:47 +01:00
ritimo.org.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
rnd.de.txt Add rnd.de.txt (#882) 2021-05-14 00:47:15 +02:00
roadandtrack.com.txt Create roadandtrack.com.txt (#1646) 2025-06-08 08:31:54 +02:00
robertsspaceindustries.com.txt
robots.thoughtbot.com.txt
rockpapershotgun.com.txt Improvements to eurogamer.net, heise.de, rockpapershotgun.com, tagesschau.de and zeit.de. Fix golem.de (#936) 2022-02-28 06:39:51 +01:00
rockylinux.org.txt Update rockylinux.org.txt (#1133) 2023-07-05 14:26:51 +02:00
rodrigo.sharpcube.com.txt
rogerebert.com.txt Update rogerebert.com.txt 2020-11-06 19:06:23 +01:00
rollingstone.com.txt
rom-game.fr.txt Added rom-game.fr.txt (#539) 2018-10-15 14:03:26 +02:00
romchip.org.txt Create romchip.org.txt (#1540) 2024-12-29 13:33:39 +01:00
roomescapeartist.com.txt
root.cz.txt
rosenheim24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
rottentomatoes.com.txt
roughtype.com.txt
roy.gbiv.com.txt
royalsocietypublishing.org.txt Add royalsocietypublishing.org.txt (#878) 2021-05-10 13:40:01 +02:00
rpgsite.net.txt
rtbf.be.txt Add author and update body extraction rules (#1862) 2026-01-31 17:48:15 +01:00
rtings.com.txt Rtings.com (#1113) 2023-06-26 06:34:25 +02:00
rubysfera.pl.txt
rue89bordeaux.com.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
rue89lyon.fr.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
rue89strasbourg.com.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
rugbyrama.fr.txt Create rugbyrama.fr.txt 2024-06-28 13:36:37 +02:00
ruhlman.com.txt
ruhr24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
rums.ms.txt Rename rums.ms to rums.ms.txt 2020-11-23 11:50:52 +01:00
rust-lang-nursery.github.io.txt
s6-frankfurt-friedberg.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
saadaalnews.net.txt
sacbee.com.txt
salon.com.txt fix: salon.com body rule (#1869) 2026-02-18 16:07:20 +01:00
saltyworld.net.txt
salzburg.com.txt
san.com.txt Create san.com.txt (#1369) 2024-04-27 12:07:41 +02:00
sankei.com.txt add sankei.com (#1851) 2026-01-21 14:25:02 +01:00
sanpedrosun.com.txt
sapiens.org.txt Create sapiens.org.txt (#1517) 2024-12-05 14:07:03 +01:00
sargasso.nl.txt
sauerlandkurier.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
saveyourself.ca.txt
sayidaty.net.txt
sbnation.com.txt
scheuch.de.txt Create scheuch.de.txt (#1455) 2024-10-23 04:22:48 +02:00
schneier.com.txt
schwarzwaelder-bote.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
science.org.txt Create science.org.txt 2021-10-05 13:47:43 +02:00
scienceblogs.de.txt
sciencedirect.com.txt Create sciencedirect.com.txt (#1308) 2024-01-15 16:41:00 +01:00
sciencepresse.qc.ca.txt Update sciencepresse.qc.ca.txt (#1724) 2025-07-26 08:53:25 +02:00
scienceticker.info.txt
scientificamerican.com.txt Update scientificamerican.com.txt 2020-10-17 13:29:47 +02:00
scilogs.de.txt Convert scilogs.de.txt from latin1. 2014-07-01 20:29:27 +01:00
scinfolex.com.txt
scnsrc.me.txt Two scene release announcement sites (#277) 2017-03-20 10:25:19 +00:00
scotthelme.co.uk.txt Add scotthelme.co.uk.txt (#944) 2022-03-03 06:47:39 +01:00
scottohara.me.txt
scotusblog.com.txt
scripting.com.txt
scroll.in.txt Create scroll.in.txt 2021-01-15 14:40:13 +01:00
sdxcentral.com.txt
searchenginejournal.com.txt
searchengineland.com.txt
seattletimes.com.txt Fix getting full text, similar to nytimes.com (#1168) 2023-07-25 06:46:38 +02:00
seattletransitblog.com.txt
sebsauvage.net.txt Update sebsauvage.net.txt (#1143) 2023-07-12 06:26:57 +02:00
sec.gov.txt Create sec.gov.txt 2025-08-19 13:09:58 +02:00
secouchermoinsbete.fr.txt
secretmag.ru.txt
securelist.com.txt
securityaffairs.co.txt add securityaffairs.co (#348) 2017-10-22 22:32:08 +01:00
securitylab.ru.txt Create securitylab.ru.txt (#1468) 2024-10-30 01:45:34 +01:00
secushare.org.txt
segment.com.txt
select.yeeyan.org.txt
semiaccurate.com.txt forgot to add semiaccurate.com 2015-11-12 19:53:04 +01:00
sempredirebanzai.it.txt Create sempredirebanzai.it.txt (#1643) 2025-06-06 09:50:26 +02:00
senscritique.com.txt Create senscritique.com.txt (#1677) 2025-06-20 13:18:23 +02:00
seriouseats.com.txt Refactor extraction rules for seriouseats.com (#1778) 2025-10-19 19:37:03 +02:00
serpentinegalleries.org.txt Create serpentinegalleries.org.txt (#1433) 2024-09-21 02:58:46 +02:00
servethehome.com.txt servethehome and lowtechmagazine (#208) 2016-10-11 23:23:05 +02:00
seznamzpravy.cz.txt Seznamzpravy.cz 2 (#1123) 2023-07-01 09:19:38 +02:00
sf.eater.com.txt
sfchronicle.com.txt Create sfchronicle.com.txt (#1653) 2025-06-08 14:27:29 +02:00
sfgate.com.txt
sfweekly.com.txt
shabayek.com.txt
shahinkalantari.com.txt Create shahinkalantari.com.txt (#1162) 2023-07-20 22:22:56 +02:00
share.ez.no.txt
shawnblanc.net.txt Initial commit 2013-02-27 23:43:10 +01:00
shepherd.com.txt Create shepherd.com.txt (#1475) 2024-11-02 00:08:02 +01:00
shifteleven.com.txt
shipilev.net.txt
shs.cairn.info.txt Update and rename cairn.info.txt to shs.cairn.info.txt (#1754) 2025-08-30 09:55:27 +02:00
shueisha.online.txt add dailyshincho.jp.txt and shueisha.online.txt (#1824) 2025-12-19 13:36:11 +01:00
shz.de.txt
siecledigital.fr.txt Update stripping rules and test URL in siecledigital.fr.txt (#1852) 2026-01-22 16:37:38 +01:00
signal.org.txt Updated signal.org.txt (#833) 2020-12-15 20:39:18 +01:00
singaporeanstocksinvestor.blogspot.com.txt
singularityhub.com.txt
sivers.org.txt
slashdot.org.txt slashdot: replace i tags with blockquote (#929) 2022-02-15 07:16:41 +01:00
slashfilm.com.txt
slate.com.txt slate.com: improve ad stripping (#839) 2021-01-04 07:02:50 +01:00
slate.fr.txt Update slate.fr.txt 2024-05-28 10:36:08 +02:00
slice.seriouseats.com.txt
slog.thestranger.com.txt Initial commit 2013-02-27 23:43:10 +01:00
slrlounge.com.txt Added slrlounge.com.txt (#963) 2022-03-31 20:53:40 +02:00
smarthomebeginner.com.txt
smashingmagazine.com.txt Update smashingmagazine.com.txt (#1084) 2023-05-16 06:38:14 +02:00
smbc-comics.com.txt
sme.sk.txt Initial commit 2013-02-27 23:43:10 +01:00
smh.com.au.txt Update smh.com.au.txt (#1445) 2024-10-16 14:51:16 +02:00
smithsonianmag.com.txt fix: bad format errors (#1811) 2025-12-09 13:48:06 +01:00
snip.ly.txt
snob.ru.txt Update snob.ru.txt 2021-12-29 16:26:21 +01:00
soester-anzeiger.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
somethingawful.com.txt
songshuhui.net.txt Fix line endings on three files. 2014-06-18 15:31:36 +01:00
soundcity.tv.txt Create soundcity.tv.txt 2015-01-08 12:46:19 +01:00
soundonsound.com.txt Update soundonsound.com.txt 2021-07-23 19:26:08 +02:00
sourcebooks.com.txt
sowetanlive.co.za.txt
space.com.txt Update space.com.txt (#1744) 2025-08-12 07:07:28 +02:00
spacetoday.com.br.txt chore: add configuration file for spacetoday.com.br (#1746) 2025-08-14 03:44:57 +02:00
spacex.com.txt Add scraping instructions for spacex.com (#1781) 2025-11-01 07:28:52 +01:00
spectator.co.uk.txt Update spectator.co.uk.txt (#1666) 2025-06-14 07:37:10 +02:00
spectrejournal.com.txt Create spectrejournal.com.txt 2021-03-12 10:12:43 +01:00
spectrum.ieee.org.txt Update spectrum.ieee.org.txt (#1204) 2023-09-18 08:59:30 +02:00
spektrum.de.txt Update spektrum.de.txt (#1107) 2023-06-19 14:33:49 +02:00
spiderum.com.txt Update spiderum.com.txt 2022-08-27 12:26:19 +02:00
spiegel.de.txt Update spiegel.de.txt 2025-06-25 17:39:40 -04:00
spiked-online.com.txt
spin.com.txt
splinternews.com.txt
sport.detik.com.txt Initial commit 2013-02-27 23:43:10 +01:00
sport365.fr.txt
sportiva.shueisha.co.jp.txt add sportiva.shueisha.co.jp (#1804) 2025-12-06 03:08:39 +01:00
sports.ru.txt Create sports.ru.txt (#1635) 2025-05-31 01:00:45 +02:00
sprengsatz.de.txt
sputniknews.com.txt
sqlite.org.txt
squashed.tumblr.com.txt Initial commit 2013-02-27 23:43:10 +01:00
srf.ch.txt
stackoverflow.blog.txt
stackoverflow.com.txt Fixed stackoverflow.com.txt (#1090) 2023-06-13 10:19:48 +02:00
stadt-bremerhaven.de.txt Update stadt-bremerhaven.de.txt with new selectors (#1958) 2026-05-30 10:28:01 +02:00
stadt-muenster.de.txt
stadtpost.de.txt Create stadtpost.de.txt (#1109) 2023-06-19 18:22:04 +02:00
staltz.com.txt Added staltz.com.txt (#605) 2019-02-05 11:32:38 +01:00
standard.co.uk.txt Update standard.co.uk.txt (#1591) 2025-04-17 22:37:53 +02:00
standardebooks.org.txt Add site config for standardebooks.org; update cbsnews.com (#1834) 2026-01-04 08:16:26 +01:00
standblog.org.txt Update XPath queries and add string replacements (#1879) 2026-02-23 14:16:43 +01:00
star-telegram.com.txt
statista.com.txt Create statista.com / es.statista.com (#852) 2021-01-20 13:51:38 +01:00
steamcommunity.com.txt Update steamcommunity.com.txt (#1562) 2025-02-28 05:35:40 +01:00
stefanjudis.com.txt
stephenfry.com.txt Initial commit 2013-02-27 23:43:10 +01:00
stiftung-gegm.de.txt Create stiftung-gegm.de.txt (#1702) 2025-07-04 10:44:50 +02:00
stjv.fr.txt
stockholmsfria.se.txt
stopgame.ru.txt Create stopgame.ru.txt 2021-02-10 14:22:35 +01:00
straightdope.com.txt
straitstimes.com.txt Create straitstimes.com.txt (#1312) 2024-01-21 09:29:34 +01:00
stratfor.com.txt
stratobuilds.com.txt Add configuration for scraping stratobuilds.com (#1905) 2026-03-02 22:53:05 +01:00
streetsblog.net.txt
stuff.co.nz.txt
stumbleupon.com.txt
stuttgarter-nachrichten.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
stuttgarter-zeitung.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
substack.com.txt Subst2 (#1713) 2025-07-15 10:07:06 +02:00
subtraction.com.txt
sueddeutsche.de.txt
sukusuku.tokyo-np.co.jp.txt add sukusuku.tokyo-np.co.jp.txt (#1864) 2026-02-03 14:53:21 +01:00
sulek.fr.txt Create sulek.fr.txt (#1456) 2024-10-24 15:15:10 +02:00
summitroute.com.txt
sun-connect.org.txt Create sun-connect.org.txt (#1043) 2023-02-06 07:05:02 +01:00
sunshinecoastdaily.com.au.txt
supchina.com.txt Create .supchina.com.txt (#896) 2021-08-17 16:53:32 +02:00
superuser.openinfra.dev.txt Create superuser.openinfra.dev (#1457) 2024-10-24 22:39:37 +02:00
svd.se.txt
svt.se.txt Update svt.se.txt 2024-05-23 14:43:05 +02:00
swcarpentry.github.io.txt Create swcarpentry.github.io.txt 2022-06-13 00:41:57 +02:00
swissinfo.ch.txt Update swissinfo.ch.txt 2024-12-06 20:06:10 +01:00
switchonpaper.com.txt Update switchonpaper.com.txt 2020-08-23 00:47:34 +02:00
sydsvenskan.se.txt
symmetrymagazine.org.txt Initial commit 2013-02-27 23:43:10 +01:00
symphozik.info.txt Fix w3349 (#1460) 2024-10-26 22:05:39 +02:00
synbioz.com.txt
syncfusion.com.txt Create syncfusion.com.txt (#1669) 2025-06-14 16:38:13 +02:00
sz-magazin.sueddeutsche.de.txt
t-online.de.txt Create t-online.de.txt (#1409) 2024-07-29 01:20:37 +02:00
t3n.de.txt Modify selectors and add content replacement for embeds (#1916) 2026-03-18 18:42:02 +01:00
t3terminal.com.txt Update t3terminal.com.txt 2020-11-10 12:45:56 +01:00
tabletmag.com.txt Update tabletmag.com.txt 2024-12-27 12:19:04 +01:00
tagblatt.de.txt
tagesanzeiger.ch.txt tagesanzeiger.ch.txt completely rewritten and replaced (#1021) 2022-12-27 13:31:15 +01:00
tagesschau.de.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
tagesspiegel.de.txt Update tagesspiegel.de.txt (#1180) 2023-08-12 10:18:21 +02:00
tailscale.com.txt Adam (#1667) 2025-06-14 08:59:23 +02:00
takt-magazin.de.txt
taste.com.au.txt
tasteofhome.com.txt
taxacc.jp.txt add gendai.media.txt xenospectrum.com.txt taxacc.jp.txt (#1821) 2025-12-17 05:44:08 +01:00
taxlabor.com.txt add number.bunshun.jp.txt and taxlabor.com.txt (#1833) 2026-01-03 22:18:54 +01:00
taz.de.txt Modify extraction rules and add test URLs (#1812) 2025-12-11 14:20:48 +01:00
tbray.org.txt
teamliquid.net.txt merge changes of my own config files with upstream 2015-11-12 00:43:44 +01:00
tech.sina.com.cn.txt Initial commit 2013-02-27 23:43:10 +01:00
techcommunity.microsoft.com.txt
techcrunch.com.txt
techdirt.com.txt
techhive.com.txt
techmeme.com.txt Update techmeme.com.txt 2014-10-15 12:16:40 +02:00
techno-science.net.txt
technologizer.com.txt
technologyreview.com.txt
techpinions.com.txt
techradar.com.txt Update techradar.com.txt 2023-04-02 00:39:57 +02:00
techstage.de.txt Update techstage.de.txt (#1396) 2024-06-18 03:50:17 +02:00
ted.com.txt
telegraph.co.uk.txt Update telegraph.co.uk.txt (#1547) 2025-01-10 14:32:47 +01:00
telepolis.de.txt Remove unnecessary comments and update XPath selectors (#1882) 2026-02-23 15:18:23 +01:00
telerama.fr.txt Add login settings for telerama.fr (#1010) 2022-11-08 17:06:05 +01:00
tennis.com.txt Update tennis.com.txt 2023-03-02 10:56:58 +01:00
terrestres.org.txt
texasmonthly.com.txt Update texasmonthly.com.txt 2021-08-23 14:58:58 +02:00
the-magazine.org.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
the-scientist.com.txt
the-tls.co.uk.txt
theage.com.au.txt
thealexandrian.net.txt Add files via upload https://thealexandrian.net (#1926) 2026-03-30 05:49:18 +02:00
theamericanscholar.org.txt
theathletic.com.txt
theatlantic.com.txt fix: bad format errors (#1811) 2025-12-09 13:48:06 +01:00
theatlanticcities.com.txt
thebaffler.com.txt Create thebaffler.com.txt 2023-01-17 22:13:52 +01:00
theblueprint.ru.txt Update theblueprint.ru.txt 2022-05-26 13:38:54 +02:00
thebulletin.org.txt Update thebulletin.org.txt 2023-07-22 13:16:24 +02:00
thecitypaperbogota.com.txt Create thecitypaperbogota.com.txt (#1498) 2024-11-17 07:06:15 +01:00
thecode.media.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
thecounter.org.txt Update thecounter.org.txt 2022-01-25 21:37:15 +01:00
thecreativeindependent.com.txt Create thecreativeindependent.com.txt 2021-01-06 16:20:03 +01:00
thecut.com.txt Update thecut.com.txt (#1047) 2023-02-13 07:56:15 +01:00
thedailybeast.com.txt
thedailymash.co.uk.txt
thedisneyblog.com.txt
thedrive.com.txt Update thedrive.com.txt 2021-03-13 16:36:11 +01:00
thefader.com.txt Update thefader.com.txt 2022-01-25 21:34:14 +01:00
thefilmexperience.net.txt
theflaw.org.txt Create theflaw.org.txt (#1581) 2025-04-01 02:03:48 +02:00
thegamedesignforum.com.txt
thegap.at.txt
theglobalmail.org.txt Initial commit 2013-02-27 23:43:10 +01:00
thegreatdiscontent.com.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
theguardian.com.txt Fix for theguardian (#1711) 2025-07-14 20:10:37 +02:00
thehansindia.com.txt Update thehansindia.com.txt 2021-10-29 01:04:44 +02:00
thehindu.com.txt Update thehindu.com.txt 2022-06-06 15:52:36 +02:00
theins.ru.txt Update theins.ru.txt 2022-04-09 11:06:35 +02:00
theintercept.com.txt Update some websites (#200) 2016-09-14 11:50:10 +02:00
theinventory.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
thekitchn.com.txt Create thekitchn.com.txt (#1112) 2023-06-26 06:34:41 +02:00
them.us.txt Adds config for them.us (#1087) 2023-06-05 09:18:16 +02:00
themarker.com.txt Update themarker.com.txt (#1082) 2023-05-12 19:37:35 +02:00
themillions.com.txt Updated themillions.com (#475) 2018-08-01 01:08:41 +02:00
thenation.com.txt
thenetworkgarden.blogs.com.txt
thenewatlantis.com.txt add site config for thenewatlantis (#830) 2020-11-27 09:24:04 +01:00
thenewdaily.com.au.txt Thenewdaily.com.au (#1321) 2024-01-27 05:54:13 +01:00
thenews.coop.txt Update thenews.coop.txt 2018-04-29 22:34:05 +02:00
thenewstribune.com.txt
thenextgeneration.org.txt
thenextweb.com.txt
theoaklandpress.com.txt
theodinproject.com.txt Create theodinproject.com.txt 2022-06-13 00:44:40 +02:00
theonion.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
theoutline.com.txt
theplayerstribune.com.txt Create theplayerstribune.com.txt (#1172) 2023-07-25 06:43:56 +02:00
thepointmag.com.txt
theregister.co.uk.txt Update theregister.com/.co.uk (#1945) 2026-05-16 20:29:50 +02:00
theregister.com.txt Update theregister.com/.co.uk (#1945) 2026-05-16 20:29:50 +02:00
theringer.com.txt Update theringer.com.txt 2020-11-04 11:35:38 +01:00
theroot.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
therumpus.net.txt
thesaturdaypaper.com.au.txt
theses.enc.sorbonne.fr.txt
thesimpledollar.com.txt Initial commit 2013-02-27 23:43:10 +01:00
theskepticalcardiologist.com.txt Create theskepticalcardiologist.com.txt (#1638) 2025-05-31 15:10:27 +02:00
thesocialitefamily.com.txt Create thesocialitefamily.com.txt 2020-07-17 09:12:50 +02:00
thespoof.com.txt Initial commit 2013-02-27 23:43:10 +01:00
thestranger.com.txt
thesun.co.uk.txt Create thesun.co.uk.txt 2020-10-13 13:32:21 +02:00
thetakeout.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
theteaspot.com.txt Create theteaspot.com.txt 2021-02-14 00:10:46 +01:00
thethaovanhoa.vn.txt
thetimes.com.txt Update stripping rules in thetimes.com.txt 2026-03-17 16:30:26 +01:00
thetorah.com.txt
theverge.com.txt Update theverge.com.txt (#1741) 2025-08-10 04:51:53 +02:00
theweek.com.txt
thewirecutter.com.txt
thingiverse.com.txt
thinkspot.com.txt Create thinkspot.com.txt 2020-08-31 23:57:55 +02:00
thinkwithgoogle.com.txt Update thinkwithgoogle.com.txt (#1124) 2023-07-03 09:28:08 +02:00
thisamericanlife.org.txt
thisiscolossal.com.txt
thoughtco.com.txt
threadreaderapp.com.txt Update threadreaderapp.com.txt 2021-05-04 21:40:11 +02:00
threatpost.com.txt
thrillist.com.txt
thueringer-allgemeine.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ticket.interpark.com.txt Create ticket.interpark.com.txt (#1077) 2023-04-13 09:33:46 +02:00
tidbits.com.txt
tijd.be.txt
time.com.txt Update time.com.txt 2022-05-06 15:24:11 +02:00
timeshighereducation.co.uk.txt Initial commit 2013-02-27 23:43:10 +01:00
timeshighereducation.com.txt
timesofisrael.com.txt Create timesofisrael.com.txt 2025-06-24 15:30:04 -04:00
tipb.com.txt
titanic-magazin.de.txt
tldp.org.txt Some site configuration and access mode matching (#290) 2017-04-19 23:49:25 +02:00
tlz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
tnr.com.txt
tobias-hartmann.net.txt
tofugu.com.txt
tokyo-np.co.jp.txt add tokyo-np.co.jp.txt and businessinsider.jp.txt (#1823) 2025-12-18 13:42:33 +01:00
tomdispatch.com.txt
tomsguide.com.txt Update tomsguide.com.txt (#1661) 2025-06-14 06:38:02 +02:00
tomshardware.com.txt Update tomshardware.com.txt 2023-04-02 00:39:37 +02:00
tomshardware.de.txt Initial commit 2013-02-27 23:43:10 +01:00
toolinux.com.txt
toolsandtoys.net.txt
topnews.jp.txt add finance.yahoo.co.jp.txt and topnews.jp.txt (#1856) 2026-01-26 19:21:42 +01:00
torgranate.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
torn.com.txt Torn (#1618) 2025-05-13 20:33:21 +02:00
torontolife.com.txt Update torontolife.com.txt 2022-12-20 00:23:38 +01:00
touilleur-express.fr.txt Add touilleur-express.fr.txt (#952) 2022-03-10 07:40:02 +01:00
tourmag.com.txt
touteduc.fr.txt Create touteduc.fr.txt (#513) 2018-08-11 14:13:40 +02:00
towardsdatascience.com.txt Create towardsdatascience.com.txt (#1148) 2023-07-12 06:25:30 +02:00
towerofthehand.com.txt
toyokeizai.net.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
tracks.ranea.org.txt
tradingforaliving.pl.txt Create tradingforaliving.pl.txt (#1476) 2024-11-03 15:12:49 +01:00
trailer.web-view.net.txt Initial commit 2013-02-27 23:43:10 +01:00
trailers.apple.com.txt
trailerzone.de.txt
traningslara.se.txt
trendmicro.com.txt Trendmicro (#1348) 2024-03-06 13:00:22 +01:00
triblive.com.txt
triple-c.at.txt Create triple-c.at.txt (#1378) 2024-05-15 16:24:53 +02:00
triplebyte.com.txt
trouw.nl.txt Modify rules for trouw.nl (#1931) 2026-05-01 17:00:58 +02:00
troyhunt.com.txt
trustedreviews.com.txt
truthdig.com.txt
truthout.org.txt Update truthout.org.txt 2021-04-04 16:25:28 +02:00
tthfanfic.org.txt
tuaw.com.txt
tuhdo.github.io.txt
turnoff.us.txt
tvline.com.txt Update tvline.com.txt (#1091) 2023-06-14 09:03:14 +02:00
tvtropes.org.txt
tweakers.net.txt
twitter.com.txt twitter.com: fix content fetching using custom UA (#837) 2020-12-28 18:22:14 +01:00
twog.fr.txt Update twog.fr.txt 2021-04-30 15:59:42 +02:00
typo3.com.txt Update typo3.com.txt 2021-03-29 13:18:42 +02:00
typo3.org.txt Update typo3.org.txt 2021-03-29 13:21:33 +02:00
tz.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
ubuntugeek.com.txt
udn.com.txt Update udn.com.txt (#1402) 2024-07-09 16:39:19 +02:00
uefa.com.txt
ufu.de.txt Create ufu.de.txt (#1701) 2025-07-04 10:15:36 +02:00
uk.xbox360.ign.com.txt
uncannymagazine.com.txt Create uncannymagazine.com.txt (#1565) 2025-03-10 18:22:49 +01:00
unherd.com.txt Update unherd.com.txt 2023-10-28 09:49:05 +02:00
uni-watch.com.txt
universe.shelfd.com.txt Create universe.shelfd.com.txt (#1454) 2024-10-23 01:02:59 +02:00
unsertirol24.com.txt
unwinnable.com.txt Mostly Instapaper changes 2013-05-13 00:52:49 +02:00
uol.com.br.txt Create uol.com.br.txt (#1583) 2025-04-06 09:55:33 +02:00
urbandictionary.com.txt
us-cert.gov.txt Adding config for us-cert.gov alerts (#338) 2017-09-25 15:53:29 +02:00
usatoday.com.txt
usbeketrica.com.txt Update usbeketrica.com.txt (#1307) 2024-01-14 20:01:11 +01:00
useit.com.txt
usenix.org.txt add usenix.org (#1104) 2023-06-19 09:08:05 +02:00
utcc.utoronto.ca.txt chore: add body and date rules for utcc utoronto (#1893) 2026-02-24 15:06:47 +01:00
utdailybeacon.com.txt Update utdailybeacon.com.txt 2015-06-14 12:26:57 +02:00
utiliser-lightroom.com.txt
utux.fr.txt 5 new websites (#420) 2018-05-04 17:09:16 +02:00
ux.artu.tv.txt
uxdesign.cc.txt create 3 new configs (#1149) 2023-07-12 06:28:40 +02:00
vakarm.net.txt
valdaiclub.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
vanityfair.com.txt Update vanityfair.com.txt 2022-02-24 04:15:56 +01:00
variety.com.txt
varsity.co.uk.txt
vc.ru.txt Create vc.ru.txt 2021-02-01 13:40:31 +01:00
vedonlyonti.com.txt Create vedonlyonti.com.txt 2024-05-07 23:21:33 +02:00
velomotion.de.txt
venturebeat.com.txt
verlagshaus-jaumann.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
version2.dk.txt
verybestbaking.com.txt
vg.no.txt
viaoccitanie.tv.txt Create viaoccitanie.tv.txt (#482) 2018-08-04 12:09:58 +02:00
vice.com.txt Update vice.com.txt 2022-05-12 14:15:23 +01:00
videogameschronicle.com.txt Create videogameschronicle.com.txt (#1717) 2025-07-19 08:55:46 +02:00
videogum.com.txt
vienna.at.txt Update vienna.at.txt with new parsing rules (#1946) 2026-05-17 10:49:26 +02:00
viget.com.txt Update viget.com.txt 2019-04-23 15:41:39 +02:00
villagevoice.com.txt
vimeo.com.txt
vincent.jousse.org.txt fix: index page parsing for https://vincent.jousse.org (#1544) 2025-01-08 17:35:18 +01:00
viply.de.txt
virten.net.txt
visir.is.txt
visual-planning.com.txt Update visual-planning.com.txt 2024-12-02 13:53:15 +01:00
visualcapitalist.com.txt Add visualcapitalist.com (#965) 2022-04-13 14:41:55 +02:00
vitispr.com.txt
vivirmexico.com.txt
vk.com.txt
vogue.co.uk.txt Add vogue.com and vogue.co.uk (#1480) 2024-11-04 00:37:31 +01:00
vogue.com.txt Add vogue.com and vogue.co.uk (#1480) 2024-11-04 00:37:31 +01:00
voices.washingtonpost.com.txt
voidstern.net.txt Create voidstern.net.txt (#1546) 2025-01-08 23:00:39 +01:00
volksfest-freising.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
volkskrant.nl.txt fix: update volkskrant.nl user agent to curl (#1966) 2026-06-12 15:08:01 +02:00
voltairenet.org.txt
vot-tak.tv.txt Update vot-tak.tv.txt 2021-08-23 14:23:32 +02:00
vox.com.txt vox.com.txt: restore h3 and strip related section (#939) 2022-02-28 06:32:19 +01:00
voxeurop.eu.txt
vozpopuli.com.txt
vr-zone.com.txt
vrt.be.txt Update vrt.be.txt 2021-09-19 10:15:43 +02:00
vulture.com.txt Update vulture.com.txt (#1619) 2025-05-13 21:01:27 +02:00
w3.org.txt
wa.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
wallabag.org.txt Create wallabag.org.txt (#1479) 2024-11-03 18:28:55 +01:00
warnerbros.fr.txt
warriordudimanche.net.txt
washingtoninstitute.org.txt washingtoninstitute.org 2013-10-03 16:33:01 +02:00
washingtonmonthly.com.txt
washingtonpost.com.txt Update washingtonpost.com.txt 2024-07-17 15:50:39 +02:00
wasserburg24.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
watchgeneration.fr.txt fix: update MacG websites configuration (#1941) 2026-05-10 09:39:30 +02:00
watchlist-internet.at.txt Update watchlist-internet.at.txt (#1413) 2024-07-30 17:24:32 +02:00
watoday.com.au.txt Create watoday.com.au.txt 2019-06-25 12:46:21 +02:00
watson.ch.txt Update watson.ch.txt 2024-07-25 15:03:15 +02:00
watson.de.txt Update watson.de.txt 2024-07-25 15:03:33 +02:00
waz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
web-libre.org.txt Initial commit 2013-02-27 23:43:10 +01:00
web.dev.txt
web.gekisaka.jp.txt add news.jp.txt and web.gekisaka.jp.txt (#1820) 2025-12-16 14:43:32 +01:00
web.motormagazine.co.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
webcg.net.txt add 3 files (#1829) 2025-12-23 11:30:11 +01:00
weblogs.asp.net.txt
webupd8.org.txt
wellcome.org.txt Update wellcome.org.txt 2022-09-27 00:58:52 +02:00
wellcomecollection.org.txt Update wellcomecollection.org.txt 2022-09-27 01:05:07 +02:00
welt.de.txt
wendezeit.ch.txt Add site configs for republik.ch, forum-geldpolitik.ch, wendezeit.ch (#1963) 2026-06-06 18:34:41 +02:00
wenow.com.txt Create wenow.com.txt (#1053) 2023-02-14 20:58:37 +01:00
werra-rundschau.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
westernadvocate.com.au.txt
wetterauer-zeitung.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
what-if.xkcd.com.txt
whatever.scalzi.com.txt
whereoware.com.txt Add metadata extraction for Whereoware page (#1936) 2026-05-04 14:05:07 +02:00
wienerzeitung.at.txt Update wienerzeitung.at.txt (#1420) 2024-08-18 11:06:29 +02:00
wiesbadener-kurier.de.txt Update wiesbadener-kurier.de.txt 2023-10-15 15:01:12 +02:00
wiesn.bayern.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
wiki.guildwars.com.txt
wiki.guildwars2.com.txt
wikihow.com.txt Update wikihow.com (#384) 2018-01-06 17:36:28 +01:00
wikitravel.org.txt
wikiwand.com.txt Create wikiwand.com.txt 2021-09-11 18:24:45 +02:00
will-self.com.txt
winfuture.de.txt
wired.co.uk.txt add wired.co.uk (#1098) 2023-06-19 09:06:17 +02:00
wired.com.txt Update wired.com.txt (#1750) 2025-08-25 18:29:31 +02:00
wired.jp.txt Update wired.jp.txt 2022-03-15 00:49:46 +01:00
wiwo.de.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
wlz-online.de.txt Update Ippen sites (#1914) 2026-03-18 16:04:26 +01:00
wmpoweruser.com.txt
wn.de.txt .dxy.cn and wn.de 2014-10-07 13:10:25 +02:00
wochenanzeiger.de.txt add wochenanzeiger.de.txt (#899) 2021-08-17 16:52:05 +02:00
woman.tvbs.com.tw.txt Update woman.tvbs.com.tw.txt (#1273) 2023-12-19 13:51:21 +01:00
wooclap.com.txt Create wooclap.com.txt (#1935) 2026-05-04 13:17:04 +02:00
woolworths.com.au.txt
wordpress.org.txt
wordswithoutborders.org.txt Create wordswithoutborders.org.txt (#1575) 2025-03-27 20:07:19 +01:00
wordyard.com.txt
world.hey.com.txt Shtrom 2023 05 (#1085) 2023-05-17 09:51:47 +02:00
worldcrunch.com.txt
worldpoultry.net.txt
worldwidewords.org.txt
wormser-zeitung.de.txt Create wormser-zeitung.de.txt 2023-10-15 14:58:52 +02:00
wornandwound.com.txt Update wornandwound.com.txt 2022-03-11 11:07:16 +01:00
woshub.com.txt Create woshub.com.txt (#871) 2021-04-06 18:00:08 +02:00
wow.joystiq.com.txt Initial commit 2013-02-27 23:43:10 +01:00
wp.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
wpbeginner.com.txt Create wpbeginner.com.txt 2022-03-11 11:29:27 +01:00
wphive.com.txt Create wphive.com.txt 2022-03-13 10:17:40 +01:00
wpmayor.com.txt
wr.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
writerunboxed.com.txt Create writerunboxed.com.txt (#1165) 2023-07-21 06:25:18 +02:00
wsj.com.txt Avoid stripping images on wsj.com (#1830) 2025-12-23 14:43:17 +01:00
wsws.org.txt Added World Socialist WebSite (wsws.org). (#822) 2020-10-16 10:26:07 +02:00
www.blueapron.com.txt
www.seriouseats.com.txt
www1.folha.uol.com.br.txt
www2.cnrs.fr.txt
wyborcza.biz.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
wyborcza.pl.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
wysokieobcasy.pl.txt Update wysokieobcasy.pl.txt (#1380) 2024-05-21 15:47:24 +02:00
wz-newsline.de.txt
xataka.com.txt Create xataka.com.txt (#1625) 2025-05-26 10:44:51 +02:00
xatakaciencia.com.txt
xatakamovil.com.txt Add xatakamovil.com (#1004) 2022-11-02 09:58:01 +01:00
xda-developers.com.txt Update xda-developers.com.txt 2022-10-25 10:17:42 +02:00
xenospectrum.com.txt Update xenospectrum.com add newsphere.jp jp.reuters.com (#1847) 2026-01-11 16:34:29 +01:00
xlsemanal.com.txt
xm.com.txt Update xm.com.txt 2023-09-28 14:44:58 +02:00
xn--protin-bva.com.txt
xplanereviews.com.txt Create xplanereviews.com.txt (#1652) 2025-06-08 13:13:40 +02:00
yahoo.com.txt Changed 3 Yahoo configs (#1400) 2024-07-07 10:12:10 +02:00
ycombinator.com.txt Create ycombinator.com.txt (#1497) 2024-11-17 06:39:52 +01:00
ynet.co.il.txt
yosoy.red.txt Create yosoy.red.txt 2021-03-11 21:37:43 +01:00
yostivanich.com.txt
yourerie.com.txt
youtu.be.txt Create youtu.be.txt (#1668) 2025-06-14 12:49:48 +02:00
youtube.com.txt Update youtube.com.txt (#1608) 2025-05-08 13:37:07 +02:00
zabbix.com.txt Create zabbix.com.txt (#1748) 2025-08-17 12:50:00 +02:00
zaknrw.de.txt
zakzak.co.jp.txt add below site configs (#1849) 2026-01-17 12:51:20 +01:00
zataz.com.txt
zdf.de.txt Update zdf.de.txt (#1367) 2024-04-24 17:13:13 +02:00
zdnet.com.txt
zdnet.fr.txt Create zdnet.fr.txt (#1678) 2025-06-22 19:54:13 +02:00
zdopravy.cz.txt
ze.tt.txt
zeit.de.txt fix zeit.de pagination (#1650) 2025-06-08 12:23:38 +02:00
zerohedge.com.txt
zerokspot.com.txt
zetland.dk.txt Create zetland.dk.txt 2020-11-25 23:54:11 +01:00
zhihu.com.txt Update zhihu.com.txt 2021-02-03 13:14:38 +01:00
zhuanlan.zhihu.com.txt Update zhuanlan.zhihu.com.txt (#902) 2021-08-17 16:50:01 +02:00
zinio.com.txt
zive.cz.txt
zoomit.ir.txt
zwiftinsider.com.txt Create zwiftinsider.com.txt (#1570) 2025-03-15 14:48:33 +01:00

Full-Text RSS site config files

Full-Text RSS, our article extraction tool, makes use of site-specific extraction rules to improve results. Each time a URL is processed, it checks to see if there are extraction rules for the site being processed. If there are no rules are found, it tries to detect the content block automatically.

This repository contains the site-specific extraction rules we rely on in Full-Text RSS.

Contributing changes

We run automated tests on these files to detect issues. If you'd like to help keep these up to date, please look at the test results and see which files you'd like to contribute fixes for.

We chose GitHub for this set of files because they offer one feature which we hope will make contributing changes easier: file editing through the web interface.

You can now make changes to any of our site config files and request that your changes be pulled into the main set we maintain. When we receive a pull request we'll review the changes and if everything's okay we'll update our copy.

If a site is not in our set, you can create a file for it in the same way. See Creating files on GitHub.

How to write a site config file

The quickest and simplest way is to use our point-and-click interface. It's a simple tool only intended to create a rule to extract the correct content block.

For further refinements, e.g. selecting the title, stripping elements, dealing with multi-page articles, please see our help page.

File naming

Use example.com.txt for

  • www.example.com
  • example.com

Use .example.com.txt for

  • sport.example.com
  • news.example.com
  • environment.example.com
  • etc.

Use sport.example.com.txt to target just that sub-domain:

  • sport.example.com

Note: .example.com.txt will not match www.example.com or example.com

Instapaper

When we introduced site patterns, we chose to adopt the same format used by Instapaper. This allowed us to make use of the extraction rules contributed by Instapaper users.

Marco, Instapaper's creator, graciously opened up the database of contributions to everyone:

And, recognizing that your efforts could be useful to a wide range of other tools and services, I'll make the list of all of these site-specific configurations available to the public, free, with no strings attached.

You can see the list maintained by Instapaper at instapaper.com/bodytext/ (no longer available since Instapaper was sold).

Testing site config files

Currently you will have to have a copy of Full-Text RSS to test changes to the site config files. In the future we will try to make this process easier.